Collaborative Sentiment Analysis in Social Media for Detection of Attacks: Preliminary Results
ABSTRACT
Social Media is becoming major and popular technological platform that allows users discussing and sharing information. Information is generated and managed through either computer or mobile devices by one person and consumed by many other persons. The importance of these user-generated textual content has been realized by researchers and marketing managers. Looking for valuable and high quality nuggets of knowledge, such as capturing and summarizing sentiments, filtering out spam users and activities from these huge amounts of data could help users make informed and accurate decisions. In this work, we develop a sentiment identification system called SES, which implements three different sentiment identification algorithms. We augment basic compositional semantic rules in the first algorithm. In the second algorithm, we think sentiment should not be simply classified as positive, negative, and objective but a continuous score to reflect sentiment degree. All word scores are calculated based on a large volume of customer reviews. Due to the special characteristics of social media texts, we propose a third algorithm, which takes emoticons, negation word position, and domain-specific words into account. Furthermore, a machine-learning model is employed on features derived from outputs of three algorithms. We conduct our experiments on user comments from Facebook and tweets from twitter. The results show that utilizing Random Forest will acquire a better accuracy than decision tree, neural network, and logistic regression. In addition, a preliminary spam filtering system is built by incorporating user historical social activities and sentiments of their comments.