Power of Predictive Analytics: Using Emotion Classification of Twitter Data for Predicting 2016 US Presidential Elections

Satish Mahadevan Srinivasan, Raghvinder Sangwan, Colin Neill, Tianhai Zu

Abstract


Predictive analytics using the twitter feeds is becoming a popular field for research. A tweet holds a wealth of information on how an individual expresses and communicates their feelings and emotions within their social network. Large-scale collection, cleaning, and mining of tweets will not only help in capturing an individual’s emotion but also the emotions of a larger group. However, capturing a large volume of tweets and identifying the emotions expressed in it is a challenging task. Different classification algorithms employed in the past for classifying emotions have resulted in low-to-moderate accuracies thus making it difficult to precisely predict the outcome of an event. Secondly, the presence of diverse emotion annotated datasets, none of which are specific to a particular domain, has limited the potentiality of supervised algorithms for classification purposes. In this study, we demonstrate the potentiality of a lexicon-based classifier, NRC, which can mine emotions and sentiments in tweets. Using the NRC classifier, we initially determined the emotions and the sentiments within the tweets and used that to predict the swing direction of the 19 US states towards the candidates of the 2016 US presidential election. Comparing the predictions from the NRC against with the actual outcome of the election, we observed a ~90% accuracy, a performance superior to the mainstream pollsters indicating the potential emotion and sentiment-based classification holds in predicting the outcome of significant social and political events.


Keywords


machine learning, emotion classification, lexicon-based classifier, predictive analytics, social media, twitter

Full Text:

PDF

References


Alm, C. O. (2008). Affect in Text and Speech. PhD Dissertation. University of Illinois at Urbana-Champaign.

Aman, S., Szpakowicz, S. (2007). Identifying Expressions of Emotion in Text. TSD 2007, LNAI 4629, 196-205.

Badshah, A.M., Ahmad, J., Lee, M.Y., Baik, S.W. (2016). Divide-and-Conquer based Ensemble to Spot Emotions in Speech using MFCC and Random Forest. Proceedings of the 2nd International Integrated Conference & Concert on Convergence, 1-8.

Barbosa, L., Feng, J. (2010). Robust sentiment detection on Twitter from biased and noisy data. Proceedings of the 23rd International Conference on Computational Linguistics (COLING 2010), 36-44.

Chaffar, S., Inkpen, D. (2011). Using a Heterogeneous Dataset for Emotion Analysis in Text. Advances in Artificial Intelligence – 24th Canadian Conference on Artificial Intelligence.

Choudhury, M.D., Gamon, M., Counts, S., Horvitz, E. (2013). Predicting depression via social media. International AAAI Conference on Weblogs and Social Media (ICWSM’13).

Danisman, T., Alpkocak, A. (2008). Feeler: Emotion Classification of Text Using Vector Space Model. AISB Convention Communication, Interaction and Social Intelligence, 53-59.

Ghazi, D., Inkpen, D., Szpakowicz, S. (2010). Hierarchical versus Flat Classification of Emotions in Text. Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, 140-146.

Hasan, M., Rundensteiner, E., Agu, E. (2014). EMOTEX: Detecting Emotions in Twitter Messages. Academy of Science and Engineering.

Hu, X., Tang, J., Gao, H., Liu, H. (2013). Unsupervised sentiment analysis with emotional signals. Proceedings of the 22nd international conference on World Wide Web, WWW’13. ACM.

Katz, J. (2016, November 8). Who Will Be President? New York Times. Retrieved from https://www.nytimes.com/interactive/2016/upshot/presidential-polls-forecast.html

Ling, R., Baron, N.S. (2007). Text Messaging and IM: Linguistic Comparison of American College Data. Journal of Language and Social Psychology, 26(3), 291-298.

LLiou, T., Anagnostopoulos, C.N. (2009). Comparison of Different Classifiers for Emotion Recognition. 13th Panhellenic IEEE Conference on Informatics, Retrieved from http://ieeexplore.ieee.org/document/5298878/

Maleki, R. E., Rezaei, A., Bidgoli, B. M. (2009). Comparison of classification methods based on the type of attributes and sample size. Journal of Convergence Information Technology. 4(3). 94-102

Mohammad, S., Turney, P. (2011). Emotions Evoked by Common Words and Phrases: Using Mechanical Turk to Create an Emotion Lexicon. Proceedings of the NAACL-HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text.

Mohammad, S. (2012). Emotional Tweets. Proceedings of the First Joint Conference on Lexical and Computational Semantics.

Pak, A., Paroubek, P. (2010). Twitter as a corpus for sentiment analysis and opinion mining. Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010), 1320-1326.

Peng, B., Lee, L., Vaithyanathan, S. (2002). Thumbs us? Sentiment classification using machine learning techniques. Proceedings of the Seventh Conference on Empirical Methods in Natural Language Processing (EMNLP-02), 79-86.

Purver, M., Battersby, S. (2012). Experimenting with distant supervision for emotion classification. Proceedings of the 13th EACL. Association for Computational Linguistics, 482-491.

Roberts, K., Roach, M. A., Johnson, J., Guthrie, J., Harabagiu, S. M. (2012). EmpaTweet: Annotating and Detecting Emotions on Twitter. LREC, 3806-3813.

Rohini, V., Thomas, M. (2015). Comparison of Lexicon based and Naïve Bayes Classifier in Sentiment Analysis. International Journal for Scientific Research & Development, 3(4).

Russell, J.A. (1980). A Circumplex model of affect. Journal of Personality and Social Psychology, 39, 1161-1178.

Stanton, J. (2013). An Introduction to the Data Science. Retried from https://www.scribd.com/document/194116122/Data-Science-Book-v-3

Thelwall, M., Buckley, K., Platoglou, G., Kappas, A. (2010). Sentiment strength detection in short informal text. Journal of the American Society for Information Science and Technology, 61(12), 2544-2558.


Refbacks

  • There are currently no refbacks.


Based at Tarleton State University in Stephenville, Texas, USA, The Journal of Social Media in Society is sponsored by the Colleges of Liberal and Fine Arts, Education, Business Administration, and Graduate Studies.