User-Level Twitter Sentiment Analysis with a Hybrid Approach
With the objective of extracting useful information from the opinion-rich data on Twitter, both supervised learning-based and unsupervised lexicon-based methods for sentiment analysis on Twitter corpus have been studied in recent years. However, the unique characteristics of tweets such as the lack of labels and frequent usage of emoticons poses challenges to most of the existing learning-based and lexicon-based methods. In addition, studies on Twitter sentiment analysis nowadays mainly focus on domain specific tweets while a larger amount of tweets are about personal feelings and comments on daily life events. In this paper, a hybrid approach of augmented lexicon-based and learning-based method is designed to handle the distinctive characteristics of tweets and perform sentiment analysis on a user level, providing us information of specific Twitter users’ typing habits and their online sentiment fluctuations. Our model is capable of achieving an overall accuracy of 81.9 %, largely outperforming current baseline models on tweet sentiment analysis.
KeywordsTwitter Social media Date mining Sentiment analysis
The authors would like to acknowledge the funding support from the National Natural Science Foundation of P. R. China (under Grants 51009017 and 51379002), Applied Basic Research Funds from Ministry of Transport of P.R. China (under Grant 2012-329-225-060), and Program for Liaoning Excellent Talents in University (under Grant LJQ2013055).
- 3.Pears Analytics: Twitter Study August – 2009. http://www.pearanalytics.com/wpcontent/uploads/2009/08/Twitter-Study-August-2009.Pdf
- 4.De Choudhury, M., Counts, S., Horvitz, E.: Social media as a measurement tool of depression in populations. In: 7th International Conference on Weblogs and Social Media, pp. 128–137. ACM, New York (2013)Google Scholar
- 5.Zhang, L., Ghosh, R., Dekhil, M., Hsu, M., Liu, B.: Combining lexicon-based and learning-based methods for Twitter sentiment analysis. HP, Technical report HPL-2011-89 (2011)Google Scholar
- 6.Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment classification using machine learning techniques. In: ACL-02 Conference on Empirical Methods in Natural Language Processing, vol. 10, pp. 79–86. Association for Computational Linguistics, Stroudsburg (2002)Google Scholar
- 7.Ding, X., Liu, B., Yu, P.S.: A holistic lexicon-based approach to opinion mining. In: 2008 International Conference on Web Search and Data Mining, pp. 231–239. ACM, New York (2008)Google Scholar
- 9.Tan, S., Wang, Y., Cheng, X.: Combing learn-based and lexicon-based techniques for sentiment detection without using labeled examples. In: 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 743–744. ACM, New York (2008)Google Scholar
- 11.Musto, C., Semeraro, G., Polignano, M.: A comparison of lexicon-based approaches for sentiment analysis of microblog posts. In: 8th International Workshop on Information Filtering and Retrieval, DART, vol. 1314, pp. 59–68, Pisa, Italy (2014)Google Scholar
- 12.Barbosa, L., Feng, J.: Robust sentiment detection on Twitter from biased and noisy data. In: 23rd International Conference on Computational Linguistics: Posters, pp. 36–44. Association for Computational Linguistics, Stroudsburg (2010)Google Scholar
- 13.Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: The International Conference on Learning. Representations (ICLR), San Diego (2015)Google Scholar