Lexicon-Based Sentiment Analysis on Topical Chinese Microblog Messages
Microblogging is a popular social media where people express their opinions and sentiment on social topics. The Chinese microblogging service, called Weibo, has become a remarkable media in the Chinese society. People are eager to know others’ attitudes towards social events; thus sentiment analysis on those topical microblog messages is important. In this paper we introduce a lexicon-based sentiment analysis method. We construct a Weibo Lexicon with representative topical words and out-of-vocabulary (OOV) words, which are usually informal and are not existing in formal dictionaries. In addition, we use a propagation algorithm to automatically assign sentiment polarity scores to the discovered words. These scores are more closely reflecting the Weibo context since words may have new or opposite polarities instead of their formal meanings. Evaluations on the classification tasks show that our method is effective on recognizing the subjectivity and sentiment of Weibo sentences. The Weibo lexicon increases the performance of the classifications.
KeywordsSentiment Analysis Sentiment Classification Sentiment Dictionary Polarity Score Sentiment Word
The NExT search center is supported by the Singapore National Research Foundation and Interactive Digital Media R&D Program Office, MDA, under research grant (WBS: R-252-300-001-490).
- 1.Chang, C.C., Lin, C.J.: Libsvm: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 27:1–27:27 (2011)Google Scholar
- 2.CNNIC: The 30th china internet development report. Tech. rep., China Internet Information Center (2012)Google Scholar
- 3.Cui, A., Zhang, M., Liu, Y., Ma, S.: Emotion tokens: bridging the gap among multilingual twitter sentiment analysis. In: Proceedings of the 7th Asia conference on Information Retrieval Technology, pp. 238–249. AIRS’11, Springer, Berlin, Heidelberg (2011)Google Scholar
- 5.Li, Z., Zhang, M., Ma, S., Zhou, B., Sun, Y.: Automatic extraction for product feature words from comments on the web. In: Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology, pp. 112–123. AIRS ’09, Springer, Berlin, Heidelberg (2009)Google Scholar
- 6.Liu, B.: Sentiment analysis and subjectivity. Handbook of Natural Language Processing, 2nd edn. In: Indurkhya, N., Damerau, FJ. (eds.) pp. 627–666 (2010)Google Scholar
- 8.Pak, A., Paroubek, P.: Twitter as a corpus for sentiment analysis and opinion mining. In: Proceedings of LREC, vol. 2010 (2010)Google Scholar
- 10.Zhang, H.P., Yu, H.K., Xiong, D.Y., Liu, Q.: Hhmm-based chinese lexical analyzer ictclas. In: Proceedings of the second SIGHAN workshop on Chinese language processing - vol. 17, pp. 184–187. SIGHAN ’03, Association for Computational Linguistics, Stroudsburg, PA (2003)Google Scholar
- 11.Zhang, W., Liu, J., Guo, X.: Positive and Negative Words Dictionary for Students (First Edition). Beijing, China: Encyclopedia of China Publishing House, 75–77 (2004)Google Scholar
- 12.Zhao, J., Dong, L., Wu, J., Xu, K.: Moodlens: an emoticon-based sentiment analysis system for chinese tweets. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1528–1531. ACM, New York (2012)Google Scholar