Abstract
Lexicon-based approaches for review sentiment analysis have attracted significant attention in recent years. Lots of sentiment lexicon generation methods have been proposed. However, the generation of domain-specific lexicon with unlabeled data has not been effectively addressed. In this paper, we propose a new domain-specific sentiment lexicon generation method, mutual information is introduced to assign terms with Part-Of-Speech (POS) tags in the lexicon, the training data are selected from unlabeled corpus according to their sentiment scores which are evaluated by the SentiWordNet (SWN) based sentiment classifier. Then we propose a completed lexicon-based sentiment analysis framework which uses the domain-specific sentiment lexicon generated by the proposed domain-specific sentiment lexicon generation method. The experiment is carried out on publically available datasets. Results show that the proposed lexicon-based sentiment analysis framework using domain-specific lexicons generated by the proposed method gets a good performance.
Similar content being viewed by others
References
Awwad H, Alpkocak A, Ieee (2016) Performance comparison of different lexicons for sentiment analysis in arabic. 2016 Third European Network Intelligence Conference (Enic 2016):127–133. https://doi.org/10.1109/enic.2016.25
Baccianella S, Esuli A, Sebastiani F (2010) SentiWordNet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: LREC, pp 2200–2204
Blitzer J, Dredze M, Pereira F (2007) Biographies, Bollywood, Boomboxes and blenders: domain adaptation for sentiment classification. Acl 31(2):187–205
Dragut EC, Yu C, Sistla P, Meng W (2010) Construction of a sentimental word dictionary. In: Proceedings of the 19th ACM international conference on Information and knowledge management, 2010. ACM, pp 1761–1764
Fellbaum C, Miller G (1998) WordNet: an electronic lexical database. Cognition Brain & Behavior
Gatti L, Guerini M (2012) Assessing sentiment strength in words prior polarities. arXiv preprint arXiv:12124315
Hamilton WL, Clark K, Leskovec J, Dan J (2016) Inducing domain-specific sentiment lexicons from unlabeled corpora
Khan FH, Qamar U, Bashir S (2016) SentiMI: introducing point-wise mutual information with SentiWordNet to improve sentiment polarity detection. Appl Soft Comput 39:140–153. https://doi.org/10.1016/j.asoc.2015.11.016
Leopold, Kindermann, rg (2002) Text categorization with support vector machines. How to represent texts in input space? Mach Learn 46 (1):423–444
Liu B (2016) Sentiment analysis: mining opinions, sentiments, and emotions. Comput Linguist 42(3):1–4
Lochter JV, Zanetti RF, Reller D, Almeida TA (2016) Short text opinion detection using ensemble of classifiers and semantic indexing. Expert Syst Appl 62:243–249. https://doi.org/10.1016/j.eswa.2016.06.025
Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. In: The Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, 19–24 June, Portland, Oregon, USA, 2011. pp 142–150
Natural Language Toolkit. http://nltk.org/
Park S, Lee W, Moon IC (2015) Efficient extraction of domain specific sentiment lexicon with active learning. Pattern Recogn Lett 56:38–44. https://doi.org/10.1016/j.patrec.2015.01.004
Petz G, Karpowicz M, Fürschuß H, Auinger A, Stříteský V, Holzinger A (2014) Computational approaches for mining user’s opinions on the web 2.0. Inf Process Manag 50(6):899–908
Saif H, He YL, Fernandez M, Alani H (2016) Contextual semantics for sentiment analysis of twitter. Inf Process Manag 52(1):5–19. https://doi.org/10.1016/j.ipm.2015.01.005
Taboada M, Brooke J, Tofiloski M, Voll K, Stede M (2011) Lexicon-based methods for sentiment analysis. Comput Linguist 37(2):267–307
Tang D, Wei F, Qin B, Zhou M, Liu T (2014) Building large-scale twitter-specific sentiment lexicon: a representation learning approach. In: COLING, pp 172–182
Toutanova K, Klein D, Manning CD, Singer Y (2003) Feature-rich part-of-speech tagging with a cyclic dependency network. In: Conference of the North American chapter of the Association for Computational Linguistics on human language Technology. Association for Computational Linguistics, Edmonton, pp 173–180
Wilson T, Wiebe J, Hoffmann P (2005) Recognizing contextual polarity in phrase-level sentiment analysis. Paper presented at the proceedings of the conference on human language technology and empirical methods in natural language processing. Vancouver, British Columbia
Acknowledgements
This work is supported by the National Natural Science Foundation of China (No. 61672179, No. 61370083, No. 61402126), the Specialized Research Fund for the Doctoral Program of Higher Education (No.20122304110012), the Heilongjiang Postdoctoral Science Foundation (LBH-Z14071), the Natural Science Foundation for Young Scientists of Heilongjiang Province (QC2016083) and the Natural Science Foundation of Heilongjiang Province (F2015030).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Han, H., Zhang, J., Yang, J. et al. Generate domain-specific sentiment lexicon for review sentiment analysis. Multimed Tools Appl 77, 21265–21280 (2018). https://doi.org/10.1007/s11042-017-5529-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-017-5529-5