Multimedia Tools and Applications

, Volume 77, Issue 16, pp 21265–21280 | Cite as

Generate domain-specific sentiment lexicon for review sentiment analysis

  • Hongyu Han
  • Jianpei Zhang
  • Jing Yang
  • Yiran Shen
  • Yongshi Zhang


Lexicon-based approaches for review sentiment analysis have attracted significant attention in recent years. Lots of sentiment lexicon generation methods have been proposed. However, the generation of domain-specific lexicon with unlabeled data has not been effectively addressed. In this paper, we propose a new domain-specific sentiment lexicon generation method, mutual information is introduced to assign terms with Part-Of-Speech (POS) tags in the lexicon, the training data are selected from unlabeled corpus according to their sentiment scores which are evaluated by the SentiWordNet (SWN) based sentiment classifier. Then we propose a completed lexicon-based sentiment analysis framework which uses the domain-specific sentiment lexicon generated by the proposed domain-specific sentiment lexicon generation method. The experiment is carried out on publically available datasets. Results show that the proposed lexicon-based sentiment analysis framework using domain-specific lexicons generated by the proposed method gets a good performance.


Lexicon-based approach Sentiment analysis SentiWordNet Mutual information 



This work is supported by the National Natural Science Foundation of China (No. 61672179, No. 61370083, No. 61402126), the Specialized Research Fund for the Doctoral Program of Higher Education (No.20122304110012), the Heilongjiang Postdoctoral Science Foundation (LBH-Z14071), the Natural Science Foundation for Young Scientists of Heilongjiang Province (QC2016083) and the Natural Science Foundation of Heilongjiang Province (F2015030).


  1. 1.
    Awwad H, Alpkocak A, Ieee (2016) Performance comparison of different lexicons for sentiment analysis in arabic. 2016 Third European Network Intelligence Conference (Enic 2016):127–133.
  2. 2.
    Baccianella S, Esuli A, Sebastiani F (2010) SentiWordNet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: LREC, pp 2200–2204Google Scholar
  3. 3.
    Blitzer J, Dredze M, Pereira F (2007) Biographies, Bollywood, Boomboxes and blenders: domain adaptation for sentiment classification. Acl 31(2):187–205Google Scholar
  4. 4.
    Dragut EC, Yu C, Sistla P, Meng W (2010) Construction of a sentimental word dictionary. In: Proceedings of the 19th ACM international conference on Information and knowledge management, 2010. ACM, pp 1761–1764Google Scholar
  5. 5.
    Fellbaum C, Miller G (1998) WordNet: an electronic lexical database. Cognition Brain & BehaviorGoogle Scholar
  6. 6.
    Gatti L, Guerini M (2012) Assessing sentiment strength in words prior polarities. arXiv preprint arXiv:12124315Google Scholar
  7. 7.
    Hamilton WL, Clark K, Leskovec J, Dan J (2016) Inducing domain-specific sentiment lexicons from unlabeled corporaGoogle Scholar
  8. 8.
    Khan FH, Qamar U, Bashir S (2016) SentiMI: introducing point-wise mutual information with SentiWordNet to improve sentiment polarity detection. Appl Soft Comput 39:140–153. CrossRefGoogle Scholar
  9. 9.
    Leopold, Kindermann, rg (2002) Text categorization with support vector machines. How to represent texts in input space? Mach Learn 46 (1):423–444Google Scholar
  10. 10.
    Liu B (2016) Sentiment analysis: mining opinions, sentiments, and emotions. Comput Linguist 42(3):1–4MathSciNetGoogle Scholar
  11. 11.
    Lochter JV, Zanetti RF, Reller D, Almeida TA (2016) Short text opinion detection using ensemble of classifiers and semantic indexing. Expert Syst Appl 62:243–249. CrossRefGoogle Scholar
  12. 12.
    Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. In: The Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, 19–24 June, Portland, Oregon, USA, 2011. pp 142–150Google Scholar
  13. 13.
    Natural Language Toolkit.
  14. 14.
    Park S, Lee W, Moon IC (2015) Efficient extraction of domain specific sentiment lexicon with active learning. Pattern Recogn Lett 56:38–44. CrossRefGoogle Scholar
  15. 15.
    Petz G, Karpowicz M, Fürschuß H, Auinger A, Stříteský V, Holzinger A (2014) Computational approaches for mining user’s opinions on the web 2.0. Inf Process Manag 50(6):899–908CrossRefGoogle Scholar
  16. 16.
    Saif H, He YL, Fernandez M, Alani H (2016) Contextual semantics for sentiment analysis of twitter. Inf Process Manag 52(1):5–19. CrossRefGoogle Scholar
  17. 17.
    Taboada M, Brooke J, Tofiloski M, Voll K, Stede M (2011) Lexicon-based methods for sentiment analysis. Comput Linguist 37(2):267–307CrossRefGoogle Scholar
  18. 18.
    Tang D, Wei F, Qin B, Zhou M, Liu T (2014) Building large-scale twitter-specific sentiment lexicon: a representation learning approach. In: COLING, pp 172–182Google Scholar
  19. 19.
    Toutanova K, Klein D, Manning CD, Singer Y (2003) Feature-rich part-of-speech tagging with a cyclic dependency network. In: Conference of the North American chapter of the Association for Computational Linguistics on human language Technology. Association for Computational Linguistics, Edmonton, pp 173–180Google Scholar
  20. 20.
    Wilson T, Wiebe J, Hoffmann P (2005) Recognizing contextual polarity in phrase-level sentiment analysis. Paper presented at the proceedings of the conference on human language technology and empirical methods in natural language processing. Vancouver, British ColumbiaGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Hongyu Han
    • 1
  • Jianpei Zhang
    • 1
  • Jing Yang
    • 1
  • Yiran Shen
    • 1
  • Yongshi Zhang
    • 1
  1. 1.College of Computer Science and TechnologyHarbin Engineering UniversityHarbinChina

Personalised recommendations