Building Domain Specific Sentiment Lexicons Combining Information from Many Sentiment Lexicons and a Domain Specific Corpus

  • Hugo HammerEmail author
  • Anis Yazidi
  • Aleksander Bai
  • Paal Engelstad
Part of the IFIP Advances in Information and Communication Technology book series (IFIPAICT, volume 456)


Most approaches to sentiment analysis requires a sentiment lexicon in order to automatically predict sentiment or opinion in a text. The lexicon is generated by selecting words and assigning scores to the words, and the performance the sentiment analysis depends on the quality of the assigned scores. This paper addresses an aspect of sentiment lexicon generation that has been overlooked so far; namely that the most appropriate score assigned to a word in the lexicon is dependent on the domain. The common practice, on the contrary, is that the same lexicon is used without adjustments across different domains ignoring the fact that the scores are normally highly sensitive to the domain. Consequently, the same lexicon might perform well on a single domain while performing poorly on another domain, unless some score adjustment is performed. In this paper, we advocate that a sentiment lexicon needs some further adjustments in order to perform well in a specific domain. In order to cope with these domain specific adjustments, we adopt a stochastic formulation of the sentiment score assignment problem instead of the classical deterministic formulation. Thus, viewing a sentiment score as a stochastic variable permits us to accommodate to the domain specific adjustments. Experimental results demonstrate the feasibility of our approach and its superiority to generic lexicons without domain adjustments.


Bayesian decision theory Cross-domain Sentiment classification Sentiment lexicon 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Liu, B.: Sentiment Analysis and Opinion Mining. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers, Toronto (2012)Google Scholar
  2. 2.
    Aue, A., Gamon, M.: Customizing sentiment classifiers to new domains: A case study. In: Proceedings of Recent Advances in Natural Language Processing (RANLP) (2005)Google Scholar
  3. 3.
    Blitzer, J., Dredze, M., Pereira, F.: Biographies, Bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In: Proceedings of the Association for Computational Linguistics (ACL) (2007)Google Scholar
  4. 4.
    Tan, S., Wu, G., Tang, H., Cheng, X.: A novel scheme for domain-transfer problem in the context of sentiment analysis. In: Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, CIKM 2007, pp. 979–982. ACM, New York (2007),, doi:10.1145/1321440.1321590CrossRefGoogle Scholar
  5. 5.
    Bollegala, D., Weir, D., Carroll, J.: Cross-Domain Sentiment Classification Using a Sentiment Sensitive Thesaurus. IEEE Transactions on Knowledge and Data Engineering 25(8), 1719–1731 (2013)CrossRefGoogle Scholar
  6. 6.
    Pan, S.J., Ni, X., Sun, J.T., Yang, Q., Chen, Z.: Cross-domain Sentiment Classification via Spectral Feature Alignment. In: Proceedings of the 19th International Conference on World Wide Web, WWW 2010, pp. 751–760. ACM, New York (2010)Google Scholar
  7. 7.
    Chetviorkin, I., Loukachevitch, N.V.: Extraction of Russian Sentiment Lexicon for Product Meta-Domain. In: COLING, pp. 593–610 (2012)Google Scholar
  8. 8.
    Gindl, S., Weichselbraun, A., Scharl, A.: Cross-Domain Contextualization of Sentiment Lexicons. In: Coelho, H., Studer, R., Wooldridge, M. (eds.) ECAI. Frontiers in Artificial Intelligence and Applications, vol. 215, pp. 771–776. IOS Press (2010)Google Scholar
  9. 9.
    Weichselbraun, A., Gindl, S., Scharl, A.: Extracting and grounding context-aware sentiment lexicons. IEEE Intelligent Systems 28(2), 39–46 (2013)CrossRefGoogle Scholar
  10. 10.
    Owsley, S., Sood, S., Hammond, K.J.: Domain specific affective classification of documents. In: AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs, pp. 181–183 (2006)Google Scholar
  11. 11.
    Turney, P.: Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the Association for Computational Linguistics (ACL), pp. 417–424 (2002)Google Scholar
  12. 12.
    Ding, X., Liu, B., Yu, P.S.: A Holistic Lexicon-based Approach to Opinion Mining. In: Proceedings of the 2008 International Conference on Web Search and Data Mining, WSDM 2008, pp. 231–240. ACM, New York (2008)Google Scholar
  13. 13.
    Chetviorkin, I., Loukachevitch, N.: Two-Step Model for Sentiment Lexicon Extraction from Twitter Streams. In: Proceedings of the 5th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pp. 90–96. Association for Computational Linguistics (2014)Google Scholar
  14. 14.
    Nielsen, F.Å.: A new ANEW: Evaluation of a word list for sentiment analysis in microblogs. CoRR abs/1103.2903 (2011)Google Scholar
  15. 15.
    Zhu, X., Ghahramani, Z.: Learning from labeled and unlabeled data with label propagation. Technical report, Technical Report CMU-CALD-02-107, Carnegie Mellon University (2002)Google Scholar
  16. 16.
    Hammer, H., Bai, A., Yazidi, A., Engelstad, P.: Building sentiment lexicons applying graph theory on information from three Norwegian thesauruses. In: Norweian Informatics Conference (2014)Google Scholar
  17. 17.
    Turney, P.D., Littman, M.L.: Measuring praise and criticism: Inference of semantic orientation from association. ACM Transactions on Information Systems (TOIS) 21(4), 315–346 (2003)CrossRefGoogle Scholar
  18. 18.
    Bai, A., Hammer, H.L., Yazidi, A., Engelstad, P.: Constructing sentiment lexicons in Norwegian from a large text corpus. In: The 17th IEEE International conference on Computational science and Engineering (CSE), pp. 231–237 (2014)Google Scholar
  19. 19.
    Hammer, H.L., Solberg, P.E.: vrelid, L.O.: Sentiment classification of online political discussions: A comparison of a word-based and dependency-based method. In: Proceedings of the 5th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. Association for Computational Linguistics, pp. 90–96 (2014)Google Scholar
  20. 20.
    Bing, L.: Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. Springer (2011)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2015

Authors and Affiliations

  • Hugo Hammer
    • 1
    Email author
  • Anis Yazidi
    • 1
  • Aleksander Bai
    • 1
  • Paal Engelstad
    • 1
  1. 1.Department of Computer ScienceOslo and Akershus University College of Applied SciencesOsloNorway

Personalised recommendations