A Case-Based Approach to Cross Domain Sentiment Classification

  • Bruno Ohana
  • Sarah Jane Delany
  • Brendan Tierney
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7466)

Abstract

This paper considers the task of sentiment classification of subjective text across many domains, in particular on scenarios where no in-domain data is available. Motivated by the more general applicability of such methods, we propose an extensible approach to sentiment classification that leverages sentiment lexicons and out-of-domain data to build a case-based system where solutions to past cases are reused to predict the sentiment of new documents from an unknown domain. In our approach the case representation uses a set of features based on document statistics, while the case solution stores sentiment lexicons employed on past predictions allowing for later retrieval and reuse on similar documents. The case-based nature of our approach also allows for future improvements since new lexicons and classification methods can be added to the case base as they become available. On a cross domain experiment our method has shown robust results when compared to a baseline single-lexicon classifier where the lexicon has to be pre-selected for the domain in question.

Keywords

case-based reasoning sentiment classification sentiment lexicons 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up?: sentiment classification using machine learning techniques. In: Proceedings of 2002 EMNLP, pp. 79–86. ACL (2002)Google Scholar
  2. 2.
    Dave, K., Lawrence, S., Pennock, D.M.: Mining the peanut gallery: Opinion extraction and semantic classification of product reviews. In: Proceedings of the 12th WWW, p. 528. ACM (2003)Google Scholar
  3. 3.
    Abbasi, A., Chen, H., Salem, A.: Sentiment analysis in multiple languages: Feature selection for opinion classification in Web forums. ACM Transactions on Information Systems (TOIS) 26(3), 12 (2008)CrossRefGoogle Scholar
  4. 4.
    Aue, A., Gamon, M.: Customizing sentiment classifiers to new domains: a case study. In: Proceedings of Recent Advances in Natural Language Processing (RANLP) (2005)Google Scholar
  5. 5.
    Kennedy, A., Inkpen, D.: Sentiment classification of movie reviews using contextual valence shifters. Computational Intelligence 22(2), 110–125 (2006)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Taboada, M., Voll, K., Brooke, J.: Extracting sentiment as a function of discourse structure and topicality. Technical Report TR 2008-20, School of Computing Science, Simon Fraser University, Burnaby, BC, Canada (2008)Google Scholar
  7. 7.
    Ohana, B., Tierney, B., Delany, S.J.: Domain Independent Sentiment Classification with Many Lexicons. In: Mining and the Web Workshop, AINA 2011, pp. 632–637. IEEE (2011)Google Scholar
  8. 8.
    Pang, B., Lee, L.: A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd Annual Meeting of the ACL, p. 271. ACL (2004)Google Scholar
  9. 9.
    Li, S., Huang, C.R., Zhou, G., Lee, S.Y.M.: Employing personal/impersonal views in supervised and semi-supervised sentiment classification. In: 48th Annual Meeting of the ACL, pp. 414–423. ACL (2010)Google Scholar
  10. 10.
    Sood, S., Owsley, S., Hammond, K., Birnbaum, L.: Reasoning through search: A novel approach to sentiment classification. Technical Report NWU-EECS-07-05, Northwestern University (2007)Google Scholar
  11. 11.
    Tan, S., Wu, G., Tang, H., Cheng, X.: A novel scheme for domain-transfer problem in the context of sentiment analysis. In: Proceedings of the Sixteenth CIKM, pp. 979–982. ACM (2007)Google Scholar
  12. 12.
    Blitzer, J., Dredze, M., Pereira, F.: Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In: Annual Meeting ACL, vol. 45, p. 440 (2007)Google Scholar
  13. 13.
    Stone, P.J., Dunphy, D.C., Smith, M.S., Ogilvie, D.M., et al.: The general inquirer: A computer approach to content analysis. MIT Press, Cambridge (1966)Google Scholar
  14. 14.
    Hatzivassiloglou, V., McKeown, K.R.: Predicting the semantic orientation of adjectives. In: 35th Annual Meeting of the ACL, ACL 1998, pp. 174–181. Association for Computational Linguistics (1997)Google Scholar
  15. 15.
    Turney, P.D., Littman, M.L.: Unsupervised learning of semantic orientation from a hundred-billion-word corpus. Technical Report ERB-1094, National Research Council of Canada (2002)Google Scholar
  16. 16.
    Velikovich, L., Blair-Goldensohn, S., Hannan, K., McDonald, R.: The viability of web-derived polarity lexicons. In: HLT 2010: Annual Conference of the North American Chapter of the ACL, pp. 777–785. ACL (2010)Google Scholar
  17. 17.
    Mohammad, S., Dunne, C., Dorr, B.: Generating high-coverage semantic orientation lexicons from overtly marked words and a thesaurus. In: Proceedings of EMNLP 2009, pp. 599–608. ACL (2009)Google Scholar
  18. 18.
    Esuli, A., Sebastiani, F.: SentiWordNet: A Publicly Available Lexical Resource for Opinion Mining. In: LREC 2006, pp. 417–422 (2006)Google Scholar
  19. 19.
    Miller, G.A.: WordNet: a lexical database for English. Communications of the ACM 38(11), 41 (1995)CrossRefGoogle Scholar
  20. 20.
    Neviarouskaya, A., Prendinger, H., Ishizuka, M.: SentiFul: A Lexicon for Sentiment Analysis. IEEE Transactions on Affective Computing (99), 1 (2011)Google Scholar
  21. 21.
    Denecke, K.: Are SentiWordNet scores suited for multi-domain sentiment classification?. In: ICDIM 2009, pp. 1–6. IEEE (2009)Google Scholar
  22. 22.
    Aamodt, A., Plaza, E.: Case-based reasoning: Foundational issues, methodological variations, and system approaches. AI Communications 7(1), 39–59 (1994)Google Scholar
  23. 23.
    Salton, G., Lesk, M.: The smart automatic document retrieval systems an illustration. Communications of the ACM 8(6), 391–398 (1965)CrossRefGoogle Scholar
  24. 24.
    Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT/EMNLP 2005, p. 354. ACL (2005)Google Scholar
  25. 25.
    Chapman, W., Bridewell, W., Hanbury, P., Cooper, G., Buchanan, B.: A simple algorithm for identifying negated findings and diseases in discharge summaries. Journal of Biomedical Informatics 34(5), 301–310 (2001)CrossRefGoogle Scholar
  26. 26.
    Baccianella, S., Esuli, A., Sebastiani, F.: Multi-facet Rating of Product Reviews. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 461–472. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  27. 27.
    Jindal, N., Liu, B.: Opinion spam and analysis. In: Proceedings of the Conference on Web Search and Web Data Mining (WSDM 2008), pp. 219–230. ACM (2008)Google Scholar
  28. 28.
    Demšar, J.: Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research 7, 1–30 (2006)MATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Bruno Ohana
    • 1
  • Sarah Jane Delany
    • 1
  • Brendan Tierney
    • 1
  1. 1.Dublin Institute of TechnologyDublinIreland

Personalised recommendations