Advertisement

Information Retrieval

, Volume 14, Issue 3, pp 337–353 | Cite as

Sentiment classification: a lexical similarity based approach for extracting subjectivity in documents

  • Kiran Sarvabhotla
  • Prasad Pingali
  • Vasudeva Varma
Web Mining for Search

Abstract

With the growth of social media, document sentiment classification has become an active area of research in this decade. It can be viewed as a special case of topical classification applied only to subjective portions of a document (sources of sentiment). Hence, the key task in document sentiment classification is extracting subjectivity. Existing approaches to extract subjectivity rely heavily on linguistic resources such as sentiment lexicons and complex supervised patterns based on part-of-speech (POS) information. This makes the task of subjective feature extraction complex and resource dependent. In this work, we try to minimize the dependency on linguistic resources in sentiment classification. We propose a simple and statistical methodology called review summary (RSUMM) and use it in combination with well-known feature selection methods to extract subjectivity. Our experimental results on a movie review dataset prove the effectiveness of the proposed methodology.

Keywords

Social media Sentiment classification Subjectivity Linguistic resources RSUMM 

Notes

Acknowledgments

We thank Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan for providing the URL in their paper to download the IMDb movie review dataset. We thank the Department of Computer Science, Cornell University for providing the link to download the dump of the IMDb archive.

References

  1. Argamon, S., Koppel, M., & Avneri, G. (1998). Routing documents according to style. In Proceedings of 1st international workshop on innovative information systems.Google Scholar
  2. Aue, A., & Gamon, M. (2005). Customizing sentiment classifiers to new domains: A case study. In Proceedings of the international conference RANLP-2005.Google Scholar
  3. Baccianella, S., Esuli, A., & Sebastiani, F. (2009). Multi-facet rating of product reviews. In Proceedings of the 31th European conference on IR research on advances in information retrieval, ECIR ’09 (pp. 461–472). Springer-Verlag.Google Scholar
  4. Baeza-Yates, R. A., & Ribeiro-Neto, B. (1999). Modern information retrieval. Boston: Addison-Wesley LongmanGoogle Scholar
  5. Beineke, P., Hastie, T., & Vaithyanathan, S. (2004). The sentimental factor: Improving review classification via human-provided information. In Proceedings of the 42nd annual meeting on association for computational linguistics, ACL ’04. Association for Computational Linguistics.Google Scholar
  6. Cui, H., Mittal, V., & Datar, M. (2006). Comparative experiments on sentiment classification for online product reviews. In Proceedings of the 21st national conference on artificial intelligence (Vol. 2, pp. 1265–1270). AAAI Press.Google Scholar
  7. Gretzel, U., & Yoo, K. H. (2008). Use and impact of online travel reviews. Information and Communication Technologies in Tourism (pp. 35–46).Google Scholar
  8. Hatzivassiloglou, V., & McKeown, K. R. (1997). Predicting the semantic orientation of adjectives. In Proceedings of the 8th conference on European chapter of the association for computational linguistics (pp. 174–181). Association for Computational Linguistics.Google Scholar
  9. Hu, M., & Liu, B. (2004). Mining and summarizing customer reviews. In Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 168–177). KDD ’04.Google Scholar
  10. Hu, Y., Lu, R., Li, X., Chen, Y., & Duan, J. (2007). A language modeling approach to sentiment analysis. In Proceedings of the 7th international conference on computational science, Part II, ICCS ’07 (pp. 1186–1193). Springer-Verlag.Google Scholar
  11. Kessler, B., Numberg, G., & Schütze, H. (1997). Automatic detection of text genre. In Proceedings of the 35th annual meeting of the association for computational linguistics and 8th conference of the European chapter of the association for computational linguistics, ACL-35 (pp. 32–38). Association for Computational Linguistics.Google Scholar
  12. Li, S., Lee, S. Y. M., Chen, Y., Huang, C.-R., & Zhou, G. (2010). Sentiment classification and polarity shifting. In Proceedings of the 23rd international conference on computational linguistics (Coling 2010) (pp. 635–643).Google Scholar
  13. Liu, B. (2010). Sentiment analysis and subjectivity. In Handbook of natural language processing (2nd ed.). Boca Raton, FL: CRC Press, Taylor and Francis Group.Google Scholar
  14. Matsumoto, S., Takamura, H., & Okumura, M. (2005). Sentiment classification using word sub-sequences and dependency sub-trees. In Proceedings of PAKDD (pp. 301–311).Google Scholar
  15. Mullen, T., & Collier, N. (2004). Sentiment analysis using support vector machines with diverse information sources. In Proceedings of EMNLP (pp. 412–418).Google Scholar
  16. Pang, B., & Lee, L. (2004) A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the ACL (pp. 271–278).Google Scholar
  17. Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis. Foundation and trends in information retrieval, 2, 1–135. ISSN 1554-0669.Google Scholar
  18. Pang, B., Lee, L., & Vaithyanathan, S. (2002). Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of the ACL-02 conference on empirical methods in natural language processing (Vol. 10, pp. 79–86).Google Scholar
  19. Raychev, V., & Nakov, P. (2009). Language-independent sentiment analysis using subjectivity and positional information. In Proceedings of the international conference RANLP-2009 (pp. 360–364). Association for Computational Linguistics.Google Scholar
  20. Tan, S., Cheng, X., Wang, Y., & Xu, H. (2009). Adapting naive bayes to domain adaptation for sentiment analysis. In Proceedings of the 31th European conference on IR research on advances in information retrieval, ECIR ’09 (pp. 337–349). Springer-Verlag.Google Scholar
  21. Thet, T. T., Na, J.-C., & Khoo, C. S. (2008). Sentiment classification of movie reviews using multiple perspectives. In Proceedings of the 11th international conference on Asian digital libraries: Universal and ubiquitous access to information, ICADL 08 (pp. 184–193). Springer-Verlag.Google Scholar
  22. Turney, P. D. (2002). Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th annual meeting on association for computational linguistics, ACL ’02 (pp. 417–424). Association for Computational Linguistics.Google Scholar
  23. Wang, S., Li, D., Wei, Y., & Li, H. (2009). A feature selection method based on fisher’s discriminant ratio for text sentiment classification. In Proceedings of the international conference on web information systems and mining, WISM ’09 (pp. 88–97). Springer-Verlag.Google Scholar
  24. Whitelaw, C., Garg, N., & Argamon, S. (2005). Using appraisal groups for sentiment analysis. In Proceedings of the 14th ACM international conference on information and knowledge management, CIKM ’05 (pp. 625–631). ACM.Google Scholar
  25. Yang, Y., & Pedersen, J. O. (1997). A comparative study on feature selection in text categorization. In Proceedings of the 14th international conference on machine learning, ICML ’97 (pp. 412–420). Morgan KaufmannGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Kiran Sarvabhotla
    • 1
  • Prasad Pingali
    • 1
  • Vasudeva Varma
    • 1
  1. 1.Search and Information Extraction LabInternational Institute of Information TechnologyHyderabadIndia

Personalised recommendations