Skip to main content

Cross-Domain Sentiment Classification via Polarity-Driven State Transitions in a Markov Model

  • Conference paper
  • First Online:
Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2015)

Abstract

Nowadays understanding people’s opinions is the way to success, whatever the goal. Sentiment classification automates this task, assigning a positive, negative or neutral polarity to free text concerning services, products, TV programs, and so on. Learning accurate models requires a considerable effort from human experts that have to properly label text data. To reduce this burden, cross-domain approaches are advisable in real cases and transfer learning between source and target domains is usually demanded due to language heterogeneity. This paper introduces some variants of our previous work [1], where both transfer learning and sentiment classification are performed by means of a Markov model. While document splitting into sentences does not perform well on common benchmark, using polarity-bearing terms to drive the classification process shows encouraging results, given that our Markov model only considers single terms without further context information.

Giacomo Domeniconi—This work was partially supported by the project “Toreador”.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    www.amazon.com.

References

  1. Domeniconi, G., Moro, G., Pagliarani, A., Pasolini, R.: Markov chain based method for in-domain and cross-domain sentiment classification. In: Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, pp. 127–137 (2015)

    Google Scholar 

  2. Dai, W., Xue, G.-R., Yang, Q., Yu, Y.:Co-clustering based classification for out-of-domain documents. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 210–219. ACM (2007)

    Google Scholar 

  3. Xue, G.-R., Dai, W., Yang, Q., Yu, Y.: Topic-bridged PLSA for cross-domain text classification. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and development in Information Retrieval, pp. 627–634. ACM (2008)

    Google Scholar 

  4. Li, L., Jin, X., Long, M.: Topic correlation analysis for cross-domain text classification. In: AAAI (2012)

    Google Scholar 

  5. Tan, S., Wang, Y., Cheng, X.: Combining learn-based and lexicon-based techniques for sentiment detection without using labeled examples. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 743–744. ACM (2008)

    Google Scholar 

  6. Qiu, L., Zhang, W., Hu, C., Zhao, K.: Selc: a self-supervised model for sentiment classification. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 929–936. ACM (2009)

    Google Scholar 

  7. Melville, P., Gryc, W., Lawrence, R.D.: Sentiment analysis of blogs by combining lexical knowledge with text classification. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1275–1284. ACM (2009)

    Google Scholar 

  8. Domeniconi, G., Moro, G., Pasolini, R., Sartori, C.: Cross-domain text classification through iterative refining of target categories representations. In: Proceedings of the 6th International Conference on Knowledge Discovery and Information Retrieval (2014)

    Google Scholar 

  9. Shrivastava, A., Malisiewicz, T., Gupta, A., Efros, A.A.: Data-driven visual similarity for cross-domain image matching. ACM Trans. Graph. (TOG) 30, 154 (2011)

    Article  Google Scholar 

  10. Domeniconi, G., Masseroli, M., Moro, G., Pinoli, P.: Cross-organism learning method to discover new gene functionalities. Comput. Methods Programs Biomed. 126, 20–34 (2016)

    Article  Google Scholar 

  11. Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)

    Article  Google Scholar 

  12. Domeniconi, G., Moro, G., Pasolini, R., Sartori, C.: Iterative refining of category profiles for nearest centroid cross-domain text classification. In: Fred, A., Dietz, J.L.G., Aveiro, D., Liu, K., Filipe, J. (eds.) IC3K 2014. CCIS, vol. 553, pp. 50–67. Springer, Heidelberg (2015). doi:10.1007/978-3-319-25840-9_4

    Chapter  Google Scholar 

  13. Dave, K., Lawrence, S., Pennock, D.M.: Mining the peanut gallery: opinion extraction and semantic classification of product reviews. In: Proceedings of the 12th International Conference on World Wide Web, pp. 519–528. ACM (2003)

    Google Scholar 

  14. Domeniconi, G., Moro, G., Pasolini, R., Sartori, C.: A comparison of term weighting schemes for text classification and sentiment analysis with a supervised variant of tf.idf. In: Helfert, M., Holzinger, A., Belo, O., Francalanci, C. (eds.) DATA 2015. CCIS, vol. 584, pp. 39–58. Springer, Heidelberg (2016). doi:10.1007/978-3-319-30162-4_4

    Chapter  Google Scholar 

  15. Domeniconi, G., Masseroli, M., Moro, G., Pinoli, P.: Random perturbations of term weighted gene ontology annotations for discovering gene unknown functionalities. In: Fred, A., Dietz, J.L.G., Aveiro, D., Liu, K., Filipe, J. (eds.) IC3K 2014. CCIS, vol. 553, pp. 181–197. Springer, Heidelberg (2015). doi:10.1007/978-3-319-25840-9_12

    Chapter  Google Scholar 

  16. Paltoglou, G., Thelwall, M.: A study of information retrieval weighting schemes for sentiment analysis. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 1386–1395. Association for Computational Linguistics (2010)

    Google Scholar 

  17. Deng, Z.-H., Luo, K.-H., Yu, H.-L.: A study of supervised term weighting scheme for sentiment analysis. Expert Syst. Appl. 41(7), 3506–3513 (2014)

    Article  Google Scholar 

  18. Wu, H., Gu, X.: Reducing over-weighting in supervised term weighting for sentiment analysis. In: COLING, pp. 1322–1330 (2014)

    Google Scholar 

  19. Aue, A., Gamon, M.: Customizing sentiment classifiers to new domains: A case study. In: Proceedings of Recent Advances in Natural Language Processing (RANLP), vol. 1, pp. 2– 1 (2005)

    Google Scholar 

  20. Bollegala, D., Weir, D., Carroll, J.: Cross-domain sentiment classification using a sentiment sensitive thesaurus. IEEE Trans. Knowl. Data Eng 25(8), 1719–1731 (2013)

    Article  Google Scholar 

  21. Blitzer, J., Dredze, M., Pereira, F., et al.: Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. In: ACL, vol. 7, pp. 440–447 (2007)

    Google Scholar 

  22. Pan, S.J., Ni, X., Sun, J.-T., Yang, Q., Chen, Z.: Cross-domain sentiment classification via spectral feature alignment. In: Proceedings of the 19th International Conference on World wide web, pp. 751–760. ACM (2010)

    Google Scholar 

  23. He, Y., Lin, C., Alani, H.: Automatically extracting polarity-bearing topics for cross-domain sentiment classification. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 123–131. Association for Computational Linguistics (2011)

    Google Scholar 

  24. Turney, P.D.: Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 417–424. Association for Computational Linguistics (2002)

    Google Scholar 

  25. Taboada, M., Brooke, J., Tofiloski, M., Voll, K., Stede, M.: Lexicon-based methods for sentiment analysis. Comput. linguist. 37(2), 267–307 (2011)

    Article  Google Scholar 

  26. Qiu, L.: Markov models of search state patterns in a hypertext information retrieval system. J. Am. Soc. Inf. Sci. 44(7), 413–427 (1993)

    Article  Google Scholar 

  27. Sarukkai, R.R.: Link prediction and path analysis using markov chains. Comput. Netw. 33(1), 377–386 (2000)

    Article  Google Scholar 

  28. Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: bringing order to the web. Technical Report, Stanford University (1999)

    Google Scholar 

  29. Mittendorf, E., Schäuble, P.: Document and passage retrieval based on hidden Markov models. In: Croft, B.W., van Rijsbergen, C.J. (eds.) SIGIR 1994, pp. 318–327. Springer, Heidelberg (1994)

    Google Scholar 

  30. Miller, D.R., Leek, T., Schwartz, R.M.: A hidden Markov model information retrieval system. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 214–221. ACM (1999)

    Google Scholar 

  31. Pan, Y.-C., Lee, H.-Y., Lee, L.-S.: Interactive spoken document retrieval with suggested key terms ranked by a Markov decision process. IEEE Trans. Audio Speech Lang. Process. 20(2), 632–645 (2012)

    Article  Google Scholar 

  32. Xu, J., Weischedel, R.: Cross-lingual information retrieval using hidden Markov models. In: Proceedings of the 2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing, Very Large Corpora: held in Conjunction with the 38th Annual Meeting of the Association for Computational Linguistics, vol. 13, pp. 95–103. Association for Computational Linguistics (2000)

    Google Scholar 

  33. Cao, G., Nie, J.-Y., Bai, J.: Using Markov chains to exploit word relationships in information retrieval. In: Large Scale Semantic Access to Content (Text, Image, Video, and Sound), pp. 388–402. Le Centre de Hautes Etudes Internationales D’Informatique Documentaire (2007)

    Google Scholar 

  34. Li, F., Huang, M., Zhu, X.: Sentiment analysis with global topics and local dependency. In: AAAI, vol. 10, pp. 1371–1376 (2010)

    Google Scholar 

  35. Mei, Q., Ling, X., Wondra, M., Su, H., Zhai, C.: Topic sentiment mixture: modeling facets and opinions in weblogs. In: Proceedings of the 16th International Conference on World Wide Web, pp. 171–180. ACM (2007)

    Google Scholar 

  36. Jo, Y., Oh, A.H.: Aspect and sentiment unification model for online review analysis. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, pp. 815–824. ACM (2011)

    Google Scholar 

  37. Jin, W., Ho, H.H., Srihari, R.K.: Opinionminer: a novel machine learning system for web opinion mining and extraction. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1195–1204. ACM (2009)

    Google Scholar 

  38. Nasukawa, T., Yi, J.: Sentiment analysis: capturing favorability using natural language processing. In: Proceedings of the 2nd International Conference on Knowledge Capture, pp. 70–77. ACM (2003)

    Google Scholar 

  39. Yi, K., Beheshti, J.: A hidden Markov model-based text classification of medical documents. J. Inf. Sci. 35, 67–81 (2008)

    Article  Google Scholar 

  40. Xu, R., Supekar, K., Huang, Y., Das, A., Garber, A.: Combining text classification and hidden Markov modeling techniques for structuring randomized clinical trial abstracts. In: AMIA Annual Symposium Proceedings 2006, p. 824. American Medical Informatics Association (2006)

    Google Scholar 

  41. Yi, K., Beheshti, J.: A text categorization model based on hidden Markov models. In: Proceedings of the Annual Conference of CAIS/Actes du congrès annuel de l’ACSI (2013)

    Google Scholar 

  42. Li, F., Dong, T.: Text categorization based on semantic cluster-hidden markov models. In: Tan, Y., Shi, Y., Mo, H. (eds.) ICSI 2013. LNCS, vol. 7929, pp. 200–207. Springer, Heidelberg (2013). doi:10.1007/978-3-642-38715-9_24

    Chapter  Google Scholar 

  43. Frasconi, P., Soda, G., Vullo, A.: Hidden Markov models for text categorization in multi-page documents. J. Intell. Inf. Syst. 18(2–3), 195–217 (2002)

    Article  Google Scholar 

  44. Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)

    Article  Google Scholar 

  45. Domeniconi, G., Moro, G., Pasolini, R., Sartori, C.: A study on term weighting for text categorization: a novel supervised variant of tf.idf. In: Proceedings of the 4th International Conference on Data Management Technologies and Applications (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrea Pagliarani .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Domeniconi, G., Moro, G., Pagliarani, A., Pasolini, R. (2016). Cross-Domain Sentiment Classification via Polarity-Driven State Transitions in a Markov Model. In: Fred, A., Dietz, J., Aveiro, D., Liu, K., Filipe, J. (eds) Knowledge Discovery, Knowledge Engineering and Knowledge Management. IC3K 2015. Communications in Computer and Information Science, vol 631. Springer, Cham. https://doi.org/10.1007/978-3-319-52758-1_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-52758-1_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-52757-4

  • Online ISBN: 978-3-319-52758-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics