Skip to main content

Probabilistic Approaches for Sentiment Analysis: Latent Dirichlet Allocation for Ontology Building and Sentiment Extraction

  • Chapter
  • First Online:
Sentiment Analysis and Ontology Engineering

Part of the book series: Studies in Computational Intelligence ((SCI,volume 639))

Abstract

People’s opinion has always driven human choices and behaviors, even before the diffusion of Information and Communication Technologies. Thanks to the World Wide Web and the widespread of On-Line collaborative tools such as blogs, focus groups, review web sitesorums, social networks, millions of messages appear on the web, which is becoming a rich source of opinioned data. Sentiment analysis refers to the use of natural language processing, text analysis and computational linguistics to identify and extract subjective information in documents, comments and posts. The aim of this work is to show how the adoption of a probabilistic approach based on the Latent Dirichlet Allocation (LDA) as Sentiment Grabber can be an effective Sentiment Analyzer. Through this approach, for a set of documents belonging to a same knowledge domain, a graph, the Mixed Graph of Terms, can be automatically extracted. This graph, which contains a set of Mixed Graph of Terms, can be transformed in a Sentiment Oriented Terminological Ontology thanks to a methodology that involves the introduction of annotated lexicon as Wordnet. The chapter shows how the obtained ontology can be discriminative for sentiment classification. The proposed method has been tested in different contexts: standard datasets and comments extracted from social networks. The experimental evaluation shows how the proposed approach is effective and the results are quite satisfactory.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The chosen value of \(\beta \) permites to weight in a more significant manner the presence of pairs adjectives/adverbs in a comment rather than the simple occurrence of the positive/negative adjectives or adverbs.

References

  1. Albanese, M., d’Acierno, A., Moscato, V., Persia, F., Picariello, A.: A multimedia semantic recommender system for cultural heritage applications. In: 2011 Fifth IEEE International Conference on Semantic Computing (ICSC), pp. 403–410. IEEE (2011)

    Google Scholar 

  2. Amato, F., Mazzeo, A., Moscato, V., Picariello, A.: Semantic management of multimedia documents for e-government activity. In: International Conference on Complex, Intelligent and Software Intensive Systems, 2009. CISIS’09, pp. 1193–1198. IEEE (2009)

    Google Scholar 

  3. Baroni, M., Vegnaduzzo, S.: Identifying subjective adjectives through web-based mutual information. In: Proceedings of the 7th Konferenz zur Verarbeitung Natrlicher Sprache (German Conference on Natural Language Processing KONVENS04, pp. 613–619 (2004)

    Google Scholar 

  4. Bird, S., Klein, E., Loper, E., Baldridge, J.: Multidisciplinary instruction with the natural language toolkit. In: Proceedings of the Third Workshop on Issues in Teaching Computational Linguistics, TeachCL’08, pp. 62–70 (2008)

    Google Scholar 

  5. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  6. Brewster, C., Jupp, S., Luciano, J., Shotton, D., Stevens, R., Zhang, Z.: Issues in learning an ontology from text. BMC Bioinformatics 10(Suppl 5), S1 (2009)

    Article  Google Scholar 

  7. Buitelaar, P., Magnini, B.: Ontology learning from text: an overview. In: Paul Buitelaar, P., Cimiano, P., Magnini, B. (eds.) Ontology Learning from Text: Methods, Applications and Evaluation, pp. 3–12. IOS Press (2005)

    Google Scholar 

  8. Chaovalit, P., Zhou, L.: Movie review mining: a comparison between supervised and unsupervised classification approaches. In: Proceedings of the Proceedings of the 38th Annual Hawaii International Conference on System Sciences (HICSS’05)—Track 4—HICSS’05, vol. 04, p. 112.3. IEEE Computer Society, Washington, DC, USA (2005)

    Google Scholar 

  9. Ciaramita, M., Gangemi, A., Ratsch, E., Šaric, J., Rojas, I.: Unsupervised learning of semantic relations between concepts of a molecular biology ontology. In: Proceedings of the 19th international joint conference on Artificial intelligence, IJCAI’05, pp. 659–664 (2005)

    Google Scholar 

  10. Cimiano, P., Pivk, A., Schmidt-Thieme, L., Staab, S.: Learning taxonomic relations from heterogeneous evidence

    Google Scholar 

  11. Cimiano, P., Völker, J., Studer, R.: Ontologies on demand a description of the state-of-the-art, applications, challenges and trends for ontology learning from text (2006)

    Google Scholar 

  12. Colace, F., De Santo, M., Greco, L.: Sentiment mining through mixed graph of terms. In: 2014 17th International Conference on Network-Based Information Systems (NBiS), pp. 324–330, Sept 2014

    Google Scholar 

  13. Colace, F., De Santo, M., Greco, L.: A probabilistic approach to tweets’ sentiment classification. In: 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction (ACII), pp. 37–42 (2013)

    Google Scholar 

  14. Colace, F., De Santo, M., Greco, L., Napoletano, P.: Weighted word pairs for query expansion. Inf. Process. Manage. (2014)

    Google Scholar 

  15. Colace, F., De Santo, M., Greco, L.: An adaptive product configurator based on slow intelligence approach. Int. J. Metadata Semant. Ontol. 9(2), 128–137, 01 (2014)

    Google Scholar 

  16. Colace, F., De Santo, M., Greco, L., Amato, F., Moscato, V., Picariello, A.: Terminological ontology learning and population using latent dirichlet allocation. J. Vis. Lang. Comput. 25(6), 818–826 (2014)

    Article  Google Scholar 

  17. Colace, F., De Santo, M., Greco, L., Napoletano, P.: Text classification using a few labeled examples. Comput. Human Behav. 30, 689–697 (2014)

    Article  Google Scholar 

  18. Colbaugh, R., Glass, K.: Estimating sentiment orientation in social media for intelligence monitoring and analysis. In: 2010 IEEE International Conference on Intelligence and Security Informatics (ISI), pp. 135–137, May 2010

    Google Scholar 

  19. Esuli, A., Sebastiani, F.: Sentiwordnet: a publicly available lexical resource for opinion mining. In: Proceedings of the 5th Conference on Language Resources and Evaluation (LREC06), pp. 417–422 (2006)

    Google Scholar 

  20. Fotzo, H.N., Gallinari, P.: Learning generalization/specialization relations between concepts—application for automatically building thematic document hierarchies

    Google Scholar 

  21. Gamallo, P., Agustini, A., Lopes, G.P.: Learning subcategorisation information to model a grammar with “co-restrictions” (2003)

    Google Scholar 

  22. Gamon, M., Aue, A.: Automatic identification of sentiment vocabulary: exploiting low association with known sentiment terms. In: Proceedings of the ACL Workshop on Feature Engineering for Machine Learning in Natural Language Processing, pp. 57–64. Association for Computational Linguistics, Ann Arbor, Michigan, June 2005

    Google Scholar 

  23. Hippisley, A., Cheng, D., Ahmad, K.: The head-modifier principle and multilingual term extraction. Nat. Lang. Eng. 11(2), 129–157 (2005)

    Article  Google Scholar 

  24. Hofmann, T.: Probabilistic latent semantic analysis. In: Proceedings of Uncertainty in Artificial Intelligence, UAI99, pp. 289–296 (1999)

    Google Scholar 

  25. Kennedy, A., Inkpen, D.: Sentiment classification of movie reviews using contextual valence shifters. Comput. Intell. 22, 2006 (2006)

    Article  MathSciNet  Google Scholar 

  26. Wrobel, S.: Inductive logic programming for knowledge discovery in databases. In: Dzeroski, S., Lavrač, N. (eds.) Relational Data Mining. Springer, pp. 74–101 (2001)

    Google Scholar 

  27. Liu, B.: Sentiment Analysis and Subjectivity. In: Handbook of Natural Language Processing, 2nd edn. Taylor and Francis Group, Boca (2010)

    Google Scholar 

  28. Maedche, A., Staab, S.: Ontology learning for the semantic web. IEEE Intell. Syst. 16(2), 72–79 (2001)

    Article  Google Scholar 

  29. Melville, P., Gryc, W., Lawrence, R.D.: Sentiment analysis of blogs by combining lexical knowledge with text classification. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’09, pp. 1275–1284. ACM, New York, NY, USA (2009)

    Google Scholar 

  30. Napoletano, P., Colace, F., De Santo, M., Greco, L.: Text classification using a graph of terms. In: 2012 Sixth International Conference on Complex, Intelligent and Software Intensive Systems (CISIS), pp. 1030–1035, July (2012)

    Google Scholar 

  31. Neviarouskaya, A., Prendinger, H., Ishizuka, M.: Sentiful: a lexicon for sentiment analysis. IEEE Trans. Affect. Comput. 2(1), 22–36, jan–june 2011

    Google Scholar 

  32. Pak, A., Paroubek, P.: Twitter as a corpus for sentiment analysis and opinion mining. In: Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC’10). European Language Resources Association (ELRA), Valletta, Malta, May 2010

    Google Scholar 

  33. Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2(1–2), 1–135 (2008)

    Article  Google Scholar 

  34. Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? sentiment classification using machine learning techniques. In: Proceedings of EMNLP, pp. 79–86 (2002)

    Google Scholar 

  35. Pedersen, T., Patwardhan, S., Michelizzi, J.: Wordnet::similarity: measuring the relatedness of concepts. In: Demonstration Papers at HLT-NAACL 2004, HLT-NAACL-Demonstrations’04, pp. 38–41 (2004)

    Google Scholar 

  36. Prabowo, R., Thelwall, M.: Sentiment analysis: a combined approach. J. Informetrics 3, 143–157 (2009)

    Article  Google Scholar 

  37. Sebastiani, F.: Machine learning in text categorization. ACM Comput. Surv. 34, 1–47 (2002)

    Article  MathSciNet  Google Scholar 

  38. Shamsfard, M., Barforoush, A.A.: The state of the art in ontology learning: a framework for comparison. Knowl. Eng. Rev. 18(4), 293–316 (2003)

    Google Scholar 

  39. Shein, K.P.P.: Ontology based combined approach for sentiment classification. In: Proceedings of the 3rd International Conference on Communications and Information Technology, CIT’09, pp. 112–115. World Scientific and Engineering Academy and Society (WSEAS), Stevens Point, Wisconsin, USA (2009)

    Google Scholar 

  40. Snow, R.: Semantic taxonomy induction from heterogenous evidence. In: Proceedings of COLING/ACL 2006, 801–808 (2006)

    Google Scholar 

  41. Turney, P., Littman, M.: Unsupervised learning of semantic orientation from a hundred-billion-word corpus. Technical report NRC technical report ERB-1094, Institute for Information Technology, National Research Council Canada (2002)

    Google Scholar 

  42. Turney, P.D.: Mining the web for synonyms: PMI-IR versus LSA on TOEFL. In: Proceedings of the 12th European Conference on Machine Learning, EMCL’01, pp. 491–502 (2001)

    Google Scholar 

  43. Turney, P.D.: Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL’02, pp. 417–424. Association for Computational Linguistics, Stroudsburg, PA, USA (2002)

    Google Scholar 

  44. Velardi, P., Navigli, R., Cucchiarelli, A., Neri, F.: Evaluation of ontolearn, a methodology for automatic learning of ontologies. In: Buitelaar, P., Cimiano, P., Magnini, B. (eds.) Ontology Learning from Text: Methods, Evaluation and Applications. IOS Press, pp. 92–105

    Google Scholar 

  45. Wang, C., Xiao, Z., Liu, Y., Yanru, X., Zhou, A., Zhang, K.: Sentiview: sentiment analysis and visualization for internet popular topics. IEEE Trans. Hum. Mach. Syst. 43(6), 620–630 (2013)

    Article  Google Scholar 

  46. Wilson, T., Wiebe, J., Hwa, R.: Just how mad are you? finding strong and weak opinion clauses. In: Proceedings of the 19th national conference on Artifical intelligence, AAAI’04, pp. 761–767. AAAI Press (2004)

    Google Scholar 

  47. Wong, W., Liu, W., Bennamoun, M.: Tree-traversing ant algorithm for term clustering based on featureless similarities. Data Min. Knowl. Discov. 15(3), 349–381 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  48. Wong, W., Liu, W., Bennamoun, M.: Ontology learning from text: a look back and into the future. ACM Comput. Surv. 44(4), 20:1–20:36, Sept 2012

    Google Scholar 

  49. Yangarber, R., Grishman, R., Tapanainen, P.: Automatic acquisition of domain knowledge for information extraction. In: Proceedings of the 18th International Conference on Computational Linguistics, pp. 940–946 (2000)

    Google Scholar 

  50. Yu, X., Liu, Y., Huang, X., An, A.: Mining online reviews for predicting sales performance: a case study in the movie domain. IEEE Trans. Knowl. Data Eng. 24(4), 720–734 (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francesco Colace .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Colace, F., De Santo, M., Greco, L., Moscato, V., Picariello, A. (2016). Probabilistic Approaches for Sentiment Analysis: Latent Dirichlet Allocation for Ontology Building and Sentiment Extraction. In: Pedrycz, W., Chen, SM. (eds) Sentiment Analysis and Ontology Engineering. Studies in Computational Intelligence, vol 639. Springer, Cham. https://doi.org/10.1007/978-3-319-30319-2_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-30319-2_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-30317-8

  • Online ISBN: 978-3-319-30319-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics