Learning patterns for discovering domain-oriented opinion words

  • Pantelis Agathangelou
  • Ioannis Katakis
  • Ioannis Koutoulakis
  • Fotis Kokkoras
  • Dimitrios Gunopulos
Regular Paper

Abstract

Sentiment analysis is a challenging task that attracted increasing interest during the last years. The availability of online data along with the business interest to keep up with consumer feedback generates a constant demand for online analysis of user-generated content. A key role to this task plays the utilization of domain-specific lexicons of opinion words that enables algorithms to classify short snippets of text into sentiment classes (positive, negative). This process is known as dictionary-based sentiment analysis. The related work tends to solve this lexicon identification problem by either exploiting a corpus and a thesaurus or by manually defining a set of patterns that will extract opinion words. In this work, we propose an unsupervised approach for discovering patterns that will extract domain-specific dictionary. Our approach (DidaxTo) utilizes opinion modifiers, sentiment consistency theories, polarity assignment graphs and pattern similarity metrics. The outcome is compared against lexicons extracted by the state-of-the-art approaches on a sentiment analysis task. Experiments on user reviews coming from a diverse set of products demonstrate the utility of the proposed method. An implementation of the proposed approach in an easy to use application for extracting opinion words from any domain and evaluate their quality is also presented.

References

  1. 1.
    Agathangelou P, Katakis I, Kokkoras F, Ntonas K (2014) Mining domain-specific dictionaries of opinion words. In: Web information systems engineering—WISE 2014—15th international conference, Thessaloniki, Greece, October 12–14, 2014, Proceedings, Part I, pp 47–62Google Scholar
  2. 2.
    Amiri H, Chua T-S (2012) Mining slang and urban opinion words and phrases from cqa services: an optimization approach. In: Proceedings of the fifth international conference on web search and web data mining, WSDM 2012, Seattle, WA, USA, February 8–12, 2012, pp 193–202Google Scholar
  3. 3.
    Baccianella S, Esuli A, Sebastiani F (2010) Sentiwordnet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: Proceedings of the international conference on language resources and evaluation, LREC 2010, 17–23 May 2010, Valletta, MaltaGoogle Scholar
  4. 4.
    Broß J, Ehrig H (2013) Automatic construction of domain and aspect specific sentiment lexicons for customer review mining. In: 22nd ACM international conference on information and knowledge management, CIKM’13, San Francisco, CA, USA, October 27–November 1, 2013, pp 1077–1086Google Scholar
  5. 5.
    Ding X, Liu B (2007) The utility of linguistic rules in opinion mining. In: SIGIR 2007: proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval, Amsterdam, The Netherlands, July 23–27, 2007, pp 811–812Google Scholar
  6. 6.
    Esuli A, Sebastiani F (2006) Sentiwordnet: a publicly available lexical resource for opinion mining. In: In Proceedings of the 5th conference on language resources and evaluation (LREC06), pp 417–422Google Scholar
  7. 7.
    Esuli A, Sebastiani F (2007) Pageranking wordnet synsets: an application to opinion mining. In: ACL 2007, proceedings of the 45th annual meeting of the association for computational linguistics, June 23–30, 2007, Prague, Czech RepublicGoogle Scholar
  8. 8.
    Guo J, Lu Y, Mori T, Blake C (2015) Expert-guided contrastive opinion summarization for controversial issues. In: WWW (Companion Volume). ACM, pp 1105–1110Google Scholar
  9. 9.
    Hatzivassiloglou V, McKeown K (1997) Predicting the semantic orientation of adjectives. In: 35th annual meeting of the association for computational linguistics and 8th conference of the european chapter of the association for computational linguistics, proceedings of the conference, 7–12 July 1997, Universidad Nacional de Educación a Distancia (UNED), Madrid, Spain, pp 174–181Google Scholar
  10. 10.
    Hu M, Liu B (2004) Mining and summarizing customer reviews. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, Seattle, Washington, USA, August 22–25, 2004, pp 168–177Google Scholar
  11. 11.
    Kamps J, Marx M, Mokken RJ, de Rijke M (2004) Using wordnet to measure semantic orientations of adjectives. In: Proceedings of the fourth international conference on language resources and evaluation, LREC 2004, May 26–28, 2004, Lisbon, PortugalGoogle Scholar
  12. 12.
    Kanayama H, Nasukawa T (2006) Fully automatic lexicon expansion for domain-oriented sentiment analysis. In: EMNLP 2007, proceedings of the 2006 conference on empirical methods in natural language processing, 22–23 July 2006, Sydney, Australia, pp 355–363Google Scholar
  13. 13.
    Kim HD, Castellanos M, Hsu M, Zhai CX, Dayal U, Ghosh R (2013) Compact explanatory opinion summarization. In: CIKM. ACM, pp 1697–1702Google Scholar
  14. 14.
    Kim HD, Castellanos MG, Hsu M, Zhai CX, Dayal U, Ghosh R (2013) Ranking explanatory sentences for opinion summarization. In: Proceedings of the 36th international ACM SIGIR conference on research and development in information retrieval, SIGIR ’13, New York, NY, USA. ACM, pp 1069–1072Google Scholar
  15. 15.
    Kokkoras F, Ntonas K, Bassiliades N (2013) Deixto: a web data extraction suite. In: Proceedings of the 6th Balkan conference in informatics, BCI ’13, New York, NY, USA. ACM, pp 9–12Google Scholar
  16. 16.
    Lau RY-K, Lai CL, Bruza P, Wong K-F (2011) Leveraging web 2.0 data for scalable semi-supervised learning of domain-specific sentiment lexicons. In: Proceedings of the 20th ACM conference on information and knowledge management, CIKM 2011, Glasgow, United Kingdom, October 24–28, 2011, pp 2457–2460Google Scholar
  17. 17.
    Li C, Xu B, Wu G, He S, Tian G, Zhou Y (2015) Parallel recursive deep model for sentiment analysis. In: Cao T, Lim EP, Zhou ZH, Ho TB, Cheung D, Motoda H (eds) Advances in knowledge discovery and data mining. PAKDD 2015. Lecture notes in computer science, vol 9078. Springer, ChamGoogle Scholar
  18. 18.
    Lim KW, Buntine WL (2014) Twitter opinion topic model: extracting product opinions from tweets by leveraging hashtags and sentiment lexicon. In: CIKM. ACM, pp 1319–1328Google Scholar
  19. 19.
    Lin D (1994) PRINCIPAR—an efficient, broad-coverage, principle-based parser. In: 15th international conference on computational linguistics, COLING 1994, Kyoto, Japan, August 5–9, 1994, pp 482–488Google Scholar
  20. 20.
    Liu K, Xu L, Zhao J (2013) Syntactic patterns versus word alignment: extracting opinion targets from online reviews. In: Proceedings of the 51st annual meeting of the association for computational linguistics, ACL 2013, 4–9 August 2013, Sofia, Bulgaria, volume 1: long papers, pp 1754–1763Google Scholar
  21. 21.
    Lu Y, Castellanos M, Dayal U, Zhai CX (2011) Automatic construction of a context-aware sentiment lexicon: an optimization approach. In: Proceedings of the 20th international conference on World Wide Web, WWW 2011, Hyderabad, India, March 28–April 1, 2011, pp 347–356Google Scholar
  22. 22.
    Peng W, Park DH (2011) Generate adjective sentiment dictionary for social media sentiment analysis using constrained nonnegative matrix factorization. In: Proceedings of the fifth international conference on weblogs and social media, Barcelona, Catalonia, Spain, July 17–21, 2011Google Scholar
  23. 23.
    Popescu A-M, Etzioni O (2005) Extracting product features and opinions from reviews. In: HLT/EMNLP 2005, human language technology conference and conference on empirical methods in natural language processing, proceedings of the conference, 6–8 October 2005, Vancouver, BC, CanadaGoogle Scholar
  24. 24.
    Qiu G, Liu B, Bu J, Chen C (2009) Expanding domain sentiment lexicon through double propagation. In: Proceedings of the 21st international joint conference on artifical intelligence, IJCAI’09, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc, pp 1199–1204Google Scholar
  25. 25.
    Qiu G, Liu B, Bu J, Chen C (2011) Opinion word expansion and target extraction through double propagation. Comput Linguist 37(1):9–27CrossRefGoogle Scholar
  26. 26.
    Rao Y, Lei J, Wenyin L, Li Q, Chen M (2014) Building emotional dictionary for sentiment analysis of online news. World Wide Web 17(4):723–742CrossRefGoogle Scholar
  27. 27.
    Severyn A, Moschitti A (2015) Twitter sentiment analysis with deep convolutional neural networks. In: SIGIR. ACM, pp 959–962Google Scholar
  28. 28.
    Socher R, Perelygin A, Wu JY, Chuang J, Manning CD, Ng AY, Potts CP (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: EMNLPGoogle Scholar
  29. 29.
    Thonet T, Cabanac G, Boughanem M, Pinel-Sauvagnat K (2016) VODUM: a topic model unifying viewpoint, topic and opinion discovery. In: Ferro N et al (eds) Advances in Information Retrieval. ECIR 2016. Lecture Notes in Computer Science, vol 9626. Springer, ChamGoogle Scholar
  30. 30.
    Turney PD (2002) Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th annual meeting of the association for computational linguistics, July 6–12, 2002, Philadelphia, PA, USA, pp 417–424Google Scholar
  31. 31.
    Wang S, Chen Z, Liu B (2016) Mining aspect-specific opinion using a holistic lifelong topic model. In: WWW. ACM, pp 167–176Google Scholar
  32. 32.
    Weichselbraun A, Gindl S, Scharl A (2011) Using games with a purpose and bootstrapping to create domain-specific sentiment lexicons. In: Proceedings of the 20th ACM conference on information and knowledge management, CIKM 2011, Glasgow, UK, October 24–28, 2011, pp 1053–1060Google Scholar
  33. 33.
    Wilson T, Wiebe J, Hoffmann P (2005) Recognizing contextual polarity in phrase-level sentiment analysis. In: HLT/EMNLP 2005, human language technology conference and conference on empirical methods in natural language processing, proceedings of the conference, 6–8 October 2005, Vancouver, BC, CanadaGoogle Scholar
  34. 34.
    Wu Y, Zhang Q, Huang X, Wu L (2009) Phrase dependency parsing for opinion mining. In: Proceedings of the 2009 conference on empirical methods in natural language processing, EMNLP 2009, 6–7 August 2009, Singapore. A meeting of SIGDAT, a Special Interest Group of the ACL, pp 1533–1541Google Scholar
  35. 35.
    Xu L, Liu K, Lai S, Chen Y, Zhao J (2013) Mining opinion words and opinion targets in a two-stage framework. In: Proceedings of the 51st annual meeting of the association for computational linguistics, ACL 2013, 4–9 August 2013, Sofia, Bulgaria, volume 1: long papers, pp 1764–1773Google Scholar

Copyright information

© Springer-Verlag London 2017

Authors and Affiliations

  1. 1.University of AthensAthensGreece
  2. 2.Technological Educational Institute of ThessalyLarissaGreece

Personalised recommendations