Skip to main content
Log in

Learning patterns for discovering domain-oriented opinion words

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Sentiment analysis is a challenging task that attracted increasing interest during the last years. The availability of online data along with the business interest to keep up with consumer feedback generates a constant demand for online analysis of user-generated content. A key role to this task plays the utilization of domain-specific lexicons of opinion words that enables algorithms to classify short snippets of text into sentiment classes (positive, negative). This process is known as dictionary-based sentiment analysis. The related work tends to solve this lexicon identification problem by either exploiting a corpus and a thesaurus or by manually defining a set of patterns that will extract opinion words. In this work, we propose an unsupervised approach for discovering patterns that will extract domain-specific dictionary. Our approach (DidaxTo) utilizes opinion modifiers, sentiment consistency theories, polarity assignment graphs and pattern similarity metrics. The outcome is compared against lexicons extracted by the state-of-the-art approaches on a sentiment analysis task. Experiments on user reviews coming from a diverse set of products demonstrate the utility of the proposed method. An implementation of the proposed approach in an easy to use application for extracting opinion words from any domain and evaluate their quality is also presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. http://deixto.com/didaxto.

  2. http://www.pricegrabber.com.

  3. http://deixto.com/.

  4. http://deixto.com/didaxto.

References

  1. Agathangelou P, Katakis I, Kokkoras F, Ntonas K (2014) Mining domain-specific dictionaries of opinion words. In: Web information systems engineering—WISE 2014—15th international conference, Thessaloniki, Greece, October 12–14, 2014, Proceedings, Part I, pp 47–62

  2. Amiri H, Chua T-S (2012) Mining slang and urban opinion words and phrases from cqa services: an optimization approach. In: Proceedings of the fifth international conference on web search and web data mining, WSDM 2012, Seattle, WA, USA, February 8–12, 2012, pp 193–202

  3. Baccianella S, Esuli A, Sebastiani F (2010) Sentiwordnet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: Proceedings of the international conference on language resources and evaluation, LREC 2010, 17–23 May 2010, Valletta, Malta

  4. Broß J, Ehrig H (2013) Automatic construction of domain and aspect specific sentiment lexicons for customer review mining. In: 22nd ACM international conference on information and knowledge management, CIKM’13, San Francisco, CA, USA, October 27–November 1, 2013, pp 1077–1086

  5. Ding X, Liu B (2007) The utility of linguistic rules in opinion mining. In: SIGIR 2007: proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval, Amsterdam, The Netherlands, July 23–27, 2007, pp 811–812

  6. Esuli A, Sebastiani F (2006) Sentiwordnet: a publicly available lexical resource for opinion mining. In: In Proceedings of the 5th conference on language resources and evaluation (LREC06), pp 417–422

  7. Esuli A, Sebastiani F (2007) Pageranking wordnet synsets: an application to opinion mining. In: ACL 2007, proceedings of the 45th annual meeting of the association for computational linguistics, June 23–30, 2007, Prague, Czech Republic

  8. Guo J, Lu Y, Mori T, Blake C (2015) Expert-guided contrastive opinion summarization for controversial issues. In: WWW (Companion Volume). ACM, pp 1105–1110

  9. Hatzivassiloglou V, McKeown K (1997) Predicting the semantic orientation of adjectives. In: 35th annual meeting of the association for computational linguistics and 8th conference of the european chapter of the association for computational linguistics, proceedings of the conference, 7–12 July 1997, Universidad Nacional de Educación a Distancia (UNED), Madrid, Spain, pp 174–181

  10. Hu M, Liu B (2004) Mining and summarizing customer reviews. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, Seattle, Washington, USA, August 22–25, 2004, pp 168–177

  11. Kamps J, Marx M, Mokken RJ, de Rijke M (2004) Using wordnet to measure semantic orientations of adjectives. In: Proceedings of the fourth international conference on language resources and evaluation, LREC 2004, May 26–28, 2004, Lisbon, Portugal

  12. Kanayama H, Nasukawa T (2006) Fully automatic lexicon expansion for domain-oriented sentiment analysis. In: EMNLP 2007, proceedings of the 2006 conference on empirical methods in natural language processing, 22–23 July 2006, Sydney, Australia, pp 355–363

  13. Kim HD, Castellanos M, Hsu M, Zhai CX, Dayal U, Ghosh R (2013) Compact explanatory opinion summarization. In: CIKM. ACM, pp 1697–1702

  14. Kim HD, Castellanos MG, Hsu M, Zhai CX, Dayal U, Ghosh R (2013) Ranking explanatory sentences for opinion summarization. In: Proceedings of the 36th international ACM SIGIR conference on research and development in information retrieval, SIGIR ’13, New York, NY, USA. ACM, pp 1069–1072

  15. Kokkoras F, Ntonas K, Bassiliades N (2013) Deixto: a web data extraction suite. In: Proceedings of the 6th Balkan conference in informatics, BCI ’13, New York, NY, USA. ACM, pp 9–12

  16. Lau RY-K, Lai CL, Bruza P, Wong K-F (2011) Leveraging web 2.0 data for scalable semi-supervised learning of domain-specific sentiment lexicons. In: Proceedings of the 20th ACM conference on information and knowledge management, CIKM 2011, Glasgow, United Kingdom, October 24–28, 2011, pp 2457–2460

  17. Li C, Xu B, Wu G, He S, Tian G, Zhou Y (2015) Parallel recursive deep model for sentiment analysis. In: Cao T, Lim EP, Zhou ZH, Ho TB, Cheung D, Motoda H (eds) Advances in knowledge discovery and data mining. PAKDD 2015. Lecture notes in computer science, vol 9078. Springer, Cham

  18. Lim KW, Buntine WL (2014) Twitter opinion topic model: extracting product opinions from tweets by leveraging hashtags and sentiment lexicon. In: CIKM. ACM, pp 1319–1328

  19. Lin D (1994) PRINCIPAR—an efficient, broad-coverage, principle-based parser. In: 15th international conference on computational linguistics, COLING 1994, Kyoto, Japan, August 5–9, 1994, pp 482–488

  20. Liu K, Xu L, Zhao J (2013) Syntactic patterns versus word alignment: extracting opinion targets from online reviews. In: Proceedings of the 51st annual meeting of the association for computational linguistics, ACL 2013, 4–9 August 2013, Sofia, Bulgaria, volume 1: long papers, pp 1754–1763

  21. Lu Y, Castellanos M, Dayal U, Zhai CX (2011) Automatic construction of a context-aware sentiment lexicon: an optimization approach. In: Proceedings of the 20th international conference on World Wide Web, WWW 2011, Hyderabad, India, March 28–April 1, 2011, pp 347–356

  22. Peng W, Park DH (2011) Generate adjective sentiment dictionary for social media sentiment analysis using constrained nonnegative matrix factorization. In: Proceedings of the fifth international conference on weblogs and social media, Barcelona, Catalonia, Spain, July 17–21, 2011

  23. Popescu A-M, Etzioni O (2005) Extracting product features and opinions from reviews. In: HLT/EMNLP 2005, human language technology conference and conference on empirical methods in natural language processing, proceedings of the conference, 6–8 October 2005, Vancouver, BC, Canada

  24. Qiu G, Liu B, Bu J, Chen C (2009) Expanding domain sentiment lexicon through double propagation. In: Proceedings of the 21st international joint conference on artifical intelligence, IJCAI’09, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc, pp 1199–1204

  25. Qiu G, Liu B, Bu J, Chen C (2011) Opinion word expansion and target extraction through double propagation. Comput Linguist 37(1):9–27

    Article  Google Scholar 

  26. Rao Y, Lei J, Wenyin L, Li Q, Chen M (2014) Building emotional dictionary for sentiment analysis of online news. World Wide Web 17(4):723–742

    Article  Google Scholar 

  27. Severyn A, Moschitti A (2015) Twitter sentiment analysis with deep convolutional neural networks. In: SIGIR. ACM, pp 959–962

  28. Socher R, Perelygin A, Wu JY, Chuang J, Manning CD, Ng AY, Potts CP (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: EMNLP

  29. Thonet T, Cabanac G, Boughanem M, Pinel-Sauvagnat K (2016) VODUM: a topic model unifying viewpoint, topic and opinion discovery. In: Ferro N et al (eds) Advances in Information Retrieval. ECIR 2016. Lecture Notes in Computer Science, vol 9626. Springer, Cham

  30. Turney PD (2002) Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th annual meeting of the association for computational linguistics, July 6–12, 2002, Philadelphia, PA, USA, pp 417–424

  31. Wang S, Chen Z, Liu B (2016) Mining aspect-specific opinion using a holistic lifelong topic model. In: WWW. ACM, pp 167–176

  32. Weichselbraun A, Gindl S, Scharl A (2011) Using games with a purpose and bootstrapping to create domain-specific sentiment lexicons. In: Proceedings of the 20th ACM conference on information and knowledge management, CIKM 2011, Glasgow, UK, October 24–28, 2011, pp 1053–1060

  33. Wilson T, Wiebe J, Hoffmann P (2005) Recognizing contextual polarity in phrase-level sentiment analysis. In: HLT/EMNLP 2005, human language technology conference and conference on empirical methods in natural language processing, proceedings of the conference, 6–8 October 2005, Vancouver, BC, Canada

  34. Wu Y, Zhang Q, Huang X, Wu L (2009) Phrase dependency parsing for opinion mining. In: Proceedings of the 2009 conference on empirical methods in natural language processing, EMNLP 2009, 6–7 August 2009, Singapore. A meeting of SIGDAT, a Special Interest Group of the ACL, pp 1533–1541

  35. Xu L, Liu K, Lai S, Chen Y, Zhao J (2013) Mining opinion words and opinion targets in a two-stage framework. In: Proceedings of the 51st annual meeting of the association for computational linguistics, ACL 2013, 4–9 August 2013, Sofia, Bulgaria, volume 1: long papers, pp 1764–1773

Download references

Acknowledgements

This project received funding from the European Union Horizon 2020 Programme (Horizon2020/2014-2020), under Grant Agreement 688380.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ioannis Katakis.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Agathangelou, P., Katakis, I., Koutoulakis, I. et al. Learning patterns for discovering domain-oriented opinion words. Knowl Inf Syst 55, 45–77 (2018). https://doi.org/10.1007/s10115-017-1072-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-017-1072-y

Keywords

Navigation