Abstract
Sentiment analysis is a challenging task that attracted increasing interest during the last years. The availability of online data along with the business interest to keep up with consumer feedback generates a constant demand for online analysis of user-generated content. A key role to this task plays the utilization of domain-specific lexicons of opinion words that enables algorithms to classify short snippets of text into sentiment classes (positive, negative). This process is known as dictionary-based sentiment analysis. The related work tends to solve this lexicon identification problem by either exploiting a corpus and a thesaurus or by manually defining a set of patterns that will extract opinion words. In this work, we propose an unsupervised approach for discovering patterns that will extract domain-specific dictionary. Our approach (DidaxTo) utilizes opinion modifiers, sentiment consistency theories, polarity assignment graphs and pattern similarity metrics. The outcome is compared against lexicons extracted by the state-of-the-art approaches on a sentiment analysis task. Experiments on user reviews coming from a diverse set of products demonstrate the utility of the proposed method. An implementation of the proposed approach in an easy to use application for extracting opinion words from any domain and evaluate their quality is also presented.
Similar content being viewed by others
References
Agathangelou P, Katakis I, Kokkoras F, Ntonas K (2014) Mining domain-specific dictionaries of opinion words. In: Web information systems engineering—WISE 2014—15th international conference, Thessaloniki, Greece, October 12–14, 2014, Proceedings, Part I, pp 47–62
Amiri H, Chua T-S (2012) Mining slang and urban opinion words and phrases from cqa services: an optimization approach. In: Proceedings of the fifth international conference on web search and web data mining, WSDM 2012, Seattle, WA, USA, February 8–12, 2012, pp 193–202
Baccianella S, Esuli A, Sebastiani F (2010) Sentiwordnet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: Proceedings of the international conference on language resources and evaluation, LREC 2010, 17–23 May 2010, Valletta, Malta
Broß J, Ehrig H (2013) Automatic construction of domain and aspect specific sentiment lexicons for customer review mining. In: 22nd ACM international conference on information and knowledge management, CIKM’13, San Francisco, CA, USA, October 27–November 1, 2013, pp 1077–1086
Ding X, Liu B (2007) The utility of linguistic rules in opinion mining. In: SIGIR 2007: proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval, Amsterdam, The Netherlands, July 23–27, 2007, pp 811–812
Esuli A, Sebastiani F (2006) Sentiwordnet: a publicly available lexical resource for opinion mining. In: In Proceedings of the 5th conference on language resources and evaluation (LREC06), pp 417–422
Esuli A, Sebastiani F (2007) Pageranking wordnet synsets: an application to opinion mining. In: ACL 2007, proceedings of the 45th annual meeting of the association for computational linguistics, June 23–30, 2007, Prague, Czech Republic
Guo J, Lu Y, Mori T, Blake C (2015) Expert-guided contrastive opinion summarization for controversial issues. In: WWW (Companion Volume). ACM, pp 1105–1110
Hatzivassiloglou V, McKeown K (1997) Predicting the semantic orientation of adjectives. In: 35th annual meeting of the association for computational linguistics and 8th conference of the european chapter of the association for computational linguistics, proceedings of the conference, 7–12 July 1997, Universidad Nacional de Educación a Distancia (UNED), Madrid, Spain, pp 174–181
Hu M, Liu B (2004) Mining and summarizing customer reviews. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, Seattle, Washington, USA, August 22–25, 2004, pp 168–177
Kamps J, Marx M, Mokken RJ, de Rijke M (2004) Using wordnet to measure semantic orientations of adjectives. In: Proceedings of the fourth international conference on language resources and evaluation, LREC 2004, May 26–28, 2004, Lisbon, Portugal
Kanayama H, Nasukawa T (2006) Fully automatic lexicon expansion for domain-oriented sentiment analysis. In: EMNLP 2007, proceedings of the 2006 conference on empirical methods in natural language processing, 22–23 July 2006, Sydney, Australia, pp 355–363
Kim HD, Castellanos M, Hsu M, Zhai CX, Dayal U, Ghosh R (2013) Compact explanatory opinion summarization. In: CIKM. ACM, pp 1697–1702
Kim HD, Castellanos MG, Hsu M, Zhai CX, Dayal U, Ghosh R (2013) Ranking explanatory sentences for opinion summarization. In: Proceedings of the 36th international ACM SIGIR conference on research and development in information retrieval, SIGIR ’13, New York, NY, USA. ACM, pp 1069–1072
Kokkoras F, Ntonas K, Bassiliades N (2013) Deixto: a web data extraction suite. In: Proceedings of the 6th Balkan conference in informatics, BCI ’13, New York, NY, USA. ACM, pp 9–12
Lau RY-K, Lai CL, Bruza P, Wong K-F (2011) Leveraging web 2.0 data for scalable semi-supervised learning of domain-specific sentiment lexicons. In: Proceedings of the 20th ACM conference on information and knowledge management, CIKM 2011, Glasgow, United Kingdom, October 24–28, 2011, pp 2457–2460
Li C, Xu B, Wu G, He S, Tian G, Zhou Y (2015) Parallel recursive deep model for sentiment analysis. In: Cao T, Lim EP, Zhou ZH, Ho TB, Cheung D, Motoda H (eds) Advances in knowledge discovery and data mining. PAKDD 2015. Lecture notes in computer science, vol 9078. Springer, Cham
Lim KW, Buntine WL (2014) Twitter opinion topic model: extracting product opinions from tweets by leveraging hashtags and sentiment lexicon. In: CIKM. ACM, pp 1319–1328
Lin D (1994) PRINCIPAR—an efficient, broad-coverage, principle-based parser. In: 15th international conference on computational linguistics, COLING 1994, Kyoto, Japan, August 5–9, 1994, pp 482–488
Liu K, Xu L, Zhao J (2013) Syntactic patterns versus word alignment: extracting opinion targets from online reviews. In: Proceedings of the 51st annual meeting of the association for computational linguistics, ACL 2013, 4–9 August 2013, Sofia, Bulgaria, volume 1: long papers, pp 1754–1763
Lu Y, Castellanos M, Dayal U, Zhai CX (2011) Automatic construction of a context-aware sentiment lexicon: an optimization approach. In: Proceedings of the 20th international conference on World Wide Web, WWW 2011, Hyderabad, India, March 28–April 1, 2011, pp 347–356
Peng W, Park DH (2011) Generate adjective sentiment dictionary for social media sentiment analysis using constrained nonnegative matrix factorization. In: Proceedings of the fifth international conference on weblogs and social media, Barcelona, Catalonia, Spain, July 17–21, 2011
Popescu A-M, Etzioni O (2005) Extracting product features and opinions from reviews. In: HLT/EMNLP 2005, human language technology conference and conference on empirical methods in natural language processing, proceedings of the conference, 6–8 October 2005, Vancouver, BC, Canada
Qiu G, Liu B, Bu J, Chen C (2009) Expanding domain sentiment lexicon through double propagation. In: Proceedings of the 21st international joint conference on artifical intelligence, IJCAI’09, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc, pp 1199–1204
Qiu G, Liu B, Bu J, Chen C (2011) Opinion word expansion and target extraction through double propagation. Comput Linguist 37(1):9–27
Rao Y, Lei J, Wenyin L, Li Q, Chen M (2014) Building emotional dictionary for sentiment analysis of online news. World Wide Web 17(4):723–742
Severyn A, Moschitti A (2015) Twitter sentiment analysis with deep convolutional neural networks. In: SIGIR. ACM, pp 959–962
Socher R, Perelygin A, Wu JY, Chuang J, Manning CD, Ng AY, Potts CP (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: EMNLP
Thonet T, Cabanac G, Boughanem M, Pinel-Sauvagnat K (2016) VODUM: a topic model unifying viewpoint, topic and opinion discovery. In: Ferro N et al (eds) Advances in Information Retrieval. ECIR 2016. Lecture Notes in Computer Science, vol 9626. Springer, Cham
Turney PD (2002) Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th annual meeting of the association for computational linguistics, July 6–12, 2002, Philadelphia, PA, USA, pp 417–424
Wang S, Chen Z, Liu B (2016) Mining aspect-specific opinion using a holistic lifelong topic model. In: WWW. ACM, pp 167–176
Weichselbraun A, Gindl S, Scharl A (2011) Using games with a purpose and bootstrapping to create domain-specific sentiment lexicons. In: Proceedings of the 20th ACM conference on information and knowledge management, CIKM 2011, Glasgow, UK, October 24–28, 2011, pp 1053–1060
Wilson T, Wiebe J, Hoffmann P (2005) Recognizing contextual polarity in phrase-level sentiment analysis. In: HLT/EMNLP 2005, human language technology conference and conference on empirical methods in natural language processing, proceedings of the conference, 6–8 October 2005, Vancouver, BC, Canada
Wu Y, Zhang Q, Huang X, Wu L (2009) Phrase dependency parsing for opinion mining. In: Proceedings of the 2009 conference on empirical methods in natural language processing, EMNLP 2009, 6–7 August 2009, Singapore. A meeting of SIGDAT, a Special Interest Group of the ACL, pp 1533–1541
Xu L, Liu K, Lai S, Chen Y, Zhao J (2013) Mining opinion words and opinion targets in a two-stage framework. In: Proceedings of the 51st annual meeting of the association for computational linguistics, ACL 2013, 4–9 August 2013, Sofia, Bulgaria, volume 1: long papers, pp 1764–1773
Acknowledgements
This project received funding from the European Union Horizon 2020 Programme (Horizon2020/2014-2020), under Grant Agreement 688380.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Agathangelou, P., Katakis, I., Koutoulakis, I. et al. Learning patterns for discovering domain-oriented opinion words. Knowl Inf Syst 55, 45–77 (2018). https://doi.org/10.1007/s10115-017-1072-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-017-1072-y