Learning patterns for discovering domain-oriented opinion words
- First Online:
Sentiment analysis is a challenging task that attracted increasing interest during the last years. The availability of online data along with the business interest to keep up with consumer feedback generates a constant demand for online analysis of user-generated content. A key role to this task plays the utilization of domain-specific lexicons of opinion words that enables algorithms to classify short snippets of text into sentiment classes (positive, negative). This process is known as dictionary-based sentiment analysis. The related work tends to solve this lexicon identification problem by either exploiting a corpus and a thesaurus or by manually defining a set of patterns that will extract opinion words. In this work, we propose an unsupervised approach for discovering patterns that will extract domain-specific dictionary. Our approach (DidaxTo) utilizes opinion modifiers, sentiment consistency theories, polarity assignment graphs and pattern similarity metrics. The outcome is compared against lexicons extracted by the state-of-the-art approaches on a sentiment analysis task. Experiments on user reviews coming from a diverse set of products demonstrate the utility of the proposed method. An implementation of the proposed approach in an easy to use application for extracting opinion words from any domain and evaluate their quality is also presented.