Advertisement

Social Network Analysis and Mining

, Volume 3, Issue 4, pp 1149–1163 | Cite as

Good location, terrible food: detecting feature sentiment in user-generated reviews

  • Mario Cataldi
  • Andrea Ballatore
  • Ilaria Tiddi
  • Marie-Aude Aufaure
Original Article

Abstract

A growing corpus of online informal reviews is generated every day by non-experts, on social networks and blogs, about an unlimited range of products and services. Users do not only express holistic opinions, but often focus on specific features of their interest. The automatic understanding of “what people think” at the feature level can greatly support decision making, both for consumers and producers. In this paper, we present an approach to feature-level sentiment detection that integrates natural language processing with statistical techniques, in order to extract users’ opinions about specific features of products and services from user-generated reviews. First, we extract domain features, and each review is modelled as a lexical dependency graph. Second, for each review, we estimate the polarity relative to the features by leveraging the syntactic dependencies between the terms. The approach is evaluated against a ground truth consisting of set of user-generated reviews, manually annotated by 39 human subjects and available online, showing its human-like ability to capture feature-level opinions.

Keywords

Sentiment analysis Opinion mining Natural language processing Feature detection Dependency graphs 

References

  1. Annett M, Kondrak G (2008) A comparison of sentiment analysis techniques: polarizing movie blogs. In: Advances in artificial intelligence, vol 5032. Springer, LNCS, pp 25–35Google Scholar
  2. Baccianella S, Esuli A, Sebastiani F (2010) SentiWordNet 3.0: an enhanced lexical resource for sentiment analysis and ppinion mining. In: Proceedings of the seventh conference on international language resources and evaluation (LREC’10), pp 2200–2204Google Scholar
  3. Baldwin T, Lui M (2010) Language identification: the long and the short of the matter. In: Human language technologies: The 2010 annual conference of the North American chapter of the association for computational linguistics, ACL, pp 229–237Google Scholar
  4. Banerjee M, Capozzoli M, McSweeney L, Sinha D (1999) Beyond kappa: a review of interrater agreement measures. Can J Stat 27(1):3–23MathSciNetCrossRefzbMATHGoogle Scholar
  5. Beineke P, Hastie T, Manning C, Vaithyanathan S (2004) Exploring sentiment summarization. In: Proceedings of the AAAI spring symposium on exploring attitude and affect in text: theories and applications, AAAI, pp 1–4Google Scholar
  6. Carvalho P, Sarmento L, Silva M, de Oliveira E (2009) Clues for detecting irony in user-generated contents: oh...!! It’s so easy ;-). In: Proceedings of the 1st international CIKM workshop on topic-sentiment analysis for mass opinion, ACM, pp 53–56Google Scholar
  7. Chen L, Qi L (2011) Social opinion mining for supporting buyers’ complex decision making: exploratory user study and algorithm comparison. Soc Netw Anal Min 1(4):301–320MathSciNetCrossRefGoogle Scholar
  8. Chevalier J, Mayzlin D (2006) The effect of word of mouth on sales: online book reviews. J Mark Res 43(3):345–354CrossRefGoogle Scholar
  9. Collins MJ (1999). Head-driven statistical models for natural language parsing. PhD thesis, University of Pennsylvania, Philadelphia, PA, USA.Google Scholar
  10. Dawes J (2008) Do data characteristics change according to the number of scale points used? An experiment using 5 point, 7 point and 10 point scales. Int J Mark Res 51(1)Google Scholar
  11. De Marneffe MC, Maccartney B, Manning CD (2006) Generating typed dependency parses from phrase structure parses. In: Proceedings of the seventh conference on international language resources and evaluation (LREC 2006), pp 449–454Google Scholar
  12. Ding X, Liu B (2010) Resolving object and attribute coreference in opinion mining. In: Proceedings of the 23rd international conference on computational linguistics, ACL, pp 268–276Google Scholar
  13. Fellbaum C (ed) (1998) WordNet: an electronic lexical database. MIT Press, CambridgeGoogle Scholar
  14. Fleiss J (1971) Measuring nominal scale agreement among many raters. Psychol Bull 76(5):378–382CrossRefGoogle Scholar
  15. Ganesan K, Zhai C (2012) Opinion-based entity ranking. Inf Retr 15(2):116–150CrossRefGoogle Scholar
  16. Godbole N, Srinivasaiah M, Skiena S (2007) Large-scale sentiment analysis for news and blogs. In: Proceedings of the International Conference on Weblogs and Social Media (ICWSM), pp 219–222Google Scholar
  17. Hatzivassiloglou V, McKeown KR (1997) Predicting the semantic orientation of adjectives. In: Proceedings of the eighth conference on European chapter of the association for computational linguistics, ACL, EACL ’97, pp 174–181Google Scholar
  18. Hiroshi K, Tetsuya N, Hideo W (2004) Deeper sentiment analysis using machine translation technology. In: Proceedings of the 20th international conference on computational linguistics, ACL, COLING ’04, pp 1–7Google Scholar
  19. Holz F, Teresniak S (2010) Towards automatic detection and tracking of topic change. In: Computational linguistics and intelligent text processing, vol 6008. Springer, LNCS, pp 327–339Google Scholar
  20. Hu M, Liu B (2004a) Mining and summarizing customer reviews. In: Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’04, ACM, pp 168–177Google Scholar
  21. Hu M., Liu B. (2004b) Mining opinion features in customer reviews. In: Proceedings of the 19th national conference on artifical intelligence (AAAI’04), AAAI, pp 755–760)Google Scholar
  22. Kannan K, Goyal M, Jacob G (2012) Modeling the impact of review dynamics on utility value of a product. Soc Netw Anal Min, pp 1–18Google Scholar
  23. Kasami T (1965) An efficient recognition and syntax analysis algorithm for context-free languages. Technical Report AFCRL-65-758. Air Force Cambridge Research LaboratoryGoogle Scholar
  24. Klein D, Manning CD (2003) Accurate unlexicalized parsing. In: Proceedings of the 41st annual meeting on association for computational linguistics, ACL, ACL ’03, pp 423–430Google Scholar
  25. Lipsman A (2007) Online consumer-generated reviews have significant impact on offline purchase behavior (comScore, Inc. and The Kelsey Group). URL:http://www.comscore.com/Insights/Press_Releases/2007/11/Online_Consumer_Reviews_Impact_Offline_Purchasing_Behavior.
  26. Liu B (2012) Sentiment analysis and opinion mining. Synth Lect Human Lang Technol 5(1):1–167CrossRefGoogle Scholar
  27. Liu B, Hu M, Cheng J (2005). Opinion observer: analyzing and comparing opinions on the web. In: Proceedings of the 14th international conference on World Wide Web, WWW ’05, ACM, pp 342–351Google Scholar
  28. Marcus MP, Marcinkiewicz MA, Santorini B (1993) Building a large annotated corpus of English: the Penn treebank. Comput Linguist 19(2):313–330Google Scholar
  29. Matsumoto S, Takamura H, Okumura M (2005) Sentiment classification using word sub-sequences and dependency sub-trees. In: Proceedings of the 9th Pacific-Asia conference on advances in knowledge discovery and data mining, PAKDD’05. Springer, Berlin, pp 301–311Google Scholar
  30. McDonald R, Nivre J (2011) Analyzing and integrating dependency parsers. Comput Linguist 37(1):197–230CrossRefGoogle Scholar
  31. Miao Q, Li Q, Dai R (2009) AMAZING: a sentiment mining and retrieval system. Expert Syst Appl 36(3):7192–7198CrossRefGoogle Scholar
  32. Missen M, Boughanem M, Cabanac G (2012) Opinion mining: reviewed from word to document level. Soc Netw Anal Min pp 1–19Google Scholar
  33. Moilanen K, Pulman S (2007) Sentiment composition. In: Proceedings of the recent advances in natural language processing international conference (RANLP 2007), pp 378–382Google Scholar
  34. Morinaga S, Yamanishi K, Tateishi K, Fukushima T (2002). Mining product reputations on the Web. In: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’02,m ACM, pp 341–349Google Scholar
  35. Mukherjee A, Liu B, Glance N (2012) Spotting fake reviewer groups in consumer reviews. In: Proceedings of the 21st international conference on World Wide Web (WWW 2012), ACM, pp 191–200Google Scholar
  36. Nadeau D, Sekine S (2007) A survey of named entity recognition and classification. Lingvisticae Investigationes 30(1):3–26CrossRefGoogle Scholar
  37. O’Connor P (2010) Managing a hotel’s image on tripadvisor. J Hosp Mark Manag 19(7):754–772Google Scholar
  38. Oelke D, Hao MC, Rohrdantz C, Keim DA, Dayal U, Haug LE, Janetzko H (2009) Visual opinion analysis of customer feedback data. In: IEEE VAST, pp 187–194Google Scholar
  39. Pang B, Lee L (2004) A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd annual meeting on association for computational linguistics, ACL, ACL ’04, pp 271–278Google Scholar
  40. Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1–2):1–135CrossRefGoogle Scholar
  41. Pang B, Lee L, Vaithyanathan S (2002) Thumbs up?: sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on empirical methods in natural language processing, ACL, EMNLP ’02, vol 10, pp 79–86Google Scholar
  42. Pedersen T, Kolhatkar V (2009) WordNet::SenseRelate::AllWords: a broad coverage word sense tagger that maximizes semantic relatedness. In: The 2009 annual conference of the North American chapter of the association for computational linguistics, ACL, pp 17–20Google Scholar
  43. Pekar V, Ou S (2008). Discovery of subjective evaluations of product features in hotel reviews. J Vacat Mark 14(2):145–155CrossRefGoogle Scholar
  44. Ponzetto SP, Strube M (2007). An API for measuring the relatedness of words in Wikipedia. In: Proceedings of the 45th annual meeting of the association for computational linguistics on interactive poster and demonstration sessions, ACL, pp 49–52Google Scholar
  45. Popescu AM, Etzioni O (2005) Extracting product features and opinions from reviews. In: Proceedings of the conference on human language technology and empirical methods in natural language processing, ACL, HLT ’05, pp 339–346Google Scholar
  46. Qiu G, Liu B, Bu J, Chen C (2009) Expanding domain sentiment Lexicon through double propagation. In: Proceedings of the 21st international joint conference on artifical intelligence. Morgan Kaufmann Publishers Inc., Burlington, pp 1199–1204Google Scholar
  47. Qiu G, Liu B, Bu J, Chen C (2011) Opinion word expansion and target extraction through double propagation. Comput Linguist 37(1):9–27CrossRefGoogle Scholar
  48. Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Inf Process Manag 24:513–523Google Scholar
  49. Titov I, McDonald R (2008) Modeling online reviews with multi-grain topic models. In: Proceedings of the 17th international conference on World Wide Web, ACM, pp 111–120Google Scholar
  50. Turney PD (2002) Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th annual meeting on association for computational linguistics, ACL, pp 417–424Google Scholar
  51. Turney PD, Littman ML (2003) Measuring praise and criticism: inference of semantic orientation from association. ACM Trans Inf Syst 21(4):315–346CrossRefGoogle Scholar
  52. Warschauer M, Black R, Chou Y (2010). Online Englishes. In: Kirkpatrick T (ed) The Routledge Handbook of World Englishes. Routledge, New York, pp 490–505Google Scholar
  53. Wu Y, Wei F, Liu S, Au N, Cui W, Zhou H, Qu H (2010). OpinionSeer: interactive visualization of hotel customer feedback. IEEE Trans Vis Comput Gr 16(6):1109–1118CrossRefGoogle Scholar
  54. Ye Q, Law R, Gu B (2009) The impact of online user reviews on hotel room sales. Int J Hosp Manag 28(1):180–182CrossRefGoogle Scholar
  55. Zhai Z, Liu B, Xu H, Jia P (2011a) Clustering product features for opinion mining. In: Proceedings of the 4th ACM international conference on web search and data mining, ACM, pp 347–354Google Scholar
  56. Zhai Z, Liu B, Xu H, Jia P (2011b) Constrained LDA for grouping product features in opinion mining. Adv Know Discov Data Min, pp 448–459Google Scholar
  57. Zhang L, Liu B (2011) Identifying noun product features that imply opinions. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies: short papers, vol 2, pp 575–580Google Scholar
  58. Zhang L, Liu B, Lim S, O’Brien-Strain E (2010) Extracting and ranking product features in opinion documents. In: Proceedings of the 23rd international conference on computational linguistics: posters, ACL, pp 1462–1470Google Scholar
  59. Zhou L, Chaovalit P (2008) Ontology-supported polarity mining. J Am Soc Inf Sci Technol 59(1):98–110CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Wien 2013

Authors and Affiliations

  • Mario Cataldi
    • 1
  • Andrea Ballatore
    • 2
  • Ilaria Tiddi
    • 3
  • Marie-Aude Aufaure
    • 1
  1. 1.École Centrale ParisParisFrance
  2. 2.University College DublinDublinIreland
  3. 3.Knowledge Media Institute, The Open University Milton KeynesUK

Personalised recommendations