Advertisement

Hybrid attribute based sentiment classification of online reviews for consumer intelligence

  • Barkha Bansal
  • Sangeet Srivastava
Article
  • 23 Downloads

Abstract

Rich online consumer reviews (OCR) can be mined to gain valuable insights, beneficial for both brands and future buyers. Recently, aspect based sentiment classification have shown excellent results for fine grained sentiment analysis of OCR. However, there are only few studies so far that rely on both explicitly deriving sentiment using syntactic features, and capturing implicit contextual word relations for the task of aspect based sentiment classification. In this paper, we propose a novel method: Hybrid Attribute Based Sentiment Classification (HABSC) with the aim to derive sentiment orientation of OCR by capturing implicit word relations and incorporating domain specific knowledge. First, we detect the most frequent bigrams and trigrams in the corpus, followed by POS tagging to retain aspect descriptions and opinion words. Then, we employ TFIDF (term frequency inverse document frequency) to represent each document, followed by automatically extracting optimal number of topics in the given corpus. All the adjectives and adverbs are labelled using domain specific knowledge and pre-existing lexicons. Lastly, we find sentiment orientation of each review under the assumption that each review is a mixture of weighted and sentiment labelled attributes. We test the efficiency of our method using datasets from two different domains: hotel reviews from TripAdvisor.com and mobile phone reviews from Amazon.com. Results show that, the classification accuracy of HABSC significantly exceeds various state-of-the-art methods including aspect-based sentiment classification and supervised classification using distributed word and paragraph vectors. Our method also exhibits less computational time as compared to distributed vectorization schemes.

Keywords

Consumer intelligence Topic sentiment Semantic sentiment Distributed vector Lexicons n gram Part-of-speech 

Notes

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

References

  1. 1.
    Abburi H, Akkireddy ESA, Gangashetti S, Mamidi R (2016) Multimodal sentiment analysis of telugu songs. In: SAAIP@ IJCAI, pp 48–52Google Scholar
  2. 2.
    Abdelwahab O, Elmaghraby A (2016) Uofl at semeval-2016 task 4: multi domain word2vec for twitter sentiment classification. In: Proceedings of the 10th international workshop on semantic evaluation (SemEval-2016), pp 164–170Google Scholar
  3. 3.
    Al-Amin M, Islam MS, Uzzal SD (2017) Sentiment analysis of bengali comments with word2vec and sentiment information of words. In: International conference on electrical, computer and communication engineering (ECCE). IEEE, pp 186-190Google Scholar
  4. 4.
    Alam MH, Ryu WJ, Lee S (2016) Joint multi-grain topic sentiment: modeling semantic aspects for online reviews. Inform Sci 339:206–223CrossRefGoogle Scholar
  5. 5.
    Appel O, Chiclana F, Carter J, Fujita H (2018) Successes and challenges in developing a hybrid approach to sentiment analysis. Appl Intell 48(5):1176–1188Google Scholar
  6. 6.
    Bansal B, Srivastava S (2018) Sentiment classification of online consumer reviews using word vector representations. Procedia Computer Science 132:1147–1153CrossRefGoogle Scholar
  7. 7.
    Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3(Jan):993–1022zbMATHGoogle Scholar
  8. 8.
    Cerón-Guzmán JA, León-Guzmán E (2016) A sentiment analysis system of spanish tweets and its application in Colombia 2014 presidential election. In: 2016 IEEE international conferences on big data and cloud computing (BDCloud), social computing and networking (socialcom), sustainable computing and communications (sustaincom)(BDCloud-socialcom-sustaincom), pp 250–257Google Scholar
  9. 9.
    Chen R, Xu W (2017) The determinants of online customer ratings: a combined domain ontology and topic text analytics approach. Electron Commer Res 17(1):31–50CrossRefGoogle Scholar
  10. 10.
    Chen R, Zheng Y, Xu W, Liu M, Wang J (2018) Secondhand seller reputation in online markets: a text analytics framework. Decis Support Syst 108:96–106CrossRefGoogle Scholar
  11. 11.
    Dataset (2016) Amazon mobile review dataset. https://www.kaggle.com/PromptCloudHQ/amazon-reviews-unlocked-mobile-phones/data. Online; Accessed Nov 2017
  12. 12.
    Giatsoglou M, Vozalis MG, Diamantaras K, Vakali A, Sarigiannidis G, Chatzisavvas KC (2017) Sentiment analysis leveraging emotions and word embeddings. Expert Syst Appl 69:214–224CrossRefGoogle Scholar
  13. 13.
    Hogenboom A, Heerschop B, Frasincar F, Kaymak U, de Jong F (2014) Multi-lingual support for lexicon-based sentiment analysis guided by semantics. Decis Support Syst 62:43–53CrossRefGoogle Scholar
  14. 14.
    Hu M, Liu B (2004) Mining opinion features in customer reviews. In: AAAI, vol 4. pp 755-760Google Scholar
  15. 15.
    Jiang S, Lewris J, Voltmer M, Wang H (2016) Integrating rich document representations for text classification. In: 2016 IEEE systems and information engineering design symposium (SIEDS). IEEE, 303-308Google Scholar
  16. 16.
    Jiang Y, Song X, Harrison J, Quegan S, Maynard D (2017) Comparing attitudes to climate change in the media using sentiment analysis based on latent dirichlet allocation. In: Proceedings of the 2017 EMNLP workshop natural language processing meets journalism, pp 25–30Google Scholar
  17. 17.
    Jo Y, Oh AH (2011) Aspect and sentiment unification model for online review analysis. In: Proceedings of the fourth ACM international conference on Web search and data mining. ACM, pp 815–824Google Scholar
  18. 18.
    Karami A, Gangopadhyay A, Zhou B, Kharrazi H (2017) Fuzzy approach topic discovery in health and medical corpora. Int J Fuzzy Syst 20(4):1334–1345CrossRefGoogle Scholar
  19. 19.
    Karami A, Dahl AA, Turner-McGrievy G, Kharrazi H, Shaw G (2018) Characterizing diabetes, diet, exercise, and obesity comments on twitter. Int J Inf Manage 38(1):1–6CrossRefGoogle Scholar
  20. 20.
    Kim HK, Kim M (2016) Model-induced term-weighting schemes for text classification. Appl Intell 45 (1):30–43CrossRefGoogle Scholar
  21. 21.
    Koltcov S, Koltsova O, Nikolenko S (2014) Latent dirichlet allocation: stability and applications to studies of user-generated content. In: Proceedings of the 2014 ACM conference on web science. ACM, pp 161-165Google Scholar
  22. 22.
    Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: International conference on machine learning, pp 1188–1196Google Scholar
  23. 23.
    Li G, Liu F (2014) Sentiment analysis based on clustering: a framework in improving accuracy and recognizing neutral opinions. Appl Intell 40(3):441–452CrossRefGoogle Scholar
  24. 24.
    Lin C, He Y (2009) Joint sentiment/topic model for sentiment analysis. In: Proceedings of the 18th ACM conference on Information and knowledge management. ACM, pp 375–384Google Scholar
  25. 25.
    Liu B (2012) Sentiment analysis and opinion mining. Synthesis Lectures On Human Language Technologies 5(1):1–167CrossRefGoogle Scholar
  26. 26.
    Liu B, Hu M, Cheng J (2005) Opinion observer: analyzing and comparing opinions on the web. In: Proceedings of the 14th international conference on world wide web. ACM, pp 342–351Google Scholar
  27. 27.
    Liu H (2017) Sentiment analysis of citations using word2vec. arXiv:170400177
  28. 28.
    Liu X, Burns AC, Hou Y (2017) An investigation of brand-related user-generated content on twitter. J Advert 46(2):236–247CrossRefGoogle Scholar
  29. 29.
    Liu Y, Jiang C, Zhao H (2018) Using contextual features and multi-view ensemble learning in product defect identification from online discussion forums. Decis Support Syst 105:1–12CrossRefGoogle Scholar
  30. 30.
    Ma S, Zhang C, He D (2016) Document representation methods for clustering bilingual documents. Proceedings of the Association for Information Science and Technology 53(1):1–10Google Scholar
  31. 31.
    Mei Q, Ling X, Wondra M, Su H, Zhai C (2007) Topic sentiment mixture: modeling facets and opinions in weblogs. In: Proceedings of the 16th international conference on world wide web. ACM, pp 171–180Google Scholar
  32. 32.
    Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv:13013781
  33. 33.
    Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119Google Scholar
  34. 34.
    Mikolov T, Wt Yih, Zweig G (2013) Linguistic regularities in continuous space word representations. In: Proceedings of the 2013 conference of the north american chapter of the association for computational linguistics: human language technologies, pp 746-751Google Scholar
  35. 35.
    Nielsen FÅ (2011) A new anew: evaluation of a word list for sentiment analysis in microblogs. arXiv:11032903
  36. 36.
    Panichella A, Dit B, Oliveto R, Di Penta M, Poshynanyk D, De Lucia A (2013) How to effectively use topic models for software engineering tasks? an approach based on genetic algorithms. In: 2013 35th international conference on software engineering (ICSE). IEEE, pp 522–531Google Scholar
  37. 37.
    Pham DH, Le AC (2017) Learning multiple layers of knowledge representation for aspect based sentiment analysis. Data Knowl EngGoogle Scholar
  38. 38.
    Qiang J, Li Y, Yuan Y, Liu W (2018) Snapshot ensembles of non-negative matrix factorization for stability of topic modeling. Appl Intell:1–13Google Scholar
  39. 39.
    Qiao Z, Zhang X, Zhou M, Wang GA, Fan W (2017) A domain oriented lda model for mining product defects from online customer reviewsGoogle Scholar
  40. 40.
    Rehurek R. Gensim. https://radimrehurek.com/gensim/models/phrases.html. Last accessed Nov 2017
  41. 41.
    Rehurek R, Sojka P (2010) Software framework for topic modelling with large corpora. In: Proceedings of the LREC 2010 workshop on new challenges for NLP frameworks. CiteseerGoogle Scholar
  42. 42.
    Salehan M, Kim DJ (2016) Predicting the performance of online consumer reviews: a sentiment mining approach to big data analytics. Decis Support Syst 81:30–40CrossRefGoogle Scholar
  43. 43.
    Sanguansat P (2016) Paragraph2vec-based sentiment analysis on social media for business in thailand. In: 2016 8th international conference on knowledge and smart technology (KST). IEEE, pp 175–178Google Scholar
  44. 44.
    Schwenk H (2007) Continuous space language models. Comput Speech Lang 21(3):492–518CrossRefGoogle Scholar
  45. 45.
    Spacy https://spacy.io. Last accessed Nov 2017
  46. 46.
    Tripathy A, Agrawal A, Rath SK (2016) Classification of sentiment reviews using n-gram machine learning approach. Expert Syst Appl 57:117–126CrossRefGoogle Scholar
  47. 47.
    Wang T, Cai Y, Hf Leung, Lau RY, Li Q, Min H (2014) Product aspect extraction supervised with online domain knowledge. Knowl-Based Syst 71:86–100CrossRefGoogle Scholar
  48. 48.
    Wang W, Wang H, Song Y (2017) Ranking product aspects through sentiment analysis of online reviews. J Exp Theor Artif Intell 29(2):227–246CrossRefGoogle Scholar
  49. 49.
    Wang Z, Gu S, Xu X (2018) Gslda: lda-based group spamming detection in product reviews. Appl Intell 48(9):3094–3107CrossRefGoogle Scholar
  50. 50.
    Xianghua F, Guo L, Yanyan G, Zhiqiang W (2013) Multi-aspect sentiment analysis for chinese online social reviews based on topic modeling and hownet lexicon. Knowl-Based Syst 37:186–195CrossRefGoogle Scholar
  51. 51.
    Xin Y, Yang J, Xie ZQ, Zhang JP (2015) An overlapping semantic community detection algorithm base on the arts multiple sampling models. Expert Syst Appl 42(7):3420–3432CrossRefGoogle Scholar
  52. 52.
    Yan X, Guo J, Lan Y, Cheng X (2013) A biterm topic model for short texts. In: Proceedings of the 22nd international conference on world wide web. ACM, pp 1445–1456Google Scholar
  53. 53.
    Yao Y, Li X, Liu X, Liu P, Liang Z, Zhang J, Mai K (2017) Sensing spatial distribution of urban land use by integrating points-of-interest and google word2vec model. Int J Geogr Inf Sci 31(4):825–848CrossRefGoogle Scholar
  54. 54.
    Yu D, Mu Y, Jin Y (2017) Rating prediction using review texts with underlying sentiments. Inf Process Lett 117:10–18MathSciNetCrossRefGoogle Scholar
  55. 55.
    Zainuddin N, Selamat A, Ibrahim R (2017) Hybrid sentiment classification on twitter aspect-based sentiment analysis. Appl Intell 48(5):1–15Google Scholar
  56. 56.
    Zhang D, Xu H, Su Z, Xu Y (2015) Chinese comments sentiment classification based on word2vec and svmperf. Expert Syst Appl 42(4):1857–1863CrossRefGoogle Scholar
  57. 57.
    Zirn C, Stuckenschmidt H (2014) Multidimensional topic analysis in political texts. Data Knowl Eng 90:38–53CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Applied SciencesThe NorthCap UniversityGurugramIndia

Personalised recommendations