Abstract
With the help of Internet and Web technologies, more and more consumers tend to seek opinions online before making purchase decisions. However, with the ever-increasing volume of user generated reviews, people are overwhelmed with the amount of data they have. Thus there is a great need for a system that can summarize the reviews and produce a set of aspects being mentioned in the reviews together with the pros/cons being expressed to them. To address the need, this paper proposes a new probabilistic topic model, SentiLDA, for mining reviews (unstructured data) and their ratings (structured data) jointly to detect the product/service aspects and their corresponding positive and negative opinions simultaneously. A key feature of SentiLDA is that it is capable of mining positive and negative sub-topics under the same aspect without the need of sentiment seed words. Experiment results show that the performance of SentiLDA outperforms the other related state-of-the-art models in detecting product/service aspects and their corresponding sentiments in reviews.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Diao, Q., Qiu, M., Wu, C.Y., Smola, A.J., Jiang, J., Wang, C.: Jointly modeling aspects, ratings and sentiments for movie recommendation (jmars). In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 193–202. ACM, August 2014
Fellbaum, C.: WordNet. Blackwell Publishing Ltd. (1998)
Ganu, G., Elhadad, N., Marian, A.: Beyond the stars: improving rating predictions using review text content. In: WebDB, vol. 9, pp. 1–6, June 2009
Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proc. Natl. Acad. Sci. 101(suppl. 1), 5228–5235 (2004)
Horrigan, J.A.: Online shopping. In: Pew Internet & American Life Project Report (2008)
Hu, M., Liu, B.: Mining opinion features in customer reviews. In: AAAI, vol. 4, No. 4, pp. 755–760, July 2004
Jo, Y., Oh, A.H.: Aspect and sentiment unification model for online review analysis. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, pp. 815–824. ACM, February 2011
Lin, C., He, Y.: Joint sentiment/topic model for sentiment analysis. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 375–384. ACM, November 2009
Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S.J., McClosky, D.: The Stanford CoreNLP natural language processing toolkit. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55–60, June 2014
Mei, Q., Ling, X., Wondra, M., Su, H., Zhai, C.: Topic sentiment mixture: modeling facets and opinions in weblogs. In: Proceedings of the 16th International Conference on World Wide Web, pp. 171–180. ACM, May 2007
Titov, I., McDonald, R.: Modeling online reviews with multi-grain topic models. In: Proceedings of the 17th International Conference on World Wide Web, pp. 111–120. ACM, April 2008
Turney, P.D.: Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 417–424. Association for Computational Linguistics, July 2002
Wallach, H.M., Mimno, D.M., McCallum, A.: Rethinking LDA: why priors matter. In: Advances in Neural Information Processing Systems, pp. 1973–1981 (2009)
Wang, H., Lu, Y., Zhai, C.: Latent aspect rating analysis on review text data: a rating regression approach. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 783–792. ACM, July 2010
Wu, N., Liu, F., Zhang, J.: A study on consistency of cross-site online reviews. In: The 10th IEEE International Conference on Pervasive, Intelligence and Computing, December 2013
Xu, X., Tan, S., Liu, Y., Cheng, X., Lin, Z.: Towards jointly extracting aspects and aspect-specific sentiment knowledge. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 1895–1899. ACM, October 2012
Zhai, K., Boyd-Graber, J., Asadi, N., Alkhouja, M.L.: Mr. LDA: a flexible large scale topic modeling package using variational inference in mapreduce. In: Proceedings of the 21st International Conference on World Wide Web, pp. 879–888. ACM, April 2012
Zhai, Z., Liu, B., Xu, H., Jia, P.: Clustering product features for opinion mining. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, pp. 347–354. ACM, February 2011
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Liu, F., Wu, N. (2016). SentiLDA — An Effective and Scalable Approach to Mine Opinions of Consumer Reviews by Utilizing Both Structured and Unstructured Data. In: Madria, S., Hara, T. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2016. Lecture Notes in Computer Science(), vol 9829. Springer, Cham. https://doi.org/10.1007/978-3-319-43946-4_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-43946-4_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-43945-7
Online ISBN: 978-3-319-43946-4
eBook Packages: Computer ScienceComputer Science (R0)