Skip to main content
Log in

Topic Models with Sentiment Priors Based on Distributed Representations

  • Published:
Journal of Mathematical Sciences Aims and scope Submit manuscript

In recent works, topic models for aspect-based opinion mining have been extended to automatically train sentiment priors for topic-word distributions, leading to automated discovery of sentiment words and improved sentiment classification. In this work, we propose an approach where sentiment priors are trained in the space of word embeddings; this allows us to both discover more aspect-related sentiment words and further improve classification. We also present an experimental study that validates our results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. R. Al-Rfou, B. Perozzi, and S. Skiena, “Polyglot: Distributed word representations for multilingual nlp,” in: Proc. 17th Conference on Computational Natural Language Learning (Sofia, Bulgaria), Association for Computational Linguistics, (2013), pp. 183–192.

  2. A. Alekseev, S. I. Nikolenko, E. Tutubalina, I, Shenbin, and V. Malykh, “Aspera: Aspectbased rating prediction model,” in: 41st European Conference on Information Retrieval, Lecture Notes in Computer Science, Vol. 11438 (2019), pp. 163–171.

  3. N. Arefyev, A. Panchenko, A. Lukanin, O. Lesota, P. Romanov, “Evaluating three corpusbased semantic similarity systems for russian,” in: Proceedings of International Conference on Computational Linguistics Dialogue (2015).

  4. A. Alekseev, E. Tutubalina, V. Malykh, and S. I. Nikolenko, “Improving unsupervised neural aspect extraction for online discussions using out-of-domain classification,” in: 7th International Symposium on Language and Knowledge Engineering (2019).

  5. J. Bergstra, O. Breuleux, F. Bastien, P. Lamblin, R. Pascanu, G. Desjardins, J. Turian, D. Warde-Farley, Y. Bengio, “Theano: a CPU and GPU math expression compiler,” in: Proc. Python for scientific computing conference (SciPy), Vol. 4, Austin, TX (2010), p. 3.

  6. D. M. Blei, J. D. Lafferty, “Dynamic topic models,” in: Proc. 23rd International Conference on Machine Learning (New York, USA), ACM (2006), pp. 113–120.

  7. D. M. Blei and J. D. McAuliffe, “Supervised topic models,” Advances in Neural Information Processing Systems, 22 (2007).

  8. D. M. Blei, A. Y. Ng, and N. I. Jordan, “Latent Dirichlet allocation,” J. Machine Learning Research, 3, Nos. 4–5, 993–1022 (2003).

    MATH  Google Scholar 

  9. Z. Cao, S. Li, Y. Liu, W. Li, and H. Ji, “A novel neural topic model and its supervised extension,” in: Proc. 29th AAAI Conference on Artificial Intelligence, January 25-30, 2015, Austin, Texas, USA (2015), pp. 2210–2216.

  10. J. Chang and D. M. Blei, 11Hierarchical relational models for document networks,” Annals of Applied Statistics, 4, No. 1, 124–150 (2010).

    Article  MathSciNet  MATH  Google Scholar 

  11. Q. Diao, M. Qiu, C.-Y. Wu, A. J. Smola, J. Jiang, and C. Wang, “Jointly modeling aspects, ratings and sentiments for movie recommendation (jmars),” in: Proc. 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM (2014), pp. 193–202.

  12. Y. Goldberg, “A primer on neural network models for natural language processing,” CoRR abs/1510.00726 (2015).

  13. T. Griffiths and M. Steyvers, “Finding scientific topics,” in: Proceedings of the National Academy of Sciences 101 (Suppl. 1) (2004), pp. 5228–5335.

  14. L. Hong, A. Ahmed, S. Gurumurthy, A. J. Smola, and K. Tsioutsiouliklis, “Discovering geographical topics in the twitter stream,” in: Proc. 21st international conference on World Wide Web, ACM (2012), pp. 769–778.

  15. N. Kalchbrenner, E. Grefenstette, and P. Blunsom, “A convolutional neural network for modelling sentences,” in: Proc. 52nd Annual Meeting of the Association for Computational Linguistics, Vol. 1, Long Papers (Baltimore, Maryland), Association for Computational Linguistics, (2014), pp. 655–665.

  16. S. Kim, J. Zhang, Z. Chen, A. H. Oh, and S. Liu, “A hierarchical aspect-sentiment model for online reviews,” in: Proc. Twenty-Seventh AAAI Conference on Artificial Intelligence, Bellevue, Washington, USA (2013), 2013.

  17. F. Li, S. Wang, S. Liu, and M. Zhang, “Suit: A supervised user-item based topic model for sentiment analysis,” in: Proc. 28th AAAI Conference on Artificial Intelligence (2014).

  18. S. Z. Li, “Markov random field modeling in image analysis,” in: Advances in Pattern Recognition, Springer, Berlin Heidelberg (2009).

  19. C. Lin, Y. He, R. Everson, and S. Ruger, “Weakly supervised joint sentiment-topic detection from text,” IEEE Transactions on Knowledge and Data Engineering, 24, No. 6, 1134–1145 (2012).

    Article  Google Scholar 

  20. B. Liu, Sentiment Analysis: Mining Opinions, Sentiments, and Emotions, Cambridge University Press (2015).

  21. B. Lu, M. Ott, C. Cardie, and B. K. Tsou, 11Multi-aspect sentiment analysis with topic models,” in: Data Mining Workshops (ICDMW), 2011 IEEE 11th International Conference (2011), pp. 81–88.

  22. T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” CoRR abs/1301.3781 (2013).

  23. T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality,” CoRR abs/1310.4546 (2013).

  24. T. Minka, Estimating a Dirichlet Distribution (2000).

  25. S. Moghaddam and M. Ester, “On the design of LDA models for aspect-based opinion mining,” in: Proc. 21st ACM international Conference on Information and Knowledge Management, ACM (2012), pp. 803–812.

  26. S. I. Nikolenko, “Artm vs. lda: an svd extension case study,” in: Proc. 5th International Conference on Analysis of Images, Social Networks, and Texts (2016).

  27. S. I. Nikolenko, “Topic quality metrics based on distributed word representations,” in: Proc. 39th International ACM SIGIR Conference on Research and Development in Information Retrieval (2016), pp. 1029–1032.

  28. A. Panchenko, N. V. Loukachevitch, D. Ustalov, D. Paperno, C. M. Meyer, and N. Konstantinova, “Russe: The first workshop on Russian semantic similarity,” in: Proc. International Conference on Computational Linguistics and Intellectual Technologies (Dialogue) (2015), pp. 89–105.

  29. S. I. Nikolenko, O. Koltsova, and S. Koltsov, “Topic modelling for qualitative studies,” J Information Science, 43, No. 1, 88–102 (2017).

    Article  Google Scholar 

  30. J. Pennington, R. Socher, and C. Manning, “GloVe: Global vectors for word representation,” in: Proc. 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (Doha, Qatar), Association for Computational Linguistics (2014), pp. 1532–1543.

  31. D. Quercia, H. Askham, and J. Crowcroft, “TweetLDA: supervised topic classification and link prediction in twitter,” in: WebSci (Noshir S. Contractor, Brian Uzzi, N. W. Macy, and Wolfgang Nejdl, eds.), ACM (2012), pp. 247–250.

  32. R. vRehůvrek and P. Sojka, “Software Framework for Topic Modelling with Large Corpora,” in: Proc. LREC 2010 Workshop on New Challenges for NLP Frameworks (Valletta, Malta), ELRA (2010), pp. 45–50.

  33. D. O. Séaghdha and S. Teufel, “Unsupervised learning of rhetorical structure with un-topic models,” in: COLING (2014), pp. 2–13.

  34. I. Titov and R. McDonald, “Modeling online reviews with multi-grain topic models,” in: Proc. 17th International conference on World Wide Web, ACM (2008), pp. 111–120.

  35. E. Tutubalina and S. I. Nikolenko, “Inferring sentiment-based priors in topic models,” in: Proc. 14th Mexican International Conference on Artificial Intelligence, LNCS Vol. 9414, Springer (2015), pp. 92–104.

  36. E. Tutubalina and S. I. Nikolenko, “Demographic prediction based on user reviews about medications,” Computación y Sistemas, 21, No. 2, 227–241 (2017).

    Article  Google Scholar 

  37. E. Tutubalina and S. I. Nikolenko, “Exploring convolutional neural networks and topic models for user profiling from drug reviews,” Multimedia Tools and Applications, 77, No. 4, 4791–4809 (2018).

    Article  Google Scholar 

  38. C. Wang, D. M. Blei, and D. Heckerman, “Continuous time dynamic topic models,” in: Proc. 24th Conference on Uncertainty in Artificial Intelligence (2008).

  39. H. Wang, Y. Lu, and C. Zhai, 11Latent aspect rating analysis without aspect keyword supervision,” in: Proc. 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM (2011), pp. 618–626.

  40. X. Wang, Y. Liu, C. Sun, B. Wang, and X. Wang, “Predicting polarities of tweets by composing word embeddings with long short-term memory,” in: Proc. 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Vol. 1, Long Papers (Beijing, China), Association for Computational Linguistics (2015), pp. 1343–1353.

  41. X. Wang and A. McCallum, “Topics over time: a non-Markov continuous-time model of topical trends,” in: Proc. 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (New York, USA), ACM (2006), pp. 424–433.

  42. T. Wilson, J. Wiebe, and P. Hoffmann, “Recognizing contextual polarity in phrase-level sentiment analysis,” in: Proc. Conference on Human Language Technology and Empirical Methods in Natural Language Processing, Association for Computational Linguistics (2005), pp. 347–354.

  43. Y. Wu, W. Wu, Z. Li, and M. Zhou, “Topic augmented neural network for short text conversation,” arXiv preprint arXiv:1605.00090 (2016).

  44. M. Yang, T. Cui, and W. Tu, “Ordering-sensitive and semantic-aware topic modeling,” CoRR abs/1502.0363 (2015).

  45. Z. Yang, A. Kotov, A. Mohan, and S. Lu, “Parametric and non-parametric user-aware sentiment topic models,” in: Proc. 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM (2015), pp. 413–422.

  46. J. Yohan and A. H. Oh, “Aspect and sentiment unification model for online review analysis,” in: Proc. 4th ACM International Conference on Web Search and Data Mining (New York, NY, USA), WSDM ’11, ACM (2011), pp. 815–824.

  47. He Zhao, Lan Du, Wray Buntine, and Mingyuan Zhou, “Dirichlet belief networks for topic structure learning,” in: Advances in Neural Information Processing Systems (S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, eds.), Vol. 31, Curran Associates, Inc., (2018).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to E. V. Tutubalina.

Additional information

Published in Zapiski Nauchnykh Seminarov POMI, Vol. 499, 2021, pp. 284–301.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tutubalina, E.V., Nikolenko, S.I. Topic Models with Sentiment Priors Based on Distributed Representations. J Math Sci 273, 639–652 (2023). https://doi.org/10.1007/s10958-023-06525-8

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10958-023-06525-8

Navigation