A joint model for analyzing topic and sentiment dynamics from large-scale online news

Article
  • 300 Downloads
Part of the following topical collections:
  1. Special Issue on Web Information Systems Engineering

Abstract

Many of today’s online news websites and aggregator apps have enabled users to publish their opinions without respect to time and place. Existing works on topic-based sentiment analysis of product reviews cannot be applied to online news directly because of the following two reasons: (1) The dynamic nature of news streams require the topic and sentiment analysis model also to be dynamically updated. (2) The user interactions among news comments can easily lead to inaccurate topic extraction and sentiment classification. In this paper, we propose a novel probabilistic generative model (DTSA) to extract topics and the specified sentiments from news streams and analyze their evolution over time simultaneously. In DTSA, three different timescale models are studied to account for the historical dependencies of sentiment-topic word distributions at current epoch, continuous, skip and multiple timescale models. Additionally, we further consider the links among news comments to avoid the error caused by user interactions. In order to mine more interpretable topics, a Conditional Random Fields (CRF) model is adopted to label a set of meaningful phrases for augmenting the bag-of-word features. Finally, we derive distributed online inference procedures to update the model with newly arrived data and show the effectiveness of our proposed model on real-world data sets.

Keywords

Topic-based sentiment analysis Topic model User interaction Conditional random fields Online inference 

Notes

Acknowledgements

This work is partially funded by the Research Council of Norway (No. 245469).

References

  1. 1.
    Blei, D.M.: Probabilistic topic models. Commun. ACM 55(4), 77–84 (2012)CrossRefGoogle Scholar
  2. 2.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3(1), 993–1022 (2003)MATHGoogle Scholar
  3. 3.
    Balahur, A., Steinberger, R., Kabadjov, M., Zavarella, V., Van Der Goot, E., Halkia, M., Pouliquen, B., Belyaeva, J.: Sentiment Analysis in the News. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10), pp 19–21. European Language Resources Association (2010)Google Scholar
  4. 4.
    Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)CrossRefGoogle Scholar
  5. 5.
    Dermouche, M., Velcin, J., Khouas, L., Loudcher, S.: A Joint Model for Topic-Sentiment Evolution over Time. In: Proceedings of the 2014 IEEE International Conference on Data Mining, pp 773–778. IEEE, Shenzhen, China (2014)CrossRefGoogle Scholar
  6. 6.
    Iwata, T., Yamada, T., Sakurai, Y., Ueda, N.: Online Multiscale Dynamic Topic Models. In: Proceedings of the 16Th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 663–672. ACM (2010)Google Scholar
  7. 7.
    Jakob, N., Gurevych, I.: Extracting Opinion Targets in a Single-And Cross-Domain Setting with Conditional Random Fields. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp 1035–1045. Association for Computational Linguistics (2010)Google Scholar
  8. 8.
    Jo, Y., Oh, A.H.: Aspect and Sentiment Unification Model for Online Review Analysis. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, pp 815–824. ACM (2011)Google Scholar
  9. 9.
    Kim, S., Zhang, J., Chen, Z., Oh, A.H., Liu, S.: A Hierarchical Aspect-Sentiment Model for Online Reviews. In: AAAI (2013)Google Scholar
  10. 10.
    Liu, P., Gulla, J.A., Zhang, L.: Dynamic Topic-Based Sentiment Analysis of Large-Scale Online News. In: International Conference on Web Information Systems Engineering, pp 3–18. Springer International Publishing (2016)Google Scholar
  11. 11.
    Lin, C., He, Y.: Joint Sentiment/Topic Model for Sentiment Analysis Proceedings of the 18Th ACM Conference on Information and Knowledge Management, pp 375–384. ACM (2009)Google Scholar
  12. 12.
    Lin, C., He, Y., Everson, R., Ruger, S.: Weakly supervised joint sentiment-topic detection from text. IEEE Trans. Knowl. Data Eng. 24(6), 1134–1145 (2012)CrossRefGoogle Scholar
  13. 13.
    Llorens, H., Saquete, E., Navarro-Colorado, B.: TimeML Events Recognition and Classification: Learning CRF Models with Semantic Roles. In: Proceedings of the 23Rd International Conference on Computational Linguistics, pp 725–733. Association for Computational Linguistics (2010)Google Scholar
  14. 14.
    Li, C., Zhang, J., Sun, J.T., Chen, Z.: Sentiment Topic Model with Decomposed Prior. In: Proceedings of the 2013 SIAM International Conference on Data Mining, pp 767–775. Society for Industrial and Applied Mathematics (2013)Google Scholar
  15. 15.
    Maas, A.L., Daly, R.E., Pham, P.T., Huang, D., Ng, A.Y., Potts, C.: Learning Word Vectors for Sentiment Analysis. In: Proceedings of the 49Th Annual Meeting of the Association for Computational Linguistics, pp 142–150. Association for Computational Linguistics (2011)Google Scholar
  16. 16.
    Mitra, T., Wright, G.P., Gilbert, E.: A Parsimonious Language Model of Social Media Credibility Across Disparate Events. In: Proceedings CSCW, pp 126–145. ACM (2017)Google Scholar
  17. 17.
    Nakayama, Y., Fujii, A.: Extracting Condition-Opinion Relations Toward Fine-Grained Opinion Mining. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp 622–631. Association for Computational Linguistics (2015)Google Scholar
  18. 18.
    Njlstad, P.C.S., Hyster, L.S., Wei, W., Gulla, J.A.: Evaluating feature sets and classifiers for sentiment analysis of financial news. In: Proceedings of the 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT), pp 71–78. IEEE Computer Society (2014)Google Scholar
  19. 19.
    Newman, D., Lau, J.H., Grieser, K., Baldwin, T.: Automatic evaluation of topic coherence. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp 100–108. Association for Computational Linguistics (2010)Google Scholar
  20. 20.
    Röder, M., Both, A., Hinneburg, A.: Exploring the Space of Topic Coherence Measures. In: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, pp 399–408. ACM (2015)Google Scholar
  21. 21.
    Ren, Y., Zhang, Y., Zhang, M., Ji, D.: Improving Twitter Sentiment Classification Using Topic-Enriched Multi-Prototype Word Embeddings. In: AAAI, pp 3038–3044. AAAI Press (2016)Google Scholar
  22. 22.
    Tavakolifard, M., Gulla, J.A., Almeroth, K.C., Ingvaldesn, J.E., Nygreen, G., Berg, E.: Tailored news in the palm of your hand: a multi-perspective transparent approach to news recommendation. In: Proceedings of the 22nd International Conference on World Wide Web, pp 305–308. ACM (2013)Google Scholar
  23. 23.
    Titov, I., McDonald, R.: Modeling Online Reviews with Multi-Grain Topic Models. In: Proceedings of the 17Th International Conference on World Wide Web, pp 111–120. ACM (2008)Google Scholar
  24. 24.
    Tang, D., Wei, F., Yang, N., Zhou, M., Liu, T., Qin, B.: Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification. In: Proceedings of the 52Nd Annual Meeting of the Association for Computational Linguistics, pp 1555–1565. Association for Computational Linguistics, Baltimore, Maryland (2014)Google Scholar
  25. 25.
    Wang, Y., Agichtein, E., Benzi, M.: TM-LDA: Efficient Online Modeling of Latent Topic Transitions in Social Media. In: Proceedings of the 18Th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 123–131. ACM (2012)Google Scholar
  26. 26.
    Wang, C., Blei, D., Heckerman, D.: Continuous Time Dynamic Topic Models. In: Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence, pp 579–586. AUAI Press (2008)Google Scholar
  27. 27.
    Wang, L., Cardie, C.: Improving Agreement and Disagreement Identification in Online Discussions with a Socially-Tuned Sentiment Lexicon. In: Proceedings of the 5Th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pp 97–106. Association for Computational Linguistics, Baltimore, Maryland (2014)CrossRefGoogle Scholar
  28. 28.
    Wang, X., McCallum, A.: Topics over Time: a non-Markov Continuous-Time Model of Topical Trends. In: Proceedings of the 12Th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 424–433. ACM (2006)Google Scholar
  29. 29.
    Xiang, B., Zhou, L., Reuters, T.: Improving Twitter Sentiment Analysis with Topic-Based Mixture Modeling and Semi-Supervised Training. In: Proceedings of the 52Nd Annual Meeting of the Association for Computational Linguistics, pp 434–439. Association for Computational Linguistics, Baltimore, Maryland (2014)Google Scholar
  30. 30.
    Zhao, Y., Dong, S., Li, L.: Sentiment analysis on news comments based on supervised learning method. International Journal of Multimedia and Ubiquitous Engineering 9(7), 333–346 (2014)CrossRefGoogle Scholar
  31. 31.
    Zheng, M., Wu, C., Liu, Y., Liao, X., Chen, G.: Topic Sentiment Trend Model: Modeling Facets and Sentiment Dynamics. In: Proceedings of the 2Nd IEEE International Conference on Computer Science and Automation Engineering (CSAE 2012), pp 651–657. IEEE (2012)Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.Department of Computer ScienceNorwegian University of Science and TechnologyTrondheimNorway

Personalised recommendations