Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

User group based emotion detection and topic discovery over short text

  • 70 Accesses

Abstract

In recent years, with the development of social media platforms, more and more people express their emotions online through short messages. It is quite valuable to detect emotions and relevant topics from such data. However, the feature sparsity of short texts brings challenges to joint topic-emotion models. In many cases, it is necessary to know not only what people think of specific topics, but also which individuals have similar feedback, and what characteristics of these users have. In this paper, we propose a user group based topic-emotion model named UGTE for emotions detection and topic discovery, which can alleviate the above feature sparsity problem of short texts. Specifically, the characteristics of each user are used to discover groups of individuals who share similar emotions, and UGTE aggregates short texts within a group into long pseudo-documents effectively. Experiments conducted on a real-world short text dataset validate the effectiveness of our proposed model.

This is a preview of subscription content, log in to check access.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8

Notes

  1. 1.

    http://www.affective-sciences.org/researchmaterial

References

  1. 1.

    Artstein, R., Poesio, M.: Inter-coder agreement for computational linguistics. Comput. Linguist. 34(4), 555–596 (2008)

  2. 2.

    Bao, S., Xu, S., Zhang, L., Yan, R., Su, Z., Han, D., Yu, Y.: Mining social emotions from affective text. IEEE Trans. Knowl. Data Eng. 24(9), 1658–1670 (2012)

  3. 3.

    Blei, D. M., Ng, A. Y., Jordan, M. I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

  4. 4.

    Cao, Z., Li, S., Liu, Y., Li, W., Ji, H.: A novel neural topic model and its supervised extension. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, Austin, pp. 2210–2216 (2015)

  5. 5.

    Chen, H., Yin, H., Li, X., Wang, M., Chen, W., Chen, T.: People opinion topic model: Opinion based user clustering in social networks. In: Proceedings of the 26th International Conference on World Wide Web Companion, Perth, pp. 1353–1359 (2017)

  6. 6.

    Chen, Z., Liu, B.: Mining Topics in Documents: Standing on the Shoulders of Big Data. In: The 20Th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14, New York, pp. 1116–1125 (2014)

  7. 7.

    Cheng, X., Yan, X., Lan, Y., Guo, J.: BTM: topic modeling over short texts. IEEE Trans. Knowl. Data Eng. 26(12), 2928–2941 (2014)

  8. 8.

    Diao, Q., Jiang, J., Zhu, F., Lim, E.: Finding bursty topics from microblogs. In: The 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, Jeju Island, Korea - Volume 1: Long Papers, pp. 536–544 (2012)

  9. 9.

    Griffiths, T. L., Steyvers, M.: Finding scientific topics. Proc. Natl. Acad. Sci. 101(suppl 1), 5228–5235 (2004)

  10. 10.

    Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 42(1/2), 177–196 (2001)

  11. 11.

    Huang, F., Zhang, S., Zhang, J., Yu, G.: Multimodal learning for topic sentiment analysis in microblogging. Neurocomputing 253, 144–153 (2017)

  12. 12.

    Huang, M., Rao, Y., Liu, Y., Xie, H., Wang, F. L.: Siamese network-based supervised topic modeling. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, pp. 4652–4662 (2018)

  13. 13.

    Huang, T., Nevmyvaka, Y.: A practical markov chain monte carlo approach to decision problems. In: Proceedings of the Fourteenth International Florida Artificial Intelligence Research Society Conference, Key West, pp. 520–524 (2001)

  14. 14.

    Jin, O., Liu, N. N., Zhao, K., Yu, Y., Yang, Q.: Transferring topical knowledge from auxiliary long texts for short text clustering. In: Proceedings of the 20th ACM Conference on Information and Knowledge Management, CIKM 2011, Glasgow, pp. 775–784 (2011)

  15. 15.

    Lin, T., Tian, W., Mei, Q., Cheng, H.: The Dual-Sparse Topic Model: Mining Focused Topics and Focused Terms in Short Text. In: 23Rd International World Wide Web Conference, WWW ’14, Seoul, pp. 539–550 (2014)

  16. 16.

    Mcpherson, M., Smithlovin, L., Cook, J. M.: Birds of a feather: Homophily in social networks. Annu. Rev. Sociol. 27(1), 415–444 (2001)

  17. 17.

    Mimno, D. M., Wallach, H. M., Talley, E. M., Leenders, M., McCallum, A.: Optimizing semantic coherence in topic models. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, John McIntyre Conference Centre, Edinburgh, UK, A meeting of SIGDAT, a Special Interest Group of the ACL, pp. 262–272 (2011)

  18. 18.

    Parthasarathy, S., Ruan, Y., Satuluri, V.: Community Discovery in Social Networks: Applications, methods and emerging trends. In: Social Network Data Analytics, pp. 79–113 (2011)

  19. 19.

    Phan, X. H., Nguyen, M. L., Horiguchi, S.: Learning to classify short and sparse text & Web with hidden topics from large-scale data collections. In: Proceedings of the 17th International Conference on World Wide Web, WWW 2008, Beijing, pp. 91–100 (2008)

  20. 20.

    Poria, S., Gelbukh, A. F., Hussain, A., Howard, N., Das, D., Bandyopadhyay, S.: Enhanced senticnet with affective labels for concept-based opinion mining. IEEE Intell. Syst. 28(2), 31–38 (2013)

  21. 21.

    Pu, X., Jin, R., Wu, G., Han, D., Xue, G.: Topic modeling in semantic space with keywords. In: Proceedings of the 24th ACM International Conference on Information and Knowledge Management, CIKM 2015, Melbourne, pp. 1141–1150 (2015)

  22. 22.

    Ramage, D., Hall, D. L. W., Nallapati, R., Manning, C. D.: Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, EMNLP 2009, Singapore, A meeting of SIGDAT, a Special Interest Group of the ACL, pp. 248–256 (2009)

  23. 23.

    Rao, Y.: Contextual sentiment topic model for adaptive social emotion classification. IEEE Intell. Syst. 31(1), 41–47 (2016)

  24. 24.

    Rao, Y., Li, Q., Mao, X., Wenyin, L.: Sentiment topic models for social emotion mining. Inf. Sci. 266, 90–100 (2014)

  25. 25.

    Rao, Y., Pang, J., Xie, H., Liu, A., Wong, T., Li, Q., Wang, F. L.: Supervised Intensive Topic Models for Emotion Detection over Short Text. In: Database Systems for Advanced Applications - 22Nd International Conference, DASFAA 2017, Suzhou, Proceedings, Part I, pp. 408–422 (2017)

  26. 26.

    Rosen-Zvi, M., Griffiths, T.L., Steyvers, M., Smyth, P.: The author-topic model for authors and documents. In: UAI ’04, Proceedings of the 20th Conference in Uncertainty in Artificial Intelligence, Banff, pp. 487–494 (2004)

  27. 27.

    Sachan, M., Contractor, D., Faruquie, T. A., Subramaniam, L. V.: Using content and interactions for discovering communities in social networks. In: Proceedings of the 21st World Wide Web Conference 2012, WWW 2012, Lyon, pp. 331–340 (2012)

  28. 28.

    Sahami, M., Heilman, T. D.: A Web-based kernel function for measuring the similarity of short text snippets. In: Proceedings of the 15th international conference on World Wide Web, WWW 2006, Edinburgh, pp. 377–386 (2006)

  29. 29.

    Wallach, H. M., Mimno, D. M., McCallum, A.: Rethinking LDA: why priors matter. In: Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, Vancouver, pp. 1973–1981 (2009)

  30. 30.

    Wang, D., Li, J., Xu, K., Wu, Y.: Sentiment community detection: exploring sentiments and relationships in social networks. Electron. Commer. Res. 17 (1), 103–132 (2017)

  31. 31.

    Wang, X., Mohanty, N., McCallum, A.: Group and topic discovery from relations and text. In: Proceedings of the 3rd international workshop on Link discovery, LinkKDD 2005, Chicago, pp. 28–35 (2005)

  32. 32.

    Xu, K., Qi, G., Huang, J., Wu, T., Fu, X.: Detecting bursts in sentiment-aware topics from social media. Knowl.-Based Syst. 141, 44–54 (2018)

  33. 33.

    Yang, B., Manandhar, S.: STC: A Joint Sentiment-Topic Model for Community Identification. In: Trends and Applications in Knowledge Discovery and Data Mining - PAKDD 2014 International Workshops: DANTH, BDM, MobiSocial, BigEC, CloudSD, MSMV-MBI, SDA, DMDA-Health, ALSIP, SocNet, DMBIH, BigPMA, Tainan, 2014. Revised Selected Papers, pp. 535–548 (2014)

  34. 34.

    Zhang, L., Liu, B.: Sentiment Analysis and Opinion Mining. In: Encyclopedia of Machine Learning and Data Mining, pp. 1152–1161 (2017)

  35. 35.

    Zhang, Q., Gong, Y., Sun, X., Huang, X.: Time-aware personalized hashtag recommendation on social media. In: COLING 2014, 25th International Conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, Dublin, pp. 203–212 (2014)

  36. 36.

    Zhao, W. X., Jiang, J., Weng, J., He, J., Lim, E., Yan, H., Li, X.: Comparing Twitter and Traditional Media Using Topic Models. In: Advances in Information Retrieval - 33Rd European Conference on IR Research, ECIR 2011, Dublin, 2011. Proceedings, pp. 338–349 (2011)

  37. 37.

    Zhao, W. X., Jiang, J., Weng, J., He, J., Lim, E., Yan, H., Li, X.: Comparing Twitter and Traditional Media Using Topic Models. In: Advances in Information Retrieval - 33Rd European Conference on IR Research, ECIR 2011, Dublin, 2011. Proceedings, pp. 338–349 (2011)

  38. 38.

    Zuo, Y., Wu, J., Zhang, H., Lin, H., Wang, F., Xu, K., Xiong, H.: Topic modeling of short texts: A pseudo-document view. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, pp. 2105–2114 (2016)

  39. 39.

    Zuo, Y., Wu, J., Zhang, H., Lin, H., Wang, F., Xu, K., Xiong, H.: Topic modeling of short texts: A pseudo-document view. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, pp. 2105–2114 (2016)

Download references

Acknowledgment

This work has been supported by Top-Up Fund (TFG-04) and Seed Fund (SFG-10) for General Research Fund / Early Career Scheme and Interdisciplinary Research Scheme of the Dean’s Research Fund 2018-19 (FLASS/DRF/IDS-3), Departmental Collaborative Research Fund 2019 (MIT/DCRF-R2/18-19), Funding Support to General Research Fund Proposal (RG 39/2019-2020R) and the Internal Research Grant (RG 90/2018-2019R) of The Education University of Hong Kong, and LEO Dr David P. Chan Institute of Data Science, Lingnan University, Hong Kong. The work has also been supported by the Research Grants Council of the Hong Kong Special Administrative Region, China (Collaborative Research Fund, project number C1031-18G).

Author information

Correspondence to Yanghui Rao.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix:

Appendix:

For clarity, numerical results of Figures 27 are provided as follows.

Table 9 Coherence@10 of UGTE_ID and baselines with different topic numbers when |G| = 10, where the best results are highlighted in boldface
Table 10 Coherence@20 of UGTE_ID and baselines with different topic numbers when |G| = 10, where the best results are highlighted in boldface
Table 11 Coherence@30 of UGTE_ID and baselines with different topic numbers when |G| = 10, where the best results are highlighted in boldface
Table 12 Accuracy of UGTE_ID and baselines with different topic numbers when |G| = 10, where the best results are highlighted in boldface
Table 13 Kappa Score of UGTE_ID and baselines with different topic numbers when |G| = 10, where the best results are highlighted in boldface
Table 14 The mean and variance of topic discovery and emotion discovery of UGTE_ID over different numbers of user groups, where the best results are highlighted in boldface
Table 15 The mean and variance values of impact of extremely short text on UGTE_ID and MSTM

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Feng, J., Rao, Y., Xie, H. et al. User group based emotion detection and topic discovery over short text. World Wide Web (2019). https://doi.org/10.1007/s11280-019-00760-3

Download citation

Keywords

  • Joint topic-emotion model
  • Short text modeling
  • User characteristics
  • User group based mining