MBMN: Multivariate Bernoulli Mixture Network for News Emotion Analysis

  • Xue Zhao
  • Ying ZhangEmail author
  • Wenya Guo
  • Xiaojie Yuan
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11642)


In the text classification task, besides the text features, labels are also crucial to the final classification performance, which have not been considered in most existing works. In the context of emotions, labels are correlated and some of them can coexist. Such label features and label dependencies as auxiliary text information can be helpful for text classification.

In this paper, we propose a Multivariate Bernoulli Mixture Network (MBMN) to learn a text representation as well as a label representation. Specifically, it generates the text representation with a simple convolutional neural network, and learns a mixture of multivariate Bernoulli distribution which can model the label distribution as well as label dependencies. The labels can be sampled from the distribution and further used to generate a label representation. With both text representation and label representation, MBMN can achieve better classification performance.

Experiments show the effectiveness of the proposed method against competitive alternatives on two public datasets.


Mixture density networks Sentiment analysis Label emebedding 



We thank the reviewers for their constructive comments. This research is supported by National Natural Science Foundation of China (No. U1836109) and the Fundamental Research Funds for the Central Universities, Nankai University (No. 63191709 and No. 63191705).


  1. 1.
    Mohammad, S.M., Kiritchenko, S.: Using hashtags to capture fine emotion categories from tweets. Comput. Intell. 31(2), 301–326 (2015)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Tang, Y.J., Chen, H.H.: Emotion modeling from writer/reader perspectives using a microblog dataset. In: Proceedings of the Workshop on Sentiment Analysis Where AI Meets Psychology (SAAIP 2011), pp. 11–19 (2011)Google Scholar
  3. 3.
    Mohammad, S.: From once upon a time to happily ever after: tracking emotions in novels and fairy tales. In: Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities, pp. 105–114. Association for Computational Linguistics, June 2011Google Scholar
  4. 4.
    Rao, Y., Lei, J., Wenyin, L., Li, Q., Chen, M.: Building emotional dictionary for sentiment analysis of online news. World Wide Web 17(4), 723–742 (2014)CrossRefGoogle Scholar
  5. 5.
    Staiano, J., Guerini, M.: Depeche mood: a lexicon for emotion analysis from crowd annotated news. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), vol. 2, pp. 427–433 (2014)Google Scholar
  6. 6.
    Bhowmick, P.K., Basu, A., Mitra, P., Prasad, A.: Multi-label text classification approach for sentence level news emotion analysis. In: Chaudhury, S., Mitra, S., Murthy, C.A., Sastry, P.S., Pal, S.K. (eds.) PReMI 2009. LNCS, vol. 5909, pp. 261–266. Springer, Heidelberg (2009). Scholar
  7. 7.
    Lin, K.H.Y., Yang, C., Chen, H.H.: Emotion classification of online news articles from the reader’s perspective. In: Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology-Volume 01, pp. 220–226. IEEE Computer Society, December 2008Google Scholar
  8. 8.
    Deyu, Z.H., Zhang, X., Zhou, Y., Zhao, Q., Geng, X.: Emotion distribution learning from texts. In: Proceedings of the 2016 Conference on EMNLP (2016)Google Scholar
  9. 9.
    Tang, J., Qu, M., Mei, Q.: PTE: predictive text embedding through large-scale heterogeneous text networks. In: Proceedings of the 21th ACM SIGKDD, August 2015Google Scholar
  10. 10.
    Wang, G., et al.: Joint embedding of words and labels for text classification. In: ACL (2018)Google Scholar
  11. 11.
    Dai, B., Ding, S., Wahba, G.: Multivariate bernoulli distribution. Bernoulli 19(4), 1465–1483 (2013) MathSciNetCrossRefGoogle Scholar
  12. 12.
    Bishop, C.M.: Mixture density networks, p. 7. Technical report NCRG/4288, Aston University, Birmingham, UK (1994)Google Scholar
  13. 13.
    Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751 (2014)Google Scholar
  14. 14.
    Raiko, T., Berglund, M., Alain, G., Dinh, L.: Techniques for learning binary stochastic feedforward neural networks. arXiv preprint arXiv:1406.2989 (2014)
  15. 15.
    Song, K., Gao, W., Chen, L., Feng, S., Wang, D., Zhang, C.: Build emotion lexicon from the mood of crowd via topic-assisted joint non-negative matrix factorization. In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 773–776. ACM, July 2016Google Scholar
  16. 16.
    Tang, D., Qin, B., Liu, T.: Document modeling with gated recurrent neural network for sentiment classification. In: EMNLP (2015)Google Scholar
  17. 17.
    Lin, Z., et al.: A structured self-attentive sentence embedding. In: ICLR (2017)Google Scholar
  18. 18.
    Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: NAACL (2016)Google Scholar
  19. 19.
    Vaswani, A., et al.: Attention is all you need. In: NIPS (2017)Google Scholar
  20. 20.
    Szegedy, C., et al.: Going deeper with convolutions. In: ICLR (2015)Google Scholar
  21. 21.
    Joulin, A,, Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. In: Proceedings of the European Chapter of the (2017)Google Scholar
  22. 22.
    Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: EMNLP (2014)Google Scholar
  23. 23.
    Ahangar, M., Ghorbandoost, M., Sharma, S., Smith, M.J.: Voice conversion based on a mixture density network. In: 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 329–333. IEEE, October 2017Google Scholar
  24. 24.
    Richmond, K.: Trajectory mixture density networks with multiple mixtures for acoustic-articulatory inversion. In: Chetouani, M., Hussain, A., Gas, B., Milgram, M., Zarader, J.-L. (eds.) NOLISP 2007. LNCS (LNAI), vol. 4885, pp. 263–272. Springer, Heidelberg (2007). Scholar
  25. 25.
    Uria, B., Murray, I., Renals, S., Richmond, K.: Deep architectures for articulatory inversion. In: Proceedings of Interspeech, pp. 867–870 (2012)Google Scholar
  26. 26.
    Den Oord, A.V., et al.: WaveNet: a generative model for raw audio. In: 9th ISCA Speech Synthesis Workshop (SSW9), pp. 125–125 (2016)Google Scholar
  27. 27.
    Kobayashi, K., Hayashi, T., Tamamori, A., Toda, T.: Statistical voice conversion with WaveNet-based waveform generation. In: INTERSPEECH, pp. 1138–1142, August 2017Google Scholar
  28. 28.
    Niwa, J., Yoshimura, T., Hashimoto, K., Oura, K., Nankaku, Y., Tokuda, K.: Statistical voice conversion based on WaveNet. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5289–5293. IEEE, April 2018Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.College of Computer ScienceNankai UniveristyTianjinChina

Personalised recommendations