Multi-task Gated Contextual Cross-Modal Attention Framework for Sentiment and Emotion Analysis

Sangwan, Suyash; Chauhan, Dushyant Singh; Akhtar, Md. Shad; Ekbal, Asif; Bhattacharyya, Pushpak

doi:10.1007/978-3-030-36808-1_72

Suyash Sangwan⁹,
Dushyant Singh Chauhan⁹,
Md. Shad Akhtar⁹,
Asif Ekbal⁹ &
…
Pushpak Bhattacharyya⁹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1142))

Included in the following conference series:

International Conference on Neural Information Processing

2963 Accesses
7 Citations

Abstract

Multi-modal sentiment and emotion analysis have been an emerging and prominent field nowadays at the intersection of natural language processing, deep learning, machine learning, computer vision, and speech processing. Sentiment and emotion prediction model finds the attitude of a speaker or writer towards any discussion, debate, event, document or topic. It can be expressed in different ways like the words spoken, energy and tone while delivering words, accompanying facial expressions, gestures, etc. Moreover related and similar tasks generally depend on each other and are predicted better if solved through a joint framework. In this paper, we present a multi-task gated contextual cross-modal attention framework which considers all the three modalities (viz. text, acoustic and visual) and multiple utterances for sentiment and emotion prediction together. We evaluate our proposed approach on CMU-MOSEI dataset for sentiment and emotion prediction. Evaluation results depict that our proposed approach extracts co-relation among the three modalities and attains an improvement over the previous state-of-the-art models.

First two authors have equal contributions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/A2Zadeh/CMU-MultimodalDataSDK.

References

Blanchard, N., Moreira, D., Bharati, A., Scheirer, W.: Getting the subtext without the text: scalable multimodal sentiment classification from visual and acoustic modalities. In: Proceedings of Grand Challenge and Workshop on Human Multimodal Language, Melbourne, Australia, pp. 1–10 (2018)
Google Scholar
Zhang, Y., et al.: A quantum-inspired multimodal sentiment analysis framework. Theor. Comput. Sci. 752, 21–40 (2018)
Article MathSciNet Google Scholar
Poria, S., Cambria, E., Hazarika, D., Mazumder, N., Zadeh, A., Morency, L.: Multi-level multiple attentions for contextual multimodal sentiment analysis. In: 2017 IEEE International Conference on Data Mining (ICDM), New Orleans, LA, pp. 1033–1038, November 2017
Google Scholar
Nojavanasghari, B., Gopinath, D., Koushik, J., Baltrušaitis, T., Morency, L.-P.: Deep multimodal fusion for persuasiveness prediction. In: Proceedings of the 18th ACM International Conference on Multimodal Interaction, Tokyo, Japan (2016)
Google Scholar
Deng, D., Zhou, Y., Pi, J., Shi, B.E.: Multimodal utterance level affect analysis using visual, audio and text features. arXiv preprint arXiv:1805.00625 (2018)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article Google Scholar
Zadeh, A., Liang, P.P., Poria, S., Vij, P., Cambria, E., Morency, L.P.: Multi-attention recurrent network for human communication comprehension. In: Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-2018), New Orleans, USA, pp. 5642–5649 (2018)
Google Scholar
Arevalo, J., Solorio, T., Montes-y-Gómez, M., González, F.A.: Gated multimodal units for information fusion. arXiv preprint arXiv:1702.01992 (2017)
Zadeh, A., Liang, P.P., Poria, S., Cambria, E., Morency, L.-P.: Multimodal language analysis in the Wild: CMU-MOSEI dataset and interpretable dynamic fusion graph. In: Proceedings of the 56th Annual Meeting of the ACL, Melbourne, Australia, pp. 2236–2246 (2018)
Google Scholar
Tong, E., Zadeh, A., Jones, C., Morency, L.-P.: Combating human trafficking with multimodal deep models. In: Proceedings of the 55th Annual Meeting of the ACL, Vancouver, Canada, pp. 1547–1556. ACL (2017)
Google Scholar
Rajagopalan, S.S., Morency, L.-P., Baltrus̆aitis, T., Goecke, R.: Extending long short-term memory for multi-view structured learning. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 338–353. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_21
Chapter Google Scholar
Zadeh, A., Chen, M., Poria, S., Cambria, E., Morency, L.-P.: Tensor fusion network for multimodal sentiment analysis. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, pp. 1103–1114, September 2017
Google Scholar
Zadeh, A., Liang, P.P., Mazumder, N., Poria, S., Cambria, E., Morency, L.-P.: Memory fusion network for multi-view sequential learning. In: Proceedings of the 32nd AAAI, New Orleans, Louisiana, USA, 2–7 February 2018
Google Scholar

Download references

Acknowledgement

Asif Ekbal acknowledges the Young Faculty Research Fellowship (YFRF), supported by Visvesvaraya Ph.D. scheme of MeiTY, Government of India. The research reported here is also partially supported by “Skymap Global India Private Limited”.

Author information

Authors and Affiliations

Indian Institute of Technology, Patna, Bihta, India
Suyash Sangwan, Dushyant Singh Chauhan, Md. Shad Akhtar, Asif Ekbal & Pushpak Bhattacharyya

Authors

Suyash Sangwan
View author publications
You can also search for this author in PubMed Google Scholar
Dushyant Singh Chauhan
View author publications
You can also search for this author in PubMed Google Scholar
Md. Shad Akhtar
View author publications
You can also search for this author in PubMed Google Scholar
Asif Ekbal
View author publications
You can also search for this author in PubMed Google Scholar
Pushpak Bhattacharyya
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Suyash Sangwan .

Editor information

Editors and Affiliations

Australian National University, Canberra, ACT, Australia
Tom Gedeon
Murdoch University, Murdoch, WA, Australia
Kok Wai Wong
Kyungpook National University, Daegu, Korea (Republic of)
Minho Lee

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sangwan, S., Chauhan, D.S., Akhtar, M.S., Ekbal, A., Bhattacharyya, P. (2019). Multi-task Gated Contextual Cross-Modal Attention Framework for Sentiment and Emotion Analysis. In: Gedeon, T., Wong, K., Lee, M. (eds) Neural Information Processing. ICONIP 2019. Communications in Computer and Information Science, vol 1142. Springer, Cham. https://doi.org/10.1007/978-3-030-36808-1_72

Download citation

DOI: https://doi.org/10.1007/978-3-030-36808-1_72
Published: 05 December 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36807-4
Online ISBN: 978-3-030-36808-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics