Skip to main content
Log in

A multi-perspective global–local interaction framework for identifying dialogue acts and sentiments of dialogue utterances jointly

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Dialogue act recognition (DAR) and sentiment classification (SC) are crucial tasks in dialogue systems, aiming to uncover speakers’ implicit intentions and sentiment by analyzing contextual information. Recent approaches have sought to improve accuracy by jointly modeling dialogue acts and sentiments, considering complex relationships and latent structures. However, these methods often neglect two critical challenges. Firstly, real-world dialogues follow a chronological order, with interlocutors discussing one or more topics. Secondly, the joint task of dialogue act recognition and sentiment classification operates at a sentence level, making it essential to effectively utilize fine-grained word-level information from utterances. To tackle these challenges, we propose a multi-perspective global–local interaction framework. It captures overall contextual information and simulates the flow of dialogue acts and sentiments for each speaker. We delve into explicit intra-task interactions, cross-task collaborations, and token-level information reuse from three perspectives. We also incorporate a time span to accommodate real-world scenarios with chronological and multi-topic dialogues. Experimental results on widely-used benchmark datasets demonstrate the superiority of our framework over mainstream approaches. Comprehensive analysis validates the effectiveness of each component, showcasing the potential for enhancing DAR and SC tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Availability of data and materials

Mastodon [7]: https://github.com/cerisara/DialogSentimentMastodon Dailydialog [13]: http://yanran.li/dailydialog

References

  1. Chen H, Liu X, Yin D, Tang J (2017) A survey on dialogue systems: Recent advances and new frontiers. ACM SIGKDD Explorations Newsl 19(2):25–35

    Article  Google Scholar 

  2. Ni J, Young T, Pandelea V, Xue F, Cambria E (2023) Recent advances in deep learning based dialogue systems: A systematic survey. Artif Intell Rev 56(4):3055–3155

    Article  Google Scholar 

  3. Liu B (2012) Sentiment analysis and opinion mining. Synthesis lectures on human language technologies 5(1):1–167

    Article  Google Scholar 

  4. Fung P, Dey A, Siddique FB, Lin R, Yang Y, Bertero D, Wan Y, Chan RHY, Wu C-S (2016) Zara: A virtual interactive dialogue system incorporating emotion, sentiment and personality recognition. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations, pp. 278–281

  5. Kim M, Kim H (2018) Integrated neural network model for identifying speech acts, predicators, and sentiments of dialogue utterances. Pattern Recogn Lett 101:1–5

    Article  Google Scholar 

  6. Ma Y, Nguyen KL, Xing FZ, Cambria E (2020) A survey on empathetic dialogue systems. Information Fusion 64:50–70

    Article  Google Scholar 

  7. Cerisara C, Jafaritazehjani S, Oluokun A, Le HT (2018) Multi-task dialog act and sentiment recognition on mastodon. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 745–754

  8. Qin L, Che W, Li Y, Ni M, Liu T (2020) Dcr-net: A deep co-interactive relation network for joint dialog act recognition and sentiment classification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 8665–8672

  9. Li J, Fei H, Ji D (2020) Modeling local contexts for joint dialogue act recognition and sentiment classification with bi-channel dynamic convolutions. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 616–626

  10. Qin L, Li Z, Che W, Ni M, Liu T (2021) Co-gat: A co-interactive graph attention network for joint dialog act recognition and sentiment classification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 13709–13717

  11. Xing B, Tsang I (2022) Darer: Dual-task temporal relational recurrent reasoning network for joint dialog sentiment classification and act recognition. In: Findings of the Association for Computational Linguistics: ACL 2022, pp. 3611–3621

  12. Sordoni A, Galley M, Auli M, Brockett C, Ji Y, Mitchell M, Nie J-Y, Gao J, Dolan WB (2015) A neural network approach to context-sensitive generation of conversational responses. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 196–205

  13. Li Y, Su H, Shen X, Li W, Cao Z, Niu S (2017) Dailydialog: A manually labelled multi-turn dialogue dataset. In: Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 986–995

  14. Chen Z, Yang R, Zhao Z, Cai D, He X (2018) Dialogue act recognition via crf-attentive structured network. In: The 41st International Acm Sigir Conference on Research & Development in Information Retrieval, pp. 225–234

  15. Kumar H, Agarwal A, Dasgupta R, Joshi S (2018) Dialogue act sequence labeling using hierarchical encoder with crf. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence, pp. 3440–3447

  16. Raheja V, Tetreault J (2019) Dialogue act classification with context-aware self-attention. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 3727–3733

  17. Li R, Lin C, Collinson M, Li X, Chen G (2019) A dual-attention hierarchical recurrent neural network for dialogue act classification. In: 23rd Conference on Computational Natural Language Learning, CoNLL 2019, pp. 383–392. Association for Computational Linguistics

  18. Colombo P, Chapuis E, Manica M, Vignon E, Varni G, Clavel C (2020) Guiding attention in sequence-to-sequence models for dialogue act prediction. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 7594–7601

  19. He Z, Tavabi L, Lerman K, Soleymani M (2021) Speaker turn modeling for dialogue act classification. In: Findings of the Association for Computational Linguistics: EMNLP 2021, pp. 2150–2157

  20. Wu T-W, Su R, Juang B-H (2021) A context-aware hierarchical bert fusion network for multi-turn dialog act detection. arXiv preprint arXiv:2109.01267

  21. Pengfei G, Yinglong M (2022) A universality-individuality integration model for dialog act classification. arXiv preprint arXiv:2204.06185

  22. Gella S, Padmakumar A, Lange PL, Hakkani-Tur D (2022) Dialog acts for task driven embodied agents. In: Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue, pp. 111–123

  23. Chang F-J, Muniyappa T, Sathyendra KM, Wei K, Strimel GP, McGowan R (2023) Dialog act guided contextual adapter for personalized speech recognition. In: ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5. IEEE

  24. Zhang L, Wang S, Liu B (2018) Deep learning for sentiment analysis: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 8(4):1253

    Google Scholar 

  25. Qiu M, Huang X, Chen C, Ji F, Qu C, Wei W, Huang J, Zhang Y (2021) Reinforced history backtracking for conversational question answering. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 13718–13726

  26. Musto C, de Gemmis M, Semeraro G, Lops P (2017) A multi-criteria recommender system exploiting aspect-based sentiment analysis of users’ reviews. In: Proceedings of the Eleventh ACM Conference on Recommender Systems, pp. 321–325

  27. Liu P, Zhang L, Gulla JA (2021) Multilingual review-aware deep recommender system via aspect-based sentiment analysis. ACM Transactions on Information Systems (TOIS) 39(2):1–33

    Article  Google Scholar 

  28. Ghosal D, Majumder N, Poria S, Chhaya N, Gelbukh A (2019) Dialoguegcn: A graph convolutional neural network for emotion recognition in conversation. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 154–164

  29. Majumder N, Poria S, Hazarika D, Mihalcea R, Gelbukh A, Cambria E (2019) Dialoguernn: An attentive rnn for emotion detection in conversations. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 6818–6825

  30. Zhang C, Li Q, Song D (2019) Aspect-based sentiment classification with aspect-specific graph convolutional networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 4568–4578

  31. Bai X, Liu P, Zhang Y (2020) Investigating typed syntactic dependencies for targeted sentiment classification using graph attention neural network. IEEE/ACM Transactions on Audio, Speech, and Language Processing 29:503–514

    Article  Google Scholar 

  32. Shen W, Wu S, Yang Y, Quan X (2021) Directed acyclic graph network for conversational emotion recognition. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 1551–1560

  33. Augustine E, Jandaghi P, Albalak A, Pryor C, Dickens C, Wang W, Getoor L (2022) Emotion recognition in conversation using probabilistic soft logic. arXiv preprint arXiv:2207.07238

  34. Ghazarian S, Hedayatnia B, Papangelis A, Liu Y, Hakkani-Tur D (2022) What is wrong with you?: Leveraging user sentiment for automatic dialog evaluation. In: Findings of the Association for Computational Linguistics: ACL 2022, pp. 4194–4204

  35. Cheng Z, Zhou J, Wu W, Chen Q, He L (2023) Tell model where to attend: Improving interpretability of aspect-based sentiment classification via small explanation annotations. In: ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5. IEEE

  36. Kumar S, Mondal I, Akhtar MS, Chakraborty T (2023) Explaining (sarcastic) utterances to enhance affect understanding in multimodal dialogues. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 12986–12994

  37. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Advances in neural information processing systems 30

  38. Veličković P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2017) Graph attention networks. arXiv preprint arXiv:1710.10903

  39. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778

  40. Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980

  41. Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186

  42. Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692

  43. Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov R, Le QV (2019) Xlnet: generalized autoregressive pretraining for language understanding. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems, pp. 5753–5763

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qichen Zhang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Q., Li, J. A multi-perspective global–local interaction framework for identifying dialogue acts and sentiments of dialogue utterances jointly. Int. J. Mach. Learn. & Cyber. 15, 1995–2011 (2024). https://doi.org/10.1007/s13042-023-02010-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-023-02010-5

Keywords

Navigation