Abstract
Natural Language Understanding (NLU) is a critical component in building a conversational system. So far, most systems have processed the user inputs at the utterance-level and assumed single dialog act (DA) per utterance. In fact, one utterance might contain more than one DA which are denoted by different continuous text spans inside it (a.k.a functional segments). As a step towards achieving natural and flexible interaction between human and machine especially in poor-resource languages, this paper presents a work for dialog segmentation (DS) and DA classification in Vietnamese. We first introduce the corpus and then systematically investigate different pipeline and joint learning approaches to deal the two tasks. Experimental results show that the joint learning approach is superior in boosting the performance of both tasks. It outperforms the conventional pipeline approach which looked at the two tasks separately. Moreover, to further enhance the final performance, this paper proposes a technique to enrich the models with useful DA knowledge. Compared to the standard models which don’t use DA knowledge, we achieve considerably better results for two tasks. Specifically, we achieved an F1 score of 86% in segmenting dialogues, and an F1-micro score of 74.75% in classifying DAs. This provides a strong foundation for future research on this interesting field.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The dataset and baseline models will be freely available online once the paper is accepted for publication.
- 2.
- 3.
- 4.
- 5.
In experiments, we used five previous utterances as history dialog contexts.
References
Ang, J., Liu, Y., Shriberg, E.: Automatic dialog act segmentation and classification in multiparty meetings. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2005, pp. 1061–1064 (2005)
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv:1409.0473 (2014)
Bothe, C., Weber, C., Magg, S., Wermter, S.: A context-based approach for dialogue act recognition using simple recurrent neural networks. In: Proceedings of the 11th LREC, pp. 1952–1957 (2018)
Bunt, H., et al.: ISO 24617-2: a semantically-based standard for dialogue annotation. In: Proceedings of LREC 2012, pp. 430–437 (2012)
Chen, Z., Yang, R., Zhao, Z., Cai, D., He, X.: Dialogue act recognition via CRF-attentive structured network. In: Proceedings of the 41st SIGIR 2018, pp. 225–234 (2018)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding, pp. 1–16. In: Proceedings of NAACL, Minnesota (2019)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. J. Neural Comput. 9(8), 1735–1780 (1997)
Ivanovic, E.: Dialogue act tagging for instant messaging chat sessions. In: Proceedings of the 43rd ACL, pp. 79–84 (2005)
Kalchbrenner, N., Blunsom, P.: Recurrent convolutional neural networks for discourse compositionality. In: Proceedings of the Workshop on Continuous Vector Space Models and their Compositionality, pp. 119–126 (2013)
Kumar, H., Agarwal, A., Dasgupta, R., Joshi, S.: Dialogue act sequence labeling using hierarchical encoder with CRF. In: Proceedings of the 32nd AAAI, pp. 3440–3447 (2018)
Ji, Y., Haffari, G., Eisenstein, J.: A latent variable recurrent neural network for discourse-driven language models. In: Proceedings of the 2016 NAACL HLT, pp. 332–342 (2016)
LeCun, Y., Bengio, Y.: Convolutional networks for images, speech, and timeseries. In: The Handbook of Brain Theory and Neural Networks, pp. 255–258. MIT Press, Cambridge (1998)
Lee, J.Y., Dernoncourt, F.: Sequential short-text classification with recurrent and convolutional neural networks. In: Proceedings of the 2016 NAACL HLT, pp. 515–520 (2016)
Morbini, F., Sagae, K.: Joint Identification and Segmentation of Domain-Specific Dialogue Acts for Conversational Dialogue Systems. In: Proceedings of the 49th ACL, pp. 95–100 (2011)
Ngo, T.L., Pham, K.L., Takeda, H.: A vietnamese dialog act corpus based on ISO 24617–2 standard. Proc. LREC 2018, 39997–40001 (2018)
Stolcke, A., et al.: Dialogue act modeling for automatic tagging and recognition of conversational speech. Comput. linguist. 26(3), 339–373 (2000)
Zhao, T., Kawahara, T.: Joint learning of dialog act segmentation and recognition in spoken dialog using neural networks. In: Proceedings of 8th International Joint Conference on Natural Language Processing, pp. 704–712 (2017)
Zhao, T., Kawahara, T.: A unified neural architecture for joint dialog act segmentation and recognition in spoken dialog system. In: Proceedings of the 19th SIGDIAL, pp. 201–208 (2018)
Zhao, T., Kawahara, T.: Joint dialog act segmentation and recognition in human conversations using attention to dialog context. Comput. Speech Lang 57, 108–127 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Luong, T.C., Tran, O.T. (2022). Dialog Act Segmentation and Classification in Vietnamese. In: Arai, K. (eds) Intelligent Computing. SAI 2022. Lecture Notes in Networks and Systems, vol 507. Springer, Cham. https://doi.org/10.1007/978-3-031-10464-0_40
Download citation
DOI: https://doi.org/10.1007/978-3-031-10464-0_40
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-10463-3
Online ISBN: 978-3-031-10464-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)