Skip to main content

Dialog Act Segmentation and Classification in Vietnamese

  • 779 Accesses

Part of the Lecture Notes in Networks and Systems book series (LNNS,volume 507)


Natural Language Understanding (NLU) is a critical component in building a conversational system. So far, most systems have processed the user inputs at the utterance-level and assumed single dialog act (DA) per utterance. In fact, one utterance might contain more than one DA which are denoted by different continuous text spans inside it (a.k.a functional segments). As a step towards achieving natural and flexible interaction between human and machine especially in poor-resource languages, this paper presents a work for dialog segmentation (DS) and DA classification in Vietnamese. We first introduce the corpus and then systematically investigate different pipeline and joint learning approaches to deal the two tasks. Experimental results show that the joint learning approach is superior in boosting the performance of both tasks. It outperforms the conventional pipeline approach which looked at the two tasks separately. Moreover, to further enhance the final performance, this paper proposes a technique to enrich the models with useful DA knowledge. Compared to the standard models which don’t use DA knowledge, we achieve considerably better results for two tasks. Specifically, we achieved an F1 score of 86% in segmenting dialogues, and an F1-micro score of 74.75% in classifying DAs. This provides a strong foundation for future research on this interesting field.


  • Dialog segmentation
  • Dialog act
  • Deep learning
  • Vietnamese retail domain

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
USD   189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions


  1. 1.

    The dataset and baseline models will be freely available online once the paper is accepted for publication.

  2. 2.

  3. 3.

  4. 4.

  5. 5.

    In experiments, we used five previous utterances as history dialog contexts.


  1. Ang, J., Liu, Y., Shriberg, E.: Automatic dialog act segmentation and classification in multiparty meetings. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2005, pp. 1061–1064 (2005)

    Google Scholar 

  2. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv:1409.0473 (2014)

  3. Bothe, C., Weber, C., Magg, S., Wermter, S.: A context-based approach for dialogue act recognition using simple recurrent neural networks. In: Proceedings of the 11th LREC, pp. 1952–1957 (2018)

    Google Scholar 

  4. Bunt, H., et al.: ISO 24617-2: a semantically-based standard for dialogue annotation. In: Proceedings of LREC 2012, pp. 430–437 (2012)

    Google Scholar 

  5. Chen, Z., Yang, R., Zhao, Z., Cai, D., He, X.: Dialogue act recognition via CRF-attentive structured network. In: Proceedings of the 41st SIGIR 2018, pp. 225–234 (2018)

    Google Scholar 

  6. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding, pp. 1–16. In: Proceedings of NAACL, Minnesota (2019)

    Google Scholar 

  7. Hochreiter, S., Schmidhuber, J.: Long short-term memory. J. Neural Comput. 9(8), 1735–1780 (1997)

    CrossRef  Google Scholar 

  8. Ivanovic, E.: Dialogue act tagging for instant messaging chat sessions. In: Proceedings of the 43rd ACL, pp. 79–84 (2005)

    Google Scholar 

  9. Kalchbrenner, N., Blunsom, P.: Recurrent convolutional neural networks for discourse compositionality. In: Proceedings of the Workshop on Continuous Vector Space Models and their Compositionality, pp. 119–126 (2013)

    Google Scholar 

  10. Kumar, H., Agarwal, A., Dasgupta, R., Joshi, S.: Dialogue act sequence labeling using hierarchical encoder with CRF. In: Proceedings of the 32nd AAAI, pp. 3440–3447 (2018)

    Google Scholar 

  11. Ji, Y., Haffari, G., Eisenstein, J.: A latent variable recurrent neural network for discourse-driven language models. In: Proceedings of the 2016 NAACL HLT, pp. 332–342 (2016)

    Google Scholar 

  12. LeCun, Y., Bengio, Y.: Convolutional networks for images, speech, and timeseries. In: The Handbook of Brain Theory and Neural Networks, pp. 255–258. MIT Press, Cambridge (1998)

    Google Scholar 

  13. Lee, J.Y., Dernoncourt, F.: Sequential short-text classification with recurrent and convolutional neural networks. In: Proceedings of the 2016 NAACL HLT, pp. 515–520 (2016)

    Google Scholar 

  14. Morbini, F., Sagae, K.: Joint Identification and Segmentation of Domain-Specific Dialogue Acts for Conversational Dialogue Systems. In: Proceedings of the 49th ACL, pp. 95–100 (2011)

    Google Scholar 

  15. Ngo, T.L., Pham, K.L., Takeda, H.: A vietnamese dialog act corpus based on ISO 24617–2 standard. Proc. LREC 2018, 39997–40001 (2018)

    Google Scholar 

  16. Stolcke, A., et al.: Dialogue act modeling for automatic tagging and recognition of conversational speech. Comput. linguist. 26(3), 339–373 (2000)

    Google Scholar 

  17. Zhao, T., Kawahara, T.: Joint learning of dialog act segmentation and recognition in spoken dialog using neural networks. In: Proceedings of 8th International Joint Conference on Natural Language Processing, pp. 704–712 (2017)

    Google Scholar 

  18. Zhao, T., Kawahara, T.: A unified neural architecture for joint dialog act segmentation and recognition in spoken dialog system. In: Proceedings of the 19th SIGDIAL, pp. 201–208 (2018)

    Google Scholar 

  19. Zhao, T., Kawahara, T.: Joint dialog act segmentation and recognition in human conversations using attention to dialog context. Comput. Speech Lang 57, 108–127 (2019)

    CrossRef  Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Oanh Thi Tran .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Luong, T.C., Tran, O.T. (2022). Dialog Act Segmentation and Classification in Vietnamese. In: Arai, K. (eds) Intelligent Computing. SAI 2022. Lecture Notes in Networks and Systems, vol 507. Springer, Cham.

Download citation