Abstract
Due to the miscellaneous ambiguity of semantics in open-domain conversation, current deep dialogue models disregard to detect potential emotional and action response features in the latent space, which leads to the general tendency to produce inaccurate and irrelevant sentences. To address this problem, we propose a semantic-aware conditional variational autoencoder that discriminates the sentiment and action responses features in the latent space for one-to-many open-domain dialogue generation. Specifically, explicit controllable variables are leveraged from the proposed module to create diverse conversational texts. This controllable variable can constrain the distribution of the latent space, disentangling the latent space features during training. Furthermore, the feature disentanglement improves the dialogue generation in terms of deep learning interpretability and text quality, which also reveals the latent features of different emotions on the logic of text generation.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Turing AM (1990) Computing machinery and intelligence. In: The philosophy of artificial intelligence, pp 40–66
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems 27: annual conference on neural information processing systems 2014, pp 3104–3112
Yao K, Zhang L, Luo T, Du D, Wu Y (2021) Non-deterministic and emotional chatting machine: learning emotional conversation generation using conditional variational autoencoders. Neural Comput Appl 33(11):5581–5589
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems 30: annual conference on neural information processing systems 2017, pp 5998–6008
Potamias RA, Siolas G, Stafylopatis A-G (2020) A transformer-based approach to irony and sarcasm detection. Neural Comput Appl 32(23):17309–17320
Kingma DP, Welling M (2014) Auto-encoding variational Bayes. In: 2nd international conference on learning representations
Chen M-Y, Chiang H-S, Sangaiah AK, Hsieh T-C (2020) Recurrent neural network with attention mechanism for language model. Neural Comput Appl 32(12):7915–7923
Weizenbaum J (1966) Eliza-a computer program for the study of natural language communication between man and machine. Commun ACM 9(1):36–45
Young S, Gašić M, Thomson B, Williams JD (2013) POMDP-based statistical spoken dialog systems: a review. Proc IEEE 101(5):1160–1179
Vinyals O, Le Q (2015) A neural conversational model. Computer Science
Serban I, Sordoni A, Bengio Y, Courville A, Pineau J (2016) Building end-to-end dialogue systems using generative hierarchical neural network models. In: Proceedings of the AAAI conference on artificial intelligence, vol 30
Wang Y, Wang H, Zhang X, Chaspari T, Choe Y, Lu M (2019) An attention-aware bidirectional multi-residual recurrent neural network (ABMRNN): a study about better short-term text classification. In: ICASSP 2019-2019 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 3582–3586. IEEE
Li J, Galley M, Brockett C, Gao J, Dolan B (2015) A diversity-promoting objective function for neural conversation models. Computer Science
Galetzka F, Rose J, Schlangen D, Lehmann J (2021) Space efficient context encoding for non-task-oriented dialogue generation with graph attention transformer. In: Proceedings of the 59th annual meeting of the Association for Computational Linguistics and the 11th international joint conference on natural language processing (volume 1: long papers), pp 7028–7041
Ribeiro MT, Singh SGC (2016) “Why should i trust you?” explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1135–1144
Lundberg SM, Lee S-I (2017) A unified approach to interpreting model predictions. In: Proceedings of the 31st international conference on neural information processing systems, pp 4768–4777
Zhang Q, Yang Y, Ma H, Wu YN (2019) Interpreting CNNs via decision trees. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6261–6270
Chen X, Duan Y, Houthooft R, Schulman J, Sutskever I, Abbeel P (2016) Infogan: interpretable representation learning by information maximizing generative adversarial nets. In: Proceedings of the 30th international conference on neural information processing systems, pp 2180–2188
KarrasT AT, Laine S (2019) A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4401–4410
Chen H, Liu X, Yin D, Tang J (2017) A survey on dialogue systems: recent advances and new frontiers. ACM SIGKDD Explor Newsl 19(2):25–35
Li J, Monroe W, Ritter A, Galley M, Gao J, Jurafsky D (2016) Deep reinforcement learning for dialogue generation. In: Proceedings of EMNLP
Serban I, Sordoni A, Lowe R, Charlin L, Pineau J, Courville A, Bengio Y (2017) A hierarchical latent variable encoder–decoder model for generating dialogues. In: Proceedings of the AAAI conference on artificial intelligence, vol 31
Shang L, Lu Z, Hang L (2015) Neural responding machine for short-text conversation. IEEE
Cho K, Merrienboer BV, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. Computer Science
Higgins I, Matthey L, Pal A, Burgess C, Glorot X, Botvinick M, Mohamed S, Lerchner A (2019) beta-vae: learning basic visual concepts with a constrained variational framework. In: 5th international conference on learning representations, pp 4401–4410
Hu Z, Yang Z, Liang X, Salakhutdinov R, Xing EP (2017) Toward controlled generation of text. In: 34th international conference on machine learning, pp 1587–1596
Wiseman S, Shieber S, Rush A (2018) Learning neural templates for text generation. In: Proceedings of the 2018 conference on empirical methods in natural language processing. Association for Computational Linguistics, Brussels, Belgium, pp 3174–3187
Zhao T, Lee K EM (2018) Unsupervised discrete sentence representation learning for interpretable neural dialog generation. In: Proceedings of the 56th annual meeting of the Association for Computational Linguistics, pp 1098–1107
See A, Roller S, Kiela D, Weston J (2019) What makes a good conversation? how controllable attributes affect human judgments. In: Proceedings of the 2019 conference of the North American chapter of the Association for Computational Linguistics: human language technologies, volume 1 (long and short papers). Association for Computational Linguistics, Minneapolis, Minnesota, pp 1702–1723
Ficler J, Goldberg Y (2017) Controlling linguistic style aspects in neural language generation. In: Proceedings of the workshop on stylistic variation. Association for Computational Linguistics, Copenhagen, Denmark, pp 94–104
Li Z, Jiang X, Shang L, Liu Q (2019) Decomposable neural paraphrase generation. In: Proceedings of the 57th annual meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, pp 3403–3414
Sato M, Suzuki J, Shindo H, Matsumoto Y (2018) Interpretable adversarial perturbation in input embedding space for text. In: the 27th international joint conference on artificial intelligence and the 23rd European conference on artificial intelligence
Pang B, Wu YN (2021) Latent space energy-based model of symbol-vector coupling for text generation and classification. In: Proceedings of the 38th international conference on machine learning, vol 139, pp 8359–8370
Shi W, Zhou H, Miao N, Li L (2020) Dispersed exponential family mixture vaes for interpretable text generation. In: Proceedings of the 37th international conference on machine learning, vol 119, pp 8840–8851
Chen C, Peng J, Wang F, Xu J, Wu H (2019) Generating multiple diverse responses with multi-mapping and posterior mapping selection. In: Proceedings of the twenty-eighth international joint conference on artificial intelligence, IJCAI 2019, Macao, China, August 10–16, 2019, pp 4918–4924
Bao S, He H, Wang F, Wu H, Wang H (2020) PLATO: pre-trained dialogue generation model with discrete latent variable. In: Proceedings of the 58th annual meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5–10, 2020, pp 85–96
Cui Z, Li Y, Zhang J, Cui J, Wei C, Wang B (2020) Focus-constrained attention mechanism for cvae-based response generation. In: Findings of the Association for Computational Linguistics: EMNLP 2020, Online Event, 16–20 November 2020, pp 2021–2030
Sohn K, Lee H, Yan X (2015) Learning structured output representation using deep conditional generative models. In: Advances in neural information processing systems 28: annual conference on neural information processing systems 2015, pp 3483–3491
Wang Y, Zhang X, Lu M, Wang H, Choe Y (2020) Attention augmentation with multi-residual in bidirectional LSTM. Neurocomputing 385:340–347
Li Y, Su H, Shen X, Li W, Cao Z, Niu S (2017) Dailydialog: a manually labelled multi-turn dialogue dataset. In: Proceedings of the eighth international joint conference on natural language processing, pp 986–995
Papineni K, Roukos S, Ward T, Zhu W (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pp 311–318
Lin, CY (2004) Rouge: A package for automatic evaluation of summaries. In: Text summarization branches out, pp 74–81
Banerjee S, Lavie A (2005) METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization@ACL 2005, pp 65–72
Li J, Galley M, Brockett C, Gao J, Dolan B (2016) A diversity-promoting objective function for neural conversation models. In: The 2016 conference of the North American chapter of the Association for Computational Linguistics: human language technologies, pp 110–119
Luong T, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 1412–1421
Acknowledgements
This work was partly supported by National Key R&D Program of China (2019YFB2103000), the National Natural Science Foundation of China (62136002,62102057 and 61876027), the Science and Technology Research Program of Chongqing Municipal Education Commission (KJQN202100627 and KJQN202100629), and the National Natural Science Foundation of Chongqing (cstc2019jcyj-cxttX0002), respectively.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, Y., Liao, J., Yu, H. et al. Semantic-aware conditional variational autoencoder for one-to-many dialogue generation. Neural Comput & Applic 34, 13683–13695 (2022). https://doi.org/10.1007/s00521-022-07182-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-07182-9