Skip to main content

A response generator with response-aware encoder for generating specific and relevant responses

Abstract

The dialogue data usually consist of the pairs of a query and its response, but no previous response generators have exploited the responses explicitly in their training while a response provides significant information about the meaning of a query. Therefore, this paper proposes a sequence-to-sequence response generator with a response-aware encoder. The proposed generator exploits golden responses by reflecting them into query representation. For this purpose, the response-aware encoder adds a relevancy scorer layer to the transformer encoder that calculates the relevancy of query tokens to a response. However, golden responses are available only during training of the response generator and unavailable at inference time. As a solution to this problem, the joint learning of a teacher and a student relevancy scorer is adopted. That is, at the training time, both the teacher and the student relevancy scorers are optimized but the decoder generates a response using only the relevancy of the teacher scorer. However, at the inference time, the decoder uses that of the student scorer. Since the student scorer is trained to minimize the difference from the teacher scorer, it can be used to compute the relevancy of a prospective response. The proposed model is the first attempt to use a golden response directly for generating a query representation, whereas previous studies used the responses for its implicit and indirect reflection. As a result, it achieved higher dialogue evaluation score than the current state-of-the-art model for Reddit, Persona-Chat, and DailyDialog data sets.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Data availability and materials

The datasets generated during and/or analysed during the current study are available in the ACL2020-ConKADI repository, https://github.com/pku-sixing/ACL2020-ConKADI.

Notes

  1. https://github.com/microsoft/MASS/tree/master/MASS-summarization.

  2. https://github.com/pytorch/fairseq/blob/master/examples/bart.

References

  • Ando A, Masumura R, Sato H, Moriya T, Ashihara T, Ijima Y, et al (2021) Speech emotion recognition based on listener adaptive models. In: Proceedings of IEEE international conference on acoustics, speech and signal processing, pp 6274–6278

  • Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: Proceedings of 3rd international conference on learning representations, pp 1–15

  • Bai J, Yang Z, Liang X, Wang W, Li Z (2021) Learning to copy coherent knowledge for response generation. In: Proceedings of the 35th AAAI conference on artificial intelligence, pp 12812–12820

  • Cai D, Wang Y, Bi W, Tu Z, Liu X, Shi S (2019). Retrieval-guided dialogue response generation via a matching-to-generation framework. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, pp 1866–1875

  • Chan Z, Liu L, Li J, Zhang H, Zhao D, Shi S, et al (2021) Enhancing the open-domain dialogue evaluation in latent space. In: Findings of the association for computational linguistics, pp 4889–4900. Available from: https://aclanthology.org/2021.findings-acl.432

  • Cho I, Wang D, Takahashi R, Saito H. (2022) Towards building a personalized dialogue generator via implicit user persona detection. Computing research repository. arXiv:2204.07372

  • Feng S, Ren X, Li K, Sun X (2021) Multi-view feature representation for dialogue generation with bidirectional distillation. In: Proceedings of the 35th AAAI conference on artificial intelligence, pp 12812–12820

  • Ghazvininejad M, Brockett C, Chang M, Dolan B, Gao J, tau Yih W, et al (2018) A Knowledge-grounded neural conversation model. In: Proceedings of the 32nd AAAI conference on artificial intelligence, pp 1–8

  • Grice P (1969) Utterer’s meaning and intentions. The Philos Rev 78(2):147–177

    Article  Google Scholar 

  • Griol D, Molina J (2016) A framework for improving error detection and correction in spoken dialog systems. Soft Comput 20(11):4229–4241. https://doi.org/10.1007/s00500-016-2290-z

    Article  Google Scholar 

  • Hinton G, Vinyals O, Dean J (2014) Distilling the knowledge in a neural network. In: Proceedings of NIPS 2014 workshop on deep learning and representation learning, pp 1–9

  • Huang M, Zhu X, Gao J (2020) Challenges in building intelligent open-domain dialog systems. ACM Trans Inf Syst 38(3):1–32

    Google Scholar 

  • Jiang S, Ren P, Monz C, Rijke M (2019). Improving neural response diversity with frequency-aware cross-entropy loss. In: Proceddings of the web conference 2019, pp 2879–2885

  • Khattak A, Habib A, Asghar MZ, Subhan F, Razzak I, Habib A (2021) Applying deep neural networks for user intention identification. Soft Comput 25(3):2191–2220. https://doi.org/10.1007/s00500-020-05290-z

    Article  Google Scholar 

  • Kingma D, Ba J (2015). Adam: a method for stochastic optimization. In: Proceedings of international conference on learning representations, pp 1–15

  • Lewis M, Liu Y, Goyal N, Ghazvininejad M, Mohamed A, Levy O, et al (2019) Bart: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv:1910.13461

  • Lian R, Xie M, Wang F, Peng J, Wu H. (2019) Learning to select knowledge for response generation in dialog systems. In: Proceedings of the 28th international joint conference on artificial intelligence, pp 5081—5087

  • Li S, Feng S, Wang D, Song K, Zhang Y, Wang W (2021) EmoElicitor: an open domain response generation model with user emotional reaction awareness. In: Proceedings of the 29th international joint conference on artificial intelligence, pp 3637–3643

  • Li J, Galley M, Brockett C, Gao J, Dolan B. (2016) A diversity-promoting objective function for neural conversation models. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics, pp 110–119

  • Li J, Monroe W, Shi T, Jean S, Ritter A, Jurafsky D (2017) Adversarial learning for neural dialogue generation. In: Proceedings of the 2017 conference on empirical methods in natural language processing, pp 2157–2169

  • Ling Y, Cai F, Hu X, Liu J, Chen W, Chen H (2021) Context-controlled topic-aware neural response generation for open-domain dialog systems. Inf Process Manag 58(1):102392–102406

    Article  Google Scholar 

  • Li Y, Su H, Shen X, Li W, Cao Z, Niu S (2017) DailyDialog: a manually labelled multi-turn dialogue dataset. In: Proceedings of the 8th international joint conference on natural language processing, pp 986–995

  • Liu CW, Lowe R, Serban IV, Noseworthy M, Charlin L, Pineau J (2016) How not to evaluate your dialogue system: an empirical study of unsupervised evaluation metrics for dialogue response generation. In: Proceedings of the 2016 conference on empirical methods in natural language processing, pp 2122–2132

  • Liu Z, Niu ZY, Wu H, Wang H (2019) Knowledge aware conversation generation with explainable reasoning over augmented graphs. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, pp 1782–1792

  • Menon A, Rawat A, Reddi S, Kim S, Kumar S (2021) A statistical perspective on distillation. In: Proceedings of the 38th international conference on machine learning, pp 7632–7642

  • Mou L, Song Y, Yan R, Li G, Zhang L, Jin Z (2016) Sequence to backward and forward sequences: a content-introducing approach to generative short-text conversation. In: Proceedings of the 26th international conference on computational linguistics, pp 3349–3358

  • Oluwatobi O, Mueller E (2020) Dlgnet: a transformer-based model for dialogue response generation. In: Proceedings of the 2nd workshop on natural language processing for conversational AI, pp 54–62

  • Paranjape A, Khattab O, Potts C, Zaharia M, Manning C (2022) Hindsight: posterior-guided training of retrievers for improved open-ended generation. In: Proceedings of international conference on learning representations, pp 1–16

  • Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M et al (2020) Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 21:1–67

    MathSciNet  MATH  Google Scholar 

  • Rashkin H, Smith E, Li M, Boureau YL (2019) Towards empathetic open-domain conversation models: a new benchmark and dataset. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 5370–5381

  • Serban IV, Sordoni A, Lowe R, Charlin L, Pineau J, Courville A, et al (2017) A hierarchical latent variable encoder–decoder model for generating dialogues. In: Proceedings of the 31st AAAI conference on artificial intelligence, pp 3295–3301

  • Shan C, Zhang J, Wang Y, Xie L (2018) Attention-based end-to-end models for small-footprint keyword spotting. In: Proceedings of the Interspeech 2018, pp 2037–2041

  • Song HJ, Park SB (2018) Identifying intention posts in discussion forums using multi-instance learning and multiple sources transfer learning. Soft Comput 22(24):8107–8118. https://doi.org/10.1007/s00500-017-2755-8

    Article  Google Scholar 

  • Song K, Tan X, Qin T, Lu J, Liu TY (2019). MASS: masked sequence to sequence pre-training for language generation. In: Proceedings of the 36th international conference on machine learning, pp 5926–5936

  • Sun B, Feng S, Li Y, Liu J, Li K. (2021) Generating relevant and coherent dialogue responses using self-separated conditional variational AutoEncoders. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, pp 5624–5637

  • Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826

  • Wang W, Gao W, Feng S, Chen L, Wang D (2021). Adaptive posterior knowledge selection for improving knowledge-grounded dialogue generation. In: Proceedings of the 30th ACM international conference on information and knowledge management, pp 1989–1998

  • Wang J, Liu J, Bi W, Liu X, He K, Xu R, et al (2020) Improving knowledge-aware dialogue generation via knowledge base question answering. In: Proceedings of the 34th AAAI conference on artificial intelligence, pp 9169–9176

  • Wei W, Liu J, Mao X, Guo G, Zhu F, Zhou P et al (2021) Target-guided emotion-aware chat machine. ACM Trans Inf Syst 39(4):1–24

    Article  Google Scholar 

  • Wu B, Jiang N, Gao Z, Li M, Wang Z, Li S, et al. (2018) Why do neural response generation models prefer universal replies? arXiv:1808.09187

  • Wu S, Li Y, Zhang D, Wu Z. (2020) Improving knowledge-aware dialogue response generation by using human-written prototype dialogues. In: Proceedings of the 2020 conference on empirical methods in natural language processing, pp 1402–1411

  • Wu S, Li Y, Zhang D, Zhou Y, Wu Z. (2020) Diveirse and informative dialogue generation with context-specific commonsense knowledge awareness. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 5811–5820

  • Wu S, Li Y, Zhang D, Zhou Y, Wu Z. (2020) TopicKA: generating commonsense knowledge-aware dialogue responses towards the recommended topic fact. In: Proceedings of the 29th international joint conference on artificial intelligence, pp 3766–3772. Available from: https://doi.org/10.24963/ijcai.2020/521

  • Yang Y, Li Y, Quan X (2021) UBAR: towards fully end-to-end task-oriented dialog system with GPT-2. In: Proceedings of the 35th AAAI conference on artificial intelligence, pp 14230–14238

  • Young T, Cambria E, Chaturvedi I, Zhou H, Biswas S, Huang M (2018). Augmenting end-to-end dialogue systems with commonsense knowledge. In: Proceedings of the 32nd AAAI conference on artificial intelligence, pp 4970–4977

  • Zhang S, Dinan E, Urbanek J, Szlam A, Kiela D, Weston J (2018) Personalizing dialogue agents: I have a dog, do you have pets too? In: Proceedings of the 56th annual meeting of the association for computational linguistics, pp 2204–2213

  • Zhang Y, Sun S, Galley M, Chen YC, Brockett C, Gao X, et al (2020) DIALOGPT : large-scale generative pre-training for conversational response generation. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 270–278

  • Zhou H, Young T, Huang M, Zhao H, Xu J, Zhu X (2018) Commonsense knowledge aware conversation generation with graph attention. In: Proceedings of the 27th international joint conference on artificial intelligence, pp 4623–4629

  • Zhu W, Mo K, Zhang Y, Zhu Z, Peng X, Yang Q (2017) Flexible end-to-end dialogue system for knowledge grounded conversation. arXiv:1709.04264

Download references

Funding

This work was supported in part by the Institute of Information and Communications Technology Planning and Evaluation (IITP) Grant funded by the Korea government (MSIT) (No. 2013-0-00109, WiseKB: Big data based self-evolving knowledge base and reasoning platform) and the Korea Government (MSIT) (Artificial Intelligence Innovation Hub) under Grant 2021-0-0206 and the National Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education (No. 2020R1A4A1018607)

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design. Material preparation, data collection were performed by So-Eon Kim and data analysis was performed by Hyun-Je Song. The first draft of the manuscript was written by So-Eon Kim and reviewing and editing the manuscript were performed by Seong-Bae Park. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Seong-Bae Park.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Kim, SE., Song, HJ. & Park, SB. A response generator with response-aware encoder for generating specific and relevant responses. Soft Comput 27, 3721–3732 (2023). https://doi.org/10.1007/s00500-022-07664-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-022-07664-x

Keywords

  • Response generator
  • Sequence-to-sequence model
  • Response-aware
  • Transformer architecture
  • Natural language processing