Skip to main content

Meta-context Transformers for Domain-Specific Response Generation

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2021)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12714))

Included in the following conference series:

  • 1525 Accesses

Abstract

Transformer-based models, such as GPT-2, have revolutionized the landscape of dialogue generation by capturing the long-range structures through language modeling. Though these models have exhibited excellent language coherence, they often lack relevance and terms when used for domain-specific response generation. In this paper, we present DSRNet (Domain Specific Response Network), a transformer-based model for dialogue response generation by reinforcing domain-specific attributes. In particular, we extract meta attributes from context and joinly model with the dialogue context utterances for better attention over domain-specific keyterms and relevance. We study the use of DSRNet in a multi-turn multi-interlocutor environment for domain-specific response generation. In our experiments, we evaluate DSRNet on Ubuntu dialogue datasets, which are mainly composed of various technical domain related dialogues for IT domain issue resolutions and also on CamRest676 dataset, which contains restaurant domain conversations. We observe that the responses produced by our model carry higher relevance due to the presence of domain-specific key attributes that exhibit better overlap with the attributes of the context. Our analysis shows that the performance improvement is mostly due to the infusion of key terms along with dialogues which result in better attention over domain-relevant terms.

D. Kar—Work done during internship at IBM Research India.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://faculty.nps.edu/cmartell/NPSChat.htm.

  2. 2.

    https://github.com/rkadlec/ubuntu-ranking-dataset-creator.

  3. 3.

    https://github.com/DebanjanaKar/DSRNet.

  4. 4.

    https://github.com/Maluuba/nlg-eval.

  5. 5.

    https://github.com/gmftbyGMFTBY/MultiTurnDialogZoo.

References

  1. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  2. Budzianowski, P., Vulic, I.: Hello, it’s gpt-2-how can i help you? towards the use of pretrained language models for task-oriented dialogue systems. EMNLP-IJCNLP 2019, 15 (2019)

    Google Scholar 

  3. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics, Minneapolis, Minnesota, June 2019. 10.18653/v1/N19-1423, https://www.aclweb.org/anthology/N19-1423

  4. Forsythand, E.N., Martell, C.H.: Lexical and discourse analysis of online chat dialog. In: International Conference on Semantic Computing (ICSC 2007), pp. 19–26 (2007)

    Google Scholar 

  5. Keskar, N.S., McCann, B., Varshney, L.R., Xiong, C., Socher, R.: Ctrl: a conditional transformer language model for controllable generation (2019)

    Google Scholar 

  6. Kummerfeld, J.K., et al.: A large-scale corpus for conversation disentanglement. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 3846–3856 (2019)

    Google Scholar 

  7. Li, C., Gao, X., Li, Y., Peng, B., Li, X., Zhang, Y., Gao, J.: Optimus: Organizing sentences via pre-trained modeling of a latent space. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 4678–4699. Association for Computational Linguistics, Online, November 2020

    Google Scholar 

  8. Lowe, R., Pow, N., Serban, I.V., Pineau, J.: The Ubuntu dialogue corpus: a large dataset for research in unstructured multi-turn dialogue systems. In: 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue, p. 285 (2015)

    Google Scholar 

  9. Mohapatra, P., et al.: Domain knowledge driven key term extraction for IT services. In: Pahl, C., Vukovic, M., Yin, J., Yu, Q. (eds.) ICSOC 2018. LNCS, vol. 11236, pp. 489–504. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-03596-9_35

    Chapter  Google Scholar 

  10. Olabiyi, O., Vienna, V., Mueller, E.T.: Dlgnet: a transformer-based model for dialogue response generation. ACL 2020, 54 (2020)

    Google Scholar 

  11. Peng, B., Li, C., Li, J., Shayandeh, S., Liden, L., Gao, J.: Soloist: Few-shot task-oriented dialog with a single pre-trained auto-regressive model (2020)

    Google Scholar 

  12. Peng, B., et al.: Few-shot natural language generation for task-oriented dialog. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings, pp. 172–182 (2020)

    Google Scholar 

  13. Peters, M., et al.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers) pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana, June 2018

    Google Scholar 

  14. Radford, A., Narasimhan, K., Salimans, T., Sutskever, I.: Improving language understanding by generative pre-training (2018)

    Google Scholar 

  15. Serban, I., Sordoni, A., Bengio, Y., Courville, A.C., Pineau, J.: Building end-to-end dialogue systems using generative hierarchical neural network models. In: AAAI (2016)

    Google Scholar 

  16. Shrestha, L., McKeown, K.: Detection of question-answer pairs in email conversations. In: Proceedings of the 20th International Conference on Computational Linguistics. COLING 2004, p. 889-es. Association for Computational Linguistics, USA (2004)

    Google Scholar 

  17. Sordoni, A., Bengio, Y., Vahabi, H., Lioma, C., Grue Simonsen, J., Nie, J.Y.: A hierarchical recurrent encoder-decoder for generative context-aware query suggestion. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. CIKM 2015, pp. 553–562, Association for Computing Machinery, New York, NY, USA (2015)

    Google Scholar 

  18. Wen, T.H., et al.: A network-based end-to-end trainable task-oriented dialogue system. In: EACL, Valencia, Spain, pp. 438–449. Association for Computational Linguistics, April 2017

    Google Scholar 

  19. Wu, Z., et al.: A controllable model of grounded response generation (2020)

    Google Scholar 

  20. Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R.R., Le, Q.V.: Xlnet: Generalized autoregressive pretraining for language understanding. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 32. Curran Associates, Inc. (2019)

    Google Scholar 

  21. Zellers, R., et al.: Defending against neural fake news. Neurips (2020)

    Google Scholar 

  22. Zhang, Y., et al.: Dialogpt: large-scale generative pre-training for conversational response generation. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 270–278 (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Suranjana Samanta .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kar, D., Samanta, S., Azad, A.P. (2021). Meta-context Transformers for Domain-Specific Response Generation. In: Karlapalem, K., et al. Advances in Knowledge Discovery and Data Mining. PAKDD 2021. Lecture Notes in Computer Science(), vol 12714. Springer, Cham. https://doi.org/10.1007/978-3-030-75768-7_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-75768-7_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-75767-0

  • Online ISBN: 978-3-030-75768-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics