Generating Synthetic Dialogues from Prompts to Improve Task-Oriented Dialogue Systems

Steindl, Sebastian; Schäfer, Ulrich; Ludwig, Bernd

doi:10.1007/978-3-031-42608-7_17

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14236))

Included in the following conference series:

German Conference on Artificial Intelligence (Künstliche Intelligenz)

655 Accesses

Abstract

Recently, the research into language models fine-tuned to follow prompts has made notable advances. These are commonly used in the form of chatbots. One special case of chatbots is that of Task-Oriented Dialogue (TOD) systems that aim to help the user achieve specific tasks using external services. High quality training data for these systems is costly to come by. We thus evaluate if the new prompt-following models can generate annotated synthetic dialogues and if these can be used to train a TOD system. To this end we generate data based on descriptions of a dialogues goal. We train a state-of-the-art TOD system to compare it in a low resource setting with and without synthetic dialogues. The evaluation shows that using prompt-following language models to generate synthetic dialogues could help training better TOD systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Azizi, S., Kornblith, S., Saharia, C., Norouzi, M., Fleet, D.J.: Synthetic data from diffusion models improves imagenet classification (2023). https://doi.org/10.48550/arXiv.2304.08466
Borisov, V., Sessler, K., Leemann, T., Pawelczyk, M., Kasneci, G.: Language models are realistic tabular data generators. In: The Eleventh International Conference on Learning Representations (2023)
Google Scholar
Budzianowski, P., et al.: MultiWOZ - a large-scale multi-domain Wizard-of-Oz dataset for task-oriented dialogue modelling. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 5016–5026. Association for Computational Linguistics, Brussels (2018)
Google Scholar
Cheng, Q., Li, L., Quan, G., Gao, F., Mou, X., Qiu, X.: Is MultiWOZ a solved task? An interactive TOD evaluation framework with user simulator. In: Findings of the Association for Computational Linguistics: EMNLP 2022, pp. 1248–1259. Association for Computational Linguistics, Abu Dhabi (2022)
Google Scholar
Denny, P., Kumar, V., Giacaman, N.: Conversing with copilot: exploring prompt engineering for solving CS1 problems using natural language. In: Proceedings of the 54th ACM Technical Symposium on Computer Science Education, vol. 1, pp. 1136–1142 (2023)
Google Scholar
He, W., et al.: Galaxy: a generative pre-trained model for task-oriented dialog with semi-supervised learning and explicit policy injection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 10749–10757 (2022)
Google Scholar
Kelley, J.F.: An iterative design methodology for user-friendly natural language office information applications. ACM Trans. Inf. Syst. 2(1), 26–41 (1984). https://doi.org/10.1145/357417.357420
Article Google Scholar
Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning, pp. 1188–1196. PMLR (2014)
Google Scholar
Lin, Z., Madotto, A., Winata, G.I., Fung, P.: MinTL: minimalist transfer learning for task-oriented dialogue systems. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 3391–3405. Association for Computational Linguistics, Online (2020)
Google Scholar
OpenAI: OpenAI: Introducing ChatGPT (2022). https://openai.com/blog/chatgpt
Peng, B., Li, C., Li, J., Shayandeh, S., Liden, L., Gao, J.: SOLOIST: building task bots at scale with transfer learning and machine teaching. Trans. Assoc. Comput. Linguist. 9, 807–824 (2021)
Article Google Scholar
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., et al.: Language models are unsupervised multitask learners. OpenAI Blog 1(8), 9 (2019)
Google Scholar
Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21(1), 5485–5551 (2020)
MathSciNet Google Scholar
To, H.Q., Bui, N.D.Q., Guo, J., Nguyen, T.N.: Better language models of code through self-improvement (2023). https://doi.org/10.48550/arXiv.2304.01228
Wang, Y., et al.: Self-instruct: aligning language model with self generated instructions (2022). https://doi.org/10.48550/arXiv.2212.10560
Zhang, R., et al.: LLaMA-adapter: efficient fine-tuning of language models with zero-init attention. arXiv preprint: arXiv:2303.16199 (2023)

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, Media and Computer Science, Ostbayerische Technische Hochschule Amberg-Weiden, 92224, Amberg, Germany
Sebastian Steindl & Ulrich Schäfer
Chair for Information Science, University Regensburg, 93053, Regensburg, Germany
Bernd Ludwig

Authors

Sebastian Steindl
View author publications
You can also search for this author in PubMed Google Scholar
Ulrich Schäfer
View author publications
You can also search for this author in PubMed Google Scholar
Bernd Ludwig
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sebastian Steindl .

Editor information

Editors and Affiliations

Universität Würzburg, Würzburg, Germany
Dietmar Seipel
University of Greifswald, Greifswald, Germany
Alexander Steen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Steindl, S., Schäfer, U., Ludwig, B. (2023). Generating Synthetic Dialogues from Prompts to Improve Task-Oriented Dialogue Systems. In: Seipel, D., Steen, A. (eds) KI 2023: Advances in Artificial Intelligence. KI 2023. Lecture Notes in Computer Science(), vol 14236. Springer, Cham. https://doi.org/10.1007/978-3-031-42608-7_17

Download citation

DOI: https://doi.org/10.1007/978-3-031-42608-7_17
Published: 18 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-42607-0
Online ISBN: 978-3-031-42608-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Generating Synthetic Dialogues from Prompts to Improve Task-Oriented Dialogue Systems