Skip to main content
Log in

Recent advances and challenges in task-oriented dialog systems

  • Review
  • Published:
Science China Technological Sciences Aims and scope Submit manuscript

Abstract

Due to the significance and value in human-computer interaction and natural language processing, task-oriented dialog systems are attracting more and more attention in both academic and industrial communities. In this paper, we survey recent advances and challenges in task-oriented dialog systems. We also discuss three critical topics for task-oriented dialog systems: (1) improving data efficiency to facilitate dialog modeling in low-resource settings, (2) modeling multi-turn dynamics for dialog policy learning to achieve better task-completion performance, and (3) integrating domain ontology knowledge into the dialog model. Besides, we review the recent progresses in dialog evaluation and some widely-used corpora. We believe that this survey, though incomplete, can shed a light on future research in task-oriented dialog systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Referenes

  1. Huang M, Zhu X, Gao J. Challenges in building intelligent open-domain dialog systems. ACM Transactions on Information Systems. New York: Association for Computing Machinery. 2020

    Google Scholar 

  2. Chen H, Liu X, Yin D, et al. A survey on dialogue systems. SIGKDD Explor Newsl, 2017, 19: 25–35

    Article  Google Scholar 

  3. Mrkšić N, Séaghdha D Ó, Wen T H, et al. Neural belief tracker: Data-driven dialogue state tracking. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Vancouver, 2017. 1777–1788

  4. Wu C S, Madotto A, Hosseini-Asl E, et al. Transferable multi-domain state generator for task-oriented dialogue systems. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: Association for Computational Linguistics, 2019. 808–819

    Chapter  Google Scholar 

  5. Zhao T, Xie K, Eskenazi M. Rethinking action spaces for reinforcement learning in end-to-end dialog agents with latent variable models. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis: Association for Computational Linguistics, 2019. 1208–1218

    Google Scholar 

  6. Chen W, Chen J, Qin P, et al. Semantically conditioned dialog response generation via hierarchical disentangled self-attention. ArXiv: 1905.12866

  7. Gao J, Galley M, Li L. Neural approaches to conversational AI. In: Proceedings of The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. Ann Arbor, 2018. 1371–1374

  8. Mnih V, Kavukcuoglu K, Silver D, et al. Playing atari with deep reinforcement learning. In: Proceedings of NIPS Deep Learning Workshop. Lake Tahoe, 2013

  9. Silver D, Huang A, Maddison C J, et al. Mastering the game of Go with deep neural networks and tree search. Nature, 2016, 529: 484–489

    Article  Google Scholar 

  10. Yao K, Zweig G, Hwang M, et al. Recurrent neural networks for language understanding. In: Proceedings of the Conference of the International Speech Communication Association. Lyon, 2013

  11. Yao K, Peng B, Zhang Y, et al. Spoken language understanding using long short-term memory neural networks. In: Proceedings of IEEE Spoken Language Technology Workshop. South Lake Tahoe, 2014

  12. Hakkani-Tür D, Tür G, Çelikyilmaz A, et al. Multi-domain joint semantic frame parsing using bi-directional RNN-LSTM. In: Proceedings of the Conference of the International Speech Communication Association. San Francisco, 2016

  13. Guo D, Tür G, Yih W, et al. Joint semantic utterance classification and slot filling with recursive neural networks. In: Proceedings of IEEE Spoken Language Technology Workshop. South Lake Tahoe, 2014

  14. Xu P, Sarikaya R. Convolutional neural network based triangular CRF for joint intent detection and slot filling. In: Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding. Olomouc, 2013

  15. Yao K, Peng B, Zweig G, et al. Recurrent conditional random field for language understanding. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing. Florence, 2014

  16. Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Minneapolis, 2019

  17. Chen Q, Zhuo Z, Wang W. BERT for joint intent classification and slot filling. ArXiv: 1902.10909

  18. Castellucci G, Bellomaria V, Favalli A, et al. Multi-lingual intent detection and slot filling in a joint bert-based model. ArXiv: 1907.02884

  19. Goo C, Gao G, Hsu Y, et al. Slot-gated modeling for joint slot filling and intent prediction. In: Proceedings of NAACL-HLT. New Orleans, 2018

  20. Liu B, Lane I. Attention-based recurrent neural network models for joint intent detection and slot filling. In: Proceedings of INTERSPEECH. San Francisco, 2016

  21. Young S, Gasic M, Thomson B, et al. POMDP-based statistical spoken dialog systems: A review. Proc IEEE, 2013, 101: 1160–1179

    Article  Google Scholar 

  22. Young S. Using pomdps for dialog management. In: Proceedings of 2006 IEEE Spoken Language Technology Workshop. IEEE, 2006. 8–13

  23. Williams J D, Young S. Scaling up pomdps for dialog management: The “summary pomdp”method. In: Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding. IEEE, 2005. 177–182

  24. Schatzmann J, Thomson B, Weilhammer K, et al. Agenda-based user simulation for bootstrapping a pomdp dialogue system. In: Proceedings of Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers. Association for Computational Linguistics, 2007. 149–152

  25. Henderson M, Thomson B, Young S. Word-based dialog state tracking with recurrent neural networks. In: Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL). Philadelphia, 2014. 292–299

  26. Mrkšić N, Séaghdha D, Thomson B, et al. Multi-domain dialog state tracking using recurrent neural networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), Beijing: Association for Computational Linguistics, 2015. 794–799

    Google Scholar 

  27. Lee H, Lee J, Kim T Y. Sumbt: Slot-utterance matching for universal and scalable belief tracking. In: Proceedings of the 57th Conference of the Association for Computational Linguistics. Florence, 2019. 5478–5483

  28. Gao S, Sethi A, Agarwal S, et al. Dialog state tracking: A neural reading comprehension approach. In: Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue. Stockholm: Association for Computational Linguistics, 2019. 264–273

    Chapter  Google Scholar 

  29. Perez J. Machine reading method for dialog state tracking. US Patent. No. 10540967, 2020

  30. Zhang J G, Hashimoto K, Wu C S, et al. Find or classify? Dual strategy for slot-value predictions on multi-domain dialog state tracking. ArXiv: 1910.03544

  31. Ren L, Ni J, McAuley J. Scalable and accurate dialogue state tracking via hierarchical sequence generation. ArXiv: 1909.00754

  32. Zhou L, Small K. Multi-domain dialogue state tracking as dynamic knowledge graph enhanced question answering. ArXiv: 1911.06192

  33. Poole D L, Mackworth A K. Artificial Intelligence: Foundations of Computational Agents. Cambridge: Cambridge University Press, 2010

    Book  Google Scholar 

  34. Mnih V, Kavukcuoglu K, Silver D, et al. Human-level control through deep reinforcement learning. Nature, 2015, 518: 529–533

    Article  Google Scholar 

  35. Zhou L, Small K, Rokhlenko O, et al. End-to-end offline goal-oriented dialog policy learning via policy gradient. ArXiv: 1712.02838

  36. Lipton Z, Li X, Gao J, et al. Bbq-networks: Efficient exploration in deep reinforcement learning for task-oriented dialogue systems. In: Proceedings of Thirty-Second AAAI Conference on Artificial Intelligence. New Orleans, 2018

  37. Peng B, Li X, Li L, et al. Composite task-completion dialogue policy learning via hierarchical deep reinforcement learning. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Copenhagen: Association for Computational Linguistics, 2017. 2231–2240

    Google Scholar 

  38. Li X, Lipton Z C, Dhingra B, et al. A user simulator for task-completion dialogues. ArXiv: 1612.05688

  39. Shi W, Qian K, Wang X, et al. How to build user simulators to train RL-based dialog systems. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLPIJCNLP). Hong Kong: Association for Computational Linguistics, 2019. 1990–2000

    Chapter  Google Scholar 

  40. Peng B, Li X, Gao J, et al. Deep Dyna-Q: Integrating planning for task-completion dialogue policy learning. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Melbourne: Association for Computational Linguistics. Austin, 2018. 2182–2192

  41. Wu Y, Li X, Liu J, et al. Switch-based active deep Dyna-Q: Efficient adaptive planning for task-completion dialogue policy learning. In: Proceedings of the AAAI Conference on Artificial Intelligence. Honolulu, 2019. 7289–7296

  42. Su S Y, Li X, Gao J, et al. Discriminative deep Dyna-Q: Robust planning for dialogue policy learning. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels: Association for Computational Linguistics, 2018. 3813–3823

    Chapter  Google Scholar 

  43. Papangelis A, Wang Y C, Molino P, et al. Collaborative multi-agent dialogue model training via reinforcement learning. In: Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue. Stockholm, 2019. 92–102

  44. Takanobu R, Liang R, Huang M. Multi-agent task-oriented dialog policy learning with role-aware reward decomposition. ArXiv: 2004.03809

  45. Wen T H, Gasic M, Mrkšić N, et al. Semantically conditioned LSTM-based natural language generation for spoken dialogue systems. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon: Association for Computational Linguistics, 2015. 1711–1721

    Chapter  Google Scholar 

  46. Peng B, Zhu C, Li C, et al. Few-shot natural language generation for task-oriented dialog. ArXiv: 2002.12328

  47. Goodfellow I, Bengio Y, Courville A. Deep Learning. Cambridge: MIT Press. 2016

    MATH  Google Scholar 

  48. Wen T H, Vandyke D, Mšksić N, et al. A network-based end-to-end trainable task-oriented dialogue system. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers. Valencia, 2017. 438–449

  49. Bordes A, Boureau Y L, Weston J. Learning end-to-end goal-oriented dialog. In: Proceedings of the 5th International Conference on Learning Representations. Toulon, 2017

  50. Madotto A, Wu C S, Fung P. Mem2Seq: Effectively incorporating knowledge bases into end-to-end task-oriented dialog systems. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Melbourne: Association for Computational Linguistics, 2018. 1468–1478

    Chapter  Google Scholar 

  51. Eric M, Krishnan L, Charette F, et al. Key-value retrieval networks for task-oriented dialogue. In: Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue. Saarbrcken, 2017. 37–49

  52. Lei W, Jin X, Kan M Y, et al. Sequicity: Simplifying task-oriented dialogue systems with single sequence-to-sequence architectures. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Melbourne, 2018. 1437–1447

  53. Zhao T, Eskenazi M. Towards end-to-end learning for dialog state tracking and management using deep reinforcement learning. ArXiv: 1606.02560

  54. Williams J D, Asadi K, Zweig G. Hybrid code networks: Practical and efficient end-to-end dialog control with supervised and reinforcement learning. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Vancouver: Association for Computational Linguistics, 2017. 665–677

    Chapter  Google Scholar 

  55. Dhingra B, Li L, Li X, et al. Towards end-to-end reinforcement learning of dialogue agents for information access. ArXiv: 1609.00777

  56. Li X, Chen Y N, Li L, et al. End-to-end task-completion neural dialogue systems. In: Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Taipei, 2017. 733–743

  57. Liu B, Lane I. Iterative policy learning in end-to-end trainable task-oriented neural dialog models. In: Proceedings of 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2017. 482–489

  58. Walker M, Litman D, Kamm C A, et al. Paradise: A framework for evaluating spoken dialogue agents. In: Proceedings of 35th Annual Meeting of the Association for Computational Linguistics and 8th Conference of the European Chapter of the Association for Computational Linguistics. Madrid, 1997. 271–280

  59. Takanobu R, Zhu Q, Li J, et al. Is your goal-oriented dialog model performing really well? Empirical analysis of system-wise evaluation. ArXiv: 2005.07362

  60. Ultes S, Barahona L M R, Su P H, et al. Pydial: A multi-domain statistical dialogue system toolkit. In: Proceedings of ACL 2017, System Demonstrations. Vancouver, 2017. 73–78

  61. Lee S, Zhu Q, Takanobu R, et al. Convlab: Multi-domain end-to-end dialog system platform. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. Florence, 2019. 64–69

  62. Zhu Q, Zhang Z, Fang Y, et al. ConvLab-2: An open-source toolkit for building, evaluating, and diagnosing dialogue systems. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Online, 2020. 142–149

  63. Pietquin O, Hastie H. A survey on metrics for the evaluation of user simulations. The Knowledge Engineering Review. Cambridge: Cambridge University Press, 2013. 28: 59–73

    Article  Google Scholar 

  64. Liu B and Lane I. Adversarial learning of task-oriented neural dialog models. ArXiv: 1805.11762

  65. Takanobu R, Zhu H, Huang M. Guided dialog policy learning: Reward estimation for multi-domain task-oriented dialog. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Hong Kong, 2019. 100–110

  66. Li J, Peng B, Lee S, et al. Results of the multi-domain task-completion dialog challenge. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence, Eighth Dialog System Technology Challenge Workshop. New York, 2020

  67. Henderson M, Thomson B, Williams J D. The second dialog state tracking challenge. In: Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL). Philadelphia, 2014. 263–272

  68. El Asri L, Schulz H, Sarma S K, et al. Frames: A corpus for adding memory to goal-oriented dialogue systems. In: Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue. Saarbrcken, 2017. 207–219

  69. Budzianowski P, Wen T H, Tseng B H, et al. Multiwoz-a large-scale multi-domain wizard-of-oz dataset for task-oriented dialogue modelling. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels, 2018. 5016–5026

  70. Peskov D, Clarke N, Krone J, et al. Multi-domain goal-oriented dialogues (multidogo): Strategies toward curating and annotating large scale dialogue data. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLPIJCNLP). Hong Kong, 2019. 4518–4528

  71. Byrne B, Krishnamoorthi K, Sankar C, et al. Taskmaster-1: Toward a realistic and diverse dialog dataset. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Hong Kong, 2019. 4506–4517

  72. Zhu Q, Huang K, Zhang Z, et al. Crosswoz: A large-scale chinese cross-domain task-oriented dialogue dataset. Transactions of the Association for Computational Linguistics. Cambridge: MIT Press. 2020

    Google Scholar 

  73. Williams J D, Raux A, Ramachandran D, et al. The dialog state tracking challenge. In: Proceedings of the SIGDIAL 2013 Conference. Metz, 2013. 404–413

  74. Li X, Wang Y, Sun S, et al. Microsoft dialogue challenge: Building end-to-end task-completion dialogue systems. ArXiv: 1807.11125

  75. Rastogi A, Zang X, Sunkara S, et al. Towards scalable multi-domain conversational agents: The schema-guided dialogue dataset. In: Proceedings of the AAAI Conference on Artificial Intelligence. New York, 2020

  76. Shah P, Hakkani-Tür D, Tür G, et al. Building a conversational agent overnight with dialogue self-play. ArXiv: 1801.04871

  77. Wei W, Le Q, Dai A, et al. Airdialogue: An environment for goal-oriented dialogue research. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels, 2018. 3844–3854

  78. Yu T, Zhang R, Er H, et al. Cosql: A conversational text-to-sql challenge towards cross-domain natural language interfaces to databases. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Hong Kong, 2019. 1962–1979

  79. Shah P, Hakkani-Tur D, Liu B, et al. Bootstrapping a neural conversational agent with dialogue self-play, crowdsourcing and on-line reinforcement learning. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 3 (Industry Papers). New Orleans, 2018. 41–51

  80. Kelley J F. An iterative design methodology for user-friendly natural language office information applications. ACM Trans Inf Syst, 1984, 2: 26–41

    Article  Google Scholar 

  81. Ilievski V, Musat C, Hossmann A, et al. Goal-oriented chatbot dialog management bootstrapping with transfer learning. ArXiv: 1802.00500

  82. Chen L, Chang C, Chen Z, et al. Policy adaptation for deep reinforcement learning-based dialogue management. In: Proceedings of 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2018. 6074–6078

  83. Rastogi A, Hakkani-Tür D, Heck L. Scalable multi-domain dialogue state tracking. In: Proceedings of 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2017. 561–568

  84. Ren L, Xie K, Chen L, et al. Towards universal dialogue state tracking. ArXiv: 1810.09587

  85. Mo K, Zhang Y, Yang Q, et al. Cross-domain dialogue policy transfer via simultaneous speech-act and slot alignment. ArXiv: 1804.07691

  86. Mo K, Zhang Y, Li S, et al. Personalizing a dialogue system with transfer reinforcement learning. In: Proceedings of Thirty-Second AAAI Conference on Artificial Intelligence. New Orleans, 2018

  87. Mo K, Zhang Y, Yang Q, et al. Fine grained knowledge transfer for personalized task-oriented dialogue systems. ArXiv: 1711.04079

  88. Schuster S, Gupta S, Shah R, et al. Cross-lingual transfer learning for multilingual task oriented dialog. ArXiv: 1810.13327

  89. Finn C, Abbeel P, Levine S. Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th International Conference on Machine Learning-Volume 70. Sydney, 2017. 1126–1135

  90. Mi F, Huang M, Zhang J, et al. Meta-learning for low-resource natural language generation in task-oriented dialogue systems. ArXiv: 1905.05644

  91. Qian K, Yu Z. Domain adaptive dialog generation via meta learning. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, 2019. 2639–2649

  92. Madotto A, Lin Z, Wu C S, et al. Personalizing dialogue agents via meta-learning. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, 2019. 5454–5459

  93. Weisz G, Budzianowski P, Su P H, et al. Sample efficient deep reinforcement learning for dialogue systems with large action spaces. IEEE/ACM Trans Audio Speech Language Proce, 2018, 26: 2083–2097

    Article  Google Scholar 

  94. Casanueva I, Budzianowski P, Su P H, et al. Feudal reinforcement learning for dialogue management in large domains. ArXiv: 1803.03232

  95. Xu X, Zhang Y, Liden L, et al. Unsupervised dialogue spectrum generation for log dialogue ranking. In: Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue. Stockholm, 2019. 143–154

  96. Su P H, Gasic M, Mšksić N, et al. On-line active reward learning for policy optimisation in spoken dialogue systems. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Berlin: Association for Computational Linguistics, 2016. 2431–2441

    Chapter  Google Scholar 

  97. Shi W, Zhao T, Yu Z. Unsupervised dialog structure learning. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis: Association for Computational Linguistics, 2019. 1797–1807

    Google Scholar 

  98. Takanobu R, Huang M, Zhao Z, et al. A weakly supervised method for topic segmentation and labeling in goal-oriented dialogues via reinforcement learning. In: Proceedings of the International Joint Conferences on Artificial Intelligence. Stockholm, 2018. 4403–4410

  99. Wolf T, Sanh V, Chaumond J, et al. Transfertransfo: A transfer learning approach for neural network based conversational agents. ArXiv: 1901.08149

  100. Budzianowski P, Vulic I. Hello, it’s gpt-2—how can I help you? Towards the use of pretrained language models for task-oriented dialogue systems. ArXiv: 1907.05774

  101. Asri L El, He J, Suleman K. A sequence-to-sequence model for user simulation in spoken dialogue systems. In: Proceedings of Inter-speech2016. San Francisco, 2016. 1151–1155

  102. Crook P A, Marin A. Sequence to sequence modeling for user simulation in dialog systems. In: Proceedings of the Conference of the International Speech Communication Association. Stockholm, 2017. 1706–1710

  103. Kreyssig F, Casanueva I, Budzianowski P, et al. Neural user simulation for corpus-based policy optimisation of spoken dialogue systems. In: Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue. Melbourne, 2018. 60–69

  104. Gur I, Hakkani-Tür D, Tür G, et al. User modeling for task oriented dialogues. In: Proceedings of 2018 IEEE Spoken Language Technology Workshop (SLT). IEEE, 2018. 900–906

  105. Chang C, Yang R, Chen L, et al. Affordable on-line dialogue policy learning. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Copenhagen, 2017. 2200–2209

  106. Chen L, Zhou X, Chang C, et al. Agent-aware dropout dqn for safe and efficient on-line dialogue policy learning. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Copenhagen, 2017. 2454–2464

  107. Henderson M, Thomson B, Young S. Deep neural network approach for the dialog state tracking challenge. In: Proceedings of the SIGDIAL 2013 Conference. Metz, 2013. 467–471

  108. Zhang Z, Huang M, Zhao Z, et al. Memory-augmented dialogue management for task-oriented dialogue systems. ACM Trans Inform Syst, 2019, 37: 1–30

    Google Scholar 

  109. Zhang Z, Liao L, Huang M, et al. Neural multimodal belief tracker with adaptive attention for dialogue systems. In: Proceedings of The World Wide Web Conference. San Francisco, 2019. 2401–2412

  110. Zhong V, Xiong C, Socher R. Global-locally self-attentive encoder for dialogue state tracking. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Melbourne: Association for Computational Linguistics, 2018. 1458–1467

    Chapter  Google Scholar 

  111. Gu J, Lu Z, Li H, et al. Incorporating copying mechanism in sequence-to-sequence learning. ArXiv: 1603.06393

  112. Xu P, Hu Q. An end-to-end approach for handling unknown slot values in dialogue state tracking. ArXiv: 1805.01555

  113. Vinyals O, Fortunato M, Jaitly N. Pointer networks. In: Proceedings of Advances in Neural Information Processing Systems. Montreal, 2015. 2692–2700

  114. Chen L, Lv B, Wang C, et al. Schema-guided multi-domain dialogue state tracking with graph attention neural networks. In: Proceedings of AAAI. New York, 2020. 7521–7528

  115. Cuayáhuitl H. Simpleds: A simple deep reinforcement learning dialogue system. In: Dialogues with Social Robots. Singapore: Springer, 2017. 109–118

    Chapter  Google Scholar 

  116. Sutton R S, Barto A G. Reinforcement Learning: An Introduction. 2nd ed. Cambridge: The MIT Press. 2018

    MATH  Google Scholar 

  117. Lewis M, Yarats D, Dauphin Y, et al. Deal or no deal? End-to-end learning of negotiation dialogues. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Copenhagen, 2017. 2443–2453

  118. Yarats D, Lewis M. Hierarchical text generation and planning for strategic dialogue. ArXiv: 1712.05846

  119. Jiang Z, Mao X L, Huang Z, et al. Towards end-to-end learning for efficient dialogue agent by modeling looking-ahead ability. ArXiv: 1908.05408

  120. Su P H, Gasic M, Young S. Reward estimation for dialogue policy optimisation. Computer Speech Language, 2018, 51: 24–43

    Article  Google Scholar 

  121. Yang Z, Levow G A, Meng H. Predicting user satisfaction in spoken dialog system evaluation with collaborative filtering. IEEE J Sel Top Signal Process, 2012, 6: 971–981

    Article  Google Scholar 

  122. Gulyaev P, ElistratovaE, Konovalov V, et al. Goal-oriented multi-task bert-based dialogue state tracker. ArXiv: 2002.02450

  123. Zhao T, Eskenazi M. Zero-shot dialog generation with cross-domain latent actions. In: Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue. Melbourne, 2018. 1–10

  124. Eric M, Manning C D. A copy-augmented sequence-to-sequence architecture gives good performance on task-oriented dialogue. ArXiv: 1701.04024

  125. Wu C S, Socher R, Xiong C. Global-to-local memory pointer networks for task-oriented dialogue. ArXiv: 1901.04713

  126. Lin Z, Huang X, Ji F, et al. Task-oriented conversation generation using heterogeneous memory networks. ArXiv: 1909.11287

  127. Qin L, Liu Y, Che W, et al. Entity-consistent end-to-end task-oriented dialogue system with kb retriever. ArXiv: 1909.06762

  128. Peters M E, Neumann M, Iyyer M, et al. Deep contextualized word representations. ArXiv: 1802.05365

  129. Radford A, Narasimhan K, Salimans T, et al. Improving language understanding by generative pre-training. https://s3-us-west-2.amazonaws.com/openai-assets/researchcovers/language-unsupervised/languageunderstandingpaper.pdf, 2018

  130. Mehri S, Razumovsakaia E, Zhao T, et al. Pretraining methods for dialog context representation learning. ArXiv: 1906.00414

  131. Mehri S, Eskenazi M. Multi-granularity representations ofdialog. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Hong Kong: Association for Computational Linguistics, 2019. 1752–1761

    Chapter  Google Scholar 

  132. Zheng Y, Chen G, Huang M. Out-of-domain detection for natural language understanding in dialog systems. IEEE/ACM Trans Audio Speech Language Proc, 2020, 28: 1198–1209

    Article  Google Scholar 

  133. Liang C, Berant J, Le Q, et al. Neural symbolic machines: Learning semantic parsers on freebase with weak supervision. ArXiv: 1611.00020

  134. Segler M H S, Waller M P. Neural-symbolic machine learning for retrosynthesis and reaction prediction. Chem Eur J, 2017, 23: 5966–5971

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to MinLie Huang.

Additional information

This work was supported by the National Natural Science Foundation of China (Grant Nos. 61936010 and 61876096), and the National Key R&D Program of China (Grant No. 2018YFC0830200).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Z., Takanobu, R., Zhu, Q. et al. Recent advances and challenges in task-oriented dialog systems. Sci. China Technol. Sci. 63, 2011–2027 (2020). https://doi.org/10.1007/s11431-020-1692-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11431-020-1692-3

Navigation