Advertisement

Deep Learning in Spoken and Text-Based Dialog Systems

  • Asli CelikyilmazEmail author
  • Li Deng
  • Dilek Hakkani-Tür
Chapter

Abstract

Last few decades have witnessed substantial breakthroughs on several areas of speech and language understanding research, specifically for building human to machine conversational dialog systems. Dialog systems, also known as interactive conversational agents, virtual agents or sometimes chatbots, are useful in a wide range of applications ranging from technical support services to language learning tools and entertainment. Recent success in deep neural networks has spurred the research in building data-driven dialog models. In this chapter, we present state-of-the-art neural network architectures and details on each of the components of building a successful dialog system using deep learning. Task-oriented dialog systems would be the focus of this chapter, and later different networks are provided for building open-ended non-task-oriented dialog systems. Furthermore, to facilitate research in this area, we have a survey of publicly available datasets and software tools suitable for data-driven learning of dialog systems. Finally, appropriate choice of evaluation metrics are discussed for the learning objective.

References

  1. Asri, L. E., He, J., & Suleman, K. (2016). A sequence-to-sequence model for user simulation in spoken dialogue systems. Interspeech.Google Scholar
  2. Aust, H., Oerder, M., Seide, F., & Steinbiss, V. (1995). The philips automatic train timetable information system. Speech Communication, 17, 249–262.CrossRefGoogle Scholar
  3. Banchs, R. E., & Li., H. (2012). Iris: A chat-oriented dialogue system based on the vector space model. ACL.Google Scholar
  4. Banerjee, S., & Lavie, A. (2005). Meteor: An automatic metric for mt evaluation with improved correlation with human judgments. In ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization.Google Scholar
  5. Bapna, A., Tur, G., Hakkani-Tur, D., & Heck, L. (2017). Improving frame semantic parsing with hierarchical dialogue encoders.Google Scholar
  6. Bateman, J., & Henschel, R. (1999). From full generation to near-templates without losing generality. In KI’99 Workshop, “May I Speak Freely?”.Google Scholar
  7. Blundell, C., Cornebise, J., Kavukcuoglu, K., & Wierstra, D. (2015). Weight uncertainty in neural networks. ICML.Google Scholar
  8. Bordes, A., Boureau, Y.-L., & Weston, J. (2017). Learning end-to-end goal-oriented dialog. In ICLR 2017Google Scholar
  9. Busemann, S., & Horacek, H. (1998). A flexible shallow approach to text generation. In International Natural Language Generation Workshop, Niagara-on-the-Lake, CanadaGoogle Scholar
  10. Celikyilmaz, A., Sarikaya, R., Hakkani-Tur, D., Liu, X., Ramesh, N., & Tur, G. (2016). A new pre-training method for training deep learning models with application to spoken language understanding. In Proceedings of Interspeech (pp. 3255–3259).Google Scholar
  11. Chen, Y.-N., Hakkani-Tür, D., Tur, G., Gao, J., & Deng, L. (2016). End-to-end memory networks with knowledge carryover for multi-turn spoken language understanding. In Proceedings of The 17th Annual Meeting of the International Speech Communication Association (INTERSPEECH), San Francisco, CA. ISCA.Google Scholar
  12. Crook, P., & Marin, A. (2017). Sequence to sequence modeling for user simulation in dialog systems. Interspeech.Google Scholar
  13. Cuayahuitl, H. (2016). Simpleds: A simple deep reinforcement learning dialogue system. In International Workshop on Spoken Dialogue Systems (IWSDS).Google Scholar
  14. Cuayahuitl, H., Yu, S., Williamson, A., & Carse, J. (2016). Deep reinforcement learning for multi-domain dialogue systems. arXiv:1611.08675.
  15. Dale, R., & Reiter, E. (2000). Building natural language generation systems. Cambridge, UK: Cambridge University Press.Google Scholar
  16. Deng, L. (2016). Deep learning from speech recognition to language and multi-modal processing. In APSIPA Transactions on Signal and Information Processing. Cambridge University Press.Google Scholar
  17. Deng, L., & Yu, D. (2015). Deep learning: Methods and applications. NOW Publishers.MathSciNetCrossRefGoogle Scholar
  18. Deng, L., & Li, X. (2013). Machine learning paradigms for speech recognition: An overview. IEEE Transactions on Audio, Speech, and Language Processing, 21(5), 1060–1089.CrossRefGoogle Scholar
  19. Dhingra, B., Li, L., Li, X., Gao, J., Chen, Y.-N., Ahmed, F., & Deng, L. (2016a). End-to-end reinforcement learning of dialogue agents for information access. arXiv:1609.00777.
  20. Dhingra, B., Li, L., Li, X., Gao, J., Chen, Y.-N., Ahmed, F., & Deng, L. (2016b). Towards end-to-end reinforcement learning of dialogue agents for information access. ACL.Google Scholar
  21. Dodge, J., Gane, A., Zhang, X., Bordes, A., Chopra, S., Miller, A., Szlam, A., & Weston, J. (2015). Evaluating prerequisite qualities for learning end-to-end dialog systems. arXiv:1511.06931.
  22. Elhadad, M., & Robin, J. (1996). An overview of surge: A reusable comprehensive syntactic realization component. Technical Report 96-03, Department of Mathematics and Computer Science, Ben Gurion University, Beer Sheva, Israel.Google Scholar
  23. Fatemi, M., Asri, L. E., Schulz, H., He, J., & Suleman, K. (2016a). Policy networks with two-stage training for dialogue systems. arXiv:1606.03152.
  24. Fatemi, M., Asri, L. E., Schulz, H., He, J., & Suleman, K. (2016b). Policy networks with two-stage training for dialogue systems. arXiv:1606.03152.
  25. Forgues, G., Pineau, J., Larcheveque, J.-M., & Tremblay, R. (2014). Bootstrapping dialog systems with word embeddings. NIPS ML-NLP Workshop.Google Scholar
  26. Gai, M., Mrki, N., Su, P.-H., Vandyke, D., Wen, T.-H., & Young, S. (2015). Policy committee for adaptation in multi-domain spoken dialogue sytems. ASRU.Google Scholar
  27. Gai, M., Mrki, N., Rojas-Barahona, L. M., Su, P.-H., Ultes, S., Vandyke, D., et al. (2016). Dialogue manager domain adaptation using Gaussian process reinforcement learning. Computer Speech and Language, 45, 552–569.Google Scholar
  28. Gasic, M., Jurcicek, F., Keizer, S., Mairesse, F., Thomson, B., Yu, K., & Young, S. (2010). Gaussian processes for fast policy optimisation of POMDP-based dialogue managers. In SIGDIAL.Google Scholar
  29. Gasic, M., Mrksic, N., Su, P.-H., Vandyke, D., & Wen, T.-H. (2015). Multi-agent learning in multi-domain spoken dialogue systems. NIPS workshop on Spoken Language Understanding and Interaction.Google Scholar
  30. Ge, W., & Xu, B. (2016). Dialogue management based on multi-domain corpus. In Special Interest Group on Discourse and Dialog.Google Scholar
  31. Georgila, K., Henderson, J., & Lemon, O. (2005). Learning user simulations for information state update dialogue systems. In 9th European Conference on Speech Communication and Technology (INTERSPEECH—EUROSPEECH).Google Scholar
  32. Georgila, K., Henderson, J., & Lemon, O. (2006). User simulation for spoken dialogue systems: Learning and evaluation. In INTERSPEECH—EUROSPEECH.Google Scholar
  33. Goller, C., & Kchler, A. (1996). Learning task-dependent distributed representations by backpropagation through structure. IEEE.Google Scholar
  34. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014). Generative adversarial nets. In NIPS.Google Scholar
  35. Gorin, A. L., Riccardi, G., & Wright, J. H. (1997). How may i help you? Speech Communication, 23, 113–127.CrossRefGoogle Scholar
  36. Graves, A., & Schmidhuber, J. (2005). Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural Networks, 18, 602–610.CrossRefGoogle Scholar
  37. Hakkani-Tür, D., Tur, G., Celikyilmaz, A., Chen, Y.-N., Gao, J., Deng, L., & Wang, Y.-Y. (2016). Multi-domain joint semantic frame parsing using bi-directional rnn-lstm. In Proceedings of Interspeech (pp. 715–719).Google Scholar
  38. Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Berlin: Springer.CrossRefGoogle Scholar
  39. He, X., & Deng, L. (2011). Speech recognition, machine translation, and speech translation a unified discriminative learning paradigm. In IEEE Signal Processing Magazine.Google Scholar
  40. He, X., & Deng, L. (2013). Speech-centric information processing: An optimization-oriented approach. In IEEE.CrossRefGoogle Scholar
  41. He, J., Chen, J., He, X., Gao, J., Li, L., Deng, L., & Ostendorf, M. (2016). Deep reinforcement learning with a natural language action space. ACL.Google Scholar
  42. Hemphill, C. T., Godfrey, J. J., & Doddington, G. R. (1990). The ATIS spoken language systems pilot corpus. In DARPA Speech and Natural Language Workshop.Google Scholar
  43. Henderson, M., Thomson, B., & Williams, J. D. (2014). The third dialog state tracking challenge. In 2014 IEEE, Spoken Language Technology Workshop (SLT) (pp. 324–329). IEEE.Google Scholar
  44. Henderson, M., Thomson, B., & Young, S. (2013). Deep neural network approach for the dialog state tracking challenge. In Proceedings of the SIGDIAL 2013 Conference (pp. 467–471).Google Scholar
  45. Higashinaka, R., Imamura, K., Meguro, T., Miyazaki, C., Kobayashi, N., Sugiyama, H., et al. (2014). Towards an open-domain conversational system fully based on natural language processing. COLING.Google Scholar
  46. Hinton, G., Deng, L., Yu, D., Dahl, G., Rahman Mohamed, A., Jaitly, N., et al. (2012). Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Processing Magazine, 29(6), 82–97.CrossRefGoogle Scholar
  47. Huang, X., & Deng, L. (2010). An overview of modern speech recognition. In Handbook of Natural Language Processing (2nd ed., Chapter 15).Google Scholar
  48. Huang, P.-S., He, X., Gao, J., Deng, L., Acero, A., & Heck, L. (2013). Learning deep structured semantic models for web search using click-through data. In ACM International Conference on Information and Knowledge Management (CIKM).Google Scholar
  49. Jaech, A., Heck, L., & Ostendorf, M. (2016). Domain adaptation of recurrent neural networks for natural language understanding.Google Scholar
  50. Kannan, A., & Vinyals, O. (2016). Adversarial evaluation of dialog models. In Workshop on Adversarial Training, NIPS 2016, Barcelona, Spain.Google Scholar
  51. Kim, Y.-B., Stratos, K., & Kim, D. (2017a). Adversarial adaptation of synthetic or stale data. ACL.Google Scholar
  52. Kim, Y.-B., Stratos, K., & Kim, D. (2017b). Domain attention with an ensemble of experts. ACL.Google Scholar
  53. Kim, Y.-B., Stratos, K., & Sarikaya, R. (2016a). Domainless adaptation by constrained decoding on a schema lattice. COLING.Google Scholar
  54. Kim, Y.-B., Stratos, K., & Sarikaya, R. (2016b). Frustratingly easy neural domain adaptation. COLING.Google Scholar
  55. Kumar, A., Irsoy, O., Su, J., Bradbury, J., English, R., Pierce, B., et al. (2015). Ask me anything: Dynamic memory networks for natural language processing. In Neural Information Processing Systems (NIPS).Google Scholar
  56. Kurata, G., Xiang, B., Zhou, B., & Yu, M. (2016). Leveraging sentence level information with encoder lstm for natural language understanding. arXiv:1601.01530.
  57. Langkilde, I., & Knight, K. (1998). Generation that exploits corpus-based statistical knowledge. ACL.Google Scholar
  58. LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. IEEE, 86, 2278–2324.CrossRefGoogle Scholar
  59. Lemon, O., & Rieserr, V. (2009). Reinforcement learning for adaptive dialogue systems—tutorial. EACL.Google Scholar
  60. Li, L., Balakrishnan, S., & Williams, J. (2009). Reinforcement learning for dialog management using least-squares policy iteration and fast feature selection. InterSpeech.Google Scholar
  61. Li, J., Galley, M., Brockett, C., Gao, J., & Dolan, B. (2016a). A diversity-promoting objective function for neural conversation models. NAACL.Google Scholar
  62. Li, J., Galley, M., Brockett, C., Spithourakis, G. P., Gao, J., & Dolan, B. (2016b). A persona based neural conversational model. ACL.Google Scholar
  63. Li, J., Monroe, W., Shu, T., Jean, S., Ritter, A., & Jurafsky, D. (2017). Adversarial learning for neural dialogue generation. arXiv:1701.06547.
  64. Li, J., Deng, L., Gong, Y., & Haeb-Umbach, R. (2014). An overview of noise-robust automatic speech recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(4), 745–777.CrossRefGoogle Scholar
  65. Lin, C.-Y. (2004). Rouge: A package for automatic evaluation of summaries. In Text summarization branches out: ACL-04 Workshop.Google Scholar
  66. Lipton, Z. C., Li, X., Gao, J., Li, L., Ahmed, F., & Deng, L. (2016). Efficient dialogue policy learning with bbq-networks. arXiv.org.
  67. Lison, P. (2013). Structured probabilistic modelling for dialogue management. Department of Informatics Faculty of Mathematics and Natural Sciences University of Osloe.Google Scholar
  68. Liu, B., & Lane, I. (2016a). Attention-based recurrent neural network models for joint intent detection and slot filling. Interspeech.Google Scholar
  69. Liu, B., & Lane, I. (2016b). Attention-based recurrent neural network models for joint intent detection and slot filling. In SigDial.Google Scholar
  70. Liu, C.-W., Lowe, R., Serban, I. V., Noseworthy, M., Charlin, L., & Pineau, J. (2016). How not to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation. EMNLP.Google Scholar
  71. Lowe, R., Pow, N., Serban, I. V., and Pineau, J. (2015b). The ubuntu dialogue corpus: A large dataset for research in unstructure multi-turn dialogue systems. In SIGDIAL 2015.Google Scholar
  72. Lowe, R., Pow, N., Serban, I. V., Charlin, L., and Pineau, J. (2015a). Incorporating unstructured textual knowledge sources into neural dialogue systems. In Neural Information Processing Systems Workshop on Machine Learning for Spoken Language Understanding.Google Scholar
  73. Mairesse, F., & Young, S. (2014). Stochastic language generation in dialogue using factored language models. Computer Linguistics.CrossRefGoogle Scholar
  74. Mairesse, F. and Walker, M. A. (2011). Controlling user perceptions of linguistic style: Trainable generation of personality traits. Computer Linguistics.CrossRefGoogle Scholar
  75. Mesnil, G., Dauphin, Y., Yao, K., Bengio, Y., Deng, L., Hakkani-Tur, D., et al. (2015). Using recurrent neural networks for slot filling in spoken language understanding. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 23(3), 530–539.CrossRefGoogle Scholar
  76. Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems (pp. 3111–3119).Google Scholar
  77. Mizil, C. D. N. & Lee, L. (2011). Chameleons in imagined conversations: A new approach to understanding coordination of linguistic style in dialogs. In Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics, ACL 2011.Google Scholar
  78. Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller, M. (2013). Playing Atari with deep reinforcement learning. NIPS Deep Learning Workshop.Google Scholar
  79. Mrkšić, N., Séaghdha, D. Ó., Wen, T.-H., Thomson, B., & Young, S. (2016). Neural belief tracker: Data-driven dialogue state tracking. arXiv:1606.03777.
  80. Oh, A. H., & Rudnicky, A. I. (2000). Stochastic language generation for spoken dialogue systems. ANLP/NAACL Workshop on Conversational Systems.Google Scholar
  81. Papineni, K., Roukos, S., Ward, T., & Zhu, W. (2002). Bleu: A method for automatic evaluation of machine translation. In 40th annual meeting on Association for Computational Linguistics (ACL).Google Scholar
  82. Passonneau, R. J., Epstein, S. L., Ligorio, T., & Gordon, J. (2011). Embedded wizardry. In SIGDIAL 2011 Conference.Google Scholar
  83. Peng, B., Li, X., Li, L., Gao, J., Celikyilmaz, A., Lee, S., & Wong, K.-F. (2017). Composite task-completion dialogue system via hierarchical deep reinforcement learning. arxiv:1704.03084v2.
  84. Pietquin, O., Geist, M., & Chandramohan, S. (2011a). Sample efficient on-line learning of optimal dialogue policies with kalman temporal differences. In IJCAI 2011, Barcelona, Spain.Google Scholar
  85. Pietquin, O., Geist, M., Chandramohan, S., & FrezzaBuet, H. (2011b). Sample-efficient batch reinforcement learning for dialogue management optimization. ACM Transactions on Speech and Language Processing.CrossRefGoogle Scholar
  86. Ravuri, S., & Stolcke, A. (2015). Recurrent neural network and LSTM models for lexical utterance classification. In Sixteenth Annual Conference of the International Speech Communication Association.Google Scholar
  87. Ritter, A., Cherry, C., & Dolan., W. B. (2011). Data-driven response generation in social media. Empirical Methods in Natural Language Processing.Google Scholar
  88. Sarikaya, R., Hinton, G. E., & Ramabhadran, B. (2011). Deep belief nets for natural language call-routing. In 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 5680–5683). IEEE.Google Scholar
  89. Sarikaya, R., Hinton, G. E., & Deoras, A. (2014). Application of deep belief networks for natural language understanding. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(4), 778–784.CrossRefGoogle Scholar
  90. Schatzmann, J., Weilhammer, K., & Matt Stutle, S. Y. (2006). A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies. The Knowledge Engineering Review.CrossRefGoogle Scholar
  91. Serban, I., Klinger, T., Tesauro, G., Talamadupula, K., Zhou, B., Bengio, Y., & Courville, A. (2016a). Multiresolution recurrent neural networks: An application to dialogue response generation. arXiv:1606.00776v2
  92. Serban, I., Sordoni, A., & Bengio, Y. (2017). A hierarchical latent variable encoder-decoder model for generating dialogues. AAAI.Google Scholar
  93. Serban, I. V., Sordoni, A., Bengio, Y., Courville, A., & Pineau, J. (2015). Building end-to-end dialogue systems using generative hierarchical neural network models. AAAI.Google Scholar
  94. Serban, I. V., Sordoni, A., Bengio, Y., Courville, A., & Pineau, J. (2016b). Building end-to-end dialogue systems using generative hierarchical neural networks. AAAI.Google Scholar
  95. Shah, P., Hakkani-Tur, D., & Heck, L. (2016). Interactive reinforcement learning for task-oriented dialogue management. SIGDIAL.Google Scholar
  96. Shang, L., Lu, Z., & Li, H. (2015). Neural responding machine for short text conversation. ACL-IJCNLP.Google Scholar
  97. Simonnet, E., Camelin, N., Deléglise, P., & Estève, Y. (2015). Exploring the use of attention-based recurrent neural networks for spoken language understanding. In Machine Learning for Spoken Language Understanding and Interaction NIPS 2015 Workshop (SLUNIPS 2015).Google Scholar
  98. Simpson, A. & Eraser, N. M. (1993). Black box and glass box evaluation of the sundial system. In Third European Conference on Speech Communication and Technology.Google Scholar
  99. Singh, S. P., Kearns, M. J., Litman, D. J., & Walker, M. A. (2016). Reinforcement learning for spoken dialogue systems. NIPS.Google Scholar
  100. Sordoni, A., Galley, M., Auli, M., Brockett, C., Ji, Y., Mitchell, M., et al. (2015a). A neural network approach to context-sensitive generation of conversational responses. In North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2015).Google Scholar
  101. Sordoni, A., Galley, M., Auli, M., Brockett, C., Ji, Y., Mitchell, M., Nie, J.-Y., et al. (2015b). A neural network approach to context-sensitive generation of conversational responses. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 196–205), Denver, Colorado. Association for Computational Linguistics.Google Scholar
  102. Stent, A. (1999). Content planning and generation in continuous-speech spoken dialog systems. In KI’99 workshop, “May I Speak Freely?”.Google Scholar
  103. Stent, A., Prasad, R., & Walker, M. (2004). Trainable sentence planning for complex information presentation in spoken dialog systems. ACL.Google Scholar
  104. Su, P.-H., Gasic, M., Mrksic, N., Rojas-Barahona, L., Ultes, S., Vandyke, D., et al. (2016). On-line active reward learning for policy optimisation in spoken dialogue systems. arXiv:1605.07669.
  105. Sukhbaatar, S., Weston, J., Fergus, R., et al. (2015). End-to-end memory networks. In Advances in neural information processing systems (pp. 2440–2448).Google Scholar
  106. Sutton, R. S., & Singh, S. P. (1999). Between mdps and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112, 181–211.MathSciNetCrossRefGoogle Scholar
  107. Tafforeau, J., Bechet, F., Artières, T., & Favre, B. (2016). Joint syntactic and semantic analysis with a multitask deep learning framework for spoken language understanding. In Interspeech (pp. 3260–3264).Google Scholar
  108. Tao, C., Mou, L., Zhao, D., & Yan, R. (2017). Ruber: An unsupervised method for automatic evaluation of open-domain dialog systems. ArXiv2017.Google Scholar
  109. Thomson, B., & Young, S. (2010). Bayesian update of dialogue state: A POMDP framework for spoken dialogue systems. Computer Speech and Language, 24(4), 562–588.CrossRefGoogle Scholar
  110. Tur, G., Deng, L., Hakkani-Tür, D., & He, X. (2012). Towards deeper understanding: Deep convex networks for semantic utterance classification. In 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 5045–5048). IEEE.Google Scholar
  111. Tur, G., & Deng, L. (2011). Intent determination and spoken utterance classification, Chapter 4 in Book: Spoken language understanding. New York, NY: Wiley.Google Scholar
  112. Tur, G., & De Mori, R. (2011). Spoken language understanding: Systems for extracting semantic information from speech. New York: Wiley.CrossRefGoogle Scholar
  113. Vinyals, O., & Le, Q. (2015). A neural conversational model. arXiv:1506.05869.
  114. Walker, M., Stent, A., Mairesse, F., & Prasad, R. (2007). Individual and domain adaptation in sentence planning for dialogue. Journal of Artificial Intelligence Research.Google Scholar
  115. Wang, Z., Stylianou, Y., Wen, T.-H., Su, P.-H., & Young, S. (2015). Learning domain-independent dialogue policies via ontology parameterisation. In SIGDAIL.Google Scholar
  116. Wen, T.-H., Gasic, M., Mrksic, N., Rojas-Barahona, L. M., Pei-Hao, P., Ultes, S., et al. (2016a). A network-based end-to-end trainable task-oriented dialogue system. arXiv.Google Scholar
  117. Wen, T.-H., Gasic, M., Mrksic, N., Rojas-Barahona, L. M., Su, P.-H., Ultes, S., et al. (2016b). A network-based end-to-end trainable task-oriented dialogue system. arXiv:1604.04562.
  118. Wen, T.-H., Gasic, M., Mrksic, N., Su, P.-H., Vandyke, D., & Young, S. (2015a). Semantically conditioned LSTM-based natural language generation for spoken dialogue systems. EMNLP.Google Scholar
  119. Wen, T.-H., Gasic, M., Mrksic, N., Su, P.-H., Vandyke, D., & Young, S. (2015b). Semantically conditioned LSTM-based natural language generation for spoken dialogue systems. arXiv:1508.01745
  120. Weston, J., Chopra, S., & Bordesa, A. (2015). Memory networks. In International Conference on Learning Representations (ICLR).Google Scholar
  121. Williams, J. D., & Zweig, G. (2016a). End-to-end LSTM-based dialog control optimized with supervised and reinforcement learning. arXiv:1606.01269.
  122. Williams, J. D., & Zweig, G. (2016b). End-to-end LSTM-based dialog control optimized with supervised and reinforcement learning. arXiv.Google Scholar
  123. Williams, J. D., Raux, A., Ramachandran, D., & Black, A. W. (2013). The dialog state tracking challenge. In SIGDIAL Conference (pp. 404–413).Google Scholar
  124. Williams, J., Raux, A., & Handerson, M. (2016). The dialog state tracking challenge series: A review. Dialogue and Discourse, 7(3), 4–33.Google Scholar
  125. Xu, P., & Sarikaya, R. (2013). Convolutional neural network based triangular CRF for joint intent detection and slot filling. In 2013 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) (pp. 78–83). IEEE.Google Scholar
  126. Yao, K., Zweig, G., Hwang, M.-Y., Shi, Y., & Yu, D. (2013). Recurrent neural networks for language understanding. In INTERSPEECH (pp. 2524–2528).Google Scholar
  127. Yu, Z., Black, A., & Rudnicky, A. I. (2017). Learning conversational systems that interleave task and non-task content. arXiv:1703.00099v1.
  128. Yu, Y., Eshghi, A., & Lemon, O. (2016). Training an adaptive dialogue policy for interactive learning of visually grounded word meanings. SIGDIAL.Google Scholar
  129. Yu, Z., Papangelis, A., & Rudnicky, A. (2015). Ticktock: A non-goal-oriented multimodal dialog system with engagement awareness. In AAAI Spring Symposium.Google Scholar
  130. Yu, D., & Deng, L. (2015). Automatic speech recognition: A deep learning approach. Berlin: Springer.Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  1. 1.Microsoft ResearchRedmondUSA
  2. 2.CitadelChicago & SeattleUSA
  3. 3.GoogleMountain ViewUSA

Personalised recommendations