Reinforcement Learning Based Dialogue Management Strategy

Saha, Tulika; Gupta, Dhawal; Saha, Sriparna; Bhattacharyya, Pushpak

doi:10.1007/978-3-030-04182-3_32

Tulika Saha¹⁶,
Dhawal Gupta¹⁶,
Sriparna Saha¹⁶ &
…
Pushpak Bhattacharyya¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11303))

Included in the following conference series:

International Conference on Neural Information Processing

2324 Accesses
5 Citations

Abstract

This paper proposes a novel Markov Decision Process (MDP) to solve the problem of learning an optimal strategy by a Dialogue Manager for a flight enquiry system. A unique representation of state is presented followed by a relevant action set and a reward model which is specific to different time-steps. Different Reinforcement Learning (RL) algorithms based on classical methods and Deep Learning techniques have been implemented for the execution of the Dialogue Management component. To establish the robustness of the system, existing Slot-Filling (SF) module has been integrated with the system. The system can still generate valid responses to act sensibly even if the SF module falters. The experimental results indicate that the proposed MDP and the system hold promise to be scalable across satisfying the intent of the user.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Ran on Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20 GHz, 251 GB RAM.

References

Allen, J.F., Byron, D.K., Dzikovska, M., Ferguson, G., Galescu, L., Stent, A.: Toward conversational human-computer interaction. AI Magaz. 22(4), 27 (2001)
Google Scholar
Arulkumaran, K., Deisenroth, M.P., Brundage, M., Bharath, A.A.: A brief survey of deep reinforcement learning. arXiv preprint arXiv:1708.05866 (2017)
Bohus, D., Rudnicky, A.I.: Error handling in the ravenclaw dialog management framework. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 225–232. Association for Computational Linguistics (2005)
Google Scholar
Cuayáhuitl, H.: SimpleDS: a simple deep reinforcement learning dialogue system. In: Jokinen, K., Wilcock, G. (eds.) Dialogues with Social Robots. LNEE, vol. 999, pp. 109–118. Springer, Singapore (2017). https://doi.org/10.1007/978-981-10-2585-3_8
Chapter Google Scholar
Cuayáhuitl, H., Yu, S., et al.: Deep reinforcement learning of dialogue policies with less weight updates (2017)
Google Scholar
Fraser, N.: Assessment of interactive systems. In: Handbook of Standards and Resources for Spoken Language Systems, pp. 564–615. Mouton de Gruyter (1998)
Google Scholar
Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: a survey. J. Artif. Intell. Res. 4, 237–285 (1996)
Article Google Scholar
Levin, E., Pieraccini, R., Eckert, W.: Using markov decision process for learning dialogue strategies. In: Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 1, pp. 201–204. IEEE (1998)
Google Scholar
Litman, D.J., Kearns, M.S., Singh, S., Walker, M.A.: Automatic optimization of dialogue management. In: Proceedings of the 18th Conference on Computational Linguistics, vol. 1, pp. 502–508. Association for Computational Linguistics (2000)
Google Scholar
McTear, M.F.: Modelling spoken dialogues with state transition diagrams: experiences with the CSLU toolkit. In: Fifth International Conference on Spoken Language Processing (1998)
Google Scholar
McTear, M.F.: Spoken dialogue technology: enabling the conversational user interface. ACM Comput. Surv. (CSUR) 34(1), 90–169 (2002)
Article Google Scholar
Mesnil, G., He, X., Deng, L., Bengio, Y.: Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. In: Interspeech, pp. 3771–3775 (2013)
Google Scholar
Mnih, V., et al.: Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)
Price, P.J.: Evaluation of spoken language systems: the atis domain. In: Speech and Natural Language: Proceedings of a Workshop Held at Hidden Valley, Pennsylvania, 24–27 June 1990
Google Scholar
Rieser, V., Lemon, O.: Reinforcement learning. In: Reinforcement Learning for Adaptive Dialogue Systems, pp. 29–52. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24942-6
Book Google Scholar
Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay. arXiv preprint arXiv:1511.05952 (2015)
Singh, S.P., Kearns, M.J., Litman, D.J., Walker, M.A.: Reinforcement learning for spoken dialogue systems. In: Advances in Neural Information Processing Systems, pp. 956–962 (2000)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, vol. 1. MIT press, Cambridge (1998)
Google Scholar
Traum, D.: Approaches to dialogue systems and dialogue management. Lecture Notes, University of Southern California (2008). http://people.ict.usc.edu/~traum/ESSLLI08
Van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double q-learning. In: AAAI, vol. 16, pp. 2094–2100 (2016)
Google Scholar
Walker, M.A.: An application of reinforcement learning to dialogue strategy selection in a spoken dialogue system for email. J. Artif. Intell. Res. 12, 387–416 (2000)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Indian Institute of Technology Patna, Patna, India
Tulika Saha, Dhawal Gupta, Sriparna Saha & Pushpak Bhattacharyya

Authors

Tulika Saha
View author publications
You can also search for this author in PubMed Google Scholar
Dhawal Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Sriparna Saha
View author publications
You can also search for this author in PubMed Google Scholar
Pushpak Bhattacharyya
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tulika Saha .

Editor information

Editors and Affiliations

The Chinese Academy of Sciences, Beijing, China
Long Cheng
City University of Hong Kong, Kowloon, Hong Kong
Andrew Chi Sing Leung
Kobe University, Kobe, Japan
Seiichi Ozawa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Saha, T., Gupta, D., Saha, S., Bhattacharyya, P. (2018). Reinforcement Learning Based Dialogue Management Strategy. In: Cheng, L., Leung, A., Ozawa, S. (eds) Neural Information Processing. ICONIP 2018. Lecture Notes in Computer Science(), vol 11303. Springer, Cham. https://doi.org/10.1007/978-3-030-04182-3_32

Download citation

DOI: https://doi.org/10.1007/978-3-030-04182-3_32
Published: 18 November 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04181-6
Online ISBN: 978-3-030-04182-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics