Co-adaptation in Spoken Dialogue Systems

  • Senthilkumar Chandramohan
  • Matthieu Geist
  • Fabrice Lefèvre
  • Olivier Pietquin
Conference paper

Abstract

Spoken dialogue systems are man-machine interfaces which use speech as the medium of interaction. In recent years, dialogue optimization using reinforcement learning has evolved to be a state-of-the-art technique. The primary focus of research in the dialogue domain is to learn some optimal policy with regard to the task description (reward function) and the user simulation being employed. However, in case of human-human interaction, the parties involved in the dialogue conversation mutually evolve over the period of interaction. This very ability of humans to coadapt attributes largely towards increasing the naturalness of the dialogue. This paper outlines a novel framework for coadaptation in spoken dialogue systems, where the dialogue manager and user simulation evolve over a period of time; they incrementally and mutually optimize their respective behaviors.

Notes

Acknowledgements

This research was partly funded by the EU INTERREG IVa project ALLEGRO and by the Règion Lorraine (France).

References

  1. 1.
    Abbeel, P., Ng, A.: Apprenticeship learning via inverse reinforcement learning. In: Proceedings of ICML, Banff, Alberta (2004)Google Scholar
  2. 2.
    Astrom, K.J.: Optimal control of markov decision processes with incomplete state estimation. J. Math. Anal. Appl. 10, 174–205 (1965)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Bellman, R.: A markovian decision process. J. Math. Mech. 6, 679–684 (1957)MathSciNetMATHGoogle Scholar
  4. 4.
    Chandramohan, S., Geist, M., Lefèvre, F., Pietquin, O.: User simulation in dialogue systems using inverse reinforcement learning. In: Proceedings of Interspeech 2011, Florence (2011)Google Scholar
  5. 5.
    Daubigney, L., Gasic, M., Chandramohan, S., Geist, M., Pietquin, O., Young, S.: Uncertainty management for on-line optimisation of a POMDP-based large-scale spoken dialogue system. In: Proceedings of Interspeech 2011, Florence pp. 1301–1304 (2011)Google Scholar
  6. 6.
    Eckert, W., Levin, E., Pieraccini, R.: User modeling for spoken dialogue system evaluation. In: Proceedings of ASRU, pp. 80–87 (1997)Google Scholar
  7. 7.
    Frampton, M., Lemon, O.: Recent research advances in reinforcement learning in spoken dialogue systems. Knowl. Eng. Rev. 24(4), 375–408 (2009)CrossRefGoogle Scholar
  8. 8.
    Gasic, M., Jurcicek, F., Thomson, B., Yu, K., Young, S.: On-line policy optimisation of spoken dialogue systems via live interaction with human subjects”. In: Proceedings of ASRU 2011, Hawaii (2011)Google Scholar
  9. 9.
    Georgila, K., Henderson, J., Lemon, O.: Learning user simulations for information state update dialogue systems. In: Proceedings of Eurospeech, Lisbon (2005)Google Scholar
  10. 10.
    Lagoudakis, M.G., Parr, R.: Least-squares policy iteration. J. Mach. Lear. Res. 4, 1107–1149 (2003)MathSciNetGoogle Scholar
  11. 11.
    Lemon, O., Georgila, K., Henderson, J., Stuttle, M.: An ISU dialogue system exhibiting reinforcement learning of dialogue policies: generic slot-filling in the TALK in-car system. In: Proceedings of EACL’06, Morristown (2006)Google Scholar
  12. 12.
    Lemon, O., Pietquin, O.: Machine learning for spoken dialogue systems. In: Proceedings of InterSpeech’07, Belgium (2007)Google Scholar
  13. 13.
    Levin, E., Pieraccini, R.: Using markov decision process for learning dialogue strategies. In: Proceedings ICASSP’98, Seattle (1998)Google Scholar
  14. 14.
    Ng, A.Y., Russell, S.: Algorithms for inverse reinforcement learning. In: Proceedings of ICML, Stanford (2000)Google Scholar
  15. 15.
    Pietquin, O.: Consistent goal-directed user model for realistic man-machine task-oriented spoken dialogue simulation. In: Proceedings of ICME’06, Toronto, pp. 425–428 (2006)Google Scholar
  16. 16.
    Pietquin, O., Dutoit, T.: A probabilistic framework for dialog simulation and optimal strategy learning. IEEE Trans. Audio Speech Lang. Process. 14(2), 589–599 (2006)CrossRefGoogle Scholar
  17. 17.
    Pietquin, O., Geist, M., Chandramohan, S., Frezza-Buet, H.: Sample-efficient batch reinforcement learning for dialogue management optimization. ACM Trans. Speech Lang. Process. 7(3), 7:1–7:21 (2011)Google Scholar
  18. 18.
    Pietquin, O., Rossignol, S., Ianotto, M.: Training bayesian networks for realistic man-machine spoken dialogue simulation. In: Proceedings of IWSDS 2009, Irsee (2009)Google Scholar
  19. 19.
    Schatzmann, J., Stuttle, M.N., Weilhammer, K., Young, S.: Effects of the user model on simulation-based learning of dialogue strategies. In: Proceedings of ASRU, Puerto Rico (2005)Google Scholar
  20. 20.
    Schatzmann, J., Thomson, B., Weilhammer, K., Ye, H., Young., S.: Agenda-based user simulation for bootstrapping a POMDP dialogue system. In: Proceedings of HLT NAACL, Rochester (2007)Google Scholar
  21. 21.
    Singh, S., Kearns, M., Litman, D., Walker, M.: Reinforcement learning for spoken dialogue systems. In: Proceedings of NIPS, Denver (1999)Google Scholar
  22. 22.
    Sutton, R., Barto, A.: Reinforcement Learning: An Introduction, 3rd edn. MIT, Cambridge (1998)Google Scholar
  23. 23.
    Williams, J.D., Young, S.: Partially observable markov decision processes for spoken dialog systems. Comput. Speech Lang. 21(2), 393–422 (2007). DOI: http://dx.doi.org/10.1016/j.csl.2006.06.008

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  • Senthilkumar Chandramohan
    • 1
    • 2
  • Matthieu Geist
    • 1
  • Fabrice Lefèvre
    • 2
  • Olivier Pietquin
    • 1
    • 3
  1. 1.Supelec, MaLIS - IMS Research GroupMetzFrance
  2. 2.Université d’Avignon et des Pays de Vaucluse, LIA-CERIAvignonFrance
  3. 3.UMI 2958 (CNRS - GeorgiaTech)MetzFrance

Personalised recommendations