Optimizing Dialogue Strategy in Large-Scale Spoken Dialogue System: A Learning Automaton Based Approach

  • G. Kumaravelan
  • R. Sivakumar
Conference paper
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 222)


Application of statistical methodology to model dialogue strategy in spoken dialogue system is a growing research area. Reinforcement learning is a promising technique for creating a dialogue management component that accepts semantic of the current dialogue state and seeks to find the best action given those features. In practice, increase in the number of dialogue states, much use of memory and processing is needed and the use of exhaustive search techniques like dynamic programming leads to sub-optimal solution. Hence, this paper investigates an adaptive policy iterative method using learning automata that cover large state-action space by hierarchical organization of automaton to learn optimal dialogue strategy. The proposed approach has clear advantages over baseline reinforcement learning algorithms in terms of faster learning with good exploitation in its update and scalability to larger problems.


Human–computer interaction Reinforcement learning Learning automata Spoken dialogue system 


  1. 1.
    McTear M (2004) Spoken dialog technology: toward the conversational user interface. Springer, New YorkGoogle Scholar
  2. 2.
    Sutton RS, Barto AG (1998) Reinforcement learning an introduction. MIT Press, CambridgeGoogle Scholar
  3. 3.
    Levin E, Pieraccini R, Eckert R (2000) A stochastic model of human-machine interaction for learning dialog strategies. IEEE Trans Speech Audio Process 8(1):11–23CrossRefGoogle Scholar
  4. 4.
    Singh S, Litman D, Walker M (2002) Optimizing dialogue management with reinforcement leaning: experiments with the NJFun system. J Artif Intell 16:105–133Google Scholar
  5. 5.
    Paek T, Pieraccini R (2008) Automating spoken dialogue management design using machine learning: an industry perspective. Speech Commun 50(8-9):716–729CrossRefGoogle Scholar
  6. 6.
    Pietquin O, Dutoit T (2006) A probabilistic framework for dialog simulation and optimal strategy learning. IEEE Trans Audio Speech Lang Process 14(2):589–599CrossRefGoogle Scholar
  7. 7.
    Henderson J, Lemon O, Georgila K (2008) Hybrid reinforcement/supervised learning of dialogue policies from fixed data sets. Comput Linguist 34(4):487–512CrossRefGoogle Scholar
  8. 8.
    Cuayáhuitl H, Renals S, Lemon O, Shimodaira H (2010) Evaluation of a hierarchical reinforcement learning spoken dialogue system. Comput Speech Lang 24(2):395–429CrossRefGoogle Scholar
  9. 9.
    Toney D, Moore J, Lemon O (2006) Evolving optimal inspectable strategies for spoken dialogue systems. In: Proceedings of HLT, pp 173–176Google Scholar
  10. 10.
    Baba N, Mogami Y (2006) A relative reward-strength algorithm for the hierarchical structure learning automata operating in the general nonstationary multiteacher environment. IEEE Trans Syst Man Cybern- Part B Cybern 36:781–794CrossRefGoogle Scholar
  11. 11.
    Thathachar MAL, Sastry PS (2004) Networks of learning automata: techniques for online stochastic optimization. Kluwer, NorwellGoogle Scholar
  12. 12.
    Schatzmann J, Weilhammer K, Stuttle MM, Young S (2006) A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies. The Knowl Eng Rev 21(02):97–126CrossRefGoogle Scholar

Copyright information

© Springer India 2013

Authors and Affiliations

  1. 1.Department of Computer SciencePondicherry UniveristyKaraikalIndia
  2. 2.Department of Computer ScienceAVVM Sri Puspam CollegeThanjavurIndia

Personalised recommendations