Evaluating the Markov assumption in Markov Decision Processes for spoken dialogue management
Rent the article at a discountRent now
* Final gross prices may vary according to local VAT.Get Access
The goal of dialogue management in a spoken dialogue system is to take actions based on observations and inferred beliefs. To ensure that the actions optimize the performance or robustness of the system, researchers have turned to reinforcement learning methods to learn policies for action selection. To derive an optimal policy from data, the dynamics of the system is often represented as a Markov Decision Process (MDP), which assumes that the state of the dialogue depends only on the previous state and action. In this article, we investigate whether constraining the state space by the Markov assumption, especially when the structure of the state space may be unknown, truly affords the highest reward. In simulation experiments conducted in the context of a dialogue system for interacting with a speech-enabled web browser, models under the Markov assumption did not perform as well as an alternative model which classifies the total reward with accumulating features. We discuss the implications of the study as well as its limitations.
- Bellman, R. E. (1957). Dynamic programming. Princeton University Press
- Chickering, D. (2002). The WinMine Toolkit. Technical Report MSR-TR-2002-103, Microsoft, Redmond, WA
- Chickering, D. M., Heckerman, D., & Meek, C. (1997). A Bayesian approach to learning Bayesian networks with local structure. In Proceedings of the thirteenth conference on uncertainty in artificial intelligence (pp. 80–89), Providence, RI: Morgan Kaufmann.
- Chickering, D. M., & Paek, T. (2006). Personalizing influence diagrams: Applying online learning strategies to dialogue management. User Modeling and User-adapted Interaction, To appear.
- Clark, H. (1996). Using language. Cambridge: Cambridge University Press.
- Heckerman, D. (1995). A Bayesian approach for learning causal networks. In S. Hanks, & P. Besnard (Eds.), Proceedings of the eleventh conference on uncertainty in articial intelligence (UAI) (pp. 285–295). Morgan Kaufmann.
- Howard, R. A., & Matheson, J. (1981). Influence diagrams. In Readings on the principles and applications of decision analysis (Vol. II, pp. 721–762). Menlo Park, CA: Strategic Decisions Group.
- Kaelbling, L. P., Littman, M. L., & Moore, A. P. (1996). Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4, 237–285.
- Kearns, M. J., Mansour, Y., & Ng, A. Y. (1999). A sparse sampling algorithm for near-optimal planning in large Markov Decision Processes. In Proceedings of the international joint conference on artificial intelligence (IJCAI) (pp. 1324–1231).
- Lauritzen, S. L., & Nilsson, D. (2001). Representing and solving decision problems with limited information. Management Science, 47(9), 1235–1251. CrossRef
- Levin, E., Pieraccini, R., & Eckert, W. (1998). Using Markov Decision Processes for learning dialogue strategies. In IEEE Transactions on Speech and Audio Processing (Vol. 8, pp. 11–23).
- Meuleau, N., Hauskrecht, M., Kim, K.-E., Peshkin, L., Kaelbling, L.P., Dean, T., & Boutilier, C. (1998). Solving very large weakly coupled Markov Decision Processes. In Proceedings of the fifteenth national conference on artificial intelligence and the tenth conference on innovative applications of artificial intelligence (AAAI/IAAI) (pp. 165–172).
- Paek, T. & Horvitz, E. (2000). Conversation as action under uncertainty. In Proceedings of the sixteenth conference on uncertainty in articial intelligence (UAI) (pp. 455–464).
- Paek, T. & Horvitz, E. (2004). Optimizing automated call routing by integrating spoken dialog models with queuing models. In Proceedings of the human language technology conference/North American chapter of the association for computational linguistics annual meeting (HLT/NAACL) (pp. 41–48).
- Roy, N., Pineau, J., & Thrun, S. (2000). Spoken dialogue management using probabilistic reasoning. In Proceedings of the conference of the association for computational linguistics (ACL) (pp. 93–100).
- Shachter, R. D. (1988). Probabilistic inference and influence diagrams. Operations Research, 36(4), 589–604. CrossRef
- Singh, S., Litman, D., Kearns, M., & Walker, M. (2002). Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJ-Fun System. Journal of Artificial Intelligence Research, 16, 105–133.
- Sutton, R. & Barto, A. (1998). Reinforcement learning: An introduction. MIT Press.
- Tatman, J. A. & Shachter, R. D. (1990). Dynamic programming and influence diagrams. IEEE transactions on Systems, Man and Cybernetics, 20(2), 365–379. CrossRef
- Walker, M., Aberdeen, J., Boland, J., Bratt, E., Garofolo, J., Hirschman, L., Le, A., Lee, S., Narayanan, S., Papineni, K., Pellom, K., Polifroni, B., Potamianos, A., Prabhu, P., Rudnicky, A., Sanders, G., Seneff, S., Stallard, D., & Whittaker, S. (2001a). DARPA communicator dialog travel planning systems: The June 2000 data collection. In Proceedings of the European conference on speech communication and technology (Eurospeech) (pp. 1371–1374).
- Walker, M., Passonneau, R., & Boland, J. (2001b). Quantitative and qualitative evaluation of DARPA communicator spoken dialogue systems. In Proceedings of the conference of the association for computational linguistics (ACL) (pp. 515–522).
- Watkins, C., & Dayan, P. (1992). Q-Learning. Machine Learning, 8(3), 229–256.
- Williams, J. D., Poupart, P., & Young, S. (2005). Factored partially observable Markov Decision Processes for dialogue management. In Proceedings of the 4th IJCAI workshop on knowledge and reasoning in practical dialogue systems (pp. 76–82).
- Williams, J. D., & Young, S. (2005). Scaling up POMDPs for dialog management: The “Summary POMDP” method. In Proceedings of the IEEE workshop on automatic speech recognition and understanding (ASRU).
- Young, S. (2000). Probabilistic methods in spoken dialogue systems. Philosophical Transactions of the Royal Society (Series A), 358(1769), 1389–1402. CrossRef
- Zhang, B., Cai, Q., Mao, J., & Guo, B. (2001). Planning and acting under uncertainty: A new model for spoken dialogue systems. In proceedings of the sixteenth conference on uncertainty in artificial intelligence (pp. 572–579).
- Evaluating the Markov assumption in Markov Decision Processes for spoken dialogue management
Language Resources and Evaluation
Volume 40, Issue 1 , pp 47-66
- Cover Date
- Print ISSN
- Online ISSN
- Kluwer Academic Publishers
- Additional Links
- Spoken dialogue
- Dialogue management
- Markov assumption
- Industry Sectors