Characterizing Markov Decision Processes
- First Online:
Problem characteristics often have a significant influence on the difficulty of solving optimization problems. In this paper, we propose attributes for characterizing Markov Decision Processes (MDPs), and discuss how they affect the performance of reinforcement learning algorithms that use function approximation. The attributes measure mainly the amount of randomness in the environment. Their values can be calculated from the MDP model or estimated on-line. We show empirically that two of the proposed attributes have a statistically significant effect on the quality of learning. We discuss how measurements of the proposed MDP attributes can be used to facilitate the design of reinforcement learning systems.
Unable to display preview. Download preview PDF.
- Dearden, R., Friedman, N., Andre, D.: Model-Based Bayesian Exploration. In Uncertainty in Artificial Intelligence: Proceedings of the Fifteenth Conference (UAI-1999) 150–159Google Scholar
- Gordon, J. G.: Reinforcement Learning with Function Approximation Converges to a Region. Advances in Neural Information Processing Systems 13 (2001) 1040–1046Google Scholar
- Kearns, M., Singh, S.: Near-Optimal Reinforcement Learning in Polynomial Time. In Proceedings of the 15th International Conference on Machine Learning (1998) 260–268Google Scholar
- Kirman, J.: Predicting Real-Time Planner Performance by Domain Characterization. Ph.D. Thesis, Brown University (1995)Google Scholar
- Lagoudakis, M., Littman, M. L.: Algorithm Selection using Reinforcement Learning Proceedings of the 17th International Conference on Machine Learning (2000) 511–518Google Scholar
- Moore, A. W., Atkeson, C. G.: Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time. Machine Learning, 13 (1993) 103–130Google Scholar
- Papadimitriou, C. H., Steiglitz, K: Combinatorial Optimization: Algorithms and Complexity. Prentice Hall (1982)Google Scholar
- Puterman, M. L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley (1994)Google Scholar
- Singh, S. P., Jaakkola, T., Jordan, M. I.: Reinforcement Learning with Soft State Aggregation. Advances in Neural Information Processing Systems, 7 (1995) 361–368Google Scholar
- Sutton, R. S., Barto, A. G.: Reinforcement Learning. An Introduction. Cambridge, MA: The MIT Press (1998)Google Scholar