Abstract
Reinforcement learning is a paradigm under which an agent seeks to improve its policy by making learning updates based on the experiences it gathers through interaction with the environment. Model-free algorithms perform updates solely bas ed on observed experiences. By contrast, model-based algorithms learn a model of the environment that effectively simulates its dynamics. The model may be used to simulate experiences or to plan into the future, potentially expediting the learning process. This paper presents a model-based reinforcement learning approach for Keepaway, a complex, continuous, stochastic, multiagent subtask of RoboCup simulated soccer. First, we propose the design of an environmental model that is partly learned based on the agent’s experiences. This model is then coupled with the reinforcement learning algorithm to learn an action selection policy. We evaluate our method through empirical comparisons with model-free approaches that have been previously applied successfully to this task. Results demonstrate significant gains in the learning speed and asymptotic performance of our method. We also show that the learned model can be used effectively as part of a planning-based approach with a hand-coded policy.
Chapter PDF
References
Albus, J.S.: Brains, Behavior, and Robotics. BYTE Books, Peterborough (1981)
Atkeson, C., Santamaría, J.: A comparison of direct and model-based reinforcement learning. In: IEEE International Conference on Robotics and Automation, vol. 4, pp. 3557–3564 (April 1997)
Boone, G.: Efficient reinforcement learning: model-based acrobot control. In: IEEE International Conference on Robotics and Automation, vol. 1, pp. 229–234 (April 1997)
Bradtke, S.J., Duff, M.O.: Reinforcement learning methods for continuous-time Markov decision problems. In: Tesauro, G., Touretzky, D., Leen, T. (eds.) Advances in Neural Information Processing Systems, vol. 7, pp. 393–400. The MIT Press (1995)
Kalyanakrishnan, S., Liu, Y., Stone, P.: Half field offense in RoboCup soccer: A multiagent reinforcement learning case study. In: Proceedings of the RoboCup International Symposium 2006 (June 2006)
Kalyanakrishnan, S., Stone, P.: Batch reinforcement learning in a complex domain. In: The Sixth International Joint Conference on Autonomous Agents and Multiagent Systems (May 2007)
Lin, L.-J.: Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning 8, 293–321 (1992)
Chen, M., Foroughi, E., Heintz, F., Huang, Z., Kapetanakis, S., Kostiadis, K., Kummeneje, J., Noda, I., Obst, O., Riley, P., Steffens, T., Wang, Y., Yin, X.: Users manual: RoboCup soccer server — for soccer server version 7.07 and later. In: The RoboCup Federation (August 2002)
Ng, A.Y., Kim, H.J., Jordan, M.I., Sastry, S.: Autonomous helicopter flight via reinforcement learning. In: Thrun, S., Saul, L., Schölkopf, B. (eds.) Advances in Neural Information Processing Systems 16, MIT Press, Cambridge (2004)
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley and Sons, New York (1994)
Stone, P., Sutton, R.S., Kuhlmann, G.: Reinforcement learning for RoboCup-soccer keepaway. Adaptive Behavior 13(3), 165–188 (2005)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Sutton, R.S., Precup, D., Singh, S.P.: Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112(1-2), 181–211 (1999)
Tesauro, G.: Practical issues in temporal difference learning. In: Moody, J.E., Hanson, S.J., Lippmann, R.P. (eds.) Advances in Neural Information Processing Systems, vol. 4, pp. 259–266. Morgan Kaufmann Publishers, Inc. (1992)
Tsitsiklis, J.N., Roy, B.V.: Feature-based methods for large scale dynamic programming. Machine Learning 22(1-3), 59–94 (1996)
Watkins, C.J.C.H., Dayan, P.: Q-learning. Machine Learning 8(3-4), 279–292 (1992)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kalyanakrishnan, S., Stone, P., Liu, Y. (2008). Model-Based Reinforcement Learning in a Complex Domain. In: Visser, U., Ribeiro, F., Ohashi, T., Dellaert, F. (eds) RoboCup 2007: Robot Soccer World Cup XI. RoboCup 2007. Lecture Notes in Computer Science(), vol 5001. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68847-1_15
Download citation
DOI: https://doi.org/10.1007/978-3-540-68847-1_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68846-4
Online ISBN: 978-3-540-68847-1
eBook Packages: Computer ScienceComputer Science (R0)