Abstract
This paper presents the CQ algorithm which decomposes and solves a Markov Decision Process (MDP) by automatically generating a hierarchy of smaller MDPs using state variables. The CQ algorithm uses a heuristic which is applicable for problems that can be modelled by a set of state variables that conform to a special ordering, defined in this paper as a “nested Markov ordering”. The benefits of this approach are: (1) the automatic generation of actions and termination conditions at all levels in the hierarchy, and (2) linear scaling with the number of variables under certain conditions. This approach draws heavily on Dietterich’s MAXQ value function decomposition and Hauskrecht, Meuleau, Kaelbling, Dean, Boutilier’s and others region based decomposition of MDPs. The CQ algorithm is described and its functionality illustrated using a four room example. Different solutions are generated with different numbers of hierarchical levels to solve Dietterich’s taxi tasks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Dean, T., Lin, S-H.: Decomposition Techniques for Planning in Stochastic Domains. (Technical Report CS-95-10). Department of Computer Science, Brown University, Providence, RI (1995)
Dietterich, T. G.: Hierarchical Reinforcement Learning with MAXQ Value Function Decomposition. Department of Computer Science, Oregon State University, Corvallis, OR (1999)
Digney, B. L.: Emergent Hierarchical Control Structures: Learning Reactive / Hierarchical Relationships in Reinforcement Environments. In: Maes, P., et al (eds.): From animals to animats 4: Proceedings of the fourth international conference on simulation of adaptive behaviour, MIT Press, Cambridge(MA) London (1996) 363–372
Hauskrecht, M., Meuleau, N., Kaelbling, L. P., Dean, T., Boutilier, C: Hierarchical Solution of Markov Decision Processes using Macro-actions. (Technical Report). Department of Computer Science, Brown University, Providence, RI (1998)
Parr, R. E.: Hierarchical Control and Learning for Markov Decision Processes. Doctoral dissertation, Computer Science, University of California, Berkley (1998)
Parr, R, Russell, S.: Reinforcement Learning with Hierarchies of Machines, Advances in Neural Information Processing Systems 10. MIT Press (1998)
Singh S.: Reinforcement Learning with a Hierarchy of Abstract Models. Proceedings of the Tenth National Conference on Artificial Intelligence, Menlo Park: AAAI Press (1992)
Sutton, S., Barto, A. G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Sutton, R. S., Singh, S., Precup, D., Ravindran, B.: Improved switching among temporally abstract actions. Advances in Neural Information Processing Systems 11 (Proceedings of the 1998 conference), MIT Press (1999) 1066–1072
Sutton, R. S., Precup, D., Singh, S.: Between MDPs and Semi-MDPs: Learning, Planning, and Representating Knowledge at Multiple Temporal Scales. (Technical Report) Department of Computer and Information Sciences, University of Massachusetts, Amherst, MA (1998)
Thrun, S., O’Sullivan, J.: Discovering Structure in Multiple Learning Tasks: The TC Algorithm. Proceedings of the Thirteenth International Conference on Machine Learning. Morgan Kaufmann, San Mateo (1996)
Thrun, S., Schwartz, A.: Finding Structure in Reinforcement Learning. Advances in Neural Information Processing Systems 7, Morgan Kaufmann, San Mateo (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hengst, B. (2000). Generating Hierarchical Structure in Reinforcement Learning from State Variables. In: Mizoguchi, R., Slaney, J. (eds) PRICAI 2000 Topics in Artificial Intelligence. PRICAI 2000. Lecture Notes in Computer Science(), vol 1886. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44533-1_54
Download citation
DOI: https://doi.org/10.1007/3-540-44533-1_54
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67925-7
Online ISBN: 978-3-540-44533-3
eBook Packages: Springer Book Archive