A Modular Hierarchical Reinforcement Learning Algorithm

  • Zhibin Liu
  • Xiaoqin Zeng
  • Huiyi Liu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7390)


How to improve the learning efficiency and optimize the encapsulation of subtasks is a key problem that hierarchical reinforcement learning needs to solve. This paper proposes a modular hierarchical reinforcement learning al-gorithm, named MHRL, in which the modularized hierarchical subtasks are trained by their independent reward systems. During learning, the MHRL pro-duces an optimization strategy for different modular layers, which makes inde-pendent modules be able to concurrently execute. In addition, this paper pre-sents some experimental results for solving application problems with nested learning processes. The results show that the MHRL can increase learning reus-ability and improve learning efficiency dramatically.


Modular Hierarchical Reinforcement Learning MAXQ Markov Decision Software Reuse Optimization Algorithm 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Sutton, R.S., Barto, A.G.: Introduction to Reinforcement Learning. MIT Press, Cambridge (1998)Google Scholar
  2. 2.
    Sutton, R.S., Precup, D., Singh, S.P.: Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning. Artificial Intelligence 112, 181–211 (1999)MathSciNetzbMATHCrossRefGoogle Scholar
  3. 3.
    Parr, R.: Hierarchical Control and Learning for Markov Decision Processes. University of California, Berkeley (1998)Google Scholar
  4. 4.
    Dayan, P., Hinton: Feudal Reinforcement Learning. Advances in Neural Information Processing Systems 5, 271–278 (1993)Google Scholar
  5. 5.
    Hengst, B.: Discovering Hierarchical Reinforcement learning, Sydney, University of New South Wales, Australia (2003)Google Scholar
  6. 6.
    Martín H, J.A., de Lope, J., Maravall, D.: Robust High Performance Reinforcement Learning through Weighted K-nearest Neighbors. Neurocomputing 74, 1251–1259 (2011)CrossRefGoogle Scholar
  7. 7.
    Busoniu, L., Babuska, R., De Schutter, B.: Reinforcement Learning and Dynamic Programming Using Function Approximators. CRC Press, New York (2010)CrossRefGoogle Scholar
  8. 8.
    Dietterich, T.G.: Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition. Journal of Artificial Intelligence Research 13, 227–303 (2000)MathSciNetzbMATHGoogle Scholar
  9. 9.
    Watins, P., Dayan, N.: Q-learning. Machine Learning 8, 279–292 (1992)Google Scholar
  10. 10.
    Barto, A.G., Mahadevan, S.: Recent Advances in Hierarchical Reinforcement Learning. Discrete Event Dynamic Systems 13, 41–77 (2003)MathSciNetzbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Zhibin Liu
    • 1
  • Xiaoqin Zeng
    • 1
  • Huiyi Liu
    • 1
  1. 1.Institute of intelligence science and technologyHohai UniversityNanjingChina

Personalised recommendations