Skip to main content

Hierarchical Reinforcement Learning with Clustering Abstract Machines

  • Conference paper
  • First Online:
Artificial Intelligence (RCAI 2019)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1093))

Included in the following conference series:


Hierarchical reinforcement learning (HRL) is another step towards the convergence of learning and planning methods. The resulting reusable abstract plans facilitate both the applicability of transfer learning and increasing of resilience in difficult environments with delayed rewards. However, on the way of the practical application of HRL, especially in robotics, there are a number of difficulties, among which the key is a semi-manual task of the creation of the hierarchy of actions, which the agent uses as a pre-trained scheme. In this paper, we present a new approach for simultaneous constructing and applying the hierarchy of actions and sub-goals. In contrast to prior efforts in this direction, the method is based on a united loop of clustering of the environment’s states observed by the agent and allocation of sub-targets by the modified bottleneck method for constructing of abstract machines hierarchy. The general machine is built using the so-called programmable schemes, which are quite universal for the organization of transfer learning for a wide class of tasks. A particular abstract machine is assigned for each set of clustered states. The goal of each machine is to reach one of the found bottleneck states and then get into another cluster. We evaluate our approach using a standard suite of experiments on a challenging planning problem domain and show that our approach facilitates learning without prior knowledge.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others


  1. Abel, D., Arumugam, D., Lehnert, L., Littman, M.: State abstractions for lifelong reinforcement learning. In: Proceedings of the 35th International Conference on Machine Learning, vol. 80, pp. 10–19 (2018).

  2. Daniel, C., van Hoof, H., Peters, J., Neumann, G.: Probabilistic inference for determining options in reinforcement learning. Mach. Learn. 104(2–3), 337–357 (2016)

    Article  MathSciNet  Google Scholar 

  3. Dayan, P., Hinton, G.: Feudal reinforcement learning. In: Advances in Neural Information Processing Systems, pp. 271–278 (1993)

    Google Scholar 

  4. Dietterich, T.G.: Hierarchical reinforcement learning with the MAXQ value function decomposition. J. Artif. Intell. Res. 13, 227–303 (2000)

    Article  MathSciNet  Google Scholar 

  5. Digney, B.L.: Learning hierarchical control structures for multiple tasks and changing environments. In: Proceedings of the Fifth International Conference on the Simulation of Adaptive Behavior on From Animals to Animats 5, pp. 321–330 (1998)

    Google Scholar 

  6. Gupta, S., Davidson, J., Levine, S., Sukthankar, R., Malik, J.: Cognitive Mapping and Planning for Visual Navigation. arXiv:1702.03920, February 2017

  7. Hengst, B.: Hierarchical approaches. In: Wiering, M., van Otterlo, M. (eds.) Reinforcement Learning. Adaptation, Learning, and Optimization, vol. 12, pp. 293–323. Springer, Heidelberg (2012).

    Chapter  Google Scholar 

  8. James, S., Rosman, B., Africa, S., Konidaris, G.: Learning to Plan with Portable Symbols, July 2018

    Google Scholar 

  9. Konidaris, G.: Constructing abstraction hierarchies using a skill-symbol loop. In: International Joint Conference on Artificial Intelligence, IJCAI, January 2016, pp. 1648–1654 (2016)

    Google Scholar 

  10. Kuzmin, V., Panov, A.I.: Hierarchical reinforcement learning with options and united neural network approximation. In: Abraham, A., Kovalev, S., Tarassov, V., Snasel, V., Sukhanov, A. (eds.) IITI 2018. AISC, vol. 874, pp. 453–462. Springer, Cham (2019).

    Chapter  Google Scholar 

  11. Mannor, S., Menache, I., Hoze, A., Klein, U.: Dynamic abstraction in reinforcement learning via clustering. In: Twenty-first international conference on Machine learning - ICML 2004, p. 71. ACM Press (2004)

    Google Scholar 

  12. McGovern, A.: Autonomous discovery of abstractions through interaction with an environment. In: Koenig, S., Holte, R.C. (eds.) SARA 2002. LNCS (LNAI), vol. 2371, pp. 338–339. Springer, Heidelberg (2002).

    Chapter  Google Scholar 

  13. Mcgovern, E.A.: Autonomous discovery of temporal abstractions from interaction with an environment. Power, May 2002.

  14. Mehta, N., Ray, S., Tadepalli, P., Dietterich, T.: Automatic discovery and transfer of MAXQ hierarchies. In: Proceedings of the 25th International Conference on Machine Learning - ICML 2008, pp. 648–655. ACM Press (2008)

    Google Scholar 

  15. Menache, I., Mannor, S., Shimkin, N.: Q-Cut—dynamic discovery of sub-goals in reinforcement learning. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) ECML 2002. LNCS (LNAI), vol. 2430, pp. 295–306. Springer, Heidelberg (2002).

    Chapter  MATH  Google Scholar 

  16. Mnih, V., et al.: Playing atari with deep reinforcement learning. arXiv preprint. arXiv:1312.5602 (2013)

  17. Panov, A.I., Skrynnik, A.: Automatic formation of the structure of abstract machines in hierarchical reinforcement learning with state clustering. In: ICML\(\backslash \)IJCAI Workshop on Planning and Learning (PAL-18) (2018).,

  18. Parr, R., Russell, S.: Reinforcement learning with hierarchies of machines. In: Neural Information Processing Systems (NIPS), pp. 1043–1049 (1998)

    Google Scholar 

  19. Precup, D., Sutton, R., Singh, S.: Multi-time models for temporally abstract planning. In: Advances in Neural Information Processing Systems 10, pp. 1050–1056 (1998)

    Google Scholar 

  20. Rasmussen, D., Voelker, A., Eliasmith, C.: A neural model of hierarchical reinforcement learning. PLOS ONE 12(7), e0180234 (2017)

    Article  Google Scholar 

  21. Shikunov, M., Panov, A.I.: Hierarchical reinforcement learning approach for the road intersection task. In: Samsonovich, A.V. (ed.) BICA 2019. AISC, vol. 948, pp. 495–506. Springer, Cham (2020).

    Chapter  Google Scholar 

  22. Solway, A., et al.: Optimal behavioral hierarchy. PLoS Comput. Biol. 10(8), e1003779 (2014).

    Article  Google Scholar 

  23. Tamar, A., Wu, Y., Thomas, G., Levine, S., Abbeel, P.: Value iteration networks. arXiv, pp. 1–14, February 2016

    Google Scholar 

  24. Tsetlin, M.L.: Automaton Theory and Modeling of Biological Systems. Academic Press, New York (1973)

    MATH  Google Scholar 

  25. Verma, A., Murali, V., Singh, R., Kohli, P., Chaudhuri, S.: Programmatically interpretable reinforcement learning. In: Proceedings of the 35th International Conference on Machine Learning (2018).

  26. Wiering, M., Schmidhuber, J.: HQ-Learning. Adapt. Behav. 6(2), 219–246 (1997)

    Article  Google Scholar 

Download references


This work was supported by the Russian Science Foundation, project no. 18-71-00143.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Aleksandr I. Panov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Alexey, S., Panov, A.I. (2019). Hierarchical Reinforcement Learning with Clustering Abstract Machines. In: Kuznetsov, S., Panov, A. (eds) Artificial Intelligence. RCAI 2019. Communications in Computer and Information Science, vol 1093. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-30762-2

  • Online ISBN: 978-3-030-30763-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics