Skip to main content

Hierarchical Reinforcement Learning with Clustering Abstract Machines

Part of the Communications in Computer and Information Science book series (CCIS,volume 1093)

Abstract

Hierarchical reinforcement learning (HRL) is another step towards the convergence of learning and planning methods. The resulting reusable abstract plans facilitate both the applicability of transfer learning and increasing of resilience in difficult environments with delayed rewards. However, on the way of the practical application of HRL, especially in robotics, there are a number of difficulties, among which the key is a semi-manual task of the creation of the hierarchy of actions, which the agent uses as a pre-trained scheme. In this paper, we present a new approach for simultaneous constructing and applying the hierarchy of actions and sub-goals. In contrast to prior efforts in this direction, the method is based on a united loop of clustering of the environment’s states observed by the agent and allocation of sub-targets by the modified bottleneck method for constructing of abstract machines hierarchy. The general machine is built using the so-called programmable schemes, which are quite universal for the organization of transfer learning for a wide class of tasks. A particular abstract machine is assigned for each set of clustered states. The goal of each machine is to reach one of the found bottleneck states and then get into another cluster. We evaluate our approach using a standard suite of experiments on a challenging planning problem domain and show that our approach facilitates learning without prior knowledge.

Keywords

  • Hierarchical reinforcement learning
  • Machine learning
  • Reinforcement learning

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-30763-9_3
  • Chapter length: 14 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   79.99
Price excludes VAT (USA)
  • ISBN: 978-3-030-30763-9
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   99.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.

References

  1. Abel, D., Arumugam, D., Lehnert, L., Littman, M.: State abstractions for lifelong reinforcement learning. In: Proceedings of the 35th International Conference on Machine Learning, vol. 80, pp. 10–19 (2018). http://proceedings.mlr.press/v80/abel18a.html

  2. Daniel, C., van Hoof, H., Peters, J., Neumann, G.: Probabilistic inference for determining options in reinforcement learning. Mach. Learn. 104(2–3), 337–357 (2016)

    MathSciNet  CrossRef  Google Scholar 

  3. Dayan, P., Hinton, G.: Feudal reinforcement learning. In: Advances in Neural Information Processing Systems, pp. 271–278 (1993)

    Google Scholar 

  4. Dietterich, T.G.: Hierarchical reinforcement learning with the MAXQ value function decomposition. J. Artif. Intell. Res. 13, 227–303 (2000)

    MathSciNet  CrossRef  Google Scholar 

  5. Digney, B.L.: Learning hierarchical control structures for multiple tasks and changing environments. In: Proceedings of the Fifth International Conference on the Simulation of Adaptive Behavior on From Animals to Animats 5, pp. 321–330 (1998)

    Google Scholar 

  6. Gupta, S., Davidson, J., Levine, S., Sukthankar, R., Malik, J.: Cognitive Mapping and Planning for Visual Navigation. arXiv:1702.03920, February 2017

  7. Hengst, B.: Hierarchical approaches. In: Wiering, M., van Otterlo, M. (eds.) Reinforcement Learning. Adaptation, Learning, and Optimization, vol. 12, pp. 293–323. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-27645-3_9

    CrossRef  Google Scholar 

  8. James, S., Rosman, B., Africa, S., Konidaris, G.: Learning to Plan with Portable Symbols, July 2018

    Google Scholar 

  9. Konidaris, G.: Constructing abstraction hierarchies using a skill-symbol loop. In: International Joint Conference on Artificial Intelligence, IJCAI, January 2016, pp. 1648–1654 (2016)

    Google Scholar 

  10. Kuzmin, V., Panov, A.I.: Hierarchical reinforcement learning with options and united neural network approximation. In: Abraham, A., Kovalev, S., Tarassov, V., Snasel, V., Sukhanov, A. (eds.) IITI 2018. AISC, vol. 874, pp. 453–462. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-01818-4_45

    CrossRef  Google Scholar 

  11. Mannor, S., Menache, I., Hoze, A., Klein, U.: Dynamic abstraction in reinforcement learning via clustering. In: Twenty-first international conference on Machine learning - ICML 2004, p. 71. ACM Press (2004)

    Google Scholar 

  12. McGovern, A.: Autonomous discovery of abstractions through interaction with an environment. In: Koenig, S., Holte, R.C. (eds.) SARA 2002. LNCS (LNAI), vol. 2371, pp. 338–339. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45622-8_34

    CrossRef  Google Scholar 

  13. Mcgovern, E.A.: Autonomous discovery of temporal abstractions from interaction with an environment. Power, May 2002. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.136.3079&rep=rep1&type=pdf

  14. Mehta, N., Ray, S., Tadepalli, P., Dietterich, T.: Automatic discovery and transfer of MAXQ hierarchies. In: Proceedings of the 25th International Conference on Machine Learning - ICML 2008, pp. 648–655. ACM Press (2008)

    Google Scholar 

  15. Menache, I., Mannor, S., Shimkin, N.: Q-Cut—dynamic discovery of sub-goals in reinforcement learning. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) ECML 2002. LNCS (LNAI), vol. 2430, pp. 295–306. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-36755-1_25

    CrossRef  MATH  Google Scholar 

  16. Mnih, V., et al.: Playing atari with deep reinforcement learning. arXiv preprint. arXiv:1312.5602 (2013)

  17. Panov, A.I., Skrynnik, A.: Automatic formation of the structure of abstract machines in hierarchical reinforcement learning with state clustering. In: ICML\(\backslash \)IJCAI Workshop on Planning and Learning (PAL-18) (2018). http://arxiv.org/abs/1806.05292, https://sites.google.com/site/planlearn18/

  18. Parr, R., Russell, S.: Reinforcement learning with hierarchies of machines. In: Neural Information Processing Systems (NIPS), pp. 1043–1049 (1998)

    Google Scholar 

  19. Precup, D., Sutton, R., Singh, S.: Multi-time models for temporally abstract planning. In: Advances in Neural Information Processing Systems 10, pp. 1050–1056 (1998)

    Google Scholar 

  20. Rasmussen, D., Voelker, A., Eliasmith, C.: A neural model of hierarchical reinforcement learning. PLOS ONE 12(7), e0180234 (2017)

    CrossRef  Google Scholar 

  21. Shikunov, M., Panov, A.I.: Hierarchical reinforcement learning approach for the road intersection task. In: Samsonovich, A.V. (ed.) BICA 2019. AISC, vol. 948, pp. 495–506. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-25719-4_64

    CrossRef  Google Scholar 

  22. Solway, A., et al.: Optimal behavioral hierarchy. PLoS Comput. Biol. 10(8), e1003779 (2014). https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003779

    CrossRef  Google Scholar 

  23. Tamar, A., Wu, Y., Thomas, G., Levine, S., Abbeel, P.: Value iteration networks. arXiv, pp. 1–14, February 2016

    Google Scholar 

  24. Tsetlin, M.L.: Automaton Theory and Modeling of Biological Systems. Academic Press, New York (1973)

    MATH  Google Scholar 

  25. Verma, A., Murali, V., Singh, R., Kohli, P., Chaudhuri, S.: Programmatically interpretable reinforcement learning. In: Proceedings of the 35th International Conference on Machine Learning (2018). http://arxiv.org/abs/1804.02477

  26. Wiering, M., Schmidhuber, J.: HQ-Learning. Adapt. Behav. 6(2), 219–246 (1997)

    CrossRef  Google Scholar 

Download references

Acknowledgements

This work was supported by the Russian Science Foundation, project no. 18-71-00143.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aleksandr I. Panov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Alexey, S., Panov, A.I. (2019). Hierarchical Reinforcement Learning with Clustering Abstract Machines. In: Kuznetsov, S., Panov, A. (eds) Artificial Intelligence. RCAI 2019. Communications in Computer and Information Science, vol 1093. Springer, Cham. https://doi.org/10.1007/978-3-030-30763-9_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-30763-9_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-30762-2

  • Online ISBN: 978-3-030-30763-9

  • eBook Packages: Computer ScienceComputer Science (R0)