Advances in Adaptive Skill Acquisition

Holas, Juraj; Farkaš, Igor

doi:10.1007/978-3-030-86380-7_53

Juraj Holas¹² &
Igor Farkaš¹²

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12894))

Included in the following conference series:

International Conference on Artificial Neural Networks

1 Citations

Abstract

Hierarchical Reinforcement Learning (HRL) represents a viable approach to learning complex tasks, especially those with an inner hierarchical structure. The HRL methods decompose the problem into a typically two-layered hierarchy. At the lower level, individual skills are created to solve specific non-trivial subtasks, such as locomotion primitives. The high-level agent can then use these skills as its actions, enabling it to tackle the overall task. The identification of an appropriate skill set, however, is a difficult problem by itself. Most current approaches solve it using a pre-training phase, in which skills are trained and fixed, before launching the training of the high-level agent. Having the skill set fixed prior to main training session can however impose flaws on the HRL system – especially if a useful skill was not successfully identified, and hence is missing from the skill set. Our Adaptive Skill Acquisition framework (ASA) aims specifically for these situations. It can be plugged onto existing HRL architectures and fix the defects within the pre-trained skill set. During the training of the high-level agent, ASA detects a missing skill, trains it, and integrates it into the existing system. In this paper, we present new improvements to the ASA framework, especially a new skill-training reward function, and support for skill-stopping functions enabling better integration. Furthermore, we extend our prior pilot tests into extensive experiments evaluating the functionality of ASA, in comparison to its theoretical boundaries. The source code of ASA is also available online (https://github.com/holasjuraj/asa).

This research was supported by KEGA grant no. 042UK-4/2019.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bacon, P.L., Harb, J., Precup, D.: The option-critic architecture. In: AAAI Conference on Artificial Intelligence (2017)
Google Scholar
Bakker, B., Schmidhuber, J.: Hierarchical reinforcement learning with subpolicies specializing for learned subgoals. In: International Conference on Neural Networks and Computational Intelligence, pp. 125–130 (2004)
Google Scholar
Garage contributors: Garage: A toolkit for reproducible reinforcement learning research. https://github.com/rlworkgroup/garage (2019)
Dietterich, T.G.: Hierarchical reinforcement learning with the maxq value function decomposition. J. Artif. Intell. Res. 13(1), 227–303 (2000)
Article MathSciNet Google Scholar
Dillinger, V.: Abstract state space construction in hierarchical reinforcement learning. Ph.D. thesis, Comenius University in Bratislava (2019)
Google Scholar
Florensa, C., Duan, Y., Abbeel, P.: Stochastic neural networks for hierarchical reinforcement learning. In: International Conference on Learning Representations (2017)
Google Scholar
Goel, S., Huber, M.: Subgoal discovery for hierarchical reinforcement learning using learned policies. In: Florida AI Research Society Conference, pp. 346–350 (2003)
Google Scholar
Holas, J., Farkaš, I.: Adaptive skill acquisition in hierarchical reinforcement learning. In: Farkaš, I., Masulli, P., Wermter, S. (eds.) ICANN 2020. LNCS, vol. 12397, pp. 383–394. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-61616-8_31
Chapter Google Scholar
Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp. 448–456. PMLR (2015)
Google Scholar
Konidaris, G., Barto, A.G.: Skill discovery in continuous reinforcement learning domains using skill chaining. In: Advances in Neural Information Processing Systems, pp. 1015–1023 (2009)
Google Scholar
Kulkarni, T.D., Narasimhan, K., Saeedi, A., Tenenbaum, J.: Hierarchical deep reinforcement learning: Integrating temporal abstraction and intrinsic motivation. In: Advances in Neural Information Processing Systems, pp. 3675–3683 (2016)
Google Scholar
Levy, A., Konidaris, G., Platt, R., Saenko, K.: Learning multi-level hierarchies with hindsight. In: International Conference on Learning Representations (2018)
Google Scholar
Li, A.C., Florensa, C., Clavera, I., Abbeel, P.: Sub-policy adaptation for hierarchical reinforcement learning. In: International Conference on Learning Representations (2020)
Google Scholar
McGovern, A., Barto, A.G.: Automatic discovery of subgoals in reinforcement learning using diverse density. Int. Conf. Mach. Learn. 1, 361–368 (2001)
Google Scholar
McGovern, E.A., Barto, A.G.: Autonomous discovery of temporal abstractions from interaction with an environment. Ph.D. thesis, University of Massachusetts at Amherst (2002)
Google Scholar
Menache, I., Mannor, S., Shimkin, N.: Q-cut—dynamic discovery of sub-goals in reinforcement learning. In: European Conference on Machine Learning, pp. 295–306 (2002)
Google Scholar
Metzen, J.H., Kirchner, F.: Incremental learning of skill collections based on intrinsic motivation. Front. Neurorobotics 7, 11 (2013)
Article Google Scholar
Moerman, W.: Hierarchical reinforcement learning: Assignment of behaviours to subpolicies by self-organization. Ph.D. thesis, Cognitive Artificial Intelligence, Utrecht University (2009)
Google Scholar
Nachum, O., Gu, S.S., Lee, H., Levine, S.: Data-efficient hierarchical reinforcement learning. In: Advances in Neural Information Processing Systems, pp. 3303–3313 (2018)
Google Scholar
Parr, R., Russell, S.J.: Reinforcement learning with hierarchies of machines. In: Advances in Neural Information Processing Systems, pp. 1043–1049 (1998)
Google Scholar
Robins, A.: Catastrophic forgetting, rehearsal and pseudorehearsal. Connection Sci. 7(2), 123–146 (1995)
Article Google Scholar
Schulman, J., Levine, S., Abbeel, P., Jordan, M., Moritz, P.: Trust region policy optimization. In: International Conference on Machine Learning, pp. 1889–1897 (2015)
Google Scholar
Shu, T., Xiong, C., Socher, R.: Hierarchical and interpretable skill acquisition in multi-task reinforcement learning. In: International Conference on Learning Representations (2018)
Google Scholar
Sutton, R.S., Precup, D., Singh, S.: Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning. Artif. Intell. 112, 181–211 (1999)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Mathematics, Physics and Informatics, Comenius University in Bratislava, Bratislava, Slovakia
Juraj Holas & Igor Farkaš

Authors

Juraj Holas
View author publications
You can also search for this author in PubMed Google Scholar
Igor Farkaš
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Juraj Holas .

Editor information

Editors and Affiliations

Comenius University in Bratislava, Bratislava, Slovakia
Igor Farkaš
iMotions A/S, Copenhagen, Denmark
Paolo Masulli
University of Tübingen, Tübingen, Baden-Württemberg, Germany
Sebastian Otte
Universität Hamburg, Hamburg, Germany
Stefan Wermter

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Holas, J., Farkaš, I. (2021). Advances in Adaptive Skill Acquisition. In: Farkaš, I., Masulli, P., Otte, S., Wermter, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2021. ICANN 2021. Lecture Notes in Computer Science(), vol 12894. Springer, Cham. https://doi.org/10.1007/978-3-030-86380-7_53

Download citation

DOI: https://doi.org/10.1007/978-3-030-86380-7_53
Published: 07 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86379-1
Online ISBN: 978-3-030-86380-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics