Hierarchical Actor-Critic with Hindsight for Mobile Robot with Continuous State Space

Aleksey, Staroverov; Panov, Aleksandr I.

doi:10.1007/978-3-030-30425-6_6

Hierarchical Actor-Critic with Hindsight for Mobile Robot with Continuous State Space

Staroverov Aleksey⁶ &
Aleksandr I. Panov^7,8

Conference paper
First Online: 04 September 2019

970 Accesses
2 Citations

Part of the book series: Studies in Computational Intelligence ((SCI,volume 856))

Abstract

Hierarchies are used in reinforcement learning to increase learning speed in sparse reward tasks. In this kind of tasks, the main problem is elapsed time, required for the initial policy to reach the goal during the first steps. Hierarchies can split a problem into a set of subproblems that could be reached in less time. In order to implement this idea, Hierarchical Reinforcement Learning (HRL) algorithms need to be able to learn the multiple levels within a hierarchy in parallel, so these smaller subproblems could be solved at the same time. Most famous existing HRL algorithms that can learn multi-level hierarchies are not able to efficiently learn levels of policies simultaneously, especially in continuous space and action space environment. To address this problem, we had analyzed the newest existing framework, Hierarchical Actor-Critic with Hindsight (HAC), test it in the simulated mobile robot environment and determine the optimal configuration of parameters and ways to encode information about the environment states.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Schmidhuber, J.: Learning to generate sub-goals for action sequences. In: Kohonen, T., Mäkisara, K., Simula, O., Kangas, J. (eds.) Artificial Neural Networks, pp. 967–972. Elsevier Science Publishers B.V., North-Holland (1991)
Google Scholar
Konidaris, G.D., Barto, A.G.: Skill discovery in continuous reinforcement learning domains using skill chaining. Adv. Neural. Inf. Process. Syst. 22, 1015–1023 (2009)
Google Scholar
Bacon, P.-L., Harb, J., Precup, D.: The option-critic architecture. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, pp. 1726–1734 (2017)
Google Scholar
Vezhnevets, A., Osindero, S., Schaul, T., Heess, N., Jaderberg, M., Silver, D., Kavukcuoglu, K.: FeUdal networks for hierarchical reinforcement learning. In: Proceedings of the 34th International Conference on Machine Learning, pp. 3540–3549 (2017)
Google Scholar
Nachum, O., Gu, S., Lee, H., Levine, S.: Data-efficient hierarchical reinforcement learning. Adv. Neural. Inf. Process. Syst. 31, 3303–3313 (2018)
Google Scholar
Levy, A., Konidaris, G., Platt, R., Saenko, K.: Learning multi-level hierarchies with hindsight. arXiv:1712.00948. [cs.AI], March 2019
Andrychowicz, M., Wolski, F., Ray, A., Schneider, J., Fong, R., Welinder, P., McGrew, B., Tobin, J., Abbeel, P., Zaremba, W.: Hindsight experience replay. Adv. Neural. Inf. Process. Syst. 30, 5048–5058 (2017)
Google Scholar
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. CoRR (2015). arXiv:1509.02971
Silver, D., Schaul, T., Horgan, D., Gregor, K.: Universal value function approximators. In: International Conference on Machine Learning (July 2015)
Google Scholar
Shikunov, M., Panov, A.I.: Hierarchical reinforcement learning approach for the road intersection task. In: Samsonovich, A.V. (ed.) Biologically Inspired Cognitive Architectures 2019. Springer, Cham (2019)
Google Scholar
Kuzmin, V., Panov, A.I.: Hierarchical reinforcement learning with options and united neural network approximation. In: Abraham, A., Kovalev, S., Tarassov, V., Snasel, V., Sukhanov, A. (eds.) Proceedings of the Third International Scientific Conference “Intelligent Information Technologies for Industry” (IITI 2018), pp. 453–462. Springer, Cham (2018)
Google Scholar
Ayunts, E., Panov, A.I.: Task planning in “Block World” with deep reinforcement learning. In: Samsonovich, A.V., Klimov, V.V. (eds.) Biologically Inspired Cognitive Architectures (BICA) for Young Scientists, pp. 3–9. Springer, Cham (2017)
Google Scholar

Download references

Acknowledgements

The reported study was supported by RFBR, research Projects No. 17-29-07079.

Author information

Authors and Affiliations

Bauman Moscow State University, Moscow, Russia
Staroverov Aleksey
Artificial Intelligence Research Institute, Federal Research Center “Computer Science and Control” of the Russian Academy of Sciences, Moscow, Russia
Aleksandr I. Panov
Moscow Institute of Physics and Technology, Moscow, Russia
Aleksandr I. Panov

Authors

Staroverov Aleksey
View author publications
You can also search for this author in PubMed Google Scholar
Aleksandr I. Panov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Aleksandr I. Panov .

Editor information

Editors and Affiliations

Scientific Research Institute for System Analysis of Russian Academy of Sciences, Moscow, Russia
Boris Kryzhanovsky
Scientific Research Institute for System Analysis of Russian Academy of Sciences, Moscow, Russia
Witali Dunin-Barkowski
Scientific Research Institute for System Analysis of Russian Academy of Sciences, Moscow, Russia
Vladimir Redko
Moscow Aviation Institute (National Research University), Moscow, Russia
Yury Tiumentsev

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Aleksey, S., Panov, A.I. (2020). Hierarchical Actor-Critic with Hindsight for Mobile Robot with Continuous State Space. In: Kryzhanovsky, B., Dunin-Barkowski, W., Redko, V., Tiumentsev, Y. (eds) Advances in Neural Computation, Machine Learning, and Cognitive Research III. NEUROINFORMATICS 2019. Studies in Computational Intelligence, vol 856. Springer, Cham. https://doi.org/10.1007/978-3-030-30425-6_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-30425-6_6
Published: 04 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30424-9
Online ISBN: 978-3-030-30425-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics