Skip to main content

Hierarchical Reinforcement Learning with Options and United Neural Network Approximation

Part of the Advances in Intelligent Systems and Computing book series (AISC,volume 874)


The “curse of dimensionality” and environments with sparse delayed rewards are one of the main challenges in reinforcement learning (RL). To tackle these problems we can use hierarchical reinforcement learning (HRL) that provides abstraction both on actions and states of the environment. This work proposes an algorithm that combines hierarchical approach for RL and the ability of neural networks to serve as universal function approximators. To perform the hierarchy of actions the options framework is used which main idea is to utilize macro-actions (the sequence of simpler actions). State of the environment is the input to a convolutional neural network that plays a role of Q-function estimating the utility of every possible action and skill in the given state. We learn each option separately using different neural networks and then combine result into one architecture with top-level approximator. We compare the performance of the proposed algorithm with the deep Q-network algorithm (DQN) in the environment where the aim of the magnet-arm robot is to build a tower from bricks.


  • Hierarchical reinforcement learning
  • Options
  • Neural network
  • DQN
  • Deep neural network
  • Q-learning

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-01818-4_45
  • Chapter length: 10 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
USD   219.00
Price excludes VAT (USA)
  • ISBN: 978-3-030-01818-4
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   279.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.


  1. Bacon, P.-L., Harb, J., Precup, D.: The Option-Critic Architecture. arXiv:1609.05140v2 (2016)

  2. Bai, A., Russell, S.: Efficient reinforcement learning with hierarchies of machines by leveraging internal transitions. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence. Main track, pp. 1418–1424 (2017)

    Google Scholar 

  3. Botvinick, M.M.: Hierarchical reinforcement learning and decision making. Curr. Opin. Neurobiol. 22, 956–962 (2012)

    CrossRef  Google Scholar 

  4. Dietterich, T.G.: Hierarchical reinforcement learning with the MAXQ value function decomposition. arXiv:cs/9905014 (1999)

  5. Kulkarni, T.D., Narasimhan, K.R., Saeedi, A., Tenenbaum, J.B.: Hierarchical deep reinforcement learning: integrating temporal abstraction and intrinsic motivation. arXiv:1604.06057 (2016)

  6. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529533 (2015)

    CrossRef  Google Scholar 

  7. Parr, R., Russell, S.: Reinforcement learning with hierarchies of machines. In: Advances in Neural Information Processing Systems: Proceedings of the 1997 Conference. MIT Press, Cambridge (1998)

    Google Scholar 

  8. Sutton, R.S., Precup, D., Singh, S.: Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning. In: Artificial Intelligence (1999)

    MathSciNet  CrossRef  Google Scholar 

  9. (Sasha) Vezhnevets, A., et al.: Strategic attentive writer for learning macro-actions. In: Proceedings of NIPS (2016)

    Google Scholar 

Download references


This work was supported by the Russian Science Foundation (Project No. 18-71-00143).

Author information

Authors and Affiliations


Corresponding author

Correspondence to Aleksandr I. Panov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Kuzmin, V., Panov, A.I. (2019). Hierarchical Reinforcement Learning with Options and United Neural Network Approximation. In: Abraham, A., Kovalev, S., Tarassov, V., Snasel, V., Sukhanov, A. (eds) Proceedings of the Third International Scientific Conference “Intelligent Information Technologies for Industry” (IITI’18). IITI'18 2018. Advances in Intelligent Systems and Computing, vol 874. Springer, Cham.

Download citation