Hierarchical Control Architecture for a Learning Robot Based on Heterogenic Behaviors

  • Maxim RovboEmail author
  • Anton Moscowsky
  • Petr Sorokoumov
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 1093)


The paper describes a hierarchical control architecture for robotic systems with learning that allows combining various goal-directed algorithms. A top-level control algorithm is proposed that switches control between base algorithms: Q-learning, random walk and a rule-based planning. The algorithm is implemented as a software module and is verified by the example of the task of finding a given door in a building of complex planning. The task is considered as a reinforcement learning problem in two distinct cases: with a goal fixed between the episodes and the goal changing from episode to episode. The simulation showed that the proposed method is more stable for different variants of the task than each of the basic ones separately, although it does not give the best result for each individual case.


Robot Control architecture Behavior Reinforcement learning 



This work was supported in part by the National Research Center “Kurchatov Institute” (Order No. 1601 of July 5, 2018) (Sect. 3) and the RFBR grant 17-29-07083 (Sects. 1, 2 and 4).


  1. Aguirre, E., González, A.: Fuzzy behaviors for mobile robot navigation: design, coordination and fusion. Int. J. Approx. Reason. 25(3), 255–289 (2000). Scholar
  2. Brown, A., Petrik, M.: Interpretable reinforcement learning with ensemble methods, pp. 1–7 (2018)Google Scholar
  3. Tsetlin, M.L.: O povedenii konechnyh avtomatov v sluchajnyh sredah. Avtom. i telemekhanika. 22(10), 1345–1354 (1961). (in Russian)Google Scholar
  4. Fox, D., Burgard, W., Thrun, S.: The dynamic window approach to collision avoidance. IEEE Robot. Autom. Mag. 4(1), 23–33 (1997). Scholar
  5. Gaaze-Rapoport, M.G., Pospelov, D.A.: Ot ameby do robota: modeli povedeniya, p. 296 (2004). (in Russian)Google Scholar
  6. Karpov, V.E., et al.: Architecture of a wheelchair control system for disabled people: towards multifunctional robotic solution with neurobiological interfaces. Sovrem. Tehnol. v Med. 11(1), 90–102 (2019a). Scholar
  7. Karpov, V.E., Karpova, I.P., Kulinich, A.A.: Social’nye soobshchestva robotov, p. 352 (2019b). (in Russian)Google Scholar
  8. Marino, A., et al.: Behavioral control for multi-robot perimeter patrol : a finite state automata approach. In: 2009 IEEE International Conference on Robotics and Automation, pp. 831–836 (2009).
  9. McGlohon, M., Sen, S.: Learning to cooperate in multi-agent systems by combining Q-learning and evolutionary strategy. Int. J. Lateral Comput. 1(2), 58–64 (2005)Google Scholar
  10. Rovbo, M.A.: Raspredelenie rolej v geterogennom murav’ino-podobnom kollektive. In: Pyatnadcataya nacional’naya konferenciya po iskusstvennomu intellektu s mezhdunarodnym uchastiem (KII-2016), pp. 363–371 (2016). (in Russian)Google Scholar
  11. Skarzynski, K., Stepniak, M., Bartyna, W., Ambroszkiewicz, S.: SO-MRS: a multi-robot system architecture based on the SOA paradigm and ontology. In: Giuliani, M., Assaf, T., Giannaccini, M.E. (eds.) TAROS 2018. LNCS (LNAI), vol. 10965, pp. 330–342. Springer, Cham (2018). Scholar
  12. Stolle, M., Precup, D.: Learning options in reinforcement learning. In: Koenig, S., Holte, R.C. (eds.) SARA 2002. LNCS (LNAI), vol. 2371, pp. 212–223. Springer, Heidelberg (2002). Scholar
  13. Stoytchev, A., Arkin, R.C.: Combining deliberation, reactivity, and motivation in the context of a behavior-based robot architecture. In: Proceedings of IEEE International Symposium on Computational Intelligence in Robotics and Automation, CIRA, pp. 290–295, January 2001.
  14. Sutton, R.S., Precup, D., Singh, S.: Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning. Artif. Intell. 112(1–2), 181–211 (1999). Scholar
  15. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction (2018)Google Scholar
  16. Varshavsky, V.I., Meleshina, M.V., Tsetlin, M.L.: Povedenie avtomatov v periodicheskih sluchajnyh sredah i zadacha sinhronizacii pri nalichii pomekh. Probl. peredachi Inf. 1(1), 65–71 (1965). (in Russian)Google Scholar
  17. Vasiliu, L., et al.: RoboBrain: a software architecture mapping the human brain. In: 2014 IEEE-RAS International Conference on Humanoid Robots, pp. 160–165. IEEE (2014).
  18. Siciliano, B., Khatib, O.: Springer Handbook of Robotics. Springer, Berlin (2008). Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.National Research Center “Kurchatov Institute”MoscowRussia

Personalised recommendations