Advertisement

Hierarchical Control Architecture for a Learning Robot Based on Heterogenic Behaviors

  • Maxim RovboEmail author
  • Anton Moscowsky
  • Petr Sorokoumov
Conference paper
  • 103 Downloads
Part of the Communications in Computer and Information Science book series (CCIS, volume 1093)

Abstract

The paper describes a hierarchical control architecture for robotic systems with learning that allows combining various goal-directed algorithms. A top-level control algorithm is proposed that switches control between base algorithms: Q-learning, random walk and a rule-based planning. The algorithm is implemented as a software module and is verified by the example of the task of finding a given door in a building of complex planning. The task is considered as a reinforcement learning problem in two distinct cases: with a goal fixed between the episodes and the goal changing from episode to episode. The simulation showed that the proposed method is more stable for different variants of the task than each of the basic ones separately, although it does not give the best result for each individual case.

Keywords

Robot Control architecture Behavior Reinforcement learning 

Notes

Acknowledgements

This work was supported in part by the National Research Center “Kurchatov Institute” (Order No. 1601 of July 5, 2018) (Sect. 3) and the RFBR grant 17-29-07083 (Sects. 1, 2 and 4).

References

  1. Aguirre, E., González, A.: Fuzzy behaviors for mobile robot navigation: design, coordination and fusion. Int. J. Approx. Reason. 25(3), 255–289 (2000).  https://doi.org/10.1016/S0888-613X(00)00056-6MathSciNetCrossRefzbMATHGoogle Scholar
  2. Brown, A., Petrik, M.: Interpretable reinforcement learning with ensemble methods, pp. 1–7 (2018)Google Scholar
  3. Tsetlin, M.L.: O povedenii konechnyh avtomatov v sluchajnyh sredah. Avtom. i telemekhanika. 22(10), 1345–1354 (1961). (in Russian)Google Scholar
  4. Fox, D., Burgard, W., Thrun, S.: The dynamic window approach to collision avoidance. IEEE Robot. Autom. Mag. 4(1), 23–33 (1997).  https://doi.org/10.1109/100.580977CrossRefGoogle Scholar
  5. Gaaze-Rapoport, M.G., Pospelov, D.A.: Ot ameby do robota: modeli povedeniya, p. 296 (2004). (in Russian)Google Scholar
  6. Karpov, V.E., et al.: Architecture of a wheelchair control system for disabled people: towards multifunctional robotic solution with neurobiological interfaces. Sovrem. Tehnol. v Med. 11(1), 90–102 (2019a).  https://doi.org/10.17691/stm2019.11.1.11CrossRefGoogle Scholar
  7. Karpov, V.E., Karpova, I.P., Kulinich, A.A.: Social’nye soobshchestva robotov, p. 352 (2019b). (in Russian)Google Scholar
  8. Marino, A., et al.: Behavioral control for multi-robot perimeter patrol : a finite state automata approach. In: 2009 IEEE International Conference on Robotics and Automation, pp. 831–836 (2009).  https://doi.org/10.1109/ROBOT.2009.5152710
  9. McGlohon, M., Sen, S.: Learning to cooperate in multi-agent systems by combining Q-learning and evolutionary strategy. Int. J. Lateral Comput. 1(2), 58–64 (2005)Google Scholar
  10. Rovbo, M.A.: Raspredelenie rolej v geterogennom murav’ino-podobnom kollektive. In: Pyatnadcataya nacional’naya konferenciya po iskusstvennomu intellektu s mezhdunarodnym uchastiem (KII-2016), pp. 363–371 (2016). (in Russian)Google Scholar
  11. Skarzynski, K., Stepniak, M., Bartyna, W., Ambroszkiewicz, S.: SO-MRS: a multi-robot system architecture based on the SOA paradigm and ontology. In: Giuliani, M., Assaf, T., Giannaccini, M.E. (eds.) TAROS 2018. LNCS (LNAI), vol. 10965, pp. 330–342. Springer, Cham (2018).  https://doi.org/10.1007/978-3-319-96728-8_28CrossRefGoogle Scholar
  12. Stolle, M., Precup, D.: Learning options in reinforcement learning. In: Koenig, S., Holte, R.C. (eds.) SARA 2002. LNCS (LNAI), vol. 2371, pp. 212–223. Springer, Heidelberg (2002).  https://doi.org/10.1007/3-540-45622-8_16CrossRefGoogle Scholar
  13. Stoytchev, A., Arkin, R.C.: Combining deliberation, reactivity, and motivation in the context of a behavior-based robot architecture. In: Proceedings of IEEE International Symposium on Computational Intelligence in Robotics and Automation, CIRA, pp. 290–295, January 2001.  https://doi.org/10.1109/CIRA.2001.1013214
  14. Sutton, R.S., Precup, D., Singh, S.: Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning. Artif. Intell. 112(1–2), 181–211 (1999).  https://doi.org/10.1016/S0004-3702(99)00052-1MathSciNetCrossRefzbMATHGoogle Scholar
  15. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction (2018)Google Scholar
  16. Varshavsky, V.I., Meleshina, M.V., Tsetlin, M.L.: Povedenie avtomatov v periodicheskih sluchajnyh sredah i zadacha sinhronizacii pri nalichii pomekh. Probl. peredachi Inf. 1(1), 65–71 (1965). (in Russian)Google Scholar
  17. Vasiliu, L., et al.: RoboBrain: a software architecture mapping the human brain. In: 2014 IEEE-RAS International Conference on Humanoid Robots, pp. 160–165. IEEE (2014).  https://doi.org/10.1109/HUMANOIDS.2014.7041353
  18. Siciliano, B., Khatib, O.: Springer Handbook of Robotics. Springer, Berlin (2008).  https://doi.org/10.1007/978-3-540-30301-5CrossRefzbMATHGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.National Research Center “Kurchatov Institute”MoscowRussia

Personalised recommendations