International Journal of Social Robotics

, Volume 10, Issue 2, pp 235–249 | Cite as

Teaching Robot Navigation Behaviors to Optimal RRT Planners

  • Noé Pérez-Higueras
  • Fernando Caballero
  • Luis Merino


This work presents an approach for learning navigation behaviors for robots using Optimal Rapidly-exploring Random Trees (RRT\(^{*}\)) as the main planner. A new learning algorithm combining both Inverse Reinforcement Learning and RRT\(^{*}\) is developed to learn the RRT\(^{*}\)’s cost function from demonstrations. A comparison with other state-of-the-art algorithms shows how the method can recover the behavior from the demonstrations. Finally, a learned cost function for social navigation is tested in real experiments with a robot in the laboratory.


Path planning Learning from demonstration Social robots 



this study was funded by the EC-FP7 under grant agreement no. 611153 (TERESA).

Compliance with Ethical Standards

Conflict of Interest

The authors declare that they have no conflict of interest.


  1. 1.
    Abbeel P, Ng AY (2004) Apprenticeship learning via inverse reinforcement learning. In: Proceedings of the twenty-first international conference on Machine learning, ICML ’04. ACM, New York, NY, USA, p 1.
  2. 2.
    Argali B, Chernova S, Veloso M, Browning B (2009) A survey of robot learning from demonstrations. Robot Auton Syst 57:469–483CrossRefGoogle Scholar
  3. 3.
    Borgefors G (1986) Distance transformations in digital images. Comput. Vision Graph. Image Process. 34(3):344–371.
  4. 4.
    Cristani M, Bazzani L, Paggetti G, Fossati A, Tosato D, Bue AD, Menegaz G, Murino V (2011) Andrea Fossati, Del\(^{\sim }\)Bue, A.: social interaction discovery by statistical analysis of F-formations. In: British machine vision conference (BMVC), pp 23.1–23.12.
  5. 5.
    Feil-Seifer D, Mataric M (2011) People-aware navigation for goal-oriented behavior involving a human partner. In: Proceedings of the IEEE international conference on development and learning (ICDL)Google Scholar
  6. 6.
    Finn C, Levine S, Abbeel P (2016) Guided cost learning: Deep inverse optimal control via policy optimization. In: Proceedings of the 33rd international conference on machine learning, vol 48Google Scholar
  7. 7.
    Fox D, Burgard W, Thrun S (1997) The dynamic window approach to collision avoidance. IEEE Robot Autom 4(1):23CrossRefGoogle Scholar
  8. 8.
    Hall ET (1990) The hidden dimension. AnchorGoogle Scholar
  9. 9.
    Henry P, Vollmer C, Ferris B, Fox D (2010) Learning to navigate through crowded environments. In: Proceedings of the IEEE international conference on robotics and automation (ICRA), pp 981–986Google Scholar
  10. 10.
    Karaman S, Frazzoli E (2011) Sampling-based algorithms for optimal motion planning. Int J Robot Res 30(7):846–894CrossRefzbMATHGoogle Scholar
  11. 11.
    Kirby R, Forlizzi J, Simmons R (2010) Affective social robots. Robot Auton Syst 58:322–332CrossRefGoogle Scholar
  12. 12.
    Kirby R, Simmons RG, Forlizzi J (2009) Companion: a constraint-optimizing method for person-acceptable navigation. In: RO-MAN, pp 607–612. IEEEGoogle Scholar
  13. 13.
    Kretzschmar H, Kuderer M, Burgard W (2014) Learning to predict trajectories of cooperatively navigating agents. In: 2014 IEEE international conference on robotics and automation (ICRA) , pp 4015–4020. IEEEGoogle Scholar
  14. 14.
    Kretzschmar H, Spies M, Sprunk C, Burgard W (2016) Socially compliant mobile robot navigation via inverse reinforcement learning. Int J Robot Res.
  15. 15.
    Kruse T, Pandey AK, Alami R, Kirsch A (2013) Human-aware robot navigation: a survey. Robot Auton Syst 61(12):1726–1743.
  16. 16.
    Kuderer M, Gulati S, Burgard W (2015) Learning driving styles for autonomous vehicles from demonstration. In: Proceedings of the IEEE international conference on robotics & automation (ICRA), Seattle, USA, vol 134Google Scholar
  17. 17.
    Kuderer M, Kretzschmar H, Sprunk C, Burgard W (2012) Feature-based prediction of trajectories for socially compliant navigation. In: Proceedings of robotics: science and systems (RSS). Sydney, Australia.
  18. 18.
    Levine S, Koltun V (2012) Continuous inverse optimal control with locally optimal examples. In: ICML ’12: Proceedings of the 29th international conference on machine learningGoogle Scholar
  19. 19.
    Luber M, Spinello L, Silva J, Arras KO (2012) Socially-aware robot navigation: A learning approach. In: 2012 IEEE/RSJ international conference on intelligent robots and systems, pp 902–907.
  20. 20.
    Michini B, Cutler M, How JP (2013) Scalable reward learning from demonstration. In: IEEE international conference on robotics and automation (ICRA). IEEE.
  21. 21.
    Ng AY, Russell SJ (2000) Algorithms for inverse reinforcement learning. In: Proceedings of the seventeenth international conference on machine learning, ICML ’00. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 663–670.
  22. 22.
    Okal B, Arras KO (2016)Formalizing normative robot behavior. In: Social robotics: 8th international conference, ICSR 2016, Kansas City, MO, USA, November 1-3, 2016 Proceedings. Springer International Publishing, pp 62–71.
  23. 23.
    Okal B, Arras KO (2016) Learning socially normative robot navigation behaviors with bayesian inverse reinforcement learning. In: Proceedings of the IEEE international conference on robotics and automation, ICRA , Stockholm, Sweden, pp 2889–2895.
  24. 24.
    Pacchierotti E, Christensen H, Jensfelt P (2006) Evaluation of passing distance for social robots. In: IEEE workshop on robot and human interactive communication (ROMAN). Hartfordshire, UKGoogle Scholar
  25. 25.
    Pérez-Higueras N, Caballero F, Merino L (2016) Learning robot navigation behaviors by demonstration using a RRT* planner. In: International conference on social robotics. Springer International Publishing, pp 1–10Google Scholar
  26. 26.
    Quigley M, Conley K, Gerkey BP, Faust J, Foote T, Leibs J, Wheeler R, Ng AY (2009) Ros: an open-source robot operating system. In: ICRA workshop on open source softwareGoogle Scholar
  27. 27.
    Ramachandran D, Amir E (2007) Bayesian inverse reinforcement learning. In: Proceedings of the 20th international joint conference on artical intelligence vol 51, pp 2586–2591.
  28. 28.
    Ramón-Vigo R, Pérez-Higueras N, Caballero F, Merino L (2014) Transferring human navigation behaviors into a robot local planner. In: Proceedings of the IEEE international symposium on robot and human interactive communication, RO-MAN.
  29. 29.
    Ramon-Vigo R, Perez-Higueras N, Caballero F, Merino L (2015) Analyzing the relevance of features for a social navigation task. In: L.P. Reis, A.P. Moreira, P.U. Lima, L. Montano, V. Munoz-Martinez (eds.) Robot 2015: Second Iberian Robotics Conference, Advances in Intelligent Systems and Computing, vol. 418, pp. 235–246. Springer International Publishing.
  30. 30.
    Ratliff ND, Bagnell JA, Zinkevich Ma (2006) Maximum margin planning. In: International conference on machine learning—ICML ’06(23), pp 729–736.
  31. 31.
    Ratliff ND, Silver D, Bagnell JA (2009) Learning to search: functional gradient techniques for imitation learning. Auton Robots 27(1):25–53.
  32. 32.
    Setti F, Russell C, Bassetti C, Cristani M (2015) F-formation detection: individuating free-standing conversational groups in images. PLoS ONE 10(5):1–32.
  33. 33.
    Shiarlis K, Messias J, van Someren M, Whiteson S, Kim J, Vroon J, Englebienne G, Truong K, Evers V, Perez-Higueras N, Perez-Hurtado I, Ramon-Vigo R, Caballero F, Merino L, Shen J, Petridis S, Pantic M, Hedman L, Scherlund M, Koster R, Michel H (2015) Teresa: a socially intelligent semi-autonomous telepresence system. In: Workshop on machine learning for social robotics at ICRA-2015 in SeattleGoogle Scholar
  34. 34.
    Shiarlis K, Messias J, Whiteson S (2017) Rapidly exploring learning trees. In: Proceedings of the IEEE international conference on robotics and automation (ICRA). IEEE, Singapore, SingaporeGoogle Scholar
  35. 35.
    Shiarlis K, Messias J, Whiteson S (2017) Acquiring social interaction behaviours for telepresence robots via deep learning from demonstration. In: Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS). Vancouver, CanadaGoogle Scholar
  36. 36.
    Sisbot EA, Marin-Urias LF, Alami R, Siméon T (2007) A human aware mobile robot motion planner. IEEE Trans Robot 23(5):874–883CrossRefGoogle Scholar
  37. 37.
    Vasquez D, Okal B, Arras KO (2014) Inverse reinforcement learning algorithms and features for robot navigation in crowds: an experimental comparison. In: Proceedings IEEE/RSJ international conference on Intelligent Robots and Systems (IROS), pp 1341–1346.
  38. 38.
    Xia C, Kamel AE (2016) Neural inverse reinforcement learning in autonomous navigation. Robot Auton Syst 84:1–14.
  39. 39.
    Ziebart B, Maas A, Bagnell J, Dey A (2008) Maximum entropy inverse reinforcement learning. In: Proceedings of the national conference on artificial intelligence (AAAI)Google Scholar

Copyright information

© Springer Science+Business Media B.V., part of Springer Nature 2017

Authors and Affiliations

  1. 1.School of EngineeringUniversidad Pablo de OlavideSevilleSpain
  2. 2.Department of System Engineering and AutomationUniversity of SevilleSevilleSpain

Personalised recommendations