Skip to main content
Log in

Teaching Robot Navigation Behaviors to Optimal RRT Planners

  • Published:
International Journal of Social Robotics Aims and scope Submit manuscript

Abstract

This work presents an approach for learning navigation behaviors for robots using Optimal Rapidly-exploring Random Trees (RRT\(^{*}\)) as the main planner. A new learning algorithm combining both Inverse Reinforcement Learning and RRT\(^{*}\) is developed to learn the RRT\(^{*}\)’s cost function from demonstrations. A comparison with other state-of-the-art algorithms shows how the method can recover the behavior from the demonstrations. Finally, a learned cost function for social navigation is tested in real experiments with a robot in the laboratory.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. http://wiki.ros.org/move_base.

  2. https://github.com/robotics-upo/upo_robot_navigation.

References

  1. Abbeel P, Ng AY (2004) Apprenticeship learning via inverse reinforcement learning. In: Proceedings of the twenty-first international conference on Machine learning, ICML ’04. ACM, New York, NY, USA, p 1. https://doi.org/10.1145/1015330.1015430

  2. Argali B, Chernova S, Veloso M, Browning B (2009) A survey of robot learning from demonstrations. Robot Auton Syst 57:469–483

    Article  Google Scholar 

  3. Borgefors G (1986) Distance transformations in digital images. Comput. Vision Graph. Image Process. 34(3):344–371. https://doi.org/10.1016/S0734-189X(86)80047-0

  4. Cristani M, Bazzani L, Paggetti G, Fossati A, Tosato D, Bue AD, Menegaz G, Murino V (2011) Andrea Fossati, Del\(^{\sim }\)Bue, A.: social interaction discovery by statistical analysis of F-formations. In: British machine vision conference (BMVC), pp 23.1–23.12. https://doi.org/10.5244/C.25.23

  5. Feil-Seifer D, Mataric M (2011) People-aware navigation for goal-oriented behavior involving a human partner. In: Proceedings of the IEEE international conference on development and learning (ICDL)

  6. Finn C, Levine S, Abbeel P (2016) Guided cost learning: Deep inverse optimal control via policy optimization. In: Proceedings of the 33rd international conference on machine learning, vol 48

  7. Fox D, Burgard W, Thrun S (1997) The dynamic window approach to collision avoidance. IEEE Robot Autom 4(1):23

    Article  Google Scholar 

  8. Hall ET (1990) The hidden dimension. Anchor

  9. Henry P, Vollmer C, Ferris B, Fox D (2010) Learning to navigate through crowded environments. In: Proceedings of the IEEE international conference on robotics and automation (ICRA), pp 981–986

  10. Karaman S, Frazzoli E (2011) Sampling-based algorithms for optimal motion planning. Int J Robot Res 30(7):846–894

    Article  MATH  Google Scholar 

  11. Kirby R, Forlizzi J, Simmons R (2010) Affective social robots. Robot Auton Syst 58:322–332

    Article  Google Scholar 

  12. Kirby R, Simmons RG, Forlizzi J (2009) Companion: a constraint-optimizing method for person-acceptable navigation. In: RO-MAN, pp 607–612. IEEE

  13. Kretzschmar H, Kuderer M, Burgard W (2014) Learning to predict trajectories of cooperatively navigating agents. In: 2014 IEEE international conference on robotics and automation (ICRA) , pp 4015–4020. IEEE

  14. Kretzschmar H, Spies M, Sprunk C, Burgard W (2016) Socially compliant mobile robot navigation via inverse reinforcement learning. Int J Robot Res. https://doi.org/10.1177/0278364915619772

  15. Kruse T, Pandey AK, Alami R, Kirsch A (2013) Human-aware robot navigation: a survey. Robot Auton Syst 61(12):1726–1743. https://doi.org/10.1016/j.robot.2013.05.007

  16. Kuderer M, Gulati S, Burgard W (2015) Learning driving styles for autonomous vehicles from demonstration. In: Proceedings of the IEEE international conference on robotics & automation (ICRA), Seattle, USA, vol 134

  17. Kuderer M, Kretzschmar H, Sprunk C, Burgard W (2012) Feature-based prediction of trajectories for socially compliant navigation. In: Proceedings of robotics: science and systems (RSS). Sydney, Australia. http://www.informatik.uni-freiburg.de/~kudererm/publications/kuderer12rss.pdf

  18. Levine S, Koltun V (2012) Continuous inverse optimal control with locally optimal examples. In: ICML ’12: Proceedings of the 29th international conference on machine learning

  19. Luber M, Spinello L, Silva J, Arras KO (2012) Socially-aware robot navigation: A learning approach. In: 2012 IEEE/RSJ international conference on intelligent robots and systems, pp 902–907. https://doi.org/10.1109/IROS.2012.6385716

  20. Michini B, Cutler M, How JP (2013) Scalable reward learning from demonstration. In: IEEE international conference on robotics and automation (ICRA). IEEE. http://acl.mit.edu/papers/michini-icra-2013.pdf

  21. Ng AY, Russell SJ (2000) Algorithms for inverse reinforcement learning. In: Proceedings of the seventeenth international conference on machine learning, ICML ’00. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 663–670. http://dl.acm.org/citation.cfm?id=645529.657801

  22. Okal B, Arras KO (2016)Formalizing normative robot behavior. In: Social robotics: 8th international conference, ICSR 2016, Kansas City, MO, USA, November 1-3, 2016 Proceedings. Springer International Publishing, pp 62–71. https://doi.org/10.1007/978-3-319-47437-3_7

  23. Okal B, Arras KO (2016) Learning socially normative robot navigation behaviors with bayesian inverse reinforcement learning. In: Proceedings of the IEEE international conference on robotics and automation, ICRA , Stockholm, Sweden, pp 2889–2895. https://doi.org/10.1109/ICRA.2016.7487452

  24. Pacchierotti E, Christensen H, Jensfelt P (2006) Evaluation of passing distance for social robots. In: IEEE workshop on robot and human interactive communication (ROMAN). Hartfordshire, UK

  25. Pérez-Higueras N, Caballero F, Merino L (2016) Learning robot navigation behaviors by demonstration using a RRT* planner. In: International conference on social robotics. Springer International Publishing, pp 1–10

  26. Quigley M, Conley K, Gerkey BP, Faust J, Foote T, Leibs J, Wheeler R, Ng AY (2009) Ros: an open-source robot operating system. In: ICRA workshop on open source software

  27. Ramachandran D, Amir E (2007) Bayesian inverse reinforcement learning. In: Proceedings of the 20th international joint conference on artical intelligence vol 51, pp 2586–2591. http://www.aaai.org/Papers/IJCAI/2007/IJCAI07-416.pdf

  28. Ramón-Vigo R, Pérez-Higueras N, Caballero F, Merino L (2014) Transferring human navigation behaviors into a robot local planner. In: Proceedings of the IEEE international symposium on robot and human interactive communication, RO-MAN. https://doi.org/10.1109/ROMAN.2014.6926347

  29. Ramon-Vigo R, Perez-Higueras N, Caballero F, Merino L (2015) Analyzing the relevance of features for a social navigation task. In: L.P. Reis, A.P. Moreira, P.U. Lima, L. Montano, V. Munoz-Martinez (eds.) Robot 2015: Second Iberian Robotics Conference, Advances in Intelligent Systems and Computing, vol. 418, pp. 235–246. Springer International Publishing. https://doi.org/10.1007/978-3-319-27149-1_19

  30. Ratliff ND, Bagnell JA, Zinkevich Ma (2006) Maximum margin planning. In: International conference on machine learning—ICML ’06(23), pp 729–736. https://doi.org/10.1145/1143844.1143936

  31. Ratliff ND, Silver D, Bagnell JA (2009) Learning to search: functional gradient techniques for imitation learning. Auton Robots 27(1):25–53. https://doi.org/10.1007/s10514-009-9121-3

  32. Setti F, Russell C, Bassetti C, Cristani M (2015) F-formation detection: individuating free-standing conversational groups in images. PLoS ONE 10(5):1–32. https://doi.org/10.1371/journal.pone.0123783

  33. Shiarlis K, Messias J, van Someren M, Whiteson S, Kim J, Vroon J, Englebienne G, Truong K, Evers V, Perez-Higueras N, Perez-Hurtado I, Ramon-Vigo R, Caballero F, Merino L, Shen J, Petridis S, Pantic M, Hedman L, Scherlund M, Koster R, Michel H (2015) Teresa: a socially intelligent semi-autonomous telepresence system. In: Workshop on machine learning for social robotics at ICRA-2015 in Seattle

  34. Shiarlis K, Messias J, Whiteson S (2017) Rapidly exploring learning trees. In: Proceedings of the IEEE international conference on robotics and automation (ICRA). IEEE, Singapore, Singapore

  35. Shiarlis K, Messias J, Whiteson S (2017) Acquiring social interaction behaviours for telepresence robots via deep learning from demonstration. In: Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS). Vancouver, Canada

  36. Sisbot EA, Marin-Urias LF, Alami R, Siméon T (2007) A human aware mobile robot motion planner. IEEE Trans Robot 23(5):874–883

    Article  Google Scholar 

  37. Vasquez D, Okal B, Arras KO (2014) Inverse reinforcement learning algorithms and features for robot navigation in crowds: an experimental comparison. In: Proceedings IEEE/RSJ international conference on Intelligent Robots and Systems (IROS), pp 1341–1346. https://doi.org/10.1109/IROS.2014.6942731

  38. Xia C, Kamel AE (2016) Neural inverse reinforcement learning in autonomous navigation. Robot Auton Syst 84:1–14. https://doi.org/10.1016/j.robot.2016.06.003

  39. Ziebart B, Maas A, Bagnell J, Dey A (2008) Maximum entropy inverse reinforcement learning. In: Proceedings of the national conference on artificial intelligence (AAAI)

Download references

Funding

this study was funded by the EC-FP7 under grant agreement no. 611153 (TERESA).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Noé Pérez-Higueras.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Additional information

This work is partially supported by the EC-FP7 under Grant Agreement No. 611153 (TERESA) and the project PAIS-MultiRobot funded by the Junta de Andalucía (TIC-7390).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pérez-Higueras, N., Caballero, F. & Merino, L. Teaching Robot Navigation Behaviors to Optimal RRT Planners. Int J of Soc Robotics 10, 235–249 (2018). https://doi.org/10.1007/s12369-017-0448-1

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12369-017-0448-1

Keywords

Navigation