Abstract
This work presents an approach for learning navigation behaviors for robots using Optimal Rapidly-exploring Random Trees (RRT\(^{*}\)) as the main planner. A new learning algorithm combining both Inverse Reinforcement Learning and RRT\(^{*}\) is developed to learn the RRT\(^{*}\)’s cost function from demonstrations. A comparison with other state-of-the-art algorithms shows how the method can recover the behavior from the demonstrations. Finally, a learned cost function for social navigation is tested in real experiments with a robot in the laboratory.
Similar content being viewed by others
References
Abbeel P, Ng AY (2004) Apprenticeship learning via inverse reinforcement learning. In: Proceedings of the twenty-first international conference on Machine learning, ICML ’04. ACM, New York, NY, USA, p 1. https://doi.org/10.1145/1015330.1015430
Argali B, Chernova S, Veloso M, Browning B (2009) A survey of robot learning from demonstrations. Robot Auton Syst 57:469–483
Borgefors G (1986) Distance transformations in digital images. Comput. Vision Graph. Image Process. 34(3):344–371. https://doi.org/10.1016/S0734-189X(86)80047-0
Cristani M, Bazzani L, Paggetti G, Fossati A, Tosato D, Bue AD, Menegaz G, Murino V (2011) Andrea Fossati, Del\(^{\sim }\)Bue, A.: social interaction discovery by statistical analysis of F-formations. In: British machine vision conference (BMVC), pp 23.1–23.12. https://doi.org/10.5244/C.25.23
Feil-Seifer D, Mataric M (2011) People-aware navigation for goal-oriented behavior involving a human partner. In: Proceedings of the IEEE international conference on development and learning (ICDL)
Finn C, Levine S, Abbeel P (2016) Guided cost learning: Deep inverse optimal control via policy optimization. In: Proceedings of the 33rd international conference on machine learning, vol 48
Fox D, Burgard W, Thrun S (1997) The dynamic window approach to collision avoidance. IEEE Robot Autom 4(1):23
Hall ET (1990) The hidden dimension. Anchor
Henry P, Vollmer C, Ferris B, Fox D (2010) Learning to navigate through crowded environments. In: Proceedings of the IEEE international conference on robotics and automation (ICRA), pp 981–986
Karaman S, Frazzoli E (2011) Sampling-based algorithms for optimal motion planning. Int J Robot Res 30(7):846–894
Kirby R, Forlizzi J, Simmons R (2010) Affective social robots. Robot Auton Syst 58:322–332
Kirby R, Simmons RG, Forlizzi J (2009) Companion: a constraint-optimizing method for person-acceptable navigation. In: RO-MAN, pp 607–612. IEEE
Kretzschmar H, Kuderer M, Burgard W (2014) Learning to predict trajectories of cooperatively navigating agents. In: 2014 IEEE international conference on robotics and automation (ICRA) , pp 4015–4020. IEEE
Kretzschmar H, Spies M, Sprunk C, Burgard W (2016) Socially compliant mobile robot navigation via inverse reinforcement learning. Int J Robot Res. https://doi.org/10.1177/0278364915619772
Kruse T, Pandey AK, Alami R, Kirsch A (2013) Human-aware robot navigation: a survey. Robot Auton Syst 61(12):1726–1743. https://doi.org/10.1016/j.robot.2013.05.007
Kuderer M, Gulati S, Burgard W (2015) Learning driving styles for autonomous vehicles from demonstration. In: Proceedings of the IEEE international conference on robotics & automation (ICRA), Seattle, USA, vol 134
Kuderer M, Kretzschmar H, Sprunk C, Burgard W (2012) Feature-based prediction of trajectories for socially compliant navigation. In: Proceedings of robotics: science and systems (RSS). Sydney, Australia. http://www.informatik.uni-freiburg.de/~kudererm/publications/kuderer12rss.pdf
Levine S, Koltun V (2012) Continuous inverse optimal control with locally optimal examples. In: ICML ’12: Proceedings of the 29th international conference on machine learning
Luber M, Spinello L, Silva J, Arras KO (2012) Socially-aware robot navigation: A learning approach. In: 2012 IEEE/RSJ international conference on intelligent robots and systems, pp 902–907. https://doi.org/10.1109/IROS.2012.6385716
Michini B, Cutler M, How JP (2013) Scalable reward learning from demonstration. In: IEEE international conference on robotics and automation (ICRA). IEEE. http://acl.mit.edu/papers/michini-icra-2013.pdf
Ng AY, Russell SJ (2000) Algorithms for inverse reinforcement learning. In: Proceedings of the seventeenth international conference on machine learning, ICML ’00. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 663–670. http://dl.acm.org/citation.cfm?id=645529.657801
Okal B, Arras KO (2016)Formalizing normative robot behavior. In: Social robotics: 8th international conference, ICSR 2016, Kansas City, MO, USA, November 1-3, 2016 Proceedings. Springer International Publishing, pp 62–71. https://doi.org/10.1007/978-3-319-47437-3_7
Okal B, Arras KO (2016) Learning socially normative robot navigation behaviors with bayesian inverse reinforcement learning. In: Proceedings of the IEEE international conference on robotics and automation, ICRA , Stockholm, Sweden, pp 2889–2895. https://doi.org/10.1109/ICRA.2016.7487452
Pacchierotti E, Christensen H, Jensfelt P (2006) Evaluation of passing distance for social robots. In: IEEE workshop on robot and human interactive communication (ROMAN). Hartfordshire, UK
Pérez-Higueras N, Caballero F, Merino L (2016) Learning robot navigation behaviors by demonstration using a RRT* planner. In: International conference on social robotics. Springer International Publishing, pp 1–10
Quigley M, Conley K, Gerkey BP, Faust J, Foote T, Leibs J, Wheeler R, Ng AY (2009) Ros: an open-source robot operating system. In: ICRA workshop on open source software
Ramachandran D, Amir E (2007) Bayesian inverse reinforcement learning. In: Proceedings of the 20th international joint conference on artical intelligence vol 51, pp 2586–2591. http://www.aaai.org/Papers/IJCAI/2007/IJCAI07-416.pdf
Ramón-Vigo R, Pérez-Higueras N, Caballero F, Merino L (2014) Transferring human navigation behaviors into a robot local planner. In: Proceedings of the IEEE international symposium on robot and human interactive communication, RO-MAN. https://doi.org/10.1109/ROMAN.2014.6926347
Ramon-Vigo R, Perez-Higueras N, Caballero F, Merino L (2015) Analyzing the relevance of features for a social navigation task. In: L.P. Reis, A.P. Moreira, P.U. Lima, L. Montano, V. Munoz-Martinez (eds.) Robot 2015: Second Iberian Robotics Conference, Advances in Intelligent Systems and Computing, vol. 418, pp. 235–246. Springer International Publishing. https://doi.org/10.1007/978-3-319-27149-1_19
Ratliff ND, Bagnell JA, Zinkevich Ma (2006) Maximum margin planning. In: International conference on machine learning—ICML ’06(23), pp 729–736. https://doi.org/10.1145/1143844.1143936
Ratliff ND, Silver D, Bagnell JA (2009) Learning to search: functional gradient techniques for imitation learning. Auton Robots 27(1):25–53. https://doi.org/10.1007/s10514-009-9121-3
Setti F, Russell C, Bassetti C, Cristani M (2015) F-formation detection: individuating free-standing conversational groups in images. PLoS ONE 10(5):1–32. https://doi.org/10.1371/journal.pone.0123783
Shiarlis K, Messias J, van Someren M, Whiteson S, Kim J, Vroon J, Englebienne G, Truong K, Evers V, Perez-Higueras N, Perez-Hurtado I, Ramon-Vigo R, Caballero F, Merino L, Shen J, Petridis S, Pantic M, Hedman L, Scherlund M, Koster R, Michel H (2015) Teresa: a socially intelligent semi-autonomous telepresence system. In: Workshop on machine learning for social robotics at ICRA-2015 in Seattle
Shiarlis K, Messias J, Whiteson S (2017) Rapidly exploring learning trees. In: Proceedings of the IEEE international conference on robotics and automation (ICRA). IEEE, Singapore, Singapore
Shiarlis K, Messias J, Whiteson S (2017) Acquiring social interaction behaviours for telepresence robots via deep learning from demonstration. In: Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS). Vancouver, Canada
Sisbot EA, Marin-Urias LF, Alami R, Siméon T (2007) A human aware mobile robot motion planner. IEEE Trans Robot 23(5):874–883
Vasquez D, Okal B, Arras KO (2014) Inverse reinforcement learning algorithms and features for robot navigation in crowds: an experimental comparison. In: Proceedings IEEE/RSJ international conference on Intelligent Robots and Systems (IROS), pp 1341–1346. https://doi.org/10.1109/IROS.2014.6942731
Xia C, Kamel AE (2016) Neural inverse reinforcement learning in autonomous navigation. Robot Auton Syst 84:1–14. https://doi.org/10.1016/j.robot.2016.06.003
Ziebart B, Maas A, Bagnell J, Dey A (2008) Maximum entropy inverse reinforcement learning. In: Proceedings of the national conference on artificial intelligence (AAAI)
Funding
this study was funded by the EC-FP7 under grant agreement no. 611153 (TERESA).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that they have no conflict of interest.
Additional information
This work is partially supported by the EC-FP7 under Grant Agreement No. 611153 (TERESA) and the project PAIS-MultiRobot funded by the Junta de Andalucía (TIC-7390).
Rights and permissions
About this article
Cite this article
Pérez-Higueras, N., Caballero, F. & Merino, L. Teaching Robot Navigation Behaviors to Optimal RRT Planners. Int J of Soc Robotics 10, 235–249 (2018). https://doi.org/10.1007/s12369-017-0448-1
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12369-017-0448-1