Robot learning from demonstration for path planning: A review

Xie, ZongWu; Zhang, Qi; Jiang, ZaiNan; Liu, Hong

doi:10.1007/s11431-020-1648-4

Robot learning from demonstration for path planning: A review

Review
Published: 06 July 2020

Volume 63, pages 1325–1334, (2020)
Cite this article

Science China Technological Sciences Aims and scope Submit manuscript

ZongWu Xie¹,
Qi Zhang¹,
ZaiNan Jiang¹ &
…
Hong Liu¹

1819 Accesses
22 Citations
Explore all metrics

Abstract

Learning from demonstration (LfD) is an appealing method of helping robots learn new skills. Numerous papers have presented methods of LfD with good performance in robotics. However, complicated robot tasks that need to carefully regulate path planning strategies remain unanswered. Contact or non-contact constraints in specific robot tasks make the path planning problem more difficult, as the interaction between the robot and the environment is time-varying. In this paper, we focus on the path planning of complex robot tasks in the domain of LfD and give a novel perspective for classifying imitation learning and inverse reinforcement learning. This classification is based on constraints and obstacle avoidance. Finally, we summarize these methods and present promising directions for robot application and LfD theory.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Survey of imitation learning for robotic manipulation

Article 23 September 2019

Bin Fang, Shidong Jia, … Fuchun Sun

Robot Skill Acquisition by Demonstration and Explorative Learning

Learning from Humans

References

Argall B D, Chernova S, Veloso M, et al. A survey of robot learning from demonstration. Robotics Autonomous Syst, 2009, 57: 469–483
Article Google Scholar
Billard A, Calinon S, Dillmann R, et al. Survey: Robot programming by demonstration. Handbook of Robotics, 2008, 59
Schaal S. Is imitation learning the route to humanoid robots? Trends Cognitive Sci, 1999, 3: 233–242
Article Google Scholar
Hussein A, Gaber M M, Elyan E, et al. Imitation learning: A survey of learning methods. Acm Comput Surv (CSUR), 2017, 50: 1–35
Article Google Scholar
Arora S, Doshi P. A survey of inverse reinforcement learning: Challenges, methods and progress. ArXiv: 1806.06877
Gao Y, Peters J, Tsourdos A, et al. A survey of inverse reinforcement learning techniques. Int Jnl Intel Comp Cyber, 2012, 5: 293–311
Article MathSciNet Google Scholar
Argall B, Browning B, Veloso M. Learning by demonstration with critique from a human teacher. In: Proceedings of the IEEE International Conference on Human-Robot Interaction (HRI). 2nd ACM. IEEE, 2007. 57–64
Argall B D, Browning B, Veloso M. Learning robot motion control with demonstration and advice-operators In: Proceedings of the International Conference on Intelligent Robots and Systems. IEEE, 2008. 399–404
Calinon S. Robot Programming by Demonstration. In: Handbook of Robotics. Berlin, Heidelberg: Springer, 2008
Google Scholar
Calinon S, Guenter F, Billard A. On learning, representing, and generalizing a task in a humanoid robot. IEEE Trans Syst, 2007, 37: 286–298
Google Scholar
Calinon S, Billard A. Incremental learning of gestures by imitation in a humanoid robot. In: Proceedings of the ACM. IEEE International Conference on Human-Robot Interaction. Arlington: 2007. 255–262
Calinon S, Billard A. Active teaching in robot programming by demonstration. In: Proceedings of the RO-MAN 2007-The 16th IEEE International Symposium on Robot and Human Interactive Communication. IEEE, 2007. 702–707
Ijspeert A J, Nakanishi J, Schaal S. Movement imitation with nonlinear dynamical systems in humanoid robots. In: Proceedings of the IEEE International Conference on Robotics and Automation (Cat. No. 02CH37292). IEEE, 2002. 2: 1398–1403
Peters J, Schaal S. Reinforcement learning of motor skills with policy gradients. Neural Networks, 2008, 21: 682–697
Article Google Scholar
Guenter F, Hersch M, Calinon S, et al. Reinforcement learning for imitating constrained reaching movements. Adv Robotics, 2007, 21: 1521–1544
Article Google Scholar
Schaal S, Mohajerian P, Ijspeert A. Dynamics systems vs. optimal controlla unifying view. Prog Brain Res, 2007, 165: 425–445
Article Google Scholar
Ijspeert A J, Nakanishi J, Schaal S. Learning attractor landscapes for learning motor primitives. In: Advances in Neural Information Processing Systems. Vancouver, 2003. 1547–1554
Schaal S, Peters J, Nakanishi J, et al. Learning movement primitives. Robotics Research. In: the Eleventh International Symposium. Berlin, Heidelberg: Springer, 2005. 561–572
Google Scholar
Ijspeert A J, Nakanishi J, Hoffmann H, et al. Dynamical movement primitives: Learning attractor models for motor behaviors. Neural Comput, 2013, 25: 328–373
Article MathSciNet Google Scholar
Schaal S, Ijspeert A, Billard A. Computational approaches to motor learning by imitation. Phil Trans R Soc Lond B, 2003, 358: 537–547
Article Google Scholar
Fang B, Jia S, Guo D, et al. Survey of imitation learning for robotic manipulation. Int Jour Int Rot App, 2019 3: 362C369
Google Scholar
Ahmed H, Mohamed M G, Eyad E, et al. Imitation learning: A survey of learning methods. ACM Computing Surveys, 2017, 50: 1–35
Google Scholar
Billard A, Epars Y, Calinon S, et al. Discovering optimal imitation strategies. Robotics Autonomous Syst, 2004, 47: 69–77
Article Google Scholar
Billard A G, Calinon S, Guenter F. Discriminative and adaptive imitation in uni-manual and bi-manual tasks. Robotics Autonomous Syst, 2006, 54: 370–384
Article Google Scholar
Rabiner L R. A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE, 1989, 77: 257–286
Article Google Scholar
Inamura T, Toshima I, Tanie H, et al. Embodied symbol emergence based on mimesis theory. Int J Robotics Res, 2004, 23: 363–377
Article Google Scholar
Kulic D, Takano W, Nakamura Y. Incremental learning, clustering and hierarchy formation of whole body motion patterns using adaptive hidden Markov chains. Int J Robotics Res, 2008, 27: 761–784
Article Google Scholar
Takano W, Yamane K, Sugihara T, et al. Primitive communication based on motion recognition and generation with hierarchical mimesis model. In: Proceedings of the IEEE International Conference on Robotics and Automation. IEEE, 2006. 3602–3609
Takano W, Yamane K, Nakamura Y. Primitive communication of humanoid robot with human via hierarchical mimesis model on the proto symbol space. In: Proceedings of the 5th IEEE-RAS International Conference on Humanoid Robots. IEEE, 2005. 167–174
Ghahramani Z, Jordan M I. Factorial hidden Markov models. In: Advances in Neural Information Processing Systems. Denver, 1996. 472–478
Kulic D, Takano W, Nakamura Y. Representability of human motions by factorial hidden markov models. In: International Conference on Intelligent Robots and Systems. IEEE, 2007. 2388–2393
Kulic D, Takano W, Nakamura Y. Incremental on-line hierarchical clustering of whole body motion patterns. In: RO-MAN 2007-The 16th IEEE International Symposium on Robot and Human Interactive Communication. IEEE, 2007. 1016–1021
Lee D, Ott C, Nakamura Y. Mimetic communication model with compliant physical contact in human-humanoid interaction. Int J Robotics Res, 2010, 29: 1684–1704
Article Google Scholar
Lee D, Nakamura Y. Mimesis model from partial observations for a humanoid robot. Int J Robotics Res, 2010, 29: 60–80
Article Google Scholar
Lee D, Nakamura Y. Mimesis from partial observations. In: Proceedings of the International Conference on Intelligent Robots and Systems. IEEE, 2005. 3758–3763
Asfour T, Azad P, Gyarfas F, et al. Imitation learning of dual-arm manipulation tasks in humanoid robots. Int J Human Robot, 2008, 05: 183–202
Article Google Scholar
Calinon S, Billard A G. What is the teacher’s role in robot programming by demonstration? Toward benchmarks for improved learning. Interaction Studies, 2007, 8: 441–464
Article Google Scholar
Cederborg T, Li M, Baranes A, et al. Incremental local online Gaussian mixture regression for imitation learning of multiple tasks. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots Systems. IEEE, 2010
Ijspeert A J, Nakanishi J, Schaal S. Trajectory formation for imitation with nonlinear dynamical systems. In: Proceedings of the International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No. 01CH37180). IEEE, 2001. 2: 752–757
Vecerik M, Hester T, Scholz J, et al. Leveraging demonstrations for deep reinforcement learning on robotics problems with sparse rewards. ArXiv: 1707.08817
Nair A, McGrew B, Andrychowicz M, et al. Overcoming exploration in reinforcement learning with demonstrations. In: Proceedings of the International Conference on Robotics and Automation (ICRA). IEEE, 2018. 6292–6299
Bojarski M, Del Testa D, Dworakowski D, et al. End to end learning for self-driving cars. ArXiv: 1604.07316
Kappler D, Pastor P, Kalakrishnan M, et al. Data-driven online decision making for autonomous manipulation. In: Robotics: Science and Systems. Rome, 2015
Pastor P, Kalakrishnan M, Chitta S, et al. Skill learning and task outcome prediction for manipulation. In: Proceedings of the International Conference on Robotics and Automation. IEEE, 2011. 3828–3834
Pastor P, Righetti L, Kalakrishnan M, et al. Online movement adaptation based on previous sensor experiences. In: Proceedings of the International Conference on Intelligent Robots and Systems. IEEE, 2011. 365–371
Pastor P, Kalakrishnan M, Righetti L, et al. Towards associative skill memories. In: Proceedings of the 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012). IEEE, 2012. 309–315
Christopher G A, Andrew W M, Stefan S. Locally weighted learning for control. Artifi Intell Rev, 1997, 11: 75–113
Article Google Scholar
Schaal, S, Atkeson, C. Constructive Incremental learning from only local information. Neural Comput, 1998, 10: 2047–2084
Article Google Scholar
Vijayakumar S, D’Souza A, Schaal S. Incremental online learning in high dimensions. Neural Comput, 2005, 17: 2602–2634
Article MathSciNet Google Scholar
Jara-Ettinger J. Theory of mind as inverse reinforcement learning. Cur Opi in Beh Sci, 2019, 29: 105–110
Google Scholar
Ng A Y, Russell S J. Algorithms for inverse reinforcement learning. In: Proceedings of the Seventeenth International Conference on Machine Learning (ICML00). San Francisco: Morgan Kaufmann Publishers Inc., 2000. 663–670
Google Scholar
Wulfmeier M, Ondruska P, Posner I. Deep inverse reinforcement learning. ArXiv: 1507.04888
Coates A, Abbeel P, Ng A Y. Apprenticeship learning for helicopter control. Commun ACM, 2009, 52: 97–105
Article Google Scholar
Ratliff N D, Bagnell J A, Zinkevich M A. Maximum margin planning. In: Proceedings of the 23rd International Conference on Machine Learning. New York, 2006. 729–736
Klein E, Geist M, Piot B, et al. Inverse reinforcement learning through structured classification. In: Advances in Neural Information Processing Systems. 2012. 1007–1015
Lin J L, Hwang K S, Shi H, et al. An ensemble method for inverse reinforcement learning. Inf Sci, 2020, 512: 518–532
Article Google Scholar
Klein E, Piot B, Geist M, et al. Structured classification for inverse reinforcement learning. In: Proceedings of the European Workshop on Reinforcement Learning. Edinburgh, 2013. 1–14
Ziebart B D, Maas A L, Bagnell J A, et al. Maximum entropy inverse reinforcement learning. In: Proceedings of the 23rd National Conference on Artificial Intelligence. Chicago: AAAI Press, 2008. 1433C1438
Google Scholar
Halperin I. Inverse reinforcement learning for marketing. ArXiv: 1712.04612
Boularias A, Kober J, Peters J. Relative entropy inverse reinforcement learning. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. Freiburg, 2011. 182–189
Ramachandran D, Amir E. Bayesian inverse reinforcement learning. IJCAI, 2007, 7: 2586–2591
Google Scholar
Choi J, Kim K E. Hierarchical bayesian inverse reinforcement learning. IEEE Trans Cybernet, 2014, 45: 793–805
Article Google Scholar
Michini B, How J P. Bayesian nonparametric inverse reinforcement learning. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Berlin, Heidelberg: Springer, 2012. 148–163
Chapter Google Scholar
Rothkopf C A, Dimitrakakis C. Preference elicitation and inverse reinforcement learning. In: Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Berlin, Heidelberg: Springer, 2011. 34–48
Chapter Google Scholar
Qiao Q, Beling P A. Inverse reinforcement learning with Gaussian process. In: Proceedings of the 2011 American Control Conference. IEEE, 2011. 113–118
Da Silva V F, Costa A H R, Lima P. Inverse reinforcement learning with evaluation. In: Proceedings of the IEEE International Conference on Robotics and Automation. Montreal, 2006. 4246–4251
Amin K, Jiang N, Singh S. Repeated inverse reinforcement learning. In: Advances in Neural Information Processing Systems. Long Beach, 2017. 1815–1824
Hadfield-Menell D, Russell S J, Abbeel P, et al. Cooperative inverse reinforcement learning. In: Advances in Neural Information Processing Systems. Barcelona SPAIN, 2016. 3909–3917
Zhang X, Zhang K, Miehling E, et al. Non-cooperative inverse reinforcement learning. In: Advances in Neural Information Processing Systems. Vancouver, 2019. 9482–9493
Chen R, Wang W, Zhao Z, et al. Active learning for risk-sensitive inverse reinforcement learning. ArXiv: 1909.07843
Abbeel P, Coates A, Ng A Y. Autonomous helicopter aerobatics through apprenticeship learning. Int J Robotics Res, 2010, 29: 1608–1639
Article Google Scholar
Abbeel P, Ng A Y. Apprenticeship learning via inverse reinforcement learning. In: Proceedings of the Twenty-first International Conference on Machine Learning. New York, 2004
Natarajan S, Kunapuli G, Judah K, et al. Multi-agent inverse reinforcement learning. In: Proceedings of the Ninth International Conference on Machine Learning and Applications. IEEE, 2010. 395–400
Amiranashvili A, Dosovitskiy A, Koltun V, et al. Motion perception in reinforcement learning with dynamic objects. ArXiv: 1901.03162
Babes M, Marivate V, Subramanian K, et al. Apprenticeship learning about multiple intentions. In: Proceedings of the 28th International Conference on Machine Learning (ICML-11). Madison, 2011. 897–904
Xin L, Li S E, Wang P, et al. Accelerated inverse reinforcement learning with randomly pre-sampled policies for autonomous driving reward design. In: Proceedings of the Intelligent Transportation Systems Conference (ITSC). IEEE, 2019. 2757–2764
Xie X, Li C, Zhang C, et al. Learning virtual grasp with failed demonstrations via bayesian inverse reinforcement learning. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, 2019
Finn C, Levine S, Abbeel P. Guided cost learning: Deep inverse optimal control via policy optimization. In: Proceedings of the International Conference on Machine Learning. New York, 2016. 49–58
Kalakrishnan M, Pastor P, Righetti L, et al. Learning objective functions for manipulation. In: Proceedings of the IEEE International Conference on Robotics and Automation. IEEE, 2013. 1331–1336
Tolstaya E, Ribeiro A, Kumar V, et al. Inverse optimal planning for air traffic control. ArXiv: 1903.10525
Osogami T, Raymond R. Map matching with inverse reinforcement learning. In: Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence. Beijing, 2013
Pietquin O. Inverse reinforcement learning for interactive systems. In: Proceedings of the 2nd Workshop on Machine Learning for Interactive Systems: Bridging the Gap Between Perception, Action and Communication. New York, 2013. 71–75
Kishikawa D, Arai S. Comfortable driving by using deep inverse reinforcement Learning. In: Proceedings of the International Conference on Agents (ICA). IEEE, 2019. 38–43
Rosbach S, James V, Grobjohann S, et al. Driving with style: Inverse reinforcement learning in general-purpose planning for automated driving. ArXiv: 1905.00229
Wulfmeier M, Rao D, Wang D Z, et al. Large-scale cost function learning for path planning using deep inverse reinforcement learning. Int J Robotics Res, 2017, 36: 1073–1087
Article Google Scholar
Wulfmeier M, Wang D Z, Posner I. Watch this: Scalable cost-function learning for path planning in urban environments. In: Proceedings of the International Conference on Intelligent Robots and Systems (IROS). IEEE, 2016. 2089–2095

Download references

Author information

Authors and Affiliations

State Key Laboratory of Robotics and System, Harbin Institute of Technology, Harbin, 150001, China
ZongWu Xie, Qi Zhang, ZaiNan Jiang & Hong Liu

Authors

ZongWu Xie
View author publications
You can also search for this author in PubMed Google Scholar
Qi Zhang
View author publications
You can also search for this author in PubMed Google Scholar
ZaiNan Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Hong Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to ZaiNan Jiang.

Additional information

This work was supported by the National Natural Science Foundation of China (Grant No. 91848202), and the Foundation for Innovative Research Groups of the National Natural Science Foundation of China (Grant No. 51521003).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xie, Z., Zhang, Q., Jiang, Z. et al. Robot learning from demonstration for path planning: A review. Sci. China Technol. Sci. 63, 1325–1334 (2020). https://doi.org/10.1007/s11431-020-1648-4

Download citation

Received: 13 March 2020
Accepted: 21 May 2020
Published: 06 July 2020
Issue Date: August 2020
DOI: https://doi.org/10.1007/s11431-020-1648-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robot learning from demonstration for path planning: A review

Abstract

Access this article

Similar content being viewed by others

Survey of imitation learning for robotic manipulation

Robot Skill Acquisition by Demonstration and Explorative Learning

Learning from Humans

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Robot learning from demonstration for path planning: A review

Abstract

Access this article

Similar content being viewed by others

Survey of imitation learning for robotic manipulation

Robot Skill Acquisition by Demonstration and Explorative Learning

Learning from Humans

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation