Skip to main content

Learning from Demonstration Based on a Classification of Task Parameters and Trajectory Optimization

Abstract

Learning from demonstration involves the extraction of important information from demonstrations and the reproduction of robot action sequences or trajectories with generalization capabilities. Task parameters represent certain dependencies observed in demonstrations used to constrain and define a robot action because of the infinite nature of the state-space environment. We present the methodology for learning from demonstration based on a classification of task parameters. The classified task parameters are used to construct a cost function, responsible for describing the demonstration data. For reproduction we propose a novel trajectory optimization that is able to generate a simplified version of the trajectory for different configurations of the task parameters. As the last step before reproduction on a real robotic arm we approximate this trajectory with a Dynamic movement primitive (DMP) - based system to retrieve a smooth trajectory. Results obtained for trajectories with three degrees of freedom (two translations and one rotation) show that the system is able to encode multiple task parameters from a low number of demonstrations and generate trajectories that are collision free.

This is a preview of subscription content, access via your institution.

References

  1. 1.

    Nair, A. McGrew, B., Andrychowicz, M., Zaremba, W. and Abbeel, P.: Overcoming exploration in reinforcement learning with demonstrations. ICRA 2018

  2. 2.

    Mülling, K., Kober, J., Kroemer, O., Peters, J.: Learning to select and generalize striking movements in robot table tennis. Int. J. Robot. Res. 32(3), 263–279 (Mar. 2013)

    Article  Google Scholar 

  3. 3.

    Siciliano, B.: Robotics: modelling, planning and control. Springer, London (2009)

    Book  Google Scholar 

  4. 4.

    Miller, S., Fritz, M., Darrell, T. and Abbeel, P.: Parametrized Shape Models for Clothing. IEEE International Conference on Robotics and Automation (ICRA), Shanghai, China, 2011, pp. 4861–4868

  5. 5.

    Jie Tang, Singh, A., Goehausen, N. and Abbeel, P.: Parameterized Maneuver Learning for Autonomous Helicopter Flight. IEEE International Conference on Robotics and Automation (ICRA), Anchorage, AK, 2010, 1142–1148

  6. 6.

    Calinon, S., Alizadeh, T. and Caldwell, D. G.: On Improving the Extrapolation Capability of Task-Parameterized Movement Models. IEEE/RSJ International Conference on Intelligent Robots and Systems, Tokyo, 2013, 610–616

  7. 7.

    Argall, B.D., Chernova, S., Veloso, M., Browning, B.: A survey of robot learning from demonstration. Robot. Auton. Syst. 57(5), 469–483 (2009)

    Article  Google Scholar 

  8. 8.

    Calinon, S., Guenter, F., Billard, A.: On learning, representing, and generalizing a task in a humanoid robot. IEEE Trans. Syst. Man Cybern. Part B Cybern. 37(2), 286–298 (2007)

    Article  Google Scholar 

  9. 9.

    Atkeson, C.G., Moore, A.W., Schaal, S.: Locally weighted learning. Artif. Intell. Rev. 11, 11–73 (1997)

    Article  Google Scholar 

  10. 10.

    Calinon, S.: A tutorial on task-parameterized movement learning and retrieval. Intell. Serv. Robot. 9(1), 1–29 (2016)

    Article  Google Scholar 

  11. 11.

    Pervez, A., Lee, D.: Learning task-parameterized dynamic movement primitives using mixture of GMMs. Intell. Serv. Robot. 11(1), 61–78 (2018)

    Article  Google Scholar 

  12. 12.

    Stulp, F., Raiola, G., Hoarau, A., Ivaldi, S. and Sigaud, O.: Learning compact parameterized skills with a single regression. 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids), Atlanta, GA, 2013, 417–422

  13. 13.

    Ureche, A.L.P., Umezawa, K., Nakamura, Y., Billard, A.: Task parameterization using continuous constraints extracted from human demonstrations. IEEE Trans. Robot. 31(6), 1458–1471 (2015)

    Article  Google Scholar 

  14. 14.

    Figueroa, N., Pais Ureche, A. L. and Billard, A.: Learning complex sequential tasks from demonstration: A pizza dough rolling case study. The Eleventh ACM/IEEE International Conference on Human Robot Interaction, 2016, 611–612

  15. 15.

    Švaco, M., Jerbić, B., Polančec, M., Šuligoj, F., Šekoranja, B., Vidaković, J.: A Reinforcement Learning Based Framework for Robot Action Planning. In: 27th International Conference on Robotics in Alpe-Adria-Danube Region, RAAD 2018. Springer Berlin Heidelberg, Patras, Greece

  16. 16.

    Kalakrishnan, M., Pastor, P., Righetti, L. and Schaal, S.: Learning objective functions for manipulation. IEEE International Conference on Robotics and Automation (ICRA) 2013, pp. 1331–1336

  17. 17.

    Piot, B., Geist, M., Pietquin, O.: Bridging the gap between imitation learning and inverse reinforcement learning. IEEE Trans. Neural Netw. Learn. Syst. 28(8), 1814–1826 (2017)

    MathSciNet  Article  Google Scholar 

  18. 18.

    Abbeel, P. and Ng, A. Y.: Apprenticeship learning via inverse reinforcement learning. Twenty-First International Conference on Machine Learning - ICML ‘04, Banff, Alberta, Canada, 2004, 1

  19. 19.

    Ratiu, M., Adriana Prichici, M.: Industrial robot trajectory optimization- a review. MATEC Web Conf. 126, 02005 (2017)

    Article  Google Scholar 

  20. 20.

    Ostanin, M., Popov, D., Klimchik, A.: Programming by demonstration using two-step optimization for industrial robot. IFAC-Pap. 51(11), 72–77 (2018)

    Article  Google Scholar 

  21. 21.

    Huang, Y., Silvério, J., Rozo, L. and Caldwell, D. G.: Generalized task-parameterized skill learning. IEEE International Conference on Robotics and Automation (ICRA), 2018

  22. 22.

    Hansen, N., Ostermeier, A.: Completely Derandomized self-adaptation in evolution strategies. Evol. Comput. 9(2), 159–195 (2001)

    Article  Google Scholar 

  23. 23.

    Fabisch, A.: A Comparison of Policy Search in Joint Space and Cartesian Space for Refinement of Skills. 28th International Conference on Robotics in Alpe-Adria-Danube Region, RAAD 2019, vol. 980, pp. 301–309. Springer Berlin Heidelberg, Kaiserslautern, Germany (2019)

    Google Scholar 

  24. 24.

    Ijspeert, A.J., Nakanishi, J., Schaal, S.: Trajectory formation for imitation with nonlinear dynamical systems. Intelligent Robots and Systems. Proc 2001 IEEE/RSJ Int Conf 2001. 2, 752–757 (2001)

    Google Scholar 

  25. 25.

    Ijspeert, A.J., Nakanishi, J., Hoffmann, H., Pastor, P., Schaal, S.: Dynamical movement primitives: learning attractor models for motor behaviors. Neural Comput. 25(2), 328–373 (2013)

    MathSciNet  Article  Google Scholar 

  26. 26.

    Calinon, S., Sardellitti, I. and Caldwell, D. G.: Learning-based control strategy for safe human-robot interaction exploiting task and robot redundancies. Intelligent Robots and Systems (IROS), 2010 IEEE/RSJ International Conference on, 2010, pp. 249–254

  27. 27.

    Kormushev, P., Calinon, S., Caldwell, D.G.: Imitation learning of positional and force skills demonstrated via kinesthetic teaching and haptic input. Adv. Robot. 25(5), 581–603 (2011)

    Article  Google Scholar 

  28. 28.

    Englert, P. and Toussaint, M.: Inverse KKT – Learning Cost Functions of Manipulation Tasks from Demonstrations. Robotics Research, vol. 3, A. Bicchi and W. Burgard, Eds. Cham: Springer International Publishing, 2018, pp. 57–72

  29. 29.

    Huang, B., Li, M., De Souza, R.L., Bryson, J.J., Billard, A.: A modular approach to learning manipulation strategies from human demonstration. Auton. Robots. 40(5), 903–927 (2016)

    Article  Google Scholar 

  30. 30.

    Levine, S., Wagener, N. and Abbeel, P.: Learning contact-rich manipulation skills with guided policy search. IEEE International Conference on Robotics and Automation (ICRA), Seattle, WA, USA, 2015, pp. 156–163

  31. 31.

    Dlaka, D.: Brain biopsy performed with the RONNA G3 system: a case study on using a novel robotic navigation device for stereotactic neurosurgery. Int. J. Med. Robot. 14(1), e1884 (2018)

    Article  Google Scholar 

  32. 32.

    Švaco, M., Šekoranja, B., Šuligoj, F., Vidaković, J., Jerbić, B., Chudy, D.: A novel robotic Neuronavigation system: RONNA G3. Strojniški vestnik - J. Mech. Eng. https://doi.org/10.5545/sv-jme.2017.4649

Download references

Acknowledgements

Authors would like to acknowledge the Croatian Scientific Foundation through the “Young researchers’ career development project – training of new doctoral students”, the Regional Centre of Excellence for Robotic Technologies – CRTA and the project DATACROSS - Advanced Methods and Technologies in Data Science and Cooperative Systems.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Bojan Šekoranja.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Vidaković, J., Jerbić, B., Šekoranja, B. et al. Learning from Demonstration Based on a Classification of Task Parameters and Trajectory Optimization. J Intell Robot Syst 99, 261–275 (2020). https://doi.org/10.1007/s10846-019-01101-2

Download citation

Keywords

  • Learning from demonstration
  • Task parameterized movement
  • Trajectory optimization
  • Robot trajectory