Advertisement

Programming by Demonstration: A Taxonomy of Current Relevant Methods to Teach and Describe New Skills to Robots

  • Jordi Bautista-Ballester
  • Jaume Vergés-Llahí
  • Domènec Puig
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 252)

Abstract

Programming by Demonstration (PbD) covers methods by which a robot learns new skills through human guidance and imitation. PbD has been a key topic in robotics during the last decade that includes the development of robust algorithms for motor control, motor learning, gesture recognition and the visual-motor integration. Nowadays, PbD deals more with learning methods than traditional approaches, and frequently it is referred to as Imitation Learning or Behavioral Cloning. This work will review and analyse existing works in order to create a taxonomy of the elements that constitute the most relevant approaches in this field to date. We intend to establish the categories and types of algorithms involved so far in PbD and describing their advantages and disadvantages and potential developments.

Keywords

Mobile Robotics Programming by Demonstration (PbD) Imitation Learning Learning from Demonstration (LfD) Taxonomy 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Calinon, S., Guenter, F., Billard, A.: On learning, representing, and generalizing a task in a humanoid robot. IEEE Trans. on Systems, Man and Cybernetics 32, 286–298 (2007)CrossRefGoogle Scholar
  2. 2.
    Ijspeert, A., Nakanishi, J., Schaal, S.: Movement imitation with nonlinear dynamical systems in humanoid robots. In: IEEE Int. Conf. on Robotics and Automation, ICRA (2002)Google Scholar
  3. 3.
    Mulling, K., Kober, J., Kroemer, O., Peters, J.: Learning to Select and Generalize Striking Movements in Robot Table Tennis. The International Journal of Robotics Research 32(3), 263–279 (2013)CrossRefGoogle Scholar
  4. 4.
    Nehaniv, C.L., Dautenhahn, K.: Like me? Measures of correspondence and imitation. Cybernetics and Systems 32(1-2), 11–51 (2001)CrossRefMATHGoogle Scholar
  5. 5.
    Breazeal, C., Scassellati, B.: Robots that imitate humans. Trends in Cognitive Sciences 6(11), 481–487 (2002)CrossRefGoogle Scholar
  6. 6.
    Argall, B.D., Chernova, S., Veloso, M., Browning, B.: A Survey of Robot Learning from Demonstration. Robotics and Autonomous Systems 57(5), 469–483 (2009)CrossRefGoogle Scholar
  7. 7.
    Pook, P.K., Ballard, D.H.: Recognizing teleoperated manipulations. In: Proc. of the IEEE Int. Conf. on Robotics and Automation, ICRA 1993 (1993)Google Scholar
  8. 8.
    Smart, W.D.: Making Reinforcement Learning Work on Real Robots Ph.D. Thesis, Department of Computer Science, Brown University, Providence, RI (2002)Google Scholar
  9. 9.
    Abbeel, P., Ng, A.Y.: Apprenticeship learning via inverse reinforcement learning. In: Proc. of the 21st Int. Conf. on Machine Learning, ICML 2004 (2004)Google Scholar
  10. 10.
    Chernova, S., Veloso, M.: Multi-thresholded approach to demonstration selection for interactive robot learning. In: Proc. of the 3rd ACM/IEEE Int. Conf. on Human-Robot Interaction, HRI 2008 (2008)Google Scholar
  11. 11.
    Breazeal, C., Berlin, M., Brooks, A., Gray, J., Thomaz, A.L.: Using perspective taking to learn from ambiguous demonstrations The Social Mechanisms of Robot Programming by Demonstration. Robotics and Autonomous Systems 54(5), 385–393 (2006)CrossRefGoogle Scholar
  12. 12.
    Rybski, P.E., Yoon, K., Stolarz, J., Veloso, M.: Interactive robot task training through dialog and demonstration. In: Proc. of the 2nd ACM/IEEE Int. Conf. on Human-Robot Interactions, HRI 2007 (2007)Google Scholar
  13. 13.
    Argall, B., Browning, B., Veloso, M.: Learning from demonstration with the critique of a human teacher. In: Proc. of the 2nd ACM/IEEE Int. Conf. on Human-Robot Interactions, HRI 2007 (2007)Google Scholar
  14. 14.
    Grollman, D.H., Jenkins, O.C.: Dogged learning for robots. In: Proc. of the IEEE Int. Conf. on Robotics and Automation, ICRA 2007 (2007)Google Scholar
  15. 15.
    Nehmzow, U., Akanyeti, O., Weinrich, C., Kyriacou, T., Billings, S.: Robot programming by demonstration through system identification. In: Proc. of the IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, IROS 2007 (2007)Google Scholar
  16. 16.
    Ogino, M., Toichi, H., Yoshikawa, Y., Asada, M.: Interaction rule learning with a human partner based on an imitation faculty with a simple visuo-motor mapping. The Social Mechanisms of Robot Programming by Demonstration, Robotics and Autonomous Systems 54(5), 414–418 (2006)CrossRefGoogle Scholar
  17. 17.
    Calinon, S., Billard, A.: Incremental learning of gestures by imitation in a humanoid robot. In: Proc. of the 2nd ACM/IEEE Int. Conf. on Human Robot Interactions, HRI 2007 (2007)Google Scholar
  18. 18.
    Aleotti, J., Caselli, S.: Robust trajectory learning and approximation for robot programming by demonstration. The Social Mechanisms of Robot Programming by Demonstration, Robotics and Autonomous Systems 54(5), 409–413 (2006)CrossRefGoogle Scholar
  19. 19.
    Billard, A., Mataric, M.: Learning human arm movements by imitation: Evaluation of biologically inspired connectionist architecture. Robotics and Autonomous Systems 37(2-3), 145–160 (2001)CrossRefMATHGoogle Scholar
  20. 20.
    Ude, A., Atkeson, C.G., Riley, M.: Programming full-body movements for humanoid robots by observation. Robotics and Autonomous Systems 47, 93–108 (2004)CrossRefGoogle Scholar
  21. 21.
    Steil, J., Rthling, F., Haschke, R., Ritter, H.: Situated robot learning for multi-modal instruction and imitation of grasping. Robot Learning by Demonstration, Robotics and Autonomous Systems 2-3(47), 129–141 (2004)CrossRefGoogle Scholar
  22. 22.
    Ratliff, N., Bagnell, J.A., Zinkevich, M.A.: Maximum margin planning. In: Proc. of the 23rd Int. Conf. on Machine Learning, ICML 2006 (2006)Google Scholar
  23. 23.
    Chernova, S., Veloso, M.: Confidence-based learning from demonstration using Gaussian Mixture Models. In: Proc. of the Int. Conf. on Autonomous Agents and Multiagent Systems, AAMAS 2007 (2007)Google Scholar
  24. 24.
    Saunders, J., Nehaniv, C.L., Dautenhahn, K.: Teaching robots by moulding behavior and scaffolding the environment. In: Proc. of the 1st ACM/IEEE Int. Conf. on Human Robot Interactions, HRI 2006 (2006)Google Scholar
  25. 25.
    Kober, J., Peters, J.: Imitation and Reinforcement Learning. Practical Learning Algorithms for Motor Primitives in Robotics 17(2), 1–8 (2010)Google Scholar
  26. 26.
    Rybski, P.E., Voyles, R.M.: Interactive task training of a mobile robot through human gesture recognition. In: Proc. of the IEEE Int. Conf. on Robotics and Automation, ICRA 1999 (1999)Google Scholar
  27. 27.
    Lockerd, A., Breazeal, C.: Tutelage and socially guided robot learning. In: Proc. of the IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, IROS 2004 (2004)Google Scholar
  28. 28.
    Calinon, S., Sauser, E.L., Billard, A.G., Caldwell, D.G.: Evaluation of a Probabilistic Approach to Learn and Reproduce Gestures by Imitation. In: IEEE Int. Conf. on Robotics and Automation (x), pp. 2671–2676 (May 2010)Google Scholar
  29. 29.
    Khansari-Zadeh, S.M., Billard, A.: Learning Stable Nonlinear Dynamical Systems with Gaussian Mixture Models. IEEE Trans. on Robotics 27(5), 943–957 (2011)CrossRefGoogle Scholar
  30. 30.
    Khansari-Zadeh, S.M., Billard, A.: BM: Aniterative algorithm to learn stable non-linear dynamical systems with Gaussian Mixture Models. In: Proc. Int. Conf. Robotics and Automation, pp. 2381–2388 (2010)Google Scholar
  31. 31.
    Rasmussen, C., Williams, C.: Gaussian Processes for Machine Learning. Springer, New York (2006)MATHGoogle Scholar
  32. 32.
    Ijspeert, A.J., Nakanishi, J., Schaal, S.: Learning rhythmic movements by demonstration using nonlinear oscillators. In: Proc. of the IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, IROS 2002 (2002)Google Scholar
  33. 33.
    Vijayakumar, S., Schaal, S.: Locally weighted projection regression: An o(n) algorithm for incremental real time learning in high dimensional space. In: Proc. of the 17th Int. Conf. on Machine Learning, ICML 2000 (2000)Google Scholar
  34. 34.
    Grollman, D.H., Jenkins, O.C.: Sparse incremental learning for interactive robot control policy estimation. In: Proc. of the IEEE Int. Conf. on Robotics and Automation, ICRA 2008 (2008)Google Scholar
  35. 35.
    Chersi, F.: Learning Through Imitation: a Biological Approach to Robotics. IEEE Trans. on Autonomous Mental Development 4(3), 204–214 (2012)CrossRefGoogle Scholar
  36. 36.
    Merrick, K.: Intrinsic Motivation and Introspection in Reinforcement Learning 4(4), 315–329 (2012)Google Scholar
  37. 37.
    Guenter, F., Billard, A.: Using reinforcement learning to adapt an imitation task. In: Proc. of the IEEE/RSJ Int. Conf.onIntelligent Robots and Systems, IROS 2007 (2007)Google Scholar
  38. 38.
    Abbeel, P., Coates, A., Quigley, M., Ng, A.Y.: An application of reinforcement learning to aerobatic helicopter flight. In: Proc. of the Advances in Neural Information Processing, NIPS 2007 (2007)Google Scholar
  39. 39.
    Nicolescu, M.N., Mataric, M.J.: Methods for robot task learning: Demonstrations, generalization and practice. In: Proc. of the Second International Joint Conference on Autonomous Agents and Multi-Agent Systems, AAMAS 2003 (2003)Google Scholar
  40. 40.
    Chernova, S., Veloso, M.: Learning equivalent action choices from demonstration. In: Proc. of the IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, IROS 2008 (2008)Google Scholar
  41. 41.
    Calinon, S., Dhalluin, F., Sauser, E.L., Caldwell, D.G., Billard, A.G.: Learning and Reproduction of Gestures by Imitation. An Approach Based on Hidden Markov Model and Gaussian Mixture Regression 17(2), 44–54 (2010)Google Scholar
  42. 42.
    Ijspeert, A., Nakanishi, J., Schaal, S.: Trajectory formation for imitation with nonlinear dynamical systems. In: Proc. IEEE Intl Conf. on Intelligent Robots and Systems (IROS), pp. 752–757 (2001)Google Scholar
  43. 43.
    Vijayakumar, S., Dsouza, A., Schaal, S.: Incremental online learning in high dimensions. Neural Computation 17(12), 2602–2634 (2005)MathSciNetCrossRefGoogle Scholar
  44. 44.
    Dempster, A., Rubin, N.L.D.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Statist. Soc. B 39(1), 1–38 (1977)MathSciNetMATHGoogle Scholar
  45. 45.
    Pastor, P., Hoffmann, H., Asfour, T., Schaal, S.: Learning and generalization of motor skills by learning from demonstration. In: Proc. Int. Conf. Robotics and Automation, pp. 1293–1298 (2009)Google Scholar
  46. 46.
    Coates, A., Abbeel, P., Ng, A.Y.: Learning for Control from Multiple Demonstrations. In: Proc. of the 25th Int. Conf. on Machine Learning (ICML 2008), pp. 144–151 (2008)Google Scholar
  47. 47.
    Karaoguz, C., Rodemann, T., Wrede, B., Goerick, C.: Learning Information Acquisition for Multitasking Scenarios in Dynamic Environments. IEEE Trans. on Autonomous Mental Development 5(1), 46–61 (2013)CrossRefGoogle Scholar
  48. 48.
    Cangelosi, A., Metta, G., Sagerer, G., Nolfi, S., Nehaniv, C., Fischer, K., Tani, J., et al.: Integration of Action and Language Knowledge: A Roadmap for Developmental Robotics. IEEE Trans. on Auton. Mental Develop. 2(3), 167–195 (2010)CrossRefGoogle Scholar
  49. 49.
    Nicolescu, M.N., Jenkins, O.C., Olenderski, F.: Learning behavior fusion from demonstration. Interaction Studies 9(2), 319–352 (2008)CrossRefGoogle Scholar
  50. 50.
    Grollman, D.H., Jenkins, O.C.: Incremental Learning of Subtasks from Unsegmented Demonstration. In: Int. Conf. on Intelligent Robots and Systems (2008)Google Scholar
  51. 51.
    Chernova, S., Veloso, M.: Interactive Policy Learning through Confidence-Based Autonomy. Journal of Artificial Intelligence Research 34, 1–25 (2009)MathSciNetMATHGoogle Scholar
  52. 52.
    Argall, B., Browning, B., Veloso, M.: Teacher feedback to scaffold and refine demonstrated motion primitives on a mobile robot. Robotics and Autonomous Systems 59(3-4), 243–255 (2011)CrossRefGoogle Scholar
  53. 53.
    Argall, B., Sauser, E.L., Billard, A.G.: Tactile Guidance for Policy Adaptation. Foundations and Trends in Robotics 1(2), 79–133 (2010)CrossRefGoogle Scholar
  54. 54.
    Grollman, D.H., Billard, A.G.: Donut as I do: Learning from failed demonstrations. In: Int. Conf. on Robotics and Automation (2010)Google Scholar
  55. 55.
    Ratliff, N., Ziebart, B., Peterson, K., Bagnell, J.B., Hebert, H.: Inverse Optimal Heuristic Control for Imitation Learning. In: Proc. of the 12th Int. Conf. on Artificial Intelligence and Statistics (2009)Google Scholar
  56. 56.
    Choi, J., Kim, K.: Nonparametric bayesian inverse reinforcement learning for multiple reward functions. In: Advances in Neural Info. Proc. Sys. 25, NIPS 2012 (2012)Google Scholar
  57. 57.
    Billard, A., Calinon, S.: Robot Programming by Demonstration. In: Handbook of Robotics, ch. 59 (2007)Google Scholar
  58. 58.
    Farkhadinov, I., Hwan, J.: A User Study of a Mobile Robot Teleoperation. In: Proc. of the 4th International Conference on Ubiquitous Robotics and Ambient Intelligence (2007)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Jordi Bautista-Ballester
    • 1
    • 2
  • Jaume Vergés-Llahí
    • 1
  • Domènec Puig
    • 2
  1. 1.ATEKNEA SolutionsCornellà de LlobregatSpain
  2. 2.Department of Computer Engineering and MathematicsUniversitat Rovira i VirgiliTarragonaSpain

Personalised recommendations