International Journal of Social Robotics

, Volume 4, Issue 4, pp 343–355 | Cite as

Keyframe-based Learning from Demonstration

Method and Evaluation
  • Baris Akgun
  • Maya Cakmak
  • Karl Jiang
  • Andrea L. Thomaz


We present a framework for learning skills from novel types of demonstrations that have been shown to be desirable from a Human–Robot Interaction perspective. Our approach—Keyframe-based Learning from Demonstration (KLfD)—takes demonstrations that consist of keyframes; a sparse set of points in the state space that produces the intended skill when visited in sequence. The conventional type of trajectory demonstrations or a hybrid of the two are also handled by KLfD through a conversion to keyframes. Our method produces a skill model that consists of an ordered set of keyframe clusters, which we call Sequential Pose Distributions (SPD). The skill is reproduced by splining between clusters. We present results from two domains: mouse gestures in 2D and scooping, pouring and placing skills on a humanoid robot. KLfD has performance similar to existing LfD techniques when applied to conventional trajectory demonstrations. Additionally, we demonstrate that KLfD may be preferable when demonstration type is suited for the skill.


Learning from Demonstration Kinesthetic teaching Human–Robot Interaction Humanoid robotics 



This research is supported by NSF CAREER grant IIS-1032254.


  1. 1.
    Abbeel P, Ng AY (2004) Apprenticeship learning via inverse reinforcement learning. In: Proceedings of the twenty-first international conference on machine learning. ACM Press, New York Google Scholar
  2. 2.
    Akgun B, Cakmak M, Wook Yoo J, Thomaz LA (2011) Augmenting kinesthetic teaching with keyframes. In: ICML workshop on new developments in imitation learning Google Scholar
  3. 3.
    Akgun B, Cakmak M, Wook Yoo J, Thomaz LA (2012) Trajectories and keyframes for kinesthetic teaching: a human–robot interaction perspective. In: CM/IEEE intl conference on human–robot interaction (HRI) Google Scholar
  4. 4.
    Amor HB, Berger E, Vogt D, Jun B (2009) Kinesthetic bootstrapping: teaching motor skills to humanoid robots through physical interaction. In: Lecture notes in computer science. Advances in artificial intelligence, vol 58. Springer, Berlin, pp 492–499 Google Scholar
  5. 5.
    Argall B, Chernova S, Browning B, Veloso M (2009) A survey of robot learning from demonstration. Robot Auton Syst 57(5):469–483 CrossRefGoogle Scholar
  6. 6.
    Billard A, Calinon S, Guenter F (2006) Discriminative and adaptive imitation in uni-manual and bi-manual tasks. Robot Auton Syst 54(5):370–384 CrossRefGoogle Scholar
  7. 7.
    Bitzer S, Howard M, Vijayakumar S (2010) Using dimensionality reduction to exploit constraints in reinforcement learning. In: IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 3219–3225 Google Scholar
  8. 8.
    Calinon S, Billard A (2007) What is the teacher’s role in robot programming by demonstration? Toward benchmarks for improved learning. Interaction Studies 8(3):441–464 Google Scholar
  9. 9.
    Calinon S, Billard A (2009) Statistical learning by imitation of competing constraints in joint space and task space. Adv Robot 23(15):2059–2076 CrossRefGoogle Scholar
  10. 10.
    Calinon S, Guenter F, Billard A (2007) On learning, representing and generalizing a task in a humanoid robot. IEEE Trans Syst Man Cybern, Part B, Cybern 37(2):286–298. Special issue on robot learning by observation, demonstration and imitation CrossRefGoogle Scholar
  11. 11.
    Eddy S (1998) Profile hidden Markov models. Bioinformatics 14(9):755–763 CrossRefGoogle Scholar
  12. 12.
    Flash T, Hogan N (1985) The coordination of arm movements: an experimentally confirmed mathematical model. J Neurosci 5(7):1688–1703 Google Scholar
  13. 13.
    Gribovskaya E, Billard A (2009) Learning nonlinear multi-variate motion dynamics for real-time position and orientation control of robotic manipulators. In: Proceedings of IEEE-RAS international conference on humanoid robots Google Scholar
  14. 14.
    Halit C, Capin T (2011) Multiscale motion saliency for keyframe extraction from motion capture sequences. Comput Animat Virtual Worlds 22(1):3–14 CrossRefGoogle Scholar
  15. 15.
    Hersch M, Guenter F, Calinon S, Billard A (2008) Dynamical system modulation for robot learning via kinesthetic demonstrations. IEEE Trans Robot 24(6):1463–1467 CrossRefGoogle Scholar
  16. 16.
    Hsiao K (2006) Imitation learning of whole-body grasps. In: IEEE/RJS international conference on intelligent robots and systems (IROS), pp 5657–5662 Google Scholar
  17. 17.
    Khansari-Zadeh SM, Billard A (2011) Learning stable non-linear dynamical systems with Gaussian mixture models. IEEE Trans Robot 27(5):943–957 CrossRefGoogle Scholar
  18. 18.
    Lipman D, Carrillo H (1988) The multiple sequence alignment problem in biology. SIAM J Appl Math 48:1073–1082 MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Liu Y, Zhou F, Liu W, De la Torre F, Liu Y (2010) Unsupervised summarization of rushes videos. In: Proceedings of the international conference on multimedia, MM ’10. ACM Press, New York, pp 751–754 CrossRefGoogle Scholar
  20. 20.
    Lowe D (1987) Three-dimensional object recognition from single two-dimensional images. Artif Intell 31(3):355–395 CrossRefGoogle Scholar
  21. 21.
    Miyamoto H, Schaal S, Gandolfo F, Gomi H, Koike Y, Osu R, Nakano E, Wada Y, Kawato M (1996) A Kendama learning robot based on bi-directional theory. Neural Netw 9:1281–1302 CrossRefGoogle Scholar
  22. 22.
    Nair N, Sreenivas T (2007) Joint decoding of multiple speech patterns for robust speech recognition. In: IEEE workshop on automatic speech recognition understanding, ASRU, pp 93–98. doi: 10.1109/ASRU.2007.4430090 CrossRefGoogle Scholar
  23. 23.
    Parent R (2002) Computer animation: algorithms and techniques. Morgan Kaufmann series in computer graphics and geometric modeling. Morgan Kaufmann, San Mateo Google Scholar
  24. 24.
    Pastor P, Hoffmann H, Asfour T, Schaal S (2009) Learning and generalization of motor skills by learning from demonstration. In: IEEE intl conference on robotics and automation Google Scholar
  25. 25.
    Ratliff N, Ziebart B, Peterson K, Bagnell JA, Hebert M, Dey AK, Srinivasa S (2009) Inverse optimal heuristic control for imitation learning. In: Proc AISTATS, pp 424–431 Google Scholar
  26. 26.
    Todorov E, Jordan M (1998) Smoothness maximization along a predefined path accurately predicts the speed profiles of complex arm movements. J Neurophysiol 80(2):696–714 Google Scholar
  27. 27.
    Wada Y, Kawato M (1993) A neural network model for arm trajectory formation using forward inverse dynamics models. Neural Netw 6:919–932 CrossRefGoogle Scholar
  28. 28.
    Weiss A, Igelsboeck J, Calinon S, Billard A, Tscheligi M (2009) Teaching a humanoid: a user study on learning by demonstration with hoap-3. In: Proceedings of the IEEE symposium on robot and human interactive communication (RO-MAN), pp 147–152 CrossRefGoogle Scholar

Copyright information

© Springer Science & Business Media BV 2012

Authors and Affiliations

  • Baris Akgun
    • 1
  • Maya Cakmak
    • 1
  • Karl Jiang
    • 1
  • Andrea L. Thomaz
    • 1
  1. 1.School of Interactive ComputingGeorgia Institute of TechnologyAtlantaUSA

Personalised recommendations