A formalism for learning from demonstration

Billing, Erik A.; Hellström, Thomas

doi:10.2478/s13230-010-0001-5

A formalism for learning from demonstration

Research Article
Published: 31 March 2010

Volume 1, pages 1–13, (2010)
Cite this article

Paladyn

Erik A. Billing¹ &
Thomas Hellström¹

116 Accesses
34 Citations
Explore all metrics

Abstract

The paper describes and formalizes the concepts and assumptions involved in Learning from Demonstration (LFD), a common learning technique used in robotics. LFD-related concepts like goal, generalization, and repetition are here defined, analyzed, and put into context. Robot behaviors are described in terms of trajectories through information spaces and learning is formulated as mappings between some of these spaces. Finally, behavior primitives are introduced as one example of good bias in learning, dividing the learning process into the three stages of behavior segmentation, behavior recognition, and behavior coordination. The formalism is exemplified through a sequence learning task where a robot equipped with a gripper arm is to move objects to specific areas. The introduced concepts are illustrated with special focus on how bias of various kinds can be used to enable learning from a single demonstration, and how ambiguities in demonstrations can be identified and handled.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Challenges of real-world reinforcement learning: definitions, benchmarks and analysis

Article 22 April 2021

A review of motion planning algorithms for intelligent robots

Article Open access 25 November 2021

Industrial Robotics

References

A. Alissandrakis, C. L. Nehaniv, and K. Dautenhahn. Imitation with ALICE: learning to imitate corresponding actions across dissimilar embodiments. IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans, 32:482–496, 2002.
Article Google Scholar
A. Alissandrakis, C. L. Nehaniv, and K. Dautenhahn. Action, state and effect metrics for robot imitation. In 15th IEEE International Symposium on Robot and Human Interactive Communication (ROMAN 2006), pages 232–237, Hatfield, September 2006.
R. Amit and M. Mataric. Parametric primitives for motor representation and control. In Int. Conf. on Robotics and Automation (ICRA), Washington DC, May 2002.
B. D. Argall, S. Chernova, M. Veloso, and B. Browning. A survey of robot learning from demonstration. Robotics and Autonomous Systems, 57(5):469–483, May 2009.
Article Google Scholar
R. C. Arkin. Behaviour-Based Robotics. MIT Press, 1998.
P. Bakker and Y. Kuniyoshi. Robot see, robot do: an overview of robot imitation. In Proceedings of the AISB Workshop on Learning in Robots and Animals, pages 3–11, Brighton, 1996.
D. Baldwin, A. Andersson, J. Saffran, and M. Meyer. Segmenting dynamic human action via statistical structure. Cognition, 106(3): 1382–1407, March 2008.
Article Google Scholar
D. C. Bentivegna. Learning from Observation using Primitives. PhD thesis, College of Computing, Georgia Institute of Technology, 2004.
D. C. Bentivegna, C. G. Atkeson, and G. Cheng. Learning similar tasks from observation and practice. In Proceedings of the 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 2677–2683, Beijing, China, October 2006.
A. Billard, Y. Epars, G. Cheng, and S. Schaal. Discovering imitation strategies through categorization of multi-dimensional data. In Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, volume 3, pages 2398–2403 vol.3, 2003.
Google Scholar
A. Billard, Y. Epars, S. Calinon, S. Schaal, and G. Cheng. Discovering optimal imitation strategies. Robotics and Autonomous Systems, 47(2–3):69–77, June 2004.
Article Google Scholar
A. Billard, S. Calinon, R. Dillmann, and S. Schaal. Robot programming by demonstration. In B. Siciliano and O. Khatib, editors, Handbook of Robotics. Springer, 2008.
E. A. Billing. Cognition Reversed — Robot Learning from Demonstration. PhD thesis, Umeå University, Department of Computing Science, Umeå, Sweden, December 2009.
Google Scholar
E. A. Billing. Cognitive perspectives on robot behavior. In J. Filipe, A. Fred, and B. Sharp, editors, Proceedings of 2nd International Conference on Agents and Artificial Intelligence (ICAART), Special Session LAMAS, pages 373–382, Valencia, Spain, January 2010.
E. A. Billing and T. Hellström. Behavior recognition for segmentation of demonstrated tasks. In IEEE SMC International Conference on Distributed Human-Machine Systems, pages 228–234, Athens, Greece, March 2008.
E. A. Billing, T. Hellström, and L. E. Janlert. Model-free learning from demonstration. In J. Filipe, A. Fred, and B. Sharp, editors, Proceedings of 2nd International Conference on Agents and Artificial Intelligence (ICAART), pages 62–71, Valencia, Spain, January 2010.
E. A. Billing, T. Hellström, and L. E. Janlert. Behavior recognition for learning from demonstration. In Proceedings of IEEE International Conference on Robotics and Automation, Anchorage, Alaska, May 2010.
C. Breazeal and B. Scassellati. Challanges in building robots that imitate people. In K. Dautenhahn and C. L. Nehahiv, editors, Imitation in Animals and Artifacts. MIT Press, 2002.
C. Breazeal and B. Scassellati. Infant-like social interactions between a robot and a human caretaker. Adaptive Behavior, 8(1): 49–74, 1998.
Article Google Scholar
C. Breazeal and B. Scassellati. Robots that imitate humans. Trends in Cognitive Sciences, 6(11):481–487, November 2002.
Article Google Scholar
R. A. Brooks. New approaches to robotics. Science, 253(13): 1227–1232, 1991.
Article Google Scholar
R.W. Byrne and A. E. Russon. Learning by imitation: a hierarchical approach. The Journal of Behavioral and Brain Sciences, 16(3), 1998.
S. Calinon and A. Billard. Recognition and reproduction of gestures using a probabilistic framework combining PCA, ICA and HMM. In Proceedings of the 22nd international conference on Machine learning, pages 105–112, Bonn, Germany, 2005. ACM.
S. Calinon, F. Guenter, and A. Billard. On learning, representing and generalizing a task in a humanoid robot. IEEE Transactions on Systems, Man and Cybernetics, Part B. Special issue on robot learning by observation, demonstration and imitation, 37(2):286–298, 2007.
Google Scholar
P. Cohen, N. Adams, and H. B. Voting experts: An unsupervised algorithm for segmenting. Intelligent Data Analysis, 11(6):607–625, 2007.
Google Scholar
A. Cypher, editor. Watch What I Do: Programming by Demonstration. MIT Press, 1993.
T. S. Dahl. Behavior-Based Learning. PhD thesis, Faculty of Engineering, University of Bristol, UK, 2002.
Google Scholar
N. Delson and H. West. Robot programming by human demonstration: The use of human inconsistency in improving 3D robot trajectories. In Proceedings of the IEEE/RSJ/GI International Conference on Intelligent Robots and Systems ′94. Advanced Robotic Systems and the Real World, IROS ′94., volume 2, pages 1248–1255, Munich, Germany, September 1994.
Google Scholar
J. Demiris and G. Hayes. Do robots ape? In Proceedings of the AAAI Fall Symposium on Socially Intelligent Agents, pages 28–31, 1997.
Y. Demiris and A. Dearden. From motor babbling to hierarchical learning by imitation: a robot developmental pathway. In Proceedings of the 5th International Workshop on Epigenetic Robotics, pages 31–37, 2005.
Y. Demiris and M. Johnson. Distributed, predictive perception of actions: a biologically inspired robotics architecture for imitation and learning. Connection Science, 15(4):231–243, 2003.
Article Google Scholar
Y. Demiris and B. Khadhouri. Hierarchical attentive multiple models for execution and recognition of actions. Robotics and Autonomous Systems, 54(5):361–369, May 2006.
Article Google Scholar
R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification (2nd Edition). Wiley-Interscience, 2001.
A. Fod, M. Mataric, and O. C. Jenkins. Automated derivation of primitives for movement classification. Autonomous Robots, pages 39–54, 2002.
C. Giovannangeli and P. Gaussier. Human-Robot interactions as a cognitive catalyst for the learning of behavioral attractors. In 16th IEEE International Symposium on Robot and Human interactive Communication (RO-MAN 2007), pages 1028–1033, August 2007.
S. F. Giszter, F. A. Mussa-Ivaldi, and E. Bizzi. Convergent force field organized in the frog’s spinal cord. Journal of Neuroscience, 13(2):467–491, 1993.
Google Scholar
J. G. Greeno. Special issue on situated action. In Cognitive Science, volume 17, pages 1–147. Ablex Publishing Corporation, Norwood, New Jersey, 1993.
Google Scholar
F. Guenter, M. Hersch, S. Calinon, and A. Billard. Reinforcement learning for imitating constrained reaching movements. RSJ Advanced Robotics, Special Issue on Imitative Robots, 21(13): 1521–1544, 2007.
Google Scholar
M. Haruno, D. M. Wolpert, and M. M. Kawato. MOSAIC model for sensorimotor learning and control. Neural Comput., 13(10): 2201–2220, 2001.
Article MATH Google Scholar
M. Haruno, D. M. Wolpert, and M. Kawato. Hierarchical MOSAIC for movement generation. In International Congress Series 1250, pages 575–590. Elsevier Science B.V., 2003.
T. Hastie, R. Tibshirani, and J. H Friedman. The Elements of Statistical Learning. Springer, August 2001.
T. Hellström. Teaching a robot to behave like a cockroach. In Proceedings of the Third International Symposium on Imitation in Animals and Artifacts in Hatfield UK, pages 54–61, 2005.
T. Hellström, T. Johansson, and O. Ringdahl. Development of an autonomous forest machine for path tracking. In P. Corke and S. Sukkariah, editors, Field and Service Robotics — Results of the 5th International Conference FSR, volume 25 of Springer Tracts in Advanced Robotics, pages 603–614. Springer, 2006.
M. Hersch, F. Guenter, S. Calinon, and A. Billard. Dynamical system modulation for robot learning via kinesthetic demonstrations. Proceedings of IEEE Transactions on Robotics, 24(6): 1463–1467, 2008.
Article Google Scholar
E. Hutchins. Cognition in the Wild. MIT Press, Cambridge, Massachusetts, 1995.
Google Scholar
R. A. Peters II and C. L. Campbell. Robonaut task learning through teleoperation. In Proceedings of the 2003 IEEE, International Conference on Robotics and Automation, pages 23–27, Taipei, Taiwan, September 2003.
L. E. Janlert. Modeling change — the frame problem. In Z. W. Pylyshyn, editor, The Robot’s Dilemma, pages 1–41. Ablex Publishing, Norwood, New Jersey, 1987.
Google Scholar
K-Team. Khepera robot. http://www.k-team.com, 2007.
H. Kadone and Y. Nakamura. Segmentation, memorization, recognition and abstraction of humanoid motions based on correlations and associative memory. In Proceedings of the 6th IEEERAS International Conference on Humanoid Robots, pages 1–6, University of Genova, Genova, Italy, 2006.
Chapter Google Scholar
H. Kadone and Y. Nakamura. Symbolic memory for humanoid robots using hierarchical bifurcations of attractors in nonmonotonic neural networks. In Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 2900–2905, Edmonton, AB, Canada, 2005.
N. Koenig and M. J. Mataric. Behavior-Based segmentation of demonstrated tasks. In International Conference on Development and Learning (ICDL), Bloomington, USA, May 2006.
D Kulic, W Takano, and Y Nakamura. Incremental learning, clustering and hierarchy formation of whole body motion patterns using adaptive hidden markov chains. The International Journal of Robotics Research, 27(7):761–784, July 2008.
Article Google Scholar
S. M. LaValle. Planning Algorithms. Cambridge University Press, Cambridge, U.K., 2006. Available at http://planning.cs.uiuc.edu/.
Book MATH Google Scholar
H. Lieberman, editor. Your Wish is My Command: Programming by Example. Morgan Kaufmann, San Francisco, 2001.
Google Scholar
L. Ljung. System Identification. Prentice-Hall, Simon & Schuster, Englewood Cliffs, New Jersey, 1987.
MATH Google Scholar
P. Maes and R. A. Brooks. Learning to coordinate behaviors. In National Conference on Artificial Intelligence (AAAI), pages 796–802, 1990.
P. Martin and U. Nehmzow. Programming by teaching: Neural network control in the manchester mobile robot. In Proc. Intelligent Autonomous Vehicles, Helsinki. Springer Verlag, 1995.
M. J. Mataric. Behavior-Based control: Examples from navigation, learning, and group behavior. Journal of Experimental and Theoretical Artificial Intelligence, 9(2–3):323–336, 1997.
Article Google Scholar
M. J. Mataric. Designing and understanding adaptive group behavior. Adaptive Behavior, 4(1):51–80, 1995.
Article Google Scholar
M. J. Mataric. Integration of representation into Goal-Driven Behavior-Based robots. In IEEE Transactions on Robotics and Automation, volume 8, pages 304–312, 1992.
Article Google Scholar
M. J. Mataric and M. J. Marjanovic. Synthesizing complex behaviors by composing simple primitives. In Proceedings of the European Conference on Artificial Life (ECAL-93), volume 2, pages 698–707, Brussels, Belgium, May 1993.
Google Scholar
J. McCarthy and P. J. Hayes. Some philosophical problems from the standpoint of artificial intelligence. In B. Meltzer and D. Michie, editors, Machine Intelligence 4, pages 463–502. Edinburgh University Press, 1969.
T. M. Mitchell. The need for biases in learning generalizations. Technical Report CBM-TR-117, Rutgers Computer Science Department Technical Report, New Brunswick, New Jersey, 1980.
F. A. Mussa-Ivaldi and S. F. Giszter. Vector field approximation: a computational paradigm for motor control and learning. Biological cybernetics, 67:479–489, 1992.
Article MATH Google Scholar
S. Nakaoka, A. Nakazawa, K. Yokoi, and K. Ikeuchi. Recognition and generation of leg primitivemotions for dance imitation by a humanoid robot. In Proceedings of 2nd International Symposium on Adaptive Motion of Animals and Machines, Kyoto, Japan, 2003.
C. L. Nehaniv and K. Dautenhahn. The correspondence problem. In K. Dautenhahn and C. L. Nehahiv, editors, Imitation in Animals and Artifacts. MIT Press, 2002.
C. L. Nehaniv and K. Dautenhahn. Of hummingbirds and helicopters: An algebraic framework for interdisciplinary studies of imitation and its applications. In J. Demiris and A. Birk, editors, Learning Robots: An Interdisciplinary Approach, volume 24, pages 136–161. World Scientific Press, 2000.
M. Nicolescu. A Framework for Learning from Demonstration, Generalization and Practice in Human-Robot Domains. PhD thesis, University of Southern California, 2003.
A. Olenderski, M. Nicolescu, and S. Louis. Robot learning by demonstration using forward models of Schema-Based behaviors. In Proceedings of International Conference on Informatics in Control, Automation and Robotics, Barcelona, Spain, 2005.
N. Otero, J. Saunders, K. Dautenhahn, and C. L. Nehaniv. Teaching robot companions: the role of scaffolding and event structuring. Connection Science, 20:111–134, June 2008.
Article Google Scholar
J. Peters and S. Schaal. Policy learning for motor skills. In Proceedings of 14th International Conference on Neural Information Processing (ICONIP 2007), pages 1–10, Berlin, Germany, November 2007. Springer.
R. Pfeifer and C. Scheier. Sensory-motor coordination: the metaphor and beyond. Robotics and Autonomous Systems, 20(2):157–178, June 1997.
Article Google Scholar
R. Pfeifer and C. Scheier. Understanding Intelligence. MIT Press. Cambrage, Massachusetts, 2001.
Google Scholar
P. K. Pook and D. H. Ballard. Recognizing teleoperated manipulations. In Proceedings of the IEEE International Conference on Robotics and Automation, pages 578–585, 1993.
B. Rohrer and S. Hulet. BECCA — a brain emulating cognition and control architecture. Technical report, Cybernetic Systems Integration Department, Univeristy of Sandria National Laboratories, Alberquerque, NM, USA, 2006.
Google Scholar
B. Rohrer and S. Hulet. A learning and control approach based on the human neuromotor system. In Proceedings of Biomedical Robotics and Biomechatronics, BioRob, 2006.
S. Russell and P. Norvig. Artificial Intelligence: A Modern Approach. Prentice Hall, NJ, 1995.
J. Saunders, C. L. Nehaniv, and K. Dautenhahn. Using Self-Imitation to direct learning. In 15th IEEE International Symposium on Robot and Human Interactive Communication, pages 244–250, 2006.
J. Saunders, C. L. Nehaniv, K. Dautenhahn, and A. Alissandrakis. Self-Imitation and environmental scaffolding for robot teaching. International Journal of Advanced Robotics Systems, 4(1):109–124, 2007.
Google Scholar
B. Scassellati. Imitation and mechanisms of joint attention: A developmental structure for building social skills on a humanoid robot. Lecture Notes in Computer Science, 1562:176–195, 1999.
Article Google Scholar
H. A. Simon. The Sciences of the Artificial. MIT Press, Cambridge, Massachusetts, 1969.
Google Scholar
L. A. Suchman. Plans and Situated Actions. PhD thesis, Intelligent Systems Laboratory, Xerox Palo Alto Research Center, USA, 1987.
Google Scholar
J. Tani. On the interactions between top-down anticipation and bottom-up regression. Frontiers in Neurorobotics, 1:2, 2007.
Article Google Scholar
J. Tani and M. Ito. Self-organization of behavioral primitives as multiple attractor dynamics: A robot experiment. IEEE Trans. on Systems, Man, and Cybernetics Part A: Systems and Humans, 33(4):481–488, 2003.
Article Google Scholar
D. H. Wolpert and W. G Macready. No free lunch theorems for optimization. In IEEE Transactions on Evolutionary Computation, volume 1, pages 67–82, April 1997.
Article Google Scholar
D. M. Wolpert. A unifying computational framework for motor control and social interaction. Phil. Trans. R. Soc. Lond., B(358):593–602, March 2003.
Google Scholar
D. Wood, J. Bruner, and G. Ross. The role of tutoring in problem solving. Journal of Child Psychology and Psychiatry, 17:89–100, 1976.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computing Science, Umeå University, Umeå, Sweden
Erik A. Billing & Thomas Hellström

Authors

Erik A. Billing
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Hellström
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Erik A. Billing.

Additional information

Parts of this text also appear as a technical report: E. A. Billing and T. Hellström. Formalising Learning from Demonstration, UMINF 08.10, Department of Computing Science, Umeå University, Sweden, 2008

About this article

Cite this article

Billing, E.A., Hellström, T. A formalism for learning from demonstration. Paladyn 1, 1–13 (2010). https://doi.org/10.2478/s13230-010-0001-5

Download citation

Received: 12 October 2009
Accepted: 26 February 2010
Published: 31 March 2010
Issue Date: March 2010
DOI: https://doi.org/10.2478/s13230-010-0001-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A formalism for learning from demonstration

Abstract

Access this article

Similar content being viewed by others

Challenges of real-world reinforcement learning: definitions, benchmarks and analysis

A review of motion planning algorithms for intelligent robots

Industrial Robotics

References

Author information

Authors and Affiliations

Corresponding author

Additional information

About this article

Cite this article

Keywords

Navigation

A formalism for learning from demonstration

Abstract

Access this article

Similar content being viewed by others

Challenges of real-world reinforcement learning: definitions, benchmarks and analysis

A review of motion planning algorithms for intelligent robots

Industrial Robotics

References

Author information

Authors and Affiliations

Corresponding author

Additional information

About this article

Cite this article

Share this article

Keywords

Search

Navigation