Socially guided intrinsic motivation for robot learning of motor skills

Nguyen, Sao Mai; Oudeyer, Pierre-Yves

doi:10.1007/s10514-013-9339-y

Socially guided intrinsic motivation for robot learning of motor skills

Published: 11 July 2013

Volume 36, pages 273–294, (2014)
Cite this article

Autonomous Robots Aims and scope Submit manuscript

Sao Mai Nguyen¹ &
Pierre-Yves Oudeyer¹

1184 Accesses
33 Citations
4 Altmetric
Explore all metrics

Abstract

This paper presents a technical approach to robot learning of motor skills which combines active intrinsically motivated learning with imitation learning. Our algorithmic architecture, called SGIM-D, allows efficient learning of high-dimensional continuous sensorimotor inverse models in robots, and in particular learns distributions of parameterised motor policies that solve a corresponding distribution of parameterised goals/tasks. This is made possible by the technical integration of imitation learning techniques within an algorithm for learning inverse models that relies on active goal babbling. After reviewing social learning and intrinsic motivation approaches to action learning, we describe the general framework of our algorithm, before detailing its architecture. In an experiment where a robot arm has to learn to use a flexible fishing line, we illustrate that SGIM-D efficiently combines the advantages of social learning and intrinsic motivation and benefits from human demonstration properties to learn how to produce varied outcomes in the environment, while developing more precise control policies in large spaces.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning from Humans

Autonomous exploration of motor skills by skill babbling

Article 28 December 2016

The iCub Platform: A Tool for Studying Intrinsically Motivated Learning

References

Abbeel, P. & Ng, A. Y. (2004). Apprenticeship learning via inverse reinforcement learning. In Proceedings of the 21st international conference on machine learning (ICML’04) (pp. 1–8).
Akgun, B., Cakmak, M., Yoo, J., & Thomaz, A. (2012). Trajectories and keyframes for kinesthetic teaching: A human–robot interaction perspective. In International conference on human–robot interaction.
Argall, B. D., Browning, B., & Veloso, M. (2008). Learning robot motion control with demonstration and advice-operators. In Proceedings IEEE/RSJ international conference on intelligent robots and systems IEEE (pp. 399–404).
Argall, B. D., Chernova, S., Veloso, M., & Browning, B. (2009). A survey of robot learning from demonstration. Robotics and Autonomous Systems, 57(5), 469–483. doi:10.1016/j.robot.2008.10.024.
Article Google Scholar
Argall, B. D., Browning, B., & Veloso, M. (2011). Teacher feedback to scaffold and refine demonstrated motion primitives on a mobile robot. Robotics and Autonomous Systems, 59(3–4), 243–255.
Article Google Scholar
Baldassarre, G. (2011). What are intrinsic motivations? A biological perspective. In 2011 IEEE international conference on development and learning (ICDL) (Vol. 2, pp. 1–8).
Baranes, A., & Oudeyer, P. Y. (2010). Intrinsically motivated goal exploration for active motor learning in robots. Paris: INRIA.
Baranes, A., & Oudeyer, P. Y. (2013). Active learning of inverse models with intrinsically motivated goal exploration in robots. Robotics and Autonomous Systems, 61(1), 49–73.
Article Google Scholar
Barto, A. G., Singh, S., & Chenatez, N. (2004a). Intrinsically motivated learning of hierarchical collections of skills. In Proceedings of 3rd international conference on development and learning, San Diego, CA (pp. 112–119).
Barto, A. G., Singh, S., & Chentanez, N. (2004b). Intrinsically motivated learning of hierarchical collections of skills. In ICDL international conference on developmental learning.
Billard, A., Calinon, S., Dillmann, R., & Schaal, S. (2007). Robot programming by demonstration. In B. Siciliano & O. Khatib (Eds.), Handbook of robotics (Chapt. 59). New York: Springer.
Bishop, C. (2007). Pattern recognition and machine learning. In Information science and statistics. Heidelberg: Springer.
Blumberg, B., Downie, M., Ivanov, Y., Berlin, M., Johnson, M. P., & Tomlinson, B. (2002). Integrated learning for interactive synthetic characters. ACM Transactions on Graphics 21:417–426. doi:10.1145/566654.566597.
Google Scholar
Breazeal, C., & Scassellati, B. (2002). Robots that imitate humans. Trends in Cognitive Sciences, 6(11), 481–487.
Article Google Scholar
Cakmak, M., & Thomaz, A. L. (2010). Optimality of human teachers for robot learners. In IEEE international conference on development and learning (ICDL) (Vol. 4).
Cakmak, M., DePalma, N., Thomaz, A. L., & Arriaga, R. (2009). Effects of social exploration mechanisms on robot learning. In The 18th IEEE international symposium on robot and human interactive communication (RO-MAN 2009) (pp. 128–134).
Cakmak, M., Chao, C., & Thomaz, A. L. (2010). Designing interactions for robot active learners. IEEE Transactions on Autonomous Mental Development, 2(2), 108–118.
Article Google Scholar
Calinon, S. (2009). Robot programming by demonstration: A probabilistic approach. Boca Raton: EPFL/CRC Press. EPFL Press ISBN 978-2-940222-31-5, CRC Press ISBN 978-1-4398-0867-2.
Calinon, S., & F G, Billard A,. (2007). On learning, representing and generalizing a task in a humanoid robot. IEEE Transactions on Systems, Man and Cybernetics, 37(2), 286–298.
Call, J., & Carpenter, M. (2002). Three sources of information in social learning. In K. Dautenhahn & C. L. Nehaniv (Eds.), Imitation in animals and artifacts (pp. 211–228). Cambridge, MA: MIT Press.
Cederborg, T., Li, M., Baranes, A., & Oudeyer, P. Y. (2010). Incremental local inline gaussian mixture regression for imitation learning of multiple tasks. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS), Taipei, Taiwan.
Chernova, S., & Veloso, M. (2009). Interactive policy learning through confidence-based autonomy. Journal of Artificial Intelligence Research, 34. doi:10.1613/jair.2584.
Clouse, J., & Utgoff, P. (1992). A teaching method for reinforcement learning. In Proceedings of the nineth international conference on machine learning.
Cohn, D. A., Ghahramani, Z., & Jordan, M. I. (1996). Active learning with statistical models. Journal of Artificial Intelligence Research, 4, 129–145.
MATH Google Scholar
Coleman, T., & Li, Y. (1994). On the convergence of reflective newton methods for large-scale nonlinear minimization subject to bounds. Mathematical Programming, 67(2), 189–224.
Article MATH MathSciNet Google Scholar
Coleman, T., & Li, Y. (1996). An interior, trust region approach for nonlinear minimization subject to bounds. SIAM Journal on Optimization, 6, 418–445.
Article MATH MathSciNet Google Scholar
Csibra, G. (2003). Teleological and referential understanding of action in infancy. Philosophical Transactions of the Royal Society of London Series B: Biological Sciences, 358(1431), 447.
Article Google Scholar
Csibra, G., & Gergely, G. (2007). Obsessed with goals: Functions and mechanisms of teleological interpretation of actions in humans. Acta Psychologica, 124(1), 60–78. doi:10.1016/j.actpsy.2006.09.007. Becoming an intentional agent: Early development of action interpretation and action control.
Google Scholar
da Silva, B., Konidaris, G., & Barto, A. (2012). Learning parameterized skills. In 29th international conference on machine learning (ICML 2012).
Dautenhahn, K., & Nehaniv, C. L. (2002). Imitation in animals and artifacts. Cambridge: MIT Press.
Google Scholar
d’Avella, A., Portone, A., Fernandez, L., & Lacquaniti, F. (2006). Control of fast-reaching movement by muscle synergies combinations. The Journal of Neuroscience, 26(30), 7791–7810.
Article Google Scholar
Deci, E., & Ryan, R. M. (1985). Intrinsic motivation and self-determination in human behavior. New York: Plenum Press.
Book Google Scholar
Fedorov, V. (1972). Theory of optimal experiment. New York, NY: Academic Press, Inc.
Google Scholar
Grollman, D. H., & Jenkins, O. C. (2008). Sparse incremental learning for interactive robot control policy estimation. In International conference on robotics and automation (ICRA 2008) (pp. 3315–3320).
Kaplan, F., Oudeyer, P. Y., Kubinyi, E., & Miklosi, A. (2002). Robotic clicker training. Robotics and Autonomous Systems, 38(3–4), 197–206.
Article Google Scholar
Kober, J., & Peters, J. (2011). Policy search for motor primitives in robotics. Machine Learning, 84(1), 171–203.
Article MATH MathSciNet Google Scholar
Kober, J., Wilhelm, A., Oztop, E., & Peters, J. (2012). Reinforcement learning to adjust parametrized motor primitives to new situations. Autonomous Robots, 1–19. doi:10.1007/s10514-012-9290-3.
Koenig, N., Takayama, L., & Matarić, M. (2010). Communication and knowledge sharing in human–robot interaction and learning from demonstration. Neural Networks, 23(8–9), 1104–1112. doi:10.1016/j.neunet.2010.06.005.
Article Google Scholar
Kormushev, P., Calinon, S., & Caldwell, D. G. (2010). Robot motor skill coordination with EM-based reinforcement learning. In Proceedings of IEEE/RSJ international conference on intelligent robots and systems (IROS), Taipei, Taiwan (pp 3232–3237).
Kormushev, P., Calinon, S., & Caldwell, D. G. (2011). Imitation learning of positional and force skills demonstrated via kinesthetic teaching and haptic input. Advanced Robotics, 25(5), 581–603.
Article Google Scholar
Krzanowski, W. J. (1988). Principles of multivariate analysis: A user’s perspective. New York: Oxford University Press.
MATH Google Scholar
Lagarias, J. C., Reeds, J. A., Wright, M. H., & Wright, P. E. (1998). Convergence properties of the nelder-mead simplex method in low dimensions. SIAM Journal of Optimization, 9(1), 112–147.
Article MATH MathSciNet Google Scholar
Lopes, M. (2012). Optimal teaching on sequential decision tasks (to appear).
Lopes, M., & Oudeyer, P. Y. (2010). Active learning and intrinsically motivated exploration in robots: Advances and challenges (guest editorial). IEEE Transactions on Autonomous Mental Development, 2(2), 65–69.
Article Google Scholar
Lopes, M., Melo, F., Montesano, L., & Santos-Victor, J. (2009a). Abstraction levels for robotic imitation: Overview and computational approaches. In From motor to interaction learning in robots. Berlin: Springer.
Lopes, M., Melo, F. S., Kenward, B., & Santos-Victor, J. (2009b). A computational model of social-learning mechanisms. Adaptive Behaviour, 17(6), 467–483.
Google Scholar
Lopes, M., Melo, F., Montesano, L., & Santos-Victor, J. (2010b). Abstraction levels for robotic imitation: Overview and computational approaches. In O. Sigaud & J. Peters (Eds.), From motor to interaction learning in robots, Studies in computational intelligence (Vol. 264, pp. 313–355). Berlin: Springer.
Lopes, M., Cederbourg, T., & Oudeyer, P. Y. (2011) Simultaneous acquisition of task and feedback models. In IEEE international conference on development and learning.
Mangin, O., & Oudeyer, P. Y. (2012) Learning the combinatorial structure of demonstrated behaviors with inverse feedback control. In A. A. Salah, J., Ruiz-del Solar, Ç. Meriçli, & P. Y. Oudeyer (Eds.), HBU 2012. LNCS (Vol. 7559, pp 135–148). Heidelberg: Springer.
Muja, M., & Lowe, D. (2009). Fast approximate nearest neighbors with automatic algorithm. In International conference on computer vision theory and applications (VISAPP’09).
Nehaniv, C. L., Dautenhahn, K., et al. (2004). Imitation and social learning in robots, humans, and animals: Behavioural, social and communicative dimensions. Cambridge: Cambridge University Press.
Nehaniv, C. L., & Dautenhahn, K. (2007). Imitation and social learning in robots, humans and animals: Behavioural, social and communicative dimensions. Cambridge: Cambridge University Press.
Book Google Scholar
Nguyen, S. M., & Oudeyer, P.-Y. (2012a). Interactive learning gives the tempo to an intrinsically motivated robot learner. In IEEE-RAS international conference on humanoid robots.
Nguyen, S. M., & Oudeyer, P.-Y. (2012b). Whom will an intrinsically motivated robot learner choose to imitate from? In J. Szufnarowska (Ed.), Proceedings of the post-graduate conference on robotics and development of cognition (pp. 32–35). doi:10.2390/biecoll-robotdoc2012-12.
Nguyen, S. M., & Oudeyer, P.-Y. (2012c). Active choice of teachers, learning strategies and goals for a socially guided intrinsic motivation learner. Paladyn Journal of Behavioural Robotics, 3(3), 136–146.
Google Scholar
Nicolescu, M., & Mataric, M. (2003). Natural methods for robot task learning: Instructive demonstrations, generalization and practice. In Proceedings of the second international joint conference on autonomous agents and multiagent systems, ACM (pp. 241–248).
Oudeyer, P. Y. (2011). Developmental constraints on the evolution and acquisition of sensorimotor skills. Habilitation a Diriger des Recherches.
Oudeyer, P. Y., & Kaplan, F. (2007). What is intrinsic motivation? a typology of computational approaches. Frontiers in Neurorobotics, 1, 6.
Oudeyer, P. Y., Kaplan, F., & Hafner, V. (2007). Intrinsic motivation systems for autonomous mental development. IEEE Transactions on Evolutionary Computation, 11(2), 265–286.
Article Google Scholar
Oudeyer, P. Y., Baranes, A., & Kaplan, F. (2013). Intrinsically motivated learning of real-word sensorimotor skills with developmental constraints. In G. Baldassarre & Miroli (Eds.), Intrinsically motivated learning in natural and artificial system. London: Springer.
Google Scholar
Peters, J., & Schaal, S. (2008). Reinforcement learning of motor skills with policy gradients. Neural Networks, 21(4), 682–697.
Article Google Scholar
Rolf, M., Steil, J., & Gienger, M. (2010). Gobal babbling permits direct learning of inverse kinematics. IEEE Transactions on Autonomous Mental Development, 2(3), 216–229.
Article Google Scholar
Roy, N., & McCallum, A. (2001). Towards optimal active learning through sampling estimation of error reduction. In Proceedings of the 18th international conference on machine learning, 1, 143–160.
Schaal, S., Ijspeert, A., & Billard, A. (2003). Computational approaches to motor learning by imitation. Philosophical Transactions of the Royal Society of London Series B, Biological sciences 358(1431), 537–547.
Google Scholar
Schmidhuber J (1991) Curious model-building control systems. In: Proceedings of the international joint conference on neural networks (Vol. 2, pp. 1458–1463).
Schmidhuber, J. (2010). Formal theory of creativity, fun, and intrinsic motivation (1990–2010). IEEE Transactions on Autonomous Mental Development, 2(3), 230–247.
Article Google Scholar
Slater, A., & Lewis, M. (2006). Introduction to infant development. Oxford: Oxford University Press.
Google Scholar
Smart, W., & Kaelbling, L. (2002). Effective reinforcement learning for mobile robots. In Proceedings of the IEEE international conference on robotics and automation (pp 3404–3410).
Stulp, F., & Schaal, S. (2011). Hierarchical reinforcement learning with movement primitives. In Humanoids (pp. 231–238).
Stulp, F., & Sigaud, O. (2012). Policy improvement methods: Between black-box optimization and episodic reinforcement learning.
Theodorou, E., Buchli, J., & Schaal, S. (2010). Reinforcement learning of motor skills in high dimensions: A path integral approach. In IEEE international conference on robotics and automation (ICRA) 2010 (pp. 2397–2403).
Thomaz, A. L. (2006). Socially guided machine learning. PhD thesis, MIT.
Thomaz, A. L., & Breazeal, C. (2008). Experiments in socially guided exploration: Lessons learned in building robots that learn with and without human teachers. Connection Science, Special Issue on Social Learning in Embodied Agents, 20(2, 3), 91–110.
Tomasello, M., & Carpenter, M. (2007). Shared intentionality. Developmental Science, 10(1), 121–125.
Google Scholar
Verma, D., & Rao, R. (2006). Goal-based imitation as probabilistic inference over graphical models. In Advances in NIPS (Vol. 18).
Weiss, E., & Flanders, M. (2004). Muscular and postural synergies of the human hand. Journal of Neurophysiology, 92, 523–535.
Article Google Scholar
Whiten, A. (2000). Primate culture and social learning. Cognitive Science, 24(3), 477–508.
Article Google Scholar
Xu, T., Yu, C., & Smith, L. (2011). It’s the child’s body: The role of toddler and parent in selecting toddler’s visual experience. IN Proceedings of IEEE 10th international conference in development and learning.

Download references

Acknowledgments

The authors would like to thank Paul Fudal, Jerome Bechu and Haylee Fogg for their support for the experimental setup, and Freek Stulp for his very helpful comments. This research was partially funded by ERC Grant EXPLORERS 240007 and ANR MACSi.

Author information

Authors and Affiliations

Flowers Team, INRIA and ENSTA ParisTech, Paris, France
Sao Mai Nguyen & Pierre-Yves Oudeyer

Authors

Sao Mai Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Pierre-Yves Oudeyer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sao Mai Nguyen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nguyen, S.M., Oudeyer, PY. Socially guided intrinsic motivation for robot learning of motor skills. Auton Robot 36, 273–294 (2014). https://doi.org/10.1007/s10514-013-9339-y

Download citation

Received: 29 March 2012
Accepted: 01 June 2013
Published: 11 July 2013
Issue Date: March 2014
DOI: https://doi.org/10.1007/s10514-013-9339-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Socially guided intrinsic motivation for robot learning of motor skills

Abstract

Access this article

Similar content being viewed by others

Learning from Humans

Autonomous exploration of motor skills by skill babbling

The iCub Platform: A Tool for Studying Intrinsically Motivated Learning

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Socially guided intrinsic motivation for robot learning of motor skills

Abstract

Access this article

Similar content being viewed by others

Learning from Humans

Autonomous exploration of motor skills by skill babbling

The iCub Platform: A Tool for Studying Intrinsically Motivated Learning

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation