Intrinsically Motivated Exploration for Developmental and Active Sensorimotor Learning

  • Pierre-Yves Oudeyer
  • Adrien Baranes
  • Frédéric Kaplan

Abstract

Intrinsic motivation is a central mechanism that guides spontaneous exploration and learning in humans. It fosters incremental and progressive sensorimotor and cognitive development by pushing exploration of activities of intermediate complexity given the current state of capabilities. This chapter presents and studies two computational intrinsic motivation systems that share similarities with human intrinsic motivation systems, IAC and R-IAC, that aim at self-organizing and efficiently guiding exploration for sensorimotor learning in robots. IAC was initially introduced to model the qualitative formation of developmental motor stages of increasing complexity, as shown in the Playground Experiment which we will outline. In this chapter, we argue that IAC and other intrinsically motivated learning heuristics could also be viewed as active learning algorithms that are particularly suited for learning forward models in unprepared sensorimotor spaces with large unlearnable subspaces. Then, we introduce a novel formulation of IAC, called R-IAC, and show that its performances as an intrinsically motivated active learning algorithm are far superior to IAC in a complex sensorimotor space where only a small subspace is “interesting”, i.e. neither unlearnable nor trivial. We also show results in which the learnt forward model is reused in a control scheme. Finally, an open-source accompanying software containing these algorithms as well as tools to reproduce all the experiments in simulation presented in this paper is made publicly available.

Index Terms

active learning intrinsically motivated learning exploration developmental robotics artificial curiosity sensorimotor learning 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Weng, J., McClelland, J., Pentland, A., Sporns, O., et al.: Autonomous mental development by robots and animals. Science 291, 599–600 (2001)CrossRefGoogle Scholar
  2. 2.
    Lungarella, M., Metta, G., Pfeifer, R., Sandini, G.: Developmental robotics: A survey. Connection Sci. 15(4), 151–190 (2003)CrossRefGoogle Scholar
  3. 3.
    Calinon, S., Guenter, F., Billard, A.: On Learning, Representing and Generalizing a Task in a Humanoid Robot. IEEE Transactions on Systems, Man and Cybernetics, Part B, Special issue on robot learning by observation, demonstration and imitation 37(2), 286–298 (2007)Google Scholar
  4. 4.
    Lopes, M., Melo, F.S., Montesano, L.: Affordance-based imitation learning in robots. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 1015–1021 (2007)Google Scholar
  5. 5.
    Abbeel, P., Ng, A.Y.: Apprenticeship learning via inverse reinforcement learning. In: Proceedings of the 21st International Conference on Machine Learning (ICML 2004), pp. 1–8 (2004)Google Scholar
  6. 6.
    Atkeson, C.G., Schaal, S.: Robot learning from demonstration. In: Proc. 14th International Conference on Machine Learning, pp. 12–20. Morgan Kaufmann, San Francisco (1997)Google Scholar
  7. 7.
    Alissandrakis, A., Nehaniv, C.L., Dautenhahn, K.: Action, state and effect metrics for robot imitation. In: 15th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN 2006), pp. 232–237. Hatfield, United Kingdom (2006)CrossRefGoogle Scholar
  8. 8.
    Argall, B., Chernova, S., Veloso, M.: A survey of robot learning from demonstration. Robotics and Autonomous Systems 57(5), 469–483 (2009)CrossRefGoogle Scholar
  9. 9.
    Asada, M., Ogino, M., Matsuyama, S., Oga, J.: Imitation learning based on visuo-somatic mapping. In: Marcelo, O.K., Ang, H. (eds.) 9th Int. Symp. Exp. Robot., vol. 21, pp. 269–278. Springer, Berlin (2006)CrossRefGoogle Scholar
  10. 10.
    Andry, P., Gaussier, P., Moga, S., Banquet, J.P., Nadel, J.: Learning and communication via imitation: an autonomous robot perspective. IEEE Transactions on Systems, Man, and Cybernetics, Part A 31(5), 431–442 (2001)CrossRefGoogle Scholar
  11. 11.
    Demiris, Y., Meltzoff, A.: The Robot in the Crib: A developmental analysis of imitation skills in infants and robots. Infant and Child Development 17, 43–53 (2008)CrossRefGoogle Scholar
  12. 12.
    Pardowitz, M., Knoop, S., Zollner, R.D., Dillmann, R.: Incremental learning of tasks from user demonstrations, past experiences, and vocal comments. IEEE Transactions on Systems, Man and Cybernetics - Part B 37(2), 322–332 (2007)CrossRefGoogle Scholar
  13. 13.
    Oztop, E., Kawato, M., Arbib, M.: Mirror neurons and imitation: A computationally guided review. Neural Networks 19(3), 254–271 (2006)MATHCrossRefGoogle Scholar
  14. 14.
    Rao, R., Shon, A., Meltzoff, A.: A Bayesian model of imitation in infants and robots. In: Imitation and social learning in robots, humans, and animals. Cambridge University Press, Cambridge (2007)Google Scholar
  15. 15.
    Arkin, R.C.: Moving Up the Food Chain: Motivation and Emotion in Behavior-based Robots. In: Fellous, J., Arbib, M. (eds.) Who Needs Emotions: The Brain Meets the Robot. Oxford University Press, Oxford (2005)Google Scholar
  16. 16.
    Fellous, J.M., Arbib, M. (eds.): Who Needs Emotions: The Brain Meets the Robot. Oxford University Press, Oxford (2005)Google Scholar
  17. 17.
    McFarland, D., Bosser, T.: Intelligent Behavior in Animals and Robots. MIT Press, Cambridge (1993)Google Scholar
  18. 18.
    Manzotti, R., Tagliasco, V.: From behaviour-based robots to motivation-based robots. Robot. Auton. Syst. 51(2-3), 175–190 (2005)CrossRefGoogle Scholar
  19. 19.
    Stoytchev, A., Arkin, R.: Incorporating Motivation in a Hybrid Robot Architecture. JACIII 8(3), 269–274 (2004)Google Scholar
  20. 20.
    Arkin, R.C., Fujita, M., Takagi, T., Hasegawa, R.: An ethological and emotional basis for human-robot interaction. Robotics and Autonomous Systems 42(3), 191–201 (2003)MATHCrossRefGoogle Scholar
  21. 21.
    White, R.: Motivation reconsidered: The concept of competence. Psychological 66, 297–333 (1959)Google Scholar
  22. 22.
    Berlyne, D.: Curiosity and Exploration. Science 153(3731), 25–33 (1966)CrossRefGoogle Scholar
  23. 23.
    Deci, E., Ryan, R.: Intrinsic Motivation and Self-Determination in Human Behavior. Plenum Press, New York (1985)Google Scholar
  24. 24.
    Schultz, W.: Getting Formal with Dopamine and Reward. Neuron 36, 241–263 (2002)CrossRefGoogle Scholar
  25. 25.
    Dayan, P., Balleine, B.: Reward, Motivation and Reinforcement Learning. Neuron 36, 285–298 (2002)CrossRefGoogle Scholar
  26. 26.
    Redgrave, P., Gurney, K.: The Short-Latency Dopamine Signal: a Role in Discovering Novel Actions? Nature Reviews Neuroscience 7(12), 967–975 (2006)CrossRefGoogle Scholar
  27. 27.
    Oudeyer, P.-Y., Kaplan, F., Hafner, V.: Intrinsic Motivation Systems for Autonomous Mental Development. IEEE Transactions on Evolutionary Computation 11(2), 265–286 (2007)CrossRefGoogle Scholar
  28. 28.
    Barto, A., Singh, S., Chentanez, N.: Intrinsically motivated learning of hierarchical collections of skills. In: Proc. 3rd Int. Conf. Development Learn., San Diego, CA, pp. 112–119 (2004)Google Scholar
  29. 29.
    Blanchard, A., Cañamero, L.: Modulation of Exploratory Behavior for Adaptation to the Context. In: Biologically Inspired Robotics (Biro-net) in AISB 2006: Adaptation in Artificial and Biological Systems, Bristol, UK (2006)Google Scholar
  30. 30.
    Der, R., Herrmann, M., Liebscher, R.: Homeokinetic approach to autonomous learning in mobile robots. In: Dillman, R., Schraft, R.D., Wörn, H. (eds.) Robotik 2002, pp. 301–306. VDI, Dusseldorf (2002)Google Scholar
  31. 31.
    Blank, D.S., Kumar, D., Meeden, L., Marshall, J.: Bringing up robot: Fundamental mechanisms for creating a self-motivated, self-organizing architecture. Cybernetics and Systems 36(2) (2005)Google Scholar
  32. 32.
    Huang, X., Weng, J.: Novelty and Reinforcement Learning in the Value System of Developmental Robots. In: Proc. Second International Workshop on Epigenetic Robotics: Modeling Cognitive Development in Robotic Systems, Edinburgh, Scotland, August 10 - 11 (2002)Google Scholar
  33. 33.
    Schmidhuber, J.: Curious model-building control systems. In: Proc. Int. Joint Conf. Neural Netw., Singapore, vol. 2, pp. 1458–1463 (1991)Google Scholar
  34. 34.
    Oudeyer, P.-Y., Kaplan, F.: Discovering Communication. Connection Science 18(2), 189–206 (2006)CrossRefGoogle Scholar
  35. 35.
    Schembri, M., Mirolli, M., Baldassarre, G.: Evolution and Learning in an Intrinsically Motivated Reinforcement Learning Robot. In: Almeida e Costa, F., Rocha, L.M., Costa, E., Harvey, I., Coutinho, A. (eds.) ECAL 2007. LNCS (LNAI), vol. 4648, pp. 294–303. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  36. 36.
    Kaplan, F.: Intrinsically Motivated Machines. In: Lungarella, M., Iida, F., Bongard, J.C., Pfeifer, R. (eds.) 50 Years of Aritficial Intelligence. LNCS (LNAI), vol. 4850, pp. 304–315. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  37. 37.
    Fedorov, V.: Theory of Optimal Experiment. Academic, New York (1972)Google Scholar
  38. 38.
    Gibson, E.J.: Principles of perceptual learning and development. Appleton-Century-Crofts, New-York (1969)Google Scholar
  39. 39.
    Berlyne, D.: Conflict, Arousal, and Curiosity. McGraw-Hill, New York (1960)CrossRefGoogle Scholar
  40. 40.
    Csikszentmihalyi, M.: Creativity-Flow and the Psychology of Discovery and Invention. Harper Perennial, New York (1996)Google Scholar
  41. 41.
    Cohn, D., Ghahramani, Z., Jordan, M.: Active learning with statistical models. J. Artif. Intell. Res. 4, 129–145 (1996)MATHGoogle Scholar
  42. 42.
    Hasenjager, M., Ritter, H.: Active Learning in Neural Networks. In: New learning paradigms in soft computing, pp. 137–169. Physica-Verlag GmbH, Berlin (2002)Google Scholar
  43. 43.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley, Chichester (2006)Google Scholar
  44. 44.
    Vijayakumar, S., Schaal, S.: LWPR: An O(n) Algorithm for Incremental Real Time Learning in High Dimensional Space. In: Proc. of Seventeenth International Conference on Machine Learning (ICML 2000) Stanford, California, pp. 1079–1086 (2000)Google Scholar
  45. 45.
    D’Souza, A., Vijayakumar, S., Schaal, S.: Learning inverse kinematics. In: IEEE International Conference on Intelligent Robots and Systems (IROS 2001). IEEE, Piscataway (2001)Google Scholar
  46. 46.
    Peters, J., Schaal, S.: Learning to control in operational space. International Journal of Robotics Research 27, 197–212 (2008)CrossRefGoogle Scholar
  47. 47.
    Salaün, C., Padois, V., Sigaud, O.: Control of redundant robots using learned models: an operational space control approach. In: IEEE International Conference on Intelligent Robots and Systems, IROS 2009 (2009)Google Scholar
  48. 48.
    Yeung, D.Y., Zhang, Y.: Learning inverse dynamics by Gaussian process regression under the multi-task learning framework. In: Sukhatme, G.S. (ed.) The Path to Autonomous Robots, pp. 131–142. Springer, Heidelberg (2009)Google Scholar
  49. 49.
    Ghahramani, Z.: Solving inverse problems using an EM approach to density estimation. In: Mozer, M.C., Smolensky, P., Toureztky, D.S., Elman, J.L., Weigend, A.S. (eds.) Proceedings of the 1993 Connectionist Models Summer School, pp. 316–323. Erlbaum Associates, Hillsdale (1993)Google Scholar
  50. 50.
    Rasmussen, C.E.: Evaluation of Gaussian Process and other Methods for Non-linear Regression. PhD thesis, Department of Computer Science, University of Toronto (1996)Google Scholar
  51. 51.
    Arya, S., Mount, D.M., Netanyahu, N.S., Silverman, R., Wu, A.Y.: An Optimal Algorithm for Approximate Nearest Neighbor Searching. Journal of the ACM 45, 891–923 (1998)MATHCrossRefMathSciNetGoogle Scholar
  52. 52.
    Maneewongvatana, S., Mount, D.M.: Analysis of Approximate Nearest Neighbor Searching with Clustered Point Sets, Data Structures, Near Neighbor Searches, and Methodology: Fifth and Sixth DIMACS Implementation Challenges. In: Goldwasser, M.H., Johnson, D.S., McGeoch, C.C. (eds.) Fifth and Sixth DIMACS Implementation Challenges. The DIMACS Series in Discr. Math. and Theoret. Comp. Sci, vol. 59, pp. 105–123. AMS (2002)Google Scholar
  53. 53.
    Filliat, D.: A visual bag of words method for interactive qualitative localization and mapping. In: Proceedings of the International Conference on Robotics and Automation, ICRA (2007)Google Scholar
  54. 54.
    Corke, P.I.: A robotics toolbox for Matlab. IEEE Robotics and Automation Magazine 1(3), 24–32 (2006)Google Scholar
  55. 55.
    Oudeyer, P.-Y., Kaplan, F.: How can we define intrinsic motivation? In: Proceedings of the 8th International Conference on Epigenetic Robotics: Modeling Cognitive Development in Robotic Systems, Lund University Cognitive Studies. LUCS, Brighton, Lund (2008)Google Scholar
  56. 56.
    Kuniyoshi, Y., Yorozu, Y., Inaba, M., Inoue, H.: From visuo-motor self learning to early imitation-a neural architecture for humanoid learning. In: IEEE Int. Conf. Robotics and Automation, vol. 3, pp. 3132–3139 (2003)Google Scholar
  57. 57.
    Lopes, M., Mello, F., Montesano, L., Santos-Victor, J.: Abstraction Levels for Robotic Imitation: Overview and Computational Approaches. In: Sigaud, O., Peters, J. (eds.) From Motor Learning to Interaction Learning in Robots. SCI, vol. 264, pp. 313–355. Springer, Heidelberg (2010)Google Scholar
  58. 58.
    Thomaz, A.L., Breazeal, C.: Experiments in Socially Guided Exploration: Lessons learned in building robots that learn with and without human teachers. Connection Science, Special Issue on Social Learning in Embodied Agents 20(2,3), 91–110 (2008)Google Scholar
  59. 59.
    Kaplan, F., Oudeyer, P.-Y., Bergen, B.: Computational Models” in the Debate over Language Learnability. Infant and Child Development 17(1), 55–80 (2008)CrossRefGoogle Scholar
  60. 60.
    Thelen, E., Smith, L.B.: A Dynamic Systems Approach to the Development of Cognition and Action. MIT Press, Cambridge (1994)Google Scholar
  61. 61.
    Baranes, A., Oudeyer, P.-Y.: R-IAC: Robust Intrinsically Motvated Active Learning. In: Proceedings of the IEEE International Conference on Development and Learning (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Pierre-Yves Oudeyer
    • 1
  • Adrien Baranes
    • 1
  • Frédéric Kaplan
    • 2
  1. 1.INRIAFrance
  2. 2.CRAFT-EPFLSwitzerland

Personalised recommendations