Theory in Biosciences

, Volume 131, Issue 3, pp 161–179 | Cite as

Information-driven self-organization: the dynamical system approach to autonomous robot behavior

  • Nihat Ay
  • Holger Bernigau
  • Ralf Der
  • Mikhail Prokopenko
Original Paper

Abstract

In recent years, information theory has come into the focus of researchers interested in the sensorimotor dynamics of both robots and living beings. One root for these approaches is the idea that living beings are information processing systems and that the optimization of these processes should be an evolutionary advantage. Apart from these more fundamental questions, there is much interest recently in the question how a robot can be equipped with an internal drive for innovation or curiosity that may serve as a drive for an open-ended, self-determined development of the robot. The success of these approaches depends essentially on the choice of a convenient measure for the information. This article studies in some detail the use of the predictive information (PI), also called excess entropy or effective measure complexity, of the sensorimotor process. The PI of a process quantifies the total information of past experience that can be used for predicting future events. However, the application of information theoretic measures in robotics mostly is restricted to the case of a finite, discrete state-action space. This article aims at applying the PI in the dynamical systems approach to robot control. We study linear systems as a first step and derive exact results for the PI together with explicit learning rules for the parameters of the controller. Interestingly, these learning rules are of Hebbian nature and local in the sense that the synaptic update is given by the product of activities available directly at the pertinent synaptic ports. The general findings are exemplified by a number of case studies. In particular, in a two-dimensional system, designed at mimicking embodied systems with latent oscillatory locomotion patterns, it is shown that maximizing the PI means to recognize and amplify the latent modes of the robotic system. This and many other examples show that the learning rules derived from the maximum PI principle are a versatile tool for the self-organization of behavior in complex robotic systems.

Keywords

Autonomous systems Predictive information Self-organization Sensorimotor loop Embodiment Hebbian learning Intrinsic motivation 

References

  1. Amari S-I (1998) Natural gradient works efficiently in learning. Neural Comput 10:251–276CrossRefGoogle Scholar
  2. Anthony T, Polani D, Nehaniv CL (2009) Impoverished empowerment: ‘meaningful’ action sequence generation through bandwidth limitation. In: Kampis G, Szathmry E (eds), vol 2. Springer, Budapest, pp 294–301Google Scholar
  3. Ay N, Bertschinger H, Der R, Güttler F, Olbrich E (2008) Predictive information and explorative behavior of autonomous robots. Eur Phys J B 63(3):329–339CrossRefGoogle Scholar
  4. Ay N, Bernigau H, Der R, Martius G (2011) Information-driven homeokinesis (in preparation)Google Scholar
  5. Baldassarre G (2008) Self-organization as phase transition in decentralized groups of robots: a study based on Boltzmann entropy. In: Prokopenko M (ed) Advances in applied self-organizing systems. Springer, Berlin, pp 127–146CrossRefGoogle Scholar
  6. Barto AG (2004) Intrinsically motivated learning of hierarchical collections of skills. In: Proceedings of 3rd international conference development Learning, San Diego, CA, USA, pp 112–119Google Scholar
  7. Bialek W, Nemenman I, Tishby N (2001) Predictability, complexity and learning. Neural Comput 13:2409PubMedCrossRefGoogle Scholar
  8. Cover TM, Thomas JA (2006) Elements of information theory. Wiley, New YorkGoogle Scholar
  9. Crutchfield JP, Young K (1989) Inferring statistical complexity. Phys Rev Lett 63:105–108PubMedCrossRefGoogle Scholar
  10. DelSole T (2004) Predictability and information theory. Part I: Measures of predictability. J Atmos Sci 61(3):2425–2440CrossRefGoogle Scholar
  11. Der R (2001) Self-organized acquisition of situated behaviors. Theory Biosci 120:179–187Google Scholar
  12. Der R, Liebscher R (2002) True autonomy from self-organized adaptivity. In: Proceedings of EPSRC/BBSRC international workshop on biologically inspired robotics. HP Labs, BristolGoogle Scholar
  13. Der R, Martius G (2006) From motor babbling to purposive actions: emerging self-exploration in a dynamical systems approach to early robot development. In: Nolfi S, Baldassarre G, Calabretta R, Hallam JCT, Marocco D, Meyer J-A, Miglino O, Parisi D (eds) Proceedings from animals to animats 9 (SAB 2006). LNCS, vol 4095. Springer, pp 406–421Google Scholar
  14. Der R, Martius G (2011) The playful machine—theoretical foundation and practical realization of self-organizing robots. Springer, BerlinGoogle Scholar
  15. Der R, Hesse F, Martius G (2005) Learning to feel the physics of a body. In: Proceedings of the international conference on computational intelligence for modelling, control and automation (CIMCA 06). IEEE Computer Society, Washington, DC, pp 252–257Google Scholar
  16. Der R, Hesse F, Martius G (2006a) Rocking stamper and jumping snake from a dynamical system approach to artificial life. Adapt Behav 14(2):105–115CrossRefGoogle Scholar
  17. Der R, Martius G, Hesse F (2006b) Let it roll–emerging sensorimotor coordination in a spherical robot. In: Rocha LM, Yaeger LS, Bedau MA, Floreano D, Goldstone RL, Vespignani A (eds) Proceedings of the artificial life X, August. International Society for Artificial Life, MIT Press, pp 192–198Google Scholar
  18. Der R, Güttler F, Ay N (2008) Predictive information and emergent cooperativity in a chain of mobile robots. In: Artificial Life XI. MIT Press, CambridgeGoogle Scholar
  19. Engel Y (2010) Gaussian process reinforcement learning. In: Claude S, Geoffrey IW (eds) Encyclopedia of machine learning. Springer, pp 439–447Google Scholar
  20. Georgiev P, Cichocki A, Amari S-I (2001) On some extensions of the natural gradient algorithm. In: Proceedings of the 3rd international conference on independent component analysis and blind signal separation, pp 581–585Google Scholar
  21. Grassberger P (1986) Toward a quantitative theory of self-generated complexity. Int J Theor Phys 25(9):907–938CrossRefGoogle Scholar
  22. Kantz H, Schreiber T (2003) Nonlinear time series analysis, 2nd ed. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  23. Kaplan F, Oudeyer P-Y (2004) Maximizing learning progress: an internal reward system for development. In: Iida F, Pfeifer R, Steels L, Kuniyoshi Y (eds) Embodied artificial intelligence, Lecture Notes in Computer Science, vol 3139. Springer, pp 629–629Google Scholar
  24. Klyubin AS, Polani D, Nehaniv CL (2005) Empowerment: a universal agent-centric measure of control. In: Congress on evolutionary computation, pp 128–135Google Scholar
  25. Klyubin AS, Polani D, Nehaniv CL (2007) Representations of space and time in the maximization of information flow in the perception-action loop. Neural Comput 19:2387–2432PubMedCrossRefGoogle Scholar
  26. Kobayashi S, Nomizu K (1963) Foundations of differential geometry. Wiley, New YorkGoogle Scholar
  27. Kober J, Peters J (2009) Policy search for motor primitives in robotics. In: Koller D, Schuurmans D, Bengio Y, Bottou L (eds) Twenty-Second annual conference on neural information processing systems, Red Hook, NY, USA, 06 2009, Curran, pp 849–856Google Scholar
  28. Kühnel W (2006) Differential geometry, vol 16. American Mathematical Society Student Mathematical LibraryGoogle Scholar
  29. Lungarella M, Pegors T, Bulwinkle D, Sporns O (2005) Methods for quantifying the informational structure of sensory and motor data. Neuroinformatics 3(3):243–262PubMedCrossRefGoogle Scholar
  30. Magnus J, Neudecker H (1988) Matrix differential calculus with applications in statistics and econometrics. Wiley, New YorkGoogle Scholar
  31. Martius G (2010) Goal-oriented control of self-organizing behavior in autonomous robots. PhD thesis, Georg-August-Universität GöttingenGoogle Scholar
  32. Martius G, Herrmann J (2010) Taming the beast: guided self-organization of behavior in autonomous robots. In: Doncieux S, Girard B, Guillot A, Hallam J, Meyer J-A, Mouret J-B (eds) From animals to animats 11. LNCS, vol 6226. Springer, pp 50–61Google Scholar
  33. Martius G, Herrmann JM, Der R (2007) Guided self-organisation for autonomous robot development. In: Almeida e Costa F, Rocha L, Costa E, Harvey I, Coutinho A (eds) Proceedings of the advances in artificial life, 9th European conference (ECAL 2007). LNCSm, vol 4648. Springer, pp 766–775Google Scholar
  34. Oudeyer P-Y, Kaplan F, Hafner V (2007) Intrinsic motivation systems for autonomous mental development. IEEE Trans Evol Comput 11(2):265–286CrossRefGoogle Scholar
  35. Pearl J (2000) Causality. Cambridge University Press, CambridgeGoogle Scholar
  36. Pfeifer R, Bongard JC (2006) How the Body Shapes the Way We Think: A New View of Intelligence. MIT Press, CambridgeGoogle Scholar
  37. Pfeifer R, Lungarella M, Iida F (2007) Self-organization, embodiment, and biologically inspired robotics. Science 318:1088–1093PubMedCrossRefGoogle Scholar
  38. Prokopenko M, Wang P, Price D, Valencia P, Foreman M, Farmer AJ (2005) Self-organizing hierarchies in sensor and communication networks. Artif Life 11(4):407–426PubMedCrossRefGoogle Scholar
  39. Prokopenko M, Gerasimov V, Tanev I (2006) Evolving spatiotemporal coordination in a modular robotic system. In: Nolfi S, Baldassarre G, Calabretta R, Hallam J, Marocco D, Meyer J-A, Parisi D (eds) From animals to animats 9: 9th international conference on the simulation of adaptive behavior (SAB 2006). Lecture Notes in Computer Science, vol 4095. Springer, pp 558–569Google Scholar
  40. Schmidhuber J (1990) A possibility for implementing curiosity and boredom in model-building neural controllers. In: Proceedings of the first international conference on simulation of adaptive behavior. MIT Press, Cambridge, pp 222–227Google Scholar
  41. Schmidhuber J (2007) Simple algorithmic principles of discovery, subjective beauty, selective attention, curiosity and creativity. Springer, BerlinGoogle Scholar
  42. Schmidhuber J (2009) Driven by compression progress: a simple principle explains essential aspects of subjective beauty, novelty, surprise, interestingness, attention, curiosity, creativity, art, science, music, jokes. In: Pezzulo G, Butz MV, Sigaud O, Baldassarre G (eds) Anticipatory behavior in adaptive learning systems, Lecture Notes in Computer Science, vol 5499. Springer, pp 48–76Google Scholar
  43. Spivak M (1999) Differential geometry, vol 1. Publish or Perish, Inc., BerkeleyGoogle Scholar
  44. Steels L (2004) The autotelic principle. In: Iida F, Pfeifer R, Steels L, Kuniyoshi Y (eds) Embodied artificial intelligence, Lecture Notes in Computer Science, vol 3139. Springer, pp 629–629Google Scholar
  45. Storck J, Hochreiter S, Schmidhuber J (1995) Reinforcement driven information acquisition in non-deterministic environments. In: Proceedings of the international conference on artificial neural networks, pp 159–164Google Scholar
  46. Theodorou EA, Buchli J, Schaal S (2010) Reinforcement learning of motor skills in high dimensions: a path integral approach. In: International conference of robotics and automation (ICRA 2010) (accepted)Google Scholar
  47. Willmore T (1959) Differential geometry. Oxford University Press, OxfordGoogle Scholar
  48. Zahedi K, Ay N, Der R (2010) Higher coordination with less control—a result of information maximization in the sensorimotor loop. Adapt Behav 18(3–4):338–355CrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2011

Authors and Affiliations

  • Nihat Ay
    • 1
    • 3
  • Holger Bernigau
    • 1
  • Ralf Der
    • 1
  • Mikhail Prokopenko
    • 1
    • 2
  1. 1.Max Planck Institute for Mathematics in the SciencesLeipzigGermany
  2. 2.CSIROSydneyAustralia
  3. 3.Santa Fe InstituteSanta FeUSA

Personalised recommendations