Biological Cybernetics

, Volume 97, Issue 1, pp 99–112 | Cite as

Learning with incomplete information and the mathematical structure behind it

  • Reimer KühnEmail author
  • Ion-Olimpiu Stamatescu


We investigate the problem of learning with incomplete information as exemplified by learning with delayed reinforcement. We study a two phase learning scenario in which a phase of Hebbian associative learning based on momentary internal representations is supplemented by an ‘unlearning’ phase depending on a graded reinforcement signal. The reinforcement signal quantifies the success-rate globally for a number of learning steps in phase one, and ‘unlearning’ is indiscriminate with respect to associations learnt in that phase. Learning according to this model is studied via simulations and analytically within a student–teacher scenario for both single layer networks and, for a committee machine. Success and speed of learning depend on the ratio λ of the learning rates used for the associative Hebbian learning phase and for the unlearning-correction in response to the reinforcement signal, respectively. Asymptotically perfect generalization is possible only, if this ratio exceeds a critical value λ c , in which case the generalization error exhibits a power law decay with the number of examples seen by the student, with an exponent that depends in a non-universal manner on the parameter λ. We find these features to be robust against a wide spectrum of modifications of microscopic modelling details. Two illustrative applications—one of a robot learning to navigate a field containing obstacles, and the problem of identifying a specific component in a collection of stimuli—are also provided.


Hide Node Learning Behaviour Learning Dynamic Committee Machine Single Layer Network 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Biehl M, Riegler P (1994) Online learning with a perceptron. Europhys Lett 28:525–530Google Scholar
  2. Biehl M, Kühn R, Stamatescu I-O (2000) Learning structured data from unspecific reinforcement. J Phys A Math Gen 33:6843–6857CrossRefGoogle Scholar
  3. Byrne R (1999) The thinking ape. Oxford University Press, OxfordGoogle Scholar
  4. Crick F, Mitchison G (1983) The function of dream sleep. Nature 304:111–114PubMedCrossRefGoogle Scholar
  5. Cristiani N, Shawe-Taylor J (2000) An Introduction to Support Vector Machines. Cambridge University Press, CambridgeGoogle Scholar
  6. Eissfeller H, Opper M (1992) New method for studying the dynamics of disordered spin systems without finite-size effects. Phys Rev Lett 68:2094–2097PubMedCrossRefGoogle Scholar
  7. Foster DJ, Wilson MA (2006) Reverse replay of behavioural sequences in hippocampal place cells during the awake state. Nature 440: 680–683 (we thank our colleague U. Bergmann for directing our attention to this paper)Google Scholar
  8. Hertz J, Krogh A, Palmer RG (1991) Introduction to the theory of neural computation. Addison-Wesley, ReadingGoogle Scholar
  9. Hopfield JJ, Feinstein DI, Palmer RG (1983) Unlearning has a stabilizing effect in collective memories. Nature 304:158–159PubMedCrossRefGoogle Scholar
  10. Kinzel W, Rujan P (1990) Improving a network generalization ability by selecting examples. Europhys Lett 13:473–477Google Scholar
  11. Kühn R, Stamatescu I-O (1999) A two step algorithm for learning from unspecific reinforcement. J Phys A Math Gen 32:5749–5762CrossRefGoogle Scholar
  12. Kühn R et al (eds) (2003) Adaptivity and learning – an interdisciplinary debate. Springer, HeidelbergGoogle Scholar
  13. Menzel R (2003) Creating presence by bridging between the past and the future: the role of learning and memory for the organization of life, in ((Kühn et al. 2003), pp 59–70Google Scholar
  14. Mitchell TM (1997) Machine learning. Mc Graw Hill, New YorkGoogle Scholar
  15. Mlodinov L, Stamatescu I-O (1985) An evolutionary procedure for machine learning. Int J Comp Inf Sci 14:201–219CrossRefGoogle Scholar
  16. Opper M, Kinzel W, Kleinz J, Nehl R (1990) On the ability of the optimal perceptron to generalize. J Phys A 23:L581–L586CrossRefGoogle Scholar
  17. Press WH, Teukolsky SA, Vetterling WT, Flannery BP (2002) Numerical recipies in C: the art of scientific computing, 2nd edn. Cambridge University Press, CambridgeGoogle Scholar
  18. Saad D, Solla SA (1995) Exact solution for online learning in multilayer neural networks. Phys Rev Lett 74:4337–4340PubMedCrossRefGoogle Scholar
  19. Schölkopf B, Smola A (2002) Learning with Kernels. MIT Press, CambridgeGoogle Scholar
  20. Shawe-Taylor J, Cristiani N (2004) Kernel Methods for Pattern Analysis. Cambridge University Press, CambridgeGoogle Scholar
  21. Stamatescu I-O (2003) A simple model for learning from nonspecific reinforcement. In (Kühn et al. 2003), pp 265–280Google Scholar
  22. Sutton RS, Barto AG (2000) Reinforcement learning—an Introduction. MIT Press, CambridgeGoogle Scholar
  23. Vallet F (1989) The Hebb rule for learning separable Boolen functions: learning and generalization. Europhys Lett 8:747–751Google Scholar
  24. van Hemmen JL (1997) Hebbian learning, its correlation catastrophe, and unlearning. Network 8:V1–V17CrossRefGoogle Scholar
  25. van Hemmen JL, Ioffe LB, Kühn R, Vaas M (1990) Increasing the efficiency of a neural network through unlearning. Physica A163: 386–392CrossRefGoogle Scholar
  26. Vapnik VN (1995) The Nature of Statistical Learning Theory. Springer, BerlinGoogle Scholar
  27. Vapnik VN (1998) Statistical learning theory. Wiley Inc, New YorkGoogle Scholar
  28. Wyatt J (2003) Reinforcement learning: a brief overview, in ((Kühn et al. 2003), pp 243–264Google Scholar

Copyright information

© Springer-Verlag 2007

Authors and Affiliations

  1. 1.Department of MathematicsKing’s CollegeLondonUK
  2. 2.FESt, Heidelberg and Institut für Theoretische PhysikUniversität HeidelbergHeidelbergGermany

Personalised recommendations