Advertisement

Statistics and Computing

, Volume 8, Issue 1, pp 55–88 | Cite as

Statistical mechanical analysis of the dynamics of learning in perceptrons

  • C. W. H. MACE
  • A. C. C. COOLEN
Article

Abstract

We describe the application of tools from statistical mechanics to analyse the dynamics of various classes of supervised learning rules in perceptrons. The character of this paper is mostly that of a cross between a biased non-encyclopaedic review and lecture notes: we try to present a coherent and self-contained picture of the basics of this field, to explain the ideas and tricks, to show how the predictions of the theory compare with (simulation) experiments, and to bring together scattered results. Technical details are given explicitly in an appendix. In order to avoid distraction we concentrate the references in a final section. In addition this paper contains some new results: (i) explicit solutions of the macroscopic equations that describe the error evolution for on-line and batch learning rules; (ii) an analysis of the dynamics of arbitrary macroscopic observables (for complete and incomplete training sets), leading to a general Fokker–Planck equation; and (iii) the macroscopic laws describing batch learning with complete training sets. We close the paper with a preliminary expose´ of ongoing research on the dynamics of learning for the case where the training set is incomplete (i.e. where the number of examples scales linearly with the network size).

Learning dynamics generalization statistical mechanics 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anlauf, J. K. and Biehl, M. (1989) Europhysics Letters 10, 687.Google Scholar
  2. Barkai, N., Seung, H. S. and Sompolinsky, H. (1995) Physical Review Letters 75, 1415.Google Scholar
  3. Bedeaux, D., Lakatos-Lindenberg, K. and Shuler, K. (1971) Journal of Mathematical Physics 12, 2116.Google Scholar
  4. Biehl, M. and Riegler, P. (1994) Europhysics Letters 28, 525.Google Scholar
  5. Biehl, M. and Schwarze, H. (1992) Europhysics Letters 20, 733.Google Scholar
  6. Biehl, M. and Schwarze, H. (1995) Journal of Physics A 28, 643.Google Scholar
  7. Biehl, M. Riegler, P. and Wöhler, C. (1996) Journal of Physics A 29, 4769.Google Scholar
  8. Coolen, A. C. C. and Saad, D. (1998) submitted to Physical Review letters.Google Scholar
  9. Coolen, A. C. C., Laughton, S. N. and Sherrington, D. (1996) Physical Review B 53, 8184.Google Scholar
  10. Copelli, M. and Caticha, N. (1995) Journal of Physics A 28, 1615.Google Scholar
  11. Fischer, K. H. and Hertz, J. A. (1991) Spin Glasses, Cambridge: Cambridge University Press.Google Scholar
  12. Gardner, E. (1988) Journal of Physics A 21, 257.Google Scholar
  13. Gradshteyn, I. S. and Ryzhik, I. M. (1980) Table of Integrals, Series, and ProductsSan Diego, Academic Press.Google Scholar
  14. Györgyi, G. (1990) Physical Review Letters 64, 2957.Google Scholar
  15. Hertz, J. A. (1990) Statistical Mechanics of Neural Networks, Berlin: Springer.Google Scholar
  16. Hertz, J. A., Krogh, A. and Thorgergsson, G. I. (1989) Journal of Physics A 22, 2133.Google Scholar
  17. Heskes, T and Kappen, B. (1991) Physical Review A 44, 2718.Google Scholar
  18. Horner, H. (1992a) Zeitschrift für physik B 86, 291.Google Scholar
  19. Horner, H. (1992b) Zeitschrift für physik B 87, 371.Google Scholar
  20. Journal of Physics A(1989) 22.Google Scholar
  21. Kim, J. W. and Sompolinsky, H. (1996) Physical Review Letters 76, 3021.Google Scholar
  22. Kinouchi, O. and Caticha, N. (1992) Journal of Physics A 25, 6243.Google Scholar
  23. Kinouchi, O. and Caticha, N. (1993) Journal of Physics A 26, 6161.Google Scholar
  24. Kinouchi, O. and Caticha, N. (1995) Physical Review E 52, 2878.Google Scholar
  25. Kinzel, W. and Opper, M. (1991) in Physics of Neural Networks I, Berlin: Springer.Google Scholar
  26. Kinzel, W. and Rujan, P. (1990) Europhysics Letters 13, 473.Google Scholar
  27. Krogh, A. and Hertz, J. A. (1992) Journal of Physics A 25, 1135.Google Scholar
  28. Laughton, S. N., Coolen, A. C. C. and Sherrington, D. (1996) Journal of Physics A 29, 763.Google Scholar
  29. Mézard, M. Parisi, G. and Virasoro, M. A. (1987) Spin Glass Theory and Beyond, Singapore: World Scientific.Google Scholar
  30. Minsky, M. L. and Papert, S. A. (1969) Perceptrons, Cambridge, Massachusetts: MIT Press.Google Scholar
  31. Opper, M. (1996) Physical Review Letters 77, 4671.Google Scholar
  32. Opper, M. and Haussler, D. (1991) Physical Review Letters 66, 2677.Google Scholar
  33. Opper, M. and Kinzel, W. (1994) in Physics of Neural Networks III, Berlin: Springer.Google Scholar
  34. Opper, M., Kinzel, W., Kleinz, J. and Nehl, R. (1990) Journal of Physics A 23, L581.Google Scholar
  35. Rosenblatt, F. (1960) Principles of Neurodynamics, New York: Spartan.Google Scholar
  36. Saad, D. and Solla, S. (1995a) Physical Review E 52, 4225.Google Scholar
  37. Saad, D. and Solla, S. (1995b) Physical Review Letters 74, 4337.Google Scholar
  38. Seung, H. S., Sompolinsky, H. and Tishby, N. (1992) Physical Review A 45, 6056.Google Scholar
  39. Simonetti, R. and Caticha, N. (1996) Journal of Physics A 29, 4859.Google Scholar
  40. Sompolinsky, H. Barkai, N. and Seung, H. S. (1995) in Neural Networks: The Statistical Mechanics Perspective, Singapore: World Scientific.Google Scholar
  41. Vallet, F. (1989) Europhysics Letters 8, 747.Google Scholar
  42. Van den Broeck, C. and Reimann, P. (1996) Physical Review Letters 76, 2188.Google Scholar
  43. Watkin, T. L. H., Rau, A. and Biehl, M. (1993) Review of Modern Physics 65, 499.Google Scholar

Copyright information

© Chapman and Hall 1998

Authors and Affiliations

  • C. W. H. MACE
    • 1
  • A. C. C. COOLEN
    • 1
  1. 1.Department of MathematicsKing's College London, StrandLondonUK

Personalised recommendations