Optimal weight decay in a perceptron

  • Siegfried Bös
Poster Presentations 1 Theory III: Genaralization
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1112)


Weight decay was proposed to reduce overfitting as it often appears in the learning tasks of artificial neural networks. In this paper weight decay is applied to a well defined model system based on a single layer perceptron, which exhibits strong overfitting. Since the optimal non-overfitting solution is known for this system, we can compare the effect of the weight decay with this solution. A strategy to find the optimal weight decay strength is proposed, which leads to the optimal solution for any number of examples.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    S. Bös (1995),’ Avoiding overfitting by finite temperature learning and cross-validation', in International Conference on Artificial Neural Networks 95 (ICANN'95), edited by EC2 & Cie, Vol.2, p.111–116.Google Scholar
  2. 2.
    S. Bös (1996),’ A realizable learning task which exhibits overfitting', in Advances in Neural Information Processing Systems 8 (@#@ NIPS*95 @#@), editors D. Touretzky, M. Mozer, and M. Hasselmo, MIT Press, Cambridge MA, in press.Google Scholar
  3. 3.
    J. Hertz, A. Krogh, and R.G. Palmer (1991), Introduction to the Theory of Neural Computation, Addison-Wesley, Reading.Google Scholar
  4. 4.
    A. Krogh, and J. Hertz (1992),’ A simple weight decay can improve generalization', in Advances in Neural Information Processing Systems 4, editors J.E. Moody, S.J. Hanson and R.J. Lippmann, Kaufmann, San Mateo CA, p.950–957.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1996

Authors and Affiliations

  • Siegfried Bös
    • 1
  1. 1.Lab for Information RepresentationRIKENSaitamaJapan

Personalised recommendations