Supervised and Unsupervised Learning in Linear Networks

  • Pierre Baldi
  • Yves Chauvin
  • Kurt Hornik


We give an overview of the main facts on supervised and unsupervised learning in networks of linear units and present several new results and open questions. In the case of back-propagation, the complete structure of the landscape of the error function and its connections to known statistical techniques such as linear regression, principal component analysis and discriminant analysis have been established. Here, we examine the dynamical aspects of the learning process, how in certain cases the spectral properties of a covariance matrix are learnt according to the order defined by the eigenvalues, and the effects of noise. In the low noise limit, we prove that the strategy adopted by the networks are unchanged whereas in the high noise limit, the solution adopted is one of complete redundancy. In the case of unsupervised learning, several algorithms based on various hebbian and anti-hebbian mechanisms are reviewed together with the structure of their fixed points. We show that three “symmetric” algorithms suggested in the literature (Oja, 1982; Williams, 1985; Baldi, 1988) are in fact equivalent. Results of simulations are presented.


Gradient Descent Unsupervised Learning Hide Unit Fact Equivalent Linear Network 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    Baldi, P. (1988). Linear learning: Landscapes and algorithms. In D. S. Touretzky (Ed.), Advances in neural information processing systems 1, Morgan Kaufman. Palo Alto, CA.Google Scholar
  2. [2]
    Baldi, P. and Hornik, K. (1989). Neural network and principal component analysis: Learning from examples without local minima. Neural Networks, 2, 1, 53–58.CrossRefGoogle Scholar
  3. [3]
    Baldi, P. and Hornik, K. (1990). Back-propagation and unsupervised learning in linear networks. In Y. Chauvin and D. E. Rumelhart (Eds.) Back-propagation: Theory, architectures and applications. Lawrence Erlbaum Ass. To Appear.Google Scholar
  4. [4]
    Bourlard, H. and Kamp, Y. (1988). Auto-association by the multilayer perceptrons and singular value decomposition. Biological Cybernetics, 51, 291–294.MathSciNetCrossRefGoogle Scholar
  5. [5]
    Chauvin, Y. (1989). Principal component analysis by gradient descent on a constrained linear hebbian cell. Proceedings of the 1989 IJCNN Conference, 1, 373–380, Washington D. C.Google Scholar
  6. [6]
    Gallinari, P., Thiria, S. and Folgelman Soulie, F. (1988). Multilayer perceptrons and data analysis. Proceedings of the 1988 IJCNN Conference, 391–399, San Diego, CA.Google Scholar
  7. [7]
    Linsker, R. (1988). Self-organization in a perceptual network. Computer, March, 105–117.Google Scholar
  8. [8]
    Oja, E. (1982). A simplified neuron model as a principal component analyser. Journal of Mathematical Biology, 15, 267–273.MathSciNetCrossRefzbMATHGoogle Scholar
  9. [9]
    Williams, R. J. (1985). Feature discovery through error-correction learning. Technical Report 8501. Institute for Cognitive Science, UCSD, La Jolla, CA.Google Scholar

Copyright information

© Springer Science+Business Media Dordrecht 1990

Authors and Affiliations

  • Pierre Baldi
    • 1
    • 2
  • Yves Chauvin
    • 3
    • 4
  • Kurt Hornik
    • 5
  1. 1.Jet Propulsion Laboratory, 303-310California Institute of TechnologyPasadenaUSA
  2. 2.Division of BiologyCalifornia Institute of TechnologyUSA
  3. 3.Thomson-CSF, IncPalo AltoUSA
  4. 4.Psychology DepartmentStanford UniversityUSA
  5. 5.Institut für Statistik und WahrsheinlichkeitstheorieTechnische Universität WienWienAustria

Personalised recommendations