Skip to main content
Log in

Part 2: Multilayer perceptron and natural gradient learning

  • Tutorial on Brain-Inspired Computing
  • Published:
New Generation Computing Aims and scope Submit manuscript

Abstract

Since the perceptron was developed for learning to classify input patterns, there have been plenty of studies on simple perceptrons and multilayer perceptrons. Despite wide and active studies in theory and applications, multilayer perceptrons still have many unsettled problems such as slow learning speed and overfitting. To find a thorough solution to these problems, it is necessary to consolidate previous studies, and find new directions for uplifting the practical power of multilayer perceptrons. As a first step toward the new stage of studies on multilayer perceptrons, we give short reviews on two interesting and important approaches; one is stochastic approach and the other is geometric approach. We also explain an efficient learning algorithm developed from the statistical and geometrical studies, which is now well known as the natural gradient learning method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Amari, S., “Theory of Adaptive Pattern classifiers,”IEEE Trans. Elect. Comput., EC-16, pp. 299–307, 1965.

    Article  Google Scholar 

  2. Amari, S., “Natural Gradient Works Efficiently in Learning,”Neural Computation, 10, pp. 251–276, 1998.

    Article  Google Scholar 

  3. Amari, S. and Nagaoka, H.,Methods of Information Geometry, AMS New York and Oxford University Press, 2000.

  4. Amari, S., Park, H. and Fukumizu, K., “Adaptive Method of Realizing Natural Gradient Learning for Multilayer Perceptrons”Neural Computation, 12, pp. 1399–1409, 2003.

    Article  Google Scholar 

  5. Amari, S., Park, H. and Ozeki, T., “Singularities Affect Dyanmics of Learning in Neuromanifold,”Neural Computation, 17, in press, 2005.

  6. Bishop, C.M.,Neural Networks, for Pattern Recognition, Clarendon Press, Oxford, 1995.

    Google Scholar 

  7. Bottou, L. “Online Algorithms and Stochastic Approximations,”Online Learning and Neural Networks (Saad, D. ed.), Cambridge University Press, Cambridge, UK, 1998.

    Google Scholar 

  8. Cybenko, G., “Approximation by Superpositions of a Sigmoidal FUnction,”Mathematical Control Singals Systems, 2, pp. 303–310, 1989.

    Article  MATH  MathSciNet  Google Scholar 

  9. Fukumizu, K., “Likelihood Ratio of Unidentifiable Models and Multilayer Neural Networks,”The Annals of Statistics, 3, pp. 833–851, 2003.

    Article  MathSciNet  Google Scholar 

  10. Fukumizu, K. and Amari, S., “Local Minima and Plateaus in Hierarchical Structures of Multilayer Perceptrons,”Neural Networks, 13, pp. 317–327, 2000.

    Article  Google Scholar 

  11. Hagiwara, K., “On the Problem in Model Selection of Neural Network Regression in Overrealizable Scenario,”Neural Computation, 14, pp. 1979–2002, 2002.

    Article  MATH  Google Scholar 

  12. Inoue, M., Park, H. and Okada, M., “On-line Learning Theory of Soft Committee Machines with Correlated Hidden Units — Steepest Gradient Descent and Natural Gradient Descent-,”J. Phys. Soc. Jpn., 72, 4, pp. 805–810, 2003.

    Article  Google Scholar 

  13. Inoue, M., Park, H. and Okada, M., “Dynamics of the Adaptive natural Gradient Descent Method for Soft Committee Machines,”Physical Review E, 69, 5-2, 056120, 2004.

    Article  Google Scholar 

  14. Park, H., Amari, S. and Fukumizu, K., “Adaptive Natural Gradient Learning Algorithms for Various Stochastic Models,”Neural Networks, 13, pp. 755–764, 2000.

    Article  Google Scholar 

  15. Park, H., Inoue, M. and Okada, M., “Learning Dynamics of Multilayer Perceptrons with Singular Structures,”J. Phys. A, 36, 11753, 2003.

    Article  MATH  MathSciNet  Google Scholar 

  16. Park, H., Inoue, M. and Okada, M., “Slow Dynamics Due to Singularities of Hierarchical Learning Machines,”Progress of Theoretical Physics Supplement, 157, pp. 275–279, 2005.

    Article  Google Scholar 

  17. Rattray, M., Saad, D. and Amari, S., “Natural Gradient Descent for On-line Learning,”Physical Review Letters, 81, 24, pp. 5461–5464, 1998.

    Article  Google Scholar 

  18. Rosenblatt, F., “The Perceptron: A Probabilistic Model for Information Storage and organization in the brain,”Psychological Review, 65, pp. 386–408, 1958.

    Article  MathSciNet  Google Scholar 

  19. Rosenblatt, F.,Principles of Neurodynamics, Spartan New York, 1961.

    Google Scholar 

  20. Rumelhart, D., Hinton, G. E. and Williams, R. J., “Learning Internal Representations by Error Backpropagation,” inParallel Distributed Processing: Explorations in the Microstructure of Cognition, 1, Foundations, MIT Press, Cambridge, MA, 1986.

    Google Scholar 

  21. Saad, D. and Solla, A., “On-line Learning in Soft Committee Machines”Phys. Rev. E. 52, pp. 4225–4243, 1995.

    Article  Google Scholar 

  22. Watanabe, S., “Algebraic Analysis for Non-identifiable Learning Machines,”Neural Computation, 13, pp. 899–933, 2001.

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

Hyeyoung Park, Ph.D.: She is Assistant Professor of Computer Sciences at School of Electrical Engineering and Computer Science of Kyungpook National University in Korea. She received her B.S., M.A. and Ph.D. from Yonsei University of Korea in 1994, 1996, and 2000. She also worked as a research scinetist at Brain Science Institute in RIKEN from 2000 to 2004. Her research insterest is in learning thoeries and pattern recognition as well as statistical data analysis.

About this article

Cite this article

Park, H. Part 2: Multilayer perceptron and natural gradient learning. New Gener Comput 24, 79–95 (2006). https://doi.org/10.1007/BF03037294

Download citation

  • Received:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF03037294

Keywords

Navigation