Revisiting the Problem of Weight Initialization for Multi-Layer Perceptrons Trained with Back Propagation

Adam, Stavros; Karras, Dimitrios Alexios; Vrahatis, Michael N.

doi:10.1007/978-3-642-03040-6_38

Stavros Adam¹⁹,
Dimitrios Alexios Karras²⁰ &
Michael N. Vrahatis²¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5507))

Included in the following conference series:

International Conference on Neural Information Processing

Abstract

One of the main reasons for the slow convergence and the suboptimal generalization results of MLP (Multilayer Perceptrons) based on gradient descent training is the lack of a proper initialization of the weights to be adjusted. Even sophisticated learning procedures are not able to compensate for bad initial values of weights, while good initial guess leads to fast convergence and or better generalization capability even with simple gradient-based error minimization techniques. Although initial weight space in MLPs seems so critical there is no study so far of its properties with regards to which regions lead to solutions or failures concerning generalization and convergence in real world problems. There exist only some preliminary studies for toy problems, like XOR. A data mining approach, based on Self Organizing Feature Maps (SOM), is involved in this paper to demonstrate that a complete analysis of the MLP weight space is possible. This is the main novelty of this paper. The conclusions drawn from this novel application of SOM algorithm in MLP analysis extend significantly previous preliminary results in the literature. MLP initialization procedures are overviewed along with all conclusions so far drawn in the literature and an extensive experimental study on more representative tasks, using our data mining approach, reveals important initial weight space properties of MLPs, extending previous knowledge and literature results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Kolen, J.F., Pollack, J.B.: Back propagation is sensitive to initial conditions. In: Advances in Neural Information Processing Systems 3, Denver (1991)
Google Scholar
Hamey, L.: Analysis of the Error Surface of the XOR Network with Two Hidden Units. In: Proc. 7th Australian Conf. Artificial Neural Networks, pp. 179–183 (1996)
Google Scholar
Kohonen, T.: Self-Organization and Associative Memory. Springer, Heidelberg (1989)
Book MATH Google Scholar
Olli Simula, O., Vesanto, J., Alhoniemi, E., Hollman, J.: Analysis and Modeling of Complex Systems Using the Self-Organizing Map. In: Neuro-Fuzzy Techniques for Intelligent Information Systems (1999)
Google Scholar
Technical Report on SOM Toolbox 2.0, Helsinki University of Technology (April 2000), http://www.cis.hut.fi/projects/somtoolbox/

Download references

Author information

Authors and Affiliations

Dept. Mathematics, University of Patras Artificial Intelligence Research Center (UPAIRC), GR-26110 Patras, Greece and TEI Hpeirou, Arta, Greece
Stavros Adam
Dept. Automation, Chalkis Institute of Technology, Psachna, Evoia GR-34400 and Hellenic Open University, Greece
Dimitrios Alexios Karras
Dept. Mathematics, University of Patras Artificial Intelligence Research Center (UPAIRC), University of Patras, GR-26110, Patras, Greece
Michael N. Vrahatis

Authors

Stavros Adam
View author publications
You can also search for this author in PubMed Google Scholar
Dimitrios Alexios Karras
View author publications
You can also search for this author in PubMed Google Scholar
Michael N. Vrahatis
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Network Design and Research Center, Kyushu Institute of Technology,, 680-4, Kawazu, Iizuka,, 820-8502, Fukuoka, Japan
Mario Köppen
Knowledge Engineering and Discovery Research Institute (KEDRI), School of Computing and Mathematical Sciences, Auckland University of Technology, 350 Queen Street, 10110, Auckland, New Zealand
Nikola Kasabov
Department of Electrical and Computer Engineering, Robotics Laboratory, Auckland University of Technology, 38 Princes Street,, 1142, Auckland, New Zealand
George Coghill

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Adam, S., Karras, D.A., Vrahatis, M.N. (2009). Revisiting the Problem of Weight Initialization for Multi-Layer Perceptrons Trained with Back Propagation. In: Köppen, M., Kasabov, N., Coghill, G. (eds) Advances in Neuro-Information Processing. ICONIP 2008. Lecture Notes in Computer Science, vol 5507. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03040-6_38

Download citation

DOI: https://doi.org/10.1007/978-3-642-03040-6_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03039-0
Online ISBN: 978-3-642-03040-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics