Classical Training Methods

Part of the Operations Research/Computer Science Interfaces Series book series (ORCS, volume 36)


This chapter reviews classical training methods for multilayer neural networks. These methods are widely used for classification and function modelling tasks. Nevertheless, they show a number of flaws or drawbacks that should be addressed in the development of such systems. They work by searching the minimum of an error function which defines the optimal behaviour of the neural network. Different standard problems are used to show the capabilities of these models; in particular, we have benchmarked the algorithms in a nonlinear classification problem and in three function modelling problems.

Key words

Multilayer perceptron delta rule cost function 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Arbib, M., 2003, The Handbook of Brain Theory and Neural Networks, MIT Press.Google Scholar
  2. Bishop, C. M., 1995, Neural Networks for Pattern Recognition, Clarendon Press.Google Scholar
  3. Chan, L. W., Fallside, F., 1987, An adaptive training algorithm for backpropagation, Computer Speech and Language, 2:205–218.CrossRefGoogle Scholar
  4. Chen, S., Gibson, G. J., Cowan, C. F. N., Grant, P. M., 1990, Adaptive equalization of finite non-linear channels using multilayer perceptrons, Signal Processing 20:107–119.CrossRefGoogle Scholar
  5. Cichocki, A., Amari, S., 2002, Adaptive Blind Signal and Image Processing, John Wiley & Sons.Google Scholar
  6. Cybenko, G., 1988, Continuous valued neural network with two hidden layers are sufficient, Tech. Report, Department of Computer Science, Tufts University, Medford, USA.Google Scholar
  7. Duda, R. O., Hart, P. E., Stork, D. G., 2001, Pattern Classification, Wiley.Google Scholar
  8. Gibson, G. J., Siu, S., Cowan, C. F. N., 1991, The application of nonlinear structures to the reconstruction of binary signals, IEEE Transactions on Signal Processing 39:1877–1881.CrossRefADSGoogle Scholar
  9. Haykin, S., 1996, Adaptive Filter Theory, Prentice-Hall.Google Scholar
  10. Haykin, S., 1999, Neural Networks: A Comprehensive Foundation, Prentice Hall.Google Scholar
  11. Hassoun, M. H., 1995, Fundamentals of Artificial Neural Networks, MIT Press.Google Scholar
  12. Hecht-Nielsen, R., 1989, Neurocomputing, Addison-Wesley.Google Scholar
  13. Jacobs, R. A., 1988, Increased rates of convergence through learning rate adaptation, Neural Networks 1:295–307.CrossRefGoogle Scholar
  14. LeCun, Y., 1985, Une procedure d’apprentissage pour reseau a seuil asymmetrique (a Learning Scheme for Asymmetric Threshold Networks), Proceedings of Cognitiva 85:599–604.Google Scholar
  15. Luenberger, D. G., 1984, Linear and Nonlinear Programming, Addison-Wesley.Google Scholar
  16. MacKay, D. J. C, 2003, Information Theory, Inference and Learning Algorithms, Cambridge University Press.Google Scholar
  17. Minai, A. A., Williams, R. D., 1990, Acceleration of backpropagation through learning rate and momentum adaptation, in: Proceedings of IJNCNN-90, pp. 676–679.Google Scholar
  18. Moreira, M., Fiesler, E., 1995, Neural networks with adaptive learning rate and momentum terms, Technical Report 95-04, IDIAP, Martigny, Switzerland.Google Scholar
  19. Nelles, O., 2001, Nonlinear System Identification From Classical Approaches to Neural Networks and Fuzzy Models, Springer.Google Scholar
  20. Orr, G. B., Müller, K. R., 1998, Neural Networks: Tricks of the Trade, Lecture Notes in Computer Science, Springer.Google Scholar
  21. Press, W. H., Flannery, B. P., Teukolsky, S. A., Vetterling, W. T., 1992, Numerical Recipes in C, Cambridge University Press.Google Scholar
  22. Proakis, J. G., 2001, Digital Communications, McGraw-Hill.Google Scholar
  23. Qureshi, S. U. H., 1985, Adaptive equalization, Procs. of the IEEE 73:1349 1387.ADSCrossRefGoogle Scholar
  24. Reed, R., 1993, Pruning Algorithms: A Survey, IEEE Transactions on Neural Networks 4(5):740–747.CrossRefGoogle Scholar
  25. Reed, R. D., Marks II, R. J., 1993, Neural Smithing, Supervised Learning in Feedforward Artificial Neural Networks, MIT Press.Google Scholar
  26. Riedmiller, M., Braun, H., 1993, A direct adaptive method for faster backpropagation learning: the RPROP algorithm, in: Proceedings of IEEE International Conference on Neural Networks, pp. 586–591.Google Scholar
  27. Ripley, B. D., 1996, Pattern Recognition and Neural Networks, Cambridge University Press.Google Scholar
  28. Rumelhart, D. E., Hinton, G. E., Williams, R. J., 1986, Learning representations by back-propagating errors, Nature 323:533–536.CrossRefADSGoogle Scholar
  29. Schiffmann, W., Joost, M., Werner, R., 1994, Optimization of the backpropagation algorithm for training multilayer perceptrons, Technical Report, University of Koblenz, Institute of Physics, Germany.Google Scholar
  30. Sexton, R. S., Dorsey, R. E., Sikander, N. A., 2004, Simultaneous optimization of neural networks function and architecture algorithm, Decision Support Systems 36:283–296.CrossRefGoogle Scholar
  31. Silva, F. M., Almeida, L. B., 1990, Acceleration techniques for the backpropagation algorithm, in: Proceedings of the EURASIP Workshop, Lecture Notes in Computer Science, vol. 412 of Lecture Notes on Computer Science, Springer-Verlag, pp. 110–119.Google Scholar
  32. Weigend, A. S., Gershenfeld, N. A., 1993, Time Series Prediction: Forecasting the Future and Understanding the Past, Addison-Wesley.Google Scholar
  33. Werbos, P. J., 1974, Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences, PhD thesis, Harvard University, Cambridge, MA, USA.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2006

Authors and Affiliations

  1. 1.Grupo de Procesado Digital de Señates, Dpt. Enginyeria Electrònica, Escola Tècnica Superior d’EnginyeriaUniversitat de ValenciaSpain
  2. 2.The Statistics and Neural Computation Research Group, School of Computing and Mathematical SciencesLiverpool John Moores UniversityUK

Personalised recommendations