Skip to main content
Log in

Training Multilayer Perceptrons Via Minimization of Sum of Ridge Functions

  • Published:
Advances in Computational Mathematics Aims and scope Submit manuscript

Abstract

Motivated by the problem of training multilayer perceptrons in neural networks, we consider the problem of minimizing E(x)=∑ i=1 n f i i x), where ξ i R s, 1≤in, and each f i i x) is a ridge function. We show that when n is small the problem of minimizing E can be treated as one of minimizing univariate functions, and we use the gradient algorithms for minimizing E when n is moderately large. For large n, we present the online gradient algorithms and especially show the monotonicity and weak convergence of the algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. D.P. Bertsekas, Nonlinear Programming (Athena, Boston, MA, 1995).

    Google Scholar 

  2. D.P. Bertsekas and J. Tsitsiklis, Neuro-Dynamic Programming (Athena, Boston, MA, 1996).

    Google Scholar 

  3. S.W. Ellacott, The numerical approach of neural networks, in: Mathematical Approaches to Neural Networks, ed. J.G. Taylor (North-Holland, Amsterdam, 1993) pp. 103-138.

    Google Scholar 

  4. T.L. Fine and S. Muhherjee, Parameter convergence and learning curves for neural networks, Neural Computation 11 (1999) 747-769.

    Google Scholar 

  5. W. Finnoff, Diffusion approximations for the constant learning rate backpropagation algorithm and resistance to local minima, Neural Computation 6 (1994) 285-295.

    Google Scholar 

  6. A.A. Gaivoronski, Convergence properties of backpropagation for neural nets via theory of stochastic gradient methods, Optim. Methods Software 4 (1994) 117-134.

    Google Scholar 

  7. M. Gori and M. Maggini, Optimal convergence of online backpropagation, IEEE Trans. Neural Networks 7 (1996) 251-254.

    Google Scholar 

  8. L. Grippo, Convergent on-line algorithms for supervised learning in neural networks, IEEE Trans. Neural Networks 11 (2000) 1284-1299.

    Google Scholar 

  9. M.H. Hassoun, Fundamentals of Artificial Neural Networks (MIT Press, Cambridge, MA, 1995).

    Google Scholar 

  10. S. Haykin, Neural Networks, 2nd ed. (Prentice-Hall, Englewood Cliffs, NJ, 1999).

    Google Scholar 

  11. C.M. Kuan and K. Hornik, Convergence of learning algorithms with constant learning rates, IEEE Trans. Neural Networks 2 (1991) 484-489.

    Google Scholar 

  12. H.J. Kushner and J. Yang, Analysis of adaptive step size SA algorithms for parameter tracting, IEEE Trans. Automat. Control 40 (1995) 1403-1410.

    Google Scholar 

  13. H.J. Kushner and G.G. Yin, Stochastic Approximation Algorithms and Applications (Springer, Berlin, 1997).

    Google Scholar 

  14. C.G. Looney, Pattern Recognition Using Neural Networks (Oxford Univ. Press, New York, 1997).

    Google Scholar 

  15. Z. Luo, On the convergence of the LMS algorithm with adaptive learning rate for linear feedforward networks, Neural Computation 3 (1991) 226-245.

    Google Scholar 

  16. Z. Luo and P. Tseng, Analysis of an approximate gradient projection method with applications to the backpropagation algorithm, Optim. Methods Software 4 (1994) 85-102.

    Google Scholar 

  17. O.L. Mangasarian and M.V. Solodov, Serial and parallel backpropagation convergence via nonmonotone perturbed minimization, Optim. Methods Software 4 (1994) 103-116.

    Google Scholar 

  18. S. Mukherjee and T.L. Fine, Online steepest descent yields weights with nonnormal limiting distribution, Neural Computation 8 (1996) 1075-1084.

    Google Scholar 

  19. S.-H. Oh, Improving the error backpropagation algorithm with a modified error function, IEEE Trans. Neural Networks 8 (1997) 799-803.

    Google Scholar 

  20. J.M. Ortega and W.C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables (Academic Press, New York, 1970).

    Google Scholar 

  21. A. Ralston, A First Course in Numerical Analysis (McGraw-Hill, New York, 1965).

    Google Scholar 

  22. P. Sollich and D. Barber, Online learning from finite training sets and robustness to input bias, Neural Computation 10 (1998) 2201-2217.

    Google Scholar 

  23. H. White, Some asymptotic results for learning in single hidden-layer feedforward neural network models, J. Amer. Statist. Assoc. 84 (1989) 117-134.

    Google Scholar 

  24. Wei Wu and Y.S. Xu, Convergence of the on-line updating gradient method for neural networks, to appear in Comput. Appl. Math.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, W., Feng, G. & Li, X. Training Multilayer Perceptrons Via Minimization of Sum of Ridge Functions. Advances in Computational Mathematics 17, 331–347 (2002). https://doi.org/10.1023/A:1016249727555

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1016249727555

Navigation