Training Multilayer Perceptrons Via Minimization of Sum of Ridge Functions

Wu, Wei; Feng, Guorui; Li, Xin

doi:10.1023/A:1016249727555

Training Multilayer Perceptrons Via Minimization of Sum of Ridge Functions

Published: November 2002

Volume 17, pages 331–347, (2002)
Cite this article

Advances in Computational Mathematics Aims and scope Submit manuscript

Wei Wu¹,
Guorui Feng¹ &
Xin Li²

107 Accesses
41 Citations
Explore all metrics

Abstract

Motivated by the problem of training multilayer perceptrons in neural networks, we consider the problem of minimizing E(x)=∑_i=1 ⁿ f _i(ξ_i⋅x), where ξ_i∈R ^s, 1≤i≤n, and each f _i(ξ_i⋅x) is a ridge function. We show that when n is small the problem of minimizing E can be treated as one of minimizing univariate functions, and we use the gradient algorithms for minimizing E when n is moderately large. For large n, we present the online gradient algorithms and especially show the monotonicity and weak convergence of the algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Comparing Regularization Techniques Applied to a Perceptron

An unsupervised learning approach for multilayer perceptron networks

Article 26 November 2018

Robust Multilayer Perceptrons: Robust Loss Functions and Their Derivatives

References

D.P. Bertsekas, Nonlinear Programming (Athena, Boston, MA, 1995).
Google Scholar
D.P. Bertsekas and J. Tsitsiklis, Neuro-Dynamic Programming (Athena, Boston, MA, 1996).
Google Scholar
S.W. Ellacott, The numerical approach of neural networks, in: Mathematical Approaches to Neural Networks, ed. J.G. Taylor (North-Holland, Amsterdam, 1993) pp. 103-138.
Google Scholar
T.L. Fine and S. Muhherjee, Parameter convergence and learning curves for neural networks, Neural Computation 11 (1999) 747-769.
Google Scholar
W. Finnoff, Diffusion approximations for the constant learning rate backpropagation algorithm and resistance to local minima, Neural Computation 6 (1994) 285-295.
Google Scholar
A.A. Gaivoronski, Convergence properties of backpropagation for neural nets via theory of stochastic gradient methods, Optim. Methods Software 4 (1994) 117-134.
Google Scholar
M. Gori and M. Maggini, Optimal convergence of online backpropagation, IEEE Trans. Neural Networks 7 (1996) 251-254.
Google Scholar
L. Grippo, Convergent on-line algorithms for supervised learning in neural networks, IEEE Trans. Neural Networks 11 (2000) 1284-1299.
Google Scholar
M.H. Hassoun, Fundamentals of Artificial Neural Networks (MIT Press, Cambridge, MA, 1995).
Google Scholar
S. Haykin, Neural Networks, 2nd ed. (Prentice-Hall, Englewood Cliffs, NJ, 1999).
Google Scholar
C.M. Kuan and K. Hornik, Convergence of learning algorithms with constant learning rates, IEEE Trans. Neural Networks 2 (1991) 484-489.
Google Scholar
H.J. Kushner and J. Yang, Analysis of adaptive step size SA algorithms for parameter tracting, IEEE Trans. Automat. Control 40 (1995) 1403-1410.
Google Scholar
H.J. Kushner and G.G. Yin, Stochastic Approximation Algorithms and Applications (Springer, Berlin, 1997).
Google Scholar
C.G. Looney, Pattern Recognition Using Neural Networks (Oxford Univ. Press, New York, 1997).
Google Scholar
Z. Luo, On the convergence of the LMS algorithm with adaptive learning rate for linear feedforward networks, Neural Computation 3 (1991) 226-245.
Google Scholar
Z. Luo and P. Tseng, Analysis of an approximate gradient projection method with applications to the backpropagation algorithm, Optim. Methods Software 4 (1994) 85-102.
Google Scholar
O.L. Mangasarian and M.V. Solodov, Serial and parallel backpropagation convergence via nonmonotone perturbed minimization, Optim. Methods Software 4 (1994) 103-116.
Google Scholar
S. Mukherjee and T.L. Fine, Online steepest descent yields weights with nonnormal limiting distribution, Neural Computation 8 (1996) 1075-1084.
Google Scholar
S.-H. Oh, Improving the error backpropagation algorithm with a modified error function, IEEE Trans. Neural Networks 8 (1997) 799-803.
Google Scholar
J.M. Ortega and W.C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables (Academic Press, New York, 1970).
Google Scholar
A. Ralston, A First Course in Numerical Analysis (McGraw-Hill, New York, 1965).
Google Scholar
P. Sollich and D. Barber, Online learning from finite training sets and robustness to input bias, Neural Computation 10 (1998) 2201-2217.
Google Scholar
H. White, Some asymptotic results for learning in single hidden-layer feedforward neural network models, J. Amer. Statist. Assoc. 84 (1989) 117-134.
Google Scholar
Wei Wu and Y.S. Xu, Convergence of the on-line updating gradient method for neural networks, to appear in Comput. Appl. Math.

Download references

Author information

Authors and Affiliations

Department of Applied Mathematics, Dalian University of Technology, Dalian, 116023, P.R. China
Wei Wu & Guorui Feng
Department of Mathematical Sciences, University of Nevada, Las Vegas, NV, 89154-4020, USA
Xin Li

Authors

Wei Wu
View author publications
You can also search for this author in PubMed Google Scholar
Guorui Feng
View author publications
You can also search for this author in PubMed Google Scholar
Xin Li
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, W., Feng, G. & Li, X. Training Multilayer Perceptrons Via Minimization of Sum of Ridge Functions. Advances in Computational Mathematics 17, 331–347 (2002). https://doi.org/10.1023/A:1016249727555

Download citation

Issue Date: November 2002
DOI: https://doi.org/10.1023/A:1016249727555

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Training Multilayer Perceptrons Via Minimization of Sum of Ridge Functions

Abstract

Access this article

Similar content being viewed by others

Comparing Regularization Techniques Applied to a Perceptron

An unsupervised learning approach for multilayer perceptron networks

Robust Multilayer Perceptrons: Robust Loss Functions and Their Derivatives

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Training Multilayer Perceptrons Via Minimization of Sum of Ridge Functions

Abstract

Access this article

Similar content being viewed by others

Comparing Regularization Techniques Applied to a Perceptron

An unsupervised learning approach for multilayer perceptron networks

Robust Multilayer Perceptrons: Robust Loss Functions and Their Derivatives

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation