Improved Learning of Neural Nets Through Global Search

Plagianakos, V. P.; Magoulas, G. D.; Vrahatis, M. N.

doi:10.1007/0-387-30927-6_15

V. P. Plagianakos³,
G. D. Magoulas⁴ &
M. N. Vrahatis³

Part of the book series: Nonconvex Optimization and Its Applications ((NOIA,volume 85))

1729 Accesses
2 Citations

Abstract

Learning in artificial neural networks is usually based on local minimization methods which have no mechanism that allows them to escape the influence of an undesired local minimum. This chapter presents strategies for developing globally convergent modifications of local search methods and investigates the use of popular global search methods in neural network learning. The proposed methods tend to lead to desirable weight configurations and allow the network to learn the entire training set, and, in that sense, they improve the efficiency of the learning process. Simulation experiments on some notorious for their local minima learning problems are presented and an extensive comparison of several learning algorithms is provided.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

N. Ampazis and S.J. Perantonis, (2002). Two Highly Efficient Second Order Algorithms for Training Feedforward Networks, IEEE Transactions on Neural Networks, 13, 1064–1074.
Article Google Scholar
L. Armijo, (1966). Minimization of Functions Having Lipschitz-continuous First Partial Derivatives, Pacific Journal of Mathematics, 16, 1–3.
MATH MathSciNet Google Scholar
J. Barzilai and J.M. Borwein, (1988). Two Point Step Size Gradient Methods, IMA Journal of Numerical Analysis, 8, 141–148.
MATH MathSciNet Google Scholar
R. Battiti, (1989). Accelerated Backpropagation Learning: Two Optimization Methods, Complex Systems, 3, 331–342.
MATH Google Scholar
R. Battiti, (1992). First-and Second—order Methods for Learning: Between Steepest Descent and Newton’s Method, Neural Computation, 4, 141–166.
Google Scholar
D.P. Bertsekas, (1995). Nonlinear Programming, Belmont, MA: Athena Scientific.
MATH Google Scholar
E.K. Blum, (1989). Approximation of Boolean Functions by Sig-moidal Networks: Part I: XOR and Other Two Variable Functions, Neural Computation, 1, 532–540.
Google Scholar
M. Burton Jr. and G.J. Mpitsos, (1992). Event Dependent Control of Noise Enhances Learning in Neural Networks, Neural Networks, 5, 627–637.
Article Google Scholar
L.W. Chan and F. Fallside, (1987). An Adaptive Training Algorithm for Back-propagation Networks, Computers Speech and Language, 2, 205–218.
Article Google Scholar
A. Corana, M. Marchesi, C. Martini, and S. Ridella, (1987). Minimizing Multimodal Functions of Continuous Variables with the Simulated Annealing Algorithm, ACM Transactions on Mathematical Software, 13, 262–280.
Article MATH MathSciNet Google Scholar
J.E. Dennis and R.B. Schnabel, (1983). Numerical Methods for Unconstrained Optimization and Nonlinear Equations, Englewood Cliffs, Prentice-Hall.
MATH Google Scholar
R.C. Eberhart and Y.H. Shi, (1998). Evolving Artificial Neural Networks, Proceedings International Conference on Neural Networks and Brain, Beijing, P.R. China.
Google Scholar
R.C. Eberhart, P.K. Simpson and R.W. Dobbins (1996). Computational Intelligence PC Tools, Academic Press Professional, Boston, MA.
Google Scholar
S.E. Fahlman (1988). Faster-learning Variations on Back-propagation: An Empirical Study, D.S. Touretzky, G.E. Hinton and T.J. Sejnowski (Eds.), Proceedings of the 1988 Connectionist Models Summer School, 38–51, San Mateo, Morgan Koufmann.
Google Scholar
A.V. Fiacco and G.P. McCormick (1990). Nonlinear Programming: Sequential Unconstrained Minimization Techniques, Philadelphia, SIAM.
MATH Google Scholar
M. Gori and A. Tesi, (1992). On the Problem of Local Minima in Backpropagation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 14, 76–85.
Article Google Scholar
L. Grippo, F. Lampariello, and S. Lucidi, (1986). A Nonmonotone Line Search Technique for Newton’s Method, SIAM Journal on Numerical Analysis, 23, 707–716.
Article MATH MathSciNet Google Scholar
M.T. Hagan and M. Menhaj, (1994). Training Feedforward Networks with the Marquardt Algorithm, IEEE Transactions on Neural Networks, 5, 989–993.
Article Google Scholar
J.H. Holland, (1975). Adaptation in Neural and Artificial Systems, University of Michigan Press.
Google Scholar
C. Houck, J. Joines, and M. Kay, (1995). A Genetic Algorithm for Function Optimization: A Matlab Implementation, NCSU-IE TR, 95–09.
Google Scholar
R.A. Jacobs, (1988). Increased Rates of Convergence Through Learning Rate Adaptation, Neural Networks, 1, 295–307.
Article Google Scholar
J. Kennedy and R.C. Eberhart, (1995). Particle Swarm Optimization, Proceedings IEEE International Conference on Neural Networks, Piscataway, NJ, IV: 1942–1948.
Google Scholar
J. Kennedy and R.C. Eberhart, (2001). Swarm Intelligence, Morgan Kaufmann Publishers.
Google Scholar
S. Kirkpatrick, CD. Gelatt Jr., and M.P. Vecchi, (1983). Optimization by Simulated Annealing, Science, 220, 671–680.
MathSciNet Google Scholar
S. Kollias and D. Anastassiou, (1989). An Adaptive Least Squares Algorithm for the Efficient Training of Multilayered Networks, IEEE Transactions on Circuits Systems, 36, 1092–1101.
Article Google Scholar
Y. Lee, S.H. Oh, and M. Kim, (1993). An Analysis of Premature Saturation in Backpropagation Learning, Neural Networks, 6, 719–728.
Article Google Scholar
G.D. Magoulas, V.P. Plagianakos, and M.N. Vrahatis, (2002). Globally Convergent Algorithms with Local Learning Rates, IEEE Transactions Neural Networks, 13, 774–779.
Article Google Scholar
G.D. Magoulas, V.P. Plagianakos, and M.N. Vrahatis, (2004). Neural Network-based Colonoscopic Diagnosis Using On-line Learning and Differential Evolution, Applied Soft Computing, 4, 369–379.
Article Google Scholar
G.D. Magoulas, M.N. Vrahatis, and G.S. Androulakis, (1997). Effective Back—propagation with Variable Stepsize, Neural Networks, 10, 69–82.
Article Google Scholar
G.D. Magoulas, M.N. Vrahatis, and G.S. Androulakis, (1997). On the Alleviation of Local Minima in Backpropagation, Nonlinear Analysis, Theory, Methods and Applications, 30, 4545–4550.
Article MATH MathSciNet Google Scholar
G.D. Magoulas, M.N. Vrahatis, and G.S. Androulakis, (1999). Improving the Convergence of the Back-propagation Algorithm Using Learning Rate Adaptation Methods, Neural Computation, 11, 1769–1796.
Article Google Scholar
G.D. Magoulas, M.N. Vrahatis, T.N. Grapsa, and G.S. Androulakis, (1997). Neural Network Supervised Training Based on a Dimension Reducing Method, Mathematics of Neural Networks, Models, Algorithms and Applications, S.W. Ellacott, J.C. Mason, I.J. Anderson Eds., Kluwer Academic Publishers, Boston, 245–249.
Google Scholar
Z. Michalewicz, (1996). Genetic Algorithms + Data Structures = Evolution Programs, Springer.
Google Scholar
M.F. Möller, (1993). A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning, Neural Networks, 6, 525–533.
Article Google Scholar
J. Nocedal, (1992). Theory of Algorithms for Unconstrained Optimization, Acta Numerica, 1, 199–242.
Article MathSciNet Google Scholar
J.M. Ortega and W.C. Rheinboldt, (1970). Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New York.
MATH Google Scholar
K.E. Parsopoulos, V.P. Plagianakos, G.D. Magoulas and M.N. Vrahatis, (2001). Objective Function “Stretching” to Alleviate Convergence to Local Minima, Nonlinear Analysis, Theory, Methods and Applications, 47, 3419–3424.
Article MATH MathSciNet Google Scholar
K.E. Parsopoulos and M.N. Vrahatis, (2002). Recent Approaches to Global Optimization Problems Through Particle Swarm Optimization, Natural Computing, 1, 235–306.
Article MATH MathSciNet Google Scholar
K.E. Parsopoulos and M.N. Vrahatis, (2004). On the Computation of All Global Minimizers Through Particle Swarm Optimization, IEEE Transactions on Evolutionary Computation, 8, 211–224.
Article Google Scholar
N.G. Pavlidis, K.E. Parsopoulos and M.N. Vrahatis, (2004). Computing Nash Equilibria Through Computational Intelligence Methods, Journal of Computational and Applied Mathematics, in press.
Google Scholar
V.P. Plagianakos, G.D. Magoulas and M.N. Vrahatis, (2002). Deterministic Nonmonotone Strategies for Effective Training of Multi—Layer Perceptrons, IEEE Transactions on Neural Networks, 13, 1268–1284.
Article Google Scholar
V.P. Plagianakos, D.G. Sotiropoulos, and M.N. Vrahatis, (1998). Automatic Adaptation of Learning Rate for Backpropagation Neural Networks, N.E. Mastorakis, (Ed.), Recent Advances in Circuits and Systems 337–341, Singapore, World Scientific.
Google Scholar
V.P. Plagianakos and M.N. Vrahatis, (1999). Neural Network Training with Constrained Integer Weights, Proceedings of Congress on Evolutionary Computation (CEC’99), 2007–2013, Washington D.C.
Google Scholar
V.P. Plagianakos and M.N. Vrahatis, (2000). Training Neural Networks with Threshold Activation Functions and Constrained Integer Weights, Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN’2000), Vol. 5, pp.161–166, Como, Italy.
Google Scholar
V.P. Plagianakos and M.N. Vrahatis, (2002). Parallel Evolutionary Training Algorithms for “Hardware-Friendly” Neural Networks, Natural Computing, 1, 307–322.
Article MATH MathSciNet Google Scholar
V.P. Plagianakos, M.N. Vrahatis, and CD. Magoulas (1999). Nonmonotone Methods for Backpropagation Training with Adaptive Learning Rate, Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN’99), Vol. 3, pp. 1762–1767, Washington D.C.
Google Scholar
E. Polak, (1997). Optimization: Algorithms and Consistent Approximations, New York, Springer-Verlag.
MATH Google Scholar
M. Raydan, (1997). The Barzilai and Borwein Gradient Method for the Large Scale Unconstrained Minimization Problem, SIAM Journal on Optimization, 7, 26–33.
Article MATH MathSciNet Google Scholar
A.K. Rigler, J.M. Irvine, and T.P. Vogl, (1991). Rescaling of Variables in Backpropagation Learning, Neural Networks, 4, 225–229.
Article Google Scholar
D.E. Rumelhart, G.E. Hinton, and R.J. Williams, (1986). Learning Internal Representations by Error Propagation, Parallel Distributed Processing: Explorations in the Micro structure of Cognition 1, D.E. Rumelhart, J.L. McClelland Eds., MIT Press, 318–362.
Google Scholar
S. Saarinen, R. Bramley, and G. Cybenko, (1993). Ill-conditioning in Neural Network Training Problems, SIAM Journal on Scientific Computing, 14, 693–714.
Article MATH MathSciNet Google Scholar
F. Silva and L. Almeida, (1990). Acceleration Techniques for the Back-propagation Algorithm, Lecture Notes in Computer Science, 412, 110–119, Berlin, Springer-Verlag.
Google Scholar
R. Storn and K. Price, (1997). Differential Evolution — A Simple and Efficient Heuristic for Global Optimization over Continuous Spaces, Journal of Global Optimization, 11, 341–359.
Article MATH MathSciNet Google Scholar
P.P. Van der Smagt, (1994). Minimisation Methods for Training Feedforward Neural Networks, Neural Networks, 7, 1–11.
Article Google Scholar
T.P. Vogl, J.K. Mangis, A.K. Rigler, W.T. Zink, and D.L. Alkon, (1988). Accelerating the Convergence of the Back-propagation Method, Biological Cybernetics, 59, 257–263.
Article Google Scholar
M.N. Vrahatis, G.S. Androulakis, J.N. Lambrinos and G.D. Magoulas, (2000). A Class of Gradient Unconstrained Minimization Algorithms with Adaptive Stepsize, Journal of Computational and Applied Mathematics, 114, 367–386.
Article MATH MathSciNet Google Scholar
M.N. Vrahatis, G.D. Magoulas and V.P. Plagianakos, (2000). Globally Convergent Modification of the Quickprop Method, Neural Processing Letters, 12, 159–169.
Article MATH Google Scholar
M.N. Vrahatis, G.D. Magoulas and V.P. Plagianakos, (2003). Prom Linear to Nonlinear Iterative Methods, Applied Numerical Mathematics, 45, 59–77.
Article MATH MathSciNet Google Scholar
S.T. Weslstead, (1994). Neural Network and Fuzzy Logic Applications in C/C++, Wiley.
Google Scholar
X.-H. Yu, G.-A. Chen, (1995). On the Local Minima Free Condition of Backpropagation Learning, IEEE Transactions on Neural Networks, 6, 1300–1303.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, University of Patras, University of Patras Artificial Intelligence Research Center-UPAIRC, GR-26110, Patras, Greece
V. P. Plagianakos & M. N. Vrahatis
School of Computer Science and Information Systems, Birkbeck College, University of London, Malet Street, London, WC1E 7HX, UK
G. D. Magoulas

Authors

V. P. Plagianakos
View author publications
You can also search for this author in PubMed Google Scholar
G. D. Magoulas
View author publications
You can also search for this author in PubMed Google Scholar
M. N. Vrahatis
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Pintér Consulting Services Inc., Halifax, Nova Scotia, Canada
János D. Pintér

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Plagianakos, V.P., Magoulas, G.D., Vrahatis, M.N. (2006). Improved Learning of Neural Nets Through Global Search. In: Pintér, J.D. (eds) Global Optimization. Nonconvex Optimization and Its Applications, vol 85. Springer, Boston, MA . https://doi.org/10.1007/0-387-30927-6_15

Download citation

DOI: https://doi.org/10.1007/0-387-30927-6_15
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-30408-3
Online ISBN: 978-0-387-30927-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics