Skip to main content

Improved Learning of Neural Nets Through Global Search

  • Chapter
Book cover Global Optimization

Part of the book series: Nonconvex Optimization and Its Applications ((NOIA,volume 85))

Abstract

Learning in artificial neural networks is usually based on local minimization methods which have no mechanism that allows them to escape the influence of an undesired local minimum. This chapter presents strategies for developing globally convergent modifications of local search methods and investigates the use of popular global search methods in neural network learning. The proposed methods tend to lead to desirable weight configurations and allow the network to learn the entire training set, and, in that sense, they improve the efficiency of the learning process. Simulation experiments on some notorious for their local minima learning problems are presented and an extensive comparison of several learning algorithms is provided.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. N. Ampazis and S.J. Perantonis, (2002). Two Highly Efficient Second Order Algorithms for Training Feedforward Networks, IEEE Transactions on Neural Networks, 13, 1064–1074.

    Article  Google Scholar 

  2. L. Armijo, (1966). Minimization of Functions Having Lipschitz-continuous First Partial Derivatives, Pacific Journal of Mathematics, 16, 1–3.

    MATH  MathSciNet  Google Scholar 

  3. J. Barzilai and J.M. Borwein, (1988). Two Point Step Size Gradient Methods, IMA Journal of Numerical Analysis, 8, 141–148.

    MATH  MathSciNet  Google Scholar 

  4. R. Battiti, (1989). Accelerated Backpropagation Learning: Two Optimization Methods, Complex Systems, 3, 331–342.

    MATH  Google Scholar 

  5. R. Battiti, (1992). First-and Second—order Methods for Learning: Between Steepest Descent and Newton’s Method, Neural Computation, 4, 141–166.

    Google Scholar 

  6. D.P. Bertsekas, (1995). Nonlinear Programming, Belmont, MA: Athena Scientific.

    MATH  Google Scholar 

  7. E.K. Blum, (1989). Approximation of Boolean Functions by Sig-moidal Networks: Part I: XOR and Other Two Variable Functions, Neural Computation, 1, 532–540.

    Google Scholar 

  8. M. Burton Jr. and G.J. Mpitsos, (1992). Event Dependent Control of Noise Enhances Learning in Neural Networks, Neural Networks, 5, 627–637.

    Article  Google Scholar 

  9. L.W. Chan and F. Fallside, (1987). An Adaptive Training Algorithm for Back-propagation Networks, Computers Speech and Language, 2, 205–218.

    Article  Google Scholar 

  10. A. Corana, M. Marchesi, C. Martini, and S. Ridella, (1987). Minimizing Multimodal Functions of Continuous Variables with the Simulated Annealing Algorithm, ACM Transactions on Mathematical Software, 13, 262–280.

    Article  MATH  MathSciNet  Google Scholar 

  11. J.E. Dennis and R.B. Schnabel, (1983). Numerical Methods for Unconstrained Optimization and Nonlinear Equations, Englewood Cliffs, Prentice-Hall.

    MATH  Google Scholar 

  12. R.C. Eberhart and Y.H. Shi, (1998). Evolving Artificial Neural Networks, Proceedings International Conference on Neural Networks and Brain, Beijing, P.R. China.

    Google Scholar 

  13. R.C. Eberhart, P.K. Simpson and R.W. Dobbins (1996). Computational Intelligence PC Tools, Academic Press Professional, Boston, MA.

    Google Scholar 

  14. S.E. Fahlman (1988). Faster-learning Variations on Back-propagation: An Empirical Study, D.S. Touretzky, G.E. Hinton and T.J. Sejnowski (Eds.), Proceedings of the 1988 Connectionist Models Summer School, 38–51, San Mateo, Morgan Koufmann.

    Google Scholar 

  15. A.V. Fiacco and G.P. McCormick (1990). Nonlinear Programming: Sequential Unconstrained Minimization Techniques, Philadelphia, SIAM.

    MATH  Google Scholar 

  16. M. Gori and A. Tesi, (1992). On the Problem of Local Minima in Backpropagation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 14, 76–85.

    Article  Google Scholar 

  17. L. Grippo, F. Lampariello, and S. Lucidi, (1986). A Nonmonotone Line Search Technique for Newton’s Method, SIAM Journal on Numerical Analysis, 23, 707–716.

    Article  MATH  MathSciNet  Google Scholar 

  18. M.T. Hagan and M. Menhaj, (1994). Training Feedforward Networks with the Marquardt Algorithm, IEEE Transactions on Neural Networks, 5, 989–993.

    Article  Google Scholar 

  19. J.H. Holland, (1975). Adaptation in Neural and Artificial Systems, University of Michigan Press.

    Google Scholar 

  20. C. Houck, J. Joines, and M. Kay, (1995). A Genetic Algorithm for Function Optimization: A Matlab Implementation, NCSU-IE TR, 95–09.

    Google Scholar 

  21. R.A. Jacobs, (1988). Increased Rates of Convergence Through Learning Rate Adaptation, Neural Networks, 1, 295–307.

    Article  Google Scholar 

  22. J. Kennedy and R.C. Eberhart, (1995). Particle Swarm Optimization, Proceedings IEEE International Conference on Neural Networks, Piscataway, NJ, IV: 1942–1948.

    Google Scholar 

  23. J. Kennedy and R.C. Eberhart, (2001). Swarm Intelligence, Morgan Kaufmann Publishers.

    Google Scholar 

  24. S. Kirkpatrick, CD. Gelatt Jr., and M.P. Vecchi, (1983). Optimization by Simulated Annealing, Science, 220, 671–680.

    MathSciNet  Google Scholar 

  25. S. Kollias and D. Anastassiou, (1989). An Adaptive Least Squares Algorithm for the Efficient Training of Multilayered Networks, IEEE Transactions on Circuits Systems, 36, 1092–1101.

    Article  Google Scholar 

  26. Y. Lee, S.H. Oh, and M. Kim, (1993). An Analysis of Premature Saturation in Backpropagation Learning, Neural Networks, 6, 719–728.

    Article  Google Scholar 

  27. G.D. Magoulas, V.P. Plagianakos, and M.N. Vrahatis, (2002). Globally Convergent Algorithms with Local Learning Rates, IEEE Transactions Neural Networks, 13, 774–779.

    Article  Google Scholar 

  28. G.D. Magoulas, V.P. Plagianakos, and M.N. Vrahatis, (2004). Neural Network-based Colonoscopic Diagnosis Using On-line Learning and Differential Evolution, Applied Soft Computing, 4, 369–379.

    Article  Google Scholar 

  29. G.D. Magoulas, M.N. Vrahatis, and G.S. Androulakis, (1997). Effective Back—propagation with Variable Stepsize, Neural Networks, 10, 69–82.

    Article  Google Scholar 

  30. G.D. Magoulas, M.N. Vrahatis, and G.S. Androulakis, (1997). On the Alleviation of Local Minima in Backpropagation, Nonlinear Analysis, Theory, Methods and Applications, 30, 4545–4550.

    Article  MATH  MathSciNet  Google Scholar 

  31. G.D. Magoulas, M.N. Vrahatis, and G.S. Androulakis, (1999). Improving the Convergence of the Back-propagation Algorithm Using Learning Rate Adaptation Methods, Neural Computation, 11, 1769–1796.

    Article  Google Scholar 

  32. G.D. Magoulas, M.N. Vrahatis, T.N. Grapsa, and G.S. Androulakis, (1997). Neural Network Supervised Training Based on a Dimension Reducing Method, Mathematics of Neural Networks, Models, Algorithms and Applications, S.W. Ellacott, J.C. Mason, I.J. Anderson Eds., Kluwer Academic Publishers, Boston, 245–249.

    Google Scholar 

  33. Z. Michalewicz, (1996). Genetic Algorithms + Data Structures = Evolution Programs, Springer.

    Google Scholar 

  34. M.F. Möller, (1993). A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning, Neural Networks, 6, 525–533.

    Article  Google Scholar 

  35. J. Nocedal, (1992). Theory of Algorithms for Unconstrained Optimization, Acta Numerica, 1, 199–242.

    Article  MathSciNet  Google Scholar 

  36. J.M. Ortega and W.C. Rheinboldt, (1970). Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New York.

    MATH  Google Scholar 

  37. K.E. Parsopoulos, V.P. Plagianakos, G.D. Magoulas and M.N. Vrahatis, (2001). Objective Function “Stretching” to Alleviate Convergence to Local Minima, Nonlinear Analysis, Theory, Methods and Applications, 47, 3419–3424.

    Article  MATH  MathSciNet  Google Scholar 

  38. K.E. Parsopoulos and M.N. Vrahatis, (2002). Recent Approaches to Global Optimization Problems Through Particle Swarm Optimization, Natural Computing, 1, 235–306.

    Article  MATH  MathSciNet  Google Scholar 

  39. K.E. Parsopoulos and M.N. Vrahatis, (2004). On the Computation of All Global Minimizers Through Particle Swarm Optimization, IEEE Transactions on Evolutionary Computation, 8, 211–224.

    Article  Google Scholar 

  40. N.G. Pavlidis, K.E. Parsopoulos and M.N. Vrahatis, (2004). Computing Nash Equilibria Through Computational Intelligence Methods, Journal of Computational and Applied Mathematics, in press.

    Google Scholar 

  41. V.P. Plagianakos, G.D. Magoulas and M.N. Vrahatis, (2002). Deterministic Nonmonotone Strategies for Effective Training of Multi—Layer Perceptrons, IEEE Transactions on Neural Networks, 13, 1268–1284.

    Article  Google Scholar 

  42. V.P. Plagianakos, D.G. Sotiropoulos, and M.N. Vrahatis, (1998). Automatic Adaptation of Learning Rate for Backpropagation Neural Networks, N.E. Mastorakis, (Ed.), Recent Advances in Circuits and Systems 337–341, Singapore, World Scientific.

    Google Scholar 

  43. V.P. Plagianakos and M.N. Vrahatis, (1999). Neural Network Training with Constrained Integer Weights, Proceedings of Congress on Evolutionary Computation (CEC’99), 2007–2013, Washington D.C.

    Google Scholar 

  44. V.P. Plagianakos and M.N. Vrahatis, (2000). Training Neural Networks with Threshold Activation Functions and Constrained Integer Weights, Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN’2000), Vol. 5, pp.161–166, Como, Italy.

    Google Scholar 

  45. V.P. Plagianakos and M.N. Vrahatis, (2002). Parallel Evolutionary Training Algorithms for “Hardware-Friendly” Neural Networks, Natural Computing, 1, 307–322.

    Article  MATH  MathSciNet  Google Scholar 

  46. V.P. Plagianakos, M.N. Vrahatis, and CD. Magoulas (1999). Nonmonotone Methods for Backpropagation Training with Adaptive Learning Rate, Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN’99), Vol. 3, pp. 1762–1767, Washington D.C.

    Google Scholar 

  47. E. Polak, (1997). Optimization: Algorithms and Consistent Approximations, New York, Springer-Verlag.

    MATH  Google Scholar 

  48. M. Raydan, (1997). The Barzilai and Borwein Gradient Method for the Large Scale Unconstrained Minimization Problem, SIAM Journal on Optimization, 7, 26–33.

    Article  MATH  MathSciNet  Google Scholar 

  49. A.K. Rigler, J.M. Irvine, and T.P. Vogl, (1991). Rescaling of Variables in Backpropagation Learning, Neural Networks, 4, 225–229.

    Article  Google Scholar 

  50. D.E. Rumelhart, G.E. Hinton, and R.J. Williams, (1986). Learning Internal Representations by Error Propagation, Parallel Distributed Processing: Explorations in the Micro structure of Cognition 1, D.E. Rumelhart, J.L. McClelland Eds., MIT Press, 318–362.

    Google Scholar 

  51. S. Saarinen, R. Bramley, and G. Cybenko, (1993). Ill-conditioning in Neural Network Training Problems, SIAM Journal on Scientific Computing, 14, 693–714.

    Article  MATH  MathSciNet  Google Scholar 

  52. F. Silva and L. Almeida, (1990). Acceleration Techniques for the Back-propagation Algorithm, Lecture Notes in Computer Science, 412, 110–119, Berlin, Springer-Verlag.

    Google Scholar 

  53. R. Storn and K. Price, (1997). Differential Evolution — A Simple and Efficient Heuristic for Global Optimization over Continuous Spaces, Journal of Global Optimization, 11, 341–359.

    Article  MATH  MathSciNet  Google Scholar 

  54. P.P. Van der Smagt, (1994). Minimisation Methods for Training Feedforward Neural Networks, Neural Networks, 7, 1–11.

    Article  Google Scholar 

  55. T.P. Vogl, J.K. Mangis, A.K. Rigler, W.T. Zink, and D.L. Alkon, (1988). Accelerating the Convergence of the Back-propagation Method, Biological Cybernetics, 59, 257–263.

    Article  Google Scholar 

  56. M.N. Vrahatis, G.S. Androulakis, J.N. Lambrinos and G.D. Magoulas, (2000). A Class of Gradient Unconstrained Minimization Algorithms with Adaptive Stepsize, Journal of Computational and Applied Mathematics, 114, 367–386.

    Article  MATH  MathSciNet  Google Scholar 

  57. M.N. Vrahatis, G.D. Magoulas and V.P. Plagianakos, (2000). Globally Convergent Modification of the Quickprop Method, Neural Processing Letters, 12, 159–169.

    Article  MATH  Google Scholar 

  58. M.N. Vrahatis, G.D. Magoulas and V.P. Plagianakos, (2003). Prom Linear to Nonlinear Iterative Methods, Applied Numerical Mathematics, 45, 59–77.

    Article  MATH  MathSciNet  Google Scholar 

  59. S.T. Weslstead, (1994). Neural Network and Fuzzy Logic Applications in C/C++, Wiley.

    Google Scholar 

  60. X.-H. Yu, G.-A. Chen, (1995). On the Local Minima Free Condition of Backpropagation Learning, IEEE Transactions on Neural Networks, 6, 1300–1303.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Plagianakos, V.P., Magoulas, G.D., Vrahatis, M.N. (2006). Improved Learning of Neural Nets Through Global Search. In: Pintér, J.D. (eds) Global Optimization. Nonconvex Optimization and Its Applications, vol 85. Springer, Boston, MA . https://doi.org/10.1007/0-387-30927-6_15

Download citation

Publish with us

Policies and ethics