Skip to main content
Log in

Fitness Landscape Analysis of Weight-Elimination Neural Networks

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Neural network architectures can be regularised by adding a penalty term to the objective function, thus minimising network complexity in addition to the error. However, adding a term to the objective function inevitably changes the surface of the objective function. This study investigates the landscape changes induced by the weight elimination penalty function under various parameter settings. Fitness landscape metrics are used to quantify and visualise the induced landscape changes, as well as to propose sensible ranges for the regularisation parameters. Fitness landscape metrics are shown to be a viable tool for neural network objective function landscape analysis and visualisation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press, Oxford

    MATH  Google Scholar 

  2. Bosman AS, Engelbrecht AP, Helbig M (2016) Search space boundaries in neural network error landscape analysis. In: Proceedings of the IEEE symposium series on computational intelligence. IEEE, Athens, Greece, pp 1–8

  3. Carvalho M, Ludermir TB (2006) Particle swarm optimization of feed-forward neural networks with weight decay. In: Sixth international conference on hybrid intelligent systems (HIS’06). IEEE, pp 5–8

  4. Choromanska A, Henaff M, Mathieu M, Ben Arous G, LeCun Y (2015) The loss surfaces of multilayer networks. In: Proceedings of the eighteenth international conference on artificial intelligence and statistics, pp 192–204

  5. Choromanska A, LeCun Y, Arous GB (2015) Open problem: the landscape of the loss surfaces of multilayer networks. In: Proceedings of the 28th conference on learning theory, pp 1756–1760

  6. Cleveland WS, Devlin SJ (1988) Locally weighted regression: an approach to regression analysis by local fitting. J Am Stat Assoc 83(403):596–610

    Article  MATH  Google Scholar 

  7. Dauphin YN, Pascanu R, Gulcehre C, Cho K, Ganguli S, Bengio Y (2014) Identifying and attacking the saddle point problem in high-dimensional non-convex optimization. In: Advances in neural information processing systems, pp 2933–2941

  8. Dreyfus G (2005) Neural networks: methodology and applications. Springer, Berlin

    MATH  Google Scholar 

  9. Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7(2):179–188

    Article  Google Scholar 

  10. Gallagher M (2001) Fitness distance correlation of neural network error surfaces: a scalable, continuous optimization problem. In: Proceedings of the 12th European conference on machine learning. Springer, Berlin, pp 157–166

  11. Gallagher MR (2000) Multi-layer perceptron error surfaces: visualization, structure and modelling. PhD thesis, University of Queensland, St Lucia 4072, Australia

  12. Girosi F, Jones M, Poggio T (1995) Regularization theory and neural networks architectures. Neural Comput 7(2):219–269

    Article  Google Scholar 

  13. Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: International conference on artificial intelligence and statistics, pp 249–256

  14. Gonçalves I, Silva S, Fonseca CM (2015) Semantic learning machine: a feedforward neural network construction algorithm inspired by geometric semantic genetic programming. In: Pereira F, Machado P, Costa E, Cardoso A (eds) Progress in artificial intelligence, lecture notes in computer science, vol 9273. Springer, Berlin, pp 280–285

  15. Grinstein G, Trutschl M, Cvek U (2001) High-dimensional visualizations. In: Proceedings of the visual data mining workshop, KDD, Citeseer

  16. Gupta A, Lam SM (1998) Weight decay backpropagation for noisy data. Neural Netw 11(6):1127–1138

    Article  Google Scholar 

  17. Hamey LG (1998) XOR has no local minima: a case study in neural network error surface analysis. Neural Netw 11(4):669–681

    Article  Google Scholar 

  18. Hinton GE (1987) Learning translation invariant recognition in a massively parallel networks. In: de Bakker JW, Nijman AJ, Treleaven PC (eds) PARLE parallel architectures and languages Europe. Springer, Berlin, pp 1–13

  19. Hush DR, Horne B, Salas JM (1992) Error surfaces for multilayer perceptrons. IEEE Trans Syst Man Cybern 22(5):1152–1161

    Article  Google Scholar 

  20. Jones T (1995) Evolutionary algorithms, fitness landscapes and search. PhD thesis, The University of New Mexico

  21. Jones T, Forrest S (1995) Fitness distance correlation as a measure of problem difficulty for genetic algorithms. In: Proceedings of the 6th international conference on genetic algorithms. Morgan Kaufmann, pp 184–192

  22. Kordos M, Duch W (2004) A survey of factors influencing MLP error surface. Control Cybern 33(4):611–631

  23. LeCun YA, Bottou L, Orr GB, Müller KR (2012) Efficient backprop. In: Montavon G, Orr G, Müller KR (eds) Neural networks: tricks of the trade. Springer, Berlin, pp 9–48

  24. Malan KM (2014) Characterising continuous optimisation problems for particle swarm optimisation performance prediction. PhD thesis, University of Pretoria

  25. Malan KM, Engelbrecht AP (2009) Quantifying ruggedness of continuous landscapes using entropy. In: IEEE congress on evolutionary computation. IEEE, pp 1440–1447

  26. Malan KM, Engelbrecht AP (2013) Ruggedness, funnels and gradients in fitness landscapes and the effect on PSO performance. In: Proceedings of the IEEE congress on evolutionary computation. IEEE, pp 963–970

  27. Malan KM, Engelbrecht AP (2013) A survey of techniques for characterising fitness landscapes and some possible ways forward. Inf Sci 241:148–163

    Article  Google Scholar 

  28. Malan KM, Engelbrecht AP (2014) Characterising the searchability of continuous optimisation problems for PSO. Swarm Intell 8(4):275–302

    Article  Google Scholar 

  29. Malan KM, Engelbrecht AP (2014) A progressive random walk algorithm for sampling continuous fitness landscapes. In: Proceedings of the IEEE congress on evolutionary computation. IEEE, pp 2507–2514

  30. Mc Loone S, Irwin G (2001) Improving neural network training solutions using regularisation. Neurocomputing 37(14):71–90

    Article  MATH  Google Scholar 

  31. Mersmann O, Bischl B, Trautmann H, Preuss M, Weihs C, Rudolph G (2011) Exploratory landscape analysis. In: Proceedings of the 13th annual conference on Genetic and evolutionary computation. ACM, pp 829–836

  32. Merz P, Freisleben B (2000) Fitness landscape analysis and memetic algorithms for the quadratic assignment problem. IEEE Trans Evol Comput 4(4):337–352

    Article  Google Scholar 

  33. Moody J, Hanson SJ, Lippmann RP (1992) The effective number of parameters: an analysis of generalization and regularization in nonlinear learning systems. Adv Neural Inf Process Syst 4:847–854

    Google Scholar 

  34. Moody J, Hanson S, Krogh A, Hertz JA (1995) A simple weight decay can improve generalization. Adv Neural Inf Process Syst 4:950–957

    Google Scholar 

  35. Muñoz MA, Sun Y, Kirley M, Halgamuge SK (2015) Algorithm selection for black-box continuous optimization problems: a survey on methods and challenges. Inf Sci 317:224–245

    Article  Google Scholar 

  36. Orr GB, Müller KR (2003) Neural networks: tricks of the trade. Springer, Berlin

    Google Scholar 

  37. Pitzer E, Affenzeller M (2012) A comprehensive survey on fitness landscape analysis. In: Fodor J, Klempous R, Araujo CPS (eds) Recent advances in intelligent engineering systems. Springer, Berlin, pp 161–191

  38. Prechelt L (1994) Proben1—a set of neural network benchmark problems and benchmarking rules. Tech. rep., Universität Karlsruhe, Karlsruhe, Germany

  39. Rakitianskaia A, Engelbrecht A (2014) Weight regularisation in particle swarm optimisation neural network training. In: Proceedings of the IEEE symposium on swarm intelligence. IEEE, Florida, USA, pp 1–8

  40. Rakitianskaia A, Engelbrecht A (2015) Saturation in PSO neural network training: good or evil? In: Proceedings of the IEEE congress on evolutionary computation. IEEE, Sendai, Japan, pp 125–132

  41. Rakitianskaia A, Bekker E, Malan K, Engelbrecht A (2016) Analysis of error landscapes in multi-layered neural networks for classification. In: Proceedings of the IEEE congress on evolutionary computation. IEEE, Vancouver, Canada (in press)

  42. Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

    MathSciNet  MATH  Google Scholar 

  43. Wang J, Ye Z, Gao W, Zurada JM (2016) Boundedness and convergence analysis of weight elimination for cyclic training of neural networks. Neural Netw 82:49–61

    Article  Google Scholar 

  44. Wegman EJ (1990) Hyperdimensional data analysis using parallel coordinates. J Am Stat Assoc 85(411):664–675

    Article  Google Scholar 

  45. Weigend AS, Rumelhart DE, Huberman BA (1991) Generalization by weight-elimination applied to currency exchange rate prediction. In: Proceedings of the international joint conference on neural networks, vol 1. IEEE, Seattle, pp 837–841

  46. Weigend AS, Rumelhart DE, Huberman BA (1991) Generalization by weight elimination with application to forecasting. Adv. Neural Inf Process Syst 3:875–882

  47. Werbos PJ (1974) Beyond regression: new tools for prediction and analysis in the behavioural sciences. PhD thesis, Harvard University, Boston, USA

Download references

Acknowledgements

The authors would like to thank the Centre for High Performance Computing (CHPC) (http://www.chpc.ac.za) for the use of their cluster to obtain the data for this study. This work is based on the research supported by the National Research Foundation (NRF) of South Africa (Grant Number 46712). The opinions, findings and conclusions or recommendations expressed in this article is that of the author(s) alone, and not that of the NRF. The NRF accepts no liability whatsoever in this regard.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anna Bosman.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bosman, A., Engelbrecht, A. & Helbig, M. Fitness Landscape Analysis of Weight-Elimination Neural Networks. Neural Process Lett 48, 353–373 (2018). https://doi.org/10.1007/s11063-017-9729-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-017-9729-9

Keywords

Navigation