Abstract
Neural network architectures can be regularised by adding a penalty term to the objective function, thus minimising network complexity in addition to the error. However, adding a term to the objective function inevitably changes the surface of the objective function. This study investigates the landscape changes induced by the weight elimination penalty function under various parameter settings. Fitness landscape metrics are used to quantify and visualise the induced landscape changes, as well as to propose sensible ranges for the regularisation parameters. Fitness landscape metrics are shown to be a viable tool for neural network objective function landscape analysis and visualisation.
Similar content being viewed by others
References
Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press, Oxford
Bosman AS, Engelbrecht AP, Helbig M (2016) Search space boundaries in neural network error landscape analysis. In: Proceedings of the IEEE symposium series on computational intelligence. IEEE, Athens, Greece, pp 1–8
Carvalho M, Ludermir TB (2006) Particle swarm optimization of feed-forward neural networks with weight decay. In: Sixth international conference on hybrid intelligent systems (HIS’06). IEEE, pp 5–8
Choromanska A, Henaff M, Mathieu M, Ben Arous G, LeCun Y (2015) The loss surfaces of multilayer networks. In: Proceedings of the eighteenth international conference on artificial intelligence and statistics, pp 192–204
Choromanska A, LeCun Y, Arous GB (2015) Open problem: the landscape of the loss surfaces of multilayer networks. In: Proceedings of the 28th conference on learning theory, pp 1756–1760
Cleveland WS, Devlin SJ (1988) Locally weighted regression: an approach to regression analysis by local fitting. J Am Stat Assoc 83(403):596–610
Dauphin YN, Pascanu R, Gulcehre C, Cho K, Ganguli S, Bengio Y (2014) Identifying and attacking the saddle point problem in high-dimensional non-convex optimization. In: Advances in neural information processing systems, pp 2933–2941
Dreyfus G (2005) Neural networks: methodology and applications. Springer, Berlin
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7(2):179–188
Gallagher M (2001) Fitness distance correlation of neural network error surfaces: a scalable, continuous optimization problem. In: Proceedings of the 12th European conference on machine learning. Springer, Berlin, pp 157–166
Gallagher MR (2000) Multi-layer perceptron error surfaces: visualization, structure and modelling. PhD thesis, University of Queensland, St Lucia 4072, Australia
Girosi F, Jones M, Poggio T (1995) Regularization theory and neural networks architectures. Neural Comput 7(2):219–269
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: International conference on artificial intelligence and statistics, pp 249–256
Gonçalves I, Silva S, Fonseca CM (2015) Semantic learning machine: a feedforward neural network construction algorithm inspired by geometric semantic genetic programming. In: Pereira F, Machado P, Costa E, Cardoso A (eds) Progress in artificial intelligence, lecture notes in computer science, vol 9273. Springer, Berlin, pp 280–285
Grinstein G, Trutschl M, Cvek U (2001) High-dimensional visualizations. In: Proceedings of the visual data mining workshop, KDD, Citeseer
Gupta A, Lam SM (1998) Weight decay backpropagation for noisy data. Neural Netw 11(6):1127–1138
Hamey LG (1998) XOR has no local minima: a case study in neural network error surface analysis. Neural Netw 11(4):669–681
Hinton GE (1987) Learning translation invariant recognition in a massively parallel networks. In: de Bakker JW, Nijman AJ, Treleaven PC (eds) PARLE parallel architectures and languages Europe. Springer, Berlin, pp 1–13
Hush DR, Horne B, Salas JM (1992) Error surfaces for multilayer perceptrons. IEEE Trans Syst Man Cybern 22(5):1152–1161
Jones T (1995) Evolutionary algorithms, fitness landscapes and search. PhD thesis, The University of New Mexico
Jones T, Forrest S (1995) Fitness distance correlation as a measure of problem difficulty for genetic algorithms. In: Proceedings of the 6th international conference on genetic algorithms. Morgan Kaufmann, pp 184–192
Kordos M, Duch W (2004) A survey of factors influencing MLP error surface. Control Cybern 33(4):611–631
LeCun YA, Bottou L, Orr GB, Müller KR (2012) Efficient backprop. In: Montavon G, Orr G, Müller KR (eds) Neural networks: tricks of the trade. Springer, Berlin, pp 9–48
Malan KM (2014) Characterising continuous optimisation problems for particle swarm optimisation performance prediction. PhD thesis, University of Pretoria
Malan KM, Engelbrecht AP (2009) Quantifying ruggedness of continuous landscapes using entropy. In: IEEE congress on evolutionary computation. IEEE, pp 1440–1447
Malan KM, Engelbrecht AP (2013) Ruggedness, funnels and gradients in fitness landscapes and the effect on PSO performance. In: Proceedings of the IEEE congress on evolutionary computation. IEEE, pp 963–970
Malan KM, Engelbrecht AP (2013) A survey of techniques for characterising fitness landscapes and some possible ways forward. Inf Sci 241:148–163
Malan KM, Engelbrecht AP (2014) Characterising the searchability of continuous optimisation problems for PSO. Swarm Intell 8(4):275–302
Malan KM, Engelbrecht AP (2014) A progressive random walk algorithm for sampling continuous fitness landscapes. In: Proceedings of the IEEE congress on evolutionary computation. IEEE, pp 2507–2514
Mc Loone S, Irwin G (2001) Improving neural network training solutions using regularisation. Neurocomputing 37(14):71–90
Mersmann O, Bischl B, Trautmann H, Preuss M, Weihs C, Rudolph G (2011) Exploratory landscape analysis. In: Proceedings of the 13th annual conference on Genetic and evolutionary computation. ACM, pp 829–836
Merz P, Freisleben B (2000) Fitness landscape analysis and memetic algorithms for the quadratic assignment problem. IEEE Trans Evol Comput 4(4):337–352
Moody J, Hanson SJ, Lippmann RP (1992) The effective number of parameters: an analysis of generalization and regularization in nonlinear learning systems. Adv Neural Inf Process Syst 4:847–854
Moody J, Hanson S, Krogh A, Hertz JA (1995) A simple weight decay can improve generalization. Adv Neural Inf Process Syst 4:950–957
Muñoz MA, Sun Y, Kirley M, Halgamuge SK (2015) Algorithm selection for black-box continuous optimization problems: a survey on methods and challenges. Inf Sci 317:224–245
Orr GB, Müller KR (2003) Neural networks: tricks of the trade. Springer, Berlin
Pitzer E, Affenzeller M (2012) A comprehensive survey on fitness landscape analysis. In: Fodor J, Klempous R, Araujo CPS (eds) Recent advances in intelligent engineering systems. Springer, Berlin, pp 161–191
Prechelt L (1994) Proben1—a set of neural network benchmark problems and benchmarking rules. Tech. rep., Universität Karlsruhe, Karlsruhe, Germany
Rakitianskaia A, Engelbrecht A (2014) Weight regularisation in particle swarm optimisation neural network training. In: Proceedings of the IEEE symposium on swarm intelligence. IEEE, Florida, USA, pp 1–8
Rakitianskaia A, Engelbrecht A (2015) Saturation in PSO neural network training: good or evil? In: Proceedings of the IEEE congress on evolutionary computation. IEEE, Sendai, Japan, pp 125–132
Rakitianskaia A, Bekker E, Malan K, Engelbrecht A (2016) Analysis of error landscapes in multi-layered neural networks for classification. In: Proceedings of the IEEE congress on evolutionary computation. IEEE, Vancouver, Canada (in press)
Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
Wang J, Ye Z, Gao W, Zurada JM (2016) Boundedness and convergence analysis of weight elimination for cyclic training of neural networks. Neural Netw 82:49–61
Wegman EJ (1990) Hyperdimensional data analysis using parallel coordinates. J Am Stat Assoc 85(411):664–675
Weigend AS, Rumelhart DE, Huberman BA (1991) Generalization by weight-elimination applied to currency exchange rate prediction. In: Proceedings of the international joint conference on neural networks, vol 1. IEEE, Seattle, pp 837–841
Weigend AS, Rumelhart DE, Huberman BA (1991) Generalization by weight elimination with application to forecasting. Adv. Neural Inf Process Syst 3:875–882
Werbos PJ (1974) Beyond regression: new tools for prediction and analysis in the behavioural sciences. PhD thesis, Harvard University, Boston, USA
Acknowledgements
The authors would like to thank the Centre for High Performance Computing (CHPC) (http://www.chpc.ac.za) for the use of their cluster to obtain the data for this study. This work is based on the research supported by the National Research Foundation (NRF) of South Africa (Grant Number 46712). The opinions, findings and conclusions or recommendations expressed in this article is that of the author(s) alone, and not that of the NRF. The NRF accepts no liability whatsoever in this regard.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bosman, A., Engelbrecht, A. & Helbig, M. Fitness Landscape Analysis of Weight-Elimination Neural Networks. Neural Process Lett 48, 353–373 (2018). https://doi.org/10.1007/s11063-017-9729-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-017-9729-9