An Empirically-Sourced Heuristic for Predetermining the Size of the Hidden Layer of a Multi-layer Perceptron for Large Datasets

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9992)

Abstract

We recommend a guiding heuristic to locate a sufficiently-sized multilayer perceptron (MLP) for larger datasets. Expected to minimise the search scope, it is based on experimental research into the comparative performance of 14 existing approaches with global minimum ranges on 31 larger datasets. The most consistent performer was Baum’s [1] equation that sets the number of hidden neurons equal to the square root of the number of training instances.

Keywords

Neural network Multilayer Perceptron Hidden layer size Global minimum Local minimum 

References

  1. 1.
    Baum, E.B.: On the capabilities of multilayer perceptrons. J. Complex. 4, 193–215 (1988)MathSciNetCrossRefMATHGoogle Scholar
  2. 2.
    Cybenko, G.: Approximation by superpositions of a sigmoidal function. Math. Control Signals Syst. 2, 303–314 (1989)MathSciNetCrossRefMATHGoogle Scholar
  3. 3.
    Hornik, K.: Approximation capabilities of multilayer feedforward networks. Neural Netw. 4, 251–257 (1991)CrossRefGoogle Scholar
  4. 4.
    Zeng, X., Yeung, D.S.: Hidden neuron pruning of multilayer perceptrons using a quantified sensitivity measure. Neurocomputing 69, 825–837 (2006)CrossRefGoogle Scholar
  5. 5.
    Aran, O., Yildiz, O.T., Alpaydin, E.: An incremental framework based on cross-validation for estimating the architecture of a multilayer perceptron. Int. J. Pattern Recogn. Artif. Intell. 23, 159–190 (2009)CrossRefGoogle Scholar
  6. 6.
    Hecht-Nielsen, R.: Kolmogorov’s mapping neural network existence theorem. In: Proceedings of IEEE First Annual International Conference on Neural Networks, pp. III-11–III-14. (1987)Google Scholar
  7. 7.
    Sprecher, D.A.: A universal mapping for kolmogorov’s superposition theorem. Neural Netw. 6, 1089–1094 (1993)CrossRefMATHGoogle Scholar
  8. 8.
    Barron, A.R.: Approximation and estimation bounds for artificial neural networks. Mach. Learn. 14, 115–133 (1994)MATHGoogle Scholar
  9. 9.
    Rogers, L.L., Dowla, F.U.: Optimization of groundwater remediation using artificial neural networks with parallel solute transport modeling. Water Resour. Res. 30, 457–481 (1994)CrossRefGoogle Scholar
  10. 10.
    Somaratne, S., Seneviratne, G., Coomaraswamy, U.: Prediction of soil organic carbon across different land-use patterns. Soil Sci. Soc. Am. J. 69, 1580–1589 (2005)CrossRefGoogle Scholar
  11. 11.
    Denker, J.S., Schwartz, D., Wittner, B., Solla, S., Howard, R., Jackel, L., Hopfield, J.: Large automatic learning, rule extraction and generalization. Complex Syst. 1, 877–922 (1987)MathSciNetMATHGoogle Scholar
  12. 12.
    Wanas, N.M., Auda, G.A., Kamel, M.S., Karray, F.O.: On the optimal number of hidden nodes in a neural network. In: IEEE Canadian Conference on Electrical and Computer Engineering 1998, vol. 2, pp. 918–921 (1998)Google Scholar
  13. 13.
    Gallinari, P., Thiria, S., Soulie, F.F.: Multilayer perceptrons and data analysis. In: IEEE International Conference on Neural Networks 1988, vol.391, pp. 391–399 (1988)Google Scholar
  14. 14.
    Shibata, K., Ikeda, Y.: Effect of number of hidden neurons on learning in large-scale layered neural networks. In: ICROS-SICE International Joint Conference 2009, pp. 5008–5013. SICE, Fukuoka International Congress Center, Japan (2009)Google Scholar
  15. 15.
    Arai, M.: Bounds on the number of hidden units in binary-valued three-layer neural networks. Neural Netw. 6, 855–860 (1993)CrossRefGoogle Scholar
  16. 16.
    Huang, S.-C., Huang, Y.-F.: Bounds on the number of hidden neurons in multilayer perceptrons. IEEE Trans. Neural Netw. 2, 47–55 (1991)CrossRefGoogle Scholar
  17. 17.
    Deepa, S.N., Sheela, K.G.: Estimation of number of hidden neurons in back propagation networks for wind speed prediction in renewable energy systems. Draft (2013)Google Scholar
  18. 18.
    Xu, S., Chen, L.: A novel approach for determining the optimal number of hidden layer neurons for FNNs and its application in data mining. In: Proceedings the 5th International Conference on Information Technology and Applications 23–26 June 2008, Cairns, Qld, pp. 683–686 (2008)Google Scholar
  19. 19.
    Gorman, R.P., Sejnowski, T.J.: Analysis of hidden units in a layered network trained to classify sonar targets. Neural Netw. 1, 75–89 (1988)CrossRefGoogle Scholar
  20. 20.
    Bache, K., Lichman, M.: UCI Machine Learning Repository. School of Information and Computer Science, University of California, Irvine (2013)Google Scholar
  21. 21.
    Hoyer, P.O., Ong, C.S., Henschel, S., Braun, M.L., Sonnenburg, S.: IDA Benchmark Repository, vol. 0.1.6. ML Group, Berlin (2013)Google Scholar
  22. 22.
    ACM Special Interest Group on Knowledge Discovery and Data Mining: KDD Cup 2004: Particle physics; plus protein homology prediction. ACM (2004). http://www.kdd.org
  23. 23.
    Dayhoff, J.: Neural Network Architectures: An Introduction. International Thomson Computer Press, Boston (1996)Google Scholar
  24. 24.
  25. 25.
    Flexer, A.: Statistical evaluation of neural network experiments: minimum requirements and current practice, pp. 1005–1008. The Austrian Research Institute for Artificial Intelligence, Schottengasse 3, A-1010 (1994)Google Scholar
  26. 26.
    Demuth, H., Beale, M., Hagan, M.: Neural Network Toolbox 6 User’s Guide. The MathWorks Inc., Natick (2009)Google Scholar
  27. 27.
    Møller, M.F.: A scaled conjugate gradient algorithm for fast supervised learning. Neural Netw. 6, 525–533 (1993)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.School of Engineering and ICTUniversity of TasmaniaLauncestonAustralia

Personalised recommendations