AI 2016: AI 2016: Advances in Artificial Intelligence pp 542-547 | Cite as
An Empirically-Sourced Heuristic for Predetermining the Size of the Hidden Layer of a Multi-layer Perceptron for Large Datasets
Conference paper
First Online:
Abstract
We recommend a guiding heuristic to locate a sufficiently-sized multilayer perceptron (MLP) for larger datasets. Expected to minimise the search scope, it is based on experimental research into the comparative performance of 14 existing approaches with global minimum ranges on 31 larger datasets. The most consistent performer was Baum’s [1] equation that sets the number of hidden neurons equal to the square root of the number of training instances.
Keywords
Neural network Multilayer Perceptron Hidden layer size Global minimum Local minimumReferences
- 1.Baum, E.B.: On the capabilities of multilayer perceptrons. J. Complex. 4, 193–215 (1988)MathSciNetCrossRefMATHGoogle Scholar
- 2.Cybenko, G.: Approximation by superpositions of a sigmoidal function. Math. Control Signals Syst. 2, 303–314 (1989)MathSciNetCrossRefMATHGoogle Scholar
- 3.Hornik, K.: Approximation capabilities of multilayer feedforward networks. Neural Netw. 4, 251–257 (1991)CrossRefGoogle Scholar
- 4.Zeng, X., Yeung, D.S.: Hidden neuron pruning of multilayer perceptrons using a quantified sensitivity measure. Neurocomputing 69, 825–837 (2006)CrossRefGoogle Scholar
- 5.Aran, O., Yildiz, O.T., Alpaydin, E.: An incremental framework based on cross-validation for estimating the architecture of a multilayer perceptron. Int. J. Pattern Recogn. Artif. Intell. 23, 159–190 (2009)CrossRefGoogle Scholar
- 6.Hecht-Nielsen, R.: Kolmogorov’s mapping neural network existence theorem. In: Proceedings of IEEE First Annual International Conference on Neural Networks, pp. III-11–III-14. (1987)Google Scholar
- 7.Sprecher, D.A.: A universal mapping for kolmogorov’s superposition theorem. Neural Netw. 6, 1089–1094 (1993)CrossRefMATHGoogle Scholar
- 8.Barron, A.R.: Approximation and estimation bounds for artificial neural networks. Mach. Learn. 14, 115–133 (1994)MATHGoogle Scholar
- 9.Rogers, L.L., Dowla, F.U.: Optimization of groundwater remediation using artificial neural networks with parallel solute transport modeling. Water Resour. Res. 30, 457–481 (1994)CrossRefGoogle Scholar
- 10.Somaratne, S., Seneviratne, G., Coomaraswamy, U.: Prediction of soil organic carbon across different land-use patterns. Soil Sci. Soc. Am. J. 69, 1580–1589 (2005)CrossRefGoogle Scholar
- 11.Denker, J.S., Schwartz, D., Wittner, B., Solla, S., Howard, R., Jackel, L., Hopfield, J.: Large automatic learning, rule extraction and generalization. Complex Syst. 1, 877–922 (1987)MathSciNetMATHGoogle Scholar
- 12.Wanas, N.M., Auda, G.A., Kamel, M.S., Karray, F.O.: On the optimal number of hidden nodes in a neural network. In: IEEE Canadian Conference on Electrical and Computer Engineering 1998, vol. 2, pp. 918–921 (1998)Google Scholar
- 13.Gallinari, P., Thiria, S., Soulie, F.F.: Multilayer perceptrons and data analysis. In: IEEE International Conference on Neural Networks 1988, vol.391, pp. 391–399 (1988)Google Scholar
- 14.Shibata, K., Ikeda, Y.: Effect of number of hidden neurons on learning in large-scale layered neural networks. In: ICROS-SICE International Joint Conference 2009, pp. 5008–5013. SICE, Fukuoka International Congress Center, Japan (2009)Google Scholar
- 15.Arai, M.: Bounds on the number of hidden units in binary-valued three-layer neural networks. Neural Netw. 6, 855–860 (1993)CrossRefGoogle Scholar
- 16.Huang, S.-C., Huang, Y.-F.: Bounds on the number of hidden neurons in multilayer perceptrons. IEEE Trans. Neural Netw. 2, 47–55 (1991)CrossRefGoogle Scholar
- 17.Deepa, S.N., Sheela, K.G.: Estimation of number of hidden neurons in back propagation networks for wind speed prediction in renewable energy systems. Draft (2013)Google Scholar
- 18.Xu, S., Chen, L.: A novel approach for determining the optimal number of hidden layer neurons for FNNs and its application in data mining. In: Proceedings the 5th International Conference on Information Technology and Applications 23–26 June 2008, Cairns, Qld, pp. 683–686 (2008)Google Scholar
- 19.Gorman, R.P., Sejnowski, T.J.: Analysis of hidden units in a layered network trained to classify sonar targets. Neural Netw. 1, 75–89 (1988)CrossRefGoogle Scholar
- 20.Bache, K., Lichman, M.: UCI Machine Learning Repository. School of Information and Computer Science, University of California, Irvine (2013)Google Scholar
- 21.Hoyer, P.O., Ong, C.S., Henschel, S., Braun, M.L., Sonnenburg, S.: IDA Benchmark Repository, vol. 0.1.6. ML Group, Berlin (2013)Google Scholar
- 22.ACM Special Interest Group on Knowledge Discovery and Data Mining: KDD Cup 2004: Particle physics; plus protein homology prediction. ACM (2004). http://www.kdd.org
- 23.Dayhoff, J.: Neural Network Architectures: An Introduction. International Thomson Computer Press, Boston (1996)Google Scholar
- 24.
- 25.Flexer, A.: Statistical evaluation of neural network experiments: minimum requirements and current practice, pp. 1005–1008. The Austrian Research Institute for Artificial Intelligence, Schottengasse 3, A-1010 (1994)Google Scholar
- 26.Demuth, H., Beale, M., Hagan, M.: Neural Network Toolbox 6 User’s Guide. The MathWorks Inc., Natick (2009)Google Scholar
- 27.Møller, M.F.: A scaled conjugate gradient algorithm for fast supervised learning. Neural Netw. 6, 525–533 (1993)CrossRefGoogle Scholar
Copyright information
© Springer International Publishing AG 2016