Abstract
Advent of Deep Learning and the emergence of Big Data has led to renewed interests in the study of Artificial Neural Networks (ANN). An ANN is a highly effective classifier that is capable of learning both linear and non-linear boundaries. The number of hidden layers and the number of nodes in each hidden layer (along with many other parameters) in an ANN, is considered to be a model selection problem. With success of deep learning especially on big datasets, there is a prevalent belief in machine learning community that a deep model (that is a model with many number of hidden layers) is preferable. However, this belies earlier theorems proved for ANN that only a single hidden layer (with multiple nodes) is capable of learning any arbitrary function, i.e., a shallow broad ANN. This raises the question of whether one should build a deep network or go for a broad network. In this paper, we do a systematic study of depth and breadth of an ANN in terms of its accuracy (0–1 Loss), bias, variance and convergence performance on 72 standard UCI datasets and we argue that broad ANN has better overall performance than deep ANN.
Notes
- 1.
These are mostly the problems in text, vision and NLP where there is a certain structure present in the input features. For example, deep learning performed extremely well on MNIST digit dataset (accuracy improved over 20%) when compared to typical machine learning algorithms.
References
Rosenblatt, F.: The perceptron-a perceiving and recognizing automaton. Cornell Aeronautical Laboratory, Technical report 85-460-1 (1957)
Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18, 1527 (2006)
Zhang, X., LeCun, Y.: Text understanding from scratch. arXiv:1502.01710 (2015)
Zaidi, N.A., Petitjean, F., Webb, G.I.: Preconditioning an artificial neural network using Naive Bayes. In: Bailey, J., Khan, L., Washio, T., Dobbie, G., Huang, J.Z., Wang, R. (eds.) PAKDD 2016. LNCS (LNAI), vol. 9651, pp. 341–353. Springer, Heidelberg (2016). doi:10.1007/978-3-319-31753-3_28
Frank, A., Asuncion, A.: UCI machine learning repository (2010). http://archive.ics.uci.edu/ml
Zaidi, N.A., Carman, M.J., Cerquides, J., Webb, G.I.: Naive-Bayes inspired effective pre-conditioners for speeding-up logistic regression. In: IEEE International Conference on Data Mining, pp. 1097–1102 (2014)
Zaidi, N.A., Webb, G.I., Carman, M.J., Petitjean, F., Cerquides, J.: \({\rm ALR}^n\): accelerated higher-order logistic regression. Mach. Learn. 104(2), 151–194 (2016)
Fayyad, U.M., Irani, K.B.: On the handling of continuous-valued attributes in decision tree generation. Mach. Learn. 8(1), 87–102 (1992)
Acknowledgment
This research has been supported by the Australian Research Council (ARC) under grant DP140100087, and by the Asian Office of Aerospace Research and Development, Air Force Office of Scientific Research under contracts FA2386-15-1-4007 and FA2386-15-1-4017. Nian Liu was supported by Early Career Researcher seed grant (2015) by the Faculty of Information Technology, Monash University, Australia.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Liu, N., Zaidi, N.A. (2016). Artificial Neural Network: Deep or Broad? An Empirical Study. In: Kang, B.H., Bai, Q. (eds) AI 2016: Advances in Artificial Intelligence. AI 2016. Lecture Notes in Computer Science(), vol 9992. Springer, Cham. https://doi.org/10.1007/978-3-319-50127-7_46
Download citation
DOI: https://doi.org/10.1007/978-3-319-50127-7_46
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50126-0
Online ISBN: 978-3-319-50127-7
eBook Packages: Computer ScienceComputer Science (R0)