Abstract
Convolutional neural networks (CNNs) are very powerful learning method in the deep learning framework. However, they contain a very large number of parameters, which restricts their utilization in platform-limited devices. Searching for an appropriate and simple convolutional neural network architecture with an optimal number of parameters is still a challenging problem. In fact, in many well-known convolutional neural networks like LeNet, AlexNet, and VGGNet, the percentage of weight in their fully connected layers exceeded 86% of the total weights in the network. Based on this remark, we propose a sparse regularization term based on smoothing the \(L_0\) and \(L_2\) regularizers to minimize the unnecessary neurons in the fully connected layers. This ensures neuron sparsity and reduces effectively the complexity of convolutional neural networks as shown in many experiments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ian, G., Bengio, Y.: Deep Learning. MIT Press, Aaron Courville (2016)
Yann, L.C., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436ā444 (2015)
Weibo, L., et al.: A survey of deep neural network architectures and their applications. Neurocomputing 234, 11ā26 (2017)
Alvarez Jose M., Mathieu Salzmann: Learning the number of neurons in deep networks. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
Simone, S., et al.: Group sparse regularization for deep neural networks. Neurocomputing 241, 81ā89 (2017)
Neyshabur Behnam, et al.: Towards understanding the role of over-parametrization in generalization of neural networks. arXiv preprint arXiv:1805.12076 (2018)
Ding-Xuan, Z.: Universality of deep convolutional neural networks. Appl. Comput. Harmon. Anal. 48(2), 787ā794 (2020)
Krizhevsky Alex, Ilya Sutskever, Geoffrey E. Hinton: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25 (2012)
Karen, S., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
He Kaiming, et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference On Computer Vision and Pattern Recognition (2016)
Gao, H., et al.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
Chao, L., Zhang, Z., Wang, D.: Pruning deep neural networks by optimal brain damage. In: Fifteenth Annual Conference of the International Speech Communication Association (2014)
Babak, H., Stork, D.G., Wolff, G.J.: Optimal brain surgeon and general network pruning. In: IEEE International Conference on Neural Networks. IEEE (1993)
Li, W., et al.: Regularization of neural networks using dropconnect. In: International Conference on Machine Learning, PMLR (2013)
Golnaz, G., Lin, T.-Y., Le, Q.V.: Dropblock: a regularization method for convolutional networks. In: Advances in Neural Information Processing Systems, vol. 31 (2018)
Yamada, Y., Iwamura, M., Akiba, T., Kise, K.: Shakedrop regularization for deep residual learning. IEEE Access 7, 186126ā186136 (2019)
Pan, H., Niu, X., Li, R., Shen, S., Dou, Y.: DropFilter: a novel regularization method for learning convolutional neural networks. Neural Process. Lett. 51, 1285ā1298 (2020)
Krogh, A., Hertz, J.: A simple weight decay can improve generalization. In: Advances in Neural Information Processing Systems, vol. 4 (1991)
Robert, T.: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc.: Ser. B (Methodol.) 58(1), 267ā288 (1996)
Hui, Z., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Statist. Soc. Ser. B (Statist. Methodol.) 67(2), 301ā320 (2005)
Zongben, X., et al.: L1/2 regularization. Sci. China Inf. Sci. 53(6), 1159ā1169 (2010)
Qinwei, F., Zurada Jacek, M., Wei, W.: Convergence of online gradient method for feedforward neural networks with smoothing L1/2 regularization penalty. Neurocomputing 131, 208ā216 (2014)
Wei, W., et al.: Batch gradient method with smoothing L1/2 regularization for training of feedforward neural networks. Neural Netw. 50, 72ā78 (2014)
Jing, C., Sha, J.: Prune deep neural networks with the modified \( l_ 1/2 \) penalty. IEEE Access 7, 2273ā2280 (2018)
Kausik, N.B.: Sparse approximate solutions to linear systems. SIAM J. Comput. 24(2), 227ā234 (1995)
Zhang, H.S., Tang, Y.L., Liu, X.D.: Batch gradient training method with smoothing regularization for l0 feedforward neural networks. Neural Comput. Appl. 26(2), 383ā390 (2015)
Huisheng, Z., Tang, Y.: Online gradient method with smoothing l0 regularization for feedforward neural networks. Neurocomputing 224, 1ā8 (2017)
Louizos Christos, Max Welling, Kingma Diederik P.: Learning sparse neural networks through \( L_0 \) regularization. arXiv preprint arXiv:1712.01312 (2017)
Ming, Y., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Statist. Soc. Ser. B (Statist. Methodol.) 68(1), 49ā67 (2006)
Jian, W., et al.: A novel pruning algorithm for smoothing feedforward neural networks based on group lasso method. IEEE Trans. Neural Netwo. Learn. Syst. 29(5), 2012ā2024 (2017)
Huaqing, Z., et al.: Feature selection for neural networks using group lasso regularization. IEEE Trans. Knowl. Data Eng. 32(4), 659ā673 (2019)
Jian, W., et al.: Convergence analyses on sparse feedforward neural networks via group lasso regularization. Inf. Sci. 381, 250ā269 (2017)
Noah, S., et al.: A sparse-group lasso. J. Comput. Graph. Stat. 22(2), 231ā245 (2013)
Zhou, Y., Jin, R., Hoi, S.C.-H.: Exclusive lasso for multi-task feature selection. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics. JMLR Workshop and Conference Proceedings, (2010)
Yoon, J., Hwang, S.J.: Combined group and exclusive sparsity for deep neural networks. In: International Conference on Machine Learning. PMLR (2017)
Zegeye, A.H., et al.: Group \( L_ {1/2} \) regularization for pruning hidden layer nodes of feedforward neural networks. IEEE Access 7, 9540ā9557 (2019)
Feng, L., Zurada Jacek, M., Wei, W.: Smooth group L1/2 regularization for input layer of feedforward neural networks. Neurocomputing 314, 109ā119 (2018)
Bui, K., et al.: Structured sparsity of convolutional neural networks via nonconvex sparse group regularization. Front. Appl. Math. Statist. 62, 529564 (2021)
Zhang, Y., et al.: Batch gradient training method with smoothing group \(L_0\) regularization for feedfoward neural networks. Neural Process. Lett. 55, 1ā17 (2022)
Ramchoun, H., Ettaouil, M.: Convergence of batch gradient algorithm with smoothing composition of group \(l_0\) and \(l_{1/2}\) regularization for feedforward neural networks. Progr. Artif. Intell. 11, 1ā10 (2022)
He, K., et al.: Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision (2015)
Ngiam, J., et al.: Tiled convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 23 (2010)
Yu, F., Vladlen, K.: Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122 (2015)
HyvƤrinen, A., Urs, K.: Complex cell pooling and the statistics of natural images. Netw. Comput. Neural Syst. 18(2), 81-100 (2007)
Yu, D., Wang, H., Chen, P., Wei, Z.: Mixed pooling for convolutional neural networks. In: Miao, D., Pedrycz, W., ÅlČ©zak, D., Peters, G., Hu, Q., Wang, R. (eds.) RSKT 2014. LNCS (LNAI), vol. 8818, pp. 364ā375. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11740-9_34
Zeiler, M.D., Fergus, R.: Stochastic pooling for regularization of deep convolutional neural networks. arXiv preprint arXiv:1301.3557 (2013)
Oren, R., Snoek, J., Adams, R.P.: Spectral representations for convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 28 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Ā© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Quasdane, M., Ramchoun, H., Masrour, T. (2023). Learning Sparse Fully Connected Layers inĀ Convolutional Neural Networks. In: Masrour, T., Ramchoun, H., Hajji, T., Hosni, M. (eds) Artificial Intelligence and Industrial Applications. A2IA 2023. Lecture Notes in Networks and Systems, vol 772. Springer, Cham. https://doi.org/10.1007/978-3-031-43520-1_16
Download citation
DOI: https://doi.org/10.1007/978-3-031-43520-1_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43519-5
Online ISBN: 978-3-031-43520-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)