Learning Sparse Fully Connected Layers in Convolutional Neural Networks

Quasdane, Mohamed; Ramchoun, Hassan; Masrour, Tawfik

doi:10.1007/978-3-031-43520-1_16

Mohamed Quasdane¹³,
Hassan Ramchoun^13,14 &
Tawfik Masrour^13,15

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 772))

Included in the following conference series:

International Conference on Artificial Intelligence & Industrial Applications

272 Accesses

Abstract

Convolutional neural networks (CNNs) are very powerful learning method in the deep learning framework. However, they contain a very large number of parameters, which restricts their utilization in platform-limited devices. Searching for an appropriate and simple convolutional neural network architecture with an optimal number of parameters is still a challenging problem. In fact, in many well-known convolutional neural networks like LeNet, AlexNet, and VGGNet, the percentage of weight in their fully connected layers exceeded 86% of the total weights in the network. Based on this remark, we propose a sparse regularization term based on smoothing the \(L_0\) and \(L_2\) regularizers to minimize the unnecessary neurons in the fully connected layers. This ensures neuron sparsity and reduces effectively the complexity of convolutional neural networks as shown in many experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 219.00; Price excludes VAT (USA)

Hardcover Book: USD 279.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ian, G., Bengio, Y.: Deep Learning. MIT Press, Aaron Courville (2016)
MATH Google Scholar
Yann, L.C., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Article Google Scholar
Weibo, L., et al.: A survey of deep neural network architectures and their applications. Neurocomputing 234, 11–26 (2017)
Article Google Scholar
Alvarez Jose M., Mathieu Salzmann: Learning the number of neurons in deep networks. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
Google Scholar
Simone, S., et al.: Group sparse regularization for deep neural networks. Neurocomputing 241, 81–89 (2017)
Article Google Scholar
Neyshabur Behnam, et al.: Towards understanding the role of over-parametrization in generalization of neural networks. arXiv preprint arXiv:1805.12076 (2018)
Ding-Xuan, Z.: Universality of deep convolutional neural networks. Appl. Comput. Harmon. Anal. 48(2), 787–794 (2020)
Article MathSciNet MATH Google Scholar
Krizhevsky Alex, Ilya Sutskever, Geoffrey E. Hinton: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25 (2012)
Google Scholar
Karen, S., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
He Kaiming, et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference On Computer Vision and Pattern Recognition (2016)
Google Scholar
Gao, H., et al.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
Google Scholar
Chao, L., Zhang, Z., Wang, D.: Pruning deep neural networks by optimal brain damage. In: Fifteenth Annual Conference of the International Speech Communication Association (2014)
Google Scholar
Babak, H., Stork, D.G., Wolff, G.J.: Optimal brain surgeon and general network pruning. In: IEEE International Conference on Neural Networks. IEEE (1993)
Google Scholar
Li, W., et al.: Regularization of neural networks using dropconnect. In: International Conference on Machine Learning, PMLR (2013)
Google Scholar
Golnaz, G., Lin, T.-Y., Le, Q.V.: Dropblock: a regularization method for convolutional networks. In: Advances in Neural Information Processing Systems, vol. 31 (2018)
Google Scholar
Yamada, Y., Iwamura, M., Akiba, T., Kise, K.: Shakedrop regularization for deep residual learning. IEEE Access 7, 186126–186136 (2019)
Article Google Scholar
Pan, H., Niu, X., Li, R., Shen, S., Dou, Y.: DropFilter: a novel regularization method for learning convolutional neural networks. Neural Process. Lett. 51, 1285–1298 (2020)
Article Google Scholar
Krogh, A., Hertz, J.: A simple weight decay can improve generalization. In: Advances in Neural Information Processing Systems, vol. 4 (1991)
Google Scholar
Robert, T.: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc.: Ser. B (Methodol.) 58(1), 267–288 (1996)
MathSciNet MATH Google Scholar
Hui, Z., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Statist. Soc. Ser. B (Statist. Methodol.) 67(2), 301–320 (2005)
Article MathSciNet MATH Google Scholar
Zongben, X., et al.: L1/2 regularization. Sci. China Inf. Sci. 53(6), 1159–1169 (2010)
Article MathSciNet MATH Google Scholar
Qinwei, F., Zurada Jacek, M., Wei, W.: Convergence of online gradient method for feedforward neural networks with smoothing L1/2 regularization penalty. Neurocomputing 131, 208–216 (2014)
Article Google Scholar
Wei, W., et al.: Batch gradient method with smoothing L1/2 regularization for training of feedforward neural networks. Neural Netw. 50, 72–78 (2014)
Article MATH Google Scholar
Jing, C., Sha, J.: Prune deep neural networks with the modified \( l_ 1/2 \) penalty. IEEE Access 7, 2273–2280 (2018)
Google Scholar
Kausik, N.B.: Sparse approximate solutions to linear systems. SIAM J. Comput. 24(2), 227–234 (1995)
Article MathSciNet Google Scholar
Zhang, H.S., Tang, Y.L., Liu, X.D.: Batch gradient training method with smoothing regularization for l0 feedforward neural networks. Neural Comput. Appl. 26(2), 383–390 (2015)
Article Google Scholar
Huisheng, Z., Tang, Y.: Online gradient method with smoothing l0 regularization for feedforward neural networks. Neurocomputing 224, 1–8 (2017)
Article Google Scholar
Louizos Christos, Max Welling, Kingma Diederik P.: Learning sparse neural networks through \( L_0 \) regularization. arXiv preprint arXiv:1712.01312 (2017)
Ming, Y., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Statist. Soc. Ser. B (Statist. Methodol.) 68(1), 49–67 (2006)
Article MathSciNet MATH Google Scholar
Jian, W., et al.: A novel pruning algorithm for smoothing feedforward neural networks based on group lasso method. IEEE Trans. Neural Netwo. Learn. Syst. 29(5), 2012–2024 (2017)
MathSciNet Google Scholar
Huaqing, Z., et al.: Feature selection for neural networks using group lasso regularization. IEEE Trans. Knowl. Data Eng. 32(4), 659–673 (2019)
Google Scholar
Jian, W., et al.: Convergence analyses on sparse feedforward neural networks via group lasso regularization. Inf. Sci. 381, 250–269 (2017)
Article MathSciNet MATH Google Scholar
Noah, S., et al.: A sparse-group lasso. J. Comput. Graph. Stat. 22(2), 231–245 (2013)
Article MathSciNet Google Scholar
Zhou, Y., Jin, R., Hoi, S.C.-H.: Exclusive lasso for multi-task feature selection. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics. JMLR Workshop and Conference Proceedings, (2010)
Google Scholar
Yoon, J., Hwang, S.J.: Combined group and exclusive sparsity for deep neural networks. In: International Conference on Machine Learning. PMLR (2017)
Google Scholar
Zegeye, A.H., et al.: Group \( L_ {1/2} \) regularization for pruning hidden layer nodes of feedforward neural networks. IEEE Access 7, 9540–9557 (2019)
Article Google Scholar
Feng, L., Zurada Jacek, M., Wei, W.: Smooth group L1/2 regularization for input layer of feedforward neural networks. Neurocomputing 314, 109–119 (2018)
Article Google Scholar
Bui, K., et al.: Structured sparsity of convolutional neural networks via nonconvex sparse group regularization. Front. Appl. Math. Statist. 62, 529564 (2021)
Article Google Scholar
Zhang, Y., et al.: Batch gradient training method with smoothing group \(L_0\) regularization for feedfoward neural networks. Neural Process. Lett. 55, 1–17 (2022)
Google Scholar
Ramchoun, H., Ettaouil, M.: Convergence of batch gradient algorithm with smoothing composition of group \(l_0\) and \(l_{1/2}\) regularization for feedforward neural networks. Progr. Artif. Intell. 11, 1–10 (2022)
Article Google Scholar
He, K., et al.: Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision (2015)
Google Scholar
Ngiam, J., et al.: Tiled convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 23 (2010)
Google Scholar
Yu, F., Vladlen, K.: Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122 (2015)
Hyvärinen, A., Urs, K.: Complex cell pooling and the statistics of natural images. Netw. Comput. Neural Syst. 18(2), 81-100 (2007)
Google Scholar
Yu, D., Wang, H., Chen, P., Wei, Z.: Mixed pooling for convolutional neural networks. In: Miao, D., Pedrycz, W., Ślȩzak, D., Peters, G., Hu, Q., Wang, R. (eds.) RSKT 2014. LNCS (LNAI), vol. 8818, pp. 364–375. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11740-9_34
Chapter Google Scholar
Zeiler, M.D., Fergus, R.: Stochastic pooling for regularization of deep convolutional neural networks. arXiv preprint arXiv:1301.3557 (2013)
Oren, R., Snoek, J., Adams, R.P.: Spectral representations for convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 28 (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Laboratory of Mathematical Modeling, Simulation and Smart Systems (L2M3S), Moulay Ismail University of Meknes, ENSAM Meknes, Morocco
Mohamed Quasdane, Hassan Ramchoun & Tawfik Masrour
National School of Business and Management, Moulay Ismail University of Meknes, Meknes, Morocco
Hassan Ramchoun
University of Quebec at Rimouski, Rimouski, Canada
Tawfik Masrour

Authors

Mohamed Quasdane
View author publications
You can also search for this author in PubMed Google Scholar
Hassan Ramchoun
View author publications
You can also search for this author in PubMed Google Scholar
Tawfik Masrour
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohamed Quasdane .

Editor information

Editors and Affiliations

Department of Mathematics and Computer Science, National Graduate School for Arts and Crafts, Meknes, Morocco
Tawfik Masrour
National School of Business and Management - ENCG Meknes, Meknes, Morocco
Hassan Ramchoun
Department of Mathematics and Computer Science, National Graduate School for Arts and Crafts, Meknes, Morocco
Tarik Hajji
Department of Mathematics and Computer Science, National Graduate School for Arts and Crafts, Meknes, Morocco
Mohamed Hosni

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Quasdane, M., Ramchoun, H., Masrour, T. (2023). Learning Sparse Fully Connected Layers in Convolutional Neural Networks. In: Masrour, T., Ramchoun, H., Hajji, T., Hosni, M. (eds) Artificial Intelligence and Industrial Applications. A2IA 2023. Lecture Notes in Networks and Systems, vol 772. Springer, Cham. https://doi.org/10.1007/978-3-031-43520-1_16

Download citation

DOI: https://doi.org/10.1007/978-3-031-43520-1_16
Published: 15 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43519-5
Online ISBN: 978-3-031-43520-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Learning Sparse Fully Connected Layers in Convolutional Neural Networks