Skip to main content

Learning Sparse Fully Connected Layers inĀ Convolutional Neural Networks

  • Conference paper
  • First Online:
Artificial Intelligence and Industrial Applications (A2IA 2023)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 772))

  • 272 Accesses

Abstract

Convolutional neural networks (CNNs) are very powerful learning method in the deep learning framework. However, they contain a very large number of parameters, which restricts their utilization in platform-limited devices. Searching for an appropriate and simple convolutional neural network architecture with an optimal number of parameters is still a challenging problem. In fact, in many well-known convolutional neural networks like LeNet, AlexNet, and VGGNet, the percentage of weight in their fully connected layers exceeded 86% of the total weights in the network. Based on this remark, we propose a sparse regularization term based on smoothing the \(L_0\) and \(L_2\) regularizers to minimize the unnecessary neurons in the fully connected layers. This ensures neuron sparsity and reduces effectively the complexity of convolutional neural networks as shown in many experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 219.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 279.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ian, G., Bengio, Y.: Deep Learning. MIT Press, Aaron Courville (2016)

    MATHĀ  Google ScholarĀ 

  2. Yann, L.C., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436ā€“444 (2015)

    ArticleĀ  Google ScholarĀ 

  3. Weibo, L., et al.: A survey of deep neural network architectures and their applications. Neurocomputing 234, 11ā€“26 (2017)

    ArticleĀ  Google ScholarĀ 

  4. Alvarez Jose M., Mathieu Salzmann: Learning the number of neurons in deep networks. In: Advances in Neural Information Processing Systems, vol. 29 (2016)

    Google ScholarĀ 

  5. Simone, S., et al.: Group sparse regularization for deep neural networks. Neurocomputing 241, 81ā€“89 (2017)

    ArticleĀ  Google ScholarĀ 

  6. Neyshabur Behnam, et al.: Towards understanding the role of over-parametrization in generalization of neural networks. arXiv preprint arXiv:1805.12076 (2018)

  7. Ding-Xuan, Z.: Universality of deep convolutional neural networks. Appl. Comput. Harmon. Anal. 48(2), 787ā€“794 (2020)

    ArticleĀ  MathSciNetĀ  MATHĀ  Google ScholarĀ 

  8. Krizhevsky Alex, Ilya Sutskever, Geoffrey E. Hinton: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25 (2012)

    Google ScholarĀ 

  9. Karen, S., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  10. He Kaiming, et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference On Computer Vision and Pattern Recognition (2016)

    Google ScholarĀ 

  11. Gao, H., et al.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)

    Google ScholarĀ 

  12. Chao, L., Zhang, Z., Wang, D.: Pruning deep neural networks by optimal brain damage. In: Fifteenth Annual Conference of the International Speech Communication Association (2014)

    Google ScholarĀ 

  13. Babak, H., Stork, D.G., Wolff, G.J.: Optimal brain surgeon and general network pruning. In: IEEE International Conference on Neural Networks. IEEE (1993)

    Google ScholarĀ 

  14. Li, W., et al.: Regularization of neural networks using dropconnect. In: International Conference on Machine Learning, PMLR (2013)

    Google ScholarĀ 

  15. Golnaz, G., Lin, T.-Y., Le, Q.V.: Dropblock: a regularization method for convolutional networks. In: Advances in Neural Information Processing Systems, vol. 31 (2018)

    Google ScholarĀ 

  16. Yamada, Y., Iwamura, M., Akiba, T., Kise, K.: Shakedrop regularization for deep residual learning. IEEE Access 7, 186126ā€“186136 (2019)

    ArticleĀ  Google ScholarĀ 

  17. Pan, H., Niu, X., Li, R., Shen, S., Dou, Y.: DropFilter: a novel regularization method for learning convolutional neural networks. Neural Process. Lett. 51, 1285ā€“1298 (2020)

    ArticleĀ  Google ScholarĀ 

  18. Krogh, A., Hertz, J.: A simple weight decay can improve generalization. In: Advances in Neural Information Processing Systems, vol. 4 (1991)

    Google ScholarĀ 

  19. Robert, T.: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc.: Ser. B (Methodol.) 58(1), 267ā€“288 (1996)

    MathSciNetĀ  MATHĀ  Google ScholarĀ 

  20. Hui, Z., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Statist. Soc. Ser. B (Statist. Methodol.) 67(2), 301ā€“320 (2005)

    ArticleĀ  MathSciNetĀ  MATHĀ  Google ScholarĀ 

  21. Zongben, X., et al.: L1/2 regularization. Sci. China Inf. Sci. 53(6), 1159ā€“1169 (2010)

    ArticleĀ  MathSciNetĀ  MATHĀ  Google ScholarĀ 

  22. Qinwei, F., Zurada Jacek, M., Wei, W.: Convergence of online gradient method for feedforward neural networks with smoothing L1/2 regularization penalty. Neurocomputing 131, 208ā€“216 (2014)

    ArticleĀ  Google ScholarĀ 

  23. Wei, W., et al.: Batch gradient method with smoothing L1/2 regularization for training of feedforward neural networks. Neural Netw. 50, 72ā€“78 (2014)

    ArticleĀ  MATHĀ  Google ScholarĀ 

  24. Jing, C., Sha, J.: Prune deep neural networks with the modified \( l_ 1/2 \) penalty. IEEE Access 7, 2273ā€“2280 (2018)

    Google ScholarĀ 

  25. Kausik, N.B.: Sparse approximate solutions to linear systems. SIAM J. Comput. 24(2), 227ā€“234 (1995)

    ArticleĀ  MathSciNetĀ  Google ScholarĀ 

  26. Zhang, H.S., Tang, Y.L., Liu, X.D.: Batch gradient training method with smoothing regularization for l0 feedforward neural networks. Neural Comput. Appl. 26(2), 383ā€“390 (2015)

    ArticleĀ  Google ScholarĀ 

  27. Huisheng, Z., Tang, Y.: Online gradient method with smoothing l0 regularization for feedforward neural networks. Neurocomputing 224, 1ā€“8 (2017)

    ArticleĀ  Google ScholarĀ 

  28. Louizos Christos, Max Welling, Kingma Diederik P.: Learning sparse neural networks through \( L_0 \) regularization. arXiv preprint arXiv:1712.01312 (2017)

  29. Ming, Y., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Statist. Soc. Ser. B (Statist. Methodol.) 68(1), 49ā€“67 (2006)

    ArticleĀ  MathSciNetĀ  MATHĀ  Google ScholarĀ 

  30. Jian, W., et al.: A novel pruning algorithm for smoothing feedforward neural networks based on group lasso method. IEEE Trans. Neural Netwo. Learn. Syst. 29(5), 2012ā€“2024 (2017)

    MathSciNetĀ  Google ScholarĀ 

  31. Huaqing, Z., et al.: Feature selection for neural networks using group lasso regularization. IEEE Trans. Knowl. Data Eng. 32(4), 659ā€“673 (2019)

    Google ScholarĀ 

  32. Jian, W., et al.: Convergence analyses on sparse feedforward neural networks via group lasso regularization. Inf. Sci. 381, 250ā€“269 (2017)

    ArticleĀ  MathSciNetĀ  MATHĀ  Google ScholarĀ 

  33. Noah, S., et al.: A sparse-group lasso. J. Comput. Graph. Stat. 22(2), 231ā€“245 (2013)

    ArticleĀ  MathSciNetĀ  Google ScholarĀ 

  34. Zhou, Y., Jin, R., Hoi, S.C.-H.: Exclusive lasso for multi-task feature selection. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics. JMLR Workshop and Conference Proceedings, (2010)

    Google ScholarĀ 

  35. Yoon, J., Hwang, S.J.: Combined group and exclusive sparsity for deep neural networks. In: International Conference on Machine Learning. PMLR (2017)

    Google ScholarĀ 

  36. Zegeye, A.H., et al.: Group \( L_ {1/2} \) regularization for pruning hidden layer nodes of feedforward neural networks. IEEE Access 7, 9540ā€“9557 (2019)

    ArticleĀ  Google ScholarĀ 

  37. Feng, L., Zurada Jacek, M., Wei, W.: Smooth group L1/2 regularization for input layer of feedforward neural networks. Neurocomputing 314, 109ā€“119 (2018)

    ArticleĀ  Google ScholarĀ 

  38. Bui, K., et al.: Structured sparsity of convolutional neural networks via nonconvex sparse group regularization. Front. Appl. Math. Statist. 62, 529564 (2021)

    ArticleĀ  Google ScholarĀ 

  39. Zhang, Y., et al.: Batch gradient training method with smoothing group \(L_0\) regularization for feedfoward neural networks. Neural Process. Lett. 55, 1ā€“17 (2022)

    Google ScholarĀ 

  40. Ramchoun, H., Ettaouil, M.: Convergence of batch gradient algorithm with smoothing composition of group \(l_0\) and \(l_{1/2}\) regularization for feedforward neural networks. Progr. Artif. Intell. 11, 1ā€“10 (2022)

    ArticleĀ  Google ScholarĀ 

  41. He, K., et al.: Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision (2015)

    Google ScholarĀ 

  42. Ngiam, J., et al.: Tiled convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 23 (2010)

    Google ScholarĀ 

  43. Yu, F., Vladlen, K.: Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122 (2015)

  44. HyvƤrinen, A., Urs, K.: Complex cell pooling and the statistics of natural images. Netw. Comput. Neural Syst. 18(2), 81-100 (2007)

    Google ScholarĀ 

  45. Yu, D., Wang, H., Chen, P., Wei, Z.: Mixed pooling for convolutional neural networks. In: Miao, D., Pedrycz, W., ŚlČ©zak, D., Peters, G., Hu, Q., Wang, R. (eds.) RSKT 2014. LNCS (LNAI), vol. 8818, pp. 364ā€“375. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11740-9_34

    ChapterĀ  Google ScholarĀ 

  46. Zeiler, M.D., Fergus, R.: Stochastic pooling for regularization of deep convolutional neural networks. arXiv preprint arXiv:1301.3557 (2013)

  47. Oren, R., Snoek, J., Adams, R.P.: Spectral representations for convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 28 (2015)

    Google ScholarĀ 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohamed Quasdane .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

Ā© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Quasdane, M., Ramchoun, H., Masrour, T. (2023). Learning Sparse Fully Connected Layers inĀ Convolutional Neural Networks. In: Masrour, T., Ramchoun, H., Hajji, T., Hosni, M. (eds) Artificial Intelligence and Industrial Applications. A2IA 2023. Lecture Notes in Networks and Systems, vol 772. Springer, Cham. https://doi.org/10.1007/978-3-031-43520-1_16

Download citation

Publish with us

Policies and ethics