Skip to main content
Log in

A lightweight deep neural network model and its applications based on channel pruning and group vector quantization

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Deep convolutional neural networks (DCNNs) contain millions of parameters and require a tremendous amount of computation; therefore, they cannot be well supported by resource-constrained edge devices. We propose a two-stage model compression method to alleviate this restriction: channel pruning and group vector quantization (CP-GVQ). By channel pruning, many channels of the DCNNs layers are pruned to reduce the model size and improve model inference speed. Based on vector quantization (VQ), GVQ is proposed to compress DCNNs, and it uses group codebooks and code matrices to represent the parameters of grouped layers; the model size is reduced greatly. CP-GVQ not only dramatically decreases model size but also improves inference speed. In each stage, it is necessary to fine-tune the model to recover the original accuracy. When applied to the filament indices classification model of microscopic images of activated sludge, the classification accuracy decreased marginally from 0.99 to 0.97, but the model size was decreased by 99% and the inference speed was improved by 42%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Fig. 3
Fig. 4

Similar content being viewed by others

Data availability

The datasets generated and analyzed during the current study are available from the corresponding author on reasonable request.

References

  1. Qian W, Yang X, Peng S, Yan J, Guo Y (2021) Learning modulated loss for rotated object detection. In: Proceedings of the AAAI conference on artificial intelligence, vol 35, pp 2458–2466

  2. Mafla A, Dey S, Biten AF, Gomez L, Karatzas D (2021) Multi-modal reasoning graph for scene-text based fine-grained image classification and retrieval. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 4023–4033

  3. Denil M, Shakibi B, Dinh L, Ranzato M, De Freitas N (2013) Predicting parameters in deep learning. In: Advances in neural information processing systems, vol 26

  4. Li Z, Wallace E, Shen S, Lin K, Keutzer K, Klein D, Gonzalez J (2020) Train big, then compress: rethinking model size for efficient training and inference of transformers. In: International conference on machine learning. PMLR, pp 5958–5968

  5. Deng L, Li G, Han S, Shi L, Xie Y (2020) Model compression and hardware acceleration for neural networks: a comprehensive survey. Proc. IEEE 108(4):485–532

    Article  Google Scholar 

  6. Tan Z, Song J, Ma X, Tan S-H, Chen H, Miao Y, Wu Y, Ye S, Wang Y, Li D et al (2020) Pcnn: pattern-based fine-grained regular pruning towards optimizing CNN accelerators. In: 2020 57th ACM/IEEE design automation conference (DAC). IEEE, pp 1–6

  7. Anwar S, Hwang K, Sung W (2017) Structured pruning of deep convolutional neural networks. ACM J Emerg Technol Comput Syst (JETC) 13(3):1–18

    Article  Google Scholar 

  8. Ding S, Meadowlark P, He Y, Lew L, Agrawal S, Rybakov O (2022) 4-bit conformer with native quantization aware training for speech recognition. arXiv:2203.15952

  9. Son S, Nah S, Lee KM (2018) Clustering convolutional kernels to compress deep neural networks. In: Proceedings of the European conference on computer vision (ECCV), pp 216–232

  10. Martinez J, Shewakramani J, Liu TW, Bârsan IA, Zeng W, Urtasun R (2021) Permute, quantize, and fine-tune: efficient compression of neural networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 15699–15708

  11. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  12. Wu J, Leng C, Wang Y, Hu Q, Cheng J (2016) Quantized convolutional neural networks for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4820–4828

  13. Han S, Mao H, Dally WJ (2015) Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding. arXiv:1510.00149

  14. Tung F, Mori G (2018) Deep neural network compression by in-parallel pruning-quantization. IEEE Trans Pattern Anal Mach Intell 42(3):568–579

    Article  PubMed  Google Scholar 

  15. Mishra R, Gupta HP, Dutta T (2020) A survey on deep neural network compression: challenges, overview, and solutions. arXiv:2010.03954

  16. Liu Z, Li J, Shen Z, Huang G, Yan S, Zhang C (2017) Learning efficient convolutional networks through network slimming. In: Proceedings of the IEEE international conference on computer vision, pp 2736–2744

  17. Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning. PMLR, pp 448–456

  18. Boyd S, Boyd SP, Vandenberghe L (2004) Convex optimization. Cambridge University Press, New York

    Book  Google Scholar 

  19. Chen AM, Lu H-M, Hecht-Nielsen R (1993) On the geometry of feedforward neural network error surfaces. Neural Comput 5(6):910–927

    Article  Google Scholar 

  20. Martinez J, Zakhmi S, Hoos HH, Little JJ (2018) Lsq++: lower running time and higher recall in multi-codebook quantization. In: Proceedings of the European conference on computer vision (ECCV), pp 491–506

  21. Tieleman T, Hinton G et al (2012) Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural Netw Mach Learn 4(2):26–31

    Google Scholar 

  22. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980

  23. Guo J, Peng Y, Wang S, Yang X, Yuan Z (2014) Filamentous and non-filamentous bulking of activated sludge encountered under nutrients limitation or deficiency conditions. Chem Eng J 255:453–461

    Article  CAS  Google Scholar 

  24. Valverde-Pérez B, Wágner DS, Lóránt B, Gülay A, Smets BF, Plósz BG (2016) Short-sludge age EBPR process-microbial and biochemical process characterisation during reactor start-up and operation. Water Res 104:320–329

    Article  PubMed  Google Scholar 

  25. Federation WE, Association A et al (2005) Standard methods for the examination of water and wastewater. American Public Health Association (APHA), Washington, p 21

    Google Scholar 

  26. Heine W, Sekoulov I, Burkhardt H, Bergen L, Behrendt J (2002) Early warning-system for operation-failures in biological stages of WWTPs by on-line image analysis. Water Sci Technol 46(4–5):117–124

    Article  CAS  PubMed  Google Scholar 

  27. Liwarska-Bizukojc E (2005) Application of image analysis techniques in activated sludge wastewater treatment processes. Biotechnol Lett 27(19):1427–1433

    Article  CAS  PubMed  Google Scholar 

  28. Khan MB, Lee XY, Nisar H, Ng CA, Yeap KH, Malik AS (2015) Digital image processing and analysis for activated sludge wastewater treatment. Signal Image Anal Biomed Life Sci 227–248

  29. Jenkins D, Richard MG, Daigger GT (2003) Manual on the causes and control of activated sludge bulking, foaming, and other solids separation problems. CRC Press, Boca Raton

    Book  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Key R &D Program of China under Grant 2018YFB1700200, 2020 Support Plan for Innovative Talents of Higher Education and 2021 Basic Scientific Research Projects of Higher Education (Key Projects) LJKZ0422

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lijie Zhao.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, M., Liu, Y., Zhao, L. et al. A lightweight deep neural network model and its applications based on channel pruning and group vector quantization. Neural Comput & Applic 36, 5333–5346 (2024). https://doi.org/10.1007/s00521-023-09332-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-09332-z

Keywords

Navigation