A lightweight deep neural network model and its applications based on channel pruning and group vector quantization

Huang, Mingzhong; Liu, Yan; Zhao, Lijie; Wang, Guogang

doi:10.1007/s00521-023-09332-z

A lightweight deep neural network model and its applications based on channel pruning and group vector quantization

Original Article
Published: 29 December 2023

Volume 36, pages 5333–5346, (2024)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Mingzhong Huang¹,
Yan Liu¹,
Lijie Zhao ORCID: orcid.org/0000-0003-1381-7047¹ &
…
Guogang Wang¹

246 Accesses
Explore all metrics

Abstract

Deep convolutional neural networks (DCNNs) contain millions of parameters and require a tremendous amount of computation; therefore, they cannot be well supported by resource-constrained edge devices. We propose a two-stage model compression method to alleviate this restriction: channel pruning and group vector quantization (CP-GVQ). By channel pruning, many channels of the DCNNs layers are pruned to reduce the model size and improve model inference speed. Based on vector quantization (VQ), GVQ is proposed to compress DCNNs, and it uses group codebooks and code matrices to represent the parameters of grouped layers; the model size is reduced greatly. CP-GVQ not only dramatically decreases model size but also improves inference speed. In each stage, it is necessary to fine-tune the model to recover the original accuracy. When applied to the filament indices classification model of microscopic images of activated sludge, the classification accuracy decreased marginally from 0.99 to 0.97, but the model size was decreased by 99% and the inference speed was improved by 42%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Unsupervised PulseNet: Automated Pruning of Convolutional Neural Networks by K-Means Clustering

Space Efficient Quantization for Deep Convolutional Neural Networks

Article 22 March 2019

Group-based network pruning via nonlinear relationship between convolution filters

Article 04 January 2022

Data availability

The datasets generated and analyzed during the current study are available from the corresponding author on reasonable request.

References

Qian W, Yang X, Peng S, Yan J, Guo Y (2021) Learning modulated loss for rotated object detection. In: Proceedings of the AAAI conference on artificial intelligence, vol 35, pp 2458–2466
Mafla A, Dey S, Biten AF, Gomez L, Karatzas D (2021) Multi-modal reasoning graph for scene-text based fine-grained image classification and retrieval. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 4023–4033
Denil M, Shakibi B, Dinh L, Ranzato M, De Freitas N (2013) Predicting parameters in deep learning. In: Advances in neural information processing systems, vol 26
Li Z, Wallace E, Shen S, Lin K, Keutzer K, Klein D, Gonzalez J (2020) Train big, then compress: rethinking model size for efficient training and inference of transformers. In: International conference on machine learning. PMLR, pp 5958–5968
Deng L, Li G, Han S, Shi L, Xie Y (2020) Model compression and hardware acceleration for neural networks: a comprehensive survey. Proc. IEEE 108(4):485–532
Article Google Scholar
Tan Z, Song J, Ma X, Tan S-H, Chen H, Miao Y, Wu Y, Ye S, Wang Y, Li D et al (2020) Pcnn: pattern-based fine-grained regular pruning towards optimizing CNN accelerators. In: 2020 57th ACM/IEEE design automation conference (DAC). IEEE, pp 1–6
Anwar S, Hwang K, Sung W (2017) Structured pruning of deep convolutional neural networks. ACM J Emerg Technol Comput Syst (JETC) 13(3):1–18
Article Google Scholar
Ding S, Meadowlark P, He Y, Lew L, Agrawal S, Rybakov O (2022) 4-bit conformer with native quantization aware training for speech recognition. arXiv:2203.15952
Son S, Nah S, Lee KM (2018) Clustering convolutional kernels to compress deep neural networks. In: Proceedings of the European conference on computer vision (ECCV), pp 216–232
Martinez J, Shewakramani J, Liu TW, Bârsan IA, Zeng W, Urtasun R (2021) Permute, quantize, and fine-tune: efficient compression of neural networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 15699–15708
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Wu J, Leng C, Wang Y, Hu Q, Cheng J (2016) Quantized convolutional neural networks for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4820–4828
Han S, Mao H, Dally WJ (2015) Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding. arXiv:1510.00149
Tung F, Mori G (2018) Deep neural network compression by in-parallel pruning-quantization. IEEE Trans Pattern Anal Mach Intell 42(3):568–579
Article PubMed Google Scholar
Mishra R, Gupta HP, Dutta T (2020) A survey on deep neural network compression: challenges, overview, and solutions. arXiv:2010.03954
Liu Z, Li J, Shen Z, Huang G, Yan S, Zhang C (2017) Learning efficient convolutional networks through network slimming. In: Proceedings of the IEEE international conference on computer vision, pp 2736–2744
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning. PMLR, pp 448–456
Boyd S, Boyd SP, Vandenberghe L (2004) Convex optimization. Cambridge University Press, New York
Book Google Scholar
Chen AM, Lu H-M, Hecht-Nielsen R (1993) On the geometry of feedforward neural network error surfaces. Neural Comput 5(6):910–927
Article Google Scholar
Martinez J, Zakhmi S, Hoos HH, Little JJ (2018) Lsq++: lower running time and higher recall in multi-codebook quantization. In: Proceedings of the European conference on computer vision (ECCV), pp 491–506
Tieleman T, Hinton G et al (2012) Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural Netw Mach Learn 4(2):26–31
Google Scholar
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980
Guo J, Peng Y, Wang S, Yang X, Yuan Z (2014) Filamentous and non-filamentous bulking of activated sludge encountered under nutrients limitation or deficiency conditions. Chem Eng J 255:453–461
Article CAS Google Scholar
Valverde-Pérez B, Wágner DS, Lóránt B, Gülay A, Smets BF, Plósz BG (2016) Short-sludge age EBPR process-microbial and biochemical process characterisation during reactor start-up and operation. Water Res 104:320–329
Article PubMed Google Scholar
Federation WE, Association A et al (2005) Standard methods for the examination of water and wastewater. American Public Health Association (APHA), Washington, p 21
Google Scholar
Heine W, Sekoulov I, Burkhardt H, Bergen L, Behrendt J (2002) Early warning-system for operation-failures in biological stages of WWTPs by on-line image analysis. Water Sci Technol 46(4–5):117–124
Article CAS PubMed Google Scholar
Liwarska-Bizukojc E (2005) Application of image analysis techniques in activated sludge wastewater treatment processes. Biotechnol Lett 27(19):1427–1433
Article CAS PubMed Google Scholar
Khan MB, Lee XY, Nisar H, Ng CA, Yeap KH, Malik AS (2015) Digital image processing and analysis for activated sludge wastewater treatment. Signal Image Anal Biomed Life Sci 227–248
Jenkins D, Richard MG, Daigger GT (2003) Manual on the causes and control of activated sludge bulking, foaming, and other solids separation problems. CRC Press, Boca Raton
Book Google Scholar

Download references

Acknowledgements

This work was supported by the National Key R &D Program of China under Grant 2018YFB1700200, 2020 Support Plan for Innovative Talents of Higher Education and 2021 Basic Scientific Research Projects of Higher Education (Key Projects) LJKZ0422

Author information

Authors and Affiliations

College of Information Engineering, Shenyang University of Chemical Technology, Shenyang, 110142, Liaoning, China
Mingzhong Huang, Yan Liu, Lijie Zhao & Guogang Wang

Authors

Mingzhong Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yan Liu
View author publications
You can also search for this author in PubMed Google Scholar
Lijie Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Guogang Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lijie Zhao.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Huang, M., Liu, Y., Zhao, L. et al. A lightweight deep neural network model and its applications based on channel pruning and group vector quantization. Neural Comput & Applic 36, 5333–5346 (2024). https://doi.org/10.1007/s00521-023-09332-z

Download citation

Received: 29 August 2022
Accepted: 26 November 2023
Published: 29 December 2023
Issue Date: April 2024
DOI: https://doi.org/10.1007/s00521-023-09332-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A lightweight deep neural network model and its applications based on channel pruning and group vector quantization

Abstract

Access this article

Similar content being viewed by others

Unsupervised PulseNet: Automated Pruning of Convolutional Neural Networks by K-Means Clustering

Space Efficient Quantization for Deep Convolutional Neural Networks

Group-based network pruning via nonlinear relationship between convolution filters

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A lightweight deep neural network model and its applications based on channel pruning and group vector quantization

Abstract

Access this article

Similar content being viewed by others

Unsupervised PulseNet: Automated Pruning of Convolutional Neural Networks by K-Means Clustering

Space Efficient Quantization for Deep Convolutional Neural Networks

Group-based network pruning via nonlinear relationship between convolution filters

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation