Skip to main content
Log in

Review of Lightweight Deep Convolutional Neural Networks

  • Review article
  • Published:
Archives of Computational Methods in Engineering Aims and scope Submit manuscript


Lightweight deep convolutional neural networks (LDCNNs) are vital components of mobile intelligence, particularly in mobile vision. Although various heavy networks with increasingly deeper and wider have continuously broken accuracy records since 2012, with the spring of terminals and mobile devices, neural networks that can match them have become a core role in practical applications. In this review, we focus on several representative lightweight Deep Convolutional Neural Networks (DCNN) technologies that hold significant potential for advancing the field. More than 190 references screened out in terms of architecture design and model compression, in which over 50 representative ones are emphasized from the perspectives of methods, performance, advantages, and drawbacks, as well as underlying framework support and benchmark datasets. With a comprehensive analysis, we put forward some existing problems and offer prospects of lightweight DCNN for future development.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21

Similar content being viewed by others


  1. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Bartlett PL, Pereira FCN, Burges CJC, Bottou L, Weinberger KQ (eds) Advances in neural information processing systems 25: 26th annual conference on neural information processing systems 2012. Proceedings of a meeting held December 3–6, 2012, Lake Tahoe, NV, USA, pp 1106–1114

  2. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: Bengio Y, LeCun Y (eds) 3rd international conference on learning representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, conference track proceedings

  3. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition, CVPR 2016, Las Vegas, NV, USA, June 27–30, 2016. IEEE Computer Society, pp 770–778

  4. Li Y, Liu J, Wang L (2018) Lightweight network research based on deep learning: a review. In: 2018 37th Chinese control conference (CCC). IEEE, pp 9021–9026

  5. Zhou Y, Chen S, Wang Y, Huan W (2020) Review of research on lightweight convolutional neural networks. In 2020 IEEE 5th information technology and mechatronics engineering conference (ITOEC). IEEE, pp 1713–1720

  6. Ge D-H, Li H-S, Zhang L, Liu R, Shen P, Miao Q-G (2020) Survey of lightweight neural network. J. Softw 31:2627–2653

    Google Scholar 

  7. Zheng M, Tian Y, Chen H, Yang S, Song F, Gao X (2022) Lightweight network research based on deep learning. In: International conference on computer graphics, artificial intelligence, and data processing (ICCAID 2021), vol 12168. SPIE, pp 333–338

  8. Ma J, Zhang Y, Ma Z, Mao K (2022) Research progress of lightweight neural network convolution design. J Front Comput Sci Technol 16(3):512–528

    Google Scholar 

  9. Wang CH, Huang KY, Yao Y, Chen JC, Shuai HH, Cheng WH (2022) Lightweight deep learning: an overview. In IEEE consumer electronics magazine, pp 1–12

  10. Mishra R, Gupta H (2023) Transforming large-size to lightweight deep neural networks for IoT applications. ACM Comput Surv 55(11):1–35

    Article  Google Scholar 

  11. Hafiz AM (2023) A survey on light-weight convolutional neural networks: trends, issues and future scope. J Mob Multimed 19:1277–1298

    Google Scholar 

  12. Cheng Y, Wang D, Zhou P, Zhang T (2018) Model compression and acceleration for deep neural networks: the principles, progress, and challenges. IEEE Signal Process Mag 35(1):126–136

    Article  Google Scholar 

  13. Zhou Z, Chen X, Li E, Zeng L, Luo K, Zhang J (2019) Edge intelligence: paving the last mile of artificial intelligence with edge computing. Proc IEEE 107(8):1738–1762

    Article  Google Scholar 

  14. Deng S, Zhao H, Fang W, Yin J, Dustdar S, Zomaya AY (2020) Edge intelligence: the confluence of edge computing and artificial intelligence. IEEE Internet Things J 7(8):7457–7469

    Article  Google Scholar 

  15. Deng L, Li G, Han S, Shi L, Xie Y (2020) Model compression and hardware acceleration for neural networks: a comprehensive survey. Proc IEEE 108(4):485–532

    Article  Google Scholar 

  16. Dianlei X, Li T, Li Y, Xiang S, Tarkoma S, Jiang T, Crowcroft J, Hui P (2021) Edge intelligence: empowering intelligence to the edge of network. Proc IEEE 109(11):1778–1837

    Article  Google Scholar 

  17. Zhao T, Xie Y, Wang Y, Cheng J, Guo X, Bin H, Chen Y (2022) A survey of deep learning on mobile devices: applications, optimizations, challenges, and research opportunities. Proc IEEE 110(3):334–354

    Article  Google Scholar 

  18. Han Cai, Ji Lin, Song Han (2022) Efficient methods for deep learning, In: Proceedings of computer vision and pattern recognition (CVPR), Advanced Methods and Deep Learning in Computer Vision, pp 159–190

  19. Shuvo MH, Islam SK, Cheng J, Morshed BI (2022) Efficient acceleration of deep learning inference on resource-constrained edge devices: a review. Proc IEEE 111(1): 42–91

  20. Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) ImageNet: a large-scale hierarchical image database. In: 2009 IEEE computer society conference on computer vision and pattern recognition (CVPR 2009), 20–25 June 2009, Miami, FL, USA. IEEE Computer Society, pp 248–255

  21. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein MS, Berg AC, Fei-Fei L (2014) ImageNet large scale visual recognition challenge. Int J Comput Vis 115:211–252

    Article  MathSciNet  Google Scholar 

  22. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  23. Denil M, Shakibi B, Dinh L, Ranzato MA, de Freitas N (2013) Predicting parameters in deep learning. In: Burges CJC, Bottou L, Ghahramani Z, Weinberger KQ (eds) Advances in neural information processing systems 26: 27th annual conference on neural information processing systems 2013. Proceedings of a meeting held December 5–8, 2013, Lake Tahoe, NV, USA, pp 2148–2156

  24. Denton EL, Zaremba W, Bruna J, LeCun Y, Fergus R (2014) Exploiting linear structure within convolutional networks for efficient evaluation. In: Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ (eds) Advances in neural information processing systems 27: annual conference on neural information processing systems 2014, December 8–13 2014, Montreal, QC, Canada, pp 1269–1277

  25. Szegedy C, Liu W, Jia Y, Sermanet P, Reed SE, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: IEEE conference on computer vision and pattern recognition, CVPR 2015, Boston, MA, USA, June 7–12, 2015. IEEE Computer Society, pp 1–9

  26. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: 2016 IEEE conference on computer vision and pattern recognition, CVPR 2016, Las Vegas, NV, USA, June 27–30, 2016. IEEE Computer Society, pp 2818–2826

  27. Szegedy C, Ioffe S, Vanhoucke V (2016) Inception-V4, inception-ResNet and the impact of residual connections on learning. CoRR.

  28. Chollet F (2017) Xception: deep learning with depthwise separable convolutions. In: 2017 IEEE conference on computer vision and pattern recognition, CVPR 2017, Honolulu, HI, USA, July 21–26, 2017. IEEE Computer Society, pp 1800–1807

  29. Wang M, Liu B, Foroosh H (2016) Design of efficient convolutional layers using single intra-channel convolution, topological subdivisioning and spatial “bottleneck” structure. arXiv: Computer Vision and Pattern Recognition

  30. Wang M, Liu B, Foroosh H (2017) Factorized convolutional neural networks. In: 2017 IEEE international conference on computer vision workshops, ICCV Workshops 2017, Venice, Italy, October 22–29, 2017. IEEE Computer Society, pp 545–553

  31. Iandola FN, Moskewicz MW, Ashraf K, Han S, Dally WJ, Keutzer K (2016) SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and \(<\)1mb model size. CoRR.

  32. Lin M, Chen Q, Yan S (2014) Network in network. In: Bengio Y, LeCun Y (eds) 2nd international conference on learning representations, ICLR 2014, Banff, AB, Canada, April 14–16, 2014, conference track proceedings

  33. Gholami A, Kwon K, Wu B, Tai Z, Yue X, Jin PH, Zhao S, Keutzer K (2018) SqueezeNext: hardware-aware neural network design. In: 2018 IEEE conference on computer vision and pattern recognition workshops, CVPR Workshops 2018, Salt Lake City, UT, USA, June 18–22, 2018. Computer Vision Foundation/IEEE Computer Society, pp 1638–1647

  34. Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) MobileNets: efficient convolutional neural networks for mobile vision applications. CoRR.

  35. Sandler M, Howard AG, Zhu M, Zhmoginov A, Chen L-C (2018) MobileNetV2: inverted residuals and linear bottlenecks. In: 2018 IEEE conference on computer vision and pattern recognition, CVPR 2018, Salt Lake City, UT, USA, June 18–22, 2018. Computer Vision Foundation/IEEE Computer Society, pp 4510–4520

  36. Zhang X, Zhou X, Lin M, Sun J (2018) ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: 2018 IEEE conference on computer vision and pattern recognition, CVPR 2018, Salt Lake City, UT, USA, June 18–22, 2018. Computer Vision Foundation/IEEE Computer Society, pp 6848–6856

  37. Ma N, Zhang X, Zheng H-T, Sun J (2018) ShuffleNet V2: practical guidelines for efficient CNN architecture design. In: Proceedings of the European conference on computer vision (ECCV), pp 116–131

  38. Zhou D, Hou Q, Chen Y, Feng J, Yan S (2020) Rethinking bottleneck structure for efficient mobile network design. In: Vedaldi A, Bischof H, Brox T, Frahm J-M (eds) Computer vision—ECCV 2020—16th European conference, Glasgow, UK, August 23–28, 2020, proceedings, part III, volume 12348 of lecture notes in computer science. Springer, pp 680–697

  39. Haase D, Amthor M (2020) Rethinking depthwise separable convolutions: how intra-kernel correlations lead to improved MobileNets. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14600–14609

  40. Gao H, Wang Z, Cai L, Ji S (2018) ChannelNets: compact and efficient convolutional neural networks via channel-wise convolutions. IEEE transactions on pattern analysis and machine intelligence, pp 2570–2581

  41. Kopuklu O, Kose N, Gunduz A, Rigoll G (2019) Resource efficient 3D convolutional neural networks. In: Proceedings of the IEEE/CVF international conference on computer vision workshops, pp 1910–1919

  42. Wu B, Iandola F, Jin PH, Keutzer K (2017) SqueezeDet: unified, small, low power fully convolutional neural networks for real-time object detection for autonomous driving. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 129–137

  43. Wu B, Wan A, Yue X, Keutzer K (2018) SqueezeSeg: convolutional neural nets with recurrent CRF for real-time road-object segmentation from 3D LiDAR point cloud. In: 2018 IEEE international conference on robotics and automation (ICRA). IEEE, pp 1887–1893

  44. Wang RJ, Li X, Ling CX (2018) Pelee: a real-time object detection system on mobile devices. In Proceedings of the 32nd International Conference on Neural Information Processing Systems (NIPS'18). Curran Associates Inc., Red Hook, NY, USA, pp 1967–1976

  45. Chen S, Liu Y, Gao X, Han Z (2018) MobileFaceNets: efficient CNNs for accurate real-time face verification on mobile devices. In: Biometric recognition: 13th Chinese conference, CCBR 2018, Urumqi, China, August 11–12, 2018, proceedings 13. Springer, pp 428–438

  46. Duong CN, Quach KG, Jalata I, Le N, Luu K (2019) MobiFace: a lightweight deep learning face recognition on mobile devices. In 2019 IEEE 10th international conference on biometrics theory, applications and systems (BTAS). IEEE, pp 1–6

  47. Han K, Wang Y, Chang X, Guo J, Chunjing X, Enhua W, Tian Q (2022) GhostNets on heterogeneous devices via cheap operations. Int J Comput Vis 130(4):1050–1069

    Article  Google Scholar 

  48. Cui C, Gao T, Wei S, Du Y, Guo R, Dong S, Lu B, Zhou Y, Lv X, Liu Q et al (2021) PP-LCNet: a lightweight CPU convolutional neural network. arXiv Preprint.

  49. Duong CN, Quach KG, Jalata I, Le N, Luu K (2019) MobiFace: a lightweight deep learning face recognition on mobile devices. In: 2019 IEEE 10th international conference on biometrics theory, applications and systems (BTAS). IEEE, pp 1–6

  50. Mehta S, Rastegari M, Caspi A, Shapiro L, Hajishirzi H (2018) ESPNet: efficient spatial pyramid of dilated convolutions for semantic segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 552–568

  51. Mehta S, Rastegari M, Shapiro L, Hajishirzi H (2019) ESPNetV2: a light-weight, power efficient, and general purpose convolutional neural network. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9190–9200

  52. Han K, Wang Y, Tian Q, Guo J, Xu C, Xu C (2020) GhostNet: more features from cheap operations. In: 2020 IEEE/CVF conference on computer vision and pattern recognition, CVPR 2020, Seattle, WA, USA, June 13–19, 2020. Computer Vision Foundation/IEEE, pp 1577–1586

  53. Darbani P, Rohbani N, Beitollahi H, Lotfi-Kamran P (2022) RASHT: a partially reconfigurable architecture for efficient implementation of CNNs. IEEE Trans Very Large Scale Integr Syst 30(7):860–868

    Article  Google Scholar 

  54. Vasu PKA, Gabriel J, Zhu J, Tuzel O, Ranjan A (2023) MobileOne: an improved one millisecond mobile backbone. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7907–7917

  55. Cai Z, Shen Q (2023) FalconNet: Factorization for the light-weight ConvNets. arXiv Preprint.

  56. Ding X, Zhang X, Ma N, Han J, Ding G, Sun J (2021) RepVGG: making VGG-style ConvNets great again. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13733–13742

  57. Nair V, Hinton GE (2010) Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp 807–814

  58. He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE international conference on computer vision, pp 1026–1034

  59. Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: 2018 IEEE conference on computer vision and pattern recognition, CVPR 2018, Salt Lake City, UT, USA, June 18–22, 2018. Computer Vision Foundation/IEEE Computer Society, pp 7132–7141

  60. Wu B, Wan A, Yue X, Jin P, Zhao S, Golmant N, Gholaminejad A, Gonzalez J, Keutzer K (2018) Shift: a zero flop, zero parameter alternative to spatial convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9127–9135

  61. Mehta S, Hajishirzi H, Rastegari M (2020) DiceNet: dimension-wise convolutions for efficient networks. IEEE Trans Pattern Anal Mach Intell 44(5):2416–2425

    Google Scholar 

  62. Lai L, Suda N, Chandra V (2018) Not all ops are created equal! CoRR.

  63. Tan M, Le QV (2019) EfficientNet: rethinking model scaling for convolutional neural networks. In: Chaudhuri K, Salakhutdinov R (eds) Proceedings of the 36th international conference on machine learning, ICML 2019, 9–15 June 2019, Long Beach, CA, USA, volume 97 of proceedings of machine learning research. PMLR, pp 6105–6114

  64. Tan M, Le QV (2021) EfficientNetV2: smaller models and faster training. In: Meila M, Zhang T (eds) Proceedings of the 38th international conference on machine learning, ICML 2021, 18–24 July 2021, virtual event, volume 139 of proceedings of machine learning research. PMLR, pp 10096–10106

  65. Zoph B, Le QV (2016) Neural architecture search with reinforcement learning. arXiv Preprint.

  66. Zoph B, Vasudevan V, Shlens J, Le QV (2018) Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8697–8710

  67. Real E, Aggarwal A, Huang Y, Le QV (2019) Regularized evolution for image classifier architecture search. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 4780–4789

  68. Tan M, Chen B, Pang R, Vasudevan V, Sandler M, Howard A, Le QV (2019) MnasNet: platform-aware neural architecture search for mobile. In: IEEE conference on computer vision and pattern recognition, CVPR 2019, Long Beach, CA, USA, June 16–20, 2019. Computer Vision Foundation/IEEE, pp 2820–2828

  69. Howard A, Pang R, Adam H, Le QV, Sandler M, Chen B, Wang W, Chen L-C, Tan M, Chu G, Vasudevan VK, Zhu Y (2019) Searching for MobileNetV3. In: International conference on computer vision

  70. Yang T-J, Howard A, Chen B, Zhang X, Go A, Sandler M, Sze V, Adam H (2018) NetAdapt: platform-aware neural network adaptation for mobile applications. In: Proceedings of the European conference on computer vision (ECCV), pp 285–300

  71. Chu X, Zhang B, Xu R (2019) MoGA: searching beyond MobileNetV3. In: ICASSP 2020—2020 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 4042–4046

  72. Wu B, Dai X, Zhang P, Wang Y, Sun F, Wu Y, Tian Y, Vajda P, Jia Y, Keutzer K (2018) FBNet: hardware-aware efficient ConvNet design via differentiable neural architecture search. CoRR.

  73. Wan A, Dai X, Zhang P, He Z, Tian Y, Xie S, Wu B, Yu M, Xu T, Chen K, Vajda P, Gonzalez JE (2020) FBNetV2: differentiable neural architecture search for spatial and channel dimensions. CoRR.

  74. Dai X, Zhang P, Wu B, Yin H, Sun F, Wang Y, Dukhan M, Hu Y, Wu Y, Jia Y, Vajda P, Uyttendaele M, Jha NK (2018) ChamNet: towards efficient network design through platform-aware model adaptation. CoRR.

  75. Cai H, Zhu L, Han S (2018) ProxylessNAS: direct neural architecture search on target task and hardware. CoRR.

  76. Tan M, Le QV (2019) MixConv: mixed depthwise convolutional kernels. CoRR.

  77. Lin M, Chen H, Sun X, Qian Q, Li H, Jin R (2020) Neural architecture design for GPU-efficient networks. arXiv Preprint.

  78. Dai X, Wan A, Zhang P, Wu B, He Z, Wei Z, Chen K, Tian Y, Yu M, Vajda P, Gonzalez JE (2020) FBNetV3: joint architecture-recipe search using neural acquisition function. CoRR.

  79. Wu B, Li C, Zhang H, Dai X, Zhang P, Yu M, Wang J, Lin Y, Vajda P (2021) FBNetV5: neural architecture search for multiple tasks in one run. CoRR.

  80. Zhang L, Shen H, Luo Y, Cao X, Pan L, Wang T, Feng Q (2022) Efficient CNN architecture design guided by visualization. In: 2022 IEEE international conference on multimedia and expo (ICME). IEEE, pp 1–6

  81. Cai H, Zhu L, Han S (2019) ProxylessNAS: direct neural architecture search on target task and hardware. In: 7th international conference on learning representations, ICLR 2019, New Orleans, LA, USA, May 6–9, 2019.

  82. Han S, Pool J, Tran J, Dally WJ (2015) Learning both weights and connections for efficient neural networks. In Proceedings of the 28th international conference on neural information processing systems - volume 1 (NIPS'15). MIT Press, Cambridge, MA, USA, pp 1135–1143

  83. Han S, Liu X, Mao H, Pu J, Pedram A, Horowitz MA, Dally WJ (2016) EIE: efficient inference engine on compressed deep neural network. ACM SIGARCH Comput Archit News 44(3):243–254

    Article  Google Scholar 

  84. Meng F, Cheng H, Li K, Luo H, Guo X, Lu G, Sun X (2020) Pruning Filter in Filter. In Proceedings of the 34th international conference on neural information processing systems (NIPS'20). Curran Associates Inc., Red Hook, NY, USA, Article 1479, pp 17629–17640

  85. Huo Z, Wang C, Chen W, Li Y, Wang J, Wu J (2022) Balanced stripe-wise pruning in the filter. In: International conference on acoustics, speech, and signal processing, pp 4408–4412

  86. Ma X, Guo F-M, Niu W, Lin X, Tang J, Ma K, Ren B, Wang Y (2019) PCONV: the missing but desirable sparsity in DNN weight pruning for real-time execution on mobile devices. In: the AAAI conference on artificial intelligence, pp 5117–5124

  87. Niu W, Ma X, Lin S, Wang S, Qian X, Lin X, Wang Y, Ren B (2020) PatDNN: Achieving Real-Time DNN Execution on Mobile Devices with Pattern-based Weight Pruning. In Proceedings of the twenty-fifth international conference on architectural support for programming languages and operating systems (ASPLOS '20), pp 907–922

  88. Vysogorets A, Kempe J (2021) Connectivity matters: neural network pruning through the lens of effective sparsity.

  89. Li H, Kadav A, Durdanovic I, Samet H, PGraf H (2016) Pruning filters for efficient convnets. arXiv: Computer Vision and Pattern Recognition

  90. Ye J, Lu X, Lin Z, Wang JZ (2018) Rethinking the smaller-norm-less-informative assumption in channel pruning of convolution layers. In: International conference on learning representations

  91. He Y, Liu P, Wang Z, Hu Z, Yang Y (2019) Filter pruning via geometric median for deep convolutional neural networks acceleration. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4335–4344

  92. Luo J-H, Wu J, Lin W (2017) ThiNet: a filter level pruning method for deep neural network compression. In: 2017 IEEE international conference on computer vision (ICCV)

  93. Fang G, Ma X, Song M, Mi MB, Wang X (2023) DepGraph: towards any structural pruning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 16091–16101

  94. He Y, Lin J, Liu Z, Wang H, Li LJ, Han S (2018) AMC: AutoML for model compression and acceleration on mobile devices. In: European conference on computer vision

  95. Li B, Wu B, Su J, Wang G, Lin L (2020) EagleEye: fast sub-net evaluation for efficient neural network pruning. In: European conference on computer vision, pp 639–654

  96. Blalock D, Ortiz JJG, Frankle J, Guttag J (2020) What is the state of neural network pruning? Proc Mach Learn Syst 2:129–146

    Google Scholar 

  97. Wang H, Qin C, Bai Y, Fu Y (2023) Why is the state of neural network pruning so confusing? On the fairness, comparison setup, and trainability in network pruning. arXiv Preprint.

  98. Li Y, Adamczewski K, Li W, Gu S, Timofte R, Van Gool L (2021) Revisiting random channel pruning for neural network compression. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 191–201

  99. Jaderberg M, Vedaldi A, Zisserman A (2014) Speeding up convolutional neural networks with low rank expansions. In: British machine vision conference

  100. Zhang X, Zou J, He K, Sun J (2015) Accelerating very deep convolutional networks for classification and detection. IEEE Trans Pattern Anal Mach Intell.

    Article  Google Scholar 

  101. Kim Y, Park E, Yoo S, Choi T-L, Yang L, Shin D (2015) Compression of deep convolutional neural networks for fast and low power mobile applications. arXiv: Computer Vision and Pattern Recognition

  102. Chen Y, Jin X, Kang B, Feng J, Yan S (2018) Sharing residual units through collective tensor factorization to improve deep neural networks. In: International joint conference on artificial intelligence

  103. Su J, Li J, Bhattacharjee B, Huang F (2018) Tensorial neural networks: generalization of neural networks and application to model compression. arXiv: Machine Learning

  104. Garipov T, Podoprikhin D, Novikov A, Vetrov D (2016) Ultimate tensorization: compressing convolutional and fc layers alike. arXiv: Learning

  105. Hawkins C, Yang H, Li M, Lai L, Chandra V (2021) Low-rank+ sparse tensor compression for neural networks. arXiv Preprint.

  106. Chu B-S, Lee C-R (2021) Low-rank tensor decomposition for compression of convolutional neural networks using funnel regularization. arXiv Preprint.

  107. Miyashita D, Lee EH, Murmann B (2016) Convolutional neural networks using logarithmic data representation. CoRR.

  108. Zhou A, Yao A, Guo Y, Xu L, Chen Y (2017) Incremental network quantization: towards lossless CNNs with low-precision weights. In: 5th international conference on learning representations, ICLR 2017, Toulon, France, April 24–26, 2017, conference track proceedings

  109. Guo Y, Yao A, Zhao H, Chen Y (2017) Network sketching: exploiting binary structure in deep CNNs. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5955–5963

  110. Nahshan Y, Chmiel B, Baskin C, Zheltonozhskii E, Banner R, Bronstein AM, Mendelson A (2021) Loss aware post-training quantization. Mach Learn 110(11):3245–3262

    Article  MathSciNet  Google Scholar 

  111. Li Y, Gong R, Tan X, Yang Y, Hu P, Zhang Q, Yu F, Wang W, Gu S (2021) BRECQ: pushing the limit of post-training quantization by block reconstruction. arXiv Preprint.

  112. Nagel M, van Baalen M, Blankevoort T, Welling M (2019) Data-free quantization through weight equalization and bias correction. In: 2019IEEE/CVF international conference on computer vision, ICCV 2019, Seoul, Korea (South), October 27–November 2, 2019. IEEE, pp 1325–1334

  113. Gholami A, Kim S, Dong Z, Yao Z, Mahoney MW, Keutzer K (2021) A survey of quantization methods for efficient neural network inference. CoRR.

  114. Nagel M, Fournarakis M, Amjad RA, Bondarenko Y, van Baalen M, Blankevoort T (2021) A white paper on neural network quantization. CoRR.

  115. Nagel M, Amjad RA, van Baalen M, Louizos C, Blankevoort T (2020) Up or down? Adaptive rounding for post-training quantization. In: Proceedings of the 37th international conference on machine learning, ICML 2020, 13–18 July 2020, virtual event, volume 119 of proceedings of machine learning research. PMLR, pp 7197–7206

  116. Banner R, Nahshan Y, Soudry D (2019) Post training 4-bit quantization of convolutional networks for rapid-deployment. In: Wallach HM, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox EB, Garnett R (eds) Advances in neural information processing systems 32: annual conference on neural information processing systems 2019, NeurIPS 2019, December 8–14, 2019, Vancouver, BC, Canada, pp 7948–7956

  117. Cai Y, Yao Z, Dong Z, Gholami A, Mahoney MW, Keutzer K (2020) ZeroQ: a novel zero shot quantization framework. In: 2020IEEE/CVF conference on computer vision and pattern recognition, CVPR 2020, Seattle, WA, USA, June 13–19, 2020. Computer Vision Foundation/IEEE, pp 13166–13175

  118. Hubara I, Nahshan Y, Hanani Y, Banner R, Soudry D (2021) Accurate post training quantization with small calibration sets. In: Meila M, Zhang T (eds) Proceedings of the 38th international conference on machine learning, ICML 2021, 18–24 July 2021, Virtual Event, volume 139 of proceedings of machine learning research. PMLR, pp 4466–4475

  119. Wei X, Gong R, Li Y, Liu X, Yu F (2022) QDrop: randomly dropping quantization for extremely low-bit post-training quantization. In: The tenth international conference on learning representations, ICLR 2022, virtual event, April 25–29, 2022.

  120. Courbariaux M, Bengio Y, David J-P (2015) BinaryConnect: training deep neural networks with binary weights during propagations. In: Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R (eds) Advances in neural information processing systems 28: annual conference on neural information processing systems 2015, December 7–12, 2015, Montreal, QC, Canada, pp 3123–3131

  121. Hubara I, Courbariaux M, Soudry D, El-Yaniv R, Bengio Y (2016) Binarized neural networks. In: Lee DD, Sugiyama M, von Luxburg U, Guyon I, Garnett R (eds) Advances in neural information processing systems 29: annual conference on neural information processing systems 2016, December 5–10, 2016, Barcelona, Spain, pp 4107–4115

  122. Rastegari M, Ordonez V, Redmon J, Farhadi A (2016) XNOR-Net: ImageNet classification using binary convolutional neural networks. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer vision-ECCV 2016-14th European conference, Amsterdam, The Netherlands, October 11–14, 2016, proceedings, part IV, volume 9908 of lecture notes in computer science. Springer, pp 525–542

  123. Zhou S, Wu Y, Zekun N, Zhou X, Wen H, Zou Y (2016) DoReFa-Net: training low bitwidth convolutional neural networks with low bitwidth gradients. arXiv: Neural and Evolutionary Computing

  124. Hubara I, Courbariaux M, Soudry D, El-Yaniv R, Bengio Y (2017) Quantized neural networks: training neural networks with low precision weights and activations. J Mach Learn Res 18:187:1–187:30

    MathSciNet  Google Scholar 

  125. Gysel P, Pimentel JJ, Motamedi M, Ghiasi S (2018) Ristretto: a framework for empirical study of resource-efficient inference in convolutional neural networks. IEEE Trans Neural Netw Learn Syst 29(11):5784–5789

    Article  Google Scholar 

  126. Bengio Y, Léonard N, Courville AC (2013) Estimating or propagating gradients through stochastic neurons for conditional computation. CoRR.

  127. Choi J, Wang Z, Venkataramani S, Chuang PI-J, Srinivasan V, Gopalakrishnan K (2018) PACT: parameterized clipping activation for quantized neural networks. CoRR.

  128. Jung S, Son C, Lee S, Son JW, Han J-J, Kwak Y, Hwang SJ, Choi C (2019) Learning to quantize deep networks by optimizing quantization intervals with task loss. In: IEEE conference on computer vision and pattern recognition, CVPR 2019, Long Beach, CA, USA, June 16–20, 2019. Computer Vision Foundation/IEEE, pp 4350–4359

  129. Esser SK, McKinstry JL, Bablani D, Appuswamy R, Modha DS (2020) Learned step size quantization. In: 8th international conference on learning representations, ICLR 2020, Addis Ababa, Ethiopia, April 26–30, 2020.

  130. Bhalgat Y, Lee J, Nagel M, Blankevoort T, Kwak N (2020) LSQ+: improving low-bit quantization through learnable offsets and better initialization. In: 2020IEEE/CVF conference on computer vision and pattern recognition, CVPR workshops 2020, Seattle, WA, USA, June 14–19, 2020. Computer Vision Foundation/IEEE, pp 2978–2985

  131. Asim F, Park J, Azamat A, Lee J (2022) Centered symmetric quantization for hardware-efficient low-bit neural networks. British Machine Vision Association (BMVA)

  132. Dong Z, Yao Z, Arfeen D, Gholami A, Mahoney MW, Keutzer K (2020) HAWQ-V2: Hessian aware trace-weighted quantization of neural networks. Adv Neural Inf Process Syst 33:18518–18529

    Google Scholar 

  133. Yao Z, Dong Z, Zheng Z, Gholami A, Yu J, Tan E, Wang L, Huang Q, Wang Y, Mahoney M et al (2021) HAWQ-V3: dyadic neural network quantization. In: International conference on machine learning. PMLR, pp 11875–11886

  134. He Y, Lin J, Liu Z, Wang H, Li L-J, Han S (2018) AMc: AutoML for model compression and acceleration on mobile devices. In: Proceedings of the European conference on computer vision (ECCV), pp 784–800

  135. Bucila C, Caruana R, Niculescu-Mizil A (2006) Model compression. In: Eliassi-Rad T, Ungar LH, Craven M, Gunopulos D (eds) Proceedings of the twelfth ACM SIGKDD international conference on knowledge discovery and data mining, Philadelphia, PA, USA, August 20–23, 2006. ACM, pp 535–541

  136. Hinton GE, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. CoRR.

  137. Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, Bengio Y (2015) FitNets: hints for thin deep nets. In: Bengio Y, LeCun Y (eds) 3rd International conference on learning representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, conference track proceedings

  138. Zhang Y, Xiang T, Hospedales TM, Lu H (2018) Deep mutual learning. In: 2018 IEEE conference on computer vision and pattern recognition, CVPR 2018, Salt Lake City, UT, USA, June 18–22, 2018. Computer Vision Foundation/IEEE Computer Society, pp 4320–4328

  139. Furlanello T, Lipton ZC, Tschannen M, Itti L, Anandkumar A (2018) Born-again neural networks. In: Dy JG, Krause A (eds) Proceedings of the 35th international conference on machine learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10–15, 2018, volume 80 of proceedings of machine learning research. PMLR, pp 1602–1611

  140. Yang C, Xie L, Su C, Yuille AL (2019) Snapshot distillation: teacher–student optimization in one generation. In: IEEE conference on computer vision and pattern recognition, CVPR 2019, Long Beach, CA, USA, June 16–20, 2019. Computer Vision Foundation/IEEE, pp 2859–2868

  141. Cho JH, Hariharan B (2019) On the efficacy of knowledge distillation. In: 2019 IEEE/CVF international conference on computer vision, ICCV 2019, Seoul, Korea (South), October 27–November 2, 2019. IEEE, pp 4793–4801

  142. Mirzadeh SI, Farajtabar M, Li A, Levine N, Matsukawa A, Ghasemzadeh H (2020) Improved knowledge distillation via teacher assistant. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 5191–5198

  143. Chen D, Mei J-P, Zhang H, Wang C, Feng Y, Chen C (2022) Knowledge distillation with the reused teacher classifier. In: IEEE/CVF conference on computer vision and pattern recognition, CVPR 2022, New Orleans, LA, USA, June 18–24, 2022. IEEE, pp 11923–11932

  144. Beyer L, Zhai X, Royer A, Markeeva L, Anil R, Kolesnikov A (2022) Knowledge distillation: a good teacher is patient and consistent. In: IEEE/CVF conference on computer vision and pattern recognition, CVPR 2022, New Orleans, LA, USA, June 18–24, 2022. IEEE, pp 10915–10924

  145. Yim J, Joo D, Bae J-H, Kim J (2017) A gift from knowledge distillation: fast optimization, network minimization and transfer learning. In: 2017IEEE conference on computer vision and pattern recognition, CVPR 2017, Honolulu, HI, USA, July 21–26, 2017. IEEE Computer Society, pp 7130–7138

  146. Zagoruyko S, Komodakis N (2017) Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. In: 5th international conference on learning representations, ICLR 2017, Toulon, France, April 24–26, 2017, conference track proceedings.

  147. Huang Z, Wang N (2017) Like what you like: knowledge distill via neuron selectivity transfer. CoRR.

  148. Heo B, Lee M, Yun S, Choi JY (2018) Knowledge transfer via distillation of activation boundaries formed by hidden neurons. CoRR.

  149. Kim J, Park S, Kwak N (2018) Paraphrasing complex network: network compression via factor transfer. In: Bengio S, Wallach HM, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R (eds) Advances in neural information processing systems 31: annual conference on neural information processing systems 2018, NeurIPS 2018, December 3–8, 2018, Montréal, Canada, pp 2765–2774

  150. Heo B, Kim J, Yun S, Park H, Kwak N, Choi JY (2019) A comprehensive overhaul of feature distillation. In: 2019 IEEE/CVF international conference on computer vision, ICCV 2019, Seoul, Korea (South), October 27–November 2, 2019. IEEE, pp 1921–1930

  151. Park W, Kim D, Lu Y, Cho M (2019) Relational knowledge distillation. In: IEEE conference on computer vision and pattern recognition, CVPR 2019, Long Beach, CA, USA, June 16–20, 2019. Computer Vision Foundation/IEEE, pp 3967–3976

  152. Peng B, Jin X, Li D, Zhou S, Wu Y, Liu J, Zhang Z, Liu Y (2019) Correlation congruence for knowledge distillation. In: 2019IEEE/CVF international conference on computer vision, ICCV 2019, Seoul, Korea (South), October 27–November 2, 2019. IEEE, pp 5006–5015

  153. Tian Y, Krishnan D, Isola P (2019) Contrastive representation distillation. CoRR.

  154. Tung F, Mori G (2019) Similarity-preserving knowledge distillation. In: 2019 IEEE/CVF international conference on computer vision, ICCV 2019, Seoul, Korea (South), October 27–November 2, 2019. IEEE, pp 1365–1374

  155. Zhao B, Cui Q, Song R, Qiu Y, Liang J (2022) Decoupled knowledge distillation. In: IEEE/CVF conference on computer vision and pattern recognition, CVPR 2022, New Orleans, LA, USA, June 18–24, 2022. IEEE, pp 11943–11952

  156. Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M, Kudlur M, Levenberg J, Monga R, Moore S, Murray DG, Steiner B, Tucker PA, Vasudevan V, Warden P, Wicke M, Yu Y, Zheng X (2016) TensorFlow: a system for large-scale machine learning. In: Keeton K, Roscoe T (eds) 12th USENIX symposium on operating systems design and implementation, OSDI 2016, Savannah, GA, USA, November 2–4, 2016. USENIX Association, pp 265–283

  157. Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Köpf A, Yang EZ, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J, Chintala S (2019) PyTorch: an imperative style, high-performance deep learning library. In: Wallach HM, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox EB, Garnett R (eds) Advances in neural information processing systems 32: annual conference on neural information processing systems 2019, NeurIPS 2019, December 8–14, 2019, Vancouver, BC, Canada, pp 8024–8035

  158. Keras. Accessed 16 Nov 2022

  159. PyTorch lightning. Accessed 20 Dec 2022

  160. Theano. Accessed 16 Nov 2022

  161. The microsoft cognitive toolkit. Accessed 16 Dec 2022

  162. Deeplearning4j suite overview. Accessed 16 Nov 2022

  163. Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick RB, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: Hua KA, Rui Y, Steinmetz R, Hanjalic A, Natsev A, Zhu W (eds) Proceedings of the ACM international conference on multimedia, MM’14, Orlando, FL, USA, November 03–07, 2014. ACM, pp 675–678

  164. Chen T, Li M, Li Y, Lin M, Wang N, Wang M, Xiao T, Xu B, Zhang C, Zhang Z (2015) MxNet: a flexible and efficient machine learning library for heterogeneous distributed systems. CoRR.

  165. Apache MXNet. Accessed 23 Dec 2022

  166. NVIDIA TensorRT. Accessed 20 Dec 2022

  167. Kechit Goyal. Title, deep learning frameworks in 2023 you can’t ignore. Accessed 09 Jan 2023

  168. Shuvo MMH, Islam SK, Cheng J, Morshed BI (2023) Efficient acceleration of deep learning inference on resource-constrained edge devices: a review. Proc IEEE 111(1):42–91

    Article  Google Scholar 

  169. Sze V, Chen Y-H, Yang T-J, Emer JS (2017) Efficient processing of deep neural networks: a tutorial and survey. Proc IEEE 105(12):2295–2329

    Article  Google Scholar 

  170. Xu X, Ding Y, Hu SX, Niemier M, Cong J, Hu Y, Shi Y (2018) Scaling for edge inference of deep neural networks. Nat Electron 1(4):216–222

    Article  Google Scholar 

  171. Jetson TX2 Module. Accessed 09 Dec 2022

  172. Intel Edison development platform. Accessed 29 Dec 2022

  173. Chen Y-H, Krishna T, Emer JS, Sze V (2017) Eyeriss: an energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE J Solid State Circuits 52(1):127–138

    Article  Google Scholar 

  174. Chen Y-H, Yang T-J, Emer JS, Sze V (2019) Eyeriss V2: a flexible accelerator for emerging deep neural networks on mobile devices. IEEE J Emerg Sel Top Circuits Syst 9(2):292–308

    Article  Google Scholar 

  175. Baischer L, Wess M, TaheriNejad N (2021) Learning on hardware: a tutorial on neural network accelerators and co-processors. arXiv Preprint.

  176. Krizhevsky A, Hinton G (2009) CIFAR-100 (Canadian institute for advanced research). Technical report, CIFAR

  177. Khosla A, Jayadevaprakash N, Yao B, Fei-Fei L (2011) Novel dataset for fine-grained image categorization. In: First workshop on fine-grained visual categorization, IEEE conference on computer vision and pattern recognition, Colorado Springs, CO, June 2011

  178. Lin T-Y, Maire M, Belongie SJ, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft COCO: common objects in context. In: Fleet DJ, Pajdla T, Schiele B, Tuytelaars T (eds) Computer vision—ECCV 2014—13th European conference, Zurich, Switzerland, September 6–12, 2014, proceedings, part V, volume 8693 of lecture notes in computer science. Springer, pp 740–755

  179. Cordts M, a Sebastian Ramos MO, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: 2016 IEEE conference on computer vision and pattern recognition, CVPR 2016, Las Vegas, NV, USA, June 27–30, 2016. IEEE Computer Society, pp 3213–3223

  180. Lai L, Suda N (2018) Rethinking machine learning development and deployment for edge devices. CoRR.

  181. Polino A, Pascanu R, Alistarh D (2018) Model compression via distillation and quantization. arXiv Preprint.

  182. Liu Z, Mu H, Zhang X, Guo Z, Yang X, Cheng K-T, Sun J (2019) MetaPruning: meta learning for automatic neural network channel pruning. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 3296–3305

  183. Liu Y, Yang G, Qiao S, Liu M, Qu L, Han N, Wu T, Yuan G, Wu T, Peng Y (2023) Imbalanced data classification: using transfer learning and active sampling. Eng Appl Artif Intell 117(Part):105621

    Article  Google Scholar 

  184. Ren P, Xiao Y, Chang X, Huang P-Y, Li Z, Gupta BB, Chen X, Wang X (2022) A survey of deep active learning. ACM Comput Surv 54(9):180:1–180:40

    Article  Google Scholar 

  185. Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q (2020) ECA-Net: efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11534–11542

  186. Zhang Q-L, Yang Y-B (2021) SA-Net: shuffle attention for deep convolutional neural networks. In: ICASSP 2021–2021 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 2235–2239

  187. Woo S, Park J, Lee J-Y, Kweon IS (2018) CBAM: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19

  188. Cao, Y., Xu, J., Lin, S., Wei, F., & Hu, H (2019) GCNet: non-local networks meet squeeze-excitation networks and beyond. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 1971–1980

  189. Li X, Hu X, Yang J (2019) Spatial group-wise enhance: improving semantic feature learning in convolutional networks. arXiv Preprint.

  190. Mehta S, Rastegari M (2021) MobileViT: light-weight, general-purpose, and mobile-friendly vision transformer. arXiv Preprint.

  191. Wu H, Xiao B, Codella N, Liu M, Dai X, Yuan L, Zhang L (2021) CvT: introducing convolutions to vision transformers. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 22–31

  192. Maaz M, Shaker A, Cholakkal H, Khan S, Zamir SW, Anwer RM, Khan FS (2022) EdgeNext: efficiently amalgamated CNN-transformer architecture for mobile vision applications. CoRR.

  193. Chen Y, Dai X, Chen D, Liu M, Dong X, Yuan L, Liu Z (2022) Mobile-former: bridging MobileNet and transformer. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5270–5279

  194. Zhang J, Li X, Li J, Liu L, Xue Z, Zhang B, Jiang Z, Huang T, Wang Y, Wang C (2023) Rethinking mobile block for efficient neural models. arXiv Preprint.

Download references


The authors gratefully acknowledge the anonymous reviewers for their constructive comments. This work is supported in part by the National Natural Science Foundation of China (No. 62132007 and 20210424), the Fundamental Research Funds for the Central Universities of China (No. lzujbky-2022-pd12), and by the Natural Science Foundation of Gansu Province, China (No. 22JR5RA492). All authors have read and agreed to the published version of the manuscript.

Author information

Authors and Affiliations


Corresponding authors

Correspondence to Fengyuan Ren or Zhen Yang.

Ethics declarations

Conflict of interest

The authors intend no competing interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, F., Li, S., Han, J. et al. Review of Lightweight Deep Convolutional Neural Networks. Arch Computat Methods Eng 31, 1915–1937 (2024).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:
