Improving the accuracy of pruned network using knowledge distillation

Prakosa, Setya Widyawan; Leu, Jenq-Shiou; Chen, Zhao-Hong

doi:10.1007/s10044-020-00940-2

Improving the accuracy of pruned network using knowledge distillation

Short Paper
Published: 17 November 2020

Volume 24, pages 819–830, (2021)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

Setya Widyawan Prakosa¹,
Jenq-Shiou Leu¹ &
Zhao-Hong Chen²

933 Accesses
11 Citations
Explore all metrics

Abstract

The introduction of convolutional neural networks (CNN) in image processing field has attracted researchers to explore the applications of CNN itself. Some network designs have been proposed to reach the state-of-the-art capability. However, the current design of neural network remains an issue related to the size of the model. Thus, some researchers introduce to reduce or compress the model size. The compression technique might affect the accuracy of the compressed model compared to the original one. In addition, it may influence the performance of the new model. Furthermore, we need to exploit a new scheme to enhance the accuracy of compressed network. In this study, we explore that knowledge distillation (KD) can be integrated to one of pruning methodologies, namely pruning filters, as the compression technique, to enhance the accuracy of pruned model. From all experimental results, we conclude that incorporating KD to create a MobileNets model can enhance the accuracy of pruned network without elongating the inference time. We measured the inference time of model trained with KD is just 0.1 s longer than that of without KD. Furthermore, by reducing 26.08% of the model size, the accuracy without KD is 63.65% and by incorporating KD, we can enhance to 65.37%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 10

References

Manyala A, Cholakkal H, Anand V, Kanhangad V, Rajan D (2018) CNN-based gender classification in near-infrared periocular images. Pattern Anal Appl 22(4):1493–1504
Article MathSciNet Google Scholar
Zhang M, Gao C, Li Q, Wang L, Zhang J (2018) Action detection based on tracklets with the two-stream CNN. Multimed Tools ppl 77(3):3303–3316
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. NIPS'12 proceedings of the 25th international conference on neural information processing systems, 1:1097–1105
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. International conference on learning representations (ICLR).
He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. CoRR, abs/1512.03385.
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2014) Going deeper with convolutions. CoRR, abs/1409.4842.
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2015) Rethinking the inception architecture for computer vision. CoRR, abs/1512.00567.
Szegedy C, Ioffe S, Vanhoucke V, Alemi A (2016) Inception-v4, inception-ResNet and the impact of residual connections on learning. CoRR, abs/1602.07261.
Hinton G, Vinyals O, Dean J (2014) distilling the knowledge in a neural network. NIPS 2014 Deep learning workshop, NIPS.
Ba LJ, Caruana R (2014) Do Deep nets really need to be deep?. Advances in neural information processing system 27, NIPS.
Aghasi A, Abdi A, Nguyen N, Romberg J (2017) Net-trim: convex pruning of deep neural networks with performance guarantee. 31st conference on neural information processing systems, NIPS.
Dong X, Chen S, Pan SJ (2017) Learning to prune deep neural networks via layer-wise optimal brain surgeon. CoRR. abs/1705.07565.
Li H, Kadav A, Durdanovic I, Samet H, Graf HP (2016) Pruning filters for efficient ConvNets. Proceedings of NIPS workshop on efficient methods for deep neural networks.
Cheng Y, Wang D, Zhou P (2017) A survey of model compression and acceleration for deep neural networks. IEEE signal processing magazine, special issue on deep learning for image understanding.
Hanson SJ, Pratt LY (1989) Comparing biases for minimal network construction with back-propagation. In: DS Touretzky (Ed.) Advances in neural information processing system (NIPS) 1, 177–185.
Cun YL, Denker JS, Solla SA (1990) Advances in neural information processing system 2. Touretzky DS, San Fransisco, Optimal brain damage, USA: Morgan Kaufmann Publishers Inc., CA, 598–605.
Hassibi B, Stork DG, Com SCR (1993) Second Order Derivatives for Network Pruning: Optimal Brain Surgeon. In: Advances in Neural Information Processing Systems 5. Morgan Kaufmann, pp. 164–171.
Bucilua C, Caruana R, Niculescu-Mizil A (2006) Model compression. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, ser. KDD ‘06. New York, USA: ACM, pp 535–541.
LeCun Y, Bottou L, Bengio Y (1998) Gradient-based learning applied to document recognition. Proc IEEE 86:2278–2324
Article Google Scholar
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) MobileNets: efficient convolutional neural networks for mobile version applications. CoRR, abs/1704.04861.
Chollet F (2017) Deep Learning with Depthwise Separable Convolutions. CVPR.
Lopes RG, Fenu S, Starner T (2017) Data-Free Knowledge Distillation for Deep Neural Networks. Proceedings of NIPS Workshop on Learning with Limited Data.

Download references

Acknowledgment

The authors gratefully acknowledge the support by the Ministry of Science and Technology, Taiwan, under grant MOST-109-2218-E-011-012-

Author information

Authors and Affiliations

Department of Electronic and Computer Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
Setya Widyawan Prakosa & Jenq-Shiou Leu
Industrial Technology Research Institute, Hsinchu, Taiwan
Zhao-Hong Chen

Authors

Setya Widyawan Prakosa
View author publications
You can also search for this author in PubMed Google Scholar
Jenq-Shiou Leu
View author publications
You can also search for this author in PubMed Google Scholar
Zhao-Hong Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jenq-Shiou Leu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Prakosa, S.W., Leu, JS. & Chen, ZH. Improving the accuracy of pruned network using knowledge distillation. Pattern Anal Applic 24, 819–830 (2021). https://doi.org/10.1007/s10044-020-00940-2

Download citation

Received: 02 October 2018
Accepted: 04 November 2020
Published: 17 November 2020
Issue Date: May 2021
DOI: https://doi.org/10.1007/s10044-020-00940-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving the accuracy of pruned network using knowledge distillation

Abstract

Access this article

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation