Abstract
Convolutional neural networks (CNNs) serve as the backbone for extracting image features in the majority of computer vision tasks. In an attempt to make them deployable on small devices, many academics have released small neural networks that they developed by hand or employed compression on large models via model pruning. Model pruning is a simple and efficient way to speed up neural networks. However, the performance of the pruned model (sparse network) falls short of the original model (dense network), and it is not easy to train towards convergence. Recent popular work has focused on improving the effectiveness and convergence of sub-networks. In this paper, we present our solution from the perspective of how to narrow the performance gap between sparse and dense networks, rather than how to obtain a better sub-network. For bridging the gap in their performance, we propose a novel training strategy by way of mutual learning. Furthermore, we provide a new pruning criterion called matching distance (MD) that aims to enable the sparse networks to inherit the majority of the knowledge learned from the dense networks. The experimental results demonstrate that our approach enables knowledge from dense networks to be transferred to sparse networks more efficiently.
Similar content being viewed by others
Data Availability
All data included in this study are available upon request by contact with the corresponding author.
References
Frankle, J., Carbin, M.: 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6–9, 2019 (OpenReview.net, 2019). https://openreview.net/forum?id=rJl-b3RcF7 (2019)
Bellec, G., Kappel, D., Maass, W., Legenstein, R.: Deep rewiring: training very sparse deep networks (2018)
He, Y., Liu, P., Wang, Z., Hu, Z., Yang, Y.: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4335–4344 (2019). https://doi.org/10.1109/CVPR.2019.00447
Li, Y., Adamczewski, K., Li, W., Gu, S., Timofte, R., Van Gool, L.: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 191–201 (2022). https://doi.org/10.1109/CVPR52688.2022.00029
Wang, H., Qin, C., Bai, Y., Zhang, Y., Fu, Y.: IJCAI (2022)
Frankle, J., Dziugaite, G.K., Roy, D.M., Carbin, M.: 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3–7, 2021 (OpenReview.net, 2021). https://openreview.net/forum?id=Ig-VyQc-MLK (2021)
Bai, Y., Wang, H., TAO, Z., Li, K., Fu, Y.: International Conference on Learning Representations. https://openreview.net/forum?id=fOsN52jn25l (2022)
Liu, Z., Li, J., Shen, Z., Huang, G., Yan, S., Zhang, C.: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2755–2763 (2017). https://doi.org/10.1109/ICCV.2017.298
Lin, M., Ji, R., Wang, Y., Zhang, Y., Zhang, B., Tian, Y., Shao, L.: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1526–1535 (2020). https://doi.org/10.1109/CVPR42600.2020.00160
Li, T., Wu, B., Yang, Y., Fan, Y., Zhang, Y., Liu, W.: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3972–3981 (2019). https://doi.org/10.1109/CVPR.2019.00410
He, Y., Ding, Y., Liu, P., Zhu, L., Zhang, H., Yang, Y.: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2006–2015 (2020). https://doi.org/10.1109/CVPR42600.2020.00208
Zhang, Y., Xiang, T., Hospedales, T.M., Lu, H.: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Yang, T., Zhu, S., Chen, C., Yan, S., Zhang, M., Willis, A.: European Conference on Computer Vision, pp. 299–315. Springer (2020)
Song, K., Xie, J., Zhang, S., Luo, Z.: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11848–11857 (2023). https://doi.org/10.1109/CVPR52729.2023.01140
Cortes, C., Lawrence, N., Lee, D., Sugiyama, M., Garnett, R. (ed.): Advances in Neural Information Processing Systems, vol. 28. Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2015/file/ae0eb3eed39d2bcef4622b2499a05fe6-Paper.pdf (2015)
Lee, D., Sugiyama, M., Luxburg, U., Guyon, I., Garnett R. (ed.): Advances in Neural Information Processing Systems, vol. 29. Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2016/file/2823f4797102ce1a1aec05359cc16dd9-Paper.pdf (2016)
He, Y., Kang, G., Dong, X., Fu, Y., Yang, Y.: Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks. http://arxiv.org/abs/1808.06866. ArXiv:1808.06866 [cs] (2018)
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
Zhao, B., Cui, Q., Song, R., Qiu, Y., Liang, J.: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition. pp. 11953–11962 (2022)
Romero, A., Ballas, N., Kahou, S.E., Chassang, A., Gatta, C., Bengio, Y.: Fitnets: Hints for thin deep nets (2015)
Heo, B., Kim, J., Yun, S., Park, H., Kwak, N., Choi, J.Y.: 2019 IEEE/CVF International Conference on Computer Vision (ICCV). pp. 1921–1930 (2019). https://doi.org/10.1109/ICCV.2019.00201
Chen, D., Mei, J.P., Zhang, Y., Wang, C., Wang, Z., Feng, Y., Chen, C.: Cross-layer distillation with semantic calibration. Proc. AAAI Conf. Artif. Intell. 35(8), 7028–7036 (2021). https://doi.org/10.1609/aaai.v35i8.16865
Yim, J., Joo, D., Bae, J., Kim, J.: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 7130–7138 (2017). https://doi.org/10.1109/CVPR.2017.754
Liu, Y., Cao, J., Li, B., Yuan, C., Hu, W., Li, Y., Duan, Y.: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7089–7097 (2019). https://doi.org/10.1109/CVPR.2019.00726
Mirzadeh, S.I., Farajtabar, M., Li, A., Levine, N., Matsukawa, A., Ghasemzadeh, H.: Improved knowledge distillation via teacher assistant. Proc. AAAI Conf. Artif. Intell. 34(04), 5191–5198 (2020). https://doi.org/10.1609/aaai.v34i04.5963
Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.): Computer Vision - ECCV 2022, pp. 120–136. Springer Nature Switzerland, Cham (2022)
He, K., Zhang, X., Ren, S., Sun, J.: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Le, Y., Yang, X.: Tiny imagenet visual recognition challenge. CS 231N 7(7), 3 (2015)
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. (IJCV) 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y
Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., Lerer, A.: Automatic differentiation in pytorch (2017)
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2015)
Dong, X., Huang, J., Yang, Y., Yan, S.: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 1895–1903 (2017). https://doi.org/10.1109/CVPR.2017.205
Zheng, Y., Sun, P., Ren, Q., Xu, W., Zhu, D.: A novel and efficient model pruning method for deep convolutional neural networks by evaluating the direct and indirect effects of filters. Neurocomputing 569, 127124 (2024)
Shi, Y., Tang, A., Niu, L., Zhou, R.: Sparse optimization guided pruning for neural networks. Neurocomputing 574, 127280 (2024)
Lin, M., Ji, R., Wang, Y., Zhang, Y., Zhang, B., Tian, Y., Shao, L.: HRank: Filter Pruning Using High-Rank Feature Map p. 10
Guan, Y., Liu, N., Zhao, P., Che, Z., Bian, K., Wang, Y., Tang, J.: Dais: Automatic channel pruning via differentiable annealing indicator search. IEEE Transactions on Neural Networks and Learning Systems. pp. 1–12 (2022). https://doi.org/10.1109/TNNLS.2022.3161284
Zhang, Y., Yao, Y., Ram, P., Zhao, P., Chen, T., Hong, M., Wang, Y., Liu, S.: Thirty-sixth Conference on Neural Information Processing Systems (2022)
Wang, H., Fu, Y.: Trainability preserving neural structured pruning. arXiv preprint arXiv:2207.12534 (2022)
Xue, Y., Yao, W., Peng, S., Yao, S.: Automatic filter pruning algorithm for image classification. Appl. Intell. 54(1), 216–230 (2024)
Dong, Z., Duan, Y., Zhou, Y., Duan, S., Hu, X.: Weight-adaptive channel pruning for cnns based on closeness-centrality modeling. Appl. Intell. 54(1), 201–215 (2024)
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) Computer Vision - ECCV 2014, pp. 818–833. Springer International Publishing, Cham (2014)
Eccles, B.J., Rodgers, P., Kilpatrick, P., Spence, I., Varghese, B.: Dnnshifter: an efficient dnn pruning system for edge computing. Future Gener. Comput. Syst. 152, 43–54 (2024)
Lin, M., Ji, R., Zhang, Y., Zhang, B., Tian, Y.: Channel pruning via automatic structure search (2020)
Cai, L., An, Z., Yang, C., Yan, Y., Xu, Y.: Proc. AAAI Conf. Artif. Intell. 36, 140–148 (2022)
Tung, F., Mori, G.: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1365–1374 (2019). https://doi.org/10.1109/ICCV.2019.00145
Ahn, S., Hu, S.X., Damianou, A., Lawrence, N.D., Dai, Z.: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9155–9163 (2019). https://doi.org/10.1109/CVPR.2019.00938
Park, W., Kim, D., Lu, Y., Cho, M.: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 3962–3971. IEEE Computer Society, Los Alamitos, CA, USA, 2019. https://doi.org/10.1109/CVPR.2019.00409. https://doi.ieeecomputersociety.org/10.1109/CVPR.2019.00409 (2019)
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China under Grant 62067002, and 62062033, in part by the Science and Technology Program Project of Jiangxi Province Department of Transportation under Grant 2022X0040.
Funding
National Natural Science Foundation of China (62067002, 62062033); Science and Technology Program Project of Jiangxi Province Department of Transportation (2022X0040).
Author information
Authors and Affiliations
Contributions
Liyan Xiong agree to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved; Qingsen Chen made substantial contributions to the conception and drafted the work; Jiawen Huang made contributions to acquisition, analysis, or interpretation of data; Xiaohui Huang revised it critically for important intellectual content; Peng Huang made contributions to the acquisition, analysis, or interpretation of data; Shangfeng Wei made contributions to the creation of new software used in the work. All the authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest
Additional information
Communicated by F. Wu.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Xiong, L., Chen, Q., Huang, J. et al. Students and teachers learning together: a robust training strategy for neural network pruning. Multimedia Systems 30, 122 (2024). https://doi.org/10.1007/s00530-024-01315-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00530-024-01315-x