Skip to main content
Log in

Global Instance Relation Distillation for convolutional neural network compression

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Previous instance-relation knowledge distillation methods transfer structural relations between instances from the heavy teacher network to the lightweight student network, effectively enhancing the accuracy of the student. However, these methods have two limitations: (1) The modeling of relation knowledge only relies on the current mini-batch instances, causing the instance relations to be incomplete. (2) The information flow hidden in the evolution of instance relations throughout the network has been neglected. To address these problems, we propose a Global Instance Relation Distillation (GIRD) for convolutional neural network compression, which improves both the instance-level and relation-level globality. Firstly, we design a feature reutilization mechanism to store previously learned features to break through the shackles of the mini-batch. Secondly, we model the pairwise similarity-relation based on stored features to reveal more complete instance relations. Furthermore, we construct the pairwise relation-evolution across different layers to reflect the information flow. Extensive experiments on benchmark datasets demonstrate that our proposed method outperforms state-of-the-art approaches in various visual tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Fig. 3

Similar content being viewed by others

Data availibility

Data will be made available on reasonable request.

References

  1. Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60(6):84–90

    Article  Google Scholar 

  2. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9

  3. Ren S, He K, Girshick R, Sun J (2016) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031

  4. Duan K, Bai S, Xie L, Qi H, Huang Q, Tian Q (2019) CenterNet: keypoint triplets for object detection. In: Proceedings of the IEEE international conference on computer vision, pp 6569–6578

  5. Li H, Dong Y, Xu L, Zhang S, Wang J (2021) Object detection method based on global feature augmentation and adaptive regression in IoT. Neural Comput Appl 33:4119–4131

    Article  Google Scholar 

  6. Zhu J, Zeng H, Huang J, Liao S, Lei Z, Cai C, Zheng L (2019) Vehicle re-identification using quadruple directional deep learning features. IEEE Trans Intell Transp Syst 21(1):410–420

    Article  Google Scholar 

  7. Xie Y, Wu H, Shen F, Zhu J, Zeng H (2021) Object re-identification using teacher-like and light students. In: Proceedings of the British machine vision conference, virtual event, pp 22–25

  8. He K, Gkioxari G, Dollár P, Girshick R (2017) Mask R-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969

  9. Sun Z, Tian L, Du Q, Bhutto JA, Wang Z (2023) Facial mask attention network for identity-aware face super-resolution. Neural Comput Appl 35(11):8243–8257

    Article  Google Scholar 

  10. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. ArXiv, vol. arXiv:abs/1409.1556

  11. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  12. Lin M, Ji R, Wang Y, Zhang Y, Zhang B, Tian Y, Shao L (2020) Hrank: filter pruning using high-rank feature map. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1529–1538

  13. Chang J, Lu Y, Xue P, Xu Y, Wei Z (2022) Global balanced iterative pruning for efficient convolutional neural networks. Neural Comput Appl 34(23):21119–21138

    Article  Google Scholar 

  14. Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) MobileNets: efficient convolutional neural networks for mobile vision applications. ArXiv, vol. arXiv:abs/1704.04861

  15. Zhang X, Zhou X, Lin M, Sun J (2018) Shufflenet: An extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6848–6856

  16. Yang J, Shen X, Xing J, Tian X, Li H, Deng B, Huang J, Hua X-s (2019) Quantization networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7308–7316

  17. Zhang Y, Zhang Z, Lew L (2022) Pokebnn: a binary pursuit of lightweight accuracy. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 12475–12485

  18. Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. ArXiv, vol. arXiv:abs/1503.02531

  19. Chen D, Mei J-P, Zhang Y, Wang C, Wang Z, Feng Y, Chen C (2021) Cross-layer distillation with semantic calibration. In: Proceedings of the AAAI conference on artificial intelligence, pp 7028–7036

  20. Chen P, Liu S, Zhao H, Jia J (2021) Distilling knowledge via knowledge review. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5008–5017

  21. Ghofrani A, Mahdian Toroghi R (2022) Knowledge distillation in plant disease recognition. Neural Comput Appl 34(17):14287–14296

    Article  Google Scholar 

  22. Wang C, Zhong J, Dai Q, Yu Q, Qi Y, Fang B, Li X (2023) MTED: multiple teachers ensemble distillation for compact semantic segmentation. Neural Comput Appl 35(16):11789–11806

    Article  Google Scholar 

  23. Xie Y, Zhang H, Xu X, Zhu J, He S (2023) Towards a smaller student: capacity dynamic distillation for efficient image retrieval. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 16006–16015

  24. Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, Bengio Y (2014) FitNets: hints for thin deep nets. ArXiv, vol. arXiv:abs/1412.6550

  25. Zagoruyko S, Komodakis N (2016) Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. ArXiv, vol. arXiv:abs/1612.03928

  26. Yue K, Deng J, Zhou F (2020) Matching guided distillation. In: European conference on computer vision. Springer, pp 312–328

  27. Park W, Kim D, Lu Y, Cho M (2019) Relational knowledge distillation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3967–3976

  28. Peng B, Jin X, Liu J, Li D, Wu Y, Liu Y, Zhou S, Zhang Z (2019) Correlation congruence for knowledge distillation. In: Proceedings of the IEEE international conference on computer vision, pp 5007–5016

  29. Tung F, Mori G (2019) Similarity-preserving knowledge distillation. In: Proceedings of the IEEE international conference on computer vision, pp 1365–1374

  30. Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: Computer Vision–ECCV 2014: 13th European conference. Springer, pp 818–833

  31. Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125

  32. Han K, Wang Y, Tian Q, Guo J, Xu C, Xu C (2020) GhostNet: more features from cheap operations. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1580–1589

  33. Zhao B, Cui Q, Song R, Qiu Y, Liang J (2022) Decoupled knowledge distillation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 11953–11962

  34. Yim J, Joo D, Bae J, Kim J (2017) A gift from knowledge distillation: fast optimization, network minimization and transfer learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4133–4141

  35. Liu Y, Cao J, Li B, Yuan C, Hu W, Li Y, Duan Y (2019) Knowledge distillation via instance relationship graph. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7096–7104

  36. Tian Y, Krishnan D, Isola P (2019) Contrastive representation distillation. In: International conference on learning representations

  37. He K, Fan H, Wu Y, Xie S, Girshick R (2020) Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9729–9738

  38. Krizhevsky A, Hinton G, et al. (2009) Learning multiple layers of features from tiny images

  39. Le Y, Yang X (2015) Tiny imagenet visual recognition challenge. CS 231N 7(7):3

    Google Scholar 

  40. Liu X, Liu W, Mei T, Ma H (2016) A deep learning-based approach to progressive vehicle re-identification for urban surveillance. In: Computer vision–ECCV 2016: 14th European conference. Springer, pp 869–884

  41. PyTorch: an imperative style, high-performance deep learning library. NIPS'19: Proceedings of the 33rd International Conference on Neural Information Processing Systems, December 2019, Article No.: 721, Pages 8026–8037. https://doi.org/10.5555/3454287.3455008

  42. Goyal P, Dollár P, Girshick R, Noordhuis P, Wesolowski L, Kyrola A, Tulloch A, Jia Y, He K (2017) Accurate, large minibatch SGD: Training imagenet in 1 hour. ArXiv, vol. arXiv:abs/1706.02677

  43. Zheng W, Xu C, Xu X, Liu W, He S (2023) CIRI: curricular inactivation for residue-aware one-shot video inpainting. In: Proceedings of the IEEE international conference on computer vision, pp 13012–13022

  44. Zheng C, Liu B, Zhang H, Xu X, He S (2023) Where is my spot? Few-shot image generation via latent subspace optimization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3272–3281

  45. Loshchilov I, Hutter F (2016) Sgdr: Stochastic gradient descent with warm restarts. ArXiv, vol. arXiv:abs/1608.03983

  46. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115:211–252

    Article  MathSciNet  Google Scholar 

  47. Luo H, Jiang W, Gu Y, Liu F, Liao X, Lai S, Gu J (2019) A strong baseline and batch normalization neck for deep person re-identification. IEEE Trans Multimed 22(10):2597–2609

    Article  Google Scholar 

  48. Ahn S, Hu SX, Damianou A, Lawrence ND, Dai Z (2019) Variational information distillation for knowledge transfer. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9163–9171

  49. Guo Z, Yan H, Li H, Lin X (2023) Class attention transfer based knowledge distillation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 11868–11877

  50. Xie Y, Zhu J, Zeng H, Cai C, Zheng L (2020) Learning matching behavior differences for compressing vehicle re-identification models. In: 2020 IEEE International conference on visual communications and image processing, pp 523–526

  51. Passalis N, Tzelepi M, Tefas A (2020) Probabilistic knowledge transfer for lightweight deep representation learning. IEEE Trans Neural Netw Learn Syst 32(5):2030–2039

    Article  MathSciNet  Google Scholar 

  52. Xie Y, Wu H, Zhu J, Zeng H (2024) Distillation embedded absorbable pruning for fast object re-identification. Pattern Recognit 110437

Download references

Acknowledgements

This work was supported in part by the National Key R &D Program of China under Grant 2021YFE0205400, in part by the Key Program of Natural Science Foundation of Fujian Province under Grant 2023J02022, in part by the Natural Science Foundation for Outstanding Young Scholars of Fujian Province under Grant 2022J06023, in part by the Natural Science Foundation of Fujian Province under Grant 2022J01294, in part by the Key Science and Technology Project of Xiamen City under Grant 3502Z20231005, in part by the High-level Talent Team Project of Quanzhou City under Grant 2023CT001 and in part by the Key Science and Technology Project of Quanzhou City under Grant 2023GZ4.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huanqiang Zeng.

Ethics declarations

Conflicts of interest

We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, and there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled ‘Global Instance Relation Distillation for Convolutional Neural Network Compression.’

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hu, H., Zeng, H., Xie, Y. et al. Global Instance Relation Distillation for convolutional neural network compression. Neural Comput & Applic (2024). https://doi.org/10.1007/s00521-024-09635-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00521-024-09635-9

Keywords

Navigation