Distilling Object Detectors with Global Knowledge

Tang, Sanli; Zhang, Zhongyu; Cheng, Zhanzhan; Lu, Jing; Xu, Yunlu; Niu, Yi; He, Fan

doi:10.1007/978-3-031-20077-9_25

Sanli Tang¹²,
Zhongyu Zhang¹²,
Zhanzhan Cheng¹²,
Jing Lu¹²,
Yunlu Xu¹²,
Yi Niu¹² &
…
Fan He¹³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13669))

Included in the following conference series:

European Conference on Computer Vision

2895 Accesses

Abstract

Knowledge distillation learns a lightweight student model that mimics a cumbersome teacher. Existing methods regard the knowledge as the feature of each instance or their relations, which is the instance-level knowledge only from the teacher model, i.e., the local knowledge. However, the empirical studies show that the local knowledge is much noisy in object detection tasks, especially on the blurred, occluded, or small instances. Thus, a more intrinsic approach is to measure the representations of instances w.r.t. a group of common basis vectors in the two feature spaces of the teacher and the student detectors, i.e., global knowledge. Then, the distilling algorithm can be applied as space alignment. To this end, a novel prototype generation module (PGM) is proposed to find the common basis vectors, dubbed prototypes, in the two feature spaces. Then, a robust distilling module (RDM) is applied to construct the global knowledge based on the prototypes and filtrate noisy local knowledge by measuring the discrepancy of the representations in two feature spaces. Experiments with Faster-RCNN and RetinaNet on PASCAL and COCO datasets show that our method achieves the best performance for distilling object detectors with various backbones, which even surpasses the performance of the teacher model. We also show that the existing methods can be easily combined with global knowledge and obtain further improvement. Code is available: https://github.com/hikvision-research/DAVAR-Lab-ML.

S. Tang and Z. Zhang—Authors contributed equally.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bucila, C., Caruana, R., Niculescu-Mizil, A.: Model compression. In: SIGKDD, pp. 535–541 (2006)
Google Scholar
Cai, Z., Vasconcelos, N.: Cascade R-CNN: high quality object detection and instance segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 43(5), 1483–1498 (2021)
Article Google Scholar
Chen, G., Choi, W., Yu, X., Han, T.X., Chandraker, M.: Learning efficient object detection models with knowledge distillation. In: NeurIPS, pp. 742–751 (2017)
Google Scholar
Dai, X., et al.: General instance distillation for object detection. CoRR abs/2103.02340 (2021)
Google Scholar
Du, Z., et al.: Distilling object detectors with feature richness. CoRR abs/2111.00674 (2021)
Google Scholar
Ester, M., Kriegel, H., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Simoudis, E., Han, J., Fayyad, U.M. (eds.) Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96), Portland, Oregon, USA, pp. 226–231. AAAI Press (1996)
Google Scholar
Everingham, M., Gool, L.V., Williams, C.K.I., Winn, J.M., Zisserman, A.: The pascal visual object classes (VOC) challenge. IJCV 88(2), 303–338 (2010)
Article Google Scholar
Fu, C., Liu, W., Ranga, A., Tyagi, A., Berg, A.C.: DSSD : deconvolutional single shot detector. CoRR abs/1701.06659 (2017)
Google Scholar
Girshick, R.: Fast R-CNN. In: ICCV, pp. 1440–1448 (2015)
Google Scholar
Gou, J., Yu, B., Maybank, S.J., Tao, D.: Knowledge distillation: a survey. CoRR abs/2006.05525 (2020)
Google Scholar
Guo, J., et al.: Distilling object detectors via decoupled features. CoRR abs/2103.14475 (2021)
Google Scholar
Han, B., et al.: Co-teaching: robust training of deep neural networks with extremely noisy labels. In: NeurIPS, pp. 8536–8546 (2018)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778. CS (2016)
Google Scholar
Heo, B., Kim, J., Yun, S., Park, H., Kwak, N., Choi, J.Y.: A comprehensive overhaul of feature distillation. In: ICCV, pp. 1921–1930 (2019)
Google Scholar
Heo, B., Lee, M., Yun, S., Choi, J.Y.: Knowledge transfer via distillation of activation boundaries formed by hidden neurons. In: AAA, pp. 3779–3787 (2019)
Google Scholar
Hinton, G.E., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. CoRR abs/1503.02531 (2015)
Google Scholar
Huang, Z., Wang, N.: Like what you like: knowledge distill via neuron selectivity transfer. CoRR abs/1707.01219 (2017)
Google Scholar
Jiang, L., Zhou, Z., Leung, T., Li, L., Fei-Fei, L.: Mentornet: learning data-driven curriculum for very deep neural networks on corrupted labels. In: ICML, vol. 80, pp. 2309–2318 (2018)
Google Scholar
Kai Chen, e.a.: Mmdetection: open mmlab detection toolbox and benchmark. CoRR abs/1906.07155 (2019)
Google Scholar
Kreutz-Delgado, K., Murray, J.F., Rao, B.D., Engan, K., Lee, T., Sejnowski, T.J.: Dictionary learning algorithms for sparse representation. Neural Comput. 15(2), 349–396 (2003)
Article MATH Google Scholar
Le, E., Kokkinos, I., Mitra, N.J.: Going deeper with lean point networks. In: CVPR, pp. 9500–9509 (2020)
Google Scholar
Li, G., Li, X., Wang, Y., Zhang, S., Wu, Y., Liang, D.: Knowledge distillation for object detection via rank mimicking and prediction-guided feature imitation. CoRR abs/2112.04840 (2021)
Google Scholar
Li, Q., Jin, S., Yan, J.: Mimicking very efficient network for object detection. In: CVPR, pp. 7341–7349 (2017)
Google Scholar
Li, X., Wu, J., Fang, H., Liao, Y., Wang, F., Qian, C.: Local correlation consistency for knowledge distillation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12357, pp. 18–33. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58610-2_2
Chapter Google Scholar
Lin, T., Dollár, P., Girshick, R.B., He, K., Hariharan, B., Belongie, S.J.: Feature pyramid networks for object detection. In: CVPR, pp. 936–944 (2017)
Google Scholar
Lin, T., Goyal, P., Girshick, R.B., He, K., Dollár, P.: Focal loss for dense object detection. In: ICCV, pp. 2999–3007 (2017)
Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Liu, Y., et al.: Knowledge distillation via instance relationship graph. In: CVPR, pp. 7096–7104 (2019)
Google Scholar
Malach, E., Shalev-Shwartz, S.: Decoupling “when to update” from “how to update”. In: NeurIPS, pp. 960–970 (2017)
Google Scholar
Mallat, S., Zhang, Z.: Matching pursuits with time-frequency dictionaries. TIP 41(12), 3397–3415 (1993)
MATH Google Scholar
Park, W., Kim, D., Lu, Y., Cho, M.: Relational knowledge distillation. In: CVPR, pp. 3967–3976 (2019)
Google Scholar
Redmon, J., Divvala, S.K., Girshick, R.B., Farhadi, A.: You only look once: unified, real-time object detection. In: CVPR, pp. 779–788 (2016)
Google Scholar
Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NeurIPS, pp. 91–99 (2015)
Google Scholar
Romero, A., Ballas, N., Kahou, S.E., Chassang, A., Gatta, C., Bengio, Y.: Fitnets: hints for thin deep nets. In: ICLR (2015)
Google Scholar
Sun, R., Tang, F., Zhang, X., Xiong, H., Tian, Q.: Distilling object detectors with task adaptive regularization. CoRR abs/2006.13108 (2020)
Google Scholar
Tian, Y., Krishnan, D., Isola, P.: Contrastive representation distillation. In: ICLR (2020)
Google Scholar
Tian, Z., Shen, C., Chen, H., He, T.: FCOS: fully convolutional one-stage object detection. In: ICCV, pp. 9626–9635 (2019)
Google Scholar
Tosic, I., Frossard, P.: Dictionary learning. SPM (2011)
Google Scholar
Tung, F., Mori, G.: Similarity-preserving knowledge distillation. In: ICCV, pp. 1365–1374 (2019)
Google Scholar
Wang, T., Yuan, L., Zhang, X., Feng, J.: Distilling object detectors with fine-grained feature imitation. In: CVPR, pp. 4933–4942 (2019)
Google Scholar
Xie, S., Girshick, R.B., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 5987–5995. IEEE Computer Society (2017)
Google Scholar
Yim, J., Joo, D., Bae, J., Kim, J.: A gift from knowledge distillation: fast optimization, network minimization and transfer learning. In: CVPR, pp. 7130–7138 (2017)
Google Scholar
Zhang, L., Ma, K.: Improve object detection with feature-based knowledge distillation: towards accurate and efficient detectors. In: ICLR (2021)
Google Scholar
Zhang, Y., et al.: Prime-aware adaptive distillation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12364, pp. 658–674. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58529-7_39
Chapter Google Scholar
Zheng, Z., Ye, R., Wang, P., Wang, J., Ren, D., Zuo, W.: Localization distillation for object detection. CoRR abs/2102.12252 (2021)
Google Scholar
Zhu, C., He, Y., Savvides, M.: Feature selective anchor-free module for single-shot object detection. In: CVPR, pp. 840–849 (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

Hikvision Research Institute, Hangzhou, China
Sanli Tang, Zhongyu Zhang, Zhanzhan Cheng, Jing Lu, Yunlu Xu & Yi Niu
Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai, China
Fan He

Authors

Sanli Tang
View author publications
You can also search for this author in PubMed Google Scholar
Zhongyu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhanzhan Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Jing Lu
View author publications
You can also search for this author in PubMed Google Scholar
Yunlu Xu
View author publications
You can also search for this author in PubMed Google Scholar
Yi Niu
View author publications
You can also search for this author in PubMed Google Scholar
Fan He
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhanzhan Cheng .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 1005 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tang, S. et al. (2022). Distilling Object Detectors with Global Knowledge. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13669. Springer, Cham. https://doi.org/10.1007/978-3-031-20077-9_25

Download citation

DOI: https://doi.org/10.1007/978-3-031-20077-9_25
Published: 06 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20076-2
Online ISBN: 978-3-031-20077-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Distilling Object Detectors with Global Knowledge