Abstract
Conventional knowledge distillation (KD) methods for object detection mainly concentrate on homogeneous teacher-student detectors. However, the design of a lightweight detector for deployment is often significantly different from a high-capacity detector. Thus, we investigate KD among heterogeneous teacher-student pairs for a wide application. We observe that the core difficulty for heterogeneous KD (hetero-KD) is the significant semantic gap between the backbone features of heterogeneous detectors due to the different optimization manners. Conventional homogeneous KD (homo-KD) methods suffer from such a gap and are hard to directly obtain satisfactory performance for hetero-KD. In this paper, we propose the HEtero-Assists Distillation (HEAD) framework, leveraging heterogeneous detection heads as assistants to guide the optimization of the student detector to reduce this gap. In HEAD, the assistant is an additional detection head with the architecture homogeneous to the teacher head attached to the student backbone. Thus, a hetero-KD is transformed into a homo-KD, allowing efficient knowledge transfer from the teacher to the student. Moreover, we extend HEAD into a Teacher-Free HEAD (TF-HEAD) framework when a well-trained teacher detector is unavailable. Our method has achieved significant improvement compared to current detection KD methods. For example, on the MS-COCO dataset, TF-HEAD helps R18 RetinaNet achieve 33.9 mAP (\(+2.2\)), while HEAD further pushes the limit to 36.2 mAP (\(+4.5\)).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ahn, S., Hu, S.X., Damianou, A., Lawrence, N.D., Dai, Z.: Variational information distillation for knowledge transfer. In: CVPR, pp. 9155–9163 (2019)
Cai, Z., Vasconcelos, N.: Cascade R-CNN: delving into high quality object detection. In: CVPR, pp. 6154–6162 (2018)
Chen, G., Choi, W., Yu, X., Han, T., Chandraker, M.: Learning efficient object detection models with knowledge distillation. In: NeurIPS, pp. 743–752 (2017)
Chen, Q., Wang, Y., Yang, T., Zhang, X., Cheng, J., Sun, J.: You Only Look One-level Feature. In: CVPR, pp. 13034–13043 (2021)
Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: NeurIPS, pp. 379–387 (2016)
Dai, X., et al.: General instance distillation for object detection. In: CVPR, pp. 7838–7847 (2021)
Dong, Z., Li, G., Liao, Y., Wang, F., Ren, P., Qian, C.: CentripetalNet: pursuing high-quality keypoint pairs for object detection. In: CVPR, pp. 10516–10525 (2020)
Du, S., et al.: Agree to disagree: adaptive ensemble knowledge distillation in gradient space. In: NeurIPS, pp. 1–11 (2020)
Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., Tian, Q.: CenterNet: keypoint triplets for object detection. In: ICCV, pp. 6568–6577 (2019)
Girshick, R.: Fast R-CNN. In: ICCV, pp. 1440–1448 (2015)
Guo, J., et al.: Distilling object detectors via decoupled features. In: CVPR, pp. 2154–2164 (2021)
Hao, M., Liu, Y., Zhang, X., Sun, J.: LabelEnc: a new intermediate supervision method for object detection. In: ECCV, pp. 529–545 (2020)
He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask R-CNN. TPAMI 42(2), 386–397 (2020)
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. In: ECCV, pp. 346–361 (2014)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
Heo, B., Lee, M., Yun, S., Choi, J.Y.: Knowledge transfer via distillation of activation boundaries formed by hidden neurons. AAAI 33(1), 3779–3787 (2019)
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. In: NeurIPS (2014)
Hoiem, D., Chodpathumwan, Y., Dai, Q.: Diagnosing error in object detectors. In: ECCV, pp. 340–353 (2012)
Huang, Z., Wang, N.: Like what you like: knowledge distill via neuron selectivity transfer. arXiv preprint arXiv:1707.01219 (2017)
Jafari, A., Rezagholizadeh, M., Sharma, P., Ghodsi, A.: Annealing knowledge distillation. In: EACL, pp. 2493–2504 (2021)
Ji, M., Shin, S., Hwang, S., Park, G., Moon, I.C.: Refine myself by teaching myself: feature refinement via self-knowledge distillation. In: CVPR, pp. 10659–10668 (2021)
Kim, J., Park, S., Kwak, N.: Paraphrasing complex network: network compression via factor transfer. In: NeurIPS, pp. 2760–2769 (2018)
Kim, K., Ji, B., Yoon, D., Hwang, S.: Self-Knowledge distillation with progressive refinement of targets. In: ICCV, pp. 6567–6576 (2021)
Lan, X., Zhu, X., Gong, S.: Knowledge distillation by on-the-fly native ensemble. In: NeurIPS, pp. 7517–7527 (2018)
Li, Q., Jin, S., Yan, J.: Mimicking very efficient network for object detection. In: CVPR, pp. 7341–7349 (2017)
Li, X., Wu, J., Fang, H., Liao, Y., Wang, F., Qian, C.: Local correlation consistency for knowledge distillation. In: ECCV, pp. 18–33 (2020)
Lin, T.Y., Dollar, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: CVPR, pp. 936–944 (2017)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollar, P.: Focal loss for dense object detection. TPAMI 42(2), 318–327 (2020)
Lin, T.Y., et al.: Microsoft COCO: common objects in context. In: ECCV, pp. 740–755 (2014)
Liu, W., et al.: SSD: single shot multibox detector. In: ECCV, pp. 21–37 (2016)
Lu, X., Li, Q., Li, B., Yan, J.: MimicDet: bridging the gap between one-stage and two-stage object detection. In: ECCV, pp. 541–557 (2020)
Mirzadeh, S.I., Farajtabar, M., Li, A., Levine, N., Matsukawa, A., Ghasemzadeh, H.: Improved knowledge distillation via teacher assistant. AAAI 34(04), 5191–5198 (2020)
Park, W., Kim, D., Lu, Y., Cho, M.: Relational knowledge distillation. In: CVPR, pp. 3962–3971 (2019)
Passalis, N., Tefas, A.: Probabilistic knowledge transfer for deep representation learning. arXiv preprint arXiv:1803.10837 (2018)
Passalis, N., Tzelepi, M., Tefas, A.: Heterogeneous knowledge distillation using information flow modeling. In: CVPR, pp. 2336–2345 (2020)
Peng, B., et al.: Correlation congruence for knowledge distillation. In: CVPR, pp. 5006–5015 (2019)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: CVPR, pp. 779–788 (2015)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. TPAMI 39(6), 1137–1149 (2017)
Romero, A., Ballas, N., Kahou, S.E., Chassang, A., Gatta, C., Bengio, Y.: FitNets: hints for thin deep nets. In: ICLR, pp. 1–13 (2015)
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV2: inverted residuals and linear bottlenecks. In: CVPR, pp. 4510–4520 (2018)
Tian, Z., Shen, C., Chen, H., He, T.: FCOS: fully convolutional one-stage object detection. In: CVPR, pp. 9626–9635 (2019)
Tung, F., Mori, G.: Similarity-preserving knowledge distillation. In: CVPR, pp. 1365–1374 (2019)
Wang, T., Yuan, L., Zhang, X., Feng, J.: Distilling object detectors with fine-grained feature imitation. In: CVPR, pp. 4928–4937 (2019)
Wu, Y., et al.: Rethinking classification and localization for object detection. In: CVPR, pp. 10183–10192 (2020)
Yang, C., An, Z., Cai, L., Xu, Y.: Hierarchical self-supervised augmented knowledge distillation. In: IJCAI, pp. 1217–1223 (2021)
Yang, Z., Liu, S., Hu, H., Wang, L., Lin, S.: RepPoints: point set representation for object detection. In: ICCV, pp. 9656–9665 (2019)
Yang, Z., et al.: Focal and global knowledge distillation for detectors. In: CVPR (2022)
Yao, L., Pi, R., Xu, H., Zhang, W., Li, Z., Zhang, T.: G-DetKD: towards general distillation framework for object detectors via contrastive and semantic-guided feature imitation. In: ICCV (2021)
Yim, J., Joo, D., Bae, J., Kim, J.: A gift from knowledge distillation: fast optimization, network minimization and transfer learning. In: CVPR, pp. 7130–7138 (2017)
Yuan, L., Tay, F.E., Li, G., Wang, T., Feng, J.: Revisiting knowledge distillation via label smoothing regularization. In: CVPR, pp. 3902–3910 (2020)
Zagoruyko, S., Komodakis, N.: Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. In: ICLR, pp. 1–13 (2017)
Zhang, L., Song, J., Gao, A., Chen, J., Bao, C., Ma, K.: Be Your own teacher: improve the performance of convolutional neural networks via self distillation. In: ICCV, pp. 3712–3721 (2019)
Zhang, Y., Xiang, T., Hospedales, T.M., Lu, H.: Deep mutual learning. In: CVPR, pp. 4320–4328 (2018)
Zhou, C., Neubig, G., Gu, J.: Improve object detection with feature-based knowledge distillation: towards accurate and efficient detectors. In: ICLR (2021)
Zhou, X., Koltun, V., Krähenbühl, P.: Probabilistic two-stage detection. arXiv preprint arXiv:2103.07461 (2021)
Zhou, X., Wang, D., Krähenbühl, P.: Objects as points. arXiv preprint arXiv:1904.07850 (2019)
Acknowledgement
This work was partly supported by the National Natural Science Foundation of China (62122010, 61876177), the Fundamental Research Funds for the Central Universities, and the Key Research and Development Program of Zhejiang Province (2022C01082).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, L. et al. (2022). HEAD: HEtero-Assists Distillation for Heterogeneous Object Detectors. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13669. Springer, Cham. https://doi.org/10.1007/978-3-031-20077-9_19
Download citation
DOI: https://doi.org/10.1007/978-3-031-20077-9_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20076-2
Online ISBN: 978-3-031-20077-9
eBook Packages: Computer ScienceComputer Science (R0)