Categorical Relation-Preserving Contrastive Knowledge Distillation for Medical Image Classification

Xing, Xiaohan; Hou, Yuenan; Li, Hang; Yuan, Yixuan; Li, Hongsheng; Meng, Max Q.-H.

doi:10.1007/978-3-030-87240-3_16

Xiaohan Xing¹⁵,
Yuenan Hou¹⁶,
Hang Li¹⁷,
Yixuan Yuan¹⁸,
Hongsheng Li¹⁵ &
…
Max Q.-H. Meng^15,19

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12905))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

8593 Accesses
13 Citations

Abstract

The amount of medical images for training deep classification models is typically very scarce, making these deep models prone to overfit the training data. Studies showed that knowledge distillation (KD), especially the mean-teacher framework which is more robust to perturbations, can help mitigate the over-fitting effect. However, directly transferring KD from computer vision to medical image classification yields inferior performance as medical images suffer from higher intra-class variance and class imbalance. To address these issues, we propose a novel Categorical Relation-preserving Contrastive Knowledge Distillation (CRCKD) algorithm, which takes the commonly used mean-teacher model as the supervisor. Specifically, we propose a novel Class-guided Contrastive Distillation (CCD) module to pull closer positive image pairs from the same class in the teacher and student models, while pushing apart negative image pairs from different classes. With this regularization, the feature distribution of the student model shows higher intra-class similarity and inter-class variance. Besides, we propose a Categorical Relation Preserving (CRP) loss to distill the teacher’s relational knowledge in a robust and class-balanced manner. With the contribution of the CCD and CRP, our CRCKD algorithm can distill the relational knowledge more comprehensively. Extensive experiments on the HAM10000 and APTOS datasets demonstrate the superiority of the proposed CRCKD method. The source code is available at https://github.com/hathawayxxh/CRCKD.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Litjens, G., et al.: A survey on deep learning in medical image analysis. Med. Image Anal. 42, 60–88 (2017)
Article Google Scholar
Yang, C., Xie, L., Su, C., Yuille, A.L.: Snapshot distillation: teacher-student optimization in one generation. In: Proceedings of the CVPR, pp. 2859–2868 (2019)
Google Scholar
Zhuang, J., Cai, J., Wang, R., Zhang, J., Zheng, W.-S.: Deep kNN for medical image classification. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12261, pp. 127–136. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59710-8_13
Chapter Google Scholar
Cheplygina, V., de Bruijne, M., Pluim, J.P.: Not-so-supervised: a survey of semi-supervised, multi-instance, and transfer learning in medical image analysis. Med. Image Anal. 54, 280–296 (2019)
Article Google Scholar
Shang, H., et al.: Leveraging other datasets for medical imaging classification: evaluation of transfer, multi-task and semi-supervised learning. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11768, pp. 431–439. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32254-0_48
Chapter Google Scholar
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
MathSciNet MATH Google Scholar
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the CVPR, pp. 2818–2826 (2016)
Google Scholar
Müller, R., Kornblith, S., Hinton, G.: When does label smoothing help? arXiv preprint arXiv:1906.02629 (2019)
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
Tarvainen, A., Valpola, H.: Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In: Advances in Neural Information Processing Systems, pp. 1195–1204 (2017)
Google Scholar
Thiagarajan, J.J., Kashyap, S., Karargyris, A.: Distill-to-label: weakly supervised instance labeling using knowledge distillation. In: 2019 18th IEEE International Conference on Machine Learning And Applications (ICMLA), pp. 902–907. IEEE (2019)
Google Scholar
Wu, J., et al.: Leveraging undiagnosed data for glaucoma classification with teacher-student learning. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12261, pp. 731–740. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59710-8_71
Chapter Google Scholar
Liu, Q., Yu, L., Luo, L., Dou, Q., Heng, P.A.: Semi-supervised medical image classification with relation-driven self-ensembling model. IEEE Trans. Med. Imaging 39, 3429–3440 (2020)
Article Google Scholar
Unnikrishnan, B., Nguyen, C.M., Balaram, S., Foo, C.S., Krishnaswamy, P.: Semi-supervised classification of diagnostic radiographs with noteacher: a teacher that is not mean. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12261, pp. 624–634. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59710-8_61
Chapter Google Scholar
Abbasi, S., et al.: Classification of diabetic retinopathy using unlabeled data and knowledge distillation. arXiv preprint arXiv:2009.00982 (2020)
Patra, A., et al.: Efficient ultrasound image analysis models with sonographer gaze assisted distillation. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11767, pp. 394–402. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32251-9_43
Chapter Google Scholar
Hou, Y., Ma, Z., Liu, C., Loy, C.C.: Learning lightweight lane detection CNNs by self attention distillation. In: Proceedings of the ICCV, pp. 1013–1021 (2019)
Google Scholar
Tung, F., Mori, G.: Similarity-preserving knowledge distillation. In: Proceedings of the ICCV, pp. 1365–1374 (2019)
Google Scholar
Tian, Y., Krishnan, D., Isola, P.: Contrastive representation distillation. arXiv preprint arXiv:1910.10699 (2019)
Saunshi, N., Plevrakis, O., Arora, S., Khodak, M., Khandeparkar, H.: A theoretical analysis of contrastive unsupervised representation learning. In: International Conference on Machine Learning, pp. 5628–5637 (2019)
Google Scholar
Wu, Z., Xiong, Y., Yu, S.X., Lin, D.: Unsupervised feature learning via non-parametric instance discrimination. In: Proceedings of the CVPR, pp. 3733–3742 (2018)
Google Scholar
Park, W., Kim, D., Lu, Y., Cho, M.: Relational knowledge distillation. In: Proceedings of the CVPR, pp. 3967–3976 (2019)
Google Scholar
Tschandl, P., Rosendahl, C., Kittler, H.: The HAN10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Sci. Data 5, 180161 (2018)
Article Google Scholar
Codella, N.C., et al.: Skin lesion analysis toward melanoma detection: a challenge at the 2017 international symposium on biomedical imaging (ISBI), hosted by the international skin imaging collaboration (ISIC). In: Proceedings of the ISBI, pp. 168–172. IEEE (2018)
Google Scholar
Aptos 2019 blindness detection. https://www.kaggle.com/c/aptos2019-blindness-detection/data
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the CVPR, pp. 4700–4708 (2017)
Google Scholar
Yim, J., Joo, D., Bae, J., Kim, J.: A gift from knowledge distillation: fast optimization, network minimization and transfer learning. In: Proceedings of the CVPR, pp. 4133–4141 (2017)
Google Scholar
Yan, Y., Kawahara, J., Hamarneh, G.: Melanoma recognition via visual attention. In: Chung, A.C.S., Gee, J.C., Yushkevich, P.A., Bao, S. (eds.) IPMI 2019. LNCS, vol. 11492, pp. 793–804. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20351-1_62
Chapter Google Scholar
Zhang, J., Xie, Y., Wu, Q., Xia, Y.: Skin lesion classification in dermoscopy images using synergic deep learning. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11071, pp. 12–20. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00934-2_2
Chapter Google Scholar
Zhang, J., Xie, Y., Xia, Y., Shen, C.: Attention residual learning for skin lesion classification. IEEE Trans. Med. Imaging 38(9), 2092–2103 (2019)
Article Google Scholar

Download references

Acknowledgements

The work described in this paper was supported by National Key R&D program of China with Grant No. 2019YFB1312400, Hong Kong RGC CRF grant C4063-18G, and Hong Kong RGC GRF grant #14211420.

Author information

Authors and Affiliations

Department of Electronic Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong, China
Xiaohan Xing, Hongsheng Li & Max Q.-H. Meng
Department of Information Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong, China
Yuenan Hou
School of Informatics, Xiamen University, Xiamen, China
Hang Li
Department of Electrical Engineering, City University of Hong Kong, Kowloon, Hong Kong, China
Yixuan Yuan
Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, China
Max Q.-H. Meng

Authors

Xiaohan Xing
View author publications
You can also search for this author in PubMed Google Scholar
Yuenan Hou
View author publications
You can also search for this author in PubMed Google Scholar
Hang Li
View author publications
You can also search for this author in PubMed Google Scholar
Yixuan Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Hongsheng Li
View author publications
You can also search for this author in PubMed Google Scholar
Max Q.-H. Meng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Yixuan Yuan or Max Q.-H. Meng .

Editor information

Editors and Affiliations

Erasmus MC - University Medical Center Rotterdam, Rotterdam, The Netherlands
Marleen de Bruijne
University of Basel, Allschwil, Switzerland
Philippe C. Cattin
Inria Nancy Grand Est, Villers-lès-Nancy, France
Stéphane Cotin
ICube, Université de Strasbourg, CNRS, Strasbourg, France
Nicolas Padoy
National Center for Tumor Diseases (NCT/UCC), Dresden, Germany
Stefanie Speidel
Tencent Jarvis Lab, Shenzhen, China
Yefeng Zheng
ICube, Université de Strasbourg, CNRS, Strasbourg, France
Caroline Essert

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 1027 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xing, X., Hou, Y., Li, H., Yuan, Y., Li, H., Meng, M.QH. (2021). Categorical Relation-Preserving Contrastive Knowledge Distillation for Medical Image Classification. In: de Bruijne, M., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2021. MICCAI 2021. Lecture Notes in Computer Science(), vol 12905. Springer, Cham. https://doi.org/10.1007/978-3-030-87240-3_16

Download citation

DOI: https://doi.org/10.1007/978-3-030-87240-3_16
Published: 21 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87239-7
Online ISBN: 978-3-030-87240-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)