Abstract
The amount of medical images for training deep classification models is typically very scarce, making these deep models prone to overfit the training data. Studies showed that knowledge distillation (KD), especially the mean-teacher framework which is more robust to perturbations, can help mitigate the over-fitting effect. However, directly transferring KD from computer vision to medical image classification yields inferior performance as medical images suffer from higher intra-class variance and class imbalance. To address these issues, we propose a novel Categorical Relation-preserving Contrastive Knowledge Distillation (CRCKD) algorithm, which takes the commonly used mean-teacher model as the supervisor. Specifically, we propose a novel Class-guided Contrastive Distillation (CCD) module to pull closer positive image pairs from the same class in the teacher and student models, while pushing apart negative image pairs from different classes. With this regularization, the feature distribution of the student model shows higher intra-class similarity and inter-class variance. Besides, we propose a Categorical Relation Preserving (CRP) loss to distill the teacher’s relational knowledge in a robust and class-balanced manner. With the contribution of the CCD and CRP, our CRCKD algorithm can distill the relational knowledge more comprehensively. Extensive experiments on the HAM10000 and APTOS datasets demonstrate the superiority of the proposed CRCKD method. The source code is available at https://github.com/hathawayxxh/CRCKD.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Litjens, G., et al.: A survey on deep learning in medical image analysis. Med. Image Anal. 42, 60–88 (2017)
Yang, C., Xie, L., Su, C., Yuille, A.L.: Snapshot distillation: teacher-student optimization in one generation. In: Proceedings of the CVPR, pp. 2859–2868 (2019)
Zhuang, J., Cai, J., Wang, R., Zhang, J., Zheng, W.-S.: Deep kNN for medical image classification. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12261, pp. 127–136. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59710-8_13
Cheplygina, V., de Bruijne, M., Pluim, J.P.: Not-so-supervised: a survey of semi-supervised, multi-instance, and transfer learning in medical image analysis. Med. Image Anal. 54, 280–296 (2019)
Shang, H., et al.: Leveraging other datasets for medical imaging classification: evaluation of transfer, multi-task and semi-supervised learning. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11768, pp. 431–439. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32254-0_48
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the CVPR, pp. 2818–2826 (2016)
Müller, R., Kornblith, S., Hinton, G.: When does label smoothing help? arXiv preprint arXiv:1906.02629 (2019)
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
Tarvainen, A., Valpola, H.: Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In: Advances in Neural Information Processing Systems, pp. 1195–1204 (2017)
Thiagarajan, J.J., Kashyap, S., Karargyris, A.: Distill-to-label: weakly supervised instance labeling using knowledge distillation. In: 2019 18th IEEE International Conference on Machine Learning And Applications (ICMLA), pp. 902–907. IEEE (2019)
Wu, J., et al.: Leveraging undiagnosed data for glaucoma classification with teacher-student learning. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12261, pp. 731–740. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59710-8_71
Liu, Q., Yu, L., Luo, L., Dou, Q., Heng, P.A.: Semi-supervised medical image classification with relation-driven self-ensembling model. IEEE Trans. Med. Imaging 39, 3429–3440 (2020)
Unnikrishnan, B., Nguyen, C.M., Balaram, S., Foo, C.S., Krishnaswamy, P.: Semi-supervised classification of diagnostic radiographs with noteacher: a teacher that is not mean. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12261, pp. 624–634. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59710-8_61
Abbasi, S., et al.: Classification of diabetic retinopathy using unlabeled data and knowledge distillation. arXiv preprint arXiv:2009.00982 (2020)
Patra, A., et al.: Efficient ultrasound image analysis models with sonographer gaze assisted distillation. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11767, pp. 394–402. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32251-9_43
Hou, Y., Ma, Z., Liu, C., Loy, C.C.: Learning lightweight lane detection CNNs by self attention distillation. In: Proceedings of the ICCV, pp. 1013–1021 (2019)
Tung, F., Mori, G.: Similarity-preserving knowledge distillation. In: Proceedings of the ICCV, pp. 1365–1374 (2019)
Tian, Y., Krishnan, D., Isola, P.: Contrastive representation distillation. arXiv preprint arXiv:1910.10699 (2019)
Saunshi, N., Plevrakis, O., Arora, S., Khodak, M., Khandeparkar, H.: A theoretical analysis of contrastive unsupervised representation learning. In: International Conference on Machine Learning, pp. 5628–5637 (2019)
Wu, Z., Xiong, Y., Yu, S.X., Lin, D.: Unsupervised feature learning via non-parametric instance discrimination. In: Proceedings of the CVPR, pp. 3733–3742 (2018)
Park, W., Kim, D., Lu, Y., Cho, M.: Relational knowledge distillation. In: Proceedings of the CVPR, pp. 3967–3976 (2019)
Tschandl, P., Rosendahl, C., Kittler, H.: The HAN10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Sci. Data 5, 180161 (2018)
Codella, N.C., et al.: Skin lesion analysis toward melanoma detection: a challenge at the 2017 international symposium on biomedical imaging (ISBI), hosted by the international skin imaging collaboration (ISIC). In: Proceedings of the ISBI, pp. 168–172. IEEE (2018)
Aptos 2019 blindness detection. https://www.kaggle.com/c/aptos2019-blindness-detection/data
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the CVPR, pp. 4700–4708 (2017)
Yim, J., Joo, D., Bae, J., Kim, J.: A gift from knowledge distillation: fast optimization, network minimization and transfer learning. In: Proceedings of the CVPR, pp. 4133–4141 (2017)
Yan, Y., Kawahara, J., Hamarneh, G.: Melanoma recognition via visual attention. In: Chung, A.C.S., Gee, J.C., Yushkevich, P.A., Bao, S. (eds.) IPMI 2019. LNCS, vol. 11492, pp. 793–804. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20351-1_62
Zhang, J., Xie, Y., Wu, Q., Xia, Y.: Skin lesion classification in dermoscopy images using synergic deep learning. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11071, pp. 12–20. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00934-2_2
Zhang, J., Xie, Y., Xia, Y., Shen, C.: Attention residual learning for skin lesion classification. IEEE Trans. Med. Imaging 38(9), 2092–2103 (2019)
Acknowledgements
The work described in this paper was supported by National Key R&D program of China with Grant No. 2019YFB1312400, Hong Kong RGC CRF grant C4063-18G, and Hong Kong RGC GRF grant #14211420.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Xing, X., Hou, Y., Li, H., Yuan, Y., Li, H., Meng, M.QH. (2021). Categorical Relation-Preserving Contrastive Knowledge Distillation for Medical Image Classification. In: de Bruijne, M., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2021. MICCAI 2021. Lecture Notes in Computer Science(), vol 12905. Springer, Cham. https://doi.org/10.1007/978-3-030-87240-3_16
Download citation
DOI: https://doi.org/10.1007/978-3-030-87240-3_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87239-7
Online ISBN: 978-3-030-87240-3
eBook Packages: Computer ScienceComputer Science (R0)