Abstract
Building generalizable AI models is one of the primary challenges in the healthcare domain. While radiologists rely on generalizable descriptive rules of abnormality, Neural Network (NN) models suffer even with a slight shift in input distribution (e.g., scanner type). Fine-tuning a model to transfer knowledge from one domain to another requires a significant amount of labeled data in the target domain. In this paper, we develop an interpretable model that can be efficiently fine-tuned to an unseen target domain with minimal computational cost. We assume the interpretable component of NN to be approximately domain-invariant. However, interpretable models typically underperform compared to their Blackbox (BB) variants. We start with a BB in the source domain and distill it into a mixture of shallow interpretable models using human-understandable concepts. As each interpretable model covers a subset of data, a mixture of interpretable models achieves comparable performance as BB. Further, we use the pseudo-labeling technique from semi-supervised learning (SSL) to learn the concept classifier in the target domain, followed by fine-tuning the interpretable models in the target domain. We evaluate our model using a real-life large-scale chest-X-ray (CXR) classification dataset. The code is available at: https://github.com/batmanlab/MICCAI-2023-Route-interpret-repeat-CXRs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Barbiero, P., Ciravegna, G., Giannini, F., Lió, P., Gori, M., Melacci, S.: Entropy-based logic explanations of neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 36, pp. 6046–6054 (2022)
Chu, B., Madhavan, V., Beijbom, O., Hoffman, J., Darrell, T.: Best practices for fine-tuning visual classifiers to new domains. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 435–442. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49409-8_34
Clough, J.R., Oksuz, I., Puyol-Antón, E., Ruijsink, B., King, A.P., Schnabel, J.A.: Global and local interpretability for cardiac MRI classification. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11767, pp. 656–664. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32251-9_72
Ghosh, S., Yu, K., Arabshahi, F., Batmanghelich, K.: Dividing and conquering a BlackBox to a mixture of interpretable models: route, interpret, repeat. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research. vol. 202, pp. 11360–11397. PMLR (2023). https://proceedings.mlr.press/v202/ghosh23c.html
Ghosh, S., Yu, K., Arabshahi, F., Batmanghelich, K.: Tackling shortcut learning in deep neural networks: An iterative approach with interpretable models (2023)
Graziani, M., Andrearczyk, V., Marchand-Maillet, S., Müller, H.: Concept attribution: explaining CNN decisions to physicians. Comput. Biol. Med. 123, 103865 (2020)
Guan, H., Liu, M.: Domain adaptation for medical image analysis: a survey. IEEE Trans. Biomed. Eng. 69(3), 1173–1185 (2021)
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Irvin, J., et al.: CheXpert: a large chest radiograph dataset with uncertainty labels and expert comparison. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 33, pp. 590–597 (2019)
Jain, S., et al.: RadGraph: Extracting clinical entities and relations from radiology reports. arXiv preprint arXiv:2106.14463 (2021)
Johnson, A., et al.: MIMIC-CXR-JPG-chest radiographs with structured labels
Kandel, I., Castelli, M.: How deeply to fine-tune a convolutional neural network: a case study using a histopathology dataset. Appl. Sci. 10(10), 3359 (2020)
Kim, B., et al.: Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav) (2017). arXiv preprint arXiv:1711.11279 (2017)
Koh, P.W., et al.: Concept bottleneck models. In: International Conference on Machine Learning, pp. 5338–5348. PMLR (2020)
Lee, D.H., et al.: Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks. In: Workshop on Challenges in Representation Learning, ICML. vol. 3, p. 896 (2013)
Rabanser, S., Thudi, A., Hamidieh, K., Dziedzic, A., Papernot, N.: Selective classification via neural network training dynamics. arXiv preprint arXiv:2205.13532 (2022)
Rajpurkar, P., et al.: CheXNet: Radiologist-level pneumonia detection on chest X-rays with deep learning. arXiv preprint arXiv:1711.05225 (2017)
Rudin, C., Chen, C., Chen, Z., Huang, H., Semenova, L., Zhong, C.: Interpretable machine learning: fundamental principles and 10 grand challenges. Stat. Surv. 16, 1–85 (2022)
Sarkar, A., Vijaykeerthy, D., Sarkar, A., Balasubramanian, V.N.: Inducing semantic grouping of latent concepts for explanations: An ante-hoc approach. arXiv preprint arXiv:2108.11761 (2021)
Wadden, D., Wennberg, U., Luan, Y., Hajishirzi, H.: Entity, relation, and event extraction with contextualized span representations. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). pp. 5784–5789. Association for Computational Linguistics, Hong Kong, China (2019). https://doi.org/10.18653/v1/D19-1585, https://aclanthology.org/D19-1585
Wang, Y.X., Ramanan, D., Hebert, M.: Growing a brain: fine-tuning by increasing model capacity. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2471–2480 (2017)
Yan, W., et al.: MRI manufacturer shift and adaptation: increasing the generalizability of deep learning segmentation for MR images acquired with different scanners. Radiol. Artif. Intell. 2(4), e190195 (2020)
Yeche, H., Harrison, J., Berthier, T.: UBS: a dimension-agnostic metric for concept vector interpretability applied to radiomics. In: Suzuki, K. (ed.) ML-CDS/IMIMIC -2019. LNCS, vol. 11797, pp. 12–20. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33850-3_2
Yu, K., Ghosh, S., Liu, Z., Deible, C., Batmanghelich, K.: Anatomy-Guided Weakly-Supervised Abnormality Localization in Chest X-rays. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) Medical Image Computing and Computer Assisted Intervention-MICCAI 2022. MICCAI 2022. Lecture Notes in Computer Science. vol. 13435. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16443-9_63
Yuksekgonul, M., Wang, M., Zou, J.: Post-hoc concept bottleneck models. arXiv preprint arXiv:2205.15480 (2022)
Zarlenga, M.E., et al.: Concept embedding models. arXiv preprint arXiv:2209.09056 (2022)
Acknowledgement
This work was partially supported by NIH Award Number 1R01HL141813-01 and the Pennsylvania Department of Health. We are grateful for the computational resources from Pittsburgh Super Computing grant number TG-ASC170024.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ghosh, S., Yu, K., Batmanghelich, K. (2023). Distilling BlackBox to Interpretable Models for Efficient Transfer Learning. In: Greenspan, H., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. MICCAI 2023. Lecture Notes in Computer Science, vol 14221. Springer, Cham. https://doi.org/10.1007/978-3-031-43895-0_59
Download citation
DOI: https://doi.org/10.1007/978-3-031-43895-0_59
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43894-3
Online ISBN: 978-3-031-43895-0
eBook Packages: Computer ScienceComputer Science (R0)