Abstract
Aimed at tracing back the decision-making process of deep neural networks on the fine-grained image recognition task, various interpretable methods have been proposed and obtained promising results. However, the existing methods still have limited interpretability in aligning the abstract semantic concepts with the concrete image regions, due to the lack of human guidance during the model training. Attempting to address this issue and inspired by the machine teaching techniques, we formulate the training process of interpretable methods as an interactive learning manner by concisely simulating the human learning mechanism. Specifically, we propose a semantic-aware self-teaching framework to progressively improve the given neural network through an interactive teacher-student learning protocol. After initialing from the well-trained parameters of the given model, the teacher model focuses on minimally providing informative image regions to train the student model to generate interpretable predictions (i.e., semantic image regions) as good feedback. These feedback can encourage the teacher model to further refine the alignment of semantic concepts and image regions. Besides, our proposed framework is compatible with most of the existing network architectures. Extensive and comprehensive comparisons with the existing state-of-the-art interpretable approaches on the public benchmarks demonstrate that our interactive learning manner showcases an improved interpretability, a higher classification accuracy, and a greater degree of generality.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alvarez-Melis, D. et al.: Towards robust interpretability with self-explaining neural networks. In: NIPS (2018)
Beckh, K., et al.: Explainable machine learning with prior knowledge: an overview. arXiv (2021)
Bychkov, D., et al.: Deep learning based tissue analysis predicts outcome in colorectal cancer. Sci. Rep. (2018)
Chen, C., et al.: This looks like that: deep learning for interpretable image recognition. Neural Inf. Process. Syst. (2019)
Chen, Z., et al.: Concept whitening for interpretable image recognition. Nat. Mach. Intell. (2020)
Deng, J., et al.: Imagenet: a large-scale hierarchical image database. In: CVPR (2009)
Donnelly, J., et al.: Deformable protopnet: an interpretable image classifier using deformable prototypes. In: CVPR (2022)
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. In: ICLR (2021)
He, J., et al.: Transfg: a transformer architecture for fine-grained recognition. In: AAAI (2022)
He, K., et al.: Deep residual learning for image recognition. In: CVPR (2016)
Huang, Y., Chen, J.: Teacher-critical training strategies for image captioning. In: CVPR (2020)
Huang, Z., Li, Y.: Interpretable and accurate fine-grained recognition via region grouping. arXiv (2020)
Ji, R., et al.: Attention convolutional binary neural tree for fine-grained visual categorization. In: CVPR (2019)
Johns, E., et al.: Becoming the expert - interactive multi-class machine teaching. In: CVPR (2015)
Khosla, A., et al.: Novel dataset for fine-grained image categorization: stanford dogs. In: Proceedings of the CVPR Workshop on Fine-Grained Visual Categorization (FGVC) (2011)
Kim, S., et al.: Vit-net: interpretable vision transformers with neural tree decoder. In: ICML (2023)
Krause, J., et al.: 3d object representations for fine-grained categorization. In: ICCV (2013)
Linardatos, P., et al.: Explainable AI: a review of machine learning interpretability methods. Entropy (2020)
Liu, R., et al.: Teacher-student training for robust tacotron-based TTS. In: ICASSP (2020)
Liu, W., et al.: Iterative machine teaching. Mach. Learn. (2017)
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: ICCV (2021)
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. Learning (2017)
Meek, C., et al.: Analysis of a design pattern for teaching with features and labels. arXiv (2016)
Mei, S., Zhu, X.: Using machine teaching to identify optimal training-set attacks on machine learners. In: AAAI (2015)
Nauta, et al.: Neural prototype trees for interpretable fine-grained image recognition. arXiv (2020)
Nauta, M., et al.: Neural prototype trees for interpretable fine-grained image recognition. In: CVPR (2020)
Rymarczyk, D., et al.: Interpretable image classification with differentiable prototypes assignment. In: ECCV (2022)
Selvaraju, R.R., et al.: Grad-cam: visual explanations from deep networks via gradient-based localization. In: ICCV (2016)
Wah, C., et al.: The caltech-ucsd birds-200-2011 dataset (2011)
Wang, J., et al.: Interpretable image recognition by constructing transparent embedding space. In: ICCV (2021)
Xue, M., et al.: Protopformer: concentrating on prototypical parts in vision transformers for interpretable image recognition. arXiv (2022)
Zech, J.R., et al.: Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: a cross-sectional study. PLoS Med. (2018)
Zeng, X., Sun, H.: Interactive image recognition of space target objects. IOP Conf. Ser. (2017)
Zhang, C., et al.: One-shot machine teaching: cost very few examples to converge faster. arXiv (2022)
Zhang, Q., et al.: Interpretable convolutional neural networks. In: CVPR (2017)
Zhang, X., et al.: Explainable machine learning in image classification models: an uncertainty quantification perspective. Knowl. Based Syst. (2022)
Zhu, X., et al.: Machine teaching: an inverse problem to machine learning and an approach toward optimal education. In: AAAI (2015)
Zhu, X., et al.: An overview of machine teaching. arXiv (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Jiang, H., Li, H., Chen, J., Wan, W., Wang, K. (2024). Interactive Learning for Interpretable Visual Recognition via Semantic-Aware Self-Teaching Framework. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14433. Springer, Singapore. https://doi.org/10.1007/978-981-99-8546-3_12
Download citation
DOI: https://doi.org/10.1007/978-981-99-8546-3_12
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8545-6
Online ISBN: 978-981-99-8546-3
eBook Packages: Computer ScienceComputer Science (R0)