Interactive Learning for Interpretable Visual Recognition via Semantic-Aware Self-Teaching Framework

Jiang, Hao; Li, Haowei; Chen, Junhao; Wan, Wentao; Wang, Keze

doi:10.1007/978-981-99-8546-3_12

Hao Jiang¹⁵,
Haowei Li¹⁵,
Junhao Chen¹⁵,
Wentao Wan¹⁵ &
…
Keze Wang¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14433))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

519 Accesses

Abstract

Aimed at tracing back the decision-making process of deep neural networks on the fine-grained image recognition task, various interpretable methods have been proposed and obtained promising results. However, the existing methods still have limited interpretability in aligning the abstract semantic concepts with the concrete image regions, due to the lack of human guidance during the model training. Attempting to address this issue and inspired by the machine teaching techniques, we formulate the training process of interpretable methods as an interactive learning manner by concisely simulating the human learning mechanism. Specifically, we propose a semantic-aware self-teaching framework to progressively improve the given neural network through an interactive teacher-student learning protocol. After initialing from the well-trained parameters of the given model, the teacher model focuses on minimally providing informative image regions to train the student model to generate interpretable predictions (i.e., semantic image regions) as good feedback. These feedback can encourage the teacher model to further refine the alignment of semantic concepts and image regions. Besides, our proposed framework is compatible with most of the existing network architectures. Extensive and comprehensive comparisons with the existing state-of-the-art interpretable approaches on the public benchmarks demonstrate that our interactive learning manner showcases an improved interpretability, a higher classification accuracy, and a greater degree of generality.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alvarez-Melis, D. et al.: Towards robust interpretability with self-explaining neural networks. In: NIPS (2018)
Google Scholar
Beckh, K., et al.: Explainable machine learning with prior knowledge: an overview. arXiv (2021)
Google Scholar
Bychkov, D., et al.: Deep learning based tissue analysis predicts outcome in colorectal cancer. Sci. Rep. (2018)
Google Scholar
Chen, C., et al.: This looks like that: deep learning for interpretable image recognition. Neural Inf. Process. Syst. (2019)
Google Scholar
Chen, Z., et al.: Concept whitening for interpretable image recognition. Nat. Mach. Intell. (2020)
Google Scholar
Deng, J., et al.: Imagenet: a large-scale hierarchical image database. In: CVPR (2009)
Google Scholar
Donnelly, J., et al.: Deformable protopnet: an interpretable image classifier using deformable prototypes. In: CVPR (2022)
Google Scholar
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. In: ICLR (2021)
Google Scholar
He, J., et al.: Transfg: a transformer architecture for fine-grained recognition. In: AAAI (2022)
Google Scholar
He, K., et al.: Deep residual learning for image recognition. In: CVPR (2016)
Google Scholar
Huang, Y., Chen, J.: Teacher-critical training strategies for image captioning. In: CVPR (2020)
Google Scholar
Huang, Z., Li, Y.: Interpretable and accurate fine-grained recognition via region grouping. arXiv (2020)
Google Scholar
Ji, R., et al.: Attention convolutional binary neural tree for fine-grained visual categorization. In: CVPR (2019)
Google Scholar
Johns, E., et al.: Becoming the expert - interactive multi-class machine teaching. In: CVPR (2015)
Google Scholar
Khosla, A., et al.: Novel dataset for fine-grained image categorization: stanford dogs. In: Proceedings of the CVPR Workshop on Fine-Grained Visual Categorization (FGVC) (2011)
Google Scholar
Kim, S., et al.: Vit-net: interpretable vision transformers with neural tree decoder. In: ICML (2023)
Google Scholar
Krause, J., et al.: 3d object representations for fine-grained categorization. In: ICCV (2013)
Google Scholar
Linardatos, P., et al.: Explainable AI: a review of machine learning interpretability methods. Entropy (2020)
Google Scholar
Liu, R., et al.: Teacher-student training for robust tacotron-based TTS. In: ICASSP (2020)
Google Scholar
Liu, W., et al.: Iterative machine teaching. Mach. Learn. (2017)
Google Scholar
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: ICCV (2021)
Google Scholar
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. Learning (2017)
Google Scholar
Meek, C., et al.: Analysis of a design pattern for teaching with features and labels. arXiv (2016)
Google Scholar
Mei, S., Zhu, X.: Using machine teaching to identify optimal training-set attacks on machine learners. In: AAAI (2015)
Google Scholar
Nauta, et al.: Neural prototype trees for interpretable fine-grained image recognition. arXiv (2020)
Google Scholar
Nauta, M., et al.: Neural prototype trees for interpretable fine-grained image recognition. In: CVPR (2020)
Google Scholar
Rymarczyk, D., et al.: Interpretable image classification with differentiable prototypes assignment. In: ECCV (2022)
Google Scholar
Selvaraju, R.R., et al.: Grad-cam: visual explanations from deep networks via gradient-based localization. In: ICCV (2016)
Google Scholar
Wah, C., et al.: The caltech-ucsd birds-200-2011 dataset (2011)
Google Scholar
Wang, J., et al.: Interpretable image recognition by constructing transparent embedding space. In: ICCV (2021)
Google Scholar
Xue, M., et al.: Protopformer: concentrating on prototypical parts in vision transformers for interpretable image recognition. arXiv (2022)
Google Scholar
Zech, J.R., et al.: Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: a cross-sectional study. PLoS Med. (2018)
Google Scholar
Zeng, X., Sun, H.: Interactive image recognition of space target objects. IOP Conf. Ser. (2017)
Google Scholar
Zhang, C., et al.: One-shot machine teaching: cost very few examples to converge faster. arXiv (2022)
Google Scholar
Zhang, Q., et al.: Interpretable convolutional neural networks. In: CVPR (2017)
Google Scholar
Zhang, X., et al.: Explainable machine learning in image classification models: an uncertainty quantification perspective. Knowl. Based Syst. (2022)
Google Scholar
Zhu, X., et al.: Machine teaching: an inverse problem to machine learning and an approach toward optimal education. In: AAAI (2015)
Google Scholar
Zhu, X., et al.: An overview of machine teaching. arXiv (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science and Technology, Sun Yat-sen University, Guangzhou, China
Hao Jiang, Haowei Li, Junhao Chen, Wentao Wan & Keze Wang

Authors

Hao Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Haowei Li
View author publications
You can also search for this author in PubMed Google Scholar
Junhao Chen
View author publications
You can also search for this author in PubMed Google Scholar
Wentao Wan
View author publications
You can also search for this author in PubMed Google Scholar
Keze Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Keze Wang .

Editor information

Editors and Affiliations

Nanjing University of Information Science and Technology, Nanjing, China
Qingshan Liu
Xiamen University, Xiamen, China
Hanzi Wang
Beijing University of Posts and Telecommunications, Beijing, China
Zhanyu Ma
Sun Yat-sen University, Guangzhou, China
Weishi Zheng
Peking University, Beijing, China
Hongbin Zha
Chinese Academy of Sciences, Beijing, China
Xilin Chen
Chinese Academy of Sciences, Beijing, China
Liang Wang
Xiamen University, Xiamen, China
Rongrong Ji

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 2068 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jiang, H., Li, H., Chen, J., Wan, W., Wang, K. (2024). Interactive Learning for Interpretable Visual Recognition via Semantic-Aware Self-Teaching Framework. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14433. Springer, Singapore. https://doi.org/10.1007/978-981-99-8546-3_12

Download citation

DOI: https://doi.org/10.1007/978-981-99-8546-3_12
Published: 26 December 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8545-6
Online ISBN: 978-981-99-8546-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Interactive Learning for Interpretable Visual Recognition via Semantic-Aware Self-Teaching Framework