Skip to main content

HOLMES: HOLonym-MEronym Based Semantic Inspection for Convolutional Image Classifiers

  • Conference paper
  • First Online:
Explainable Artificial Intelligence (xAI 2023)

Abstract

Convolutional Neural Networks (CNNs) are nowadays the model of choice in Computer Vision, thanks to their ability to automatize the feature extraction process in visual tasks. However, the knowledge acquired during training is fully sub-symbolic, and hence difficult to understand and explain to end users. In this paper, we propose a new technique called HOLMES (HOLonym-MEronym based Semantic inspection) that decomposes a label into a set of related concepts, and provides component-level explanations for an image classification model. Specifically, HOLMES leverages ontologies, web scraping and transfer learning to automatically construct meronym (parts)-based detectors for a given holonym (class). Then, it produces heatmaps at the meronym level and finally, by probing the holonym CNN with occluded images, it highlights the importance of each part on the classification output. Compared to state-of-the-art saliency methods, HOLMES takes a step further and provides information about both where and what the holonym CNN is looking at. It achieves so without relying on densely annotated datasets and without forcing concepts to be associated to single computational units. Extensive experimental evaluation on different categories of objects (animals, tools and vehicles) shows the feasibility of our approach. On average, HOLMES explanations include at least two meronyms, and the ablation of a single meronym roughly halves the holonym model confidence. The resulting heatmaps were quantitatively evaluated using the deletion/insertion/preservation curves. All metrics were comparable to those achieved by GradCAM, while offering the advantage of further decomposing the heatmap in human-understandable concepts. In addition, results were largely above chance level, thus highlighting both the relevance of meronyms to object classification, as well as HOLMES ability to capture it. The code is available at https://github.com/FrancesC0de/HOLMES.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Arrieta, A.B., et al.: Explainable artificial intelligence (XAI): concepts, taxonomies, opportunities and challenges toward responsible AI. Inf. Fusion 58, 82–115 (2020)

    Article  Google Scholar 

  2. Bau, D., Zhou, B., Khosla, A., Oliva, A., Torralba, A.: Network dissection: quantifying interpretability of deep visual representations. In: Computer Vision and Pattern Recognition (2017)

    Google Scholar 

  3. Bau, D., Zhu, J.Y., Strobelt, H., Lapedriza, A., Zhou, B., Torralba, A.: Understanding the role of individual units in a deep neural network. Proc. Natl. Acad. Sci. 117(48), 30071–30078 (2020)

    Article  Google Scholar 

  4. Chen, C., Li, O., Barnett, A., Su, J., Rudin, C.: This looks like that: deep learning for interpretable image recognition. Adv. Neural Inf. Process. Syst. 32, 8930–8941 (2019)

    Google Scholar 

  5. Chen, X., Mottaghi, R., Liu, X., Fidler, S., Urtasun, R., Yuille, A.L.: Detect what you can: detecting and representing objects using holistic models and body parts. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1979–1986 (2014)

    Google Scholar 

  6. Confalonieri, R., et al.: An ontology-based approach to explaining artificial neural networks. CoRR abs/1906.08362 (2019)

    Google Scholar 

  7. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)

    Google Scholar 

  8. Geirhos, R., et al.: Shortcut learning in deep neural networks. Nat. Mach. Intell. 2(11), 665–673 (2020)

    Article  Google Scholar 

  9. Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F.A., Brendel, W.: Imagenet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. In: International Conference on Learning Representations (2019)

    Google Scholar 

  10. Ghidini, V., Perotti, A., Schifanella, R.: Quantitative and ontology-based comparison of explanations for image classification. In: Nicosia, G., Pardalos, P., Umeton, R., Giuffrida, G., Sciacca, V. (eds.) LOD 2019. LNCS, vol. 11943, pp. 58–70. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-37599-7_6

    Chapter  Google Scholar 

  11. Gonzalez-Garcia, A., Modolo, D., Ferrari, V.: Do semantic parts emerge in convolutional neural networks? Int. J. Comput. Vis. 126, 476–494 (2018)

    Article  MathSciNet  Google Scholar 

  12. Goodman, B., Flaxman, S.: European Union regulations on algorithmic decision making and a “Right to Explanation’’. AI Mag. 38, 50–57 (2016)

    Google Scholar 

  13. Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., Pedreschi, D.: A survey of methods for explaining black box models. ACM Comput. Surv. 51(5), 1–42 (2018)

    Article  Google Scholar 

  14. Jiang, H., Kim, B., Guan, M., Gupta, M.: To trust or not to trust a classifier. Adv. Neural Inf. Process. Syst. 31, 5541–5552 (2018)

    Google Scholar 

  15. Kim, B., Wattenberg, M., Gilmer, J., Cai, C., Wexler, J., Viegas, F., et al.: Interpretability beyond feature attribution: quantitative testing with concept activation vectors (TCAV). In: International Conference on Machine Learning, pp. 2668–2677. PMLR (2018)

    Google Scholar 

  16. Lim, D., Lee, H., Kim, S.: Building reliable explanations of unreliable neural networks: locally smoothing perspective of model interpretation. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6464–6473 (2021)

    Google Scholar 

  17. Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017)

    Google Scholar 

  18. Massouh, N., Babiloni, F., Tommasi, T., Young, J., Hawes, N., Caputo, B.: Learning deep visual object models from noisy web data: how to make it work. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5564–5571 (2017)

    Google Scholar 

  19. Miller, G.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)

    Article  Google Scholar 

  20. Molinari, D., Pasquale, G., Natale, L., Caputo, B.: Automatic creation of large scale object databases from web resources: a case study in robot vision. In: International Conference on Image Analysis and Processing (2019)

    Google Scholar 

  21. Morra, L., Lamberti, F.: Benchmarking unsupervised near-duplicate image detection. Expert Syst. Appl. 135, 313–326 (2019)

    Article  Google Scholar 

  22. Obermeyer, Z., Powers, B., Vogeli, C., Mullainathan, S.: Dissecting racial bias in an algorithm used to manage the health of populations. Science 366(6464), 447–453 (2019)

    Article  Google Scholar 

  23. Petsiuk, V., Das, A., Saenko, K.: Rise: randomized input sampling for explanation of black-box models. In: British Machine Vision Conference (2018)

    Google Scholar 

  24. Ribeiro, M.T., Singh, S., Guestrin, C.: Why should i trust you? Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144 (2016)

    Google Scholar 

  25. Rodríguez, N.D., et al.: Explainable neural-symbolic learning (X-NeSyL) methodology to fuse deep learning representations with expert knowledge graphs: the MonuMAI cultural heritage use case. Inf. Fusion 79, 58–83 (2022)

    Article  Google Scholar 

  26. Samanta, P., Jain, S.: Analysis of perceptual hashing algorithms in image manipulation detection. Procedia Comput. Sci. 185, 203–212 (2021). https://doi.org/10.1016/j.procs.2021.05.021

    Article  Google Scholar 

  27. Samek, W., Binder, A., Montavon, G., Lapuschkin, S., Müller, K.R.: Evaluating the visualization of what a deep neural network has learned. IEEE Trans. Neural Netw. Learn. Syst. 28(11), 2660–2673 (2017)

    Article  MathSciNet  Google Scholar 

  28. Sarraf, A., Azhdari, M., Sarraf, S.: A comprehensive review of deep learning architectures for computer vision applications. Am. Sci. Res. J. Eng. Technol. Sci. 77, 1–29 (2021)

    Google Scholar 

  29. Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. Int. J. Comput. Vis. 128(2), 336–359 (2019)

    Article  Google Scholar 

  30. Shyu, M.L., Chen, S.C., Sarinnapakorn, K., Chang, L.: A novel anomaly detection scheme based on principal component classifier. In: Proceedings of International Conference on Data Mining, January 2003

    Google Scholar 

  31. Siblini, W., Fréry, J., He-Guelton, L., Oblé, F., Wang, Y.-Q.: Master your metrics with calibration. In: Berthold, M.R., Feelders, A., Krempl, G. (eds.) IDA 2020. LNCS, vol. 12080, pp. 457–469. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-44584-3_36

    Chapter  Google Scholar 

  32. Silberer, C., Ferrari, V., Lapata, M.: Visually grounded meaning representations. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2284–2297 (2017)

    Article  Google Scholar 

  33. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (2015)

    Google Scholar 

  34. Steging, C., Schomaker, L., Verheij, B.: The XAI paradox: systems that perform well for the wrong reasons. In: Proceedings of the 31st Benelux Conference on A.I. and the 28th Belgian Dutch Conference on Machine Learning (2019)

    Google Scholar 

  35. Yao, Y., et al.: Exploiting web images for multi-output classification: from category to subcategories. IEEE Trans. Neural Netw. Learn. Syst. 31(7), 2348–2360 (2020)

    Google Scholar 

  36. Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Adv. Neural Inf. Process. Syst. 27, 3320–3328 (2014)

    Google Scholar 

  37. Zhang, C., Wang, Q., Xie, G., Wu, Q., Shen, F., Tang, Z.: Robust learning from noisy web images via data purification for fine-grained recognition. IEEE Trans. Multimed. 24, 1198–1209 (2021)

    Article  Google Scholar 

  38. Zhang, Q., Wu, Y.N., Zhu, S.C.: Interpretable convolutional neural networks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8827–8836 (2018)

    Google Scholar 

  39. Zhou, B., Khosla, A., Lapedriza, À., Oliva, A., Torralba, A.: Object detectors emerge in deep scene CNNs. In: International Conference on Learning Representations abs/1412.6856 (2015)

    Google Scholar 

  40. Zhou, B., Sun, Y., Bau, D., Torralba, A.: Interpretable basis decomposition for visual explanation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11212, pp. 122–138. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01237-3_8

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francesco Dibitonto .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dibitonto, F., Garcea, F., Panisson, A., Perotti, A., Morra, L. (2023). HOLMES: HOLonym-MEronym Based Semantic Inspection for Convolutional Image Classifiers. In: Longo, L. (eds) Explainable Artificial Intelligence. xAI 2023. Communications in Computer and Information Science, vol 1902. Springer, Cham. https://doi.org/10.1007/978-3-031-44067-0_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-44067-0_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-44066-3

  • Online ISBN: 978-3-031-44067-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics