Skip to main content
Log in

ADVISE: ADaptive feature relevance and VISual Explanations for convolutional neural networks

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

To equip convolutional neural networks (CNNs) with explainability, it is essential to interpret how opaque models make specific decisions, understand what causes the errors, improve the architecture design, and identify unethical biases in the classifiers. This paper introduces ADVISE, a new explainability method that quantifies and leverages the relevance of each unit of the feature map to provide better visual explanations. To this end, we propose using adaptive bandwidth kernel density estimation to assign a relevance score to each unit of the feature map with respect to the predicted class. We also propose an evaluation protocol to quantitatively assess the visual explainability of CNN models. Our extensive evaluation of ADVISE in image classification tasks using pretrained AlexNet, VGG16, ResNet50, and Xception models on ImageNet shows that our method outperforms other visual explainable methods in quantifying feature-relevance and visual explainability while maintaining competitive time complexity. Our experiments further show that ADVISE fulfils the sensitivity and implementation independence axioms while passing the sanity checks. The implementation is accessible for reproducibility purposes on https://github.com/dehshibi/ADVISE.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data Availability

All images and data belong to the ImageNet database, which is publicly available on https://www.image-net.org/.

Notes

  1. The terms feature map and activation map are used interchangeably here since the former refers to a mapping of where a specific type of feature can be found in an image, and the latter is a mapping that relates to the activation of different areas of the image.

  2. \(\frac{\omega ^{*}}{\gamma } = n\) is used in our experiment.

References

  1. Dehshibi, M.M., Shanbehzadeh, J.: Cubic norm and kernel-based bi-directional PCA: toward age-aware facial kinship verification. Vis. Comput. 35, 23–40 (2019). https://doi.org/10.1007/s00371-017-1442-1

    Article  Google Scholar 

  2. Adhane, G., Dehshibi, M.M., Masip, D.: A deep convolutional neural network for classification of aedes albopictus mosquitoes. IEEE Access 9, 72681–72690 (2021). https://doi.org/10.1109/ACCESS.2021.3079700

    Article  Google Scholar 

  3. Adhane, G., Dehshibi, M.M., Masip, D.: On the use of uncertainty in classifying Aedes Albopictus mosquitoes. IEEE J. Sel. Top. Signal Process. 16(2), 224–233 (2022). https://doi.org/10.1109/JSTSP.2021.3122886

    Article  Google Scholar 

  4. Chen, W., et al.: GCSANet: a global context spatial attention deep learning network for remote sensing scene classification. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 15, 1150–1162 (2022). https://doi.org/10.1109/JSTARS.2022.3141826

    Article  Google Scholar 

  5. Chen, W., et al.: JAGAN: a framework for complex land cover classification using Gaofen-5 AHSI images. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sensing 15, 1591–1603 (2022). https://doi.org/10.1109/JSTARS.2022.3144339

    Article  Google Scholar 

  6. Tan, M., Pang, R., Le, Q. V.: EfficientDet: scalable and efficient object detection. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10778–10787. (2020) https://doi.org/10.1109/CVPR42600.2020.01079

  7. Alsagheer, D., Mansourifar, H., Dehshibi, M. M., Shi, W.: Detecting hate speech against athletes in social media. In: IEEE International Conference on Intelligent Data Science Technologies and Applications (IDSTA), pp. 75–81. (2022) https://doi.org/10.1109/IDSTA55301.2022.9923132

  8. Mehta, S., Rastegari, M., Caspi, A., Shapiro, L., Hajishirzi, H.: ESPNet: efficient spatial pyramid of dilated convolutions for semantic segmentation. In: European Conference on Computer Vision (ECCV), pp. 561–580. (2018) https://doi.org/10.1007/978-3-030-01249-6_34

  9. Ashtari-Majlan, M., Seifi, A., Dehshibi, M.M.: A multi-stream convolutional neural network for classification of progressive MCI in Alzheimer’s disease using structural MRI images. IEEE J. Biomed. Health Inform. 26(8), 3918–3926 (2022). https://doi.org/10.1109/JBHI.2022.3155705

    Article  Google Scholar 

  10. Chen, L. et al.: SCA-CNN: spatial and channel-wise attention in convolutional networks for image captioning. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6298–6306. (2017) https://doi.org/10.1109/CVPR.2017.667

  11. Dehshibi, M.M., Baiani, B., Pons, G., Masip, D.: A deep multimodal learning approach to perceive basic needs of humans from Instagram profile. IEEE Trans. Affect. Comput. 14(2), 944–956 (2021). https://doi.org/10.1109/TAFFC.2021.3090809

    Article  Google Scholar 

  12. Lipton, Z.C.: The mythos of model interpretability: in machine learning, the concept of interpretability is both important and slippery. Queue 16(3), 31–57 (2018). https://doi.org/10.1145/3236386.3241340

    Article  Google Scholar 

  13. Arrieta, A.B., et al.: Explainable artificial intelligence (XAI): concepts, taxonomies, opportunities and challenges toward responsible AI. Inf. Fusion 58, 82–115 (2020). https://doi.org/10.1016/j.inffus.2019.12.012

    Article  Google Scholar 

  14. Adhane, G., Dehshibi, M. M., Masip, D.: Incorporating reinforcement learning for quality-aware sample selection in deep architecture training. In: IEEE International Conference on Omni-layer Intelligent Systems (COINS), pp. 1–5. (2022) https://doi.org/10.1109/COINS54846.2022.9854971

  15. Guidotti, R., et al.: A survey of methods for explaining black box models. ACM Comput. SurV. (CSUR) 51(5), 931–942 (2018). https://doi.org/10.1145/3236009

    Article  Google Scholar 

  16. Ribeiro, M. T., Singh, S., Guestrin, C.: Interpreting CNNs via decision trees. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6254–6263. (2019) https://doi.org/10.1109/CVPR.2019.00642

  17. Ribeiro, M. T., Singh, S., Guestrin, C.: "Why Should I Trust You?": Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144. (2017) https://doi.org/10.1145/2939672.2939778

  18. Kim, S., Jeong, M., Ko, B.C.: Lightweight surrogate random forest support for model simplification and feature relevance. Appl. Intell. (2021). https://doi.org/10.1007/s10489-021-02451-x

    Article  Google Scholar 

  19. Binder, A., Montavon, G., Lapuschkin, S., Müller, K.-R., Samek, W.: Layer-wise relevance propagation for neural networks with local renormalization layers. In: Artificial Neural Networks and Machine Learning (ICANN), pp. 63–71 (2019) https://doi.org/10.1007/978-3-319-44781-0_8

  20. Nguyen, A., Dosovitskiy, A., Yosinski, J., Brox, T., Clune, J.: Synthesizing the preferred inputs for neurons in neural networks via deep generator networks . In: Proceedings of the 30th International Conference on Neural Information Processing Systems, pp. 3395–3403 (2016) https://doi.org/10.5555/3157382.3157477

  21. Lundberg, S. M., Lee, S.-I.: A unified approach to interpreting model predictions. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 4768–4777 (2017)https://doi.org/10.5555/3295222.3295230

  22. Samek, W., Wiegand, T., Müller, K.-R.: Explainable artificial intelligence: understanding, visualizing and interpreting deep learning models. ITU J. ICT Discoveries 1(S1), 39–48 (2017)

    Google Scholar 

  23. Zeiler, M. D., Fergus, R.: Visualizing and understanding convolutional networks. In: European Conference on Computer Vision (ECCV), pp. 818–833 (2014) https://doi.org/10.1007/978-3-319-10590-1_53

  24. Mahendran, A., Vedaldi, A.: Understanding deep image representations by inverting them. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5188–5196 (2015) https://doi.org/10.1109/CVPR.2015.7299155

  25. Li, Y., Yosinski, J., Clune, J., Lipson, H., Hopcroft, J.E.: Convergent learning: do different neural networks learn the same representations?. In: 4th International Conference on Learning Representations (ICLR), pp. 196–212 (2016)

  26. Selvaraju, R.R. et al.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: IEEE International Conference on Computer Vision (ICCV), pp. 618–626 (2017) https://doi.org/10.1109/ICCV.2017.74

  27. Taha, A., Yang, X., Shrivastava, A., Davis, L.: A generic visualization approach for convolutional neural networks. In: European Conference on Computer Vision (ECCV), pp. 734–750 (2020) https://doi.org/10.1007/978-3-030-58520-4_43

  28. Jiang, P.-T., Zhang, C.-B., Hou, Q., Cheng, M.-M., Wei, Y.: LayerCAM: exploring hierarchical class activation maps for localization. IEEE Trans. Image Process. 30, 5875–5888 (2021). https://doi.org/10.1109/TIP.2021.3089943

    Article  Google Scholar 

  29. Lin, M., Chen, Q., Yan, S.: Network in network. In: Second International Conference on Learning Representations, (ICLR), pp. 1–10 (2014)

  30. Donahue, J., et al.: Long-term recurrent convolutional networks for visual recognition and description. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 677–691 (2017). https://doi.org/10.1109/TPAMI.2016.2599174

    Article  Google Scholar 

  31. Seo, S., Huang, J., Yang, H., Liu, Y.: Interpretable convolutional neural networks with dual local and global attention for review rating prediction. In: Proceedings of the 11th ACM Conference on Recommender Systems, pp. 297–305 (2017) https://doi.org/10.1145/3109859.3109890

  32. Dehshibi, M. M., Olugbade, T., Diaz-de Maria, F., Bianchi-Berthouze, N., Tajadura-Jiménez, A.: Pain level and pain-related behaviour classification using GRU-based sparsely-connected RNNs. IEEE J. Sel. Top. Signal Process. 17(3), 677–688 (2023). https://doi.org/10.1109/JSTSP.2023.3262358

  33. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations (ICLR), pp. 1–14 (2015)

  34. Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1800–1807 (2017) https://doi.org/10.1109/CVPR.2017.195

  35. Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y

    Article  MathSciNet  Google Scholar 

  36. Bau, D., Zhou, B., Khosla, A., Oliva, A., Torralba, A.: Network dissection: quantifying interpretability of deep visual representations. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3319–3327 (2017) https://doi.org/10.1109/CVPR.2017.354

  37. Olah, C., Mordvintsev, A., Schubert, L.: Feature visualization. Distill 2(11), e7 (2017). https://doi.org/10.23915/distill.00007

  38. Gonzalez-Garcia, A., Modolo, D., Ferrari, V.: Do semantic parts emerge in convolutional neural networks? Int. J. Comput. Vision 126(5), 476–494 (2018). https://doi.org/10.1007/s11263-017-1048-0

    Article  MathSciNet  Google Scholar 

  39. Zhou, B., Khosla, A., Lapedriza, À., Oliva, A., Torralba, A.: Object detectors emerge in deep scene CNNs. In: 3rd International Conference on Learning Representations (ICLR), pp. 1–12 (2015)

  40. Shrikumar, A., Greenside, P., Kundaje, A.: Learning important features through propagating activation differences. In: Proceedings of the 34th International Conference on Machine Learning, pp. 3145–3153 (2017) https://doi.org/10.5555/3305890.3306006

  41. Zheng, Q., Wang, Z., Zhou, J., Lu, J.: Shap-CAM: Visual explanations for convolutional neural networks based on shapley value. In: 17th European Conference on Computer Vision–ECCV 2022, pp. 459–474 (2022) https://doi.org/10.1007/978-3-031-19775-8_27

  42. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

  43. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016) https://doi.org/10.1109/CVPR.2016.90

  44. Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2921–2929 (2016) https://doi.org/10.1109/CVPR.2016.319

  45. Chattopadhay, A., Sarkar, A., Howlader, P., Balasubramanian, V.N.: Grad-CAM++: generalized gradient-based visual explanations for deep convolutional networks. In: IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 839–847 (2018) https://doi.org/10.1109/WACV.2018.00097

  46. Wang, H. et al.: Score-CAM: score-weighted visual explanations for convolutional neural networks. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 111–119 (2018). https://doi.org/10.1109/CVPRW50498.2020.00020

  47. Zhang, J., et al.: Top-down neural attention by excitation backprop. Int. J. Comput. Vision 126(10), 1084–1102 (2018). https://doi.org/10.1007/s11263-017-1059-x

    Article  Google Scholar 

  48. Fong, R.C., Vedaldi, A.: Interpretable explanations of black boxes by meaningful perturbation. In: IEEE International Conference on Computer Vision (ICCV), pp. 3429–3437 (2017). https://doi.org/10.1109/ICCV.2017.371

  49. Cao, C. et al.: Look and think twice: capturing top-down visual attention with feedback convolutional neural networks. In: IEEE International Conference on Computer Vision (ICCV), pp. 2956–2964 (2015). https://doi.org/10.1109/ICCV.2015.338

  50. Zhou, Y., Zhu, Y., Ye, Q., Qiu, Q., Jiao, J.: Weakly supervised instance segmentation using class peak response. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3791–3800 (2018). https://doi.org/10.1109/CVPR.2018.00399

  51. Kasanishi, T., Wang, X., Yamasaki, T.: Edge-level explanations for graph neural networks by extending explainability methods for convolutional neural networks . In: IEEE International Symposium on Multimedia (ISM), pp. 249–252 (2021). https://doi.org/10.1109/ISM52913.2021.00049

  52. Sundararajan, M., Taly, A., Yan, Q.: Axiomatic attribution for deep networks. In: Proceedings of the 34th International Conference on Machine Learning, pp. 3319–3328 (2017)

  53. Sturmfels, P., Lundberg, S., Lee, S.-I.: Visualizing the impact of feature attribution baselines. Distill 5(1), e22 (2020). https://doi.org/10.23915/distill.00022

  54. Adebayo, J. et al.: Sanity checks for saliency maps. In: Advances in Neural Information Processing Systems, pp. 1–11 (2018)

  55. Sixt, L., Granz, M., Landgraf, T.: When explanations lie: why many modified BP attributions fail. In: Proceedings of the 37th International Conference on Machine Learning, pp. 9046–9057 (2018)

  56. Parzen, E.: On estimation of a probability density function and mode. Ann. Math. Stat. 33(3), 1065–1076 (1962). https://doi.org/10.1214/aoms/1177704472

    Article  MathSciNet  MATH  Google Scholar 

  57. Bowman, A.W.: An alternative method of cross-validation for the smoothing of density estimates. Biometrika 71(2), 353–360 (1984). https://doi.org/10.1093/biomet/71.2.353

    Article  MathSciNet  Google Scholar 

  58. Shimazaki, H., Shinomoto, S.: Kernel bandwidth optimization in spike rate estimation. J. Comput. Neurosci. 29(1), 171–182 (2010). https://doi.org/10.1007/s10827-009-0180-4

    Article  MathSciNet  MATH  Google Scholar 

  59. Nadaraya, E.A.: On Estimating Regression. Theory Probab. Appl. 9(1), 141–142 (1964). https://doi.org/10.1137/1109020

    Article  MATH  Google Scholar 

  60. Li, X.-H. et al.: An Experimental study of quantitative evaluations on saliency methods. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pp. 3200–3208 (2021) https://doi.org/10.1145/3447548.3467148

  61. Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004). https://doi.org/10.1109/TIP.2003.819861

    Article  Google Scholar 

  62. Borji, A., Tanner, J.: Reconciling Saliency and object center-bias hypotheses in explaining free-viewing fixations. IEEE Trans. Neural Netw. Learn. Syst. 27(6), 1214–1226 (2015). https://doi.org/10.1109/TNNLS.2015.2480683

    Article  Google Scholar 

  63. Wolf, C., Lappe, M.: Salient objects dominate the central fixation bias when orienting toward images. J. Vis. 21(8), 23–23 (2021). https://doi.org/10.1167/jov.21.8.23

    Article  Google Scholar 

  64. Hooker, S., Erhan, D., Kindermans, P.-J., Kim, B.: A benchmark for interpretability methods in deep neural networks. In: Advances in Neural Information Processing Systems, pp. 1–12 (2019)

  65. Sundararajan, M., Najmi, A.: The many shapley values for model explanation. In: Proceedings of the 37th International Conference on Machine Learning, pp. 9269–9278 (2020)

  66. Janzing, D., Minorics, L., Blöbaum, P.: Feature relevance quantification in explainable AI: A causal problem. In: Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, pp. 2907–2916 (2020)

Download references

Funding

This work is partially supported by funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme, with grant agreement No. 101002711, and by PID2022-138721NB-I00 grant from the Spanish Ministry of Science, Innovation (FEDER funds).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Mahdi Dehshibi.

Ethics declarations

Conflict of Interests

There are no competing interests to declare.

Human and Animal Rights

This study has no experiments with human participants and animals conducted by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dehshibi, M.M., Ashtari-Majlan, M., Adhane, G. et al. ADVISE: ADaptive feature relevance and VISual Explanations for convolutional neural networks. Vis Comput (2023). https://doi.org/10.1007/s00371-023-03112-5

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00371-023-03112-5

Keywords

Navigation