Skip to main content

The Blame Problem in Evaluating Local Explanations and How to Tackle It

  • Conference paper
  • First Online:
Artificial Intelligence. ECAI 2023 International Workshops (ECAI 2023)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1947))

Included in the following conference series:

  • 258 Accesses

Abstract

The number of local model-agnostic explanation techniques proposed has grown rapidly recently. One main reason is that the bar for developing new explainability techniques is low due to the lack of optimal evaluation measures. Without rigorous measures, it is hard to have concrete evidence of whether the new explanation techniques can significantly outperform their predecessors. Our study proposes a new taxonomy for evaluating local explanations: robustness, evaluation using ground truth from synthetic datasets and interpretable models, model randomization, and human-grounded evaluation. Using this proposed taxonomy, we highlight that all categories of evaluation methods, except those based on the ground truth from interpretable models, suffer from a problem we call the “blame problem.” In our study, we argue that this category of evaluation measure is a more reasonable method for evaluating local model-agnostic explanations. However, we show that even this category of evaluation measures has further limitations. The evaluation of local explanations remains an open research problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    For brevity, we refer to them as local explanations in our study.

  2. 2.

    See Sect. 3 for a formal definition of these techniques.

  3. 3.

    For brevity, we refer to local ground truth importance scores as ground truth. Note that these ground truth vectors differ from the common ground truth in machine learning, which are discrete class labels for the data points.

  4. 4.

    Other studies have shown that saliency maps are unreliable for evaluating explanations [1, 14].

  5. 5.

    For brevity, we might refer to local model-agnostic explanations as local explanations in our study.

  6. 6.

    See Table 2 of the study and the scale of values that explanations show for robustness.

  7. 7.

    In this example, the Euclidean similarity is defined as \(1 / (\epsilon + d)\) where d is the Euclidean distance and \(\epsilon \) is the machine epsilon of Python.

References

  1. Adebayo, J., Gilmer, J., Muelly, M., Goodfellow, I., Hardt, M., Kim, B.: Sanity checks for saliency maps. In: Advances in Neural Information Processing Systems, vol. 31 (2018)

    Google Scholar 

  2. Agarwal, C., et al.: Rethinking stability for attribution-based explanations. arXiv preprint arXiv:2203.06877 (2022)

  3. Agarwal, C., et al.: OpenXAI: towards a transparent evaluation of model explanations. In: Advances in Neural Information Processing Systems, vol. 35, pp. 15784–15799 (2022)

    Google Scholar 

  4. Alvarez-Melis, D., Jaakkola, T.S.: On the robustness of interpretability methods. arXiv preprint arXiv:1806.08049 (2018)

  5. Arnold, T., Kasenberg, D.: Value alignment or misalignment - what will keep systems accountable? In: AAAI Workshop on AI, Ethics, and Society (2017)

    Google Scholar 

  6. Chen, H., Janizek, J.D., Lundberg, S., Lee, S.-I.: True to the model or true to the data? arXiv preprint arXiv:2006.16234 (2020)

  7. Chen, J., Song, L., Wainwright, M., Jordan, M.: Learning to explain: an information-theoretic perspective on model interpretation. In International Conference on Machine Learning, pp. 883–892. PMLR (2018)

    Google Scholar 

  8. Covert, I., Lundberg, S.M., Lee, S.-I.: Explaining by removing: a unified framework for model explanation. J. Mach. Learn. Res. 22, 209-1 (2021)

    Google Scholar 

  9. Craven, M., Shavlik, J.: Extracting tree-structured representations of trained networks. In: Advances in Neural Information Processing Systems, vol. 8 (1995)

    Google Scholar 

  10. Craven, M.W., Shavlik, J.W.: Using sampling and queries to extract rules from trained neural networks. In: Machine Learning Proceedings 1994, pp. 37–45. Elsevier (1994)

    Google Scholar 

  11. Doshi-Velez, F., Kim, B.: Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608 (2017)

  12. Fong, R.C., Vedaldi, A.: Interpretable explanations of black boxes by meaningful perturbation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3429–3437 (2017)

    Google Scholar 

  13. Freitas, A.A.: Comprehensible classification models: a position paper. ACM SIGKDD Explor. Newsl. 15(1), 1–10 (2014)

    Google Scholar 

  14. Geirhos, R., Zimmermann, R.S., Bilodeau, B., Brendel, W., Kim, B.: Don’t trust your eyes: on the (un) reliability of feature visualizations (2023)

    Google Scholar 

  15. Ghorbani, A., Abid, A., Zou, J.: Interpretation of neural networks is fragile. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 3681–3688 (2019)

    Google Scholar 

  16. Guidotti, R.: Evaluating local explanation methods on ground truth. Artif. Intell. 291, 103428 (2021)

    Article  MathSciNet  Google Scholar 

  17. Guidotti, R., Monreale, A., Pedreschi, D., Giannotti, F.: Principles of explainable artificial intelligence. In: Sayed-Mouchaweh, M. (ed.) Explainable AI Within the Digital Transformation and Cyber Physical Systems, pp. 9–31. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-76409-8_2

    Chapter  Google Scholar 

  18. Hancox-Li, L.: Robustness in machine learning explanations: does it matter? In: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pp. 640–647 (2020)

    Google Scholar 

  19. Hedström, A., et al.: Quantus: an explainable AI toolkit for responsible evaluation of neural network explanations and beyond. J. Mach. Learn. Res. 24(34), 1–11 (2023)

    Google Scholar 

  20. Hooker, G., Mentch, L., Zhou, S.: Unrestricted permutation forces extrapolation: variable importance requires at least one more model, or there is no free variable importance. Stat. Comput. 31(6), 1–16 (2021)

    Article  MathSciNet  Google Scholar 

  21. Hooker, S., Erhan, D., Kindermans, P.-J., Kim, B.: A benchmark for interpretability methods in deep neural networks. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

    Google Scholar 

  22. Hsieh, C.-Y., et al.: Evaluations and methods for explanation through robustness analysis (2021)

    Google Scholar 

  23. Jiang, L., Zhou, Z., Leung, T., Li, L.-J., Fei-Fei, L.: Mentornet: learning data-driven curriculum for very deep neural networks on corrupted labels. In: International Conference on Machine Learning, pp. 2304–2313. PMLR (2018)

    Google Scholar 

  24. Kim, B., Khanna, R., Koyejo, O.O.: Examples are not enough, learn to criticize! Criticism for interpretability. In: Advances in Neural Information Processing Systems, vol. 29 (2016)

    Google Scholar 

  25. Krishna, S., et al.: The disagreement problem in explainable machine learning: a practitioner’s perspective. arXiv preprint arXiv:2202.01602 (2022)

  26. Leavitt, M.L., Morcos, A.: Towards falsifiable interpretability research. arXiv preprint arXiv:2010.12016 (2020)

  27. Liu, Y., Khandagale, S., White, C., Neiswanger, W.: Synthetic benchmarks for scientific research in explainable machine learning. arXiv preprint arXiv:2106.12543 (2021)

  28. Lundberg, S.M., Lee, S.-I.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  29. Lundberg, S.M., et al.: From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2(1), 56–67 (2020)

    Google Scholar 

  30. Molnar, C., Casalicchio, G., Bischl, B.: Interpretable machine learning – a brief history, state-of-the-art and challenges. In: Koprinska, I., et al. (eds.) ECML PKDD 2020. CCIS, vol. 1323, pp. 417–431. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-65965-3_28

    Chapter  Google Scholar 

  31. Molnar, C., et al.: General pitfalls of model-agnostic interpretation methods for machine learning models. In: Holzinger, A., Goebel, R., Fong, R., Moon, T., Müller, K.R., Samek, W. (eds.) xxAI 2020. LNCS, vol. 13200, pp. 39–68. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-04083-2_4

    Chapter  Google Scholar 

  32. Montavon, G., Samek, W., Müller, K.-R.: Methods for interpreting and understanding deep neural networks. Digit. Sig. Process. 73, 1–15 (2018)

    Article  MathSciNet  Google Scholar 

  33. Nauta, M., et al.: From anecdotal evidence to quantitative evaluation methods: A systematic review on evaluating explainable ai. ACM Comput. Surv. 55(13s), 1–42 (2023)

    Article  Google Scholar 

  34. Nguyen, D.: Comparing automatic and human evaluation of local explanations for text classification. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 1069–1078 (2018)

    Google Scholar 

  35. Poursabzi-Sangdeh, F., Goldstein, D.G., Hofman, J.M., Vaughan, J.W.W., Wallach, H.: Manipulating and measuring model interpretability. In: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, pp. 1–52 (2021)

    Google Scholar 

  36. Rahnama, A.H.A., Boström,H.: A study of data and label shift in the lime framework. arXiv preprint arXiv:1910.14421 (2019)

  37. Rahnama, A.H.A., Bütepage, J., Geurts, P., Boström, H.: Can local explanation techniques explain linear additive models? Data Min. Knowl. Discov. pp. 1–44 (2023)

    Google Scholar 

  38. Ribeiro, M.T., Singh, S., Guestrin, C.: Model-agnostic interpretability of machine learning. arXiv preprint arXiv:1606.05386 (2016)

  39. Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should I trust you?” explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.1135–1144 (2016)

    Google Scholar 

  40. Rudin, C.: Please stop explaining black box models for high stakes decisions. Stat 1050, 26 (2018)

    Google Scholar 

  41. Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-cam: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 618–626 (2017)

    Google Scholar 

  42. Strumbelj, E., Kononenko, I.: An efficient explanation of individual classifications using game theory. J. Mach. Learn. Res. 11, 1–18 (2010)

    MathSciNet  Google Scholar 

  43. Sturmfels, P., Lundberg, S., Lee, S.-I.: Visualizing the impact of feature attribution baselines. Distill 5(1), e22 (2020)

    Article  Google Scholar 

  44. Sundararajan, M., Taly, A., Yan, Q.: Axiomatic attribution for deep networks. In: International Conference on Machine Learning, pp. 3319–3328. PMLR (2017)

    Google Scholar 

  45. Yeh, C.-K., Hsieh, C.-Y., Suggala, A., Inouye, D.I., Ravikumar, P.K.: On the (in) fidelity and sensitivity of explanations. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

    Google Scholar 

  46. Zhang, C., Bengio, S., Hardt, M., Recht, B., Vinyals, O.: Understanding deep learning (still) requires rethinking generalization. Commun. ACM 64(3), 107–115 (2021)

    Article  Google Scholar 

  47. Zhou, J., Gandomi, A.H., Chen, F., Holzinger, A.: Evaluating the quality of machine learning explanations: a survey on methods and metrics. Electronics 10(5), 593 (2021)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amir Hossein Akhavan Rahnama .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Akhavan Rahnama, A.H. (2024). The Blame Problem in Evaluating Local Explanations and How to Tackle It. In: Nowaczyk, S., et al. Artificial Intelligence. ECAI 2023 International Workshops. ECAI 2023. Communications in Computer and Information Science, vol 1947. Springer, Cham. https://doi.org/10.1007/978-3-031-50396-2_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-50396-2_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-50395-5

  • Online ISBN: 978-3-031-50396-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics