The Blame Problem in Evaluating Local Explanations and How to Tackle It

Akhavan Rahnama, Amir Hossein

doi:10.1007/978-3-031-50396-2_4

Amir Hossein Akhavan Rahnama ORCID: orcid.org/0000-0002-6846-5707³⁴

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1947))

Included in the following conference series:

European Conference on Artificial Intelligence

258 Accesses

Abstract

The number of local model-agnostic explanation techniques proposed has grown rapidly recently. One main reason is that the bar for developing new explainability techniques is low due to the lack of optimal evaluation measures. Without rigorous measures, it is hard to have concrete evidence of whether the new explanation techniques can significantly outperform their predecessors. Our study proposes a new taxonomy for evaluating local explanations: robustness, evaluation using ground truth from synthetic datasets and interpretable models, model randomization, and human-grounded evaluation. Using this proposed taxonomy, we highlight that all categories of evaluation methods, except those based on the ground truth from interpretable models, suffer from a problem we call the “blame problem.” In our study, we argue that this category of evaluation measure is a more reasonable method for evaluating local model-agnostic explanations. However, we show that even this category of evaluation measures has further limitations. The evaluation of local explanations remains an open research problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
For brevity, we refer to them as local explanations in our study.
2.
See Sect. 3 for a formal definition of these techniques.
3.
For brevity, we refer to local ground truth importance scores as ground truth. Note that these ground truth vectors differ from the common ground truth in machine learning, which are discrete class labels for the data points.
4.
Other studies have shown that saliency maps are unreliable for evaluating explanations [1, 14].
5.
For brevity, we might refer to local model-agnostic explanations as local explanations in our study.
6.
See Table 2 of the study and the scale of values that explanations show for robustness.
7.
In this example, the Euclidean similarity is defined as \(1 / (\epsilon + d)\) where d is the Euclidean distance and \(\epsilon \) is the machine epsilon of Python.

References

Adebayo, J., Gilmer, J., Muelly, M., Goodfellow, I., Hardt, M., Kim, B.: Sanity checks for saliency maps. In: Advances in Neural Information Processing Systems, vol. 31 (2018)
Google Scholar
Agarwal, C., et al.: Rethinking stability for attribution-based explanations. arXiv preprint arXiv:2203.06877 (2022)
Agarwal, C., et al.: OpenXAI: towards a transparent evaluation of model explanations. In: Advances in Neural Information Processing Systems, vol. 35, pp. 15784–15799 (2022)
Google Scholar
Alvarez-Melis, D., Jaakkola, T.S.: On the robustness of interpretability methods. arXiv preprint arXiv:1806.08049 (2018)
Arnold, T., Kasenberg, D.: Value alignment or misalignment - what will keep systems accountable? In: AAAI Workshop on AI, Ethics, and Society (2017)
Google Scholar
Chen, H., Janizek, J.D., Lundberg, S., Lee, S.-I.: True to the model or true to the data? arXiv preprint arXiv:2006.16234 (2020)
Chen, J., Song, L., Wainwright, M., Jordan, M.: Learning to explain: an information-theoretic perspective on model interpretation. In International Conference on Machine Learning, pp. 883–892. PMLR (2018)
Google Scholar
Covert, I., Lundberg, S.M., Lee, S.-I.: Explaining by removing: a unified framework for model explanation. J. Mach. Learn. Res. 22, 209-1 (2021)
Google Scholar
Craven, M., Shavlik, J.: Extracting tree-structured representations of trained networks. In: Advances in Neural Information Processing Systems, vol. 8 (1995)
Google Scholar
Craven, M.W., Shavlik, J.W.: Using sampling and queries to extract rules from trained neural networks. In: Machine Learning Proceedings 1994, pp. 37–45. Elsevier (1994)
Google Scholar
Doshi-Velez, F., Kim, B.: Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608 (2017)
Fong, R.C., Vedaldi, A.: Interpretable explanations of black boxes by meaningful perturbation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3429–3437 (2017)
Google Scholar
Freitas, A.A.: Comprehensible classification models: a position paper. ACM SIGKDD Explor. Newsl. 15(1), 1–10 (2014)
Google Scholar
Geirhos, R., Zimmermann, R.S., Bilodeau, B., Brendel, W., Kim, B.: Don’t trust your eyes: on the (un) reliability of feature visualizations (2023)
Google Scholar
Ghorbani, A., Abid, A., Zou, J.: Interpretation of neural networks is fragile. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 3681–3688 (2019)
Google Scholar
Guidotti, R.: Evaluating local explanation methods on ground truth. Artif. Intell. 291, 103428 (2021)
Article MathSciNet Google Scholar
Guidotti, R., Monreale, A., Pedreschi, D., Giannotti, F.: Principles of explainable artificial intelligence. In: Sayed-Mouchaweh, M. (ed.) Explainable AI Within the Digital Transformation and Cyber Physical Systems, pp. 9–31. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-76409-8_2
Chapter Google Scholar
Hancox-Li, L.: Robustness in machine learning explanations: does it matter? In: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pp. 640–647 (2020)
Google Scholar
Hedström, A., et al.: Quantus: an explainable AI toolkit for responsible evaluation of neural network explanations and beyond. J. Mach. Learn. Res. 24(34), 1–11 (2023)
Google Scholar
Hooker, G., Mentch, L., Zhou, S.: Unrestricted permutation forces extrapolation: variable importance requires at least one more model, or there is no free variable importance. Stat. Comput. 31(6), 1–16 (2021)
Article MathSciNet Google Scholar
Hooker, S., Erhan, D., Kindermans, P.-J., Kim, B.: A benchmark for interpretability methods in deep neural networks. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Google Scholar
Hsieh, C.-Y., et al.: Evaluations and methods for explanation through robustness analysis (2021)
Google Scholar
Jiang, L., Zhou, Z., Leung, T., Li, L.-J., Fei-Fei, L.: Mentornet: learning data-driven curriculum for very deep neural networks on corrupted labels. In: International Conference on Machine Learning, pp. 2304–2313. PMLR (2018)
Google Scholar
Kim, B., Khanna, R., Koyejo, O.O.: Examples are not enough, learn to criticize! Criticism for interpretability. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
Google Scholar
Krishna, S., et al.: The disagreement problem in explainable machine learning: a practitioner’s perspective. arXiv preprint arXiv:2202.01602 (2022)
Leavitt, M.L., Morcos, A.: Towards falsifiable interpretability research. arXiv preprint arXiv:2010.12016 (2020)
Liu, Y., Khandagale, S., White, C., Neiswanger, W.: Synthetic benchmarks for scientific research in explainable machine learning. arXiv preprint arXiv:2106.12543 (2021)
Lundberg, S.M., Lee, S.-I.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Lundberg, S.M., et al.: From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2(1), 56–67 (2020)
Google Scholar
Molnar, C., Casalicchio, G., Bischl, B.: Interpretable machine learning – a brief history, state-of-the-art and challenges. In: Koprinska, I., et al. (eds.) ECML PKDD 2020. CCIS, vol. 1323, pp. 417–431. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-65965-3_28
Chapter Google Scholar
Molnar, C., et al.: General pitfalls of model-agnostic interpretation methods for machine learning models. In: Holzinger, A., Goebel, R., Fong, R., Moon, T., Müller, K.R., Samek, W. (eds.) xxAI 2020. LNCS, vol. 13200, pp. 39–68. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-04083-2_4
Chapter Google Scholar
Montavon, G., Samek, W., Müller, K.-R.: Methods for interpreting and understanding deep neural networks. Digit. Sig. Process. 73, 1–15 (2018)
Article MathSciNet Google Scholar
Nauta, M., et al.: From anecdotal evidence to quantitative evaluation methods: A systematic review on evaluating explainable ai. ACM Comput. Surv. 55(13s), 1–42 (2023)
Article Google Scholar
Nguyen, D.: Comparing automatic and human evaluation of local explanations for text classification. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 1069–1078 (2018)
Google Scholar
Poursabzi-Sangdeh, F., Goldstein, D.G., Hofman, J.M., Vaughan, J.W.W., Wallach, H.: Manipulating and measuring model interpretability. In: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, pp. 1–52 (2021)
Google Scholar
Rahnama, A.H.A., Boström,H.: A study of data and label shift in the lime framework. arXiv preprint arXiv:1910.14421 (2019)
Rahnama, A.H.A., Bütepage, J., Geurts, P., Boström, H.: Can local explanation techniques explain linear additive models? Data Min. Knowl. Discov. pp. 1–44 (2023)
Google Scholar
Ribeiro, M.T., Singh, S., Guestrin, C.: Model-agnostic interpretability of machine learning. arXiv preprint arXiv:1606.05386 (2016)
Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should I trust you?” explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.1135–1144 (2016)
Google Scholar
Rudin, C.: Please stop explaining black box models for high stakes decisions. Stat 1050, 26 (2018)
Google Scholar
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-cam: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 618–626 (2017)
Google Scholar
Strumbelj, E., Kononenko, I.: An efficient explanation of individual classifications using game theory. J. Mach. Learn. Res. 11, 1–18 (2010)
MathSciNet Google Scholar
Sturmfels, P., Lundberg, S., Lee, S.-I.: Visualizing the impact of feature attribution baselines. Distill 5(1), e22 (2020)
Article Google Scholar
Sundararajan, M., Taly, A., Yan, Q.: Axiomatic attribution for deep networks. In: International Conference on Machine Learning, pp. 3319–3328. PMLR (2017)
Google Scholar
Yeh, C.-K., Hsieh, C.-Y., Suggala, A., Inouye, D.I., Ravikumar, P.K.: On the (in) fidelity and sensitivity of explanations. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Google Scholar
Zhang, C., Bengio, S., Hardt, M., Recht, B., Vinyals, O.: Understanding deep learning (still) requires rethinking generalization. Commun. ACM 64(3), 107–115 (2021)
Article Google Scholar
Zhou, J., Gandomi, A.H., Chen, F., Holzinger, A.: Evaluating the quality of machine learning explanations: a survey on methods and metrics. Electronics 10(5), 593 (2021)
Google Scholar

Download references

Author information

Authors and Affiliations

KTH Royal Institute of Technology, Stockholm, Sweden
Amir Hossein Akhavan Rahnama

Authors

Amir Hossein Akhavan Rahnama
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Amir Hossein Akhavan Rahnama .

Editor information

Editors and Affiliations

Halmstad University, Halmstad, Sweden
Sławomir Nowaczyk
Warsaw University of Technology, Warsaw, Poland
Przemysław Biecek
Warsaw University, Warsaw, Poland
Neo Christopher Chung
University of Huddersfield, Huddersfield, UK
Mauro Vallati
AGH University of Science and Technology, Kraków, Poland
Paweł Skruch
AGH University of Science and Technology, Kraków, Poland
Joanna Jaworek-Korjakowska
University of Huddersfield, Huddersfield, UK
Simon Parkinson
University of Huddersfield, Huddersfield, UK
Alexandros Nikitas
Universität Osnabrück, Osnabrück, Germany
Martin Atzmüller
University of Economics Prague, Prague, Czech Republic
Tomáš Kliegr
University of Bamberg, Bamberg, Germany
Ute Schmid
Jagiellonian University, Kraków, Poland
Szymon Bobek
Jožef Stefan Institute, Ljubljana, Slovenia
Nada Lavrac
HU University of Applied Sciences Utrecht, Utrecht, The Netherlands
Marieke Peeters
Rotterdam University of Applied Sciences, Rotterdam, The Netherlands
Roland van Dierendonck
Amsterdam University of Applied Sciences, Amsterdam, The Netherlands
Saskia Robben
University of Reims Champagne-Ardenne, Reims, France
Eunika Mercier-Laurent
Istanbul Technical University, Istanbul, Türkiye
Gülgün Kayakutlu
Wroclaw University of Economics and Business, Wrocław, Poland
Mieczyslaw Lech Owoc
University of Galway, Galway, Ireland
Karl Mason
University of Galway, Galway, Ireland
Abdul Wahid
University of Calabria, Rende, Italy
Pierangela Bruno
University of Calabria, Rende, Italy
Francesco Calimeri
Marche Polytechnic University, Ancona, Italy
Francesco Cauteruccio
University of Calabria, Rende, Italy
Giorgio Terracina
University of Bamberg, Bamberg, Germany
Diedrich Wolter
Coburg University of Applied Sciences, Coburg, Germany
Jochen L. Leidner
FAU Erlangen-Nürnberg, Erlangen, Germany
Michael Kohlhase
University of Leeds, Leeds, UK
Vania Dimitrova

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Akhavan Rahnama, A.H. (2024). The Blame Problem in Evaluating Local Explanations and How to Tackle It. In: Nowaczyk, S., et al. Artificial Intelligence. ECAI 2023 International Workshops. ECAI 2023. Communications in Computer and Information Science, vol 1947. Springer, Cham. https://doi.org/10.1007/978-3-031-50396-2_4

Download citation

DOI: https://doi.org/10.1007/978-3-031-50396-2_4
Published: 21 January 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-50395-5
Online ISBN: 978-3-031-50396-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

The Blame Problem in Evaluating Local Explanations and How to Tackle It