Skip to main content

Advertisement

SpringerLink
Go to cart
  1. Home
  2. Annals of Mathematics and Artificial Intelligence
  3. Article
Investigating the impact of calibration on the quality of explanations
Download PDF
Your article has downloaded

Similar articles being viewed by others

Slider with three articles shown per slide. Use the Previous and Next buttons to navigate the slides or the slide controller buttons at the end to navigate through each slide.

Stable and actionable explanations of black-box models through factual and counterfactual rules

14 November 2022

Riccardo Guidotti, Anna Monreale, … Fosca Giannotti

Comparison of feature importance measures as explanations for classification models

03 February 2021

Mirka Saarela & Susanne Jauhiainen

CHIRPS: Explaining random forest classification

04 June 2020

Julian Hatwell, Mohamed Medhat Gaber & R. Muhammad Atif Azad

Exploiting patterns to explain individual predictions

08 June 2019

Yunzhe Jia, James Bailey, … Xingjun Ma

Approximation trees: statistical reproducibility in model distillation

11 January 2023

Yichen Zhou, Zhengze Zhou & Giles Hooker

Explainable Ensemble Trees

12 January 2023

Massimo Aria, Agostino Gnasso, … Giuseppe Pandolfo

Extracting optimal explanations for ensemble trees via automated reasoning

25 October 2022

Gelin Zhang, Zhé Hóu, … Yongsheng Gao

Efficient Venn predictors using random forests

20 August 2018

Ulf Johansson, Tuve Löfström, … Henrik Boström

Recent advances in decision trees: an updated survey

10 October 2022

Vinícius G. Costa & Carlos E. Pedreira

Download PDF
  • Open Access
  • Published: 13 March 2023

Investigating the impact of calibration on the quality of explanations

  • Helena Löfström  ORCID: orcid.org/0000-0001-9633-04231,2,
  • Tuwe Löfström3,
  • Ulf Johansson3 &
  • …
  • Cecilia Sönströd3 

Annals of Mathematics and Artificial Intelligence (2023)Cite this article

  • 112 Accesses

  • Metrics details

Abstract

Predictive models used in Decision Support Systems (DSS) are often requested to explain the reasoning to users. Explanations of instances consist of two parts; the predicted label with an associated certainty and a set of weights, one per feature, describing how each feature contributes to the prediction for the particular instance. In techniques like Local Interpretable Model-agnostic Explanations (LIME), the probability estimate from the underlying model is used as a measurement of certainty; consequently, the feature weights represent how each feature contributes to the probability estimate. It is, however, well-known that probability estimates from classifiers are often poorly calibrated, i.e., the probability estimates do not correspond to the actual probabilities of being correct. With this in mind, explanations from techniques like LIME risk becoming misleading since the feature weights will only describe how each feature contributes to the possibly inaccurate probability estimate. This paper investigates the impact of calibrating predictive models before applying LIME. The study includes 25 benchmark data sets, using Random forest and Extreme Gradient Boosting (xGBoost) as learners and Venn-Abers and Platt scaling as calibration methods. Results from the study show that explanations of better calibrated models are themselves better calibrated, with ECE and log loss for the explanations after calibration becoming more conformed to the model ECE and log loss. The conclusion is that calibration makes the models and the explanations better by accurately representing reality.

Download to read the full article text

Working on a manuscript?

Avoid the common mistakes

Data Availability

All 25 data sets used (see Table 1) are binary classification problems that are publicly available from either the UCI repository [31] or the PROMISE Software Engineering Repository [32]. A GitHub repository named https://github.com/tuvelofstrom/calibrating-explanations has been prepared with data sets and code.

References

  1. High-Level Expert Group on AI: Ethics Guidelines for Trustworthy AI. Report, European Commission, Brussels (2019)

  2. Muhlbacher, T., Piringer, H., Gratzl, S., Sedlmair, M., Streit, M.: Opening the black box: Strategies for increased user involvement in existing algorithm implementations. IEEE Trans. Vis. Comput. Graph. 20(12), 1643–1652 (2014)

    Article  Google Scholar 

  3. Freitas, A.A.: Comprehensible Classification Models—a position paper. SigKDD Explor. 15(1), 1–10 (2014)

    Article  Google Scholar 

  4. Rudin, C.: Algorithms for interpretable machine learning. In: Proc. of the 20th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pp 1519–1519 (2014)

  5. Ribeiro, M.T., Singh, S., Guestrin, C.: “Why Should I Trust You?”: Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’16, pp 1135–1144. Association for Computing Machinery (2016). https://doi.org/10.1145/2939672.2939778

  6. Liu, M., Shi, J., Li, Z., Li, C., Zhu, J., Liu, S.: Towards better analysis of deep convolutional neural networks. IEEE Trans. Vis. Comput. Graph. 23(1), 91–100 (2017)

    Article  Google Scholar 

  7. Lim, B.Y., Dey, A.K., Avrahami, D.: Why and why not explanations improve the intelligibility of context-aware intelligent systems. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp 2119–2128 (2009)

  8. Gunning, D.: Explainable Artificial Intelligence. Web. DARPA. https://www.darpa.mil/attachments/XAIProgramUpdate.pdf. Accessed 29 Aug 2019 (2017)

  9. Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., Pedreschi, D.: A survey of methods for explaining black box models. ACM Comput. Surv. (CSUR) 51(5), 1–42 (2018)

    Article  Google Scholar 

  10. Lundberg, S.M., Lee, S.-I.: A unified approach to interpreting model predictions. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp 4768–4777 (2017)

  11. Vovk, V., Gammerman, A., Shafer, G.: Algorithmic Learning in a Random World. Springer, Berlin (2005)

    MATH  Google Scholar 

  12. Platt, J., et al.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv. Large Margin Classifiers 10(3), 61–74 (1999)

    Google Scholar 

  13. Zadrozny, B., Elkan, C.: Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers. In: Proc. 18th International Conference on Machine Learning, pp 609–616 (2001)

  14. Vovk, V., Shafer, G., Nouretdinov, I.: Self-calibrating probability forecasting. In: Advances in Neural Information Processing Systems, pp 1133–1140 (2004)

  15. Lambrou, A., Nouretdinov, I., Papadopoulos, H.: Inductive venn prediction. Ann. Math. Artif. Intell. 74(1), 181–201 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  16. Vovk, V., Petej, I.: Venn-Abers predictors. arXiv:1211.0025(2012)

  17. Molnar, C.: Interpretable Machine Learning, 2nd edn. Accessed 1 February 2023. https://christophm.github.io/interpretable-ml-book (2022)

  18. Lipton, Z.C.: The mythos of model interpretability: In machine learning, the concept of interpretability is both important and slippery. Queue 16(3), 31–57 (2018)

    Article  Google Scholar 

  19. Dhurandhar, A., Shanmugam, K., Luss, R., Olsen, P.: Improving simple models with confidence profiles. arXiv:1807.07506 (2018)

  20. Rahnama, A.H.A., Butepage, J., Geurts, P., Bostrom, H.: Evaluation of local model-agnostic explanations using ground truth. arXiv:2106.02488(2021)

  21. Rahnama, A.H.A., Boström, H.: A study of data and label shift in the lime framework. arXiv:1910.14421 (2019)

  22. Recio-García, J.A., Díaz-Agudo, B., Pino-Castilla, V.: CBR-LIME: A case-based reasoning approach to provide specific local interpretable model-agnostic explanations. In: International Conference on Case-Based Reasoning, pp 179–194. Springer (2020)

  23. Yeh, C.-K., Hsieh, C.-Y., Suggala, A., Inouye, D.I., Ravikumar, P.K.: On the (in) fidelity and sensitivity of explanations. Adv. Neural Inf. Process. Syst. 32, 10967–10978 (2019)

    Google Scholar 

  24. Arya, V., Bellamy, R.K., Chen, P.-Y., Dhurandhar, A., Hind, M., Hoffman, S.C., Houde, S., Liao, Q.V., Luss, R., Mojsilović, A., et al.: One explanation does not fit all: A toolkit and taxonomy of ai explainability techniques. arXiv:1909.03012 (2019)

  25. Johansson, U., Linusson, H., Löfström, T., Boström, H.: Interpretable regression trees using conformal prediction. Expert Syst. Appl. 97, 394–404 (2018)

    Article  Google Scholar 

  26. Ribeiro, M.T., Singh, S., Guestrin, C.: “why should i trust you?” explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 1135–1144 (2016)

  27. Mehrtash, A., Wells, W.M., Tempany, C.M., Abolmaesumi, P., Kapur, T.: Confidence calibration and predictive uncertainty estimation for deep medical image segmentation. IEEE Trans. Med. Imaging 39(12), 3868–3878 (2020)

    Article  Google Scholar 

  28. Johansson, U., Sönströd, C., Löfström, T., Boström, H.: Customized interpretable conformal regressors. In: 2019 IEEE International Conference on Data Science and Advanced Analytics (DSAA), pp 221–230. IEEE (2019)

  29. Johansson, U., Löfström, T.: Well-calibrated and specialized probability estimation trees. In: Proceedings of the 2020 SIAM International Conference on Data Mining, pp 415–423. SIAM (2020)

  30. Python Software Foundation: Python. Accessed 5 Jan 2023. https://www.python.org/downloads/release/python-397/

  31. Dua, D., Graff, C.: UCI Machine Learning Repository. Accessed 1 October 2022. http://archive.ics.uci.edu/ml (2017)

  32. Sayyad Shirabad, J., Menzies, T.J.: PROMISE Repository of Software Engineering Databases. School of Information Technology and Engineering, University of Ottawa, Canada (2005)

Download references

Funding

Open access funding provided by University of Boras. This research is partly founded by the Swedish Knowledge Foundation through the Industrial Research School INSiDR. The authors also acknowledge the Knowledge Foundation, Jönköping University, and the industrial partners for financially supporting the research and education environment on Knowledge Intensive Product Realization SPARK at Jönköping University, Sweden. Project: AFAIR with agreement number 20200223.

Author information

Authors and Affiliations

  1. Department of Information Technology, University of Borås, Allégatan 1, Borås, 50190, Sweden

    Helena Löfström

  2. Jönköping International Business School, Jönköping University, Gjuterigatan 5, Jönköping, 55111, Sweden

    Helena Löfström

  3. Department of Computing, Jönköping University, Gjuterigatan 5, Jönköping, 55111, Sweden

    Tuwe Löfström, Ulf Johansson & Cecilia Sönströd

Authors
  1. Helena Löfström
    View author publications

    You can also search for this author in PubMed Google Scholar

  2. Tuwe Löfström
    View author publications

    You can also search for this author in PubMed Google Scholar

  3. Ulf Johansson
    View author publications

    You can also search for this author in PubMed Google Scholar

  4. Cecilia Sönströd
    View author publications

    You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Helena Löfström.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Löfström, H., Löfström, T., Johansson, U. et al. Investigating the impact of calibration on the quality of explanations. Ann Math Artif Intell (2023). https://doi.org/10.1007/s10472-023-09837-2

Download citation

  • Accepted: 31 January 2023

  • Published: 13 March 2023

  • DOI: https://doi.org/10.1007/s10472-023-09837-2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords

  • Predicting with confidence
  • Calibration
  • Explainable artificial intelligence
  • Decision support systems
  • Venn Abers
  • Uncertainty in explanations

Mathematics Subject Classification (2010)

  • 68Q87
Download PDF

Working on a manuscript?

Avoid the common mistakes

Advertisement

Over 10 million scientific documents at your fingertips

Switch Edition
  • Academic Edition
  • Corporate Edition
  • Home
  • Impressum
  • Legal information
  • Privacy statement
  • Your US state privacy rights
  • How we use cookies
  • Your privacy choices/Manage cookies
  • Accessibility
  • FAQ
  • Contact us
  • Affiliate program

Not affiliated

Springer Nature

© 2023 Springer Nature Switzerland AG. Part of Springer Nature.