Skip to main content

Explainable Artificial Intelligence (XAI): Motivation, Terminology, and Taxonomy

  • Chapter
  • First Online:
Machine Learning for Data Science Handbook

Abstract

Deep learning algorithms and deep neural networks (DNNs) have become extremely popular due to their high-performance accuracy in complex fields, such as image and text classification, speech understanding, document segmentation, credit scoring, and facial recognition. As a result of the highly nonlinear structure of deep learning algorithms, these networks are hard to interpret; thus, it is not clear how the models reach their conclusions and therefore, they are often considered black-box models. The poor transparency of these models is a major drawback despite their effectiveness. In addition, recent regulations such as the General Data Protection Regulation (GDPR), require that, in many cases, an explanation will be provided whenever the learning model may affect a person’s life. For example, in autonomous vehicle applications, methods for visualizing, explaining, and interpreting deep learning models that analyze driver behavior and the road environment have become standard. Explainable artificial intelligence (XAI) or interpretable machine learning (IML) programs aim to enable a suite of methods and techniques that produce more explainable models while maintaining a high level of output accuracy [1–4]. These programs enable human users to better understand, trust, and manage the emerging generation of artificially intelligent systems [4].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 219.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 279.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Erico Tjoa, Cuntai Guan, “A Survey on Explainable Artificial Intelligence (XAI): towards Medical XAI” 2019. [Online] https://arxiv.org/pdf/1907.07374.pdf

  2. Supriyo Chakraborty, Richard Tomsett, Ramya Raghavendra, Daniel Harborne, Moustafa Alzantot, Federico Cerutti, Mani Srivastava, Alun Preece, Simon Julier, Raghuveer M. Rao, Troy D. Kelley, Dave Braines, Murat Sensoyk, Christopher J. Willis, Prudhvi Gurram IBM T. J. Watson Research Center, Crime and Security Research Institute, Cardiff University, UCLA, IBM UK, Army Research Lab, Adelphi, Ozyegin University, BAE Systems AI Labs University College London “Interpretability of Deep Learning Models: A Survey of Results” 2017. [Online] https://orca.cf.ac.uk/101500/1/Interpretability%20of%20Deep%20Learning%20Models%20-%20A%20Survey%20of%20Results.pdf

  3. Amina Adadi, Mohammed Berrada, “Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)” 2018. [Online] https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8466590

  4. Filip Karlo Došilović, Mario Brcic, Nikica Hlupic “Explainable Artificial Intelligence: A Survey” 2018. [Online] https://www.researchgate.net/publication/325398586_Explainable_Artificial_Intelligence_A_Survey

  5. Jakob M. Schoenborn, Klaus-Dieter Althof, “Recent Trends in XAI: A Broad Overview on current Approaches, Methodologies and Interactions”. 2019. [Online] http://gaia.fdi.ucm.es/events/xcbr/papers/XCBR-19_paper_1.pdf

  6. B. Letham, C. Rudin, T. H. McCormick, and D. Madigan, “Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model,” Ann. Appl. Statist., vol. 9, no. 3, pp. 1350–1371, 2015. [Online] https://arxiv.org/pdf/1511.01644.pdf

  7. R. Caruana, Y. Lou, J. Gehrke, P. Koch, M. Sturm, and N. Elhadad, “Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission,” in Proc. 21th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2015, pp. 1721–1730. [Online] http://people.dbmi.columbia.edu/noemie/papers/15kdd.pdf

  8. K. Xu et al., “Show, attend and tell: Neural image caption generation with visual attention,” in Proc. Int. Conf. Mach. Learn. (ICML), 2015, pp. 1–10. [Online] https://arxiv.org/pdf/1502.03044.pdf

  9. Z. C. Lipton, “The mythos of model interpretability,” in Proc. ICML Workshop Hum. Interpretability Mach. Learn., 2016, pp. 96–100. [Online] https://arxiv.org/pdf/1606.03490.pdf

  10. C. Yang, A. Rangarajan, and S. Ranka. (2018). “Global model interpretation via recursive partitioning.” [Online] https://arxiv.org/pdf/1802.04253.pdf

  11. M. A. Valenzuela-Escárcega, A. Nagesh, and M. Surdeanu. (2018). “Lightly-supervised representation learning with global interpretability.” [Online] https://arxiv.org/pdf/1805.11545.pdf

  12. A. Nguyen, A. Dosovitskiy, J. Yosinski, T. Brox, and J. Clune, “Synthesizing the preferred inputs for neurons in neural networks via deep generator networks,” in Proc. Adv. Neural Inf. Process. Syst. (NIPS), 2016, pp. 3387–3395. [Online] https://arxiv.org/pdf/1605.09304.pdf

  13. M. T. Ribeiro, S. Singh, and C. Guestrin, “‘Why should i trust you?’: Explaining the predictions of any classifier,” in Proc. 22nd ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2016, pp. 1135–1144. [Online] https://www.kdd.org/kdd2016/papers/files/rfp0573-ribeiroA.pdf

  14. M. T. Ribeiro, S. Singh, and C. Guestrin, “Anchors: High-precision model-agnostic explanations,” in Proc. AAAI Conf. Artif. Intell., 2018, pp. 1–9. [Online] https://homes.cs.washington.edu/~marcotcr/aaai18.pdf

  15. K. Simonyan, A. Vedaldi, and A. Zisserman. (2013). “Deep inside convolutional networks: Visualising image classification models and saliency maps. [Online] https://arxiv.org/pdf/1312.6034.pdf

  16. M. D. Zeiler and R. Fergus, “Visualizing and understanding convolutional networks,” in Proc. Eur. Conf. Comput. Vis. Zurich, Switzerland: Springer, 2014, pp. 818–833. [Online] https://cs.nyu.edu/~fergus/papers/zeilerECCV2014.pdf

  17. B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and O. Torralba, “Learning deep features for discriminative localization,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., June 2016, pp. 2921–2929. [Online] https://arxiv.org/abs/1512.04150

  18. M. Sundararajan, A. Taly, and Q. Yan. (2017). “Axiomatic attribution for deep networks.” [Online] https://arxiv.org/pdf/1703.01365.pdf

  19. D. Smilkov, N. Thorat, B. Kim, F. Viégas, and M. Wattenberg. (2017). “SmoothGrad: Removing noise by adding noise.” [Online] https://arxiv.org/pdf/1706.03825.pdf

  20. S. M. Lundberg and S. I. Lee, “A unified approach to interpreting model predictions,” in Proc. Adv. Neural Inf. Process. Syst., 2017, pp. 4768–4777. [Online] https://arxiv.org/pdf/1705.07874.pdf

  21. R. Guidotti, A. Monreale, S. Ruggieri, D. Pedreschi, F. Turini, and F. Giannotti. (2018). “Local rule-based explanations of black box decision systems.” [Online] https://arxiv.org/pdf/1805.10820.pdf

  22. D. Linsley, D. Scheibler, S. Eberhardt, and T. Serre. (2018). “Globaland-local attention networks for visual recognition.” [Online] https://arxiv.org/pdf/1805.08819.pdf

  23. O. Bastani, C. Kim, and H. Bastani. (2017). “Interpretability via model extraction. [Online] https://arxiv.org/pdf/1706.09773.pdf

  24. J. J. Thiagarajan, B. Kailkhura, P. Sattigeri, and K. N. Ramamurthy. (2016). “TreeView: Peeking into deep neural networks via feature-space partitioning.” [Online] https://arxiv.org/pdf/1611.07429.pdf

  25. D. P. Green and H. L. Kern, “Modeling heterogeneous treatment effects in large-scale experiments using Bayesian additive regression trees,” in Proc. Annu. Summer Meeting Soc. Political Methodol., 2010, pp. 1–40. [Online] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.190.3826&rep=rep1&type=pdf

  26. J. Elith, J. Leathwick, and T. Hastie, “A working guide to boosted regression trees,” J. Animal Ecol., vol. 77, no. 4, pp. 802–813, 2008. [Online] https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/j.1365-2656.2008.01390.x

  27. S. H. Welling, H. H. F. Refsgaard, P. B. Brockhoff, and L. H. Clemmensen. (2016). “Forest floor visualizations of random”. [Online] https://arxiv.org/pdf/1605.09196.pdf

  28. A. Goldstein, A. Kapelner, J. Bleich, and E. Pitkin, “Peeking inside the black box: Visualizing statistical learning with plots of individual conditional expectation,” J. Comput. Graph. Statist., vol. 24, no. 1, pp. 44–65, 2015, [Online] https://www.tandfonline.com/doi/abs/10.1080/10618600.2014.907095

  29. G. Casalicchio, C. Molnar, and B. Bischl. (2018). “Visualizing the feature importance for black box models.” [Online] https://arxiv.org/pdf/1804.06620.pdf

  30. U. Johansson, R. König, and I. Niklasson, “The truth is in there—Rule extraction from opaque models using genetic programming,” in Proc. FLAIRS Conf., 2004, pp. 658–663. [Online] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.470.4124&rep=rep1&type=pdf

  31. T. Hailesilassie. (2017). “Rule extraction algorithm for deep neural networks: A review.” [Online] https://arxiv.org/abs/1610.05267

  32. P. Sadowski, J. Collado, D. Whiteson, and P. Baldi, “Deep learning, dark knowledge, and dark matter,” in Proc. NIPS Workshop High-Energy Phys. Mach. Learn. (PMLR), vol. 42, 2015, pp. 81–87. [Online] http://proceedings.mlr.press/v42/sado14.pdf

  33. S. Tan, R. Caruana, G. Hooker, and Y. Lou. (2018). “Detecting bias in black-box models using transparent model distillation.” [Online] https://arxiv.org/abs/1710.06169

  34. Z. Che, S. Purushotham, R. Khemani, and Y. Liu. (2015). “Distilling knowledge from deep networks with applications to healthcare domain.” [Online] https://arxiv.org/abs/1512.03542

  35. Y. Zhang and B. Wallace (2016). “A sensitivity analysis of (and practitioners’ Guide to) convolutional neural networks for sentence classification. [Online] https://arxiv.org/abs/1510.03820

  36. P. Cortez and M. J. Embrechts, “Using sensitivity analysis and visualization techniques to open black box data mining models,” Inf. Sci., vol. 225, pp. 1–17, Mar. 2013. [Online] https://core.ac.uk/download/pdf/55616214.pdf

  37. S. Bach, A. Binder, G. Montavon, F. Klauschen, K.-R. Müller, and W. Samek, “On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation,” PLoS ONE, vol. 10, no. 7, p. e0130140, 2015. [Online] https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0130140

  38. A. Fisher, C. Rudin, and F. Dominici. (2018). “Model class reliance: Variable importance measures for any machine learning model class, from the ‘rashomon’ perspective.” [Online] https://arxiv.org/abs/1801.01489

  39. B. Kim, R. Khanna, and O. O. Koyejo, “Examples are not enough, learn to criticize! criticism for interpretability,” in Proc. 29th Conf. Neural Inf. Process. Syst. (NIPS), 2016, pp. 2280–2288 [Online] https://papers.nips.cc/paper/6300-examples-are-not-enough-learn-to-criticize-criticism-for-interpretability.pdf

  40. X. Yuan, P. He, Q. Zhu, and X. Li. (2017). “Adversarial examples: Attacks and defenses for deep learning.” [Online] https://arxiv.org/abs/1712.07107

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Irad Ben-Gal .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Notovich, A., Chalutz-Ben Gal, H., Ben-Gal, I. (2023). Explainable Artificial Intelligence (XAI): Motivation, Terminology, and Taxonomy. In: Rokach, L., Maimon, O., Shmueli, E. (eds) Machine Learning for Data Science Handbook. Springer, Cham. https://doi.org/10.1007/978-3-031-24628-9_41

Download citation

Publish with us

Policies and ethics