Skip to main content

Abstract

The last decade has witnessed the rise of a black box society where obscure classification models are adopted by Artificial Intelligence systems (AI). The lack of explanations of how AI systems make decisions is a key ethical issue to their adoption in socially sensitive and safety-critical contexts. Indeed, the problem is not only for lack of transparency but also for possible biases inherited by the AI from prejudices hidden in the training data. Thus, the research in eXplainable AI (XAI) has recently caught much attention. The applications in which AI systems are employed are various. Therefore, there are many requirements for different types of explanations for different users. We survey the existing proposals in the literature by discussing which are the principles of XAI. In addition, we illustrate different types of explanations returned by established explainers. Finally, we discuss their usability and how they can be exploited in real-world applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    http://www.techinsider.io/how-algorithms-can-be-racist-2016-4.

  2. 2.

    http://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing.

  3. 3.

    https://www.nytimes.com/2018/03/19/technology/uber-driverless-fatality.html.

  4. 4.

    https://www.merriam-webster.com/.

References

  1. Adadi, A., & Berrada, M. (2018). Peeking inside the black-box: A survey on explainable artificial intelligence (XAI). IEEE Access, 6, 52138–52160.

    Article  Google Scholar 

  2. Agrawal, R., Srikant, R., et al. (1994). Fast algorithms for mining association rules. In Proc. 20th Int. Conf. Very Large Data Bases, VLDB (Vol. 1215, pp. 487–499).

    Google Scholar 

  3. Aldeen, Y. A. A. S., Salleh, M., & Razzaque, M. A. (2015). A comprehensive review on privacy preserving data mining. SpringerPlus, 4(1), 694.

    Article  Google Scholar 

  4. Andrews, R., Diederich, J., & Tickle, A. B. (1995). Survey and critique of techniques for extracting rules from trained artificial neural networks. Knowledge-Based Systems, 8(6), 373–389.

    Article  Google Scholar 

  5. Angelino, E., Larus-Stone, N., Alabi, D., Seltzer, M., & Rudin, C. (2017). Learning certifiably optimal rule lists. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 35–44). ACM.

    Google Scholar 

  6. Apicella, A., Isgrò, F., Prevete, R., & Tamburrini, G. (2019). Contrastive explanations to classification systems using sparse dictionaries. In International Conference on Image Analysis and Processing (pp. 207–218). Springer.

    Google Scholar 

  7. Arrieta, A. B., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A., García, S., Gil-López, S., Molina, D., Benjamins, R., et al. (2020). Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion, 58, 82–115.

    Article  Google Scholar 

  8. Augasta, M. G., & Kathirvalavakumar, T. (2012). Reverse engineering the neural networks for rule extraction in classification problems. Neural Processing Letters, 35(2), 131–150.

    Article  Google Scholar 

  9. Bach, S., Binder, A., Montavon, G., Klauschen, F., Müller, K.-R., & Samek, W. (2015). On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS One, 10(7), e0130140.

    Article  Google Scholar 

  10. Bakas, S., et al. (2018). Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the brats challenge.

    Google Scholar 

  11. Bengio, Y., Ducharme, R., Vincent, P., & Jauvin, C. (2003). A neural probabilistic language model. Journal of Machine Learning Research, 3, 1137–1155.

    MATH  Google Scholar 

  12. Bhatt, U., Xiang, A., Sharma, S., Weller, A., Taly, A., Jia, Y., Ghosh, J., Puri, R., Moura, J. M., & Eckersley, P. (2020). Explainable machine learning in deployment. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (pp. 648–657).

    Google Scholar 

  13. Bien, J., & Tibshirani, R. (2011). Prototype selection for interpretable classification. The Annals of Applied Statistics, 5(4), 2403–2424.

    Article  MathSciNet  Google Scholar 

  14. Blanco-Justicia, A., Domingo-Ferrer, J., Martínez, S., & Sánchez, D. (2020). Machine learning explainability via microaggregation and shallow decision trees. Knowledge-Based Systems, 194, 105532.

    Article  Google Scholar 

  15. Bolukbasi, T., Chang, K.-W., Zou, J. Y., Saligrama, V., & Kalai, A. T. (2016). Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In Advances in Neural Information Processing Systems (pp. 4349–4357).

    Google Scholar 

  16. Boz, O. (2002). Extracting decision trees from trained neural networks. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 456–461).

    Google Scholar 

  17. Byrne, R. M. (2019). Counterfactuals in explainable artificial intelligence (XAI): Evidence from human reasoning. In IJCAI (pp. 6276–6282).

    Google Scholar 

  18. Chen, C., Li, O., Tao, D., Barnett, A., Rudin, C., & Su, J. K. (2019). This looks like that: Deep learning for interpretable image recognition. In Advances in Neural Information Processing Systems (pp. 8930–8941).

    Google Scholar 

  19. Chen, X., Chen, H., Xu, H., Zhang, Y., Cao, Y., Qin, Z., & Zha, H. (2019). Personalized fashion recommendation with visual explanations based on multimodal attention network: Towards visually explainable recommendation. In B. Piwowarski, M. Chevalier, É. Gaussier, Y. Maarek, J. Nie & F. Scholer (Eds.), Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2019, Paris, France, July 21–25, 2019 (pp. 765–774). ACM.

    Google Scholar 

  20. Craven, M., & Shavlik, J. W. (1996). Extracting tree-structured representations of trained networks. In Advances in Neural Information Processing Systems (pp. 24–30).

    Google Scholar 

  21. Doshi-Velez, F., & Kim, B. (2017). Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608.

    Google Scholar 

  22. Fahner, G. (2018). Developing transparent credit risk scorecards more effectively: An explainable artificial intelligence approach. Data Anal, 2018, 17.

    Google Scholar 

  23. Fong, R. C., & Vedaldi, A. (2017). Interpretable explanations of black boxes by meaningful perturbation. In Proceedings of the IEEE International Conference on Computer Vision (pp. 3429–3437).

    Google Scholar 

  24. Freitas, A. A. (2014). Comprehensible classification models: A position paper. ACM SIGKDD Explorations Newsletter, 15(1), 1–10.

    Article  Google Scholar 

  25. Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189–1232.

    Article  MathSciNet  Google Scholar 

  26. Frixione, M., & Lieto, A. (2012). Prototypes vs exemplars in concept representation. In KEOD (pp. 226–232).

    Google Scholar 

  27. Goodman, B., & Flaxman, S. (2016). EU regulations on algorithmic decision-making and a “right to explanation”. In ICML Workshop on Human Interpretability in Machine Learning (WHI 2016), New York, NY. http://arxiv.org/abs/1606.08813v1

  28. Goyal, Y., Feder, A., Shalit, U., & Kim, B. (2019). Explaining classifiers with causal concept effect (CACE). arXiv preprint arXiv:1907.07165.

    Google Scholar 

  29. Guidotti, R., Monreale, A., & Cariaggi, L. (2019). Investigating neighborhood generation methods for explanations of obscure image classifiers. In Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 55–68). Springer.

    Google Scholar 

  30. Guidotti, R., Monreale, A., Giannotti, F., Pedreschi, D., Ruggieri, S., & Turini, F. (2019). Factual and counterfactual explanations for black box decision making. IEEE Intelligent Systems, 34(6), 14–23.

    Article  Google Scholar 

  31. Guidotti, R., Monreale, A., Matwin, S., & Pedreschi, D. (2019). Black box explanation by learning image exemplars in the latent feature space. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 189–205). Springer.

    Google Scholar 

  32. Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., & Pedreschi, D. (2018). A survey of methods for explaining black box models. ACM Computing Surveys (CSUR), 51(5), 1–42.

    Article  Google Scholar 

  33. Guidotti, R., & Nanni, M. (2020). Crash prediction and risk assessment with individual mobility networks. In 2020 21st IEEE International Conference on Mobile Data Management (MDM) (pp. 89–98). IEEE.

    Google Scholar 

  34. Guidotti, R., & Ruggieri, S. (2019). On the stability of interpretable models. In 2019 International Joint Conference on Neural Networks (IJCNN) (pp. 1–8). IEEE.

    Google Scholar 

  35. He, X., Chen, T., Yen Kan, M., & Chen, X. (2015). TriRank: Review-aware explainable recommendation by modeling aspects.

    Book  Google Scholar 

  36. Johansson, U., & Niklasson, L. (2009). Evolving decision trees using oracle guides. In 2009 IEEE Symposium on Computational Intelligence and Data Mining (pp. 238–244). IEEE.

    Google Scholar 

  37. Kim, B., Khanna, R., & Koyejo, O. O. (2016). Examples are not enough, learn to criticize! criticism for interpretability. In Advances in Neural Information Processing Systems (pp. 2280–2288).

    Google Scholar 

  38. Krause, J., Perer, A., & Ng, K. (2016). Interacting with predictions: Visual inspection of black-box machine learning models. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (pp. 5686–5697).

    Google Scholar 

  39. Lakkaraju, H., Bach, S. H., & Leskovec, J. (2016). Interpretable decision sets: A joint framework for description and prediction. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1675–1684). ACM.

    Google Scholar 

  40. Lampridis, O., Guidotti, R., & Ruggieri, S. (2020). Explaining sentiment classification with synthetic exemplars and counter-exemplars. In International Conference on Discovery Science (pp. 357–373). Springer.

    Google Scholar 

  41. Laugel, T., Lesot, M.-J., Marsala, C., Renard, X., & Detyniecki, M. (2017). Inverse classification for comparison-based interpretability in machine learning. arXiv preprint arXiv:1712.08443.

    Google Scholar 

  42. Laugel, T., Lesot, M.-J., Marsala, C., Renard, X., & Detyniecki, M. (2019). Unjustified classification regions and counterfactual explanations in machine learning. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 37–54). Springer.

    Google Scholar 

  43. Li, O., Liu, H., Chen, C., & Rudin, C. (2018). Deep learning for case-based reasoning through prototypes: A neural network that explains its predictions. In Thirty-Second AAAI Conference on Artificial Intelligence.

    Google Scholar 

  44. Lowry, S., & Macpherson, G. (1988). A blot on the profession. British Medical Journal (Clinical Research Ed.), 296(6623), 657.

    Article  Google Scholar 

  45. Lundberg, S. M., & Lee, S.-I. (2017). A unified approach to interpreting model predictions. In Advances in neural information processing systems (pp. 4765–4774).

    Google Scholar 

  46. Makhzani, A., Shlens, J., Jaitly, N., Goodfellow, I., & Frey, B. (2015). Adversarial autoencoders. arXiv preprint arXiv:1511.05644.

    Google Scholar 

  47. Malgieri, G., & Comandé, G. (2017). Why a right to legibility of automated decision-making exists in the General Data Protection Regulation. International Data Privacy Law, 7(4), 243–265.

    Article  Google Scholar 

  48. Martens, D., Baesens, B., Van Gestel, T., & Vanthienen, J. (2007). Comprehensible credit scoring models using rule extraction from support vector machines. European Journal of Operational Research, 183(3), 1466–1476.

    Article  Google Scholar 

  49. Miller, T. (2019). Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence, 267, 1–38.

    Article  MathSciNet  Google Scholar 

  50. Molnar, C. (2020). Interpretable machine learning. Lulu.com

  51. Murphy, P. M., & Pazzani, M. J. (1991). Id2-of-3: Constructive induction of m-of-n concepts for discriminators in decision trees. In Machine learning proceedings 1991 (pp. 183–187). Elsevier.

    Google Scholar 

  52. Naretto, F., Pellungrini, R., Monreale, A., Nardini, F. M., & Musolesi, M. (2020). Predicting and explaining privacy risk exposure in mobility data. In International Conference on Discovery Science (pp. 403–418). Springer.

    Google Scholar 

  53. Oriol, J. D. V., Vallejo, E. E., Estrada, K., Peña, J. G. T., Initiative, A. D. N., et al. (2019). Benchmarking machine learning models for late-onset Alzheimer’s disease prediction from genomic data. BMC Bioinformatics, 20(1), 1–17.

    Article  Google Scholar 

  54. Pasquale, F. (2015). The black box society. Harvard University Press.

    Book  Google Scholar 

  55. Pedreschi, D., Giannotti, F., Guidotti, R., Monreale, A., Ruggieri, S., & Turini, F. (2019). Meaningful explanations of black box AI decision systems. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 33, pp. 9780–9784).

    Google Scholar 

  56. Pedreshi, D., Ruggieri, S., & Turini, F. (2008). Discrimination-aware data mining. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 560–568).

    Google Scholar 

  57. Quinlan, J. R. (1993). C4.5: Programs for machine learning. Elsevier.

    Google Scholar 

  58. Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). Why should I trust you? Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1135–1144). ACM.

    Google Scholar 

  59. Ribeiro, M. T., Singh, S., & Guestrin, C. (2018). Anchors: High-precision model-agnostic explanations. In Thirty-Second AAAI Conference on Artificial Intelligence.

    Google Scholar 

  60. Romei, A., & Ruggieri, S. (2014). A multidisciplinary survey on discrimination analysis. Knowledge Engineering Review, 29(5), 582–638.

    Article  Google Scholar 

  61. Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5), 206–215.

    Article  Google Scholar 

  62. Rudin, C., & Radin, J. (2019). Why are we using black box models in AI when we don’t need to? A lesson from an explainable AI competition. Harvard Data Science Review, 1(2). https://hdsr.mitpress.mit.edu/pub/f9kuryi8

  63. Setzu, M., Guidotti, R., Monreale, A., & Turini, F. (2019). Global explanations with local scoring. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 159–171). Springer.

    Google Scholar 

  64. Shapley, L. S. (1953). A value for n-person games. Contributions to the Theory of Games, 2(28), 307–317.

    MathSciNet  MATH  Google Scholar 

  65. Shokri, R., Strobel, M., & Zick, Y. (2019). Privacy risks of explaining machine learning models. CoRR, abs/1907.00164.

    Google Scholar 

  66. Shrikumar, A., Greenside, P., Shcherbina, A., & Kundaje, A. (2016). Not just a black box: Learning important features through propagating activation differences. arXiv preprint arXiv:1605.01713.

    Google Scholar 

  67. Simonyan, K., Vedaldi, A., & Zisserman, A. (2013). Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034.

    Google Scholar 

  68. Sokol, K., & Flach, P. A. (2020). Explainability fact sheets: A framework for systematic assessment of explainable approaches. In FAT* ’20: Conference on Fairness, Accountability, and Transparency, Barcelona, Spain, January 27–30, 2020 (pp. 56–67). ACM.

    Google Scholar 

  69. Sundararajan, M., Taly, A., & Yan, Q. (2017). Axiomatic attribution for deep networks. arXiv preprint arXiv:1703.01365.

    Google Scholar 

  70. Swapna, G., Vinayakumar, R., & Soman, K. (2018). Diabetes detection using deep learning algorithms. ICT Express, 4(4), 243–246.

    Article  Google Scholar 

  71. Tan, P.-N. et al. (2006). Introduction to data mining. Pearson Education India.

    Google Scholar 

  72. Ustun, B., Spangher, A., & Liu, Y. (2019). Actionable recourse in linear classification. In Proceedings of the Conference on Fairness, Accountability, and Transparency (pp. 10–19).

    Google Scholar 

  73. Wachter, S., Mittelstadt, B., & Floridi, L. (2017). Why a right to explanation of automated decision-making does not exist in the General Data Protection Regulation. International Data Privacy Law, 7(2), 76–99.

    Article  Google Scholar 

  74. Wachter, S., Mittelstadt, B., & Russell, C. (2017). Counterfactual explanations without opening the black box: Automated decisions and the GDPR. HJLT, 31, 841.

    Google Scholar 

  75. Wang, F., & Rudin, C. (2015). Falling rule lists. In Artificial intelligence and statistics (pp. 1013–1022).

    Google Scholar 

  76. Yin, X., & Han, J. (2003). CPAR: Classification based on predictive association rules. In Proceedings of the 2003 SIAM International Conference on Data Mining (pp. 331–335). SIAM.

    Google Scholar 

  77. Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In European Conference on Computer Vision (pp. 818–833). Springer.

    Google Scholar 

Download references

Acknowledgements

This work is partially supported by the European Community H2020 programme under the funding schemes: INFRAIA-01-2018-2019 Res. Infr. G.A. 871042 SoBigData++ (sobigdata.eu), G.A. 952026 Humane AI Net (humane-ai.eu), G.A. 825619 AI4EU (ai4eu.eu), G.A. 952215 TAILOR (tailor.eu), and the ERC-2018-ADG G.A. 834756 “XAI: Science and technology for the eXplanation of AI decision making” (xai.eu).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Riccardo Guidotti .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Guidotti, R., Monreale, A., Pedreschi, D., Giannotti, F. (2021). Principles of Explainable Artificial Intelligence. In: Sayed-Mouchaweh, M. (eds) Explainable AI Within the Digital Transformation and Cyber Physical Systems. Springer, Cham. https://doi.org/10.1007/978-3-030-76409-8_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-76409-8_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-76408-1

  • Online ISBN: 978-3-030-76409-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics