Principles of Explainable Artificial Intelligence

Guidotti, Riccardo; Monreale, Anna; Pedreschi, Dino; Giannotti, Fosca

doi:10.1007/978-3-030-76409-8_2

Riccardo Guidotti²,
Anna Monreale²,
Dino Pedreschi² &
…
Fosca Giannotti³

1324 Accesses
10 Citations

Abstract

The last decade has witnessed the rise of a black box society where obscure classification models are adopted by Artificial Intelligence systems (AI). The lack of explanations of how AI systems make decisions is a key ethical issue to their adoption in socially sensitive and safety-critical contexts. Indeed, the problem is not only for lack of transparency but also for possible biases inherited by the AI from prejudices hidden in the training data. Thus, the research in eXplainable AI (XAI) has recently caught much attention. The applications in which AI systems are employed are various. Therefore, there are many requirements for different types of explanations for different users. We survey the existing proposals in the literature by discussing which are the principles of XAI. In addition, we illustrate different types of explanations returned by established explainers. Finally, we discuss their usability and how they can be exploited in real-world applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Brief Review of Explainable Artificial Intelligence (XAI) Techniques

Survey on Explainable AI: From Approaches, Limitations and Applications Aspects

Article Open access 10 August 2023

Defining Explainable AI for Requirements Analysis

Article 03 October 2018

Notes

References

Adadi, A., & Berrada, M. (2018). Peeking inside the black-box: A survey on explainable artificial intelligence (XAI). IEEE Access, 6, 52138–52160.
Article Google Scholar
Agrawal, R., Srikant, R., et al. (1994). Fast algorithms for mining association rules. In Proc. 20th Int. Conf. Very Large Data Bases, VLDB (Vol. 1215, pp. 487–499).
Google Scholar
Aldeen, Y. A. A. S., Salleh, M., & Razzaque, M. A. (2015). A comprehensive review on privacy preserving data mining. SpringerPlus, 4(1), 694.
Article Google Scholar
Andrews, R., Diederich, J., & Tickle, A. B. (1995). Survey and critique of techniques for extracting rules from trained artificial neural networks. Knowledge-Based Systems, 8(6), 373–389.
Article Google Scholar
Angelino, E., Larus-Stone, N., Alabi, D., Seltzer, M., & Rudin, C. (2017). Learning certifiably optimal rule lists. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 35–44). ACM.
Google Scholar
Apicella, A., Isgrò, F., Prevete, R., & Tamburrini, G. (2019). Contrastive explanations to classification systems using sparse dictionaries. In International Conference on Image Analysis and Processing (pp. 207–218). Springer.
Google Scholar
Arrieta, A. B., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A., García, S., Gil-López, S., Molina, D., Benjamins, R., et al. (2020). Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion, 58, 82–115.
Article Google Scholar
Augasta, M. G., & Kathirvalavakumar, T. (2012). Reverse engineering the neural networks for rule extraction in classification problems. Neural Processing Letters, 35(2), 131–150.
Article Google Scholar
Bach, S., Binder, A., Montavon, G., Klauschen, F., Müller, K.-R., & Samek, W. (2015). On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS One, 10(7), e0130140.
Article Google Scholar
Bakas, S., et al. (2018). Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the brats challenge.
Google Scholar
Bengio, Y., Ducharme, R., Vincent, P., & Jauvin, C. (2003). A neural probabilistic language model. Journal of Machine Learning Research, 3, 1137–1155.
MATH Google Scholar
Bhatt, U., Xiang, A., Sharma, S., Weller, A., Taly, A., Jia, Y., Ghosh, J., Puri, R., Moura, J. M., & Eckersley, P. (2020). Explainable machine learning in deployment. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (pp. 648–657).
Google Scholar
Bien, J., & Tibshirani, R. (2011). Prototype selection for interpretable classification. The Annals of Applied Statistics, 5(4), 2403–2424.
Article MathSciNet Google Scholar
Blanco-Justicia, A., Domingo-Ferrer, J., Martínez, S., & Sánchez, D. (2020). Machine learning explainability via microaggregation and shallow decision trees. Knowledge-Based Systems, 194, 105532.
Article Google Scholar
Bolukbasi, T., Chang, K.-W., Zou, J. Y., Saligrama, V., & Kalai, A. T. (2016). Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In Advances in Neural Information Processing Systems (pp. 4349–4357).
Google Scholar
Boz, O. (2002). Extracting decision trees from trained neural networks. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 456–461).
Google Scholar
Byrne, R. M. (2019). Counterfactuals in explainable artificial intelligence (XAI): Evidence from human reasoning. In IJCAI (pp. 6276–6282).
Google Scholar
Chen, C., Li, O., Tao, D., Barnett, A., Rudin, C., & Su, J. K. (2019). This looks like that: Deep learning for interpretable image recognition. In Advances in Neural Information Processing Systems (pp. 8930–8941).
Google Scholar
Chen, X., Chen, H., Xu, H., Zhang, Y., Cao, Y., Qin, Z., & Zha, H. (2019). Personalized fashion recommendation with visual explanations based on multimodal attention network: Towards visually explainable recommendation. In B. Piwowarski, M. Chevalier, É. Gaussier, Y. Maarek, J. Nie & F. Scholer (Eds.), Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2019, Paris, France, July 21–25, 2019 (pp. 765–774). ACM.
Google Scholar
Craven, M., & Shavlik, J. W. (1996). Extracting tree-structured representations of trained networks. In Advances in Neural Information Processing Systems (pp. 24–30).
Google Scholar
Doshi-Velez, F., & Kim, B. (2017). Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608.
Google Scholar
Fahner, G. (2018). Developing transparent credit risk scorecards more effectively: An explainable artificial intelligence approach. Data Anal, 2018, 17.
Google Scholar
Fong, R. C., & Vedaldi, A. (2017). Interpretable explanations of black boxes by meaningful perturbation. In Proceedings of the IEEE International Conference on Computer Vision (pp. 3429–3437).
Google Scholar
Freitas, A. A. (2014). Comprehensible classification models: A position paper. ACM SIGKDD Explorations Newsletter, 15(1), 1–10.
Article Google Scholar
Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189–1232.
Article MathSciNet Google Scholar
Frixione, M., & Lieto, A. (2012). Prototypes vs exemplars in concept representation. In KEOD (pp. 226–232).
Google Scholar
Goodman, B., & Flaxman, S. (2016). EU regulations on algorithmic decision-making and a “right to explanation”. In ICML Workshop on Human Interpretability in Machine Learning (WHI 2016), New York, NY. http://arxiv.org/abs/1606.08813v1
Goyal, Y., Feder, A., Shalit, U., & Kim, B. (2019). Explaining classifiers with causal concept effect (CACE). arXiv preprint arXiv:1907.07165.
Google Scholar
Guidotti, R., Monreale, A., & Cariaggi, L. (2019). Investigating neighborhood generation methods for explanations of obscure image classifiers. In Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 55–68). Springer.
Google Scholar
Guidotti, R., Monreale, A., Giannotti, F., Pedreschi, D., Ruggieri, S., & Turini, F. (2019). Factual and counterfactual explanations for black box decision making. IEEE Intelligent Systems, 34(6), 14–23.
Article Google Scholar
Guidotti, R., Monreale, A., Matwin, S., & Pedreschi, D. (2019). Black box explanation by learning image exemplars in the latent feature space. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 189–205). Springer.
Google Scholar
Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., & Pedreschi, D. (2018). A survey of methods for explaining black box models. ACM Computing Surveys (CSUR), 51(5), 1–42.
Article Google Scholar
Guidotti, R., & Nanni, M. (2020). Crash prediction and risk assessment with individual mobility networks. In 2020 21st IEEE International Conference on Mobile Data Management (MDM) (pp. 89–98). IEEE.
Google Scholar
Guidotti, R., & Ruggieri, S. (2019). On the stability of interpretable models. In 2019 International Joint Conference on Neural Networks (IJCNN) (pp. 1–8). IEEE.
Google Scholar
He, X., Chen, T., Yen Kan, M., & Chen, X. (2015). TriRank: Review-aware explainable recommendation by modeling aspects.
Book Google Scholar
Johansson, U., & Niklasson, L. (2009). Evolving decision trees using oracle guides. In 2009 IEEE Symposium on Computational Intelligence and Data Mining (pp. 238–244). IEEE.
Google Scholar
Kim, B., Khanna, R., & Koyejo, O. O. (2016). Examples are not enough, learn to criticize! criticism for interpretability. In Advances in Neural Information Processing Systems (pp. 2280–2288).
Google Scholar
Krause, J., Perer, A., & Ng, K. (2016). Interacting with predictions: Visual inspection of black-box machine learning models. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (pp. 5686–5697).
Google Scholar
Lakkaraju, H., Bach, S. H., & Leskovec, J. (2016). Interpretable decision sets: A joint framework for description and prediction. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1675–1684). ACM.
Google Scholar
Lampridis, O., Guidotti, R., & Ruggieri, S. (2020). Explaining sentiment classification with synthetic exemplars and counter-exemplars. In International Conference on Discovery Science (pp. 357–373). Springer.
Google Scholar
Laugel, T., Lesot, M.-J., Marsala, C., Renard, X., & Detyniecki, M. (2017). Inverse classification for comparison-based interpretability in machine learning. arXiv preprint arXiv:1712.08443.
Google Scholar
Laugel, T., Lesot, M.-J., Marsala, C., Renard, X., & Detyniecki, M. (2019). Unjustified classification regions and counterfactual explanations in machine learning. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 37–54). Springer.
Google Scholar
Li, O., Liu, H., Chen, C., & Rudin, C. (2018). Deep learning for case-based reasoning through prototypes: A neural network that explains its predictions. In Thirty-Second AAAI Conference on Artificial Intelligence.
Google Scholar
Lowry, S., & Macpherson, G. (1988). A blot on the profession. British Medical Journal (Clinical Research Ed.), 296(6623), 657.
Article Google Scholar
Lundberg, S. M., & Lee, S.-I. (2017). A unified approach to interpreting model predictions. In Advances in neural information processing systems (pp. 4765–4774).
Google Scholar
Makhzani, A., Shlens, J., Jaitly, N., Goodfellow, I., & Frey, B. (2015). Adversarial autoencoders. arXiv preprint arXiv:1511.05644.
Google Scholar
Malgieri, G., & Comandé, G. (2017). Why a right to legibility of automated decision-making exists in the General Data Protection Regulation. International Data Privacy Law, 7(4), 243–265.
Article Google Scholar
Martens, D., Baesens, B., Van Gestel, T., & Vanthienen, J. (2007). Comprehensible credit scoring models using rule extraction from support vector machines. European Journal of Operational Research, 183(3), 1466–1476.
Article Google Scholar
Miller, T. (2019). Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence, 267, 1–38.
Article MathSciNet Google Scholar
Molnar, C. (2020). Interpretable machine learning. Lulu.com
Murphy, P. M., & Pazzani, M. J. (1991). Id2-of-3: Constructive induction of m-of-n concepts for discriminators in decision trees. In Machine learning proceedings 1991 (pp. 183–187). Elsevier.
Google Scholar
Naretto, F., Pellungrini, R., Monreale, A., Nardini, F. M., & Musolesi, M. (2020). Predicting and explaining privacy risk exposure in mobility data. In International Conference on Discovery Science (pp. 403–418). Springer.
Google Scholar
Oriol, J. D. V., Vallejo, E. E., Estrada, K., Peña, J. G. T., Initiative, A. D. N., et al. (2019). Benchmarking machine learning models for late-onset Alzheimer’s disease prediction from genomic data. BMC Bioinformatics, 20(1), 1–17.
Article Google Scholar
Pasquale, F. (2015). The black box society. Harvard University Press.
Book Google Scholar
Pedreschi, D., Giannotti, F., Guidotti, R., Monreale, A., Ruggieri, S., & Turini, F. (2019). Meaningful explanations of black box AI decision systems. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 33, pp. 9780–9784).
Google Scholar
Pedreshi, D., Ruggieri, S., & Turini, F. (2008). Discrimination-aware data mining. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 560–568).
Google Scholar
Quinlan, J. R. (1993). C4.5: Programs for machine learning. Elsevier.
Google Scholar
Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). Why should I trust you? Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1135–1144). ACM.
Google Scholar
Ribeiro, M. T., Singh, S., & Guestrin, C. (2018). Anchors: High-precision model-agnostic explanations. In Thirty-Second AAAI Conference on Artificial Intelligence.
Google Scholar
Romei, A., & Ruggieri, S. (2014). A multidisciplinary survey on discrimination analysis. Knowledge Engineering Review, 29(5), 582–638.
Article Google Scholar
Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5), 206–215.
Article Google Scholar
Rudin, C., & Radin, J. (2019). Why are we using black box models in AI when we don’t need to? A lesson from an explainable AI competition. Harvard Data Science Review, 1(2). https://hdsr.mitpress.mit.edu/pub/f9kuryi8
Setzu, M., Guidotti, R., Monreale, A., & Turini, F. (2019). Global explanations with local scoring. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 159–171). Springer.
Google Scholar
Shapley, L. S. (1953). A value for n-person games. Contributions to the Theory of Games, 2(28), 307–317.
MathSciNet MATH Google Scholar
Shokri, R., Strobel, M., & Zick, Y. (2019). Privacy risks of explaining machine learning models. CoRR, abs/1907.00164.
Google Scholar
Shrikumar, A., Greenside, P., Shcherbina, A., & Kundaje, A. (2016). Not just a black box: Learning important features through propagating activation differences. arXiv preprint arXiv:1605.01713.
Google Scholar
Simonyan, K., Vedaldi, A., & Zisserman, A. (2013). Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034.
Google Scholar
Sokol, K., & Flach, P. A. (2020). Explainability fact sheets: A framework for systematic assessment of explainable approaches. In FAT* ’20: Conference on Fairness, Accountability, and Transparency, Barcelona, Spain, January 27–30, 2020 (pp. 56–67). ACM.
Google Scholar
Sundararajan, M., Taly, A., & Yan, Q. (2017). Axiomatic attribution for deep networks. arXiv preprint arXiv:1703.01365.
Google Scholar
Swapna, G., Vinayakumar, R., & Soman, K. (2018). Diabetes detection using deep learning algorithms. ICT Express, 4(4), 243–246.
Article Google Scholar
Tan, P.-N. et al. (2006). Introduction to data mining. Pearson Education India.
Google Scholar
Ustun, B., Spangher, A., & Liu, Y. (2019). Actionable recourse in linear classification. In Proceedings of the Conference on Fairness, Accountability, and Transparency (pp. 10–19).
Google Scholar
Wachter, S., Mittelstadt, B., & Floridi, L. (2017). Why a right to explanation of automated decision-making does not exist in the General Data Protection Regulation. International Data Privacy Law, 7(2), 76–99.
Article Google Scholar
Wachter, S., Mittelstadt, B., & Russell, C. (2017). Counterfactual explanations without opening the black box: Automated decisions and the GDPR. HJLT, 31, 841.
Google Scholar
Wang, F., & Rudin, C. (2015). Falling rule lists. In Artificial intelligence and statistics (pp. 1013–1022).
Google Scholar
Yin, X., & Han, J. (2003). CPAR: Classification based on predictive association rules. In Proceedings of the 2003 SIAM International Conference on Data Mining (pp. 331–335). SIAM.
Google Scholar
Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In European Conference on Computer Vision (pp. 818–833). Springer.
Google Scholar

Download references

Acknowledgements

This work is partially supported by the European Community H2020 programme under the funding schemes: INFRAIA-01-2018-2019 Res. Infr. G.A. 871042 SoBigData++ (sobigdata.eu), G.A. 952026 Humane AI Net (humane-ai.eu), G.A. 825619 AI4EU (ai4eu.eu), G.A. 952215 TAILOR (tailor.eu), and the ERC-2018-ADG G.A. 834756 “XAI: Science and technology for the eXplanation of AI decision making” (xai.eu).

Author information

Authors and Affiliations

University of Pisa, Pisa, Italy
Riccardo Guidotti, Anna Monreale & Dino Pedreschi
ISTI-CNR Pisa, Pisa, Italy
Fosca Giannotti

Authors

Riccardo Guidotti
View author publications
You can also search for this author in PubMed Google Scholar
Anna Monreale
View author publications
You can also search for this author in PubMed Google Scholar
Dino Pedreschi
View author publications
You can also search for this author in PubMed Google Scholar
Fosca Giannotti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Riccardo Guidotti .

Editor information

Editors and Affiliations

Institute Mines-Telecom Lille Douai, Douai, France
Moamar Sayed-Mouchaweh

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Guidotti, R., Monreale, A., Pedreschi, D., Giannotti, F. (2021). Principles of Explainable Artificial Intelligence. In: Sayed-Mouchaweh, M. (eds) Explainable AI Within the Digital Transformation and Cyber Physical Systems. Springer, Cham. https://doi.org/10.1007/978-3-030-76409-8_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-76409-8_2
Published: 08 May 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-76408-1
Online ISBN: 978-3-030-76409-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Principles of Explainable Artificial Intelligence

Abstract

Access this chapter

Similar content being viewed by others

A Brief Review of Explainable Artificial Intelligence (XAI) Techniques

Survey on Explainable AI: From Approaches, Limitations and Applications Aspects

Defining Explainable AI for Requirements Analysis

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Principles of Explainable Artificial Intelligence

Abstract

Access this chapter

Similar content being viewed by others

A Brief Review of Explainable Artificial Intelligence (XAI) Techniques

Survey on Explainable AI: From Approaches, Limitations and Applications Aspects

Defining Explainable AI for Requirements Analysis

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation