Abstract
The last decade has witnessed the rise of a black box society where obscure classification models are adopted by Artificial Intelligence systems (AI). The lack of explanations of how AI systems make decisions is a key ethical issue to their adoption in socially sensitive and safety-critical contexts. Indeed, the problem is not only for lack of transparency but also for possible biases inherited by the AI from prejudices hidden in the training data. Thus, the research in eXplainable AI (XAI) has recently caught much attention. The applications in which AI systems are employed are various. Therefore, there are many requirements for different types of explanations for different users. We survey the existing proposals in the literature by discussing which are the principles of XAI. In addition, we illustrate different types of explanations returned by established explainers. Finally, we discuss their usability and how they can be exploited in real-world applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Adadi, A., & Berrada, M. (2018). Peeking inside the black-box: A survey on explainable artificial intelligence (XAI). IEEE Access, 6, 52138–52160.
Agrawal, R., Srikant, R., et al. (1994). Fast algorithms for mining association rules. In Proc. 20th Int. Conf. Very Large Data Bases, VLDB (Vol. 1215, pp. 487–499).
Aldeen, Y. A. A. S., Salleh, M., & Razzaque, M. A. (2015). A comprehensive review on privacy preserving data mining. SpringerPlus, 4(1), 694.
Andrews, R., Diederich, J., & Tickle, A. B. (1995). Survey and critique of techniques for extracting rules from trained artificial neural networks. Knowledge-Based Systems, 8(6), 373–389.
Angelino, E., Larus-Stone, N., Alabi, D., Seltzer, M., & Rudin, C. (2017). Learning certifiably optimal rule lists. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 35–44). ACM.
Apicella, A., Isgrò, F., Prevete, R., & Tamburrini, G. (2019). Contrastive explanations to classification systems using sparse dictionaries. In International Conference on Image Analysis and Processing (pp. 207–218). Springer.
Arrieta, A. B., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A., García, S., Gil-López, S., Molina, D., Benjamins, R., et al. (2020). Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion, 58, 82–115.
Augasta, M. G., & Kathirvalavakumar, T. (2012). Reverse engineering the neural networks for rule extraction in classification problems. Neural Processing Letters, 35(2), 131–150.
Bach, S., Binder, A., Montavon, G., Klauschen, F., Müller, K.-R., & Samek, W. (2015). On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS One, 10(7), e0130140.
Bakas, S., et al. (2018). Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the brats challenge.
Bengio, Y., Ducharme, R., Vincent, P., & Jauvin, C. (2003). A neural probabilistic language model. Journal of Machine Learning Research, 3, 1137–1155.
Bhatt, U., Xiang, A., Sharma, S., Weller, A., Taly, A., Jia, Y., Ghosh, J., Puri, R., Moura, J. M., & Eckersley, P. (2020). Explainable machine learning in deployment. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (pp. 648–657).
Bien, J., & Tibshirani, R. (2011). Prototype selection for interpretable classification. The Annals of Applied Statistics, 5(4), 2403–2424.
Blanco-Justicia, A., Domingo-Ferrer, J., Martínez, S., & Sánchez, D. (2020). Machine learning explainability via microaggregation and shallow decision trees. Knowledge-Based Systems, 194, 105532.
Bolukbasi, T., Chang, K.-W., Zou, J. Y., Saligrama, V., & Kalai, A. T. (2016). Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In Advances in Neural Information Processing Systems (pp. 4349–4357).
Boz, O. (2002). Extracting decision trees from trained neural networks. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 456–461).
Byrne, R. M. (2019). Counterfactuals in explainable artificial intelligence (XAI): Evidence from human reasoning. In IJCAI (pp. 6276–6282).
Chen, C., Li, O., Tao, D., Barnett, A., Rudin, C., & Su, J. K. (2019). This looks like that: Deep learning for interpretable image recognition. In Advances in Neural Information Processing Systems (pp. 8930–8941).
Chen, X., Chen, H., Xu, H., Zhang, Y., Cao, Y., Qin, Z., & Zha, H. (2019). Personalized fashion recommendation with visual explanations based on multimodal attention network: Towards visually explainable recommendation. In B. Piwowarski, M. Chevalier, É. Gaussier, Y. Maarek, J. Nie & F. Scholer (Eds.), Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2019, Paris, France, July 21–25, 2019 (pp. 765–774). ACM.
Craven, M., & Shavlik, J. W. (1996). Extracting tree-structured representations of trained networks. In Advances in Neural Information Processing Systems (pp. 24–30).
Doshi-Velez, F., & Kim, B. (2017). Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608.
Fahner, G. (2018). Developing transparent credit risk scorecards more effectively: An explainable artificial intelligence approach. Data Anal, 2018, 17.
Fong, R. C., & Vedaldi, A. (2017). Interpretable explanations of black boxes by meaningful perturbation. In Proceedings of the IEEE International Conference on Computer Vision (pp. 3429–3437).
Freitas, A. A. (2014). Comprehensible classification models: A position paper. ACM SIGKDD Explorations Newsletter, 15(1), 1–10.
Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189–1232.
Frixione, M., & Lieto, A. (2012). Prototypes vs exemplars in concept representation. In KEOD (pp. 226–232).
Goodman, B., & Flaxman, S. (2016). EU regulations on algorithmic decision-making and a “right to explanation”. In ICML Workshop on Human Interpretability in Machine Learning (WHI 2016), New York, NY. http://arxiv.org/abs/1606.08813v1
Goyal, Y., Feder, A., Shalit, U., & Kim, B. (2019). Explaining classifiers with causal concept effect (CACE). arXiv preprint arXiv:1907.07165.
Guidotti, R., Monreale, A., & Cariaggi, L. (2019). Investigating neighborhood generation methods for explanations of obscure image classifiers. In Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 55–68). Springer.
Guidotti, R., Monreale, A., Giannotti, F., Pedreschi, D., Ruggieri, S., & Turini, F. (2019). Factual and counterfactual explanations for black box decision making. IEEE Intelligent Systems, 34(6), 14–23.
Guidotti, R., Monreale, A., Matwin, S., & Pedreschi, D. (2019). Black box explanation by learning image exemplars in the latent feature space. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 189–205). Springer.
Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., & Pedreschi, D. (2018). A survey of methods for explaining black box models. ACM Computing Surveys (CSUR), 51(5), 1–42.
Guidotti, R., & Nanni, M. (2020). Crash prediction and risk assessment with individual mobility networks. In 2020 21st IEEE International Conference on Mobile Data Management (MDM) (pp. 89–98). IEEE.
Guidotti, R., & Ruggieri, S. (2019). On the stability of interpretable models. In 2019 International Joint Conference on Neural Networks (IJCNN) (pp. 1–8). IEEE.
He, X., Chen, T., Yen Kan, M., & Chen, X. (2015). TriRank: Review-aware explainable recommendation by modeling aspects.
Johansson, U., & Niklasson, L. (2009). Evolving decision trees using oracle guides. In 2009 IEEE Symposium on Computational Intelligence and Data Mining (pp. 238–244). IEEE.
Kim, B., Khanna, R., & Koyejo, O. O. (2016). Examples are not enough, learn to criticize! criticism for interpretability. In Advances in Neural Information Processing Systems (pp. 2280–2288).
Krause, J., Perer, A., & Ng, K. (2016). Interacting with predictions: Visual inspection of black-box machine learning models. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (pp. 5686–5697).
Lakkaraju, H., Bach, S. H., & Leskovec, J. (2016). Interpretable decision sets: A joint framework for description and prediction. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1675–1684). ACM.
Lampridis, O., Guidotti, R., & Ruggieri, S. (2020). Explaining sentiment classification with synthetic exemplars and counter-exemplars. In International Conference on Discovery Science (pp. 357–373). Springer.
Laugel, T., Lesot, M.-J., Marsala, C., Renard, X., & Detyniecki, M. (2017). Inverse classification for comparison-based interpretability in machine learning. arXiv preprint arXiv:1712.08443.
Laugel, T., Lesot, M.-J., Marsala, C., Renard, X., & Detyniecki, M. (2019). Unjustified classification regions and counterfactual explanations in machine learning. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 37–54). Springer.
Li, O., Liu, H., Chen, C., & Rudin, C. (2018). Deep learning for case-based reasoning through prototypes: A neural network that explains its predictions. In Thirty-Second AAAI Conference on Artificial Intelligence.
Lowry, S., & Macpherson, G. (1988). A blot on the profession. British Medical Journal (Clinical Research Ed.), 296(6623), 657.
Lundberg, S. M., & Lee, S.-I. (2017). A unified approach to interpreting model predictions. In Advances in neural information processing systems (pp. 4765–4774).
Makhzani, A., Shlens, J., Jaitly, N., Goodfellow, I., & Frey, B. (2015). Adversarial autoencoders. arXiv preprint arXiv:1511.05644.
Malgieri, G., & Comandé, G. (2017). Why a right to legibility of automated decision-making exists in the General Data Protection Regulation. International Data Privacy Law, 7(4), 243–265.
Martens, D., Baesens, B., Van Gestel, T., & Vanthienen, J. (2007). Comprehensible credit scoring models using rule extraction from support vector machines. European Journal of Operational Research, 183(3), 1466–1476.
Miller, T. (2019). Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence, 267, 1–38.
Molnar, C. (2020). Interpretable machine learning. Lulu.com
Murphy, P. M., & Pazzani, M. J. (1991). Id2-of-3: Constructive induction of m-of-n concepts for discriminators in decision trees. In Machine learning proceedings 1991 (pp. 183–187). Elsevier.
Naretto, F., Pellungrini, R., Monreale, A., Nardini, F. M., & Musolesi, M. (2020). Predicting and explaining privacy risk exposure in mobility data. In International Conference on Discovery Science (pp. 403–418). Springer.
Oriol, J. D. V., Vallejo, E. E., Estrada, K., Peña, J. G. T., Initiative, A. D. N., et al. (2019). Benchmarking machine learning models for late-onset Alzheimer’s disease prediction from genomic data. BMC Bioinformatics, 20(1), 1–17.
Pasquale, F. (2015). The black box society. Harvard University Press.
Pedreschi, D., Giannotti, F., Guidotti, R., Monreale, A., Ruggieri, S., & Turini, F. (2019). Meaningful explanations of black box AI decision systems. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 33, pp. 9780–9784).
Pedreshi, D., Ruggieri, S., & Turini, F. (2008). Discrimination-aware data mining. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 560–568).
Quinlan, J. R. (1993). C4.5: Programs for machine learning. Elsevier.
Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). Why should I trust you? Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1135–1144). ACM.
Ribeiro, M. T., Singh, S., & Guestrin, C. (2018). Anchors: High-precision model-agnostic explanations. In Thirty-Second AAAI Conference on Artificial Intelligence.
Romei, A., & Ruggieri, S. (2014). A multidisciplinary survey on discrimination analysis. Knowledge Engineering Review, 29(5), 582–638.
Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5), 206–215.
Rudin, C., & Radin, J. (2019). Why are we using black box models in AI when we don’t need to? A lesson from an explainable AI competition. Harvard Data Science Review, 1(2). https://hdsr.mitpress.mit.edu/pub/f9kuryi8
Setzu, M., Guidotti, R., Monreale, A., & Turini, F. (2019). Global explanations with local scoring. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 159–171). Springer.
Shapley, L. S. (1953). A value for n-person games. Contributions to the Theory of Games, 2(28), 307–317.
Shokri, R., Strobel, M., & Zick, Y. (2019). Privacy risks of explaining machine learning models. CoRR, abs/1907.00164.
Shrikumar, A., Greenside, P., Shcherbina, A., & Kundaje, A. (2016). Not just a black box: Learning important features through propagating activation differences. arXiv preprint arXiv:1605.01713.
Simonyan, K., Vedaldi, A., & Zisserman, A. (2013). Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034.
Sokol, K., & Flach, P. A. (2020). Explainability fact sheets: A framework for systematic assessment of explainable approaches. In FAT* ’20: Conference on Fairness, Accountability, and Transparency, Barcelona, Spain, January 27–30, 2020 (pp. 56–67). ACM.
Sundararajan, M., Taly, A., & Yan, Q. (2017). Axiomatic attribution for deep networks. arXiv preprint arXiv:1703.01365.
Swapna, G., Vinayakumar, R., & Soman, K. (2018). Diabetes detection using deep learning algorithms. ICT Express, 4(4), 243–246.
Tan, P.-N. et al. (2006). Introduction to data mining. Pearson Education India.
Ustun, B., Spangher, A., & Liu, Y. (2019). Actionable recourse in linear classification. In Proceedings of the Conference on Fairness, Accountability, and Transparency (pp. 10–19).
Wachter, S., Mittelstadt, B., & Floridi, L. (2017). Why a right to explanation of automated decision-making does not exist in the General Data Protection Regulation. International Data Privacy Law, 7(2), 76–99.
Wachter, S., Mittelstadt, B., & Russell, C. (2017). Counterfactual explanations without opening the black box: Automated decisions and the GDPR. HJLT, 31, 841.
Wang, F., & Rudin, C. (2015). Falling rule lists. In Artificial intelligence and statistics (pp. 1013–1022).
Yin, X., & Han, J. (2003). CPAR: Classification based on predictive association rules. In Proceedings of the 2003 SIAM International Conference on Data Mining (pp. 331–335). SIAM.
Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In European Conference on Computer Vision (pp. 818–833). Springer.
Acknowledgements
This work is partially supported by the European Community H2020 programme under the funding schemes: INFRAIA-01-2018-2019 Res. Infr. G.A. 871042 SoBigData++ (sobigdata.eu), G.A. 952026 Humane AI Net (humane-ai.eu), G.A. 825619 AI4EU (ai4eu.eu), G.A. 952215 TAILOR (tailor.eu), and the ERC-2018-ADG G.A. 834756 “XAI: Science and technology for the eXplanation of AI decision making” (xai.eu).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Guidotti, R., Monreale, A., Pedreschi, D., Giannotti, F. (2021). Principles of Explainable Artificial Intelligence. In: Sayed-Mouchaweh, M. (eds) Explainable AI Within the Digital Transformation and Cyber Physical Systems. Springer, Cham. https://doi.org/10.1007/978-3-030-76409-8_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-76409-8_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-76408-1
Online ISBN: 978-3-030-76409-8
eBook Packages: Computer ScienceComputer Science (R0)