Abstract
The growing importance of Explainable Artificial Intelligence (XAI) has highlighted the need to understand the decision-making processes of black-box models. Surrogation, emulating a black-box model (BB) with a white-box model (WB), is crucial in applications where BBs are unavailable due to security or practical concerns. Traditional fidelity measures only evaluate the similarity of the final predictions, which can lead to a significant limitation: considering a WB faithful even when it has the same prediction as the BB but with a completely different rationale. Addressing this limitation is crucial to develop Trustworthy AI practical applications beyond XAI. To address this issue, we introduce ShapGAP, a novel metric that assesses the faithfulness of surrogate models by comparing their reasoning paths, using SHAP explanations as a proxy. We validate the effectiveness of ShapGAP by applying it to real-world datasets from healthcare and finance domains, comparing its performance against traditional fidelity measures. Our results show that ShapGAP enables better understanding and trust in XAI systems, revealing the potential dangers of relying on models with high task accuracy but unfaithful explanations. ShapGAP serves as a valuable tool for identifying faithful surrogate models, paving the way for more reliable and Trustworthy AI applications.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
- Explainable Artificial Intelligence (XAI)
- Fidelity Measures
- Surrogate Models
- Interpretability
- Black-box
- White-box
- Faithfulness
- SHAP
1 Introduction
Explainable Artificial Intelligence (XAI) aims to provide insights into the decision-making processes of complex machine learning models, particularly black-box models, which are often difficult to interpret due to their inherent complexity [2]. As the adoption of machine learning models in critical applications (e.g., healthcare, finance, or business decision-making) continues to increase, understanding their decision-making rationale becomes essential for building trust, ensuring fairness, and making informed decisions based on the output of the model [12].
Surrogate models, which are interpretable white-box (WB) models trained to approximate the behavior of black-box (BB) models, have emerged as a popular approach for providing explanations in XAI [21]. These surrogate models are particularly relevant in use cases where BB models are unavailable due to security or practical concerns or when stakeholders require explanations to support their decision-making processes. For instance, in healthcare, a BB model may predict the likelihood of a patient having a specific disease based on various symptoms and test results. A surrogate model that provides the same prediction but with a different rationale could lead medical professionals to focus on the wrong symptoms or tests, resulting in incorrect treatment or mismanagement of the patient’s condition. In finance, a BB model might predict the probability of default for loan applicants based on their credit history, income, and other financial factors. A surrogate model that matches the BB model’s predictions but with different reasoning could lead to unfair lending decisions or increased financial risk for the lending institution.
However, evaluating the faithfulness of these surrogate models remains challenging. Traditional fidelity measures such as accuracy focus on the similarity of the final predictions between the BB and surrogate models. This can lead to a significant limitation, as a surrogate model might be considered faithful even if it provides the same prediction as the BB model but with a completely different rationale. In critical applications like the ones mentioned above, this can be dangerous, as decision-makers might act on unfaithful explanations, leading to suboptimal or even harmful outcomes [14].
In this paper, we address this limitation by introducing a novel metric called ShapGAP, which assesses the faithfulness of surrogate models by comparing their reasoning paths, using SHAP explanations as a proxy. ShapGAP measures the average L2 distance between the SHAP explanations of BB and WB models, providing a more comprehensive evaluation of surrogate model faithfulness that goes beyond the similarity of final predictions. The main contributions are:
-
We propose ShapGAP, a novel metric for evaluating surrogate model faithfulness that considers the reasoning paths of the models by comparing their SHAP explanations.
-
We demonstrate the effectiveness of ShapGAP through experiments with real-world datasets, comparing it against traditional fidelity measures.
-
We highlight the potential dangers and ethical concerns of relying on unfaithful explanations in critical applications, drawing on philosophical arguments for truthfulness and ethical AI.
By introducing ShapGAP, we aim to contribute to the ongoing research towards Trustworthy AI by providing a more effective method for evaluating surrogate model faithfulness that captures the essence of reasoning paths, enabling better understanding, trust, and ethical considerations in AI systems.
The rest of the manuscript is organized as follows. Section 2 introduces related work. Section 3 describes ShapGAP. Section 4 presents details about the experiments (i.e., experimental setting, datasets and models). Section 5 discusses the reported results. Section 6 goes in depth with some ethical concerns. Finally, Sect. 7 concludes the paper with final remarks and future work.
2 Related Works
The research field of XAI encompasses various approaches for extracting secondary models from primary models. Model distillation, for instance, often refers to the transfer of knowledge from a larger model (teacher) to a smaller one (student) with the aim of optimizing space and speed though not necessarily interpretability [4, 13]. However, in our context, we pay major attention to distillation that results in a secondary model with greater interpretability than the primary one while maintaining most of its core characteristics in terms of both behaviour and performance.
In this domain, two main threads can be identified: local surrogates and global surrogates. Local surrogates ensure fidelity within a suitably defined neighborhood [17, 21], resulting in multiple local models that collectively describe the global behavior. Conversely, global surrogates attempt to build a more interpretable model across the entire data domain, providing a bird’s-eye view of the problem. Several methods have been proposed for distilling global surrogates:
-
Pedagogical approach [6, 8]: This method trains the surrogate on queries to the primary model, assuming availability of the primary model for evaluating arbitrary synthesized data points in order to obtain labels for training the secondary model.
-
Audit approach [27]: When probing the model with new data is not possible, this approach trains the surrogate on predictions made by the primary model. This setup is sometimes more realistic in industrial scenarios where the primary model is unavailable due to security or practical concerns.
The quality of the secondary model is typically assessed using one or two metrics, which can be referred to as task accuracy and model accuracy. Task accuracy evaluates the accuracy of the secondary model concerning the true labels in the dataset, while model accuracy (sometimes called fidelity) measures the accuracy of the secondary model with respect to the labels provided by the primary model [7, 8, 15]. As an alternative, precision and recall are also used to evaluate the resulting rule system or Bayesian network trained with the pedagogical approach [23]. Bastani et al. [6] generalize this approach to non-classification domains, incorporating suitable metrics for regression and reinforcement learning tasks.
In addition to the aforementioned work, there are also studies that focus on alternative metrics for evaluating faithfulness of explanations in different contexts. For instance, Alvarez-Melis and Jaakkola [3] propose a self-explainable neural network that provides both prediction and self-relevance scores for feature or concept importance. Their faithfulness measure is based on the correlation between explanation vectors and probability drops from ablation studies, offering an alternative approach to assess faithfulness using explanation vectors. Although their setup is different from a global surrogate, it shares some similarity with ShapGAP in utilizing explanations for measuring faithfulness.
Alaa and van der Schaar [1] proposed another approach that employs feature importance for model comparison, although not necessarily in a global surrogate setting. They qualitatively compare the global feature importance of two models, demonstrating another perspective on assessing faithfulness and explanation quality between models. In addition, Dai et al. [9] proposed alternative fidelity measures such as ground truth fidelity, which can be adapted for comparing two models. This measure, which can be referred to as the “Top-k Percentage Accordance”, calculates the percentage of top-k features from one explanation that are also in the top-k features of another explanation. While this metric might have limitations, it demonstrates another perspective on evaluating faithfulness between models.
It is worth noting that previous work does not directly address surrogation but contribute to the broader understanding of evaluating explanations and faithfulness in various settings. This context helps to clarify how ShapGAP fits into the larger landscape of XAI research.
3 Proposed Approach: ShapGAP
For giving context it is appropriate to recall some preliminary concepts before defining ShapGAP. Shapley values [25], originally derived from cooperative game theory, represent a strategy for allocating the payoff of a cooperative game among the players, based on their contributions. In the context of feature importance, each feature is considered as a player, and the prediction is the payoff. The Shapley value of a feature quantifies its average marginal contribution across all possible feature combinations. SHapley Additive exPlanations (SHAP) [17] combine Shapley values with a unified measure of feature importance for machine learning models. For a given prediction, SHAP assigns an importance value to each feature, such that the sum of all values equals the difference between the prediction and the average prediction for the dataset.
We define a BB model as a function \(f_{bb}: X \rightarrow Y\), where X is the input space and Y is the output space, and a WB model as a function \(f_{wb}: X \rightarrow Y\). Given a dataset D with n instances, we compute the SHAP values for each instance \(x_i\) in D for both the BB and WB models. Let \(S_{bb}(x_i)\) and \(S_{wb}(x_i)\) represent the SHAP values for instance \(x_i\) for the BB and WB models, respectively.
To define a generic version of ShapGAP, we use a distance function \(d(\cdot , \cdot )\) that measures the dissimilarity between the SHAP explanations of the BB and WB models for instance \(x_i\). The ShapGAP metric is then the average of these distances across all instances in the dataset:
We can implement the distance function \(d(S_{bb}(x_i), S_{wb}(x_i))\) using different distance measures, such as the L2 Euclidean distance and the Cosine distance. The L2 Euclidean distance, as shown in Eq. (2), is more precise and faithful to the final probability values, as the SHAP explanations sum up to the output of the model. This choice emphasizes the importance of the exact contribution of each feature in the explanation and is more sensitive to differences in magnitude between SHAP values associated to BB and WB models.
However, it is noteworthy to mention some limitations of the L2 Euclidean distance. This method is sensitive to outliers, which means that a single instance with a large disparity in SHAP values can significantly affect the overall distance. Therefore, L2 Euclidean distance might not represent the overall similarity well in the presence of outliers.
On the other hand, the Cosine distance (see Eq. (3)) is more relaxed and focuses on the similarity in the direction of the SHAP explanations rather than their magnitude. This choice allows for finding out surrogate models with similar reasoning paths, even if the magnitude of their feature contributions differs. By being more scale-agnostic with respect to the magnitude of the explanations, the Cosine distance might be better suited for cases where the focus is on the general structure of the explanation, rather than the exact values of the SHAP contributions.
By offering these two distance measures, the ShapGAP metric can accommodate different application requirements and preferences, providing more flexibility in the evaluation of surrogate model faithfulness.
To compute SHAP values, we use the widely adopted shap packageFootnote 1, which offers efficient implementations for various types of models. For tree-like models such as random forests and decision trees, the package provides the fast TreeSHAP algorithm. Likewise, for linear models, an efficient method based on the coefficients of the model is available. In cases where the BB model is neither tree-based nor linear, the package offers a model-agnostic method called KernelSHAP, which can be applied to any model at the expense of increased computational cost. By leveraging these implementations, we can calculate the ShapGAP metric for a diverse range of surrogate models, ensuring that the metric remains flexible and adaptable to various application scenarios.
4 Experimental Section
In this section, we present the experimental setting for using and validating both ShapGAP L2 and Cosine distance metrics, compared against Task Accuracy, Fidelity Accuracy (in the sense of model accuracy, i.e., the accuracy with respect to the labels predicted by the BB model), and ShapLength [18]. ShapLength, a model-agnostic metric, enables the comparison of fundamentally different models, such as Logistic Regression and Decision Trees, in terms of their explanation complexity. By examining ShapLength, we can assess the trade-offs between faithfulness and simplicity in surrogate models, and gain insights into the complexity of models and their explanations, making our analysis more comprehensive and thorough.
To align with the problem suggested in the introduction, we use two popular datasets: the Breast Cancer dataset [26], which contains 569 instances with 30 features, and the German Credit score dataset from UCI [10], which includes 1,000 instances with 20 features. Both datasets are widely used for benchmarking machine learning models in the context of binary classification. For surrogating we employ the Audit approach and perform a 10-fold cross-validation. Then, we compute the quality metrics on the test set. The reported results represent the average across all the folds. To explore various surrogate models, we train Decision Trees by varying two parameters: max depth, which can take values of 3, 4, or 5, and ccp_alpha, a regularization parameter for cost complexity pruning, which takes values of 0.001, 0.01, and 0.1. For the Bank credit score dataset from UCI, which contains categorical columns, we preprocess the data by one-hot encoding the categorical columns.
For the sake of experimental reproducibility, everything required for running the experiments is available onlineFootnote 2.
5 Discussions of Results
Our analysis of the results on both the Breast Cancer (Fig. 1) and German Credit (Fig. 2) datasets reveal some important insights about the relationships between Task Accuracy, Fidelity Accuracy, Complexity, and ShapGAP. In both experiments, Logistic Regression (LR) performs better in terms of both Task Accuracy and Fidelity Accuracy compared to Decision Trees (DT). However, when we consider the ShapGAP metric, the LR model exhibits a very high ShapGAP compared to DT models, indicating that its explanations are unfaithful to the BB model despite its superior accuracy.
In the Breast Cancer dataset, the LR model exhibits both high Task and Fidelity Accuracy, suggesting its potential as an effective global surrogate for the BB model. The additional advantage of lower complexity in the LR model strengthens this proposition. However, the high ShapGAP value warns against this, as it would lead to unfaithful explanations of the BB model.
In the German Credit dataset, it is interesting to observe that the LR model has similar Task Accuracy as the BB model but with higher complexity. One possible reason for this observation is that the simpler structure of the LR model, with its linear decision boundary, allows it to effectively capture the underlying patterns in the data, but at the expense of using more features on average, as reflected in its higher ShapLength complexity. On the other hand, the more expressive nature of the Random Forest BB model, with its ensemble of DT models, enables it to capture complex relationships among features and potentially represent the patterns in the data with fewer features on average.
Given that a WB model such as the LR model performs almost on par with the BB model in terms of Task Accuracy, readers may wonder whether it is necessary to employ a more complex less interpretable BB model in this case. Indeed, this question was already posed by a few papers, such as [19, 22], which argue that if a WB model performs well on the task, it should be used directly for both prediction and explanation, discarding the BB model. Of course, this choice is also motivated by the nature of the task and its specific requirements. For high-stake scenarios, such as medical diagnosis, it might be more preferable to have slightly lower Task Accuracy but higher explainability, making the use of a WB model more appropriate. For other scenarios, like movie or song recommendations, the trade-off between accuracy and explainability might be less critical, and the choice between WB and BB models may depend on other factors. The decision ultimately depends on the value of having a certain percentage of accuracy improvement and the importance of explainability in the given context.
Overall, ShapGAP reveals that LR models behave in a very different way compared to DT models. While LR models can achieve higher accuracy, in some cases, it might be preferable to use DT models as global surrogates due to their explanations being more faithful to the BB model. This illustrates the value of the ShapGAP metric in guiding the selection of surrogate models based on the faithfulness of their explanations, in addition to their performance on the task.
6 Ethical Considerations in Surrogate Explanations
The ethical implications of unfaithful surrogate explanations in critical applications are manifold. Unfaithful explanations can lead to misinformed decision-making, adversely affecting individuals’ lives and well-being, especially in sensitive domains like healthcare, justice and finance [5]. This misalignment can also hinder the identification and correction of AI model shortcomings, exacerbating existing societal inequalities [20].
Accurate and reliable surrogate explanations are vital for ensuring AI systems align properly with ethical principlesFootnote 3 guiding AI development. Such explanations enable users to understand and scrutinize model behavior, allowing them to make fully informed choices about the use of AI systems. Informed consent is a fundamental ethical principle that should be upheld in AI development and deployment, as it preserves users’ autonomy and agency [11].
Unfaithful surrogate explanations can erode trust in AI systems, which is essential for the successful adoption of AI technologies [24]. Legal and liability issues can arise due to misaligned explanations, making responsibility attribution for errors or adverse outcomes challenging [28]. Furthermore, unfaithful explanations can mask biases and discrimination, perpetuating societal inequalities.
AI developers have an ethical obligation to provide truthful and accurate explanations. In summary, ShapGAP offers a means to evaluate and compare surrogate models, promoting responsible development and deployment of Trustworthy AI by helping developers in their pursuit of more faithful explanations. Ensuring truthfulness and accuracy in AI explanations is essential for preserving human values, promoting fairness, and fostering transparency and accountability in agreement with ethical guidelines [28].
7 Conclusions, Limitations, and Future Work
In this paper, we introduced the ShapGAP metric, a novel approach for evaluating the faithfulness of surrogate models by comparing their SHAP explanations with those of the black-box model. Through two illustrative case studies, we demonstrated the utility of ShapGAP in revealing the unfaithfulness of models that may otherwise appear as strong global surrogates based on Task Accuracy, Fidelity Accuracy, and complexity metrics like ShapLength.
The ShapGAP metric has the potential to improve the trustworthiness and utility of surrogate models, particularly in high-stakes applications such as healthcare and finance, where faithful explanations are crucial for decision-making. Our experimental results emphasize the importance of considering faithfulness as an essential criterion for surrogate model evaluation and selection.
While ShapGAP offers a promising approach for evaluating surrogate model faithfulness, there are some limitations to consider:
-
Computational Expense: SHAP explanations can be computationally expensive to compute, especially for complex models or large datasets. Although various approximation methods exist, the computational cost of calculating SHAP values may still pose a challenge in some scenarios. Anyway, the utility of ShapGAP as an evaluation metric makes it worthwhile to consider these explanations despite the potential computational burden.
-
Approximate Nature: though SHAP explanations are widely adopted in the XAI community, they are only approximations of the reasoning paths followed by the underlying models. Furthermore, the SHAP computation itself yields approximations to the true Shapley values. Despite these limitations, we believe that a reasonable approximation is still useful for assessing surrogate model faithfulness, as it provides insights into the models’ reasoning processes that are otherwise unavailable or incomparable.
-
Dependency on SHAP: Our approach relies on SHAP explanations to compare reasoning paths, which may limit its applicability to other explanation methods. Although SHAP has gained widespread acceptance in the XAI community, future research could explore alternative approaches for evaluating surrogate model faithfulness using different explanation methods.
To address these limitations, future work could focus on expanding the scope of ShapGAP to incorporate other explanation methods or reduce the computational cost of producing SHAP explanations. Moreover, SHAP, as it stands, provides a first-order explanation, meaning it presents the impact of each individual feature without considering interactions between features. As a result, there might be missed nuances in feature relationships. To account for this, we plan taking into account SHAP interaction values [16], which consider the synergistic or antagonistic effects between feature pairs, as a basis for an enhanced ShapGAP. In Addition, more in-depth investigations into the factors that contribute to high or low ShapGAP scores, and how to optimize surrogate models for faithfulness, could be valuable. Further research might also delve into the role of ShapGAP in model selection and evaluation pipelines, its integration into automated machine learning (AutoML) frameworks, and its potential impact on the design or training of surrogate models for improved faithfulness. Moreover, further research could investigate potential biases or limitations in SHAP explanations on more datasets and explore methods to mitigate their impact on the evaluation of surrogate model faithfulness.
References
Alaa, A.M., van der Schaar, M.: Demystifying black-box models with symbolic metamodels. In: Advances in Neural Information Processing Systems, vol. 32. Curran Associates, Inc. (2019)
Ali, S., et al.: Explainable Artificial Intelligence (XAI): what we know and what is left to attain trustworthy artificial intelligence. Inf. Fusion 101805 (2023). https://doi.org/10.1016/j.inffus.2023.101805. https://linkinghub.elsevier.com/retrieve/pii/S1566253523001148
Alvarez-Melis, D., Jaakkola, T.S.: Towards robust interpretability with self-explaining neural networks. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, NIPS 2018, pp. 7786–7795. Curran Associates Inc., Red Hook (2018)
Ba, J., Caruana, R.: Do deep nets really need to be deep? In: Advances in Neural Information Processing Systems, vol. 27. Curran Associates, Inc. (2014)
Barredo Arrieta, A., et al.: Explainable Artificial Intelligence (XAI): concepts, taxonomies, opportunities and challenges toward responsible AI. Inf. Fusion 58, 82–115 (2020). https://doi.org/10.1016/j.inffus.2019.12.012
Bastani, O., Kim, C., Bastani, H.: Interpretability via model extraction (2018). http://arxiv.org/abs/1706.09773 [cs, stat]
Burkart, N., Huber, M.F.: A survey on the explainability of supervised machine learning. J. Artif. Intell. Res. 70, 245–317 (2021). https://doi.org/10.1613/jair.1.12228. https://www.jair.org/index.php/jair/article/view/12228
Craven, M., Shavlik, J.: Extracting tree-structured representations of trained networks. In: Advances in Neural Information Processing Systems, vol. 8. MIT Press (1995)
Dai, J., Upadhyay, S., Aivodji, U., Bach, S.H., Lakkaraju, H.: Fairness via explanation quality: evaluating disparities in the quality of post hoc explanations. In: Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society, pp. 203–214 (2022). https://doi.org/10.1145/3514094.3534159. http://arxiv.org/abs/2205.07277 [cs]
Dua, Dheeru and Graff, Casey: UCI Machine Learning Repository (2017). http://archive.ics.uci.edu/ml. University of California, Irvine, School of Information and Computer Sciences
Floridi, L., et al.: AI4People—an ethical framework for a good AI society: opportunities, risks, principles, and recommendations. Mind. Mach. 28(4), 689–707 (2018). https://doi.org/10.1007/s11023-018-9482-5
Gunning, D., Vorm, E., Wang, J.Y., Turek, M.: DARPA’s explainable AI (XAI) program: a retrospective. Appl. AI Lett. 2(4), e61 (2021). https://doi.org/10.1002/ail2.61
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network (2015). http://arxiv.org/abs/1503.02531 [cs, stat]
Jacovi, A., Goldberg, Y.: Towards faithfully interpretable NLP systems: how should we define and evaluate faithfulness? In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 4198–4205. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.acl-main.386. https://www.aclweb.org/anthology/2020.acl-main.386
Lakkaraju, H., Kamar, E., Caruana, R., Leskovec, J.: Interpretable & explorable approximations of black box models (2017). http://arxiv.org/abs/1707.01154 [cs]
Lundberg, S.M., Erion, G.G., Lee, S.I.: Consistent Individualized Feature Attribution for Tree Ensembles (2018)
Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc. (2017)
Mariotti, E., Alonso-Moral, J.M., Gatt, A.: Measuring model understandability by means of shapley additive explanations. In: 2022 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), Padua, Italy, pp. 1–8. IEEE (2022). https://doi.org/10.1109/FUZZ-IEEE55066.2022.9882773. https://ieeexplore.ieee.org/document/9882773/
Markus, A.F., Kors, J.A., Rijnbeek, P.R.: The role of explainability in creating trustworthy artificial intelligence for health care: a comprehensive survey of the terminology, design choices, and evaluation strategies. J. Biomed. Inform. 113, 103655 (2021). https://doi.org/10.1016/j.jbi.2020.103655. https://www.sciencedirect.com/science/article/pii/S1532046420302835
Mittelstadt, B.D., Allo, P., Taddeo, M., Wachter, S., Floridi, L.: The ethics of algorithms: mapping the debate. Big Data Soc. 3(2), 205395171667967 (2016). https://doi.org/10.1177/2053951716679679. http://journals.sagepub.com/doi/10.1177/2053951716679679
Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should I trust you?”: explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016, pp. 1135–1144. Association for Computing Machinery, New York (2016). https://doi.org/10.1145/2939672.2939778
Rudin, C.: Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1(5), 206–215 (2019). https://doi.org/10.1038/s42256-019-0048-x
Sanchez, I., Rocktaschel, T., Riedel, S., Singh, S.: Towards extracting faithful and descriptive representations of latent variable models (2015)
Selbst, A.D., Barocas, S.: The intuitive appeal of explainable machines. SSRN Electron. J. (2018). https://doi.org/10.2139/ssrn.3126971. https://www.ssrn.com/abstract=3126971
Shapley, L.S.: A value for n-person games. In: Kuhn, H.W., Tucker, A.W. (eds.) Contributions to the Theory of Games (AM-28), vol. II, pp. 307–318. Princeton University Press (1953). https://doi.org/10.1515/9781400881970-018
Street, N., Wolberg, W.H., Mangasarian, O.L.: Nuclear feature extraction for breast tumor diagnosis. In: Proceedings of the Conference on Biomedical Image Processing and Biomedical Visualization, vol. 1905 (1993). https://doi.org/10.1117/12.148698
Tan, S., Caruana, R., Hooker, G., Lou, Y.: Distill-and-compare: auditing black-box models using transparent model distillation. In: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pp. 303–310. ACM, New Orleans (2018). https://doi.org/10.1145/3278721.3278725. https://dl.acm.org/doi/10.1145/3278721.3278725
Wachter, S., Mittelstadt, B., Russell, C.: Counterfactual explanations without opening the black box: automated decisions and the GDPR. SSRN Electron. J. (2017). https://doi.org/10.2139/ssrn.3063289. https://www.ssrn.com/abstract=3063289
Acknowledgement
E. Mariotti and A. Sivaprasad are ESRs in the NL4XAI project which has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Grant Agreement No. 860621. In addition, this work is supported by Grant PID2021-123152OB-C21 funded by MCIN/AEI/10.13039/501100011033 and by “ESF Investing in your future”, by Grant TED2021-130295B-C33 funded by MCIN/AEI/10.13039/501100011033 and by the “European Union NextGenerationEU/PRTR”, and by the Galician Ministry of Culture, Education, Professional Training and University (grants ED431G2019/04, ED431C2022/19 co-funded by the European Regional Development Fund, ERDF/FEDER program).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2023 The Author(s)
About this paper
Cite this paper
Mariotti, E., Sivaprasad, A., Moral, J.M.A. (2023). Beyond Prediction Similarity: ShapGAP for Evaluating Faithful Surrogate Models in XAI. In: Longo, L. (eds) Explainable Artificial Intelligence. xAI 2023. Communications in Computer and Information Science, vol 1901. Springer, Cham. https://doi.org/10.1007/978-3-031-44064-9_10
Download citation
DOI: https://doi.org/10.1007/978-3-031-44064-9_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44063-2
Online ISBN: 978-3-031-44064-9
eBook Packages: Computer ScienceComputer Science (R0)