Contrastive counterfactual visual explanations with overdetermination

A novel explainable AI method called CLEAR Image is introduced in this paper. CLEAR Image is based on the view that a satisfactory explanation should be contrastive, counterfactual and measurable. CLEAR Image seeks to explain an image’s classification probability by contrasting the image with a representative contrast image, such as an auto-generated image obtained via adversarial learning. This produces a salient segmentation and a way of using image perturbations to calculate each segment’s importance. CLEAR Image then uses regression to determine a causal equation describing a classifier’s local input–output behaviour. Counterfactuals are also identified that are supported by the causal equation. Finally, CLEAR Image measures the fidelity of its explanation against the classifier. CLEAR Image was successfully applied to a medical imaging case study where it outperformed methods such as Grad-CAM and LIME by an average of 27% using a novel pointing game metric. CLEAR Image also identifies cases of causal overdetermination, where there are multiple segments in an image that are sufficient individually to cause the classification probability to be close to one.


INTRODUCTION
Data-driven AI for Computer Vision can achieve high levels of predictive accuracy, yet the rationale behind these predictions is often opaque.This paper proposes a novel explainable AI (XAI) method called CLEAR Image that seeks to reveal the causal structure implicitly modelled by an AI system, where the causes are an image's segments and the effect is the AI system's classification probability.The explanations are for single predictions and describe the local input-output behaviour of the classifier.CLEAR Image is based on the philosopher James Woodward's seminal analysis of causal explanation [32], which develops Judea Pearl's manipulationist account of causation [18].Together they constitute the dominant accounts of explanation in the philosophy of science.We argue that a successful explanation for an AI system should be contrastive, counterfactual and measurable.
According to Woodward, to explain an event E is "to provide information about the factors on which it depends and exhibit how it depends on those factors".This requires a causal equation to describe the causal structure responsible for generating the event.The causal equation must support a set of counterfactuals; a counterfactual specifies a possible world where, contrary to the facts, a desired outcome occurs.The counterfactuals serve to illustrate the causal structure, and to answer a set of 'what-if-things-had-been-different' questions.In XAI, counterfactuals usually state minimal changes needed to achieve the desired outcome.A contrastive explanation seeks to answer the question 'Why E rather than F?' F comes from a contrast class of events that were alternatives to E, but which did not happen [26].An explanation identifies the causes that led to E occurring rather than F, even though the relevant contrast class to which F belongs is often not explicitly conveyed.
For Woodward, all causal claims are counterfactual and contrastive: 'to causally explain an outcome is always to explain why it, rather than some alternative, occurred'.Woodward's theory of explanation stands in opposition to the multiple XAI methods that claim to provide counterfactual explanations [28], but which only provide statements of single or multiple counterfactuals.As this paper will illustrate, counterfactuals will only provide incomplete explanations without a supporting causal equation.
CLEAR Image excels at identifying cases of 'causal overdetermination'.The causal overdetermination of an event occurs when two or more sufficient causes of that event occur.A standard example from the philosophy literature is of soldiers in a firing squad simultaneously shooting a prisoner, with each shot being sufficient to kill the prisoner.The death of the prisoner is causally overdetermined.This causal structure may well be ubiquitous in learning systems.
For example, there may be multiple patches in a medical image, any of which being sufficient by itself to cause a classification probability close to one.To the best of our knowledge, CLEAR Image is the first XAI method capable of identifying causal overdetermination.
CLEAR Image explains an image's classification probability by contrasting the image with a corresponding GAN generated image.Previously, XAI use of GANs has just focused on their difference masks, which are created by subtracting the original image from its corresponding GAN generated image.However, as we will illustrate, a difference mask should only be a starting point for segmentation and explanation.This is because the segments identified from a difference mask can vary significantly in their relevance to a classification; furthermore, other segments critical to the classification can often be absent from the mask.Therefore, CLEAR Image uses a novel segmentation method that combines information from the difference mask, the original image and the classifier's behaviour.After completing its segmentation, CLEAR Image identifies counterfactuals and then follows a process of perturbation, whereby segments of the original image are changed, and the change in outcome is observed to produce a regression equation.The regression equation is used to determine the contribution each segment makes to the classification probability.As we will show, the explanations provided by leading XAI methods such as LIME and Grad-CAM often cannot be trusted.CLEAR Image, therefore, measures the fidelity of its explanation against the classifier, where fidelity refers to how closely an XAI method is able to mimic a classifier's behaviour.
CLEAR Image was evaluated in two case studies, both involving overdetermination.The first uses a multifaceted synthetic dataset, and the second uses chest X-rays.CLEAR Image outperformed XAI methods such as LIME and Grad-CAM by an average of 31% on the synthetic dataset and 27% on the X-ray dataset based on a pointing game metric defined in this paper for the case of multiple targets.Our code will be made available on GitHub.
The contribution of this paper is four-fold.We introduce an XAI method that: • generates contrastive, counterfactual and measurable explanations outperforming established XAI methods in challenging image domains; • uses a GAN-generated contrast image in determining a causal equation, segment importance scores and counterfactuals.
• offers novel segmentation and pointing game algorithms for the evaluation of image explanations.
• is capable of identifying causal overdetermination, i.e. the multiple sufficient causes for an image classification.
CLEAR Image is a substantial development of an earlier XAI method, (Counterfactual Local Explanations viA Regression), which only applies to tabular data [30].New functionality includes: (i) the novel segmentation algorithm (ii) generating perturbed images by infilling from the corresponding GAN image (iii) a novel pointing game suitable for images with multiple targets (iv) identification of sufficient causes and overdetermination (v) measurement of fidelity errors for counterfactuals involving categorical features.
The remainder of the paper is organised as follows: Section 2 defines the relevant notation and background.Section 3 discusses the immediate related work.Section 4 introduces the CLEAR Image method and algorithms.Section 5 details the experimental setup and discusses the results.Section 6 concludes the paper and indicates directions for future work.

Key Notation
This paper adopts the following notation: let the instance  be an image, and  be a machine learning system that maps  to class label  with probability .Let  be partitioned into  segments (regions) { 1 , . . .,   }.Let any variable with a prime subscript ′ be the variable from the GAN-generated image, e.g. ′ is the GAN generated image derived from , and maps to class  with probability  ′ .

Explanation by Perturbation
Methods such as Occlusion [35], Extremal Perturbation [8], FIDO [3], LIME [19] and Kernel SHAP [15] use perturbation to evaluate which segments of an image  are most responsible for 's classification probability y.The underlying idea is that the contribution that a segment   makes to  can be determined by substituting it with an uninformative segment  ′  , where  ′  may be either grey, black or blurred [35,8,19] or in-painted without regard to any contrast class [3].There are three key problems with using perturbed images to explain a classification: (1) A satisfactory explanation must be contrastive; it must answer 'Why E rather than F?' None of the above methods does this.Their contrasts are instead images of uninformative segments.
(2) The substitution may fail to identify the contribution that   makes to .For example, replacing   with black pixels can take the entire image beyond the classifier's training distribution.By contrast, blurring or uninformative in-painting might result in  ′  being too similar to   resulting in the contribution of   being underestimated.(3) A segmentation needs to be relevant to its explanatory question.Current XAI perturbation approaches produce radically different segmentations.FIDO and Extremal Perturbation identify 'optimal' segments that, when substituted by an uninformative segment, maximally affect the classification probability; by contrast, LIME uses a texture/intensity/colour algorithm (e.g.Quickshift [27] ).
CLEAR Image uses GAN generated images to address each of these problems: (i) its foil is a GAN generated image  ′ belonging to a contrast class selected by the user.(ii) inpainting with segments derived from  ′ enables better estimation of each segment's contribution to the difference between probabilities  and  ′ .(iii) the differences between  and  ′ are used to guide the segmentation.

RELATED WORK
The XAI methods most relevant to this paper can be broadly grouped into four types: (i) Counterfactual methods: Wachter et al. [29] first proposed using counterfactuals as explanations of single machine learning predictions.Many XAI methods have attempted to generate 'optimal' counterfactuals; for example, [12] review sixty counterfactual methods.The algorithms differ in their constraints and the attributes referenced in their loss functions [28].Desiderata often include that a counterfactual is: (1) actionable -e.g.does not recommend that a person reduces their age, (2) near to the original observation -common measures include Manhattan distance, L1 norm and L2 norm, (3) sparse -only changing the values of a small number of features, (4) plausible -e.g. the counterfactual must correspond to a high density part of the training data, (5) efficient to compute.Karimi et al. [13] argue that these methods are likely to identify counterfactuals that are either suboptimal or infeasible in terms of their actionability.This is because they do not take account of the causal structure that determines the consequences of the person's actions.The underlying problem is that unless all of the person's features are causally independent of each other, then when the person acts to change the value of one feature, other downstream dependent features may also change.[25,33] have been found to be invariant to network re-parameterisation or the class predicted [1,16].A third approach [22,4] uses the product of gradients and activations starting from a late layer.In Grad-CAM [22], the product is clamped to only highlight positive influences on class scores.
(iii) Perturbation based methods: LIME and Kernel SHAP generate a dataset of perturbed images, which feeds into a regression model, which then calculates segment importance scores (LIME) or Shapley Values (Kernel SHAP).These bear some similarity to CLEAR Image but key differences include: they do not use a GAN generated image, do not identify counterfactuals and do not report fidelity.Extremal Perturbation uses gradient descent to determine an optimal perturbed version of an image that, for a fixed area, has the maximal effect on a network's output whilst guaranteeing that the selected segments are smooth.FIDO uses variational Bernoulli drop to find a minimal set of segments that would change an image's class.In contrast to LIME, Kernel SHAP and Extremal Perturbation, FIDO uses a GAN to in-paint segments with 'plausible alternative values'; however, these values are not generated to belong to a chosen contrast class.Furthermore, segment importance scores are not produced.
(iv) GAN difference methods: Generative adversarial network (GAN) [9] has been widely applied for synthetic image generation.Image translation through direct mapping of the original image to its target class has gained popularity, such as CycleGAN [36] and StarGAN [5].StarGan V2 [6] introduced a style vector for conditional image translation and produced high quality images over a diverse set of target conditions.These models, however, do not keep the translation minimal and make modification even for intra-domain translation.Fixed-point GAN [23] introduced an identity loss penalising any deviation of the image during intra-domain translation.This aims to enhance visual similarity with the original image.DeScarGAN [31] adopts this loss function in its own GAN architecture and has outperformed Fixed-point GAN in their case study for Chest X-Ray pathology identification and localisation.
CLEAR Image builds on the strengths of the above XAI methods but also addresses key shortcomings.As already outlined, it uses a 'GAN-augmented' segmentation algorithm rather than just a difference mask.Next, methods such as LIME and Kernel SHAP assume that a classification probability is a simple linear addition of its causes.This is incorrect for cases of causal overdetermination, and CLEAR Image, therefore, uses a sigmoid function (see section 4.2).Finally, our experiments confirm that prominent XAI methods often fail to identify the most relevant regions of an image; CLEAR Image, therefore, measures the fidelity of its explanations.

THE CLEAR IMAGE METHOD
CLEAR Image is a model-agnostic XAI method that explains the classification of an image made by any classifier (see Figure 1).It requires both an image  and a contrast image  ′ generated by a GAN.CLEAR Image segments  into { 1 , . . .,   } ∈  and then applies the same segmentation to  ′ creating { ′ 1 , . . .,  ′  } ∈  ′ .It then determines the contributions that different subsets of S make to y by substituting with the corresponding segments of  ′ .CLEAR Image is GAN agnostic, allowing the user to choose the GAN architecture most suitable to their project.A set of 'image-counterfactuals' { 1 . . .  } is also identified.Figures 1 to 5 provide a running example of the CLEAR Image pipeline, using the same X-ray taken from the CheXpert dataset.

GAN-Based Image Generation
To generate contrastive images, StarGAN-V2 [6] and DeScarGAN [31] have been deployed as the network architectures for our two case studies, the first using CheXpert, the second using a synthetic dataset.The use of these established GAN networks demonstrates how the generated contrastive images can aid in the overall CLEAR Image pipeline.
Default training hyperparameters were applied unless otherwise stated.Details of model training and hyperparameters can be found in Appendix B. The source image was used as input for the Style Encoder instead of a specific reference image for StarGAN-V2.This ensures the generated style mimics that of the input source images.StarGAN-V2 is also not locally constrained (i.e. the network will modify all pixels in an image related to the targeted class, which will include irrelevant spurious regions of the image).A post-generation lung space segmentation step using a pre-trained U-Net model [20] was therefore implemented.The original diseased lung space was replaced with the generated image, with a Gaussian Blur process to fuse the edge effect (see Figure 2).This confines the feature identification space used by CLEAR Image to the lung space.It is an advantage of the CLEAR Image pipeline that it is possible to use pre-processing to focus the explanation on the relevant parts of .As we will show, XAI methods that do not take a contrast image as part of their input can sometimes identify parts of , known to be irrelevant, as being responsible for the classification.The following steps generate an explanation of prediction  for image : (1) GAN-Augmented segmentation algorithm.This algorithm is based on our findings (in Section 5.4) that the segments ( ℎ ) determined by analysing high intensity differences between an image  and its corresponding GAN generated image  ′ will often miss regions of  that are important to explaining 's classification.It is therefore necessary to supplement segments  ℎ with a second set of segments   confined to those regions of  corresponding to low intensity differences between  and  ′ .  is created based on similar textures/intensities/colours solely within .
Pseudocode for our algorithm is shown in Algorithm 1. First, high and low thresholds ( ℎ and   ) are determined by comparing the differences between  and  ′ using multi-Otsu; alternatively the thresholds can be user-specified. ℎ is then used to generate a set of segments,  ℎ .The supplementary segments   , are determined by applying the low threshold,   , to the low intensity regions and then applying a sequence of connected component labelling, erosion and Felzenszwalb [7].The combined set of segments,  ℎ and   , is checked to see if any individual segment is an image-counterfactual.If none is found, an iterative process is applied to gradually increase the minimum segment size parameter.The final set of segments (S, S') is subsequently created using the combined set ( ℎ ,   ) as shown in Figure 3. (2) Determine 's image-counterfactuals.A dataset of perturbed images is created by selectively replacing segments of  with the corresponding segments of  ′ (see Figure 4).A separate image is created for every combination in which either 1, 2, 3, or 4 segments are replaced.Each perturbed image is then passed through  to determine its classification probability.All image-counterfactuals involving changes in up to four segments are then identified.
(The maximum number of perturbed segments in a counterfactual is a user parameter; the decision to set it to 4 in our experiments was made as we found counterfactuals involving 5+ segments to have little additional explanatory value.) (3) Perform a stepwise logistic regression.A tabular dataset is created by using a {0,1} representation of the segments in each perturbed image from step 2. Consider a perturbed image   .This will be composed of a combination of segments   from the original image  and segments  ′  from the GAN contrast image  ′ .In order to represent   in tabular form, each segment of   that is from  is represented as a 1 and each segment of   that is from  ′ is represented as a 0. For example, if   consisted solely of { ′ 1 ,  2 ,  3 ,  4 }, and had a classification probability from  equal to 0.75 of being 'pleural effusion', then this would be represented in tabular form as {0, 1, 1, 1, 0.75}.The table of representation vectors can then be used to generate a weighted logistic regression in which those perturbed images that are image-counterfactuals are given a high weighting and act as soft constraints.
(4) Calculate segment importance scores.These are the regression coefficients for each segment from step 3.
(6) Measure the fidelity of the regression by calculating fidelity errors (see Figure 5) and goodness of fit statistics.(7) Iterate to the best explanation.Because CLEAR Image produces fidelity statistics, its parameters can be changed to achieve a better trade-off between interpretability and fidelity.For example, increasing the number of segments in the regression equation and including interaction terms might each increase the fidelity of an explanation but reduce its interpretability.Fig. 5. Extracts from a CLEAR Image report.The report identifies that substituting both segments 4 and 11 with the corresponding segments from its contrast image flips the classification probability to 'healthy' According to the logistic regression equation these substitutions would change the probability of the X-ray being classified as 'pleural effusion' to 0.44.However, when these segments are actually substituted and passed through the classifier, the probability changed to 0.43, hence the fidelity error is 0.01.CLEAR Image also identifies that substituting segments 3 and 11 also creates an image-counterfactual.Note that unlike methods such as GradCAM, CLEAR Image is able to identify segments that have a negative impact on a classification probability.
For CLEAR Image an explanation is a tuple < ; ;  ; ,  >, where  are segment importance scores,  are imagecounterfactuals,  is a regression equation,  are the causes resulting in overdetermination, and  are fidelity errors.
The regression equation is a causal equation with each independent variable (each referring to whether a particular segment is from  or  ′ ) being a direct cause of the classification probability.Figure 5 shows an extract from a CLEAR report.Pseudocode summarising how CLEAR Image generates an explanations is provided in Algorithm 2. The causal overdetermination of an effect occurs when multiple sufficient causes of that effect occur.By default, CLEAR Image only reports sufficient causes which each consist of a single segment belonging to .Substituting a sufficient cause for its corresponding member in  ′ guarantees the effect.In the philosophy of science, it is generally taken that for an effect to be classified as overdetermined, it should be narrowly defined, such that all the sufficient causes have the same, or very nearly the same impact [17].Hence for the case studies, the effect is defined as  ( ∈ ) > 0.99, though the user may choose a different probability threshold.A sufficient cause changes a GAN generated healthy image to a diseased image.This is in the opposite direction to CLEAR Image's counterfactuals whose perturbed segments flip the classification to 'healthy'.Sufficient causes can be read off from CLEAR Image's regression equation.Using Fig. 6.Overdetermination.The report identifies segments 9 and 11 as each sufficient to have caused the original X-ray to be classified as 'pleural effusion' with a probability greater than 0.99.Hence this is a case of causal overdetermination.The corresponding GAN-generated image  ′ has a classification probability ≈ 0 for pleural effusion.If a perturbed image   was created by substituting all the segments of the original image  with the corresponding segments of  ′ except for segment 9, then   would still have a classification probability for pleural effusion greater than 0.99.The same would apply if only segment 11 was substituted.
the example in Figure 6 with the logistic formula, a classification probability of > 0.99 requires w  x > 4.6.The GAN healthy image corresponds to all the binary segment variables being equal to 0. Hence, w  x is equal to the intercept value of -4.9, giving a probability of (1 +  4.9 ) −1 ≈ 0.01.If a segment  ′  is now replaced by   , the corresponding binary variable changes to 1. Hence if segment 9 is infilled, then Seg09 = 1 and w  x = 6.8 (..11.7 − 4.9).Similarly, infilling just segment 11 will make w  x > 4.6.Either substitution is sufficient to guarantee w  x > 4.6, irrespective of any other changes that could be made to the values of the other segment variables.Hence segments 9 and 11 are each a sufficient cause leading to overdetermination.By contrast, XAI methods such as LIME and Kernel SHAP cannot identify cases of overdetermination.This is because they use simple linear regression instead of logistic regression.For example, suppose that an image has three segments:  1 ,  2 ,  3 .In the regression dataset, each segment infilled from  has a value of 1 and each segment infilled from  ′ has a value of 0. LIME/Kernel SHAP's regression equation will have the form:  =  1  1 +  2  2 +  3  3 .In the case of LIME,  is meant to be the classification probability and the regression coefficients ( 1 ,  2 ,  3 ) are the feature importance scores.
Let us suppose there is overdetermination, with segments  1 and  2 each being a sufficient cause for  to be in a given class (e.g.'pleural effusion') with more than 0.99 probability.Hence, the regression equation should set  to a value greater than 0.99 not only when  1 =  2 = 1, but also when either  1 = 1 or  2 = 1.This is clearly impossible with the above linear form (and the constraint that  ≤ 1).Mutatis mutanda, the same argument applies for Kernel SHAP.

EXPERIMENTAL INVESTIGATION
There are two case studies, the first using a synthetic dataset, the second analysing pleural effusion X-rays taken from the CheXpert dataset [11].Transfer learning was used to train both a VGG-16 with batch normalisation and a DenseNet-121 classifier for each dataset.CLEAR Image was evaluated against Grad-CAM, Extremal Perturbations and LIME.The evaluation consisted of both a qualitative comparison of saliency maps and a comparison of pointing game and intersection over union (IoU) scores.CLEAR Image's fidelity errors were also analysed (none of the other XAI methods measures fidelity).

Datasets
The synthetic dataset's images share some key characteristics found in medical imaging including: (i) different combinations of features leading to the same classification (ii) irrelevant features.All images (healthy and diseased) contain a set of concentric circles, a large and a small ellipse.An image is 'diseased' if either: (1) the small ellipse is thin-lined, and the large ellipse contains a square or (2) there is a triangle, and the large ellipse contains a square.The dataset is an adaptation of [31].
CheXpert is a dataset of chest X-rays with automated pathological label extraction through radiology reports, consisting of 224,316 radiographs of 65,240 patients in total.Images were extracted just for the classes 'pleural effusion' and 'no finding'.Mis-classified images and images significantly obstructed by supporting devices were manually filtered.
A random frontal X-ray image per patient was collected.In total, a dataset of 2,440 images was used in this work for model training, validation and testing.Appendix A.2 details the data preparation process.A hospital doctor provided the ground truth annotation to the X-ray images with pleural effusion for our case study.

Evaluation Metrics
This paper uses two complementary metrics to evaluate XAI methods.Both require annotated images identifying 'target' regions that should be critical to their classification.A pointing game produces the first metric, which measures how successfully a saliency map 'hits' an image's targets.Previously pointing games have been designed for cases where (i) images have single targets (ii) the saliency maps have a maximum intensity point [8,34].By contrast, this paper's case studies have multiple targets, and the pixels within each CLEAR Image segment have the same value.We, therefore, formulated a novel pointing game.The pointing game partitions a 'diseased' image into 49 square segments, P = { 1 . . . 49 } and identifies which squares contain each of the targets.The corresponding saliency map is also partitioned, and each square is allocated a score equal to the average intensity of that square's pixels Q = { 1 . . . 49 }.The pointing game then starts with the   of highest intensity and determines if the corresponding   contains a relevant feature.A successful match is a 'hit' and an unsuccessful match is a 'miss'.This process continues until every target has at least one hit.The score for an image is the number of hits over the number of hits plus misses.Pseudocode is provided in Algorithm 3.
The second metric is IoU.It is assumed that each pixel in a saliency map is classified as 'salient' if it is above 70 ℎ percent of the maximum intensities in that map.IoU then measures the overlap between the 'salient' pixels   and the pixels belonging to the image's targets   :  =   ∩   /   ∪   .
The chosen percentile was an empirically identified threshold to maintain a relatively high IoU score by balancing high intersection with   and small union of pixel regions with a large enough   (see Appendix A.1 for details).
Both metrics are useful but have counterexamples.For example, IoU would give too high a score to a saliency map that strongly overlapped with a large image target but completely missed several smaller targets that were also important to a classification.However, applied together, the two metrics provide a good indication of an XAI's performance.

Experimental Runs
CLEAR Image was run using logistic regression with the Akaike information criterion; full testing and parameter values can be found in Appendix B.3.The test datasets consisted of 95 annotated X-rays and 100 synthetic images.The average running time for CLEAR Image was 20 seconds per image for the synthetic dataset and 38 seconds per image for the Algorithm 3: Pointing Game input : -input image,  -annotated features  -XAI saliency map once all features hit break return < ℎ,  > CheXpert dataset, running on a Windows i7-8700 RTX 2070 PC.Default parameter values were used for the other XAI methods, except for the following beneficial changes: Extremal Perturbations was run with 'fade to black' perturbation type, and using areas {0.025,0.05,0.1,0.2} with the masks summed and a Gaussian filter applied.LIME was run using Quickshift segmentation with kernel sizes 4 and 20 for the CheXpert and synthetic datasets respectively.

Experimental Results
CLEAR Image outperforms the other XAI methods on both datasets (Figure 7a).Furthermore, its fidelity errors are low, indicating that the regression coefficients are accurate for the counterfactually important segments (Figure 7b). Figure 7c illustrates some of the benefits of using the 'Best Configuration', which uses GAN-augmented segmentation and infills using  ′ .This is compared with (i) segmenting with Felzenszwalb and infilling with  ′ (ii) segmenting with GAN-augmented but infilling with black patches (iii) segmenting with Felzenszwalb, infilling with black patches.Figure 8 illustrates how CLEAR Image's use of GAN-augmented leads to a better explanation than just using a difference mask (e.g.CLEAR Image's performance was similar for VGG-16 and DenseNet; therefore, only the DenseNet results are presented unless otherwise stated.CLEAR Image's regression equation was able to capture the relatively complex causal structure that generated the synthetic dataset.Figure 9 shows an example.A square (SQ) is a necessary but insufficient cause for being diseased.An image is labelled as diseased if there is also either a triangle (TR) or the small ellipse is thin-lined (TE).When SQ, TR and TE are all present in a single image, there is a type of overdetermination in which TR and TE are each a sufficient cause relative to the 'image with SQ already present'.As before, a diseased image corresponds to the binary segment variables equalling one and a classification probability of being diseased > 0.99 requires w  x > 4.6.This can only be achieved by Seg 5 (corresponding to SQ) plus at least one of Seg 2 or Seg 7 (TE, TR) being set to 1 (i.e.being present).For the CheXpert dataset, figure 11 illustrates how CLEAR Image allows for a greater appreciation of the pathology compared to 'broad-brush' methods such as Grad-CAM (please see Appendix A1 for further saliency maps).Nevertheless, the IoU scores highlight that the segmentation can be further improved.For CheXpert's counterfactuals, only 5% of     images did not have a counterfactual with four or fewer  ′ segments.Most images required several  segments to be infilled before its classification flipped to 'healthy', 17% required one segment, 30% with two segments, 24% with three segments and 24% with four segments.17% of the X-rays' were found to be causally overdetermined.

CONCLUSION AND FUTURE WORK
A key reason for CLEAR Image's outperformance of other XAI methods is its novel use of GANs.It recognises that a difference mask is only the starting point for an explanation.Instead, it uses a GAN image both for infilling and as an input into its own segmentation algorithm.
As AI systems for image data are increasingly adopted in society, understanding their implicit causal structures has become paramount.Yet the explanations provided by XAI methods cannot always be trusted, as the differences in Figure 11's saliency maps show.It is therefore critical that an XAI method measures its own fidelity.By 'knowing when it does not know', it can alert the user when its explanations are unfaithful.
The examples in this paper help to illustrate our claim that XAI counterfactual methods will often fail to provide satisfactory explanations of a classifier's local input-output behaviour.This is because a satisfactory explanation requires both counterfactuals and a supporting causal equation.It is only because CLEAR Image produces a causal equation that it is then able to identify (a) segment importance scores, including identifying segments with negative scores (Figure 5) (b) segments that are necessary but insufficient causes (Figure 9) (c) cases of overdetermination (Figure 6).Providing only counterfactuals by themselves is clearly insufficient; imagine another science, say physics, treating a statement of counterfactuals as being an explanation, rather than seeking to discover the governing equation.Perhaps the primary benefit of XAI counterfactual methods is in suggesting sets of actions, but as Karimi et al. [13] argue, counterfactual methods will often perform suboptimally at this task.
A limitation of CLEAR Image is that it first requires training a GAN, which can be a challenging process.Another possible limitation could be the understandability of CLEAR Image to non-technical users.However, its reports can be suitably tailored, e.g.only showing saliency maps, lists of counterfactuals and cases of overdetermination.
We have shown that CLEAR Image can illuminate cases of causal overdetermination.Many other types of causal structures may also be ubiquitous in AI.For example, causal pre-emption and causal clustering are well documented within the philosophy of science [2,21].The relevance of these to XAI will be a future area of work.A user study should also be carried out.However, these are time/resource consuming and need to be devised carefully by experts within specific application domains to produce sound, reliable results.Instead, we focus on objective measures and evaluations of XAI research which in our view must precede any user study.Future work will also focus on improving segmentation, e.g. by introducing domain-specific constraint parameters for GANs, to minimise the modifications of specified attributes (e.g.changes in the heart when generating lung X-rays).  Figure 12 shows the additional results for the DenseNet model while Figure 13 presents the results for the VGG16 model.These results have demonstrated higher precision using CLEAR Image in identifying significant segment matching against the annotated ground truth in comparison to other explanation methods.These two figures provide a qualitative comparison to supplement the results presented in Figure 7 where CLEAR Image outperforms other XAI methods.

A.2 Data Pre-Processing
CheXpert has a total of 14 pathological classes including 'No Finding', and these are labelled through an automated rule-based labeller from text radiology reports.For each observation, the Stanford team has classified each radiograph as either negative (0), uncertain (-1) or positive (1).Other metadata includes gender, age, X-ray image projection and presence of supporting devices.
In this study, this dataset (v1.0) was applied for the model development of a binary classification task to demonstrate the capability of CLEAR Image as an XAI framework.An initial filtering process of the metadata was applied for the two classes used in the study -(1) Diseased with Pleural Effusion and (2) Healthy (this was assumed to be X-ray images with no findings and no positive observations in any of the pathological conditions).To minimise potential complications with other pathological conditions, X-ray images with only positive in pleural effusion were used with the other pathological categories either as negative/blank.
A review of the filtered images also identified that the dataset was curated with some images having significant artefacts that can hamper model training performance.Figure 15 presents some of these images in both diseased and healthy categories.Many of these consisted of artefacts from image capturing and processing (e.g.image distortion, orientation, low resolutions or miscalibration).Some images were also significantly obstructed by limbs or support devices.Some healthy images were also wrongly labelled according to a hospital doctor, who assisted in our project.A secondary manual filtering was conducted to remove any identified images with artefacts.
The 2440 selected images were split approximately 80/10/10 for the training/validation/testing.The images were also resized to 256 x 256 as the input into the classification model and generative adversarial network (GAN) as described in Section 5. Figure 16 presents some typical images in the final dataset for both diseased and healthy categories.The DeScarGAN architecture was adopted for the synthetic dataset in Section 5.1.80% of the dataset (4000 images) was used for GAN training and 20% of the dataset (1000 images) was used for validation.A total of 2,500 epochs was run and the best epoch was selected on visual quality.Additional 100 images were generated as an out-of-sample test dataset.Adam optimizer was used with  1 = 0.5,  2 =0.999.An initial learning rate of 10 −4 was used and stepped down to a final learning rate of 10 −6 .Default hyperparameters for loss functions were used to mimic a similar investigation from the original author as shown below in StarGAN-V2 [6] has been adopted in this work as a state-of-art GAN network for image translation.The GAN provided the necessary contrastive images for the CheXpert dataset.Default hyperparameters were maintained while notable loss weights are highlighted in Table 2. Adam optimizer was used with  1 = 0,  2 =0.99.A total of 50,000 epochs were run for the CheXpert dataset.The style encoding was referenced to the input image for the translation to the targeted class.This aided in maintaining the general features of the images compared to the original.As StarGAN-V2 [6] did not constrain its generation to a localised region (e.g.lungs), post-processing of segmentation and blending was implemented for the CheXpert dataset.Segmentation of the lung region was based on a pre-trained model with a U-Net architecture.
The segmentation mask was subsequently used to guide the replacement of pixels within the lung region from the GAN generated healthy image onto the original diseased image.Gaussian Blur was applied to minimise the edge effect during the blending process.This post-processing step aided in restricting the feature identification space within the lungs and reducing the computational cost for locating the counterfactuals.

Loss Term
Weight Value Style Reconstruction Loss:    = 1 Style Diversification Loss:   = 1 Cyclic Loss: = 1 An evaluation of similarity to real healthy images was performed using the Fréchet inception distance (FID) [10] benchmarking against the set of healthy images in the model training dataset.Four image sets were compared: (1) real healthy images in the validation set, set of images with pleural effusion processed as described in Figure 2  As observed in Figure 17, the processed images with replacement using GAN generated healthy lung segments resemble more similar to actual healthy images than blurred or black segments.As such, GAN generated processed images as described in Figure 2 were selected as the choice of synthetic healthy images for this work.

Fig. 1 .
Fig. 1.The CLEAR  pipeline.The GAN produces a contrast image.CLEAR Image explains the classification probability by comparing the input image with its contrast image.It produces a regression equation that measures segment scores, reports fidelity and identifies cases of overdetermination.In this example, class  is 'pleural effusion' and its contrast class  ′ is 'healthy'.Using our Densenet model, the X-ray shown in this figure had a probability of belonging to  equal to 1, and its contrast image had a probability of belonging to  equal to 0.

Fig. 2 .
Fig.2.The process of generating a contrast image.An original diseased image is firstly used to generate a healthy contrast image with a trained GAN model.In this example, StarGAN v2 is used as the architecture.The generated healthy lung airspace is then segmented using a U-Net segmentation model blended onto the original diseased image to produce the final image by applying GaussianBlur to minimise any edging effect around the segments.

4. 2
Generating Contrastive Counterfactual Explanations Definition 4.1.An image-counterfactual   from l to  ′ is an image resulting from a change in the values of one or more segments  of x to their corresponding values in  ′ such that class(()) = , class((  )) =  ′ and  ≠  ′ .The change is minimal in that if any of the changed segments had remained at its original value, then class(()) = class((  )).CLEAR Image uses a regression equation to quantify the contribution that the individual segments make to .It then measures the fidelity of its regression by comparing the classification probability resulting from each   with an estimate obtained from the regression equation.Definition 4.2.Counterfactual-regression fidelity error Let (  ) denote the application of the CLEAR Image regression equation given image-counterfactual   .Counterfactual-regression fidelity error = |(  ) −    |.

Fig. 3 .
Fig.3.The GAN-Augmented segmentation algorithm.There are three stages.First, segments are identified from the high intensity differences between the original image  and its contrast image  ′ (a).Second, additional segments are identified from the regions of  corresponding to low intensity differences between  and  ′ (b) Third, the segments from the two steps are combined (c).

Fig. 4 .
Fig. 4. Determining image-counterfactuals.In this example segments  4 and  11 are evaluated both separately and in combination.Substituting  11 with its corresponding contrast segment  ′ 11 creates a perturbed image (b) with the same classification probability as the original image (a).The same applies with segment  4 (c).However substituting both segments  4 and  11 results in a perturbed image (d) which has a classification probability of 0.43.Given a decision boundary at probability of 0.5, (d) would be classified as a 'healthy' X-ray and would therefore be an image-counterfactual.

Figure 10
Figure 10 compares the saliency maps for synthetic data.

Fig. 7 .
Fig. 7. Evaluation metrics.Figure (a) compares the performances of different XAI methods with the DenseNet models.Figure (b) shows the fidelity errors for the DenseNet models.Figure (c) compares the performances of different configurations of CLEAR Image.The bars show 95% confidence intervals.
Fig. 7. Evaluation metrics.Figure (a) compares the performances of different XAI methods with the DenseNet models.Figure (b) shows the fidelity errors for the DenseNet models.Figure (c) compares the performances of different configurations of CLEAR Image.The bars show 95% confidence intervals.

Fig. 8 .
Fig.8.GAN-Augmented Segmentation versus GAN difference mask.The difference mask identifies four segments but when CLEAR Image perturbs these, the two nearest to the top were found to be irrelevant.Of the other two segments, CLEAR Image identifies the segment it colors green to be far more important to the classification probability.

Fig. 9 .
Fig. 9. Extracts from a CLEAR Image report for a synthetic image.The regression equation shows that Seg05 is a necessary but insufficient cause of the X-ray being diseased.

Fig. 10 .
Fig. 10.Comparison of XAI methods on synthetic data.The pointing game scores are shown in green and the IoU scores are in purple.The maps illustrate how CLEAR Image and LIME are able to tightly focus on salient regions of an image compared to broadbrush methods such as Grad-CAM and Extremal.The significance of a patch is indicated by its red intensity.

Fig. 11 .
Fig. 11.Comparison of XAI methods on X-ray.The pointing game scores are shown in green and the IoU scores are in purple.The significance of a patch is indicated by the intensity of red against the blue outlined annotated ground truth.
Fig. 14.Comparison of IoU score against four XAI methods, (1) CLEAR Image, (2) GradCAM, (3) Extremal and (4) LIME to determine the threshold of intensity at 10% intervals.CLEAR Image outperforms the other XAI methods for each of the 4 intensity thresholds.

Fig. 15 .
Fig. 15.Representative examples of poorly curated images including image distortion, mis-orientation, obstruction by limbs and support devices as well as significant spine deformation.

Fig. 16 .
Fig. 16.Representative examples of final images for (a) diseased with identifiable regions of pathology and (b) healthy images with clear air space.All images have minimal obstructions from support devices.
with replacement of lung segments using (2) corresponding GAN-generated healthy images, (3) Gaussian blurred version of the original images and (4) constant value of zero (i.e.black).This FID score indicated how close each of the four compared image sets to the benchmark images in the training set.A low score indicated similarity between the two datasets.

Fig. 17 .
Fig. 17.Comparison of Fréchet inception distance (FID) against the training healthy image dataset with (1) a set of real healthy images in the validation set, set of images with pleural effusion processed as described in Figure 2 with replacement of lung segments using (2) corresponding GAN-generated healthy images, (3) Gaussian blurred version of the original images and (4) constant value of zero (i.e.black).
However, this criticism does not apply to CLEAR Image.CLEAR Image's purpose is to explain the local