Introduction

The umbrella term explicability refers to increasing the understanding of black-box artificial intelligence (AI) systems (Robbins 2019). Reducing the opacity of black-box AI systems is crucial for medical AI applications because of the moral and professional responsibility of physicians to provide reasons for decisions (Swartout 1983). As such, commentators claim that medical AI systems “must have an explainable architecture, designed to align with human cognitive decision-making processes familiar to physicians, and directly tied to clinical evidence” (Char et al. 2020). This is because the output of medical AI systems provides reasons to justify further diagnostic and therapeutic decisions (Ferreira and Monteiro 2021). In contrast, other commentators state that accurate results for decisions in medicine are more important than the property to explain how results are produced (London 2019). Accordingly, a “defense of the black box” justifies its application in medicine if the “cost of a wrong answer is low relative to the value of a correct answer” (Holm 2019). This position prioritizes reducing harms from erroneous human decisions over having full understanding of how decisions were reached. In this paper, we mediate between these positions by distinguishing levels of explicability for obtaining valid informed consent to medical AI-aided procedures.

The discussion about explicability is rooted in a technological dilemma of black-box AI algorithms, namely the trade-off between accuracy and opacity (Reyes et al. 2020; London 2019). Accurate AI algorithms are increasingly difficult to comprehend, while comprehensible algorithms perform poorer. Understanding how automated decision support for doctors and patients came about is important when it comes to life-threatening surgical interventions or vigorous medications, as Smith (2018, pp. 149–150) implicates: “I don’t know why you are ill, but my computer says ‘take these pills’ […] or recommends surgery.” This entails ethical tensions because in medicine both physicians and patients may have a strong interest in tracing how facts came about that have implications for further action without having to sacrifice performance.

The centrality of explicability in medical AI invites one to reflect on the ethical requirements for information disclosure to meet the demands of informed consent. Therefore, the aim of this paper is to determine the levels of explicability required for ethically defensible informed consent processes and how they can technically be met by developers of medical AI. To assist clinical decision-making, we will conclude by proposing four levels of explicability, i.e., disclosure, intelligibility, interpretability, and explainability, as a framework that will allow assessing the extensiveness of informing patients within the informed consent process. Our framework is normative, so physicians can infer what information they should disclose to patients when they intend to use medical AI to assist in diagnostic or therapeutic decision-making.

Throughout this article, we take diagnostic systems in radiology as an example of clinical decision support systems (CDSS), because radiological AI-aided diagnostic systems represent the most advanced application of all medical AI developed to date, they are already commercially available and are in clinical use (American College of Radiology 2023; Muehlematter et al. 2021). AI-driven diagnostic systems are used for cancer screening (e.g., mammography, cf. Jairam and Ha 2022), neuroimaging (e.g., dementia, cf. Ursin et al. 2021a), ophthalmology (e.g., diabetic retinopathy, cf. Ursin et al. 2021b), and dermatology (e.g., skin cancer, cf. Beltrami et al. 2022). Our findings may have also relevance for nondiagnostic medical AI systems, such as those assisting in administrative processes or those integrated in medical devices like defibrillators adjusting automatically (Brown et al. 2022; Ranschaert et al. 2021).

Methodical procedure

We proceed in five steps: first, to derive the normative requirements for information content, we summarize the ethical demands for informed consent with a particular focus on the German healthcare system. We consider it an advantage to start from the acknowledged and established ethical and legal standards for informed consent in the German healthcare system, because this allows us to be as specific as possible while at the same time recognizing that we are not exhaustive globally. Although there is a plethora of legal, regulatory, social, and ethical issues regarding the question “what to tell the patient” when AI is involved (Cohen 2020; de Miguel et al. 2020; Mitchell and Ploem 2018), we are particularly interested in the normative sources of the extensiveness of information required for ethical decision-making.

Second, we map the conceptions that commonly fall under the umbrella term “explicability” as described in the literature on ethical AI, policy frameworks, ethical guidelines, and even computer science. Our own contribution is to synthesize and apply the various notions of explicability in these different discourses to medical AI. The aim of the second step is to distinguish levels of explicability through a conceptual analysis. In recognizing that we cannot be exhaustive within the scope of this work to cover both the medical ethics and computer science literature, we have chosen a goal-oriented approach. Therefore, our knowledge base for the conceptual analysis rests on two targeted literature searches to identify key ideas in reviews. First, we selected reviews of guidelines for ethical AI (Fjeld et al. 2020; Hagendorff 2020; Floridi and Cowls 2019; Jobin et al. 2019). Second, we selected recent taxonomies of explainable AI (XAI) in computer science (Yang et al. 2022; Graziani et al. 2022; Barredo Arrieta et al. 2020; Miller 2019; Holzinger et al. 2019; Lipton 2018; Adadi and Berrada 2018; see appendix). Since there are only early attempts to harmonize the global taxonomy of XAI and in spite of a lacking consensus across disciplinary boundaries (Graziani et al. 2022; Miller 2019), we cover a representative scope of the relevant discussion in both fields of ethics and computer science. We attached our findings of the concept mapping as an appendix to this article (see Table 4 in the appendix), to facilitate the replication of our interdisciplinary approach for other medical ethics assessments.

Third, we distinguish hurdles for explicability in terms of epistemic and explanatory opacity in building on the works of Ferretti et al. (2018) and Burrell (2016). This step aims at creating a list of criteria and questions for safeguarding the ethical utilization of medical AI against the background of the four levels of opacity by Ferretti et al. (2018). Fourth, we connect the normative requirements for informed consent with the levels of opacity and conclude which level of explicability physicians normatively must reach and what patients can expect. In a last step, we show how the identified levels of explicability can technically be met from the perspective of computer science. To this end, we discuss recent attempts of developing and deploying XAI in radiology.

Requirements for informed consent

We take the established framework of Faden et al. (1986, p. 274) as a starting point for distinguishing five elements of an ethically valid informed consent: information disclosure, comprehension, competence, voluntariness, and the consent itself. These five elements are acknowledged both globally (Eyal 2019) and nationally in Germany in the ethics literature (Becker 2019, pp. 16–26). Concerning the explicability of AI models as applied in various clinical fields all five elements can be reasonably discussed. However, issues of information disclosure and comprehension seem to gain particular importance with respect to medical AI due to the abovementioned black-box dilemma, whereas voluntariness, competence, and the consent itself are more likely to resemble “standard” settings of clinical care.

According to the traditional view of informed consent, information disclosure is closely linked to comprehension. Physicians are constantly asked to “tailor” their information in a way that is appropriate to the needs of the individual patient, e.g., by considering educational backgrounds, health literacy, or the patient’s current competence for intellectually grasping complex medical facts. Furthermore, it is well known that patients’ information needs differ and can be assessed systematically by applying standardized instruments (Christalle et al. 2019). In addition, various interventions for improving patients’ comprehension have been developed, including written, audiovisual and interactive digital materials (Glaser et al. 2020). However, empirical evidence still indicates that patients’ understanding of the information provided by physicians is often far from being optimal (Pietrzykowski and Smilowska 2021; Schenker et al. 2010). We conclude that each patient requires a specific content and quality of information to understand a medical procedure and consent to it.

Whereas, apparently, patients’ comprehension is closely linked to healthcare providers’ competence and practice in providing medical information. Recent ethical analyses highlight that comprehension and disclosure requirements rest on different normative sources: whereas disclosure aims at preventing illegitimate control over a person’s decision, the comprehension requirement focuses more on enabling the decision-maker to decide for something concrete (and not for something else; Millum and Bromwich 2021).

Beside the ethical standards of informed consent, there are also legal obligations further specifying which concrete information need to be disclosed to patients. Meeting these requirements is a sine qua non for transforming an illegitimate act of violating the patient’s bodily integrity into an act that is principally permissible for physicians (Becker 2019, p. 76). The German Civil Code (BGB) specifies that physicians must disclose information in a comprehensible manner on all circumstances that are essential for the treatment, in particular the diagnosis, the expected medical progression, and the therapy (BGB § 630c Abs. 2). Specifically, information on the nature, extent, execution, expected consequences and risks of the medical procedure as well as its necessity, urgency, suitability and prospects of success with regard to the diagnosis or therapy needs to be disclosed (BGB § 630e Abs. 1). The patient also has to be informed about alternatives and their different burdens, risks or chances of recovery (BGB § 630e Abs. 1). If a patient explicitly refuses being informed, he or she does not have to be informed (BGB § 630c Abs. 4 and 630e Abs. 3).

Against the background of such high requirements regarding the content and quality of information and the challenges which the practice of informed consent faces in clinical reality, the question arises how such requirements are met (or need to be further transformed) in light of a healthcare practice which is supported by AI-driven systems. In addition to issues of communication, the specific characteristics of medical AI necessitate developing new standards for disclosure and comprehension. For example, it matters whether the AI-driven system is already approved as a medical device so that physicians can trust its reliable functioning. If the system still has a novel character and clinical studies are missing, then higher demands are placed on physicians regarding both a critical benefit–risk assessment and informing patients about the details of the new treatment modality (ZEKO 2021).

The EU’s General Data Protection Regulation (GDPR) adds to this already demanding patient–physician communication the “right not to be subject to a decision based solely on automated processing” (GDPR 2016, article 22). However, there is no legally binding “right to explanation” because this is specified only as a recital (GDPR 2016, recital 71), but data subjects must be provided with “meaningful information about the logic involved” (GDPR 2016, article 13.2.f; 14.2.g, and 15.1.h). We conclude that patients must be informed that a medical procedure is taking place at all (whether or not AI is involved) and physicians must disclose the circumstantial information of the medical procedure. However, the amount and quality of information regarding the nature, extent, execution, etc. of the medical procedure must be aligned to the comprehension capacity of the patient, unless he or she refuses to be informed.

Mapping concepts of explicability

Explicability is an ethical concept often used as an umbrella term that incorporates “the epistemological sense of ‘intelligibility’ (as an answer to the question ‘How does it work?’) and the ethical sense of ‘accountability’ (as an answer to the question ‘Who is responsible for the way it works?’)” (Floridi and Cowls 2019, p. 8). Although the same term is not used in computer science (see appendix), it has been introduced to the debate on ethical AI by Floridi et al. (2018). Notions of explicability can be found in high-level guidelines on ethical AI (AI4People in Floridi et al. 2018; Robbins 2019, p. 499), although its conceptual value has been contested (Ursin et al. 2022; Cortese et al. 2022; Wadden 2021; Krishnan 2020; Mittelstadt 2019; Robbins 2019).

Despite its frequent use, Jobin et al. (2019) found significant differences in the meaning and justification of terms related to explicability. In their scoping review of principles in 84 guidelines for ethical AI, they clustered eleven principles of which transparency was the most common, followed by explainability, explicability, understandability, interpretability, communication, disclosure, and showing. Floridi and Cowls (2019) synthesized six guidelines for ethical AI authored by high-profile initiatives with 49 principles in total into a five-principles approach. Hagendorff (2020) examined 22 guidelines for ethical AI and found 18 principles. He clustered the terms transparency and openness (16 mentions); explainability and interpretability (10 mentions); and openness, human oversight, control, and auditing (12 mentions). Fjeld et al. (2020) clustered under “transparency & explainability” the terms open source data and algorithms, notification when interacting with an AI, notification when AI makes a decision about an individual, regular reporting requirement, right to information, and open procurement (for governments).

There are three major shortcomings in using principles to guide ethical decisions. First, principles are vague and therefore difficult to interpret, second, principles can be in conflict with each other as the authors of the four prima-facie principles of biomedical ethics concede (Beauchamp and Childress 2019), and third, there is a lack of conceptual clarity because, e.g., explainability and transparency are often considered synonymous (Kazim and Koshiyama 2021; Robbins 2019). For example, Funer (2022) refrained from defining explicability, explainability, and transparency when discussing whether accuracy or comprehension should guide information disclosure in the clinical application of AI systems. Being aware of the conceptual ambiguities means that certain concepts apply in specific domains, so that, for example, explainability is used differently in ethics and in computer science (Powers and Ganascia 2020, pp. 29–33).

While these shortcomings can hinder the implementation of principles into practice (Morley et al. 2020), the medical domain has a long tradition of coherence approaches to translate “high-level commitments and principles into practical requirements and norms of good practice” (Mittelstadt 2019, p. 503). There are attempts to reconcile conceptual ambiguities between philosophy, computational disciplines, law, economics, and engineering (Mattingly-Jordan et al. 2022; Amann et al. 2020). To the best of our knowledge, the article by Miller (2019) is the most comprehensive attempt to bring together insights from social sciences and computer science for XAI but it does not explicitly tackle medicine. Therefore, we still lack interdisciplinary work on XAI that brings together medicine, ethics, and computer science concerning the information that must be provided when medical AI is utilized to justify diagnostic or treatment decisions—this is a gap we aim to help bridge.

Explicability is not considered a moral principle on its own, comparable to the other four principles of biomedical ethics (Cortese et al. 2022; Morley et al. 2020). It is linked to them by mostly instrumental chains to avoid harm and increase trust or performance (Ursin et al. 2022; McCoy et al. 2022). A system is explicable when it is explainable and interpretable, making it more transparent, therefore more accountable for human decision-making, human oversight, and justifiable decisions (Morley et al. 2020). The EU’s Guideline on Trustworthy AI provides a definition for explicability (High Level Expert Group on Artificial Intelligence 2019, p. 13):

Explicability is crucial for building and maintaining users’ trust in AI systems. This means that processes need to be transparent, the capabilities and purpose of AI systems openly communicated, and decisions—to the extent possible—explainable to those directly and indirectly affected. Without such information, a decision cannot be duly contested. […] The degree to which explicability is needed is highly dependent on the context and the severity of the consequences if that output is erroneous or otherwise inaccurate.

Generally, we conclude that the inner workings of AI systems should be considered for decision making in general. Specifically, our analysis of the requirements for informed consent suggests that explicability should be considered for the clinical application of medical AI. Nevertheless, explicability faces four hurdles that need to be overcome.

Hurdles for explicability

Our contribution builds on a synthesis of Burrell’s (2016) account on different forms of opacity with that of Ferretti et al. (2018) and its application to the informed consent process in the medical domain. The influential work of Ferretti et al. (2018) has already descriptively been adopted in other work on ethical issues of informed consent to AI applications (Goisauf and Cano Abadía 2022; Astromskė et al. 2021), but the perspective on AI developers by Burrell (2016) has been neglected so far. Our interdisciplinary approach acknowledges that the hurdles for explicability in the patient–physician encounter are intertwined with the interests of AI developers.

Burrell (2016, pp. 3–5) distinguishes between opacity as (1) intentional corporate secrecy, (2) technical illiteracy, and (3) the way algorithms operate at the scale of application. The cause for intentional secrecy may be a form of self-protection by companies intending to maintain “their trade secrets and competitive advantage” (Burrell 2016, p. 3). Low technical literacy, also to be understood as a general epistemic opacity, is common in patients as well as in physicians as reading and writing code remains inaccessible to the majority of the population (Burrell 2016, p. 4). Also developers are affected by opacity since algorithms and models are multicomponent systems built by teams; thus, programmers must also contend with a specific epistemic opacity, i.e., how algorithms operate at the scale of specific application.

Ferretti et al. (2018) distinguish between the (1) lack of disclosure, (2) general epistemic opacity, (3) specific epistemic opacity, as well as (4) explanatory opacity. While Burrell (2016) used a conceptual approach to distinguish forms of opacity, Ferretti et al. (2018) examined the GDPR to identify rights for data subjects (Table 1). The most basic hurdle is when physicians themselves are not fully aware that they are working with an AI technology (Ferretti et al. 2018, pp. 326–327), e.g., when they assume they use a conventional picture archiving and communication system (PACS). Physicians might also be hesitant to disclose that an AI system is used at all, because of worries that patients have irrational fears towards AI systems and, therefore, chose out of paternalism not to disclose technical aspects.

Table 1 Levels of opacity based on Ferretti et al. (2018) and Burrell (2016)

The second hurdle is the general epistemic opacity of AI systems, i.e., that an agent does not understand the general principles of their design and functionality rather than the technical details (Ferretti et al. 2018, p. 328; Wachter et al. 2017a, p. 76; GDPR art. 13–15). This hurdle refers to the lack of meaningful background knowledge about the components of AI systems, the importance of training data, the data processing from inputs to outputs, and how rules are established for classifications by learning from examples (Ferretti et al. 2018, pp. 327–329). In accordance with the GDPR, these are the “logic, significance, envisaged consequences, and general functionality” (Wachter et al. 2017a, p. 78).

Like Ferretti et al. (2018, pp. 327–329) and Burrell (2016, p. 4), we see the need to draw an explicit distinction between how AI systems generally work (general epistemic opacity) and how a specific AI system works (specific epistemic opacity). To provide an illustrative example, it makes a difference whether (1) a patient might be satisfied by learning that AI systems can diagnose a disease from medical images, (2) a specific AI system is only able to discriminate between “likely has this disease” and “likely does not have this disease” (e.g., IDx-DR, cf. Ursin et al. 2021b), as well as a yet (3) hypothetical AI screening system for any disease in a given medical specialty. Similarly as with conventional medical devices, persons who are not familiar with the general functioning of that device must receive a general introduction to the technology to be able to consent to the procedure.

In light of its ethical, technical, and medical significance, the next hurdle for explicability is the specific epistemic opacity of a particular AI system considering its specific training data, the clinical relevance of specific inputs (feature relevance), the limitations of the spectrum of possible outputs, and the internal rules for data processing (Ferretti et al. 2018, pp. 327–329). Specific epistemic opacity relates “to the question of how an AI system provides a specific outcome” (Ferretti et al. 2018, p. 327). This is important because there might be biased training data and resulting rules, which lead to an unjustified discrimination of a patient or patient groups. In other words, specific epistemic opacity covers the hurdle for explicability when a particular AI system might not be the ideal choice to answer a clinical question or serve particular groups of patients.

The fourth hurdle is explanatory opacity that “relates to the question of why an AI system provides a specific outcome” (Ferretti et al. 2018, p. 329). This derives also from the “black box problem” that “occurs whenever the reasons why an AI decision-maker has arrived at its decision are not currently understandable to the patient or those involved in the patient’s care because the system itself is not understandable to either of these agents” (Wadden 2021, p. 4). To be able to provide effective, efficient, and satisfactory justifications for individual decisions is crucial for further medical treatment (Holzinger et al. 2019).

Levels of explicability

The clinical application of medical AI requires informed consent as for any other clinical procedure (Neri et al. 2020; Brady and Neri 2020). The question of why AI systems should be treated differently than other technologies already used in patient care is caused by the inherent levels of opacity of AI. The difference between conventional radiological images and an AI system analyzing that data is that the interpretation of the imaging data by an AI system is an additional task that influences the interpretation of the physician analyzing that data. Therefore, we conclude that the four levels of opacity influence the patient–physician encounter and should be countered by corresponding levels of explicability (see next section, Table 2). The aspects of information disclosure, comprehension, competence, voluntariness, and the consent itself have to be secured by considering the technological particularities of such systems. The “black box problem” makes this process particularly challenging. We found that the need for four different levels of explicability not only derives from the requirements for informed consent and the norms identified in the mapping of ethical concepts, but also as a response to the above-mentioned levels of opacity.

Table 2 Levels of explicability as applied for this work

There are various interpretations on the levels of explicability medical professionals should provide. Hitherto, Amann et al. (2020) distinguish only two levels: first, understanding how systems arrive at conclusions in general, and second, identifying features for an individual prediction. Floridi and Cowls (2019) distinguish intelligibility and accountability. Jobin et al. (2019) claim that communication and disclosure are crucial to increase explicability, i.e., the fact that AI is used, the evidence-base for AI use, limitations of the AI algorithm, and auditability, thereby expanding the levels of explicability to four. The notion of four aspects is reasonable, although not tailored to medicine and healthcare, hence, being abstract and not rooted in the ethical requirements for informed consent.

To adapt the concept of explicability to medical practice, we propose four levels of explicability: disclosure, intelligibility, interpretability, and explainability (Table 2). We do so by synthesizing both the concept mapping of the ethical discourse as well as the prevalent concepts of XAI in computer science to mitigate the opacity hurdles for explicability (appendix 1). We enrich each level with an ethical guiding question and identify ethical implications specifically oriented to the patient–physician encounter. We proceed by discussing for each level which information should be given to overcome which hurdle, why this is ethically important and what the ethical implications are, respectively.

Disclosure

As AI supported decision-making is not the standard in medical practice and a technology still under early development, patients may have a general interest in knowing that a medical decision was influenced by AI (ZEKO 2021). The GDPR requires disclosure for being subject to algorithmic decision-making (GDPR 2016, article 22). In Germany, it is legally required to inform about the nature of a diagnostic procedure (BGB § 630c Abs. 2).

From an ethical perspective, the clearest case requiring disclosure of having used medical AI is when asked by the patient. Physicians should generally not lie to patients—particularly on issues that are of their direct concern, such as the provenance of reasons backing a medical decision. Furthermore, to respect autonomy, physicians have an obligation not to deceive patients. Pretending that a medical decision was reached only through human wit would be a form of deception. Lastly, physicians should avoid withholding information patients are likely to want. If patients are generally interested whether new technologies have been used for their diagnosis or treatments, physicians would enter an obligation to share such information with the patient.

Therefore, to obtain an ethically valid informed consent, medical professionals need to disclose the application of medical AI within diagnosis or treatment. It is ethically not desirable to subject patients to algorithmic decision-making without their awareness.

Intelligibility

The aim of intelligibility is to counter general epistemic opacity, guided by the question “How do AI systems generally work?” This question can be split up into two further questions. The first question, “What are the parts of algorithmic models?” refers to the principle of decomposability, i.e., the “ability to explain each of the parts of a model (input, parameter and calculation)” (Barredo Arrieta et al. 2020, p. 88; Lipton 2018, p. 14). Decomposability is a condition for the second question, “How do AI systems learn from training data and generate an algorithmic models’ output?” referring to the ability to explain the general functioning of the technology. A general overview on how medical AI systems work may be of interest to those who have a moderate interest or skepticism towards new technologies, but have no particular worries about such systems. Informing about the general aspects of medical AI might be enough as far as physicians do not identify any factors that may position the patient as being at high risk.

To be intelligible, information needs to be provided on general risks and benefits inherited in the technology. Medical AI, especially in radiology, may be beneficial in having a high diagnostic accuracy, accelerating the radiological workflow or may potentially lead to less costly healthcare (Canadian Association of Radiologists 2019). Concerning risks, there are biases in the training data and once the model is trained there are algorithmic bias and automation bias as “the tendency for humans to favour machine-generated decisions” (Geis et al. 2019). In Germany, the obligation to communicate these general risks derives from the German Civil Code stating that specified information on the nature, extent, procedure, expected consequences, and risks must be disclosed (BGB § 630c Abs. 2).

Interpretability

Inspired by differentiations made in computer science (Molnar 2022, chapter 3) and radiology (Geis et al. 2019), we distinguish between explainability and interpretability. While explainability regards local post hoc explanations of individual predictions with the aid of explanatory methods, interpretability refers to the degree to which a human can predict a model’s output by comprehending its inner workings. We refer to interpretability as a feature of a specific algorithm in contrast to intelligibility as a feature of the technology machine learning (ML) as a branch of AI. We draw this distinction because it makes a difference which specific algorithm is used for which purpose.

If there are 209 radiological AI systems commercially available (American College of Radiology 2023), why should we use exactly this or that one? The specific algorithm, let’s say, can diagnose 15 different diseases, but the patient has the 16th so the algorithm is not suitable for broad screenings. If an algorithm’s output is only binary, let’s say, the patient has that disease or not, the algorithm should only be used if that is the diagnostic question. If an algorithm was only trained on data from Caucasian cis-males, then this is a limitation because the algorithm will not perform well on a diverse patient population (Obermeyer et al. 2019).

Therefore, there is an ethical obligation to inform oneself about vulnerable groups and conditions that may lead to inaccurate results in terms of accuracy, validity, uncertainty, and applicability as minimally acceptable criteria for interpretability (Arbelaez Ossa et al. 2022). Furthermore, once risks have been identified, there is the obligation to inform affected patients about individual or group risks. Lastly, the spectrum of possible outputs of a specific AI system must be disclosed. The German Civil Code demands that specific information on the suitability and prospects of success with regard to the diagnosis must be provided (BGB § 630c Abs. 2). In addition, well-informed patients may point out issues that physicians did not consider in their assessment.

We therefore agree with Ploug and Holm (2020) as well as Neri et al. (2020) that patients need to receive certain information to be able to contest AI driven diagnostics:

  • Personal health data: information on the type and source of input data,

  • Bias: information on (a) the character of the training data; (b) how training data were categorized by domain experts; (c) how the AI model was tested,

  • Performance: information on (a) accuracy, specificity, and sensitivity; (b) how the performance was tested, and

  • Decision: information on the (a) degree of human or algorithmic agency in making decisions; (b) that physicians are responsible for the final diagnosis.

Explainability

Explanations refer to causes and answer why-questions (Miller 2019, p. 6). Why-questions like “Why did that specific algorithm diagnose that condition?” are the most difficult to answer, because they are counterfactual, i.e., they exclude all other possible diagnoses. In medicine, this process is called differential diagnosis and it resembles abductive reasoning, i.e., concluding that the most likely hypothesis is true because all other hypotheses cannot explain an event properly. This resembles the scientific process of falsification, i.e., the empirical falsifiability of hypotheses (that a patient may have one out of n possible conditions).

One has to differentiate what an explanation means for a physician and for a patient because they ask different questions due to their different interests. While the physician as a domain expert uses the causes of a condition as an explanation for a medical indication and thereby as a justification for further examination, diagnosis or treatment, a patient needs an explanation for understanding his or her condition, maybe to adjust lifestyle or increase compliance. In case of the patient’s understanding, explanations are mostly intrinsic because they satisfy curiosity. In all other cases, explanations are instrumental, because they are means to achieve an end. Further instrumental reasons for explainability are examinations, to find meaning (reconcile contradictions and inconsistencies), to manage social interaction (share understanding), and persuasion or “assignments of blame” (Miller 2019, p. 9). We conclude that physicians and patients need different types (and thereby levels) of explanations.

Physicians as domain experts should have access to the explanation why an algorithm reached a certain decision because the algorithm’s output justifies or contests their own decision (Henin and Le Métayer 2021). There are already preliminary proposals for situations where the AI’s output conflicts with the physician’s decision, but it is beyond the scope of this paper to consider all possible combinations of peer-disagreement between a physician, colleagues, and AI systems (Kempt and Nagel 2022). However, meeting a certain level of explicability has the advantage that physicians may be able to defend their position against peers for using or omitting the use of AI.

As medical technologies increase in complexity, we need to acknowledge that limitations of resources oblige us to set limits in how detailed a patient can expect to receive explanations. Beyond a certain level, we can even say that offering more details about the working of a particular medical technology becomes a supererogatory act. Under resources scarcity, spending excessive amount of time informing patients may also conflict with obligations towards other patients.

Applying the levels of explicability

The question remains how the levels of explicability can be applied within the information process to obtain informed consent. As the levels we propose represent a stepwise model of increased complexity, not every patient may request or need the highest level of explicability. One may consider stratified risk levels (unacceptable, high, and low or minimal risk) as the EU’s AI regulatory framework does (European Commission 2021), but the risks have to be specific. In the literature, there is the proposal to concentrate on justifiability and contestability in high-stakes situations (Henin and Le Métayer 2021) or to provide minimally acceptable criteria for explainability (Arbelaez Ossa et al. 2022).

We propose two principles to translate theory into practice: first, tailoring the levels of explicability to patient needs and wishes and, second, tailoring the levels of explicability to the scope of a medical decision for a patient. Tailoring to patient needs and wishes honors the respect for autonomy from which both the “right to know” and the “right not to know” are derived. The “right not to know” is not an absolute right, but rather a right that must be “activated” (Andorno 2004), i.e., patients should be asked up to which level they wish to be informed. This requires at the very least that physicians disclose the fact that AI was used. We conceive that every possible combination can be met in reality: patients with high technical literacy and low health literacy may wish to know how the AI system works. While having general knowledge (therefore not “needing” intelligibility), they may request interpretability due to curiosity, but may not wish to get an explanation for the AI’s specific output. A patient with a low technical literacy and high health literacy may wish to get an explanation on the reasons backing a decision, but rejects elaborations on intelligibility and interpretability.

The second principle has been recently established by Funer (2022, p. 13) stating that “the greater the scope of a medical decision for a patient [is], the more normatively decisive the patient’s insight into the factors relevant to this decision and their interpretation in the context of the patient’s personal life.” While this principle is abstract, we suggest three criteria: the level of invasiveness of the treatment, the reversibility of the treatment, and the risk of reducing the quality of life. If an AI system diagnoses lung cancer based on a patient’s radiograph indicating surgery or chemotherapy, then this is invasive, not reversible and affects the quality of life. Therefore, in this case the highest level of explicability is appropriate. If an AI system automatically determines the age of persons by assessing radiographs, then the lowest level of explicability is appropriate because this test does not entail invasive or not-reversible procedures and does not affect quality of life (Fig. 1).

Fig. 1
figure 1

Levels of explicability in relation to levels of opacity. How to read the iceberg shaped figure: start from the top and proceed to the bottom if the patient desires to get on the next level of explicability. AI artificial intelligence

XAI approaches in computer science

In a last step, we discuss whether and how the identified levels of explicability can technically be met from the perspective of computer science (Table 3). We rely on state-of-the-art XAI taxonomies (Graziani et al. 2022; Barredo Arrieta et al. 2020) and select five XAI methods specifically suited for visual data. To this end, we also discuss recent attempts of developing, deploying, and combining XAI in radiology (see section “Applying XAI in radiology”). The technical reasons for selecting these XAI methods are that their explanations can justify an AI systems output, serve control desires in terms of error identification, can improve the model itself, and enable to discover new facts and information (Adadi and Berrada 2018).

Table 3 Selection of five main methods for explainable artificial intelligence (AI) for visual data and how they meet the levels of explicability according to Graziani et al. (2022, table 6)

Table 3 distinguishes five main methods proposed for building XAI systems for visual data. One possibility is to train so-called inherently interpretable models instead of models as opaque as deep neural networks (DNNs). For instance, Li et al. (2012) train a K-nearest neighbor (KNN) model for image-based cancer prediction. Explaining how KNNs work in general is a simple task: They store each training image along with its classification label. To classify a new image as either showing cancer or not, the KNN looks up the K stored images that are closest to the new image. When the majority of these K stored images carry the “cancer” label, then the new image is also classified as cancer. In this sense, KNNs have a high degree of intelligibility. They are also interpretable because the internal structure of a trained KNN can be visualized by showing all the stored images, their mutual distances, and the assigned labels. Finally, to explain an individual prediction of a given image, the K‑nearest neighbors of the given image and their labels can be shown: “This image has been classified as cancer because it is similar to these K images which also show cancer.” Unfortunately, inherently interpretable models such as KNNs achieve accuracy scores below DNNs.

Several methods have been developed that help understanding trained DNNs. Feature visualization (Nguyen et al. 2016) is a technique which is especially suitable for image classification with DNNs. A DNN consists of various layers of neurons that get activated by stimuli. The input image provides the stimulus for the first layer of neurons. If a neuron is sufficiently stimulated, it propagates a transformed stimulus further to the next layer of neurons. The idea behind feature visualization is to automatically generate images as input stimuli that maximize the activation of a neuron or a layer of neurons of interest. Some neurons may show high activity for images that contain a lot of edges, others may be more active for images with certain textures. This way, one can understand which parts of a given AI model respond to which parts of the input, and thus they contribute to a model’s interpretability. The method may also be useful for how AI generally works (intelligibility). However, feature visualization does not provide an explanation of why a specific image was classified as, say, cancer.

Prototypes (Kim et al. 2016) can contribute to both interpretability and explainability. Prototypes are prototypical images from the training data set. By using the trained AI model to classify these prototypical images, one can get an overview of the model’s behavior for a handful (or so) particularly representative images. An individual prediction can be explained by showing the closest prototype of the image input that receives the same prediction.

Often, the explainee is interested in knowing what parts of the image were particularly important for AI model’s prediction. Counterfactuals (Wachter et al. 2017b) and feature attribution methods (Ribeiro et al. 2016; Lundberg and Lee 2017; Simonyan et al. 2013) can provide this information. Counterfactuals identify regions in the image that are important for the observed prediction, that is, when these regions were removed (greyed out), then the AI model would change its prediction. Feature attribution methods not only identify important regions but also assign importance values to regions in the image, viz., they can show which regions speak in favor of the AI model’s prediction and which ones speak against it.

Applying XAI in radiology

Due to the medical relevance of XAI approaches, several review articles cover a wide range of explainability techniques in the medical domain (Yang et al. 2022; Graziani et al. 2022; Knapič et al. 2021; Adadi and Berrada 2020). In this section, we illustrate two such examples for radiology in detail. The first example has been chosen to show that even traditional saliency map techniques can still be improved. The second example demonstrates that a combination of XAI methods gains higher user satisfaction.

Saliency maps are likely the most commonly used XAI approach in medicine. By highlighting relevant regions, it is not only possible to provide reasoning behind an AI decision, but also to generate new knowledge, when the AI discovers features not represented in a labelled data set. Nevertheless, saliency techniques such as GradCAM (Selvaraju et al. 2016) often produce blurry highlights, which make localization difficult. To obtain more precise saliency maps, Major et al. (2020) have developed an approach which relies on image inpainting. During an inpainting process, they substitute healthy and unhealthy tissue. Based on the score difference of the images as well as saliency map quality, they can compute a saliency loss which is used in an iterative optimization to obtain sharper saliency maps. Their results are demonstrated on mammograms and show a much more detailed saliency map when compared to state-of-the-art algorithms.

While saliency maps provide a good intuition about relevant regions within a radiological image, they lack the means to communicate via language. As highlighted image regions often require additional textual labels, or a text describing the particular medical context, natural language needs to be considered as an important part when considering XAI in radiology. Gale et al. (2018) where among the first to realize this demand by generating medical reports on radiological images. In their work, they propose an image-to-text model, which has been designed and trained to generate such reports for hip fractures from frontal pelvic x‑rays. By combining a DenseNet with attention mechanisms, they are able to generate textual reports for x‑ray images, whereby the report also contains details not provided through supervision. Thus, the authors were able to generate such sentences, which described the type of fracture and the location of the fracture with great accuracy, even outperforming the original reports. When confronting physicians with the outcomes of the system, they on average rated text alone (7.0) higher than saliency maps (4.4), while rating the combinations of the two best (8.8) on a 10-point Likert scale. This clearly shows that XAI should not only be visual, but should also consider other means of providing information.

Conclusion and future outlook

An ethically defensible information process when utilizing medical AI is possible through four levels of explicability as consecutive steps of escalation. Disclosure is the first condition to anticipate whether the patient desires further details. After the patient becomes aware of the intended use of medical AI, he or she is in a position to request further details or allow physicians to inform at their own discretion. Physicians should be able to offer to patients the further three levels, i.e., intelligibility, interpretability, and explainability to counter the epistemic and explanatory hurdles of medical AI. However, there is no ethical obligation to provide all three further levels in every case. In terms of applicability, we advise physicians to tailor the level of explicability to the needs of patients and the scope of the medical decision: the more invasive, the higher the effect upon quality of life, and the less reversible a medical decision is, the more levels of explicability should be provided.

We believe that our analysis of the explicatory hurdles of medical AI has implications not only for aligning information requirements in radiology in particular, but also for health care in general. We acknowledge that our analysis of the normative requirements for informing patients pose high stakes for the use of medical AI. This could lead to questioning the feasibility of these normative claims. However, instead of lowering the bar and reduce the explicatory burden physicians should bear, we rather suggest that the insights from medical AI ethics should be used to re-evaluate established medical practices and technologies. While we are increasingly learning about algorithmic biases of medical AI systems (Obermeyer et al. 2021, 2019), we also become aware that AI systems reproduce or exacerbate already existing biases, inequalities, and discriminations inherent in the training data (Ntoutsi et al. 2020). This is not bad news, as we are now witnessing a window of opportunity to address the discriminatory effects that may also be prevalent in other sectors of medical practice today.

To inform about the quality of training data and its impact on marginalized population groups should not be a matter for medical AI alone, as increased awareness in fields such as dermatology is revealing. Darker skin types are underrepresented in cutaneous imaging data and models trained on these data perform poorly on patients with such skin tones (Kim et al. 2022). Discrimination does not only derive from the training data, but also from the classification system for skin types itself, the Fitzpatrick skin phototypes, because it does not capture variations in darker skin color and therefore restricts the range of options for people with darker skin. Ultimately, medical AI ethics can be an attention catalyst for confronting the biases long hidden in medical classification systems.