Introduction

As Artificial Intelligence (AI) system continues to be integrated, the ambiguity and opacity of these systems have stirred considerable concern, prompting an increase in the focus on eXplainable AI (XAI) research. The intention of XAI is to shed light on the internal workings of AI systems, thereby making them more transparent, comprehensible, and accountable (Gunning & Aha, 2019). This effort aligns closely with the broader endeavor towards ethical governance of AI. Indeed, the ethical implications of AI technologies have gained significant attention due to their potential to perpetuate existing inequalities, produce unintended negative consequences, and create new ethical dilemmas (Blasimme & Vayena, 2020; Buyl et al., 2022; Jobin et al., 2019; Pastaltzidis et al., 2022). However, the extent to which XAI research genuinely addresses ethical considerations, and effectively assimilates them into the design, development, and evaluation of AI systems, remains a topic of considerable debate (Balasubramaniam et al., 2023; van Otterlo & Atzmueller, 2020; Kaur et al., 2020; Alufaisan et al., 2021; Chazette et al., 2019).

Explanations can be misleading, oversimplified, or biased, and may not always align with human values and preferences (Bertrand et al., 2022; He et al., 2023; Bordt et al., 2022; Balagopalan et al., 2022). Moreover, the level of detail and complexity of explanations must be carefully calibrated to the needs and capacities of different stakeholders (Bhatt et al., 2020; Liao et al., 2020; Ehsan et al., 2021; Zhang et al., 2020; Bansal et al., 2021). This also to ultimately inform and adopt unambiguous policies around AI explainability (Nannini et al., 2023). Designing XAI systems thus requires grappling with complex trade-offs between competing values. Navigating such ethical challenges requires a deep engagement with the principles and frameworks of moral philosophy. Ethical theories such as deontology, consequentialism, and virtue ethics offer valuable resources for evaluating different XAI techniques and approaches. At the same time, the novel and complex nature of XAI systems may require going beyond traditional ethical frameworks to develop new context-specific principles and guidelines (Floridi et al., 2018; Vainio-Pekka et al., 2023). The field of applied ethics, which focuses on translating moral theories into action-guiding principles for real-world decision-making, provides a useful lens for considering the responsible development and deployment of XAI systems (Beauchamp & Childress, 2001).

The primary research question driving this study is: “What is the extent and depth of ethical discussions within XAI research, and how are ethical theories or frameworks applied in this domain?". To pursue this research direction, we critically investigate the relationship between XAI and ethical considerations by conducting a systematic review of (410) papers gathered from Scopus. Our research queries were formalised with the aim of identifying papers proposing contributions at the intersection of both XAI and ethics. We classify the papers according to their treatment of ethical aspects in the context of XAI, using a rigorous 5-point approach that takes into account the presence, depth, and focus of ethical discussions, as well as the application of ethical theories. The main contributions of this paper are: (1) a novel methodology and the taxonomy to conduct a bibliometric study on the current state of ethical discourse in the XAI field; (2) a classification of XAI research papers based on their treatment of ethical aspects; (3) an in-depth analysis of the findings.

The rest of the paper is organized as follows. Section 2 provides a detailed overview of the background knowledge and introduces the necessary terminology essential for comprehending our discussion. Section 3 presents the methodology employed and describes our bibliometric approach, with the classification further detailed within Appendix A & B. In Sects. 4 we report upon our findings pertaining to the extent of ethical considerations within the field of XAI research. Section 5 deliberates on such considerations for a more comprehensive integration of ethical considerations in XAI, accounting for limitations in Sect. 6 before concluding in Sect. 7.

Background

XAI techniques can favor AI-systems comprehension by illuminating the reasoning behind their decisions. To grapple with ethical complexities, XAI research must engage substantively with normative ethical theories and principles from the field of applied ethics. This background section provides an overview of the major ethical frameworks relevant to XAI, outlines key ethical challenges in operationalizing XAI principles, and reviews related work examining the treatment of ethics in XAI research to date.

Section 2.1 summarizes the core tenets of deontological, consequentialist, and virtue ethics perspectives, considering their potential implications for the design and governance of XAI systems. Section 2.2 then discusses the crucial role of applied ethics in translating abstract moral theories into context-sensitive guidance. There, Sect. 2.2.1 briefly outlines key philosophical debates underpinning these ethical theories, such as the nature of moral reasoning, the grounding of moral principles, and the scope of moral consideration. With this philosophical grounding established, Sect. 2.3 finally positions our research among recent systematic mapping studies and self-critical assessments examining the treatment of ethics within the XAI research landscape.

Major ethical theories and their relevance to XAI

To provide a solid foundation for our analysis, it is essential to briefly discuss the main normative ethical theories that have shaped moral reasoning and decision-making, and consider their potential implications for XAI. These theories, namely deontology, consequentialism, and virtue ethics, offer distinct perspectives on what constitutes ethical behavior and how to navigate moral dilemmas (Shafer-Landau, 2012; Copp, 2006).

Deontological theories—focus on the intrinsic rightness of actions based on moral rules or duties (Kant, 1959; Ross, 1930). The core commitment is to ground objective moral principles in the nature of rational agency itself. Kant argued that moral requirements are categorical imperatives—absolute, universal duties derived from pure practical reason that hold independent of contingent desires or social conventions (Kant, 1959; Hill, 1992). Actions have moral worth only if done from a “good will"—a stable disposition to act from duty rather than mere inclination (Kant, 1996). In the context of XAI, a deontological approach would emphasize the inherent rightness of designing AI systems that respect user autonomy and provide truthful, non-deceptive explanations as a matter of moral duty, regardless of the consequences. Neo-Kantians have developed this idea in terms of respecting the autonomy of persons as “ends-in-themselves" and acting only on principles that could consistently serve as universal laws (Korsgaard, 1996; Hill, 1992; O’Neill, 1975). The focus is on the formal principle used to assess maxims, not the consequences of individual acts—this would require ensuring that the underlying principles and decision-making processes of XAI systems are universalizable and can be made transparent to users without violating their autonomy or dignity. Some deontologists, such as Ross, propose a system of prima facie duties that can conflict in particular situations, requiring agents to weigh and balance competing moral considerations (Ross, 1930). This perspective aligns with the discussion of deliberative agency and the conditions for a right to explanation (Jongepier & Keymolen, 2022).

Consequentialist theories—in contrast, maintain that the rightness of an action depends on the value of its consequences (Bentham, 1961; Mill, 1979; Sidgwick, 1907). Classic utilitarianism is the paradigmatic example, holding that we should act to maximize overall welfare or well-being impartially considered (Bentham, 1961; Mill, 1979). From this perspective, the development and deployment of XAI systems would be evaluated based on their overall impact on human welfare, taking into account factors such as the benefits of increased transparency and understanding, the potential risks of gaming or misuse, and the trade-offs between explainability and other desirable outcomes like accuracy or efficiency. More recent consequentialist views incorporate a wider range of goods beyond just pleasure or preference-satisfaction, and that allow for agent-neutral reasons to favor certain individuals (Parfit, 1984; Railton, 1984; Sen, 1979). But the core idea likely remains that the ethical status of XAI methods depends on their outcomes rather than their intrinsic nature or the motives behind them. Moral rules are at best reliable heuristics that should be set aside when better results are achievable by violating them (Smart & Williams, 1973; Hare, 1981). The consequentialist and deontological perspectives on explainability discussed in Kempt et al. (2022) illustrate the diverse ethical considerations at play in XAI design and the potential tensions between them.

Virtue ethics—shifts focus from right action to good character, contending that virtues are stable dispositions or traits that reliably lead to human flourishing (Anscombe, 1958; MacIntyre, 1981; Slote, 1992; Hursthouse, 1999). The tradition draws heavily from Aristotle’s conception of virtues as mean states between extreme vices, cultivated through proper upbringing and practical judgment (phronesis) rather than rigid rule-following (Aristotle, 1999; Sherman, 1989). While specific virtues may differ across cultures, the basic notion is that character and context are as ethically relevant as actions and their consequences (Nussbaum, 1988). Neo-Aristotelians argue that truly virtuous agents act for the right reasons and with appropriate emotions, not just in line with moral duties (Hursthouse, 1999; Foot, 1978; Oakley, 1996). Practical reasoning, akin to skill, is thus essential for translating virtuous dispositions into situationally-sensitive judgments (McDowell, 1979). In the context of XAI, a virtue ethics approach would prioritize the development of AI systems that embody and promote virtuous character traits, such as honesty and benevolence. This would require cultivating the practical wisdom necessary to discern when and how to provide explanations that are sensitive to the needs and situations of individual users, rather than simply following rigid rules or protocols.

Philosophical debates and challenges

The three main ethical frameworks raise important philosophical questions and debates that complicate the work of translating ethical principles into practice. One key debate concerns the relationship between motives, actions, and consequences in determining the moral status of an agent or decision. As seen, deontological theories emphasize the intrinsic rightness of actions based on universal duties, while consequentialist theories focus solely on outcomes. Virtue ethics, meanwhile, stresses the importance of character and moral perception in navigating context-specific challenges (Adams, 1976). These differences have implications for how XAI systems are designed and evaluated, and for how the decision-making of human agents interacting with these systems is understood and assessed. Related to this is the question of moral worth—whether right actions must flow from good will or virtuous character to be praiseworthy, or if accidental conformity to moral principles suffices (Arpaly, 2002).

Another consideration is the nature and grounding of moral principles. Deontologists argue that moral rules are grounded in the necessary requirements of rational agency, consequentialists justify them by their generally optimific results, and virtue ethicists see moral rules as heuristics to guide those still cultivating practical wisdom (Hursthouse, 1999). However, all three approaches recognize that there can be hard cases where moral rules conflict (Ross, 1930; Smart & Williams, 1973). The ethical frameworks also take different positions regarding the scope of moral consideration and the demands of impartiality, which have implications for how broadly we conceive of the moral status of AI systems and the ethical obligations we have towards them (Sidgwick, 1907; Scheffler, 1982). Finally, the broader philosophical question of the nature of intelligence shapes the way we conceive of the capacities and limitations of AI (Boden, 2006), which has significant implications for the standards of interpretability, robustness, and control that we demand from XAI systems.

Applied ethics in XAI

As seen, while major ethical theories offer valuable normative foundations yet they might not be always well-suited for the practical challenges of designing and governing XAI systems. This is where the field of applied ethics comes in, developing mid-level principles and context-sensitive guidance to address the moral, political and social implications of technologies in real-world settings (Felzmann et al., 2020; Beauchamp & Childress, 2001). A wealth of XAI surveys and reviews mention such applied ethics principles while identifying and reporting various explanation techniques and domain, shedding light on diverse domain applications (Samek et al., 2017; Adadi & Berrada, 2018; Arrieta et al., 2020; Cambria et al., 2023; Stepin et al., 2021; Saeed & Omlin, 2023a; Martins et al., 2024).

While the major ethical theories offer valuable normative foundations, they are not always well-suited for the practical challenges of designing and governing XAI systems. As seen in the history of bioethics, a purely deductive approach that seeks to derive practical guidance from overarching moral theories is often insufficient due to the gap between theoretical principles and the nuanced ethical dilemmas encountered in practice (Gert et al., 2006; Jonsen, 2012). Drawing from the lessons of bioethics, XAI ethics should strive for a reflective equilibrium between principles, contextual factors, stakeholder perspectives, and the actual challenges arising in the development and use of explainable AI (Loi & Spielkamp, 2021; Theodorou et al., 2017). This involves an iterative process of specifying principles in light of practical considerations, while also allowing on-the-ground insights to inform the interpretation and balancing of competing principles. Effective applied ethics in XAI requires close engagement with the technical, organizational, and social realities shaping the technology, as well as the needs and concerns of diverse stakeholders (Langer et al., 2021; Muralidharan et al., 2024). This requires a sociotechnical lens attentive to cognitive biases, power dynamics, and the distribution of authority between human and algorithmic agents (Zhang et al., 2020; Kitamura et al., 2021).

In this spirit, XAI can learn from other domains where applied ethics has addressed the responsible development of emerging technologies, such as bioethics, environmental ethics, and research ethics (Cohen et al., 2014; Morley et al., 2021; Mittelstadt, 2019). These fields provide valuable strategies for inclusive stakeholder engagement, contextual awareness, balancing principles and practice, and navigating trade-offs. For example, bioethics offers tools for ethical deliberation and oversight (Solomon, 2005; Dubler & Liebman, 2011), while environmental ethics provides insights on balancing competing values amid uncertainty (Brennan & Lo, 2022).

Ethical complexity in XAI

Challenges remain in translating both major ethical theories and applied ethics concepts into concrete XAI practices (Zicari et al., 2021; Morley et al., 2023). Especially for applied ethics, conceptual tensions must be resolved between competing desiderata like transparency and privacy or efficiency and user-friendliness (Loi & Spielkamp, 2021; Theodorou et al., 2017; Mittelstadt et al., 2019; Brey, 2010; Ehsan et al., 2021). This requires going beyond blanket imperatives to consider which stakeholders need what types of explanations for which aspects of AI systems in particular contexts (Felzmann et al., 2019; Bhatt et al., 2020; Tsamados et al., 2022; Nyrup & Robinson, 2022).

The Transparency-Explainability Dilemma – One key challenge is the complex relationship between transparency and explainability. While enhanced transparency is often heralded as desirable for facilitating explainability and contributing to ethical goals (do Prado Leite & Cappelli, 2010; Cysneiros, 2013), mere amplification of transparency does not inherently lead to superior explainability without clear guidelines on what and how to disclose information (Habibullah & Horkoff, 2021; Chazette et al., 2019; Köhl et al., 2019). The interplay between transparency and other requirements like trust, privacy, security, and accuracy must also be considered (Zerilli et al., 2019). Conflating explainability and transparency can stem from XAI designers lacking in-depth ethical understanding or researchers exploiting “ethics" rhetoric without genuine consideration of societal needs (Floridi, 2019; Bietti, 2020; Wagner, 2018a). Designers should be guided by how disclosed information will be processed and used, not just the need to disclose (Miller, 2023; Cabitza et al., 2024, 2023).

Enhancing Accountability—XAI techniques can facilitate auditing by illuminating decision processes, but accountability also requires pathways for recourse when problems are detected. Relying solely on “after-the-fact" explanations can instill false confidence without appropriate feedback channels and governance (Mökander & Axente, 2023; Bordt et al., 2022; Casper et al., 2024). Further, fairness and bias mitigation present challenges for XAI. Various mathematical definitions of fairness exist, sometimes encoding mutually exclusive criteria (Brun et al., 2018; Chouldechova, 2017). Identifying appropriate standards requires normative deliberation, not just computational evaluation, always cognizant that XAI techniques can perpetuate biases if not carefully designed (Bertrand et al., 2022; Chaudhuri & Salakhutdinov, 2019; Shamsabadi et al., 2022). Finally, still regarding fairness perspective, value pluralism poses issues as diverse stakeholders bring different ethical priorities. The same model may demand distinct explanations for different audiences (Markus et al., 2021). Trade-offs arise between competing goods in high-stakes applications, benefiting from ethical analysis and community input.

Trust and Reliance Dynamics—Engendering appropriate trust and reliance in AI remains a key XAI motivation, but real-world dynamics are fraught. More or better explanations do not automatically improve human judgment or error detection (Zhang et al., 2020; Kitamura et al., 2021; Bertrand et al., 2022). Overconfidence can lead to misplaced trust, while exposing flaws might foster undue skepticism. Levels of explainability should be based on realities of imperfect human reasoning to avoid unfairly holding AI to a “double standard," while potentially justifying a higher bar e.g., given physicians’ ability to take responsibility for their own heuristics but not an AI’s inscrutable reasoning (Kempt et al., 2022). As Loi argues, moving beyond post-hoc explanations to consider broader institutional contexts and “design publicity" is beneficial (Loi et al., 2021).

Related mapping studies in XAI and AI ethics

As shown, core ethical principles often conflict when operationalized in real-world XAI deployments. Purely technical approaches cannot resolve the inevitable value tensions and contextual particularities at play. Instead, grappling with the ethics of XAI requires critically examining the assumptions, methods and impacts of these systems through interdisciplinary collaboration and inclusive stakeholder engagement. The field of applied ethics offers conceptual frameworks and methodological tools well-suited to this challenge, by putting technical choices in dialogue with their social and institutional context.

To better position the current research, it is worth noting that scholarly discourse has moved towards self-critical approaches in the XAI field, with meta-surveys or analogous structural work inquiring over future research directions with also stronger ethical considerations (Löfström et al., 2022; Saeed & Omlin, 2023a; Ali et al., 2023; Schmid & Wrede, 2022; Brand & Nannini, 2023). In particular, a recent manifesto by Longo et al. (2024) outlined 28 key challenges and future directions for XAI research, organized into 9 high-level categories. While the article’s primary focus was not on ethics, it recognized XAI as a key component of responsible AI and highlighted various ethical challenges and considerations that the XAI community needs to grapple with moving forward. These included the need for human-centered explanations, mitigation of potential negative impacts, and the role of XAI in addressing societal issues like power imbalances and the “right to be forgotten”. The authors advocated for participatory design approaches involving impacted stakeholders as an ethically-minded way forward. Brand and Nannini (2023) offer a unique philosophical perspective on the ethical grounding of XAI, arguing that it should be viewed not merely as a universal right, but as a moral duty rooted in the principle of reciprocity. They contend that XAI plays a crucial role in maintaining reciprocal relationships between human agents in AI-assisted decision-making contexts by providing transparency and supporting genuine reason-sharing. By highlighting XAI’s instrumental value in upholding human agency and moral duties in the face of opaque AI systems, they proceed to map how such approach to XAI would benefit different communities, such as of XAI techniques developers, HCI designers, and policymakers. Similarly, Kasirzadeh (2021) contributes to this critical examination by systematically mapping the relationships between technical explanations, value judgments, and stakeholder perspectives in XAI systems, complementing and extending the typologies and challenges identified in other mapping studies of the XAI ethics landscape.

Yet, to the best our knowledge, the work closest to this research is the systematic mapping study by Vainio-Pekka et al. (2023), investigating the role of XAI in the field of AI ethics research. Their work provided valuable insights into the prevalence of XAI as a research focus within empirical AI ethics scholarship, the main themes and methodological approaches in this area, and potential research gaps. While their work shares some similarities with the present study in terms of the broad topic and the use of a systematic mapping methodology, there are important differences in scope and emphasis. Notably, their study focused specifically on the role of XAI within empirical AI ethics research, whereas the current analysis considers the engagement with ethical considerations across the broader landscape of XAI research, including both empirical and theoretical work. In addition to that, our study places greater emphasis on the depth and quality of ethical engagement in XAI research, using a novel classification scheme to assess the level of ethical analysis and the application of specific ethical theories and frameworks.

By providing a more comprehensive and fine-grained analysis of the ethical dimensions of XAI research, the present research aims to complement and extend findings of the aforementioned studies, offering new insights into the current state of the field and opportunities for future work at the intersection of ethics and XAI.

Methodology

This study employs a systematic review approach to investigate the landscape of ethical considerations in explainable AI (XAI) research. Our methodology consists of three key stages: (1) formulating research queries in Subsection 3.1; (2) applying a multi-stage filtering process (Sect. 3.2); and (3) developing a taxonomy for classifying depth and quality of ethical engagement in XAI literature (Sect. 3.3).

Research queries

Identifying relevant papers necessitated systematic searching on Scopus. Our search strings incorporated both XAI-specific and ethics-specific terms. The selection of XAI-related terms (i.e., “Explainable AI," “XAI," “interpretable machine learning," “interpretability," and “AI explainability") was straightforward given their direct relevance to the research focus. The choice of ethics-related terms, however, required careful consideration due to the complexity and diversity of ethical concepts applicable in XAI context. We adopted a twofold approach:

  • Major Ethical Theories: We incorporated key terms related to the major normative ethical theories, including consequentialism, deontology, virtue ethics, and care ethics (Alexander & Moore, 2021; Hursthouse & Pettigrove, 2018; Held, 2005). These theories provide the philosophical underpinnings for many of the ethical principles and frameworks discussed in the context of AI and XAI.

  • Applied Ethics in XAI: We dove into the specifics of ethics as they pertain to XAI. Principles like transparency, accountability, and fairness have unique connotations in this context (Jobin et al., 2019; Weller, 2019). For instance, transparency might refer to the explainability of AI systems, while accountability might involve mechanisms to hold AI systems and their creators responsible.

This approach resulted in an extensive list of ethics-related keywords, aiming to encompass the multifaceted ethical discussions within XAI research, fully detailed in Appendix B. By casting a wide net across both foundational ethical theories and XAI-specific ethical principles, these search terms aim to capture a broad range of ethical discussions within the XAI literature.

Filtering process

The initial search yielded a pool of 410 papers which underwent a multi-stage filtering process to ensure the relevance and quality of the included studies. The filtering process was conducted by three PhD students in XAI, two of whom had backgrounds in AI ethics and policy, while the third had a more technical focus. This diverse expertise allowed for a comprehensive and balanced assessment of the papers.Footnote 1 The filtering process involved the following steps:

  1. 1.

    Initial Pool Screening—We started with the preliminary full pool of papers as follows: we first removed duplicate entries to ensure that each paper is considered only once; excluded papers that were not written in English as well as papers produced before 2016, the year of DARPA’s XAI program release (Gunning & Aha, 2019), to focus on the most recent and relevant developments in the field. We finally also excluded papers that were not peer-reviewed i.e., tutorials, workshop abstracts, white papers, and theoretical reviews, to ensure the inclusion of high-quality, original research that advances the state-of-the-art in XAI tools, applications, evaluations, or theoretical/framing contributions.

    Each paper was screened reading titles and abstracts using a three-reviewer system: each paper was independently assessed by two members of the research team to determine its relevance to both XAI and ethical considerations. Papers were classified as “relevant," “irrelevant," or “uncertain". In particular, disagreements and “uncertain" papers were resolved through discussion and consensus, where also the third reviewer—with a more technical background—was consulted if consensus could not be reached, in order to minimize individual bias and ensure a more reliable selection process (Cumpston et al., 2019; McDonald et al., 2019).

  2. 2.

    Examined Papers Review—After the preliminary screening, we obtained 237 papers that were analyzed by conducting a full-text review. In this phase, we identified and excluded not relevant papers i.e., works that appeared relevant based on their title and abstract but did not directly contribute to the study’s focus upon closer examination. In particular it was assessed the quality and depth of ethical discussions in the remaining papers using a four-step review process:

    1. (a)

      Identification of Ethical Discussions: Searching for any sections or subsections addressing ethical concerns, considerations, or issues within the context of XAI. Papers that did not have any mention on ethics in XAI were further excluded.

    2. (b)

      Evaluation of Discussion Depth: Evaluating the depth of the ethical discussions within each paper, considering the complexity of the ethical issues addressed, the sophistication of the analysis, and the extent to which ethics was integrated.

    3. (c)

      Examination of Ethical Theories: Identifying and evaluating mentioned and/or application of ethical theories reported in Sect. 2.

    4. (d)

      Focus Evaluation: Determining the paper’s primary focus based on the research question, objectives, and overall contribution to the field. P

  3. 3.

    Final Pool—With the final pool (= 77 entries) established, we assigned for each entry the most suitable category from A to E to each remaining paper according to our proposed taxonomy, which is outlined in the following paragraph and detailed in Appendix A. The taxonomy considers both the depth and extent of ethical discussions and the paper’s overall focus on ethics within the XAI context. This process was also conducted by the reviewers independently, with disagreements resolved through discussion and consensus. To further improve transparency and reproducibility, we documented the reasons for exclusion at each stage of the filtering process and maintained a detailed recordFootnote 2.

Proposed classification taxonomy

To analyze depth and quality of ethical engagement in XAI research, we developed a novel classification scheme comprising five categories (A–E). This taxonomy builds upon existing approaches to evaluating the integration of ethical considerations in technology design and development, while addressing their limitations in capturing the specific nuances of the XAI context.

The categories are differentiated based on three key dimensions: (i) the depth of ethical discussion, (ii) the application of specific ethical theories or frameworks, and (iii) the overall emphasis on ethical issues in relation to XAI. By considering these dimensions in combination, our taxonomy provides a more comprehensive and fine-grained assessment of the ethical landscape within XAI research. Each category is associated with a set of quantitative thresholds and qualitative criteria to ensure a systematic and replicable classification process (see Appendix A). These thresholds were iteratively refined through pilot testing and calibration among the research team to enhance inter-rater reliability.

Results

In the primary phase of our bibliometric study, an initial pool of 410 research papers was established. Following the application of our predefined inclusion criteria, we subsequently eliminated 173 of these articles, leaving a sample of 237 papers for further review. Within this remaining pool, each paper was thoroughly examined, with both abstract and body text read and analyzed. Prior to the final classification process, an additional elimination of papers deemed as not relevant was undertaken. These were primarily research articles that emerged as false positives in our methodology—papers not directly applicable to our study focus. These included a total of 143 papers that treated the subjects of XAI or ethical considerations independently, without a focus on their intersection. This category also encompassed 17 survey articles that were identified within our pool. The entire process of filtering and categorization is visually depicted in Fig. 1.

Fig. 1
figure 1

Decision tree illustrating the distribution of papers at distinct stages of the process

Overview of paper distribution

Our multi-stage filtering process resulted in a final pool of 77 research papers for in-depth analysis. These papers were classified according to our pre-established five-tiered ranking system (A-E), which assessed the relevance and depth of ethical engagement in the context of XAI research.

Distribution across Categories—The distribution of papers across the five categories comprised 29 papers (37.66% of the pool) occurrences in Category A; 21–27.27% in Category B; 12–15.58% in Category C; 9–11.69% in Category D; 6–7.79% in Category E. Notably, over 60% of the papers fell into categories A and B, indicating a relatively superficial engagement with ethical considerations in a significant portion of XAI research. In contrast, only about 20% of the papers (categories D and E) demonstrated a deeper integration of ethical analysis into the design and development of XAI systems. Out of the 77 papers in the list, 39 (50.6%) were published in conference proceedings, 34 (44.2%) in journals, and 4 (5.2%) in workshops or other publication types. This distribution highlights the importance of both conferences and journals in advancing research on ethics in XAI.

Key Publication Venues—Several conferences and journals have emerged as key outlets for research on ethics in XAI, as reported in Table 1. The most prominent venue in the list is Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), with 9 papers. This is followed by the conference AIES (AAAI/ACM Conference on AI, Ethics, and Society) with 4 papers; Philosophy and Technology with 2 papers and Ethics and Information Technology with 3 papers. Other notable venues include Minds and Machines with 2 papers, IEEE International Conference on Fuzzy Systems with 2 papers, and Advances in Intelligent Systems and Computing with 2 papers. These venues highlights the multidisciplinary nature of research on ethics in XAI.

Table 1 Key Publication Venues and References

In terms of disciplinary focus of publication venues, we report an excerpt of distribution across the categories that most consistenly engaged with ethical consideration in Table 2.

The full complete list seen a majority of papers (36%) coming from Computer Science outlets; Medicine/Healthcare venues account for the 32%; Ethics & Society outlets account for the 16%; Law venues are represented as the 12%; and Business/Management as the 4%. This distribution showcases the multidisciplinary nature of research on ethics in XAI, with significant contributions from computer science, medicine/healthcare, ethics & society, law, and business/management outlets. The strong representation of computer science and medicine/healthcare venues highlights technical and domain-specific considerations in the development and application of ethical XAI systems.

Table 2 Disciplinary Domains and References among C, D, and E categories

Depth and extent of ethical discussions

The majority of XAI articles in categories A-B make reference to ethical considerations regarding AI exclusively in the abstract or in the introduction section, without further developing the discussion. Ethics is often presented as a motivation for the work or used to contextualize the proposed XAI methods within the landscape of real-world applications. In practice, applications and ethical implications are almost always mentioned together, alongside legal issues. Furthermore, these considerations are typically used to introduce the term XAI in general as an AI ethics principle, rather than being concretely connected to the specific proposed method. Papers classified under categories C, D, and E demonstrated varying, yet more substantiated, levels of engagement with ethical theories and frameworks. A closer examination of the papers in each category reveals distinct patterns in the depth and quality of ethical engagement:

  • Category A papers, which constituted the largest group (37.66%), typically mentioned ethics or ethical values in passing without engaging in any substantive ethical analysis. Many of these papers referred to ethics in the abstract or introduction as a general motivation for the work, but failed to connect these considerations to the specific XAI methods or applications being proposed.

  • Category B papers (27.27%) went a step further by discussing ethical principles or values in the context of XAI, but still lacked a thorough or systematic ethical analysis. These papers often highlighted the importance of ethical considerations such as transparency, accountability, or fairness, but did not delve into the nuances of how these principles might be operationalized or navigated in practice.

  • Category C papers (15.58%), such as (Graziani et al., 2023; Fleisher, 2022; Narayanan & Tan, 2023; El-Nasr & Kleinman, 2020; Nicodeme, 2020; Heinrichs & Eickhoff, 2020; Larsson & Heintz, 2020; Jongepier & Keymolen, 2022; Martinho et al., 2021; Waefler & Schmid, 2021; Morris et al., 2023; Löfström et al., 2022), present ethical analyses but do not explicitly link the ethical considerations to the design or development of specific XAI tools. Instead, they focus on critiquing existing approaches, highlighting ethical challenges, or proposing conceptual frameworks and guidelines for addressing ethical issues in XAI.

  • Category D papers (11.69%) began to bridge this gap by proposing XAI tools or techniques that were informed by ethical considerations. Examples such (Hofeditz et al., 2022; Baum et al., 2022; Kempt et al., 2022; Lindner & Möllney, 2019; Sibai, 2020; Dexe et al., 2020; Kasirzadeh & Smart, 2021; Calegari et al., 2020; Gerdes, 2021; van Otterlo & Atzmueller, 2020), propose XAI tools or techniques informed by ethical considerations but do not thoroughly substantiate the connection between the ethical principles and the proposed solutions. These papers often focus on specific aspects of explainable AI, such as generating explanations for moral judgments (Lindner & Möllney, 2019), classifying AI crimes (Sibai, 2020), or designing human-agent collaboration protocols (van der Waa et al., 2021). However, the connection between the ethical principles invoked and the specific XAI solutions proposed was not always thoroughly substantiated or explored in depth.

  • Category E papers, while representing the smallest proportion (7.79%), offered the most comprehensive and rigorous integration of ethical considerations into the design and development of XAI systems. Papers such as (John-Mathews, 2021; Falomir & Costa, 2021; van der Waa et al., 2021; Amann et al., 2020; Herzog, 2022a; Sullivan & Verreault-Julien, 2022), explicitly integrate ethical considerations into the design and development of XAI tools and provide comprehensive ethical analyses of the proposed solutions. These papers engage more deeply with ethical theories and frameworks, using them to guide the design and evaluation of explainable AI systems. For example, Amann et al. (2020) conducts an ethical assessment of explainability in AI-based clinical decision support using the “Principles of Biomedical Ethics," while (Sullivan & Verreault-Julien, 2022) proposes using the capability approach to provide ethical standards for algorithmic recourse. Notably, papers from philosophical, social science, and interdisciplinary backgrounds (e.g., (Amann et al., 2020)) often provide more extensive engagement with ethical theories and frameworks compared to papers from purely technical domains.

Across the papers reviewed, a diverse range of ethical theories and frameworks are applied to analyze the role of explainability in AI systems. Certain ethical theories and principles emerged as more prominent than others. Consequentialism, Deontological Ethics, Virtue Ethics) are relatively present. In Table  3, we report a list of works that refer to them more or less explicitly. Explicit mentions are also to be found to the “Principles of Biomedical Ethics" by Beauchamp & Childress (autonomy, beneficence, nonmaleficence, and justice) e.g., used as an analytical framework in Amann et al. (2020) to assess the ethical implications of explainability in AI-based clinical decision support systems. Similarly, Herzog (2022a) builds on the notion of “explicability" proposed by Floridi and Cowls (2019), which combines the demands for intelligibility and accountability of AI systems, to argue for the ethical and epistemological utility of explainable AI in medicine.

Table 3 References of papers that engage in a discourse regarding major ethical theories presented (not necessarily just one)

Several papers draw upon philosophical concepts and frameworks to examine the ethical dimensions of explainable AI. Narayanan and Tan (2023) explores the attitudinal tensions between explainability and trust in AI decision support tools, discussing the incompatible deliberative and unquestioning attitudes required for each. Kasirzadeh and Smart (2021) critiques the use of counterfactuals in algorithmic fairness and explainability, arguing that social categories may not admit counterfactual manipulation and proposing tenets for using counterfactuals in machine learning. John-Mathews (2021) introduces the concept of “denunciatory power" as an ethical desideratum for AI explanations, measuring their ability to reveal unethical decisions or behavior. Dexe et al. (2020) employs the Value Sensitive Design (VSD) method to facilitate transparency and the realization of ethical principles in AI and digital systems design. van der Waa et al. (2021) proposes three team design patterns with varying levels of agent autonomy and human involvement to enable moral decision-making in human-agent teams. Other papers engage with various ethical principles and concepts, such as informed consent, shared decision-making, accountability, fairness, and transparency (Jongepier & Keymolen, 2022; Kempt et al., 2022; Lindner & Möllney, 2019; Sullivan & Verreault-Julien, 2022).

Discussion

Our bibliometric analysis has revealed a complex landscape of ethical engagement within the field of XAI research. The quantitative findings expose a striking disparity between the high number of papers acknowledging the importance of ethics (categories A and B, >60%) and the limited number providing explicit theoretical ethical frameworks or substantively integrating ethical considerations into XAI design and development (categories D and E, <20%). This raises critical questions about the depth of ethical considerations and implications for XAI systems’ application. We structure our discussion into three themes emerging from the observed patterns.

First, we discuss the prevalence of “ethics-acknowledging" research (Sect. 5.1), signaling ethics’ importance but failing to substantively embed ethical complexity, arguing for rigorous engagement with ethical theories and frameworks. Second, we further advance explainability’s inherent ethical tensions (Sect.  5.2), highlighted by the diverse ethical theories and principles applied, emphasizing the need for nuanced, context-specific guidelines navigating XAI’s complex trade-offs and competing interests. Finally, we underscore ethical education and interdisciplinary collaborations’ importance (Sect.  5.3) in advancing XAI’s responsible development, drawing insights from the diverse disciplinary backgrounds represented and arguing for cross-disciplinary dialogue and incorporating underrepresented ethical perspectives.

From signaling to embedding ethical complexity

Ethics being mentioned as a general concept without substantive engagement—as reported in our coded categories A and B—suggests a trend of superficial treatment of ethical issues in XAI. We define such trend as the prevalence of “ethics-acknowledging" research. This approach risks oversimplifying the multifaceted nature of ethics and creating misalignment between the design of XAI systems and their intended ethical impacts.

As outlined in Sect. 2, the major ethical theories (deontology, consequentialism, and virtue ethics) and the field of applied ethics offer valuable frameworks for navigating the complex ethical challenges surrounding XAI systems (Shafer-Landau, 2012; Copp, 2006; Felzmann et al., 2020; Beauchamp & Childress, 2001). These theories provide the necessary grounding for substantive ethical engagement, enabling researchers to consider the specific implications of their XAI systems in light of established moral principles and context-specific guidelines (Floridi, 2019; Bietti, 2020). Yet our analysis reveals that while many XAI papers acknowledge the importance of ethics, there is often a lack of deep engagement with these theories and frameworks.

This failure to embed ethical considerations substantively in the research design, execution, or interpretation of XAI studies threatens to undermine the ethical grounding of these systems (Graziani et al., 2023; Floridi, 2019). We argue that such trend aligns with corporate ethics initiatives—also affecting XAI applications—that might lack both intrinsic value (as they are not undertaken out of genuine commitment to moral principles) and instrumental value (as they do not lead to beneficial outcomes for society) (Metcalf et al., 2019; Bietti, 2020). This dynamic risks perpetuating a superficial form of ethical engagement, where ethics is invoked to legitimize existing practices rather than to drive genuine transformation (Hu, 2021). Similarly, the lack of robust metrics for evaluating the ethical implications of XAI systems, as highlighted by Floridi’s discussion of “ethics bluewashing," further compounds the risk of superficial ethical engagement (Floridi, 2019). Without clear, shared, and publicly accepted ethical standards, as well as metrics that capture not just the performance of XAI systems but also their potential adverse outcomes and adherence to ethical principles, the ethical claims made by XAI researchers may remain unsubstantiated and fail to drive genuine ethical progress in the field (Hu, 2021; Wagner, 2018b; Bietti, 2020).

To address these challenges, it would be beneficial for XAI researchers to be intentional about which ethical theories they apply and to consider the specific implications of their systems in light of these theories (Floridi, 2019; Wagner, 2018b). This necessitates asking critical questions such as: “What ethical implications might arise due to the nature of my system, its users, or its context of use?". Such kind of questions move beyond generic ethical concerns to reflect over specific ethical paradigms that guide behavior and decision-making. As an example, by appealing to a consequentialist stance, the system should be evaluated on its ability to forecast and mitigate adverse outcomes. This would require metrics that not only measure the accuracy or performance of the system but also its potential implications (de Bruijn et al., 2022) while still being aware of difficulties in predicting all the negative possible consequences beforehand (Genus & Stirling, 2018). On the other hand, a deontological approach would prioritize fidelity to defined rules and principles (Alexander & Moore, 2021), being centered around regulatory compliance and integrity of operation, thus potentially being detrimental to more nuanced, contextually-grounded manner as advocated in the following subsection.

Inherent ethical tensions in explainability

Explainability in AI systems often intersects with deep-seated ethical dilemmas that arise from the very principles of our normative philosophical frameworks, as stressed in Sect. 2.2.1. For example, Narayanan and Tan (2023) discuss the attitudinal tensions between explainability and trust in AI decision support tools, arguing that the deliberative attitude required for meaningful engagement with explanations is incompatible with the unquestioning attitude implied by trust. Similarly, Kasirzadeh and Smart (2021) critique the use of counterfactuals in XAI, contending that social categories may not admit counterfactual manipulation. Addressing these tensions requires careful consideration of the specific context and stakeholders involved, as well as the development of nuanced ethical guidelines that can adapt to the unique challenges of different domains (Nyrup & Robinson, 2022).

Another challenge is ensuring meaningful stakeholder engagement throughout the XAI development process. The bibliometric analysis underscores the importance of involving domain experts, end-users, and affected communities in the design and evaluation of XAI systems (Langer et al., 2021; Muralidharan et al., 2024). The substantial contributions from computer science, philosophy, ethics, and interdisciplinary outlets highlight the need for continued cross-disciplinary dialogue and collaboration to address this gap. As echoed by van Otterlo and Atzmueller (2020); Kasirzadeh (2021); Amann et al. (2020), a multidisciplinary approach is crucial for balancing the various legitimate but potentially conflicting interests involved in XAI, such as transparency, privacy protection, and intellectual property rights (Langer et al., 2021; Muralidharan et al., 2024). However facilitating effective collaboration and communication between these diverse stakeholders can be difficult, particularly when there are differences in technical expertise, values, and priorities (Green, 2022; Kroll, 2021). As Metcalf et al. (2019) argue, the influence of corporate logics on the institutionalization of ethics in the tech industry can further complicate these efforts.

Nonetheless, these challenges also present valuable opportunities for advancing the integration of ethics into XAI. The development of standardized ethical frameworks and guidelines tailored to the specific needs of XAI can provide a common language and set of principles to guide the responsible development of explainable AI systems (Amann et al., 2020; Longo et al., 2024; Sokol & Flach, 2020). These frameworks should be informed by the insights gained from the diverse ethical theories and approaches identified in the bibliometric analysis, such as the “Principles of Biomedical Ethics" (Amann et al., 2020), the capability approach (Sullivan & Verreault-Julien, 2022), and the concept of “reflective equilibrium" between principles and practice (Loi & Spielkamp, 2021; Theodorou et al., 2017). Other recent frameworks, such as “Evaluative AI" (Miller, 2023), recognize the inherent tensions in XAI and aim to provide a more flexible and context-sensitive approach. By designing XAI systems that promote cognitive reflection, such frameworks can help developers and users navigate the ethical complexities of XAI in a more nuanced and contextually-grounded manner (Ehsan et al., 2022; Cabitza et al., 2024, 2023).

Educating to ethical theories and interdisciplinary collaborations

In line with stakeholders engagement, we finally underscore the value of cross-disciplinary dialogue in illuminating the multifaceted ethical landscape of XAI. Much can be learned from other domains of applied ethics, such as bioethics and environmental ethics, which have grappled with similar challenges of balancing competing values and interests in the face of uncertainty and high stakes (Beauchamp & Childress, 2001; Markus et al., 2021; Blasimme & Vayena, 2020).

The landscape of ethical theories is vast, encompassing not just the mainstream utilitarian or deontological approaches, but also less represented ones like virtue ethics, care ethics, and non-Western ethical traditions (Wu et al., 2023; Amugongo et al., 2023; Okolo et al., 2022). These lesser-known paths may offer valid perspectives, allowing to navigate ethical dilemmas in XAI through an unexplored lens (Okolo, 2023). In this vein, initiatives such as workshops, tutorials and courses designed to provide a robust understanding of ethical theories and their practical implications are instrumental in this endeavor. There are already promising steps in this direction, as evidenced by institutional initiatives like the NIST’s effort to develop comprehensive reports on human psychology and tools for XAI implementation (Broniatowski, 2021; Phillips et al., 2021) or researches on the moral value of XAI for the public sector (Brand, 2023). In this spirit, future studies should further investigate how organizational constraints do influence XAI deployers’ alignment with specific ethical stances and their willingness to express dissenting views (Hickok, 2021; Ibáñez & Olmeda, 2021; Kitamura et al., 2021).

Research limitations

Our study provides valuable insights into the ethical discourse within XAI research, but it is essential to consider the following key limitations:

  1. 1.

    Scope and Framing: It is important to note that our research queries were designed to capture a broad spectrum of ethical considerations in XAI research. As demonstrated by the search queries provided in the Appendix, we included both generic ethics terms (e.g., ’ethics,’ ’ethical,’ ’moral,’ ’morality’) and specific theories (e.g., ’deontology,’ ’consequentialism,’ ’virtue ethics’). This approach aimed to ensure that our analysis was not limited to papers explicitly mentioning ethical theories but also included those discussing ethical issues more broadly. By combining generic and specific ethical key terms, we sought to minimize the potential bias towards any particular ethical framework. Yet, focusing solely on works that explicitly discuss ethics in XAI may overlook articles that embed ethical considerations within alternative framings, such as “responsible AI" or “human-centered AI". Future research should explore these diverse conceptualizations to capture a more comprehensive understanding of the ethical landscape in XAI. In terms of linguistic and chronological constraints, we recognize that by concentrating on English-language articles published after 2016, we may have excluded valuable insights from non-English publications and pre-DARPA works (Gunning & Aha, 2019).

  2. 2.

    Classification Complexity: Despite our efforts to mitigate bias through double-coding, the inherent subjectivity in our research process remains a limitation. Researchers’ shared backgrounds may influence their interpretations, emphasizing the importance of reflexivity, diverse research teams, and systematic approaches to managing subjectivity in future studies. Furthermore, our five-tier classification scheme, while useful for structured analysis, may oversimplify the intricate nature of ethical discussions. Future research could explore more nuanced or multi-dimensional classification approaches to better capture the complexity of ethical engagement in XAI.

  3. 3.

    Academic Perspectives: By focusing on the academic domain, our study does not fully capture the broader discourse on ethics in XAI that occurs in industry, policy-making, and societal contexts. These non-academic spaces may surface practical and societal considerations and misalignment that are less emphasized in scholarly publications but are critical for a holistic understanding of ethics in XAI (Nannini et al., 2023). Finally, while our study highlights the need for deeper engagement with ethical theories in XAI research, we acknowledge the constraints of scientific publishing. Not all AI journals may prioritize extensive discussions of philosophical works, which may contribute to the observed lack of depth in some papers. Future research could investigate these structural barriers and propose strategies for fostering more substantive ethical deliberation within the confines of academic publishing; similarly, research could greatly benefit from incorporating those non-academic perspectives while navigating the challenges of accessing and analyzing non-public or proprietary information.

Conclusion

Our study contributes to the growing body of research on the ethical dimensions of XAI by critically examining the depth and breadth of ethical engagement in the field. Our bibliometric analysis has revealed a complex landscape of ethical engagement within XAI research: while many studies acknowledge the importance of ethics, there is often a lack of depth in the application of ethical theories and frameworks. This superficial treatment risks oversimplifying the multifaceted nature of ethics and creating misalignments between the design of XAI systems and their intended ethical impacts. By acknowledging our limitations and identifying avenues for future research, we invite further exploration and discourse to advance a more comprehensive, nuanced, and inclusive understanding of ethics in XAI. Ultimately, our aim is to stimulate a reflective and actionable dialogue on the role of ethics in shaping the responsible development and deployment of explainable AI systems.