Introduction

As the frontier of computational advancements, artificial intelligence (AI) is currently pushing the boundaries of what is feasible in data-driven problem-solving (Berente et al. 2021). In this context, AI can be considered as an abstract concept for solving data-driven problems by using mathematical and statistical algorithms to build machine learning (ML) models that do not require explicit programming (Hutson 2017; Janiesch et al. 2021). Unsurprisingly many kinds of systems are using AI today to achieve or surpass human intelligence for selected tasks (Berente et al. 2021). AI-based decision support systems (DSS) are a particular type of such systems capable of supporting human decision-making in many situations (Herm, Heinrich, et al., 2022a; Mohseni et al. 2021) such as evaluating heat-flux sensor data to track plastic welding processes and ensure the durability of the welding seam (see Section 5).

As past research has primarily focused on solving mathematical constraints and thereby improving the performance of ML models, their inherent algorithmic complexities steadily increased (Arrieta et al. 2020; Meske et al. 2022). Lately, a class of ML algorithms called deep learning (DL) algorithms is employed increasingly as their deep ML models regularly outperform shallow ML models (Janiesch et al. 2021). In turn, these models are particularly opaque to the user, making them de facto black boxes for human users. Hence, these models cause difficulties in interpreting or even understanding the model’s inherent processing logic or even their predictions in complex real-world use cases (Herm et al. 2021; Sharma et al. 2021). This lack of explainability of the decision-making process leads to reduced trust and lowers the acceptance of intelligent systems, especially in high-stake use cases (Shin 2021; Thiebes et al. 2021). Hence, their overall adaptation in practice is still hesitant (Hradecky et al. 2022; Kelly et al. 2019). In response, multiple studies have shown that explainability can directly contribute to adopting these models for decision support in practice (Sardianos et al. 2021; Wanner et al. 2021).

The research domain of explainable AI (XAI) addresses this issue by developing diverse techniques to maintain the high level of performance of black-box algorithms while increasing the level of explainability at the same time (Mohseni et al. 2021). Consequently, the integration of such XAI techniques in intelligent systems and the development of explainable intelligent systems (EIS) for decision support is considered a key factor for intelligent system acceptance (Gunning et al. 2019; Mohseni et al. 2021). Due to the novelty of the research domain, there are several unsolved problems (Abedin et al. 2022; Meske et al. 2022). Despite numerous applications and developments of XAI techniques, there is still a lack of a holistic reappraisal of design factors to enable the integration of XAI techniques into intelligent systems (Abedin et al. 2022; Herm et al. 2021; Meske et al. 2022; Mohseni et al. 2021). Complicating matters further, recent XAI techniques are predominantly developed by ML experts for ML experts leading to a situation where the desired explainability of the models only becomes accessible to experts but is barely accepted by end-users in practice. In this context, ML experts are developers with in-depth knowledge of ML algorithms to build and evaluate ML models. In contrast, end-users are users who are skilled in their application domain and thus use EIS in support of decision making without having any profound ML background (Arrieta et al. 2020; Herm, Wanner, et al., 2022b). As intelligent systems rapidly emerge as a core assistance for daily work, in our research we predominantly address the future workforce that will be affected by such systems (Berente et al. 2021; McKinney et al. 2020). Users come with various age and experience profiles. We focus on educated people with some work experience as well as little (for end-users) to pre-existing (for developers) AI background. We do not consider in-training or late-career specificities. In this respect, through our requirements analysis and evaluations we focus on work systems and professional work situations and do not consider EIS for private uses such as entertainment.

In our research, we address this lack of system development guidelines and the consideration of both user groups to foster the acceptance of EIS. Employing design science research (DSR), we investigate which design requirements, design principles, and design features, cumulated as a nascent information systems design theory, are relevant for EIS in theory and practice. The following research questions (RQ) summarize our socio-technical research intent:

  • RQ1) What are design requirements, design principles, and design features of a nascent design theory for EIS?

  • RQ2) How do the results vary for end-users and developers?

To answer our research questions, we applied a two-cycled DSR methodology according to Vaishnavi and Kuechler (2007). In the first design cycle, we conducted a structured literature review to derive an initial theory-based design theory, which we then adjusted and validated through expert interviews. In the second design cycle, we refined our design theory and evaluated it against a real-world use case application. Ultimately, we propose a nascent design theory crafted for domain-independent development of EIS comprising multiple user groups. Due to its multidisciplinary nature, our design theory takes the diverse facets of XAI’s human-agent interaction (Miller 2019) into account and can be considered as a starting point for adaptations for all types of use cases, including electronic market scenarios that require decision support such as e-business, supply chain, or service management.

Our paper structures as follows: In the second section, we present the theoretical background and related research of EIS. Section 3 describes the used DSR methodology, including a comprehensive description of the two design cycles. Section 4 introduces the final nascent design theory and Section 5 presents an EIS real-world use case application and evaluation. We discuss the results in Section 6, before we conclude with a summary.

Research background

From decision support systems to intelligent systems

While DSS gained significant momentum in information systems research in the 1970s and 1980s, their application is still essential today (Liu et al. 2008). In this context, DSS are interactive and computer-based software systems that use decision rules and models to aid decision makers in solving unstructured problems (Turban and Watkins 1986). Since this is a broad definition, any system that contributes to a decision-making process can be defined as a DSS (Sprague 1980). Unlike expert systems, DSSs do not replace users but rather provide them with decision recommendations (Turban and Watkins 1986). In the early days of the DSS era, software engineers handcrafted decision rules and decision models underlying the DSS. That is, knowledge workers had to transfer their skills into DSS’s logic explicitly (Sprague 1980). Since then, computational breakthroughs due to advances in ML technology have enabled the use of DSS in highly complex and critical situations (Janiesch et al. 2021). Recent examples can be found in all kind of application fields, such as medicine (McKinney et al. 2020), manufacturing (Nor et al. 2022), or social media (Meske and Bunde 2022). For the following, we align with Herm, Heinrich, et al. (2022a) and Mohseni et al. (2021) by referring to these types of AI-based DSS or intelligent DSS as intelligent systems.

Artificial intelligence and intelligent systems

According to definition of Berente et al. (2021, 4), AI is the “frontier of computational advancements that references human intelligence in addressing ever more complex decision-making problems”, which is pushed further by intelligent systems to provide decision-making with human-like or even superhuman cognitive abilities (Herm, Heinrich, et al., 2022a; Janiesch et al. 2021). To enable these decision-making abilities for decision support, intelligent systems use ML to allow for the autonomous generation of decision knowledge based on observations (Nilsson 2014; Poole et al. 1998). The field of ML has gained increasing attention due to groundbreaking computational advances (Thiebes et al. 2021). Here mathematical and statistical algorithms are used to iteratively learn nonlinear relationships and complex patterns from empirical data to train ML models (Goodfellow et al. 2016; Janiesch et al. 2021). This includes models from DL, which are based on (deep) artificial neural network (DNN) (LeCun et al. 2015). Nowadays, the predictive performance of DNNs exceed that of domain experts (McKinney et al. 2020). On the downside, while their architectural structure is becoming more complex, the user’s ability to comprehend the inner decision logic decreases (Ribeiro et al. 2016). In practice, this results in a complex tradeoff between the performance and the explainability of these models (Herm, Heinrich, et al., 2022a). That is, models with high predictive accuracy also tend to be more challenging to comprehend and vice versa (Herm et al. 2021). Since we do not make a distinction between shallow ML and DL in this article, as we focus on any non-white-box model, in the following we subsume DL under the larger umbrella term ML.

When integrating ML models into intelligent systems, this results in an increased tension between a user and the intelligent system during a decision-making process (Sundar 2020), as a user may not be able to understand the underlying rationale of the ML model. Consequently, the user’s willingness to adopt this system diminishes as humans desire to reduce uncertainty and ambiguity in their environment (Epley et al. 2007). Ultimately, the overall goal should be to implement intelligent systems, which can describe their rationale with sufficient explanations to aid in decision making (Mohseni et al. 2021; Rudin 2019). We define those systems as EIS.

Explainable artificial intelligence in explainable intelligent systems

According to Miller (2019), explanations as the product of explanation theory are about the assignment of causal responsibility derived through a cognitive and social process of knowledge transfer. Hence, he outlines that explanation theory for AI must account for multiple dimensions ranging from information requirements, information access, functional capacities to pragmatic goals of the explainer and explanatory tool to address cognitive aspects as well as beliefs, desires, intentions, emotions, and thoughts derived from the theory of mind to address social aspects.

Correspondingly, we define explainability as the ability to use information to comprehend an event by formalizing logic-based causal chains (Arrieta et al. 2020; Lewis 1986). In this regard, missing explainability can cause trust issues and reduce the acceptance of those systems (Shin et al. 2020; Zerilli et al. 2022), resulting in so-called algorithmic aversion (Berger et al. 2021). As an explanation includes both the product of cognitive reasoning and the social process, an explanation may be inappropriate if it is not correctly understood by the receiver or perceived as irrelevant (Hilton 1996). Accordingly, recent research has demonstrated the importance of considering a plethora of factors to provide the receiver with an adequate explanation (Mahmud et al. 2022; Shin et al. 2020).

Explaining ML decisions is of paramount importance as misclassified training data can have devastating consequences when human lives are at stake (Lebovitz et al. 2021). To achieve explainability in intelligent systems, the system must either apply inherently explainable shallow ML models (e.g., decision trees), that is white-box models, and thus potentially forfeit predictive power or consider more complex models (e.g., DNNs) that are black boxes if considered in isolation and require explanation augmentations (Arrieta et al. 2020; Rudin 2019).

The multidisciplinary research field of XAI addresses this objective by developing transfer techniques that provide users with comprehensible explanations of an intransparent model’s decision logic or insights from the utilized data of a decision (Das and Rad 2020; Meske et al. 2022). XAI is gaining momentum due to policy initiatives and regulations such as the “right to explanation” in the wake of the General Data Protection Regulation (GDPR) (Goodman and Flaxman 2017). In addition, the integration of XAI into intelligent systems for decision support is motivated by the need to manage, control, and improve intelligent systems (Arrieta et al. 2020; Mohseni et al. 2021), establishing the need of EIS (Herm, Heinrich, et al., 2022a).

Hence, various techniques have been developed for DNNs (Adadi and Berrada 2018), showing a promising suitability for resolving the tradeoff between performance and explainability (Arrieta et al. 2020; Herm, Heinrich, et al., 2022a). In this context, using model-agnostic techniques enable the transformation of opaque black-box models into transparent white-box models, with the coincident goal of maintaining their predictive power (Mohseni et al. 2021). They can be distinguished in two different post-hoc explanation types (Gunning et al. 2019): global explanations and local explanations. Global explanations allow a deeper traceability of the model’s behavior, making the holistic decision-making process of models transparent (Lundberg et al. 2020). In theory, these types of explanations are mainly used by developers to validate trained models (Miller 2019). In contrast, local explanations, primarily aimed at end-users, provide explanations for specific predictions presented in the form of visual, textual, or example-based explanations (Arrieta et al. 2020; Herm et al. 2021; Lipton 2018). However, literature claim the lack of user-centered evaluation of existing XAI techniques, which may lead to inadequate XAI explanations and thus hinder successful human-agent interaction (Miller 2019; van der Waa et al. 2021).

Related work

Apart IS-related contributions such as Förster et al. (2020) who provide a design process for user-centric XAI systems and Herm, Wanner, et al. (2022b) who introduce a taxonomy to assist user-centered XAI research, we were only able to identify a handful of DSR-based contributions that focus on user-based studies for EIS (Bunde 2021; Cirqueira et al. 2021; Landwehr et al. 2022; Meske and Bunde 2022; Schemmer et al. 2022). Meske and Bunde (2022) and Bunde (2021) provide design principles for explainable DSS limited to detecting hate speech. Landwehr et al. (2022) derive design knowledge for image-based DSS. Further, Cirqueira et al. (2021) stated design principles for XAI-based systems in fraud detection and Schemmer et al. (2022) propose design principles for an XAI-based DSS at real estate appraisals.

Related to this, we identified further XAI design studies in the field of human-computer interaction (HCI) relevant to our cause. Here, Amershi et al. (2019) and Mohseni et al. (2021) provide some generic design recommendations for XAI research. Moreover, Sokol and Flach (2020) and Liao et al. (2020) primarily focus on design needs for EIS. Similarly, current research in the field of HCI-based XAI investigates how users perceive user interfaces (UI) and thereby their expectations towards the use of intelligent systems (e.g., Mualla et al. 2022; Stumpf et al. 2019). This research aims to reveal the influence of HCI in the field of XAI research (e.g., Abdul et al. 2018; Bove et al. 2022). Lastly, research addresses the impact of interactive UI elements within intelligent systems (e.g., Evans et al. 2022; Khanna et al. 2022).

In addition, we identified XAI-related research, which implicitly derives challenges and thus requirements for the use of EIS. This includes human-in-the-loop for EIS development (Chou et al. 2022), identifying the degree of EIS’s decision explainability (Herm, Heinrich, et al., 2022a), or defining new responsibilities to handle EIS’s outcome (Storey et al. 2022).

While preliminary research has already derived a first theoretical foundation for the derivation of a design theory, it is apparent that this research has not been synthesized to design knowledge as starting point for the derivation of use case dependent design theories yet. In contrast, recent research primarily focuses on specialized use cases. To this end, this manifests the deficit and thus the need for first-hand and use case independent design knowledge to enhance and ensure future EIS design theory development.

Research methodology

Design science research methodology

Design science research.

DSR is a problem-solving-oriented research approach to generate IT artifacts (e.g., design theories) for a more effective and efficient use, implementation, and management of information systems or to solve a specific organizational problem. The goal is to transform a defined problem state into a solution state by intervening with a defined IT artifact (Hevner et al. 2004; Möller et al. 2020). In this context, the role of DSR is twofold. First, a kernel theory initiates the search progress for an appropriate solution state. As elaborated above, explanation theory (Miller 2019) serves as a kernel theory with XAI as its instantiation to enable AI-based applications in DSS resulting in EIS. Second, the application of DSR aims at providing prescriptions for how to solve a defined problem state. These prescriptions can be provided by a design theory (Vaishnavi and Kuechler 2007). Design theories contain certain classes of (meta-) design requirements, practices for IT artifact development (e.g., design principles), and IT artifacts themselves or distinctive design features that contribute to design knowledge (Meth et al. 2015). Gregor and Hevner (2013) distinguish situated implementation from nascent design theories from well-developed design theories. While the former deals with instantiations and the latter encompasses mid-range to grand theories, nascent design theories focus on knowledge as operational principles expressed through design principles. Design principles are precepts that are inductively or deductively derived from experience or empirical evidence to support achieving a prosperous solution state. Finally, the concrete problem is solved by visualizing the design principles into concrete design features (Fu et al. 2015; Möller et al. 2020).

Application of design science research.

The aim of our research is to develop a nascent design theory. To ensure the quality of the IT artifact, we applied the DSR methodology according to Vaishnavi and Kuechler (2007) and extended it by including multiple theory-building elements (Glaser and Strauss 1967; vom Brocke et al. 2015). This combination of qualitative and quantitative research is also recommended by Mohseni et al. (2021). The resulting methodology divides into five phases: problem awareness, suggestions, design & development, evaluation, and conclusion. For our research, we applied two of these design cycles (see Fig. 1).

Fig. 1
figure 1

Application of DSR according to Vaishnavi and Kuechler (2007)

Overview of first design cycle.

Initially, the design cycle began with the phase of problem awareness where we identified the lack of design knowledge and built the knowledge foundation. Here, we identified that information systems research currently lacks design knowledge for the derivation of use-case-independent design theories for EIS (cf. Section 2.3). To address this lack, we used prior design knowledge as input for the derivation of three meta design requirement proposals (vom Brocke et al. 2020). In order to do so, we conducted a structured literature review according to vom Brocke et al. (2015), including design studies, case studies, scenarios, and reviews. During the suggestions phase, we extracted goals, design requirements, design principles, and design features from the structured literature review to address our meta design requirements (Möller et al. 2020). Extending this, we follow the guidelines of Gregor et al. (2020) to propose an initial design theory. In the subsequent design & development phase, we specified design principles using the development process of Möller et al. (2020) to materialize the theory-based design theory. In the evaluation phase, we enriched the theory-based design theory and demonstrated as well as validated it with practitioners and researchers in qualitative semi-structured interviews according to Kaiser (2014). This preliminary nascent design theory constitutes the result of the conclusion phase of the first design cycle and as input for the second design cycle.

Overview of second design cycle.

As we observed improvement potential during the evaluation of the first design cycle, we conducted a second design cycle, including findings from recent XAI publications and input from the evaluation phase of the first design cycle in the awareness of problem phase. Then, we refined the design principles and features in the suggestions phase and, consequently, the overall design theory in the design & development phase. Subsequently, we performed a threefold evaluation in the evaluation phase with experts from a German predictive maintenance project to prove the rigor of our design theory (Hevner et al. 2004; Mohseni et al. 2021). This includes a qualitative study to ensure the validity our design theory and reveal possible improvement potentials, an instantiation of the design theory through the implementation and evaluation of an EIS through a real-world use case within the maintenance project, and lastly a quantitative evaluation against Iivari et al. (2021)’s reusability criteria. Lastly, we operationalized the final design theory and thereby contribute to theory and practice by revealing novel design knowledge (Vaishnavi and Kuechler 2007). Section 4 introduces and details our final nascent design theory, while Section 5 comprises the design theory instantiation and the quantitative evaluation.

Results of first design cycle

Awareness of problem, suggestions, design, and development.

To obtain the theoretical foundation for the derivation of the design theory, we applied a structured literature review according to vom Brocke et al. (2015). Due to the interdisciplinary nature of the topic, we considered databases from economics (Emerald Insight, EBSCOhost), computer science (IEEE Xplore, ACM Digital Library), and from information systems (AISeL, ScienceDirect). We queried contributions focusing on the topics of XAI, HCI, explainability, and (design) requirements. Please see Appendix A.1 for a comprehensive overview of the search strings, the used terms, and synonyms. Further, due to the novelty of the subject, we did not restrict search in terms of rankings. This resulted in 1.426 potential hits, which we then screened and analyzed using reduction criteria consisting of title, keyword, abstract analysis, as well as duplication and language checking. This leads to 114 remaining contributions, of which we classified 86 as relevant using full-text and forward/backward search analysis. As inclusion criteria, we considered contributions from the XAI domain, focusing on requirements, guidelines, best-practices, and different explanatory concepts from a (non-)technical perspective. Figure 2 summarizes the process of the literature review.

Fig. 2
figure 2

Process of structured literature review according to vom Brocke et al. (2015)

We iteratively developed a concept matrix using these 86 contributions by following Möller et al. (2020), including three iterations to develop a theory-based design theory. Please note that to improve readability, we will only provide details on the evaluated design theory of the first design cycle within the following subsection. See Appendix A for a full overview of the iterations of the first design cycle and a visualization of the initial theory-based design theory.

Adjustment and evaluation of theory-based design theory.

Following the FEDS framework from Venable et al. (2016), we conducted an artificial summative evaluation to “demonstrate the utility, quality, and efficacy” (Venable et al. 2016, p. 77) of our design theory. First, we conducted two preliminary expert test interviews (TI) to make initial adjustments to the design theory (cf. Appendix B). Then, we conducted eleven additional semi-structured expert interviews to evaluate the design theory (Kaiser 2014). Here, we define an expert as a person who has theoretical and practical knowledge in the field of AI and XAI. In this context, we interviewed German-speaking researchers and practitioners who classified themselves in the role of an end-user (n=5) or a developer (n=6). All interviews were in the age group of late 20s to mid-40s. See Table 1 for more information also on their demographics such as experience with AI.

Table 1 Overview interviewees and demographics (first design cycle)

We divided the interviews into four phases: 1) At the beginning, we asked the experts about their demographics and their knowledge and experience in the field of XAI, including their estimation about potential barriers for the adoption of intelligent systems to carry out an initial completeness check of our meta design requirements. 2) Furthermore, we asked them to classify themselves as either end-users or developers. 3) We then evaluated our nascent design theory with these experts by presenting the theory-based design theory and openly discussing it with them. Here, we assessed appropriateness and completeness by asking them if they would add, change, or replace any elements. As additional support, we used hypothetical use cases to empower the participants to put themselves in a corresponding situation. 4) Lastly, we asked them to rate the perceived relevance of the design requirements, design principles, and design features based on a seven-point Likert-scale.

In line with Glaser and Strauss (1967), we transcribed and classified the results by creating inductive and deductive codes. Likewise, according to Flick (2020), we made a qualitative analysis. As a single coder primarily coded the data, we obtained intercoder reliability according to O’Connor and Joffe (2020) through coding a sample of data by an additional coder. Altogether, the interviews comprise 559 minutes of audio material, which is equivalent to 126 pages of transcripts (Herm et al. 2022).

Initial design theory

Using the relevance rating of the experts, we categorized the design requirements, principles, and features into a user group if the median of the perceived relevance is “slightly important” or above. Table 2 illustrates the derived and evaluated design requirements, design principles, and design features, as well as the related rating from the experts of the first design cycle. See Appendix B for a graphical overview of the detailed description of the applied steps and the corresponding design theory, in Section 4 we will provide a comprehensive explanation of each element of the design theory except DF11Footnote 1.

Table 2 Design requirements, design principles, and design features of first design cycle including their relevance

During the expert study, we found that there was improvement potential for our design theory. We used this as input knowledge for the second design cycle.

Results of second design cycle

Awareness of problem, suggestions, and design & development.

In the second design cycle, we refined the nascent design theory. Thereby, we included the input from the expert study of the first design cycle and revisited current XAI and HCI research. That is, we adapted DR1 to “improve intelligibility of system’s decision” to emphasize that users must have some access to the logic of ML models for decision support rather than explanations per se. Explanations represent one means to do so as introduced by the subsequent design principles. With this change, we acknowledge that the solution space may actually be larger than only considering explanations. In addition, we assigned DP3 to end-user relevance because a personalized interface design decreases the perceived cognitive effort and increases end-users’ motivation to use the EIS for decision-support (Arrieta et al. 2020; Conati et al. 2021). Likewise, we made DF1 only applicable for developers as end-users are often overwhelmed by (technical) details about the used ML model and are not able to comprehend the provided information (Evans et al. 2022; Holzinger et al. 2022). Further, we added the need for visualization technique explanation into DF8, which results from the fact that XAI visualizations are often difficult to understand for non-technical users and thus may hamper decision support (Herm et al. 2021; Mualla et al. 2022; van der Waa et al. 2021). Lastly, following the first evaluation we discarded DF11, since “users are used receiving abstract information from different systems, so [they] don’t need these anthropomorphic stories” (I8) and the experts rated the relevance of this design feature as overall unimportant. We could not identify any further aspects through the inclusion of recent XAI-related literature.

Expert study, use case application, and reusability evaluation.

The evaluation phase in the second design cycle consists of a threefold naturalistic summative evaluation (Venable et al. 2016). First, we conducted a semi-structured expert study, consisting of a pre-test (TU1-2) and the main expert study (U1-6), with four end-users and four developers (Kaiser 2014) that are part of an AI project in the field of predictive maintenance involving two German companies. Since we observed theoretical saturation, we did not include further expert interviews in our evaluation (Strauss and Corbin 1994). In line with the first semi-structured expert interview study, we asked the participants about their demographics. Subsequently, we showed the adjusted design theory to them and asked them about their perception and if they would modify, add, or remove any elements within the design theory. Again, all interviews were in the age group of late 20s to mid-40s. See Table 3 for more information also on their demographics such as experience with AI. To minimize group bias, we conducted the interviews with each expert individually. Altogether, the interviews comprise 271 minutes of audio.

Table 3 Overview interviewees and demographics (second design cycle)

In the second step, we presented the implemented EIS following our design theory to them. We provided them with the opportunity to use this system and think about the corresponding design theory once again. Lastly, we asked them to rate the design principles according to the reusability evaluation criteria of Iivari et al. (2021). We illustrate the use case application of the design theory as well as the results from the evaluation according to Iivari et al. (2021) in Section 5.

Final nascent design theory

While contemporary intelligent systems can support users with precise recommendations for decision support, their application is hampered especially in high-stake scenarios due to their lack of explainability (Shin 2021), , which highlights the need for EIS (Herm, Heinrich, et al., 2022b). However, due to the novelty of the subject, there is only scarce research on EIS design theories, which are predominantly developed for domain-dependent tasks (e.g., Landwehr et al. 2022). To this end, we propose a broad and domain-independent nascent design theory for EIS, that facilitates the adaptation to different types of use cases (RQ1). Moreover, since XAI research has primarily focused on developers as target group and not the actual end-user of an EIS (van der Waa et al. 2021), we extend this body of knowledge through the differentiated consideration of end-users and developers within the design theory (RQ2). In Fig. 3, we comprehensively visualize the results of the derived design theory for EIS and its dependencies. We present meta design requirements that form the basis for our design requirements and subsequently for the design principles and design features. In addition, we present the user group relevance for each element. When both user groups deemed an aspect necessary, we marked it as “end-user and developer relevance”.

Fig. 3
figure 3

Visualization of final nascent design theory

Meta design requirements and design requirements

Meta design requirements.

Baskerville and Pries-Heje (2019) state that DSR-based research must be projectable to propagate design knowledge. Following the argument of Zschech et al. (2020), we used prior design research as input knowledge for our IT artifact (vom Brocke et al. 2020) to gather meta design requirements (Chandra Kruse et al. 2022; Lee and Baskerville 2003). To this end, we derived the three meta design requirements: system transparency, user trust, and system accessibility, as described below.

MDR1: Increase system transparency. The lack of transparency of the system is a significant barrier to the adoption of AI in practice (Wanner et al. 2022]), as users are incapable of comprehending a models’ internal logic or the reasoning behind a models’ recommendation, rendering EIS for decision support inefficacious (Arrieta et al. 2020; Sardianos et al. 2021). Consequently, system transparency can be seen as a prerequisite for enabling a trustworthy user interaction with the EIS (Landwehr et al. 2022; Samek et al. 2017; Shin et al. 2020). Increasing system transparency also results in a shift in user perception making decisions more conscious (Chazette and Schneider 2020). Simultaneously, system transparency increases the acceptance of using an EIS in work environments (Arrieta et al. 2020; Bhatt et al. 2020).

MDR2: Increase user trust.The acceptance of EIS and, consequently, their adoption depends on trust in the results a system provides (Wanner et al. 2022; Carvalho et al. 2019; Thiebes et al. 2021). Especially for critical decisions, users have to rely on these results to make an informed decision (Choi and Ji 2015; Herm et al. 2021). Consequently, it is only possible to establish initial trust in a (new) intelligent system if there are no unknown risk factors present or users are not afraid of losing control due to a lack of information about the results (McKnight et al. 2011; Slade et al. 2015). However, while this may lead to the perception that trust is influenced by system transparency (e.g., Schmidt et al. 2020), empirical research has proven that there is no significant direct effect of system transparency on the perceived level of trust (Wanner et al. 2022; Cramer et al. 2008). Lastly, the EIS must take into account several influencing factors, such as keeping humans in the loop during system development, to ensure that users perceive the EIS as a competent decision support system for their use case, leading to increased user trust and thus acceptance of EIS (Mualla et al. 2022; Shin 2021).

MDR3: Enhance system accessibility.Crucial in using EIS is the transfer of knowledge towards the user (Berger et al. 2021). Here, a fluent and non-restrictive interaction must be ensured if recommendations differ from user expectations due to the user’s reservations or domain knowledge (Chander et al. 2018; Meth et al. 2015). The use of XAI transfer techniques to ensure an interaction enables the increase of acceptance and the improvement of the intrinsic attitude towards the systems (Sokol and Flach 2020). This also includes the adaptation of the system’s recommendation (Ferreira and Monteiro 2020) as well as the ability to generate causalities for following actions (Liao et al. 2020).

Design requirements.

Design requirements describe how general meta design requirements from related fields of the IT artifact’s topic should be addressed in a way that allows for an evaluation of a developed design solution (Baskerville and Pries-Heje 2019; vom Brocke et al. 2020). During our structured literature review, we scrutinized the meta requirements unearthed initially and operationalized them into more output-related design requirements. We ensure their validity and completeness through the expert interviews in the first and second design cycle (see Section 3.2 and 3.3). We describe them in the following.

DR1: Improve intelligibility of system’s decision. The use of EIS empowers end-users and developers to compare their intrinsic mental model and consequently their expectations with the recommendation of an EIS. So, when user’s expectations conform with the recommendation explanations, their willingness to use the system in practice increase (Carvalho et al. 2019; Malhi et al. 2020). In doing so, EIS must provide recommendations with associated accounts in a way that adequately supports users during the decision process (Longo et al. 2020).

DR2: Support human in own decision-making. To support and improve a human’s own decision-making by providing accounts for predictions, those need to be enriched with domain knowledge and situation-specific context (Dikmen and Burns 2022). Providing such accounts increases the user’s confidence during the decision-making process (Evans et al. 2022). Once end-users can understand the recommendation, they are skilled in making sound decisions. This is also true for developers when they intent to understand the internal processing logic of the model (Malhi et al. 2020).

DR3: Increase user motivation. In case users are extrinsically or intrinsically motivated to use the EIS, the degree of motivation increases, and consequently their system acceptance will increase as well (Stumpf et al. 2019). EIS should therefore incorporate features that rise the motivation of the end-users using an EIS for decision support (Ferreira and Monteiro 2020). This could include different paradigms, as they are directly related to user expectations, leading to a well-perceived user experience (Nunes and Jannach 2017).

DR4: Reduce cognitive effort. If users require a long time to understand recommendation and their accounts, for example if they are counterintuitive or complex, it may be perceived as cognitively demanding and lead to frustration and rejection (Fürnkranz et al. 2020). It is worth noting that the perceived cognitive load may vary by an individual due to context-specific circumstances (Oviatt 2006). Hence, EIS must provide accounts in a manner that reduces the cognitive effort of users (Zschech et al. 2020).

Design principles and corresponding design features

Design principles and design features are intended to explain how derived design requirements can be addressed in a design theory (Baskerville and Pries-Heje 2019; vom Brocke et al. 2020). In the following, we present the final and validated design principles and design features of our nascent design theory. For each design principle, we first provide a comprehensive rationale, followed by a tabular formulation of the design principle using the design principle schema established by Gregor et al. (2020) (see Tables 4, 5, 6 and 7). Lastly, we present corresponding design features to illustrate how the design principles can be implemented into an associated instantiation (Gregor et al. 2020; Seidel et al. 2018).

Table 4 Principle of global explanations
Table 5 Principle of local explanations
Table 6 Principle of personalized interface design (preference, needs)
Table 7 Principle of ability to address psychological/emotional factors (intrinsic barriers)

DP1: Principle of global explanations.

With an EIS, users can understand the general behavior of an intelligent system within the decision-making process and thereby comprehend the inner logic of the model to a certain level. For this purpose, the internal logic of the system must be represented in a user-friendly manner in order for the developer to understand the ML model (Das and Rad 2020). It is essential to grasp the capabilities of the model beforehand because “it is pointless using an ML model that makes completely insufficient predictions” (I5). Furthermore, Rudin (2019) calls for per-se interpretable but performance-wise appropriate ML models, when deploying intelligent systems in highly critical environments as this may be necessary due to regulatory constraints (Vale et al. 2022).

On the one hand, (technical) information (DF1), such as system capabilities of the ML model, (hyper-) parameters, and information about the training data and training history, must be provided to ensure lawfulness and fairness of the training process (Hepenstal and McNeish 2020; Kaur et al. 2022) (U3; U4). This is primarily relevant to developers, since if the logic of an ML model “is far above the level of knowledge, then it’s all magic [for them]” (U5). Furthermore, (performance) metrics must be provided (DF2) to quantitatively evaluate the decision support capability of an EIS (e.g., accuracy, F1-score, decision certainty) (Glomsrud et al. 2019; Sun et al. 2022).

DP2: Principle of local explanations.

To render the recommendation of individual observations explicable, an EIS must provide local explanations. This allows (end-)users to validate or adjust their own expectations if certain recommendations “fit somewhere in [their] expectations” (I8). This internal process can assist in resolving cognitive restrictions (Hepenstal and McNeish 2020). Local explanations complement global explanations and make recommendations easier to understand. Consequently, they are necessary, especially for end-users and novices (Hohman et al. 2019; Mohseni et al. 2021). Moreover, our research shows that this representation is also relevant for developers, since they “[..] can use local explanations to analyze the pre-trained models for reliability by manipulating data and seeing how the model’s outputs change” (U1). This becomes specially important if transfer-learned models are used.

The EIS must display related input data to enable end-users and developers to trace the specific data input used (DF3) for the recommendations and the resulting data output (Liao and Varshney 2021; Nunes and Jannach 2017). This is also true for associative information (DF5) to understand causal decision chains of the EIS in a user-friendly way (Haynes et al. 2009; Nunes and Jannach 2017). This also includes process diagrams, graphical explanations (e.g., correlation matrixes) (U4), and look-up glossaries to understand complex issues in time-constrained situations (U1; U3). Similarly, filterable historical information about past decisions (DF4), including the used visualizations, must be displayed (Atkinson et al. 2020) (U3) as users can form their decision based on previous data and receive information about the decision-making process when legal issues arise (e.g., in high-risk cases) (U1). Moreover, additional information about possible decision alternatives (DF6) must be presented especially in cases of low decision certainty (Nor et al. 2022). In addition, providing input options to customize the input data allows developers to validate and debug an ML model according to (regulatory) unit tests (U3). Lastly, providing hypothetical scenarios (DF7), for example simulations to end-users, would reveal the potential impacts of the provided recommendations (Amershi et al. 2019).

DP3: Principle of personalized interface design.

When using EIS, different user groups have varying preferences and needs for information presentation (Arrieta et al. 2020; Bhatt et al. 2020). Only flexible customization of system components can ensure user comprehension and consequently increase adoption of an EIS (Conati et al. 2021; Mualla et al. 2022). In addition, it is essential to pay attention to reducing the cognitive effort for the user when designing individual EIS components (Carvalho et al. 2019; Cheng et al. 2019). That is, established UI design guidelines (e.g., Shneiderman and Plaisant 2016), and best practices from numerous application domains must be consulted (Amershi et al. 2019) to avoid being “a confusing system with a thousand numbers and variables and layers” (I8). While developers primarily identified this requirement, it is apparent that this is meant to support end-users.

To enable personalized adaptation, several visualization techniques, for example XAI-based argumentations, should be used (DF8) (Jesus et al. 2021), including justifications for why these types of visualizations are used to gain the trust of end-users and developers (U1). Therein, these visualizations should offer different levels of granularity in information presentation (DF9) and should be independently adjustable by users (Amershi et al. 2019). An example would be zooming into an explanation “so [it] can be successively traced further and further in detail” (I2). Similarly, it is necessary to group and prioritize (DF10) individual explanation components for specific user groups to enable adequate presentation and consequently not overwhelm users cognitively (Schneider and Handali 2019).

DP4: Principle of ability to address psychological/emotional factors.

For successful interaction with end-users and developers, the EIS should address their emotions, beliefs, and expectations to achieve the intended goals (Arrieta et al. 2020). This includes situational representations to support the user emotionally and psychologically (Kocielnik et al. 2019), thus addressing their “[..] personal idiosyncrasies and preferences so that they are satisfied with the results” (I1). This improved interaction increases the perceived ease of use, leading to higher adoption of the EIS (Ferreira and Monteiro 2020)

The incorporation of multiple visualization techniques (DF8) enables users to handle individual emotions, such as stress, when faced with time-critical decisions by allowing them to customize the UI to their individual preferences (Chromik and Butz 2021). In addition, end-users must be able to reexamine textual explanations to the corresponding visualizations, in case of interpretational uncertainties during process execution. Besides, end-users require training prior to using EIS to reduce the cognitive effort required (U1; U2).

Evaluation of the final nascent design theory

Overall, the naturalistic summative evaluation in the last design cycle consists of a threefold evaluation following the FEDS framework of Venable et al. (2016). While we demonstrate the qualitive expert study and their findings in Section 3.3 and 4 , in this subsection, we describe the instantiation of the nascent design theory using an EIS prototype implemented in a production-ready environment, including a subsequent reusability evaluation (Iivari et al. 2021) through use-case-related employees.

The use case is part of an AI-based predictive maintenance project performed by the two German companies ROBOUR Automation GmbH and SKZ - German Plastics Centre. In this project, heat-flux sensors track plastic welding processes of polypropylene homopolymer pipes (Lambers and Balzer 2022). This welding process is used when setting up infrastructural underground pipes for freshwater or wastewater supply. The application of poorly welded pipes can lead to the loss of the transported goods and, consequently to the contamination of the soil with potential toxic substances.

According to tracked senor data, a multi-layer DNN predicts the ratio between the flexural strength of the welded specimen and the raw materials, whereby a ratio lower than 0.7 indicates an insufficient welding process. Taking the DNN’s ability to outperform experts and the relatively low acceptance of DNNs in this high-risk scenario into account, the application of an EIS that supports the decision-making process of experts is promising for evaluating our nascent design theory.

As an in-depth pre-test with one developer and one end-user during EIS development revealed, splitting the EIS into multiple dashboards reduces the cognitive load of end-users and developers. As illustrated in Figure 4, the implemented EIS consists of five different dashboards. Following the proposed nascent design theory, the user specific dashboards are only accessible to the certain user groups.

Fig. 4
figure 4

Overview of the different dashboards of the EIS instantiation

These five dashboards comprise the different views for the end-users and the developers of the EIS and consequently postulate a meaningful representation of the derived nascent design theory. The first dashboard provides an overview of the input information (DF3) from the tracked sensors, the corresponding prediction from the ML model, and a (local) explanation of this prediction and thus the resulting decision recommendation (DF8). By clicking on a button below the shown prediction (DF6), the dashboard highlights decision alternatives. In conjunction with the prediction, a hypothetical scenario is presented to the end-user (DF7). The second dashboard contains the associative information for end-users and developers, including (graphical) information about the related sensors, process execution, and data processing steps (DF5). The third dashboard provides (technical) information about the EIS, including a comprehensive description, the applied ML model architecture, information about ML model training (DF1), and the corresponding performance metrics (DF2). Comparable to the first dashboard, the fourth dashboard addresses DF8 by providing (global) explanations of the ML model for the developer. The last dashboard contains an archive of historical decisions including the associated sensor data and its history (DF4). By dividing the EIS into multiple dashboards, we ensure granularity and navigability throughout the EIS (DF9). Similarly, within the first and fourth dashboards, we provide drop-down menus that allow end-users and developers to group and prioritize explanations concurring to their own preferences (DF10).

We asked the experts using the system to speak unreservedly about their impressions and whether they would change, add, or remove any elements. In doing so, we qualitatively analyzed their feedback to identify if this would affect the proposed design theory. In this regard, we noticed that our experts, except for occasional comments, are satisfied with this EIS instantiation. Here a developer stated, that “The system is well designed and offers all necessary functions to assist me during my work” (U3) or “I would like to use the system in our production. As a minor improvement, more technical information about data gathering and preprocessing would be appreciated, at least for our use case” (U1). Likewise, an end-user concluded “The system seems to offer a solid and comprehensible approach to support end-users.” (U5), while another one claimed that “At first, I perceived the dashboard as complex, which is why I believe that a short introduction is necessary, especially for new end-users. Afterwards, the system appears complete and well designed.” (U6).

Lastly, we evaluate the derived design principles by following the reusability evaluation propositions for DSR-based design principles of Iivari et al. (2021). We performed this quantitative evaluation at the end to verify that users are aware of the implemented EIS and thus of our nascent design theory, as real-world use of an EIS may reveal additional changes to the proposed design theory. To do so, we asked the participants to rate the constructs of accessibility, importance, novelty & insightfulness, actability & guidance, as well as effectiveness through multiple questions constructs on a 5-Point Likert scale (1 = strongly disagree, 5 = strongly agree). We conducted the evaluation anonymously via an online survey, to not force biases. The following Figure illustrate the corresponding results. Please see Appendix C for the questionnaire.

Since we used multiple questions per construct, we calculated the median for each construct and expert group. Then, we used the median, minimum, and maximum of this data for the overall construct evaluation per user group (Boone and Boone 2012).

This results in overall positive expert feedback. As, the experts considered no further changes within our design theory, as “the design theory seems complete” (U4) and had a positive perception of the design principles (cf. Fig. 5), we consider our nascent design theory ready-to-use.

Fig. 5
figure 5

Results of reusability evaluation according to Iivari et al. (2021)

Discussion of findings

Discussion and implications

Discussion.

There are several contributions dealing with design approaches for EIS (e.g., Bunde 2021; Landwehr et al. 2022; Meske and Bunde 2022; Schemmer et al. 2022) to create a hybrid intelligence as Dellermann et al. (2019) have called it.

While we conclude, that intelligibility (DR1) expressed through global and local explanation is both important, Meske and Bunde (2022) and Landwehr et al. (2022) are limited to local explanations; only Schemmer et al. (2022) describe the need for providing an overall explainability. Further, recent DSR-based XAI contributions (e.g., Landwehr et al. 2022; Meske and Bunde 2022) do not include the support of own decision-making (DR2) within their design theory. In contrast, these research findings are primary derived from the HCI field (e.g., Dikmen and Burns 2022) and demonstrate the need for an interdisciplinary design theory. The same applies for increasing the user motivation (DR3) (e.g., Ferreira and Monteiro 2020) and reducing cognitive effort (DR4) (e.g., Oviatt 2006). Moreover, while we observed the need for increasing user motivation and reducing cognitive effort within recent literature, end-users and developers did barely envision this need, when talking about both design requirements on a theoretical basis. Nonetheless, we were able to uncover, during EIS application, that users still require design principles related to DR3 and DR4.

In terms of the derived design principles, our study also extends the current body of design knowledge. That is, while recent research targets end-users and is thus limited to addressing local explainability (e.g., Bunde 2021; Landwehr et al. 2022; Meske and Bunde 2022), our nascent design theory does not only include local explainability (DP2) but also incorporates global explainability (DP1) for developers. In addition, while theoretical contributions (e.g., Mohseni et al. 2021) are mainly assigning DP2 to end-users, our research indicate, that developers also benefit from using local explanations. This extension of design science knowledge based on our research applies for DP3 and DP4 as well. While personalized interface design (DP3) is considered important (Conati et al. 2021), during our first design cycle only developers confirmed this finding. Nonetheless, during the second design cycle, end-users also confirmed the importance of DP3. Regarding the consideration of psychological/emotional factors (DP4) for end-users and developers our findings are in line with recent research (Arrieta et al. 2020).

Lastly, matching theoretical foundations with our research findings also reveals differences. Comparing our findings with related design theories (Bunde 2021; Landwehr et al. 2022; Meske and Bunde 2022; Schemmer et al. 2022) shows that only four out of our ten design features have been mentioned earlier. This includes design features such as providing input information (DF3) and historical information (DF4) as well as using explanation techniques (DF8) and incorporating granularity and navigability (DF9). Six out of our ten design features were derived from interdisciplinary contributions. Comparing the targeted user groups from theory with our findings uncovers further distinctions: while the six design features DF1 (Hepenstal and McNeish 2020), DF2 (Sun et al. 2022), DF3 (Nunes and Jannach 2017), DF4 (Atkinson et al. 2020), DF6 (Nor et al. 2022), and DF7 (Amershi et al. 2019) are in line with recent interdisciplinary research, four design features are not. Although previous research consider DF5 (Haynes et al. 2009), DF8 (Jesus et al. 2021), DF9 (Amershi et al. 2019), and DF10 (Schneider and Handali 2019) for both user groups, our evaluations reveal, that DF5 and DF8 have a purely unilateral preference towards end-users and DF9 and DF10 towards developers. While our theory-based initial design theory, drawing on scholarly literature, included the need for anthropomorphic design language, as in chatbots, to reduce adaptation barriers (Weitz et al. 2019), we did not include this design principle in our final nascent design theory because our experts rejected this, as non-novice users are accustomed working with abstract information, which leads to undesirable complexity within the EIS. We could not find evidence with the EIS instantiation either. We acknowledge though that DF11 may be relevant in situation where end-users possess no technical skills at all (e.g., private use of intelligent assistance services, chatbots, etc.).

Theoretical implications.

DSR seeks to develop prescriptive design knowledge by developing and evaluating novel IT artifacts to solve practical problems (Hevner et al. 2004). Corresponding to mode 3B of Drechsler and Hevner (2018)’s design theorizing modes, we derived a nascent design theory that provides explicit prescriptions for entity realization for a class of explainable AI-based DSS, so-called EIS. Further, following Gregor and Hevner (2013)’s DSR knowledge contribution framework, we contribute with a nascent design theory including (meta) design requirements, design principles, and design features (level 2 contribution) and a situated implementation of the IT artifact (level 1 contribution). Since we applied two design cycles, the design theory can be considered rigorous and consequently can serve as input for future research (Hevner 2021).

Looking at previous design science research reveal that the integration of AI in DSS leads to intelligent systems that are capable of supporting users in their decision-making process (Janiesch et al. 2021). However, due to their focus on user performance, these systems are primarily developed for low-stake use cases wherein users do not rely on comprehending the reasoning of a ML model (e.g., Zschech et al. 2020) as an incorrect recommendation has no significant impact on humans or the environment (Rudin 2019). In contrast, utilizing these systems in high-stake use cases, wherein incorrect decisions may endanger human lives or may have vast consequences, designing intelligent systems require the explicit consideration of techniques such as XAI to make the ML model’s behavior traceable (Mohseni et al. 2021), resulting in the need of EIS applications (Herm, Heinrich, et al., 2022a). Hence, recent research has already developed first design principles for domain-dependent EIS development (e.g., Landwehr et al. 2022). To extend this sparse research, we position our research as a broad design theory for EIS development (Chandra Kruse et al. 2022), that distinguishes itself from recent research:

First, to best of our knowledge, there is no other scholarly contribution providing a nascent design theory for a domain-independent EIS including an instantiation. That is, compared to current research contributions that develop DSR-based design principles for specific use cases (e.g., Bunde 2021; Landwehr et al. 2022; Meske and Bunde 2022), our research provides a first-hand design knowledge as a starting point for adoption and refinement for all types of decision support use cases. As an example, applying our design theory to a healthcare use case may lead to the consideration of additional factors to assist physicians in high-stake cases when human lives could depend on a decision.

Second, in our design theory we consider recent findings from design-based XAI, interdisciplinary XAI, and HCI research. To this end, our design theory compromises not only technical XAI aspects but also socio-technical aspects that origin from the field of HCI and psychology. In doing so, we take into account the diverse facets of human-agent interaction that unfold due to XAI’s nature (Miller 2019).

Third, our design theory also includes the consideration of different user groups. Since previous XAI research has not sufficiently addressed the integration of end-users, we have focused our design theory not only on the developer and ML expert, but also on the end-users. However, we recognize that there is no one-size-fits all EIS. That is, during the interview studies, we mostly rely on end-users that are domain-expert but mostly unskilled in terms of ML. During our qualitative research, we identified this type of end-user as widely spread. Hence, we take our design theory as a starting-point for the consideration of end-users, with the potential need of design theory adjustment, when it comes to specific use cases, for instance, when novice users perform tasks.

Practical implications.

During our research, we found that XAI is not a silver bullet. That is, in practice the use of XAI does not automatically ensure utilization of EIS. Even when using XAI-based transfer techniques, novice users need to be empowered to use these EIS and thereby develop a widespread understanding. This is especially true for high-stake scenarios, where recommendations and explanations must be comprehensible to users at all times. In addition, this can (psychologically) support users, when they compare explanations with their own expertise and expectations.

Besides, companies should discuss the required cognitive effort with their end-users. Surprisingly, as we particularly focused on reducing this effort, end-users told us, that using this EIS seemed quite complicated for them at first. Consequently, conducting training before using an EIS guides these novice users and similarly reduces the required cognitive effort, as they become familiar with the system.

Nevertheless, we revealed that some end-users do not only want to comprehend the recommendation but also want to determine the quality of the ML model based on metrics such as accuracy, F1-score, or decision certainty to critically evaluate the provided recommendation. In contrast, these users are not interested in understanding how the models operate. Instead, we have found that end-users trust the model development and selection by the EIS designers. Conversely, talking to the experts shed light on the correlation between AI knowledge and trust in AI. This means that AI experts tend to have more reservations about AI because they are aware of potential difficulties during selecting, training, and developing ML models.

Finally, in the second evaluation phase of the design cycle, we found that experts not only view the implemented EIS as an opportunity to deploy AI into practice in an explainable fashion but also to use the data-driven generated knowledge to train end-users for use case execution. In doing so, we noticed that the utilization of an EIS fosters the acceptance of AI and allows experts to view AI as trustworthy.

Limitation and future research

Although we ensured scientific rigor by applying established DSR guidelines (Gregor and Hevner 2013; Iivari et al. 2021; Vaishnavi and Kuechler 2007), we noticed certain limitations in our research. This includes the two expert studies we conducted while adjusting and evaluating the proposed nascent design theory, where experts already had several years of experience in the field of AI. Hence, we must assume that the results could differ for novice users. Further, all interviewees were early to mid-career employees. Hence, our results are more likely to apply for this age group than for mid-50s and older. We conducted the last evaluation phase based on an exemplary and thus context-dependent scenario, which is why the results could vary in other scenarios. Also, end-users did not have to make time-critical decisions in the use case application. With this in mind, we assume that the design of EIS systems may differ, when there are additional technical, privacy, or cognitive constraints to consider. Lastly, we did not test all 15 possible design principles configuration to ensure design principle expressiveness (Janiesch et al. 2020). Our design theory represents a nascent design theory, it is not yet a fully developed grand theory.

During our research, we noticed several shortcomings in current XAI literature and XAI applications in practice leading to novel research opportunities. As part of a DSR-based research project, we provide research prospects that future research projects can use as a starting point and thus as meta design requirements for their work (Peffers et al. 2007).

Contrary to existing theoretical assumptions (e.g., Liao et al. 2020), global explanations are not necessarily suitable for developers, as they as well may be cognitively overwhelmed. For future research, it is therefore necessary not only to investigate interactive XAI-based explanations with different levels of granularity for end-users but also to consider developers as a relevant user group. This is especially true since the algorithmic output of common XAI tools can be challenging for these user group (Herm et al. 2021; van der Waa et al. 2021), as not all developers have a data science related background.

Connected to this, we found that all experts emphasized the importance of adequate XAI-based explanations during the evaluation of the use case. However, none of these experts were able to provide dedicated requirements for such an explanation. Consequently, research should target the derivation of frameworks and guidelines for selecting context specific and appropriate XAI explanation types to assist decision-making. This includes evaluation metrics and standards to define the quality of an explanation. This evaluation may also differ due to different use case scenarios. While previous research has already endeavored to define criteria such as clarity, fairness, bias, completeness, and soundness (e.g., Zhou et al. 2021), it is not evident how these can be objectively measured and whether they are sufficient in constrained scenarios. In addition, the use of EIS requires interdisciplinary research to define guidelines and norms that ensure legally compliant utilization of EIS across different application domains, transitioning EIS into trustworthy AI (Thiebes et al. 2021).

Lastly, we found divergent results for the relevance of user motivation (Ferreira and Monteiro 2020). Here, we assume that the inclusion of components to increase user motivation is primarily necessary for novice users, since experienced users have already internalized the benefits provided by an EIS. Although our experts have mentioned the potential of using gamification concepts to reduce EIS acceptance barriers through play, recent research has not yet focused on this approach. While research has already shown how students can learn and perform new content through an interactive, game-based learning platform (Xinogalos and Satratzemi 2022), a gamified approach with a leaderboard could provide employees with necessary EIS knowledge and potentially increase adaptation or reduce learning barriers when it comes to using yet unknown technologies. However, our experts were unable to define how such a learning platform should be designed to support their employees without overwhelming them.

Conclusion and outlook

The lack of explainability of intelligent systems inhibits their acceptance. XAI offers a potential path out of this dilemma. In response, we have developed a rigorous nascent design theory for EIS that includes four design principles and ten design features to foster the acceptance of AI-assisted decision-making focusing on local and global explanation, personalization as well as addressing intrinsic barriers. In doing so, we incorporate both technical and socio-technical aspects of XAI to address the needs of different user groups, including end-users and developers to develop a broad, domain-independent design theory also considering human-agent interaction. In summary, our nascent design theory provides novel knowledge design knowledge for a symbiosis of expert and system and can further foster the integration of AI into operational practice.