1 Introduction

Non-communicable diseases (NCDs), including chronic conditions like heart disease, cancer, and chronic kidney disease, pose a major public health issue. These diseases, which are not transmitted from person to person, lead to significant economic strain, premature mortality, and a reduction in the quality of life (QoL) for individuals affected (Lopez et al. 2014). Older adults are the most affected in terms of health, and they are also the group with a greater need for support in rehabilitation (De Biase et al. 2020). Recently, also the COVID-19 pandemic has revealed shortcomings in healthcare and the need to implement solutions for coping with diseases from a distance or at home (Fagherazzi et al. 2020). In this sense, persuasive systems may contribute to improving the situation. Persuasive systems are designed and studied in Information Systems Research (ISR) to inform and support patients, change undesirable behaviors, or guide decision-making (Chatterjee and Price 2009; Orji and Moffat 2018). Despite the potential benefits these systems offer to the elderly, their often less advanced digital skills might hinder acceptance, highlighting the need for tailored solutions to enhance their accessibility (Mannheim et al. 2019).

Also, on the policy level, the use of digital solutions is seen as an important means of disease prevention and control. There is strong evidence that proper rehabilitation helps to regain patients’ independence and QoL (Clifford 2011; Bethge et al. 2014; Siegel and Dorner 2017). Despite its importance, a common issue is that the continuation of rehabilitation is often disrupted after hospital discharge. The transition to patients’ homes is aggravated, which has a negative impact on adherence to treatment plans and the long-term success of the treatment. To address this gap, Virtual Coaches (VCs) are being recognized as promising interventions in the realm of digital health. They have the potential to support the continuity of care and allow for the partial replacement or complement of traditional in-person coaching, esp. for rehabilitation at home (Ding et al. 2010; Kyriazakos et al. 2020; Graßmann and Schermuly 2021; Weimann et al. 2022). In particular, they could motivate the patient to adhere to physical activities, set incentives to be active, monitor vital signs, support health literacy (Kyriazakos et al. 2020), and be a companion of trusted guidance (Allen et al. 2008). Therefore, VCs must adequately reflect both the needs of the patients and the relevant clinical (process) knowledge in terms of clinical pathways (Gand et al. 2021b).

In recent years, the interface of VCs has frequently taken the form of so-called conversational agents (CAs) that are autonomous systems designed to resemble a human communication partner (Seeger et al. 2021). Embodied conversational agents (ECAs) represent a special type of CAs, characterized by computer-generated and often human-like avatars interacting with the user (Curtis et al. 2021). Given their capacity to also convey non-verbal cues, several studies suggest their potential suitability for health coaching purposes by enhancing user engagement, building a trustful and empathic relationship, and promoting system accessibility (Tropea et al. 2019; Kramer et al. 2020; Jiang et al. 2024). However, further explorations beyond the current state-of-art are required in the following areas:

  • Use of CAs in real-world care scenarios: Evidence for (E)CAs has often been gathered in laboratory settings with a cross-sectional character and healthy subjects (e.g., ter Stal et al. (2020); Schachner et al. (2020)). Real-world aspects like the long-term use of ECAs in a clinical setting are possibly often neglected due to complicated and costly processes (e.g., ethical approval, data protection or medical device regulations, patient recruitment, and patient training). This is especially relevant when integrating sensor devices feeding the VC, using personalization algorithms, or dealing with older/impaired users or users who lack digital skills.

  • Design knowledge and impact of VCs: There is only scarce evidence for the design of VCs and their advantageousness in the medical domain (Tropea et al. 2019; Bin Sawad et al. 2022). This includes the absence of explicitly formulated and validated design knowledge in terms of meta-requirements, design principles, and design features, which are at the heart of contemporary ISR (Strohmann et al. 2023). Detailed, systematically structured and evaluated design knowledge is crucial for understanding and replicating VCs in practice. Without rigorous clinical methodology applied in practitioner-researcher interventions, ISR may struggle to produce truly fit-for-purpose solutions (Baskerville et al. 2023).

  • Investigation of virtual coaching platforms: To date, VCs have been primarily investigated and developed for specific disease scenarios (Kramer et al. 2020; Car et al. 2020; Jiang et al. 2024). The development of holistic coaching platforms applicable for a range of medical indications, including the demonstration of the technical feasibility and a rigorous evaluation for multiple disease contexts, has been a largely unaddressed field in the literature so far. Such platforms, with a scalable architecture, could potentially transform healthcare delivery by providing tailored and efficient support to a vast, diverse, and morbid population (Fürstenau et al. 2023).

The aim of advancing knowledge on the design and use of VCs for home rehabilitation therefore leads us to the following research questions (RQs):

RQ1: What are the design principles for implementing of VC-based home rehabilitation systems to support the continuity of care?

RQ2: Which design principles could be adopted for designing comparable systems?

RQ3: To what extent do VC-based rehabilitation systems influence rehabilitation positively concerning user experience and QoL?

The present study reports on a comprehensive research project that built and evaluated a VC for supporting the rehabilitation of older patients at home. Methodologically, the project followed a Design Science Research (DSR) approach, embedding three build and evaluate cycles (Kuechler and Vaishnavi 2008; Sonnenberg and vom Brocke 2012). The project has taken the design prototype into pilot testing to address the above shortcomings of VC studies. Therefore, involving different professions like clinicians and software engineers provided access to different stakeholders, particularly patients, which are essential for the design and evaluation.

For practice, the study contributes with the design artifact itself, including the requirements set, technical specification, and the learning from three build-and-evaluate cycles. For theory, we justify a set of validated meta-requirements, design principles, and design features that could inform future researchers when designing comparable interventions.

The remainder of the paper is structured as follows: In chapter 2, we outline the background and theoretical foundations reflecting the Computers-Are-Social-Actors Paradigm and the Social Response Theory. In chapter 3, the method is presented. In chapter 4.1, we derive the initial design principles for VCs in rehabilitation contexts based on literature and summarize the artifact design. Chapter 4.2 consolidates the evaluation cycles. We conclude with implications for theory and practice, the limitations, and an outlook for future research.

2 Theory and Background

2.1 Virtual Coaches and Related System Classes

Due to the novelty of VCs, there has long been a lack of terminological clarity in the literature concerning how they relate to other concepts such as virtual assistants or CAs. Furthermore, a variety of synonymous terms exist in the literature (e.g., “e-coach” or “virtual trainer”; Tropea et al. 2019). A holistic and integrative view of VCs and related system classes was recently given by Weimann et al. (2022). In this work, VCs were defined as adaptive software systems that target goal-directed transformations of the coachee’s (i.e., patient’s) cognition, affection, and behavior to achieve improvements on the individual and societal level. Virtual assistants, in contrast, aim to simplify the user’s life and rather function as servants. Overall, VCs could support human skill development in many domains (Chatterjee et al. 2021). Regarding healthcare, VCs have been previously studied to train physicians or nursing staff (Cook et al. 2010; Richards et al. 2020) but even more frequently as patient-supporting applications to promote physical, mental, and/or social health by instructing exercises, imparting disease-related knowledge, and serving as a long-term motivational companion (Tropea et al. 2019). The literature review by Tropea et al. further highlights several examples of VCs working with healthy individuals (e.g., physical activity promotion to prevent obesity), but points out a notable deficiency in their use for elderly patients’ rehabilitation.

From a conceptual view, both the human coach and coachee can vary in their digital presence on a continuum. At one end of this continuum, the ultimate form of the coach’s digital evolution can be regarded as the replacement by an autonomous system (Kool et al. 2013, p. 27). Yet, this classification approach tends to blur the distinctions between how care is delivered (in-person, remotely, and/or via standalone autonomous systems, e.g., Tuckson et al. (2017)) with the differing levels of the coach and coachee’s digital enrichment. Instead, another perspective posits that all three modes of care delivery can be complemented by the widespread adoption of digital technologies (e.g., using sensors, decision support systems, or virtual reality tools during in-person coaching; Philpot et al. (2023)). While remote and standalone delivery modes can be summarized as VC in a broader sense, prior literature argues that VCs go beyond mere computer-mediated communication tools by being at least semi-autonomous software agents (Weimann et al. 2022). Further, VCs emphasize the conversational character and context of the coaching process. This allows for more intelligent behavior by adapting the coaching actions depending on the user’s current context (Kamphorst 2017). Given their conversational character, it seems unsurprising that current VCs are associated with CAs for the interface. However, recent literature reviews on ECAs as health coaches highlight a lack of context-based personalization and overall evidence, underscoring the need for further advancements and clinical trials (Lyzwinski et al. 2023; Jiang et al. 2024).

Given the socio-technical nature of VCs, different perspectives are of interest when evaluating the VC in medical care. These could be assessing the VC’s clinical efficacy and effectiveness by considering psychological, behavioral, and physiological outcome parameters (e.g., QoL, daily steps, blood pressure), usability or user-experience metrics, and the evaluation of the general technology implementation along with economic measures (Curtis et al. 2021; Jiang et al. 2024). Given the long-term use of the VC application, ensuring high user experience coupled with approaches to promote adherence in a targeted manner (e.g., serious games or gamification) is crucial for achieving clinical improvements and plays a significant role in the design process of VCs (McCallum 2012; Oinas-Kukkonen 2013).

2.2 Virtual Coaches as Social Actors

When CAs are used for the front-end of the VC, design aspects such as the avatar’s visual appearance, voice, or gestures may trigger emotional responses (e.g., enjoyment) and thus (positively) impact user experience (Loveys et al. 2020; ter Stal et al. 2020). The general idea of deliberately including such social cues (e.g., giving the VC a name, selecting particular clothing) in the system design is empirically grounded in the so-called “Computers-Are-Social-Actors” Paradigm and the Social Response Theory based on it (Nass and Moon 2000). According to this fundamental theory, computers can influence cognition, affection, and behavior similar to humans. Considering the high media richness of ECAs, a variety of different social cues can be implemented that may increase the persuasive characteristics of the VC and the likeliness that the user follows the VC’s advice (Feine et al. 2019). However, because of this larger design space, the design of ECAs is more challenging than that of simple disembodied chatbots. Consequently, an unfavorable design may lead to mismatches of social cues that negatively affect user experience. The so-called theory of “uncanny valley” (Mori et al. 2012) from robotic research postulates that increasing the humanness of the VC can increase acceptance to a certain level. Still, the acceptance rapidly drops when there are imperfections between human-like appearance and the system’s actual behavior. Consequently, using ECAs as health coaches can be beneficial for persuading the user to adopt a specific behavior but may lead to adverse effects with an unfavorable design (Venning et al. 2021).

2.3 Grounding the Design of Virtual Coaches in Behavior Change Theory

Promoting behavioral changes to improve the disease trajectory or prevent diseases can be considered the primary intention of coaching in medical care (Olsen and Nesbitt 2010; Passmore and Lai 2020). Therefore, evidence from behavioral medicine, social psychology, and behavioral IS research provides a rich design knowledge base for VCs. Mainly, two theoretical frameworks are referenced in this regard and have been used in the past to design digital health interventions (Wang et al. 2019): The Behavior Change Technique Taxonomy by Abraham and Michie (2008) and Michie et al. (2013) (from the field of behavioral medicine), and the Persuasive System Design Model by Oinas-Kukkonen and Harjumaa (2009) (originated from ISR). The Persuasive System Design Model is, due to its origin, one step closer to the software development perspective by describing concrete design principles along with exemplary requirements and implementations. Both frameworks are grounded on multiple established psychological theories, such as goal-setting theory (Locke and Latham 1990), theory of planned behavior and its predecessor (Ajzen 1991), or the information-motivation-behavioral skills model (Fisher and Fisher 1992). However, both frameworks vary in their arrangement and number of proposed techniques, respectively, design principles. The Persuasive System Design Model classifies 28 design principles related to primary task support, dialog support, system credibility support, and social support. In contrast, the Behavior Change Technique Taxonomy proposes 93 techniques divided into 16 groups (e.g., goals and planning, feedback and monitoring, social support). Despite some similarities (e.g., self-monitoring, social support), the frameworks complement each other and, therefore, serve as a holistic fundament for designing virtual coaching interventions (Asbjørnsen et al. 2022). However, to the best of our knowledge, the frameworks have so far not been used to implement and test a VC in the home rehabilitation domain for multiple disease contexts.

3 Method

3.1 Case Description

The study was part of the EU-funded vCare project.Footnote 1 This provided access to four clinical reference sites and their stakeholders, as well as to resources for implementing and evaluating a VC solution regarding technical feasibility, usability, and effectiveness in a real-world setting. Following the DSR paradigm, the project methodology included three primary joint clinical-technical development phases (Kyriazakos et al. 2020) corresponding to three distinct build and evaluate cycles (Kuechler and Vaishnavi 2008; Sonnenberg and vom Brocke 2012). The development team included 12 interdisciplinary members with clinical, technical, and exploitation backgrounds. In mid-2017, the development of the VC solution started; the final artifact was released within the third phase in mid-2022 (see also Fig. 2 below for a procedural overview). Ethical approval were granted for the study sites.

The project focused on home rehabilitation of two neurological (stroke and Parkinson’s disease) and two cardiological (heart failure and ischemic heart disease) pathologies. Clinical sites in Italy (Lombardy), Spain (Basque Country) and Romania (Bucharest region) have been involved. The clinical aim was to support patients to continue their inpatient rehabilitation at home using a VC. In sum, 80 patients were involved. The users obtained a holistic VC solution that uses serious games (games that go beyond mere entertainment, here for better health conditions, and use elements like striving for new high scores to achieve the goals) in front of a camera with body recognition for motoric exercises, or tablet-based games for cognitive exercises. The VC provides motivational feedback, controls the overall rehabilitation procedure, shows exercise results and vital signs overviews, and partly compensates for the absence of human caregivers. A reasoning component allows the VC to adapt to the condition and preferences of the patients and, thus, allows a personalization of the rehabilitation procedures replicating clinical process knowledge and human adaptivity (e.g., adapting physical exercises to heart rate excesses). Various sensors (e.g., for step count, heart rate, or position in the home) are integrated and used as the data basis and for decisions or inference mechanisms. Clinical procedural knowledge is represented and referred to for overall control and scheduling of all coaching measures in terms of clinical pathways as conceptual procedural models (Gand et al. 2021b).

3.2 Research Approach

DSR has evolved as a fundamental prescriptive paradigm in ISR, contributing to theory and practice to solve real-world problems (vom Brocke et al. 2020). Moreover, DSR provides distinct guidance to help formulate research aims and structure work (Hevner et al. 2004; Gregor and Hevner 2013; vom Brocke et al. 2020).

The paper outlines a high-level design approach that makes use of various design science methods. Based on the problem (the relevance outlined by the clinical experts in workshops and interviews; upper left part in Fig. 1), a problem-diminishing solution is suggested. The knowledge base in terms of existing literature is the primary source of evidence for the initial derivation of the design (upper right part in Fig. 1; Hevner 2007)). With these inputs, the (meta-) requirements for addressing the problem are derived, guiding the development of the artifact and feeding its evaluation (upper middle part in Fig. 1; Peffers et al. 2007; Kuechler and Vaishnavi 2008).

Fig. 1
figure 1

Cyclic DSR approach (upper part; (Hevner 2007)) and respective DSR contribution types (lower part; building upon Gregor and Hevner (2013); Drechsler and Hevner (2018))

Also, the experts actively participated in all technical developments and always accompanied the requirements (as far as content-related clinical and not purely technical aspects were concerned; upper left part in Fig. 1). This then fed further executions of the (design) cycles. In this sense, the project is a mode of convergence between the problem space (the need for home rehabilitation) and the solution space (personalized virtual coaching) that not only leads to this convergence itself, but also contributes to the understanding of how this convergence takes place (solution design knowledge or λ-knowledge according to Drechsler and Hevner (2018)).

After theoretically deriving the design principles for VCs in rehabilitation contexts in general (focal level 2 of DSR contribution types in Fig. 1), we developed an instantiation during the project, which we then evaluated in a threefold test setting (level 1 in Fig. 1). The testing phases were crucial in determining whether the VC solution could achieve the targeted clinical outcomes and in confirming the effectiveness of the applied prescriptive knowledge.

The project methodology included a division into three primary development phases (Kyriazakos et al. 2020) that correspond to three distinct design cycles ((Kuechler and Vaishnavi 2008; Sonnenberg and vom Brocke 2012); see Fig. 2). Overall, a mixed method approach was used to receive feedback on the proposed solution (ex-ante), or proofs of the technical prototype (ex-post). With each sub-cycle, the focus shifted from technical feasibility to usability and clinical outcomes (Kyriazakos et al. 2020):

  1. (1)

    Tech Lab (DSR Cycle 1): The technical feasibility of the solution in terms of a formative evaluation was assessed. Based on the clinical needs, the technical project partners developed a prototype to test in a laboratory setting, examining the system integration. Also, the clinicians simulated the patients to confirm the prototype’s compliance with the clinical requirements and the external validity by assessing its compatibility with the needs of the clinical sites. In terms of an artificial evaluation (Venable et al. 2012), the fit of the actual implementation with the defined technical (functional and non-functional) requirements was investigated.

  2. (2)

    Living Lab (DSR Cycle 2): A controlled environment in terms of the clinical facilities was used to assess the clinical feasibility of the VC solution. The interaction with the artifact was performed by the patients, but not in a real-life environment in order to better control and assess the effects of and on this interaction. With the technical feasibility established, now the perception by the users was in primary focus. Also, some minor technical issues and user experience flaws were detected during the actual usage of the solution.

  3. (3)

    Pilot Phase (DSR Cycle 3): After assessing both the technical and clinical feasibility of the VC solution, its effect on the clinical outcomes had to be finally assessed. This was done by using standard clinical scales and quantitative assessments (like for QoL) to evaluate the effect of the solution on the patients. Finally, the user perception, motivation, and acceptance of the technology was assessed. Ten patients per pathology were included in a control group, and another ten patients in a test group

Fig. 2
figure 2

Overview of the design phases (top: timeline/phases of the project; left: generic Design Science Research Cycle according to Kuechler and Vaishnavi (2008); right: partly reusing the Build and Evaluate phases according to Sonnenberg and vom Brocke (2012))

For each of the three phases, we evaluated the appropriateness of the design given the respective requirements (see the evaluation results in Sect. 4.2, the respective lab adaptation overview in Appendix A (available online via http://link.springer.com), and the final requirement set as deduced in Sect. 4.1.1). Figure 3 offers an overall overview of the evaluation approach. It is with intention that all aspects are covered in terms of a holistic evaluation concept, including distinct participatory design tools (Harst et al. 2021).

Fig. 3
figure 3

Overview of the evaluation during the different phases (on the left), contrasted to the choice within the DSR Evaluation Framework by Venable et al. (2012) on the right (the vertical spanning of the phases on the left and the evaluation types on the right is matched)

The high degree of interaction between health professionals and patients (especially in the Living Lab) also poses a risk to the reliability of the evaluation results. In particular, confirmation bias can lead to distortion (Nickerson 1998). To counteract this, a study protocol with precisely defined quality criteria and scales for the individual steps was developed in advance of all three study phases (parallel to the derivation of the requirements). This had to be strictly adhered to. Thus, bias could at least be reduced, although person-related biases may be difficult to fully exclude. Furthermore, the approach of several pilot locations was not only chosen to ensure pathological as well as geographical/cultural diversity. Rather, contradictory results may also have been uncovered in overly heterogeneous settings and major bias in one of the locations. The different locations are, therefore, also their own corrections in terms of mutual learning or failure prevention.

The results of the literature review (knowledge base in Fig. 1, top right) are presented in chapter 4.1. A participatory design approach (Harst et al. 2021) was chosen for the relevance part (Fig. 1, top left) that similarly contributed to the derivation of the requirements for the design of the VC solution. The results of the participatory design approach can be found in the evaluation for each phase (see chapter 4.2 for the overall evaluation results and Appendix A for the adaptions per test phase). The basic procedure of the participatory design approach and the sub-steps per phase are summarized in Fig. 4. More details are given in Appendix C. Particularly for the first part, we extensively gathered user viewpoints. Also, the basic shape of the VC solution was based on the needs and respective discussions with the clinicians. These initial requirements were refined and amended in discussions with the technical project partners, the participatory design elements and also taking into account the literature, resulting in the final set of requirements as discussed in Sect. 4.1.1. For example, the original requirements analysis envisaged monitoring users’ vital signs and activity continuously. This was intended to ensure an optimal database and the best possible basis for clinical decision-making. However, this was changed as part of the participatory design. Criticism was voiced by both patients and experts in the Tech Lab (see Appendix C). For example, users said that there should also be unobserved times (“stealth mode”). Patients’ autonomy should not be unnecessarily restricted. Experts also expressed concerns about data protection regarding full-time monitoring. For this reason, an easy-to-activate standby mode was developed within the VC solution, in which the avatar does not “listen” or is inactive as far as the technical monitoring components are concerned. Ultimately, the final version of the requirement is reflected below in DP1 or DF1-5, 7, 9, 11, 13–19, which can be actively restricted by the user through the standby mode.

Fig. 4
figure 4

Participatory Design Approach (Harst et al. 2021) as part of the DSR environment/Relevance cycle (Hevner 2007)

The test results per phase are discussed in Sect. 4.2. The generalizable learnings for the overall design process are summarized below. We also provide micro-learnings from each build-evaluate cycle in terms of particular design adaptions (see Sect. 4.2 and Appendix A).

4 Results

4.1 Artifact Design

In the following, we report on the derived meta requirements (MRs), design principles (DPs), design features (DFs), and their technical instantiation in the final design artifact. The complete set of DFs can be found in Appendix D.

The design knowledge is the result of the three DSR cycles illustrated in Fig. 2. Based on the recommendations by Bradbury et al. (2014), we combined an analysis of the literature and a participatory design involving both clinical domain experts and patients (i.e., deductive and inductive). This way, both the DSR environment and knowledge base contributed to the final design (see Fig. 5, left). The deductively derived MRs were substantiated and validated through empirical studies with clinical domain experts and patients. The literature review focused on papers conceptualizing the domain of VCs and related concepts (esp. CAs, personalization) as well as papers on the application of behavior change techniques in the medical fields relevant to our project (Parkinson’s disease, stroke, heart failure, ischemic heart disease). Several literature reviews were conducted as part of the vCare project (see Tropea et al. (2019), Philipp et al. (2019), and Weimann et al. (2022)), which served to identify essential concepts in the domain of VCs and key publications. The findings of these reviews also served as guidance to unsystematically search for further literature reviews or surveys addressing specific aspects in more detail (e.g., behavior change techniques). Likewise, dedicated papers emerged within the context of the project that discussed some requirements and their implementation aspects in more detail (esp. Benedict et al. 2019; Kropf et al. 2020; Gand et al. 2021a, b). Consequently, the overall literature search process can be regarded as semi-systematic. For deriving the design principles (DPs) based on the elaborated meta-requirements, we followed Möller et al. (2020) and Gregor et al. (2020). Additionally, the set of DFs was collected based on these studies to specify the solution space (see Fig. 5).

Fig. 5
figure 5

Mapping principle of MRs, DPs and DFs (based on Strohmann et al. (2023))

4.1.1 Meta-Requirements and Design Principles

Several previous research efforts have significantly contributed to understanding the “anatomy” of VCs. For example, Kamphorst (2017) postulated several requirements for VCs: social abilities, credibility, context awareness, tailoring and learning abilities, the ability to interface with different data streams, proactivity, model of behavior change and planning. Likewise, Ding et al. (2010) identified self-monitoring, coaching strategy, context awareness, and interface modality as core aspects for designing VCs. Both conceptualizations are only partially suitable to serve as guidance for subsequent implementation on this level of abstraction. In addition, multiple aspects are closely related and need to be refined and systemized (e.g., context awareness and tailoring). Nonetheless, context awareness of the system can be considered an essential property of the VC for enabling personalized and non-static coaching (e.g., recommending a walk when the observed physical activity is low). Following the literature on self-adaptive systems and autonomous computing (e.g., Huebscher and McCann (2008); Salehie and Tahvildari (2009)), context awareness is a prerequisite for achieving adaptivity via different technical mechanisms (e.g., rule-based and/or machine learning-based adaptation). Consequently, we consider the adaptivity of the system as the first meta-requirement (MR1).

However, to make the system “context-aware” in the first place, data inputs are needed that fill the contextual variables (e.g., physical activity level, personal preference, weather) with values and thus give the system a notion of the surrounding context. The necessary data inputs can be, in general, gathered using passive sensors or by actively asking the patient via questionnaires (Sim 2019). Therefore, we consider the system’s ability to provide active and passive sensing mechanisms and process these different types of data as a dedicated sub-requirement to address adaptivity (MR1.1). Based on the observed data, the system can set up and maintain a continuously evolving model of the user (patient) that serves as a basis for personalizing the coaching conversation to the user/user groups (MR1.2). Kamphorst (2017) included the idea of “self-learning” as an essential property of a VC. However, the requirements analysis with the medical experts during the design phase revealed that solely using machine learning algorithms to personalize experiences and to recommend coaching activities to the patient can compromise clinical safety. Particularly, constraints need to be defined beforehand to ensure that the VC does not recommend activities that go beyond the level of the patient’s capabilities and contradict clinical goals. This increased transparency is also important to promote trust in the VC from both the health professionals as well as the patients (Philipp et al. 2019). Nonetheless, self-learning algorithms are highly useful for addressing the individual needs and preferences of the patient when acting within clearly defined boundaries (Boecking and Philipp 2020). As both paradigms can complement each other, we concluded that there is a need for a combined approach using rule-based and machine learning-based mechanisms for the care pathway (MR1.3; see Fig. 6).

Fig. 6
figure 6

MR1.1–1.3, DP1 and DFs related to the intervention adaptivity

To make use of the mentioned adaptation mechanisms in a goal-directed manner, the VC first needs a strategy of health behavior change (coaching strategy) (MR2) (Ding et al. 2010; Beinema et al. 2021). Across different literature reviews summarizing the use of behavior change techniques in digital health interventions for cardiovascular and neurological diseases, the behavior change techniques of self-monitoring, feedback, prompts, and goal-setting were most frequently used and associated with positive effects (Winter et al. 2016; Howes et al. 2017; Felsberg et al. 2019).

Self-monitoring of patient behavior and behavioral outcomes serves two purposes. On the one hand, it serves as a behavior change technique to promote self-regulatory control of internal or external processes and is, in accordance with the control theory, typically coupled with feedback mechanisms (Hennessy et al. 2020; Zhang et al. 2021). On the other hand, the self-monitoring data also enables the system to adjust interventions as outlined above (Nahum-Shani et al. 2018). Therefore, self-monitoring of therapy-related parameters can be regarded as a technique applicable across the entire care pathway.

However, the short-term and long-term goals vary along the pathway with the corresponding activities that need to be accomplished. For example, prompts for certain activities or tasks (e.g., recommending a walk), completing a questionnaire after the activity, or referring to educational material usually have an inherent temporal and logical order. The idea to arrange the coaching actions in temporal order is analogous to coaching strategies, coaching plans, or trajectory models of behavior change as proposed in prior work on VCs (Ding et al. 2010; Kamphorst 2017; Ochoa and Gutierrez 2018). With regard to the behavior change technique taxonomy, this refers to “goals and planning” (Michie et al. 2013). To summarize, we identified the need for the VC to possess (and computationally process) a model of the care pathway (MR 2.1) and to implement self-monitoring abilities (vital signs, activities) (MR 2.2). Furthermore, the VC should provide social and dialogue support for structuring the patient’s (inter-)personal daily activities and promoting their performance using feedback, especially in the form of reminders, suggestions, and praise (see Kropf et al. (2020) for details; MR 2.3). Beyond merely giving advice and providing a daily schedule, the VC should also specifically instruct the patient by referring to educational material (MR 2.4). A common and ongoing challenge in the field of digital behavior change interventions is how to stimulate and sustain user engagement in the long run, especially when behaviors have not yet become habits. Recent literature discusses gamification and serious games (Deterding et al. 2011) as promising approaches for promoting engagement (Krath et al. 2021). Consequently, the VC should implement strategies to promote user engagement and, based on prior literature and our qualitative studies, consider gamification and serious games as appropriate strategies (MR 2.5; see Fig. 7).

Fig. 7
figure 7

MR2.1–2.5, DP2 and DFs related to the coaching strategy

The user interface is a key success factor when designing systems that deliberately attempt to change the user’s behavior (Oinas-Kukkonen 2013). Based on Weimann et al. (2022), a distinction can be made between the interface of the human coach to integrate the clinical knowledge and to conduct (dependent on the system’s degree of autonomy) adaptations if necessary and the actual interface for the coachee (patient). Given that both parties are users of the system, we consider the need for a “multi-user interface modality” (MR3).

Prior research has found human-like (anthropomorphic) interfaces useful for the patient’s front-end. Considering that the traditional format of coaching, respectively therapeutic interventions, are face-to-face conversations, several authors proposed (E)CAs as essential for computerizing the human coach (Tropea et al. 2019). In accordance with Kamphorst (2017), we also regard the social abilities of the VC and mechanisms to resemble a natural conversation as important (MR 3.1). Additionally, these social abilities should be designed so that the VC is perceived as persuasive (e.g., having expertise; MR 3.2). This implies avoiding common pitfalls such as the “uncanny valley” (see Sect. 2.2). Also, the accessibility of the user interface is crucial for long-term engagement and is particularly important when designing digital health interventions for older adults, people with a low literacy level, and cognitively or physically impaired people (Cheng et al. (2020); MR 3.3). The system’s emulation of social abilities through the use of (E)CAs is associated with high accessibility and usability among these target groups, and MR 3.1 and 3.3 can be considered interrelated.

The interface for the human coach or the gap between the medical domain experts and the technical system has been less the subject of research in the past (Weimann et al. 2022). Enabling “lay developers” to (partially) design and adapt software systems on their own is closely related to model-driven engineering (Di Ruscio et al. 2022). A model reflecting the clinical knowledge in terms of the care pathway could then be executed by a dedicated adaptation or workflow engine (see MR1) and also coupled with machine-learning for personalization (Philipp et al. 2019; Sahay et al. 2020). Based on the participatory design approach, we identified the need for a domain-specific modeling language to define the intervention workflow (pathway) in an understandable way (accessible for non-technical experts; Gand et al. (2021b)). This modeling approach should be supported by a dedicated software tool and provide a mapping between the conceptual and technology-related levels (MR 3.4). Furthermore, our participatory design approach revealed that the front-end of the human coach should allow the enrolment of patients in an intervention and the prescription of an individual pathway (MR 3.5; see Fig. 8).

Fig. 8
figure 8

MR3.1–3.5, DP3 and DFs related to the user interface (coachee and coach)

MR4 refers to the implementation of a sustainable technical infrastructure that promotes scalability, extensibility, and reusability of the platform (Benedict et al. 2019). First, the technical architecture should be modularized and interoperable to enable different integration options in terms of a service-oriented architecture (MR 4.1). As a cloud-based platform, this opens up new business opportunities. It is of paramount importance to rigorously build on established data exchange interoperability standards (MR 4.2). For instance, in recent years, the HL7 FHIR standardFootnote 2 is considered key to integrated healthcare processes, as well as FIWARE (Cirillo et al. 2019) for the integration of context data. In addition, data security and privacy are particularly important when managing health data. All layers and endpoints should include dedicated security measures (MR 4.3; see Fig. 9).

Fig. 9
figure 9

MR4.1–4.3 and DP4 related to the technical infrastructure sustainability

4.1.2 Implementation of the Artifact

The artifact design instantiates the above-mentioned meta-requirements for the design and design principles and is thus based on an open, decoupled multi-layered service-oriented architecture (MR4.1). Various existing solutions have been combined to provide the necessary functions. The architecture described in detail in Kyriazakos et al. (2020) consists of:

  • The knowledge layer focusses on the ontology-based storage and reasoning using reinforcement learning (Philipp et al. 2019) for personalizing the user interaction (MR1.3, 2.4) and the execution of the patient’s care pathway (MR2.1). During the pilot phase, a rather simplified reinforcement learning algorithm, specifically a contextual bandit, was used to recommend the daily number of steps (each day) and e-learning videos (when triggered by the patient). This enabled tailored coaching interactions based on the patient’s profile (MR1.2).

  • The sensor layer offers integration services for sensor data, which provide essential data and context information, including activity patterns (MR2.2, 1.1).

  • The middleware layer provides the basic bus system needed to connect the single layers and allows for context integration. It provides basic security and database services (MR4.2).

  • The pathway layer provides an interface for the human users to specify the clinical pathway, thus providing the clinical procedural knowledge to govern the rehabilitation care through a graphical pathway modeler (MR2.1). These models are translated into machine-readable FHIR resources (MR4.2) to allow for the machine processing and execution/instantiation of the pathways (Gand et al. 2021b).

  • The coaching layer includes specific coaching and support services that interact with and engage patients in their rehabilitation (MR1.1, 1.2, 2.1, 2.3, 2.5).

  • The UI/Exploitation layer enables the interaction with the user and external systems. The first includes the interface for the patients, namely the avatar, the VC itself, and the serious games (MR3.1, 3.2). We created a character for the avatar styled to mirror the role of a physician (e.g., white jacket). Beyond the mere human-like appearance, the avatar is also able to show emotions. The professional portal allows for the entry of patient data and for the assignment of pathways to the patients (MR3.4, 3.5). External systems can be connected following the service-oriented architecture (MR4.1).

The artifact concept enables a flexible combination and application of various coaching services that can be re-composed to tailor a personalized pathway. Agents within the knowledge layer address the different medical scenarios, and an ontology technically represents the patients' needs and states, along with the services and clinical pathways. This forms the basis for machine learning algorithms to assess the patient’s adherence and progress along the pathway. Based on the outcomes, suggestions for modifying the pathway can be proposed (Kyriazakos et al. 2020).

Figure 10 illustrates the general form of the VC system in use. Initially, clinical experts have to develop a clinical pathway, which includes the procedural rules as an initial configuration of the system (MR2.1). The coaching procedure then starts with the prescription of clinical pathways (see box “Pathway modeler” in Fig. 10). On that basis, the agenda for each individual patient is derived (MR2.3, 2.5, 3.1, 3.2). The agenda serves as the starting point for coaching and care activities (MR3). Amongst others, e-learning content is provided to foster healthy behaviors (MR2.1, 2.4). Also, both motor and cognitive serious games making use of 3D camera recognition and touchscreen are triggered to activate and improve patients’ motoric and cognitive skills (MR1.1, 1.2, 2.1).

Fig. 10
figure 10

VC system in use, including actual implementation of the meta-requirements (MRs)

4.2 Evaluation

In the following, we present the results from the three lab phases or build-evaluate cycles, focusing primarily on the results in the form of the QoL and usability/acceptance evaluation.

4.2.1 Tech Lab – DSR Cycle 1 – Focus on Technical Feasibility

The Tech Lab outcomes document the technical setup of the solution. Additionally, an overview of the progress in integrating the technical components is given. Tests that are semi-integrated were carried out on these components, allowing for system interactions that are partially simulated. This encompassed unit and integration tests. All technical interfaces were completely specified and established, enabling data and information to flow seamlessly across the system. Concurrently, emphasis was placed on processing this information. As evidenced by the test documentation, the defined requirements of the disease-related activities were met. Still, some adaptations of the technical realization were needed when transitioning to the Living Lab. These adjustments fed into the overall knowledge base as a learning for the design implementation (see Appendix A, Table A.1).

To involve the end-users in the participatory design activity, distinct video footage was specifically designed.Footnote 3 This aims to clarify how a VC could impact the everyday life of an older person. A group of 10 elderly subjects was enrolled in each clinical center for qualitative usability surveys after they had watched the demo video. In the interviews, the clinical staff administered the User Experience Questionnaire (UEQ) (Laugwitz et al. 2008; see Appendix B for an overview of the scale) and thoroughly explored the causes for the most significant problems in order to feed the next design iteration. From the patient surveys, we could see that the mean scores resulted in > 1.5 (above the mean value) in all six domains, which is considered as a positive result (see Appendix B, Table B.2 for the full data). In particular, the most positive results were observed in the attractiveness, stimulation, and novelty domains. The following input for guiding the further design was given (see full feedback provided in Appendix C): technological interfaces need to be easily accessible and provide immediate and natural interaction with end users. Psychological support should be provided in form of feedback and motivational reinforcement techniques to improve care adherence at home. Especially a human-like avatar design, both in appearance and in behavioral empathy, is important to coach the user successfully.

4.2.2 Living Lab – DSR Cycle 2 – Focus on Clinical Feasibility

For the part of the Living Lab, 20 patients from each clinical site were enrolled to interact with the VC in a controlled environment within the clinics’ premises. Over a period of a maximum of two weeks, patients executed the clinical pathway prescribed to them. Before proceeding to test the Living Lab with patients, members of the staff of the clinical sites tested and confirmed the stability of the VC system functionalities in executing the clinical pathways as part of integration tests (Seregni et al. 2021).

For the quantitative usability survey, the patients self-administered the System Usability Scale (SUS) for each activity (Brooke 1996; Bangor et al. 2009; see Appendix B for an overview of the scale). On a scale from 0 to 100, a SUS score exceeding 68 is deemed to be above average, indicating that system usability is satisfactory. Scores higher than 80 were defined as excellent; below 50 as non-acceptable. Also a qualitative User’s Open Feedback Form in terms of semi-structured interviews was derived on the basis of the UEQ evaluating the same domains (Laugwitz et al. 2008). Interviews followed the UEQ template, extracting key insights related to the VC solution. This method offers a clearer view of user opinions on specific experiences, potentially improving evaluation effectiveness compared to pure quantitative approaches (Seregni et al. 2021). This way, both the professional as well as the patient users’ qualitative feedback could be used to evaluate the user experience (see Appendix C for all the feedback in the course of the participatory design activities).

Table 1 provides an overview of the Living Lab participants. The threshold for the system's excellent usability (SUS score > 80) was clearly surpassed during the tests in the Living Lab. The SUS scores of the patients with Parkinson’s Disease and heart failure were greater than 80 (= excellent), despite some problems that surfaced from the semi-structured interview evaluation (esp. issues related to the tablet’s touch sensitivity and accuracy, the need for greater variability, and difficulty of the rehabilitation games). For stroke and ischemic heart disease the scores ranged still above the threshold of 60.

Table 1 Overview of the Living Lab participants

Table 2 provides some exemplary quotations from Living Lab participants to provide a more intuitive impression of the feedback. The strong usability achieved in the Living Lab tests can also be attributed to the system's intuitive interaction (e.g., see quotes (3), (4) in Table 2). Most patients reported gradually increasing confidence with the technology in subsequent sessions, enabling them to use it independently at home (e.g., see quotes (1), (2), (6) in Table 2). In only a few instances, additional instruction and support were deemed necessary; patients would have appreciated user manuals or brief demonstration videos as reminders on how to operate the system (e.g., see quotes (2), (7) in Table 2). In some cases, patients expressed a desire for a more immersive gaming environment and higher complexity levels in the games to encourage more constant activity (e.g., see quotes (10), (12) in Table 2).

Table 2 Exemplary quotations from Living Lab participants

The technical components as such have already been proved in the preceding Tech Lab phase. Only some optimizations in the runtime behavior (iterative improvements of the app) or additions to the scope of functions were necessary. Thus, the focus moved more to a processual or pathway-related level (see Gand et al. (2021a, b) for more details). Also, there were new requirements for the rehabilitation care pathways that resulted from discussions with clinicians during real system operations. This is because the first set of pathway templates was derived primarily from paperwork without having an actual working system. These changes are summarized in Appendix A, Table A.2. As the changes regarding the pathway content and methodology were limited, a well-conducted initial requirements analysis can be assumed.

4.2.3 Pilot Phase – DSR Cycle 3 – Focus on Clinical Outcomes

The Pilot Phase was conducted subsequent to system refinement, incorporating feedback received from users during earlier stages. The primary technical work was about installing the needed devices at the patients’ homes, maintaining the devices, and training the patients on how to use them. Also, within the Professional Portal, overviews of the measured vital parameters and questionnaire results from distant patients have been added. Some clinical pathways had again to be refined, and the machine learning algorithms had been initiated/fed with first patient data.

The Pilot Phase was planned as a small-scale pilot randomized trial (Cocks and Torgerson 2013), wherein the developed system undergoes clinical validation within the user’s home environment to evaluate the VC’s impact. The participants have been recruited from the pool of patients who have been treated at the clinical sites and who meet some clinical inclusion criteria. Ten participants per pathology followed a personalized rehabilitation program at home for a period of up to three months. During the study, the participants used the VC app on a tablet that sent reminders for prescribed exercises (for physical and cognitive activation and rehabilitation), allowed monitoring of vital and movement parameters, and provided hints and e-learning material via the VC’s avatar interface (following the precepts as set by the pathway templates). Also, a set for conducting physical and cognitive serious games at home (in front of a screen, with a 3D depth camera for motion sensing) was installed for them. For each clinical site, a further ten patients were enrolled as a control group (receiving conventional rehabilitation, i.e., clinical recommendations at discharge; see Table 3 for an overview of the participants).

Table 3 Overview of the Pilot Phase participants

It is acknowledged that the final outcomes slightly differ based on the pathology and specific indicators assessed. Nonetheless, overall, the goal of restoring active and independent living at home, in terms of QoL, has been achieved through improvements in continuous care and access to personalized cognitive (for neurological cases), motor exercises (for all patient groups), and comprehensive coaching advice provided by the VC. The EQ5D scaleFootnote 4 has been used as a standard tool to compare QoL before and after singular/shorter interventions (see Table 4), given the relatively small sample size of the Pilot study (Gandhi et al. 2017). The target improvement of 10% for the QoL values stems from the clinical experts as an empirical value to be considered very good and competitive compared to other interventions. It is, therefore, a consolidated expert assessment.

Table 4 Improvements in QoL pre-post intervention, sub-domains of the EQ5D QoL scale, project ambition was an improvement by at least 10%

Figure 11 provides a comparison of the summative QoL scales pre and post-intervention. E.g., the mobility score for the stroke case could have been decreased from an average of 2,0 to 1,7 (pre- to post-intervention), leading to an improvement of about 17%. Overall, there is a clear tendency toward improvement, given that the scale is seen as a first indicator of the advantageousness and usability of the VC solution.

Fig. 11
figure 11

Comparison of the EQ5D QoL scales pre (T0) and post (T1) intervention as a basic indicator for the advantageousness of the VC solution; The scores range from 0 (no problems) to 4 (unable to walk, not being able to perform daily activities, etc.)

Figure 12 depicts the EQ5D visual analog scale (VAS) ratings assessing the perceived health status of patients in the VC-guided intervention (experimental) and the traditional intervention (control). Given the small sample size but similar treatment needs and rehabilitation programs, we decided to pool the data (neurological and cardiological patients) for analyzing the overall changes. The results suggest that the VC-guided intervention was associated with at least comparable but even more positive effects regarding the perceived health status, particularly for VC-guided cardiological programs (see percentage improvement compared to T0).

Fig. 12
figure 12

Changes in the EQ5D-VAS Scale (Perceived Health Status) of the experimental group and the control group for the cardiological and neurological rehabilitation program

Positive outcomes were also observed in reducing risk factors, including a significant increase in daily steps and time devoted to exercises or e-learning sessions. The adherence to the rehabilitation plan regarding access to the platform, system interactions, or the total number of times patients performed a suggested activity was in line with the expectations. Positive indications were also noted concerning the viability of the anticipated personalization and health promotion efforts. However, the volume of data collected was somewhat less than expected which in turn limited the (automated) refinements to the rehabilitation therapy. Despite this, the VC managed to adjust the pathways in certain instances.

The user experience ratings were mainly positive for the hedonic dimensions (stimulation and novelty) across the four pathologies (see Fig. 13 for an overview). In contrast, ratings of the pragmatic dimensions (perspicuity, efficiency, and dependability) were rather neutral to even negative, particularly for the cardiological programs. One explanation for this result may be the more complex setting in the cardiological programs (additional blood pressure and weight scale devices; more delays due to the pandemic). Patients considered that the technical aspects in terms of monitoring devices could be improved by refining the system and eliminating difficulties they encountered. The effort required for technical adjustments and a reduced affinity for the technology of older patients may, therefore, have been underestimated to some extent. However, the results suggest that, particularly for neurological patients, the user experience was satisfactory, and no such adverse events occurred.

Fig. 13
figure 13

User Experience Questionnaire (UEQ) ratings for all patient groups after pilot phase; UEQ ratings above 0.8 are considered as positive evaluation

When taking the results of the SUS evaluation into account, the acceptability threshold was met for all pathologies with the exception of patients with ischemic heart disease (see Fig. 14). The evaluation of the perceived usefulness and ease of use, proposed as key factors of technology acceptance (see Davis (1989)), further underscores this finding (see Fig. B.2 in Appendix B). It can, therefore, be concluded that the developed VC represents a safe, engaging, and aesthetically pleasing system for ensuring the continuity of medical rehabilitation care in the patient’s home environment. Thus, the design cycle can be regarded as successful, supporting the broader applicability of the elaborated DPs.

Fig. 14
figure 14

SUS-Scores for Living Lab and Pilot Phase

Detailed insights into the medical efficacy of the VC with a focus on the separate medical use cases can be found in Del Pino et al. (2023), Seregni et al. (2022), Busnatu et al. (2022) and Lăcraru et al. (2023).

5 Discussion

In contrast to the prior studies investigating single DPs in an artificial experiment setting, we have focused on their integration into a full-featured VC system in a natural environment. We were able to investigate the impact of a VC from both the clinical and human–computer interaction perspectives (see Fig. 15 for an overview of the main findings). E.g., the mobility score for the stroke case could have been decreased from an average of 2,0 to 1,7 (pre- to post-intervention), leading to an improvement of about 17%; patients generally had positive user experiences with the VC system, particularly as regards the stimulation and novelty (hedonic UEQ dimensions). Also, see the positive user feedback as exemplarily shown in Table 2. The study was able to show how the principles of DSR were used in a real-life context to intertwine MRs and DPs. The VC solution is proven in a practically relevant context and is beneficial in terms of user experience and clinical aspects. Moreover, the findings provide real-world evidence for the usefulness of DPs in the field of VC solutions. By evaluating clinical outcomes and user experience-related aspects together, the study narrows down the current evaluation gap.

Fig. 15
figure 15

Main findings and primary results of the build-evaluate cycles (double tick = fully fulfilled; single tick = minor weakness, but overall fulfilled)

From a clinical viewpoint, the study lays the foundations for large-scale clinical trials (RCTs) to validate the results and rule out further confounders. From an ISR viewpoint, the study delineates first the complexity of the design and implementation of digital interventions in healthcare. Second, the findings broaden the design knowledge regarding VCs for home rehabilitation and clinical research in general. With this study, we provide four DPs as guidance and a detailed set of DFs which help to implement those principles. The study was conducted within a diverse naturalistic setting, where the dimensions of relevance and rigor encounter a substantially broader range of influencing factors for evaluation than in a controlled, laboratory-based experimental context. The study elucidates the manner in which comprehensive evaluation and documentation efforts can facilitate the incorporation of complex interventions into the ISR knowledge corpus. The specific implications for theory and practice and limitations are delineated in the subsequent sections.

Regarding RQ1 (What are the design principles for implementing of VC-based home rehabilitation systems to support the continuity of care?), our study introduces a comprehensive framework consisting of rigorously developed meta-requirements and detailed sub-requirements. This is achieved by employing iterative build-evaluate cycles for VCs for rehabilitation purposes, which is substantiated by positive outcomes during their evaluation. The contribution of our research is guided by four pivotal DPs: Adaptivity, Coaching Strategy, Multi-user Interface, and Sustainable Infrastructure. These principles guide the creation of VCs, ensuring they are not only effective in rehabilitation settings but also grounded on robust theoretical foundations. Further, we elaborate on a detailed list of design features, illustrating how each contributes to the realization of the DPs.

In addressing RQ2 (Which design principles could be adopted for designing comparable systems?), our investigations provide prescriptive knowledge for the development of distinct VC solutions, thereby illustrating the usability and transferability of design knowledge across various contexts. Showing the instantiation of the prescriptive knowledge in the solution of the vCare project not only facilitates the creation of tailored VC interventions but also contributes to the overarching discourse on the utilization of VCs in the realm of digital healthcare.

Finally, the study provides insights into RQ3 (To what extent do VC-based rehabilitation systems influence rehabilitation positively concerning user experience and QoL?): Our analysis unveils that the deployment of our VC solution significantly enhances clinical outcomes and exhibits commendable results in terms of user experience (including system acceptance and usability). The evaluation highlights that participants experienced noteworthy improvements in their QoL, substantiating the potential clinical efficacy and effectiveness of our VC solution. Following an integrated evaluation design based on the framework of Venable et al. (2012), as explicated in Sect. 3.2, our study reinforces the clinical and practical value of VCs in rehabilitation. Due to its costly design and interdisciplinary character, this kind of study has been rarely conducted so far.

5.1 Implications for Theory

This study's contributions embody the essential criteria of a nascent design theory (Gregor and Jones (2007); Gregor and Hevner (2013); see Table 5). Grounded in a robust theoretical foundation, it extends existing knowledge with rigorous theoretical development by conducting an iterative process to design and develop VCs for rehabilitation backed by a positive clinical trial (Purpose and Scope). We described the VC system's key technical parts and the varied stakeholders involved, from caregivers to solution designers (Constructs). The VC's DPs are derived from existing theories and experts’ knowledge, to guide medical professionals in VC design and use (Principle of Form and Function). The VC design is flexible and adaptable, thus fitting various clinical scenarios and offering customization and personalization (Artifact Mutability). The VC's effectiveness can be measured through clinical trials, where study results have shown promise (Testable Propositions). The design built is rooted in core VC design theories, and integrates past research and practical approaches (Justificatory Knowledge). The research provides a tangible VC design, which has been operationalized and set into motion (Principles of Implementation). A VC for rehabilitation has been designed, while maintaining design consistency (Expository Instantiation). Table 5 provides more details of the components.

Table 5 Design theory components (Gregor and Jones 2007; Gregor and Hevner 2013)

5.2 Implications for Practice

As mentioned, there is a lack of large-scale VC solutions that consider the clinical routines and patient needs while complementing the existing face-to-face care. Therefore, the research results provide a multitude of contributions to practice. Firstly, it provides evidence that the employed DPs can facilitate product development. We demonstrated the generic applicability of the DPs across diverse disease contexts, as evidenced by their instantiation in a scalable platform tailored to the requirements of both cardiological and neurological patients. The distinct coaching system of the project could function as a baseline for other VC systems since it is built as a modular and distributed system. It also lays the conceptual foundation for a VC ecosystem.

Second, the DPs with the detailed set of DFs allow for the development of sub-products. Specifically, a first step towards the delivery of VC solutions targeting multiple morbidities has been taken, which is a rising issue in healthcare (Skou et al. 2022). For example, cardiovascular diseases can be associated with obesity, and both conditions require a specialized coaching program. It is crucial to avoid the need for patients to interact with separate software applications in order to enhance accessibility, seamless user experience, and long-term technology adoption, as well as to avoid contradictions of incompatible coaching programs.

Third, new business models in the field of VCs could emerge. Standardization and interoperability are the main drivers that enable a VC ecosystem where different vendors have a “co-opetition” for the best solution, services, and business models (Eisenmann et al. 2008).

Fourth and finally, a VC solution may change role models and modalities of rehabilitation and related information systems. Pressure could be taken off caregivers without replacing them. Rather, it supplements the remote physician, for example, by giving therapy recommendations and constantly collecting vital parameters. With AI, more advanced personalization and adaptation mechanisms are likely to be realistic soon (Weimann and Gißke 2024). Consequently, our study paves the way for additional investigations into the use of AI to customize health-related cues and training, thereby enhancing the overall effect of the intervention. Nevertheless, while the DPs underscore the importance of integrating personalization (via AI) within the comprehensive solution, they do not furnish detailed specifications for the distinct AI components.

5.3 Limitations and Opportunities for Future Research

The selection of pathologies is one of the limitations of the current study. The design decision and adaptations (see Appendix A) might have been different if applied to, e.g., a younger population or people with stronger cognitive impairments. We tried to mitigate this effect by including different pathologies with high social impact in different countries.

A further design-related limitation is the choice of technology. Since the study lasted over a period of five years, we cannot rule out that alternative services could have shown a better performance. However, to increase the stability and comparability of the evaluation, the main components had to persist over the project phases after final approval.

Limitations in the evaluation phase also result from choosing specific scales for the clinical acceptability of the artifact. While this does not imply that alternative scales are inherently more meaningful or valid, the reason for the selection of scales was influenced by the availability in the required languages (essential for direct use with older patients and for an international perspective) and the existing knowledge of clinical researchers.

Furthermore, the evaluation was partly designed like a classical RCT (with intervention and control group) with a rather exploratory intention than a confirmatory scope (exploratory design), which resulted in a limited number of participants. As part of this exploration, we were able to conduct a feasibility study in a clinical-scientific setting with real patients, providing results on acceptance and efficacy. This small-scale study in a naturalistic setting builds the foundation to develop a large-scale RCT with a more medical focus on the confirmation of the already gained results.

Future research should also address current trends, such as digital ubiquity or generative AI (esp. large language models) (Feuerriegel et al. 2024). This would potentially result in better and non-invasive context awareness and, overall, in a better personalization of coaching interventions.

Finally, research should also address the question of a multi-VC approach rather than a single-VC approach (i.e., multiple avatars; Beinema et al. (2021)). Depending on the application scenario, a specialized coach (different databases, different technical equipment) could then appear and interact with the patient (e.g., for physical activity, nutrition, social health etc.).

6 Summary & Outlook

Our investigation delves into the utilization of VCs as a specialized form of ECAs within a clinical setting to enhance the continuity of care, esp. in home rehabilitation. The primary aim of this study was to broaden our understanding regarding the deployment of clinical VCs, particularly in addressing the evident scarcity of comprehensive interventions, evaluating the overarching influence of VCs, and rectifying the shortfall of evidence that is deeply rooted in practical application.

Further research should contribute to the development of home-based rehabilitation through VC solutions, with a particular focus on personalization and personalization algorithms aiming to improve the adaptation quality of rehabilitation programs to the individual needs and conditions of patients. Ensuring patient safety within these personalized interventions emerges as a critical research directive, necessitating the development of robust frameworks that monitor and mitigate potential risks associated with automatic adaptations.

The design and implementation of a VC ecosystem also raises a number of research questions. This includes the question of how to integrate low-/no-code capabilities to facilitate a more accessible and adaptable environment for VC-based treatment programs, as well as the question regarding suitable business models for the different stakeholders in the ecosystem.

Furthermore, larger clinical trials (RCTs) should be included in further research to extend the statistical evidence. In addition to traditional experimental and control groups for a statistically valid proof of effect, it would be desirable to form further subgroups. This could provide insights into the benefits of specific DPs and DFs.

All in all, our work contributes to the evidence for VC solutions in the clinical domain and to the research on persuasive systems (Chatterjee and Price 2009). We aspire to advance toward a new era characterized by the seamless integration of digital health solutions.