Introduction

The world’s changing demographics are a double-edged sword. On the one hand, life expectancy is increasing through improvements in medicine, nutrition, and hygiene. On the other hand, the viability of health and social systems is threatened by a declining workforce, obesity, and sedentary behavior, as well as an increasing number of beneficiaries who are older and thus more dependent on care (Ho et al. 1993; Oeppen and Vaupel 2002; Giannakouris 2008; López-Otín et al. 2013).

New consumer health informatics (CHI) technologies are a solution to mitigate the negative consequences of the changing demographics. Ambient assistant living (AAL) and serious games for healthcare are both CHI techniques aimed at mitigating the negative effects of the demographic change. The former aims to increase the safety, comfort, and quality of life of residents by integrating technology into their living environment and to enable them to live autonomous and self-determined lives despite chronic illnesses (Kleinberger et al. 2007). The latter combines the motivational character of games with serious goals to increase health awareness, stamina, body control, and memory or to support rehabilitation measures.

In this article we combine these two rarely connected branches and we present the development and empirical evaluation of two serious games for healthcare in a prototypical AAL residential environment: the first game addresses the improvement of executive functioning through a cooking game, and the second promotes physical activity in a garden-themed exercise game. Furthermore, as social acceptance of medical technology by users—especially in the case of technology for end users—is a major challenge (Czaja et al. 2006; Flandorfer 2012), we apply methods from technology acceptance research to evaluate our game environments and to identify individual factors that promote or limit the adoption of the game-based healthcare interventions in AAL. Key questions addressed in this article are the relationship between user factors and performance within both serious games, the influence of user factors and the attained performance on the evaluation of the games, how the game evaluation relates to the overall projected use, and the similarities and differences between the two games under investigation.

The structure of this article is as follows: After this introduction, we present an overview of the benefits of exercise, serious games for healthcare as a method to stimulate exercise, ambient assisted living, and technology acceptance research. Next, we present the two game prototypes developed for this work. After that we illustrate our experimental framework for the evaluation, the samples, and the key findings. Finally, we discuss our work, its implications, and its limitations and ideas for future work.

Related work

As this article combines concepts and methods form multiple distinct domains, this section gives a broad overview and argues the benefits of physical and cognitive exercise, and presents related work on serious games for healthcare, AAL, and finally technology acceptance research.

Benefits of physical and cognitive exercise

Aging is “characterized by a progressive loss of physiological integrity, leading to impaired function and increased vulnerability to death” (López-Otín et al. 2013, p. 1194), and affects all aspects of human life. It is associated with an increased likelihood of chronic illnesses or disabilities such as congestive heart failures (Ho et al. 1993), stroke, diabetes mellitus (Hader et al. 2004), mild cognitive impairment (MCI), dementia, or Alzheimer’s disease (AD) (Gao et al. 1998). Besides age-related diseases, overweight and obesity are increasing risk factors and are threatening life expectancy (Mann 2005).

Physical inactivity is associated with several diseases including cancer, diabetes, hypertension, and coronary or cerebrovascular diseases, and reduces life expectancy (Knight 2012). However, health and life expectancy can be increased by maintaining a healthy body weight, a healthy diet, not smoking, low alcohol intake, and regular exercise, as shown by a 35-year long-term study (Elwood et al. 2013). People who showed all or almost all habits had a 73% lower likelihood of diabetes, 60% lower likelihood of dementia, 60% lower risk for stroke, and an increase in life expectancy by 6 years. Physical exercise increases pulmonary function, oxygen consumption, cardiovascular function, standing and walking balance, bone stability, and heart and other muscle strength, and improves the immune system (Aldwin et al. 2006). In combination with a healthy diet, it reduces overweight and obesity and thus associated diseases such as diabetes. Regular physical exercise reduces the likelihood or intensity of cardiovascular diseases, and patients suffering from chronic syndromes benefit from regular and lifelong exercise at home, such as a combination of aerobic activities and resistance, flexibility, and balance exercises (Perez‐Terzic 2012). One hundred and fifty minutes of medium-intensity aerobic exercise per week is sufficient for a significant positive impact on health (Mendes et al. 2011), especially for children, older adults, and people with overweight or obesity. Although the frequency, intensity, and duration of exercise can be optimized, the key is that some physical activity is always better than none (Saint-Maurice et al. 2018).

Physical exercise also has benefits that are less apparent. It reduces the probability of silent (unnoticed) brain infarcts in the elderly by 40% (Willey et al. 2011) and can be as effective against the occurrence of migraines as an anti-migraine drug (Varkey et al. 2011). A more active lifestyle with social interaction, listening to music, and physical activity lowers tension, improves mood, and increases perceived energy levels (Thayer et al. 1994). Exercise can also reduce the symptoms of depression, although antidepressants and psychotherapy are found to be most effective (Cooney et al. 2013).

Physical exercise also has short-term effects on cognition. Working memory performance is directly increased after 30 minutes of exercise (McMorris et al. 2011). Regular aerobic exercise improves executive functioning for healthy people, and older adults show improved task switching ability, higher selective attention, higher working memory capacity, and better overall performance (Guiney and Machado 2013). Remarkably, physical exercise in midlife reduces the risk of cognitive decline, mild cognitive impairment, and dementia later in life (Ahlskog et al. 2011).

For the domain of cognitive training, research on the effectiveness and medical implications is less conclusive. On the one hand, computerized cognitive training (CCT) programs are postulated to be suitable, effective, and necessary for healthy brain aging by stimulating neuroplasticity (Ball et al. 2004; Heyn et al. 2004) and may contribute to healthy brain aging (Shah et al. 2017).

Serious games and serious games for healthcare

The main idea behind serious games (for healthcare and for other domains) is built on the Premack principle (Premack 1959): the likelihood of performing an unpleasant and thus infrequent activity—such as training and exercise—can be increased by linking it with a more pleasant and thus more frequently performed activity—such as playing games. It involves packaging less liked and therefore less practiced activities with games to increase the motivation and thus the likelihood of the activity being carried out. Serious games can be defined as “a game in which education (in its various forms) is the primary goal, rather than entertainment” (Michael and Chen 2006). In contrast, the related concept of gamification (Deterding et al. 2011) does not use games as packaging around activities, but certain real-world activities are incentivized by game-like elements (e.g., points, badges).

Examples of serious games for healthcare are numerous and address various medical domains such as mental health (Fleming et al. 2017), rehabilitation (Nap and Diaz-Orueta 2015; Bonnechère et al. 2016), physical exercise for the elderly (Konstantinidis et al. 2016), and even training of medical professionals (Gentry et al. 2019).

Also, there are already many available commercial game titles for physical exercise or cognitive training (e.g., a variety of mobile games and console titles such as Wii Fit and Brain Boost for Nintendo, Body and Brain Connection for Microsoft Xbox, or Smart As… for Sony PlayStation). There is evidence that exercise gaming at home has positive effects on health, pain perception, and overall well-being (Warburton et al. 2007; Whitehead et al. 2010; Franco et al. 2012; Brauner and Ziefle 2020), but the effectiveness and transferability of cognitive training games is not as well established. A meta-analysis found small but significant positive effects on processing speed, working memory, and verbal and nonverbal memory, but no effect on attention or executive function (Lampit et al. 2014). Also, home-based and too frequent training was found to be ineffective. A subsequent meta-analysis found that commercial video games increased cognitive performance, such as attention and problem-solving skills. However, the performance gains were limited to the specific task in the games and had no further positive effects (Choi et al. 2020). Consequently, more research on the effectiveness of CCT is needed.

Ambient assisted living

Apart from serious games for healthcare, AAL environments are another active research area aimed at mitigating the negative effects of the changing demographics: by augmenting the living environments with smart sensors and actuators, elderly (or chronically ill) people can stay independent in their own living environments for longer (Raisinghani et al. 2006; Kleinberger et al. 2007; Rashidi and Mihailidis 2013). These environments reduce the costs of healthcare systems while increasing individuals’ independent living and self-determination. However, despite its immense potential and the increasing use of smart home components in the homes of a few technology enthusiasts, the widespread penetration of AAL is still a bold vision of the future, and many fundamental questions regarding acceptance, perception, use, and interaction styles are still open (Cook and Das 2007; Lindley et al. 2008; Cook et al. 2009).

To study how AAL environments are perceived by future inhabitants and how these environments need to be designed, and to evaluate how these will shape future living, researchers construct prototypical living labs to simulate the envisioned models. These labs are built to test functional and nonfunctional properties of innovative technologies and evaluation by potential users. Potential residents thus not only have the opportunity to be part of the evaluation of the technology, but also can play an active role by contributing feedback, fears, and ideas regarding the modern technology, and the residents can thereby shape their future technology-augmented living environments. We will present the test environment used for this work in more detail in the Method section.

Which factors influence the use of technology?

The main goal of technology acceptance research is predicting whether and how a given technology will be adopted by users and which technological (capabilities, design, interfaces, etc.) and user factors (age, gender, attitude towards technology) are relevant to the actual use (Davis 1993; Rogers 2003).

A cornerstone of this research domain is Davis’ technology acceptance model (TAM) aimed at predicting the actual use of a technology through the intention to use that can be measured in advance (Davis 1993). TAM has revealed a strong relationship between the perceived usefulness and ease of use of the technology, and the projected and the later actual use.

Over time, numerous other models have emerged and have integrated additional explanatory factors, increased the predictive power, and opened new areas of application. However, this has led to a variety of highly specific models that are rarely transferable to other areas (such as TAM2, TAM3, and unified theory of acceptance and use of technology [UTAUT]) (Venkatesh et al. 2003). Regarding our work, there is no established model for studying serious games for healthcare or for evaluating AAL environments. Nevertheless, the existing and established models provide at least some guidance as to which factors are suitable for studying the social acceptance of serious games for healthcare in AAL environments. In this work, we use the UTAUT2 model (Venkatesh et al. 2012), as it is the first model that integrates the hedonic evaluation of a technology. In this model, later use is predicted through the intention to use, which is explained by the seven factors of performance expectancy (perceived benefits of the technology), effort expectancy (how complicated it is to use), facilitating conditions (is the technology compatible with my environment, can others help me in case of trouble), social influence (how do my peers rate my use of the technology), price–value trade-off (is the technology worth the money), hedonic motivation (how much fun is the technology to use), and habit (can I integrate the use into my daily routine).

Social acceptance is both shaped by the evaluation of the system and mediated by individual user factors. Numerous studies have shown that age is associated not only with less ease of use in interacting with novel technologies, but also with lower perceived usefulness and overall acceptance (Selwyn et al. 2003; Arning and Ziefle 2007; Schreder et al. 2013). Furthermore, women, on average, report lower interest and skill in dealing with technical systems. This is especially relevant for the design of technology-based interventions in the healthcare sector, as these interventions should be usable and useful for all potential users, regardless of age, gender, or (perceived) technical self-efficacy (Wilkowska et al. 2018).

Key challenges

Even though the two areas of AAL and serious games for healthcare are established concepts in research that are increasingly flowing into the market, the intersection has received little attention. Consequently, our research objective is to develop healthcare games in technology-augmented environments, to evaluate how elderly people—who are often reluctant to deal with innovative technology—interact with these games, and to identify the predictors for social acceptance and adoption by means of technology acceptance research. Thus, this work is at the intersection of technology development for AAL, healthcare, and user research. Here, we combine the knowledge about the benefits of exercise, serious games as a method to promote activities usually perceived as unpleasant, the invisible integration of interfaces in AAL environments, and user and acceptance modeling to understand whether this approach increases the likelihood of exercising and which factors are the largest contributors to projected acceptance and thus later use.

Designing healthcare games for the elderly is difficult, as many distinct aspects must be considered. Besides the identification of suitable training domains, the games must be fun and entertaining, adapt to the various age-related changes such as lower information processing speed and reaction times, altered vision (Fisk et al. 2009), and different interests and often less ability to interact with computers compared to younger adults.

To facilitate these numerous, often conflicting, requirements, guidelines and recommendations have evolved over the years. One of the earliest guidelines for serious games for healthcare emerged in the early 1980s when Weisman (1983) modified and evaluated games on the Apple II for institutionalized elderly.

Games should be slowed down and be focused on single tasks; on-screen objects should be well-defined and larger, and additional hints should be provided to support interaction. Playing with others has also been found to be fun, entertaining, and rewarding. Personalized messages may increase engagement with the game as well (“Linus, your score is not so good; keep on trying.”).

More recently, Ijsselsteijn et al. (2007) identified requirements, benefits, and opportunities for game design for elderly users. For example, they suggest that the boundaries of the visual game environment should adjust to the individual visual decrement. Also, information and feedback should use multiple modalities (e.g., visual and auditory feedback) to account for preferences and individual limitations. Memory load and cognitive requirements should be kept to a minimum, for example, by avoiding the need to remember information from various parts of the game). Lastly, players should be given time to get used to the game and learn the basic interactions. During this time in particular, encouraging feedback should be provided to support insecure players.

The first guidelines for exercise games that use body sensors were presented by Whitehead et al. (2010). Games should detect whether body movements were correctly executed, and shortened movements should be rejected. Activities and gestures should be designed to use larger muscles of the body, such as the legs. The games should support players gaining experience and playing the games for longer periods, for example by providing incentives, as experience and frequent use yields higher physical benefits. These incentives should have a lasting effect, not just for hours or days, but for months and years. Ideally, these incentives should not be related to the health benefits but should not be an abstraction. Furthermore, there must be regular recuperation periods within the games to allow players to rest between active phases.

Gerling et al. (2012) presented guidelines specifically for full-body motion exercise games for the elderly. For example, interactions within the game should take account of potential individual physical or cognitive impairments, games should adapt to limitations in players’ range of motion, and overexertion should be managed by alternating between physically intense and less challenging parts. Further, the difficulty and challenge of the game should not be fixed but should adapt to players’ abilities. Movement gestures should be easy to learn and easy to remember, and ideally there should be a natural mapping between gestures and real-world activities. Players should be guided by initial tutorials and later hints at required actions. Finally, games should support independent play, meaning that routines to start the game should be easy and intuitive, and no assistance should be needed.

Method

Our work focuses on the concept of prevention before treatment: the games presented here to do not address a specific disease, but rather overall physical or cognitive fitness. We developed two different game prototypes to study the communalities and differences between distinct kinds of serious games for healthcare in AAL environments. The first game addresses general physical fitness, agility, endurance, and body control (“Fitness Farm”), whereas the second game is designed to promote executive functioning (“Cook it Right”). Both games incorporate the sensors and display surfaces embedded in the AAL lab presented in the following, and thus require no further interaction devices and no additional hardware setup.

Ambient assisted living lab

We developed and evaluated the serious games in RWTH Aachen University’s AAL lab. It replicates a typical 25 m2 living room with windows, a parquet floor, two couches and a table, ceiling lamps, and pictures and shelves.Footnote 1 However, this lab has integrated various novel input and output technologies to make the room more supportive. It assists future residents in monitoring their diseases by providing novel communication channels to family, friends, and caregivers, or simply by increased comfort. Sensors invisibly embedded in the parquet floor detect potential falls and may alert family members or caregivers (Klack et al. 2011a; Leusmann et al. 2011), a hidden invisible infrared camera measures a person’s body temperature for detecting illnesses (Klack et al. 2011b), and a large multi-touch wall can visually extend to room (Kasugai et al. 2010), support novel interaction techniques, and may provide telemedicine consultation services (Beul et al. 2012).

What is essential about this environment is that the monitoring technology is integrated invisibly, and the devices do not have to be set up by the users.

Game concept and implementation

In the first game, the player is in a garden environment and the task is to collect several types of fruits and vegetables through full-body motion gestures. Microsoft Kinect sensors hidden in the AAL lab capture the player’s body postures. These gestures were developed in cooperation with a physiotherapist and address harmless and often insufficiently trained movements such as stretching or bending. Throughout the game’s levels, different and increasingly difficult body gestures are introduced and practiced.

In the second game, players take on the role of a cook. They must first memorize recipes and then recook them in a graphical, touch-based interface. The difficulty of the recipes increases with the levels of the game, for example, by requiring more ingredients or more branched work steps. The game uses the multi-touch display surfaces of the AAL lab, namely a conventional tablet device, a sofa table with an integrated display, or the large wall-sized display.

For both game prototypes we applied a strictly iterative, user-centered, and participatory design approach. Starting with low-fidelity paper prototypes (Snyder 2003), we evaluated the game concept and the basic interactions with potential future users of the games. Using interviews, we captured feedback and ideas for improving the game design and the interactions. In subsequent iterations, we increased the maturity of the game prototypes and eventually arrived at fully functional game prototypes that were then evaluated in the final evaluations presented below. Key improvements were, for example, implicit feedback on the remaining playing time through a changing daylight in the case of the motion game, and a focus on turn-based interactions compared to a timed user interface in the cognitive training game. An in-depth presentation of the design process and related findings for each game can be found in Wittland et al. (2015) and Brauner and Ziefle (2020). Figure 1 shows a picture of an early (left) and the final prototype (right) for the motion (top) and cognitive (bottom) training game.

Fig. 1
figure 1

Illustration of the iterative development process of both games

Experimental procedure

We applied a congruent research design to evaluate both game prototypes from the perspective of potential future users. Sixty-four participants evaluated each prototype in a between-subject design. They were recruited through personal social networks and public notices in the city. To determine whether age or gender has a particular influence on performance or social acceptance, we recruited both younger and older adults, as well as male and female subjects. For the experiment, we first introduced the participants to the AAL lab, its core features, and how they could now contribute to the design of serious games in this environment. Second, using a questionnaire, we surveyed the participants’ characteristics including age, gender, chronic illnesses, technical self-efficacy, and inclination to play. Next, we introduced the games, and the participants played three game rounds while performance metrics were captured using log files. Third, we measured the perception of the game on a second and final questionnaire. We concluded the experimental session with a brief and informal chat about the game and the living lab. Participation was voluntary and—aside from cookies—not rewarded. Figure 2 illustrates the experimental procedure and the captured variables.

Fig. 2
figure 2

Experimental procedure for evaluating the two serious games for healthcare in AAL environments

User factors

On a questionnaire before the game intervention, we captured the participants’ age and their gender, as both factors are often linked with different usage, perceived ease of use, self-efficacy, and performance when interacting with information and communication technology (Arning and Ziefle 2007; Schreder et al. 2013). We constructed a sample with both younger and older, and male and female participants (see description of the sample below).

We also measured additional explanatory variables (not used for constructing the sample) that prior work had identified as relevant predictors for performance or acceptance.

Chronic illnesses

As people with chronic illness may have lower performance levels (e.g., due to physical limitations) or higher evaluation (e.g., higher perceived benefits), we asked whether the participants suffered from chronic diseases, and if so, which ones.

Self-efficacy in interacting with technology (SET)

Self-efficacy is the domain-specific belief, based on the perception of one’s own competence, that one is successful in a certain activity. Women and older adults often report lower scores in SET, which results in less motivation to deal with or lower performance in interacting with technical devices. We used the four items on a 6-point Likert scale from Beier’s SET scale with the highest item-total correlation. The internal reliability was very good (α ≥ .830).

Need for achievement (NfA)

This construct describes the general and stable tendency to work on relevant tasks with energy and perseverance until successful completion, and relates to the choice of tasks and the performance attained in the tasks (Schuler and Prochaska 2001; Elliot and Dweck 2005). We suspected that a lower NfA could lead to lower performance in the games or lower acceptance of games because of the competitive components they contain. We used a short form with four items of Schuler’s scale, for example, “I am attracted to difficult problems.” The internal reliability of the scale was very good (α ≥ .829).

Gaming frequency (GF)

Obviously, the general tendency to play will also affect the performance in and evaluation of serious games. To determine this, we asked the participants for the current game frequency across multiple game domains and media, such as games of skill and board and card games, but also games mediated through game consoles and cell phones, as well as ball and outdoor games. The scale consists of nine items such as “I frequently play board games.” It had very good internal reliability (α ≥ .730) and was further related to items such as “I enjoy playing games” in prior work (Brauner et al. 2015).

All scales were measured on 6-point Likert items ranging from 0 to 5 (max).

Dependent variables

For the evaluation, we measured (a) the participants’ performance in the games using logfiles, (b) the usability, perception, and social acceptance of the games using the UTAUT2 model in post-game surveys, and (c) qualitative observations during the participants’ play and informal talks afterwards.

The performance metrics differed between games. In Fitness Farm each level lasted 90 s and we counted the number of collected objects as performance metric (higher number means higher performance). In Cook It Right we measured the time the participants needed to successfully finish the recipes (shorter duration means higher performance).

For the evaluation of both games, we designed a survey based on the UTAUT2 model (see above). First, we measured the evaluation of the game on the seven dimensions performance expectancy, effort expectancy, facilitating conditions, social influence, price–value trade-off, hedonic motivation, and habit each on three 6-point Likert items. Next, we asked for the intention to use the game as an indicator of social acceptance using three 3-point Likert items (“I would like to play the game again”).

As this work focuses on the social acceptance of serious games in AAL, we did not measure health-related outcomes of the games. Yet, based on the presented related work, we argue that if the games are used to exercise, they will also reveal positive effects on health and well-being.Footnote 2

Statistical analysis

All subjective measures were captured on 6-point Likert scales. For data analyses we used bivariate correlations (Pearson’s r, Spearman’s ρ), χ2, univariate and multivariate analyses of variance (ANOVA/MANOVA), and multiple linear regression for data analysis. We used and report Pillai’s value V for omnibus tests of the MANOVAs. For the multiple linear regressions, we use the ENTER method, iteratively remove models with low standardized beta coefficients, and exclude models with high variance inflation factors (VIF ≫ 1). We set the type I error rate (level of significance) to α = .05 and report effect sizes as partial η2. Missing values are deleted list-wise on a per-test basis.

Description of the samples

The sample for the physical exercise game included 64 participants (32 male, 32 female) aged 17–85 years (M = 43.2, SD = 19.6), with no correlation between age and gender (r = .110, p = .386 > .05). In our sample, age was associated with a decline in reported SET (r = −.548, p < .001) and lower GF (r = −.569, p < .001). In addition, chronic diseases increased with age (ρ = .406, p < .001), and the most frequently mentioned diseases were asthma, hypertension, and diabetes. Further, a link was observed between gender and SET (ρ = −.265, p = .035 < .001), with women reporting lower levels of self-efficacy than men. GF, however, was unrelated to gender in this sample (ρ = −.160, p = .206). NfA was not related to either age (r = −.055, p = .672 > .05) or gender (ρ = −.139, p = .282 > .05). Table 1 (left) presents the correlations within the first sample.

Table 1 Characteristics of both samples. Left: motion-based physical exercise game. Right: touch-based cognitive training game

The sample of the cognitive training game also consisted of 64 participants (34 male, 30 female) aged 16–84 years (M = 41.6, SD = 18.4). Again, there was no correlation between age and gender (ρ = −.065, p = .611 > .05), and age was linked to lower SET (r = −.490, p < .001) and lower GF (r = .722, p < .001). Contrary to the first sample, NfA was linked to age (r = −.327, p < .001) and gender (ρ = −.258, p = .042 < .05), with older people and women reporting being less performance-motivated. Again, gender was related to SET, with women reporting lower scores than men (ρ = −.387, p = .002 < .05). However, gender was not related to GF (ρ = −.219, p = .082 > .05). Table 1 shows an overview of the correlations.

Next, the participants’ performance and their evaluation of the game is analyzed.

Results

The structure of the results section is as follows: First, we analyze the attained performance in both games (physical exercise and cognitive training) and identify the strongest predictors of performance. Second, we analyze how individual factors, including attained performance, relate to the overall acceptance of the game.

Performance

In this section, we analyze the players’ performance in the game, how it changed over the course of the game, and whether user factors influenced performance.

For the motion-based exercise game, the number of collected objects was M = 22.8, SD = 6.5 (range 7–37) in the first level (grabbing), M = 18.8, SD = 4.8 (9–29) in the second level (grabbing + holding), and M = 17.0, SD = 4.9 (2–26) in the third level of the game (grabbing + holding + crossed over). The increasing difficulty through more complex targets yielded fewer collected game objects.

In the cognitive training game, the average completion time was M = 57.3 s, SD = 18.7 s (range 33.2–107.2 s) for the first level (low complexity), M = 95.2 s, SD = 30.5 s (60.0–167.3 s) for the second level (medium complexity), and M = 116.5, SD = 33.6 s (62.5–199.6 s) for the last level (high complexity). Consequently, and despite practice, the average duration of a level increased with complexity.

More interesting, the performance across the three levels in each game was strongly correlated and therefore consistent, meaning that players were either slower or faster independent of the level (α = .804 for the motion-based exercise game and α = .852 for the cognitive training game). These findings are summarized in Table 2.

Table 2 Descriptive statistics for the performance in the motion-based exercise game (left) and the cognitive training game (right)

To evaluate how individual differences influenced performance, we first calculated an average performance score across the three levels that would be used in the following. As Table 3 shows, most user factors correlated with performance: In both games, performance was influenced by age (r = −.440 and r = −.543) and technical self-efficacy (r = .357 and r = .309), with age and lower self-efficacy yielding lower performance. Gender, however, was not related to performance in either game. The reported GF was linked to performance in the motion-based exercise game (r = .269), with gamers being faster, whereas higher NfA yielded higher performance in the cognitive training game (r = .318).

Table 3 Correlation between user factors and attained performance in the motion-based exercise game (left) and the cognitive training game (right) [gender dummy coded: 1 = male, 2 = female]

However, the sample description (see Table 2) showed that most user factors were interdependent, so it is not clear which factor had the strongest influence on performance. Consequently, we then calculated a multiple linear regression (MLR) with average performance as the dependent variable and the user factors as independent variables (age, gender, GF, SET, NfA) to identify the strongest predictors.

For the first game (physical exercise), the regression model explained 60% of the variance in performance (F(5,58) = 18.763, p < .001, r2 = .603). For the second game (cognitive training), the regression model accounted for 57% of the variance in performance (F(5,39) = 9.913, p < .001, r2 = .573). Tables 4 and 5 show the regression models for the two games.

Table 4 Regression table for performance in the motion-based exercise game (r2 = .573; gender dummy coded: 1 = male, 2 = female)
Table 5 Regression table for performance in the game for cognitive training (r2 = .573; gender dummy coded: 1 = male, 2 = female)

For the motion-based game, the strongest predictor was age (β = −.598, p < .001**), with older people being slower than younger people by far, followed by participant gender (β = −.221, p = .013 < .05*) and NfA (β = .213, p = .022*), with women being slower than men and higher NfA yielding higher performance in the game. Neither SET nor participant GF had a significant influence in the MLR model that accounted for the mutual influence among the predictor variables.

For the cognitive training game, age was also the strongest predictor of performance (β = −.739, p < .001), followed by GF (β = −.346, p = .047 < .05) and SET (β = .338, p = .020 < .05). Neither the participants’ gender nor their NfA influenced performance in the game.

Qualitatively, we observed that, independent of their age, players of both games enjoyed being able to control their own speed without being incentivized or even punished: while some tried to be as fast as possible, others took their time. Although both games provided feedback on the performance (objects collected during a round or number of steps and time taken), we did not show benchmarks for comparison (such as high scores). Here, user preferences should be implemented individually: while some, mostly older adults, commented that they liked not having to compare their performance against others, others found this particularly motivating.

In summary, the performance the participants attained was mostly determined by their age, with older people being slower than younger people. This result may not come as a surprise, so the important question of whether the performance achieved in the game influenced acceptance will be examined in the next section.

Usability and acceptance

The overall intention to use the motion-based exercise game was high (M = 3.70/5, SD = 1.02) and above the center of the scale (2.5). Likewise, the cognitive training game was also rated very positively (M = 3.40/5, SD = 1.11) and far above the center of the scale.

To understand what drives social acceptance of serious games for healthcare in AAL environments, we first analyzed how individual user factors, including attained performance in the game, relate to the intention to use the game in the future. For both games, correlation analyses showed that performance did not significantly influence the usage intention (r = .161, p = .205 > .05 and r = −.049, p = .766 > .05). In the first game, only GF influenced the intention to use (r = .445, p < .001), with gamers reporting a higher intention to use the game. No other individual factors had a significant effect. For the second game, none of the individual factors influenced the intention to use the game, including no effect of GF (r = .005, p = .976 > .05). So, although NfA had a positive effect on performance, no influence between NfA and intention to use was found in either study.

Lastly, we studied how the evaluation dimensions from the UTAUT2 model related to the overall intention to use both games. In Fitness Farm, all evaluation dimensions had a significant influence on the intended use. In Cook It Right, all dimensions except effort expectancy were significantly related to the intention to use. Since the criteria correlated strongly with each other, we again identified the strongest predictors using an MLR. For Fitness Farm, this procedure identified a model (F(2,61) = 16.090, p < .001) with two significant predictors and an overall explained variance in intention to use of slightly below 50% (r2 = .478). Specifically, habit (β = .503) and social influence (β = .301) were identified as the two strongest significant predictors. In Cook it Right, this procedure also identified a model with two significant predictors (F(2,44) = 24.62, p < .001) and an explained variance of above 50% (r2 = .512). The two strongest predictors were hedonic motivation (β = .550) and, again, habit (β = .294). Figure 3 illustrates the correlations (r) and the strongest predictors (β) from the MLR.

Fig. 3
figure 3

Significant correlations between the UTAUT2 predictor variables and intention to use the motion-based exercise game (left) and the cognitive training game (right). Bold lines show the key predictors and their strengths (β) after a multiple linear regression

From a qualitative perspective, the participants had mixed feelings about both games. For the physical exercise game, most participants liked the idea of embedding exercise in games, yet some had doubts as to whether the current game was motivating in the long run. They suggested more activities within the game, the ability to connect with friends and to compete against each other, and other game environments, such as walking through a city by exercising. While the comments on the cognitive exercise game were more positive, a general concern was whether the activities within the games would have actual benefits for health and cognitive functioning. We suspect that there is a fine balance between a low usage barrier due to a simple and entertainment-like user interface on the one hand and a credible medical effect on the other.

Discussion and conclusion

The evaluation of both serious games for healthcare within the prototypical AAL environments revealed some expected and some unexpected results.

First, none of the participants in the two final evaluation studies had any serious difficulties in using the games and successfully completing the three required levels. Although we cannot formally prove this, we attribute this to the utilization of user-centered and participatory design principles, such as personas guiding the development phases, early tests with paper prototypes with different game concepts and interaction styles, and later frequent tests with functional prototypes. This is also reflected in the very positive overall evaluation and the above-average intention to use the games.

Second, the results show that performance is explained by the age of the players, with older people being significantly slower than younger players. This may not sound particularly surprising, yet it shows that other frequently observed factors, such as lower technical self-efficacy and ability to interact with information and communication technology, play a minor role in performance, but age-related differences in physiology and cognition.

Third, the overall intention to use the games, based on the UTAUT2 technology acceptance model, is not directly related to any of the investigated (explanatory) user factors aside from GF in Fitness Farm. On the one hand, this raises an important question that is not sufficiently discussed in the serious game community: How can we realize preventive healthcare interventions for people not particularly inclined towards (computer) games? Can we work together with users to develop new forms of exercise activities that are not embedded within classic games, for example, by embedding the suggested movements in other interactions, such as making music or art? What do we do when people have no interest in exercising at all—for example, do they want to be nudged to be active more often? On the other hand, this also shows that the attained performance—which was much lower for older participants—is irrelevant for later acceptance of the games: as we did not give feedback on the player’s performance compared to others (e.g., high score), slower players still had fun with the game.

Strikingly, intention to use was not influenced by either technical self-efficacy or NfA (but NfA influenced performance, so we measured correctly). Hence, the games were perceived as worthwhile independent of the participant’s interest in delivering high performance and independent of the participants ability to interact with complicated technical systems, such as AAL environments. Again, both may be attributed to our design methodology. However, a different picture might have emerged if we had not followed the various guidelines for designing games for the elderly. Games with complicated setup routines, complex menu structures, or overcrowded interfaces may be a stronger barrier to performance, and here we expect an influence of, for example, technical self-efficacy on the intention to use.

Forth, in both games, the application of the UTAUT2 showed that most factors of the model related to the intention to use the game again. Hence, the model is applicable, although other factors relevant for adopting medical technology are missing (e.g., privacy disposition, trust in medical technology). Regarding the particular factors, habit was found to be most influential in both games. Hence, people who envision integrating the game into their daily routine report higher intention to use the game in the future. Consequently, the strongest levers to increase the likelihood that the games are adopted are strategies to increase this perception. For Fitness Farm, social influence was also influential: people are more likely to adopt the game if others (particularly peers) positively appraise the game play. One way to realize this might be game communities in which family, friends, and others might connect and play together. In Cook It Right, hedonic motivation was the second strongest predictor for the acceptance of the game: the wish to play the game again was directly linked to the perceived fun, which some, but not all, participants reported. Here, we suggest developing a variety of games so that future inhabitants of AAL environments can freely choose the training games they like the most.

Lastly, financing of (preventive) serious games for healthcare and the augmentation of living environments by AAL technology need to be discussed. Currently, it is difficult to estimate the monetary benefits of such AAL environments and the long-term medical benefits of serious games for healthcare. Consequently, more research is needed to reduce the cost of these technologies and to make their benefits quantifiable.

Limitations

This work provides insights into the acceptance of future serious games for healthcare in AAL environments, yet it is not without its limitations.

First, the evaluation of the serious games for healthcare is based on a short-term intervention. Hence, future work should evaluate the long-term effectiveness of these interventions, whether and how long positive effects will emerge and remain after playing the game, whether and by whom the games will be used over a longer period, and whether and to what extent health and overall well-being is improved. Prior work suggests that if people use the games to exercise (see related work), positive effects should unfold. Also, our own work suggests that older adults’ perceived pain decreases shortly after playing the physical exercise game (Brauner and Ziefle 2020). On the other hand, the anticipated AAL environments are not yet available, and only single AAL labs distributed across universities and other research institutions exist for user studies. Our approach of studying one short-term interaction among many users enables deeper insights into usability and technology acceptance than studying few users over a longer period.

Second, to keep the length of the experiment reasonable, the set of (explanatory) user factors and evaluation dimensions was limited. Of course, there are numerous other user factors (e.g., disposition to trust, privacy disposition) and evaluation dimensions (e.g., perceived privacy of the system, trust in this medical technology) that could influence social acceptance of these serious games. Therefore, future studies should systematically identify relevant factors and weight their importance in the respective factor space. This work, however, suggests that the application of technology acceptance research models is feasible for evaluating these systems, although additional evaluation dimensions might be needed.

Third, in this article we refrained from detailed correlation analyses between the (explanatory) user factors and the evaluation dimensions. This is presented individually for Fitness Farm (Brauner and Ziefle 2020) and Cook It Right (Wittland et al. 2015).

Lastly, this work presented a game-based approach for exercise that embedded physical and cognitive exercises within two different game environments. Yet, the findings show that non-players are not easily motivated to exercise by means of games. Thus, to address non-players, exercises also need to be integrated into other environments, and studies need to determine whether a larger and more diverse range of measures has a positive effect on social acceptance, use, and health and well-being.