1 Introduction

Developed countries are experiencing significant population ageing. According to the United Nations [35], the number of older adults is projected to grow to 1.4 billion until 2030, making it one of the most significant social transformations of the twenty-first century. This demographic shift brings with it several societal challenges. One significant challenge is on the increase in demand of healthcare services and caregivers, caused by the increased health risk of the elderly population. This forces them, in turn, to require the regular use of those care services. One approach to mitigate this need is to develop technologies that can help the population to age in a more active way, allowing them to remain healthy and independent for a longer period of time. This concept is denoted in literature as active ageing [33].

This important societal challenge was first identified by the World Health Organization [37] and has led to significant efforts for the past decade to propose new technological tools to help society facing it. These advances have been proposed under the umbrella of the well-known ambient assisted living, which is defined as “the use of information and communications technology (ICT) in a person’s daily environment that enable them to stay active longer, remain socially connected and live independently into old age” [24]. This is accomplished through “smart” environments, a.k.a. smart homes [11], which usually include intelligent sensors and actuators that coexist with social assistive robots [8] that are capable of interacting with people in a natural and non intrusive way, e.g. through dialog.

Active ageing encompasses several factors [33] which can be grouped into three main dimensions: cognitive stimulation, socio-emotional stimulation, and physical activity promotion. The focus of this article is on the latter dimension. Regular physical activity has been shown to bring significant health benefits, irrespective of age. A large scale longitudinal study [36] found that, in comparison with individuals in the inactive group, subjects that exercised an average of 15 min a day had a reduced 14% risk of all-cause mortality and a 3-year longer life expectancy. In the elderly, in particular, physical activity reduces the decline associated with ageing of cardiovascular function, strength, stability, and flexibility [21]. The benefits to mental health have also been studied, with literature showing that regular physical activity reduces the propensity for some types of mental illness [26], lessens depression and anxiety [5], and has various mood enhancing effects [2]. In [19], the authors present statistical evidence that, compared with no exercise, physical activity is associated with lower risks of cognitive impairment, Alzheimer disease, and dementia of any type, therefore regular activity could present an important and potent protective factor to cognitive decline and dementia in elderly persons. These benefits to health reduce the need for healthcare services and caregivers, and consequently to the strain imposed by the aforementioned demographic changes.

In line with this, a multi-agent system for motivating and helping elderly users to perform regular physical activity is proposed in this article. The role of these agents is to not only persuade the users to perform exercise routines during detected periods of inactivity, but also guide and accompany them during the practice. This article focuses on the ability of the proposed exercise system to deal with several barriers that hinder regular physical activity habits and provide effective training through various exercises; and on its potential to elicit a flow state in elderly users.

It also extends a recent work by the same authors [22] in three main aspects: (i) a more thorough review of the state of the art to better put in frame the contributions of this work in the area; (ii) a clearer analysis of the application context and of the complementary features of the different agents included in the system proposed; and (iii) the presentation of the front-end graphical user interface included in the system proposed, the exercise routine designer, which is a module that allows a therapist, physician, or carer to prescribe the set of physical exercises by using a domain-specific language. This module then converts this information expressed in the expert’s view into a set of basic (motion) primitives (the low-level view of the information) that are sent to the robotic and virtual agents, which in turn help the elderly user to perform the prescribed exercises.

The next section surveys related works on assisting and motivating elderly users to perform regular physical activity, and explain the main differentiating aspects of the system proposed herein and the contribution in the area. Section 3 summarises the goals and the design guidelines of the system proposed. Section 4 describes in detail the multi-agent system proposed, namely each individual agent comprising it. Section 5 presents the pilot study that was conducted to validate the system with end users, as well as the obtained results and their discussion. The final section wraps up the main conclusions of the study.

2 Related work and contributions

A considerable body of research already exists on social assistive robots and other elderly focused technologies. These technologies generally focus on monitoring dangerous situations [14], providing psychological support [15], and helping elderly users in their activities of daily living (ADLs) [23]. All of these issues can be positively affected by regular physical activity [3], benefiting from technologies that promote it. As such, multiple exercise systems designed to help elderly users perform physical activity have been developed.

Many of these exercise systems make use of screen interfaces, displaying virtual avatars to provide a visualization of the execution of the exercises. The role of the avatars it typically to guide the users in performing specific exercises for rehabilitation [18], or maintaining physical condition [16]. In [31], the authors evaluate Active Lifestyle, a training application running on a tablet, which assists, monitors, and motivates elders to follow personalised training plans autonomously at home. Others have proposed software combining sport exercises and gaming, i.e. serious games, to assist elderly persons in performing physical exercises [17]. While these kinds of screen interfaces have been demonstrated to facilitate physical training, they present some shortcomings that limit the attained benefits, namely the a priori lack of motivation to use them on a regular basis, or difficultoes in using them due to visual impairment or some degrees of ICT illiteracy.

Approaches that explore the use of robot-based systems to promote elderly physical activity have also been studied. In [27], a small robot designed to deliver basic coaching for physical activities is presented. Görer et al. [12] explored and tested an autonomous robotic exercise tutor to help the elder’s daily exercises. The existence of motion learning capabilities, enables a human to define exercises by demonstrating them to the robot. After watching the human instructor performing an exercise, the robot replicates it to the best of its ability. This exercise tutor is then able to guide users through physical activity sessions comprising the exercises it learned. In [6], the authors present a social robot that recommends and monitors physical exercises designed for elders. Besides the human-robot interaction provided by the robot, the robotic system evaluates the users’ exercises performance using deep learning and schedules personalised physical exercises in the users’ agenda. Shen and Wu [30] benchmarked a robot instructor, a humanoid robot based on the Aldebaran NAO platform, against a human instructor. This pilot study conducted in Singapore with 41 elderly participants showed that a robot can be more effective and better preferred by users over a human instructor in that context.

The comparison between an embodied robot coach over a virtual coach has also been the subject of study. Fasola and Matarić [9] used a system consisting of a physical robot and a computer simulation of that robot in a physical exercise context. The participants of the study evaluated them individually in terms of companionship, helpfulness, social attraction, social presence, and as exercise partners. Results showed that the users preferred the physical robot over its computer simulation counterpart on all of these scales. This fact was attributed to the positive effect that physical embodiment has on user acceptance, which agrees with other similar studies [13].

The ICT-based solution proposed herein goes beyond the state of the art by making use of a multi-agent system that aims to take advantage of the merits of both software-based agents and embodied agents for the promotion of physical activity in elderly users. The combination of different types of agents also allows to mitigate their individual shortcomings in the service provided by the system to the elderly user. An avatar, a software-based agent with advanced interaction capabilities, acts as a virtual coach and instructs the elderly user in performing a sequence of physical exercises with a high degree of fidelity in the body motions. On the other hand, a social robot uses advanced interaction features based on posture and speech to invite the elderly user to initiate a physical exercise session. The physical embodiment provided by the social robot is used to promote a better engagement of users in the proposed exercises, by accompanying and encouraging them during the physical activity session, thus acting as an exercise buddy. This engagement arises from the fact that the robot performs, side by side with the user, basic movements at the same pace and amplitude as instructed by the avatar. Moreover, the system also innovates by including in the multi-agent system an exercise performance evaluation and feedback module, i.e. a software-based agent that uses computer vision and classification to assess whether the user is performing the exercise correctly. The feedback data provided by this agent is used as an input to the assistive robot companion which, in turn, using it adapts the spoken utterances to encourage the user during the workout and help him/her to perform better.

3 Goals and approach

The aim of this article is to propose a system capable of motivating, accompanying and evaluating the elderly during exercise. The proposed system should be capable of:

  1. 1.

    Correctly detecting periods of inactivity;

  2. 2.

    Effectively guiding elderly users through exercise routines composed of diverse exercises;

  3. 3.

    Evaluating user performance and keep record of progress;

  4. 4.

    Positively affecting elderly users motivation to remain active.

In order to effectively accomplish these goals, four distinct agents were developed and integrated: a smart camera network, a robotic exercise companion (i.e. a social robot acting as an exercise buddy), a virtual coach (i.e. an avatar), and a performance evaluator. Each agent is responsible for a specific task. The smart camera network, deployed on the user home, is responsible for identifying when a user is on an excessively long period of physical inactivity, and then launch an alert event about the situation. The robot by making use of its physical presence and mobility is able to interact with the user to try to convince him/her to do some physical exercise, and later to take the role of an exercise buddy. The virtual coach, with its ability to faithfully replicate human movements, is used to effectively guide the user through an exercise routine. A performance evaluator is responsible for tracking and analysing the user while he/she performs the routine and issue warnings about wrong practice as well as reproduce motivational words or sentences to improve engagement. By combining these complementary capabilities, the combined system exploits their individual strengths and mitigates their shortcomings, thus producing a multi-agent system capable to tackle the problem at hand.

By putting all together, this system keeps track of the users inside the smart home and detects periods of prolonged physical inactivity. When it determines that a user is in a state of prolonged inactivity, it sends a message to the centralized planner. This triggers the exercise service, which sends the robot agent to the user location in order to try to convince him/her to exercise. In the case of a positive response, the robot guides the user to the exercise area, where the virtual coach is being displayed on a screen. Once there, the user chooses the routine he/she pretends for the virtual coach to demonstrate. After capturing the user’s physical characteristics, the virtual coach guides and evaluates the user throughout the chosen exercise routine. During the routine execution, the robotic exercise buddy accompanies the user in all the exercises, even if it means using adaptations given its mobility and structural characteristics. When the routine ends, the virtual coach records the performance evaluation and the robot goes to its charging station.

4 Exercise routines: design, guidance, and motivation aspects

It is well known that as we age, muscles and bones tend to loose mass and consequent functionality. These losses can be very steep in the absence of physical activity. On the other side, physical activity not only improves muscle condition but also the application of tensions to the bones will contribute to the increase in their density via a piezoelectric effect. Naturally the exercises to perform must be adequate to each person and carefully chosen by professionals, in order to avoid dropout due to frustration, or even accidents. As the health benefits are not easily perceived, there is frequently a tendency to relax and reduce physical activity or simply forget about it. It is therefore important to find ways to help people to engage in this type of practice. To this end, the most frequently referred approach is the use of games to increase the interest, reduce boredom, and even forget limitations [10].

In the work described herein, the choice was to pay attention to both intrinsic and extrinsic motivational aspects. As for the former, it is important that the person does not experience frustration and otherwise feels able to do the exercises. As already referred, this requires the adequacy of the exercises in terms of range of movements, effort and speed. Maladjusted speed and cadence is possibly one of the factors that frustrates more rapidly any practitioner, therefore any coach must pay attention to the limits of each particular person and perform adequate adaptations. Lastly, guidance and positive correction is very important for the exercises to attain their goals, and contribute for an improved physical condition.

Though it is important to affect extrinsic motivations for promoting regular exercise, specially during the early stages, studies have shown that intrinsic motivation is vital for the maintenance of those habits. One way to verify if a given activity is having a positive effect on the intrinsic motivation of the participant is to see if the flow state is being elicited. Flow state was defined by [25] and may be resumed in a simplified manner as a level of immersion in one activity that one looses the sense of time. This can be subjectively quantified with the help of questionnaires like the Flow Questionnaire (FQ) [7], the Flow Short Scale (FSS) [29], or others.

4.1 Exercise routine designer

One important part of the system proposed is the Exercise Routine Designer, which is a software module to support therapists or trainers to create, an modify exercise routines. Based on a library of exercises, this utility may be used to select, trim, and combine specific exercises and define their appropriate execution speeds, number of repetitions, and the pauses between them. This enables the customisation of a routine to a specific person abilities and needs. The edition of routines is illustrated in Fig. 1.

Fig. 1
figure 1

Exercise Routine Designer editing windows: editing the number of repetitions, pause length, and speed factor (top left); adding a new exercise to the routine from the current list (top right); definition of a new exercise from parts of exercises (bottom)

Although relying on predefined exercises, any person in charge has the possibility to select parts of different exercises and glue together to create new ones. Fig. 1(bottom) shows an example where a pause was introduced in the middle of a squat, but any combination of parts of the same or different exercises is possible.

Currently, the exercises are based on model animations stored in Collada file format. The exercises are mapped to the corresponding files through a JSON dictionary that includes a set of relevant parameters. Similarly, the routines are also stored using in JSON format. Figure 2 shows examples of how routines and exercise mappings are stored.

Fig. 2
figure 2

Left A routine as stored in JSON. Note the use of exercises names defined in the exercise dictionary. Right An entry of the exercise dictionary as stored in JSON

4.2 Virtual coach

The virtual coach (VC) included in the proposed system (see Fig. 3) is an animated character that demonstrates faithfully how to perform each specific exercise. It is supported by OpenAR, an in-house developed software platform for the creation of 3D applications. When the exercise system starts, the application generates and displays a virtual environment on which the virtual coach greets the user.

Fig. 3
figure 3

A user executing exercises accompanied by the robotic buddy and guided by the virtual coach on the screen

The virtual coach is provided with a set of low-level motion primitives that enable the reproduction of the sequences generated via the Exercise Routine Designer based on the exercise routine prescribed by a therapist, physician, or carer according to the user’s specific needs. The virtual coach agent is also responsible for communicating a low-level specification of the exercise routine to the robot, herein denoted as Exercise Buddy, so that it may follow the exercise execution. As previously referred, a routine comprises a set of exercises, each with: an associated animation, number of demonstrations, number of repetitions, exercise speed, and rest time between exercises. The routine is interpreted by the graphical engine as a queue of animations, with the specified speeds and wait times between them. The animations are in turn defined by sets of timed key-frame transformations, which are interpolated along the time to compute the configurations to be applied to the avatar skeleton and, therefore, animate its polygon mesh. The virtual coach maintains a communication channel with the robot that supports the synchronisation of the exercises executed by the two agents.

4.3 Exercise buddy

The exercise buddy is the GrowMu social robot [20], shown in Fig. 3, which is able to communicate verbally through a speech synthesizer, and non-verbally by making use of a LED array on its head to change facial expressions, or by moving its base to convey some sort of message or emotion.

Since the robot is not equipped with any limbs, it may only execute base movements (composition of planar translations and rotations) in synchronicity with the exercises demonstrated by the virtual coach agent. In spite of its limited movements, the effect of behaving as an exercise buddy was shown to increase the engagement and effectiveness in physical exercise promotion. Apart from that, there are also the visual effects and sense of presence that an embodied (physical) agent moving close to the user and in synchronicity provides. This is one of the effects of night club lights, giant screens and moving heads, that flash and generate motion patterns at the rhythm of music to stimulate people to dance. In gyms, it is usual to have classes with several people exploring rhythmic exercises looking forward at the coach that establishes the cadence and shouts instructions in sync. Why do people engage more in a group class than alone with the coach? In fact, people try to follow the coach instructions but literature demonstrates that the perception of nearby motion by exercise companions, i.e. some sort of companionship [32] and social support [34], makes them ”go with the wave”, thus contributing to their higher engagement in physical exercises. This is the effect that the inclusion of the Exercise Buddy in the system intends to explore. The robot thus acts as exercise buddy which, although not performing all the exercises due to the missing limbs and articulations, induces a sensation of someone doing it with the user through its motion.

To this end, the exercise routines are translated into motions that the robot can execute, turning each exercise into basic movements that it can rhythmically execute. As aforementioned, the robot performs the exercise together with the user and in sync with the Virtual Coach at each repetition.

4.4 Exercise performance evaluation and feedback

The evaluation of the user performance is done by comparing the user movements with the virtual coach ones, in terms of some parameters that are presented below. Depending on whether the user is following correctly or not, either a verbal motivational or corrective instruction is spoken by the coach.

The user exercise analysis starts by the tracking of the user body configuration along time using OpenPose [4], a real-time CNN-based algorithm that is able to extract the 2D pose of people in images. Figure 4 shows the extracted skeleton superimposed on the user image. The second step consists in the extraction of body configuration parameters which are to be compared with those executed by the virtual coach.

Fig. 4
figure 4

Samples from exercise sequence showing the user body pose extraction by OpenPose

The evaluation of the exercises is based on the analysis of several parameters that relate the joints relevant to the exercises, the predominant axis of the movement, events related with expected changes in direction of movement, the time at which each event should occur on the exercise, and the motion range that should be reached until that time.

When an exercise routine is prescribed through the Exercise Routine Designer, a set of goals for each exercise is then generated, defining what should be accomplished by each joint using the information previously described. The evaluation is then performed, for each joint, by comparing the events detected with the goals defined in terms of percentage of executed repetitions, average delay, and percentage of repetitions with correct range. The percentage of expected repetitions is given by

$$\begin{aligned} p_{{\textit{executed}}} = \frac{r_{{\textit{detected}}}}{r_{{\textit{expected}}}} \times 100, \end{aligned}$$
(1)

where \(r_{{\textit{detected}}}\) is the number of detected events, and \(r_{{\textit{expected}}}\) is the number of expected repetitions at time of the evaluation t. The delay of detected event n with respect to the goal is given by

$$\begin{aligned} \varDelta t_n = t^d_n - t^g_n, \end{aligned}$$
(2)

where \(t^d_n\) is the time event n was detected and \(t^g_n\) is the time of the goal closest to the when the event n was detected. The average delay is then given by

$$\begin{aligned} \varDelta t_{{\textit{avg}}} = \frac{ \sum _{n=1}^{r_{{\textit{detected}}}}\varDelta t_n}{r_{{\textit{detected}}} }. \end{aligned}$$
(3)

The range of motion is evaluated by verifying during the execution of the repetition if it is within the acceptable interval defined in the exercise.

The use of these parameters require that before the exercise, the system enters a calibration phase, where the relative limb size is taken for normalisation of the range of motion. Here, the user must stand still, with all limbs stretched and at a central location in the camera image. This restriction on location allows for the capture of all movements, without the chance of the extremity joints going out of bounds. When the calibration phase ends, the system registers the user characteristics and triggers the Virtual Coach to start the routine. During the exercise session, the Virtual Coach progressively executes the animations in its queue, evaluating the user and recording that evaluation. If the evaluation detects that the user is delayed with respect to the Virtual Coach’s movements, this agent decreases its speed in response so as to help the user to catch up.

5 Integrated smart environment for exercise promotion

As aforementioned, this system is integrated in a smart environment using a distributed approach to make use of the abilities of multiple agents, as depicted in Fig. 5. The central decision making agent interprets the data provided by the available agents (e.g. users location, exercise performance, etc.), combines it with the current system state, and triggers the appropriate service by means of a high-level command sent to the agents. The communication between the agents is done by making use of ROSFootnote 1 communication primitives. Once a service is initiated it action plan is translated by each agent into low-level commands, such as velocity commands, speech output, visual output, etc., which they follow in order to fulfil the global goal.

Four distributed agents are included in the multi-agent system proposed herein: smart camera network, exercise buddy, virtual coach, and exercise performance evaluator. The smart camera network informs the system about the user’s location within the smart environment and his/her (in)activity level. On pre-programmed times or when the user is detected to be excessively inactive, the exercise promotion orchestration comes into play.

Fig. 5
figure 5

Architecture composed by a centralised decision making agent and four main distributed execution agents

5.1 Smart camera network

The camera network is installed on the user home, where each camera observes a separate room. The video streams are analysed by a convolutional neural network (CNN), YOLO 9000 [28]. This CNN allows the agent to identify objects and people present on the image and track their locations. When detecting an object or person, it identifies the detected class with the associated a confidence level \(c \in [0, 1]\), and defining the image location via a bounding box. As the goal in this work is to detect people, the user once detected is localised by using an homography transformation to map the camera view into the corresponding spacial locations. The required homography matrices \({\mathbf {H}}\) are initially estimated for each camera through a calibration procedure. For this a set of known floor marks \(\mathbf {p}_{\mathbf {i}}=\begin{bmatrix} x_i&y_i&1 \end{bmatrix}^{T}\) are identified in cameras’ images \(\mathbf {p}_{\mathbf {i}}^{\mathbf {\prime }}=\begin{bmatrix} x'_i&y'_i&1 \end{bmatrix}^{T}\), as they are related by \(\mathbf {p}_{\mathbf {i}}^{\mathbf {\prime }}={\mathbf {Hp}}_{\mathbf {i}}\). In fact, on a user’s view, only the feet spatial locations can be estimated as the established homography is only valid for the ground floor. The coordinates of the user are then determined by computing \(x', y'\) with respect to the lowest point of the bounding box that contains the person.

In order to reduce the computational load imposed by processing simultaneously images from multiple camera feeds, a mechanism for the selection of the most informative camera was implemented. Since the purpose of this agent is to track humans, a periodical sweeping of all cameras is done and their captured images are analysed and searched for people. The camera presenting the highest confidence of human detection is selected for continuous processing during the next interval. This allows for the resources to be focused on the camera that best captures the user, instead of dividing it by all cameras.

Once the user of interest is detected immobilised for a predefined period of time, the camera network agent launches an alert event that triggers the exercise promotion sequence.

5.2 Calling for exercise practice

When the smart camera network accounts for a period of inactivity of the user above a certain threshold, it sends an alert message to the central planning agent. As a consequence, the latter establishes a communication with the robot for it to receive the new mission of going to the user’s location and invite him/her for the exercise routine. The robot navigates to that point, approaches the user and starts an oral dialog with the user proposing the exercise. The user is expected to answer in the same way, as it was found in previous studies that it is the most adequate form for this age group.

In the case the user refuses to go along with the robot, the latter tries to be more persuasive by reinforcing the invitation with positive arguments. In the case the user still does not accept, it acknowledges the choice, and returns to its charging dock. Figure 6 shows the situation where the robot approaches the user and establishes the invitation dialog.

Fig. 6
figure 6

Robotic buddy inviting user for some physical exercise practice

If the user accepts, the robot guides him/her to the screen where the Virtual Coach will guide the exercise session. Prior to the exercise, the robot establishes a short dialog with the user to inquire about exercise preferences such as coach gender, characteristics of the exercises, etc. Finally, the exercise session is started as above described.

6 Pilot study

In order to verify if the exercise system is able to accomplish the proposed goals, a multi-session user study with elderly users was conducted.

The description of this pilot study and obtained results were previously included in a conference article [22] and repeated here for the sake of completeness. Figure 7 shows two of the participants of that pilot study doing the exercises demonstrated by the virtual coach and accompanied by the robot.

Fig. 7
figure 7

Two of the voluntary participants doing the exercises

6.1 Experimental Setup and Participants

Three variations of the system were included in the tests:

  • Variant 1: a workout video—A video of the Virtual Coach performing an exercise routine. This variant is the simplest of the three. The warning of prolonged idleness is given to the system, which triggers a video of the Virtual Coach greeting the user and performing an exercise routine. No consideration for the user performance is taken and no agent is present on this variant. It is intended to serve as a baseline for the results obtained from the other two variants.

  • Variant 2: an avatar-guided session—The Virtual Coach performing the routine but vocalizing and adapting the exercise speed according to the user’s performance. After the warning of prolonged idleness, the Virtual Coach appears on the screen, greets the user, and asks him/her to stay inside the central area for calibration phase, where it acquires the resting appearance of the user, height and limb lengths. Once this phase ends, the Virtual Coach starts conducting the routine execution. Depending on the evaluation of the last three repetitions the Virtual Coach either maintains the speed it is executing the exercise or, in the case the user is falling behind, slightly slows down. The evaluation is vocalized by the Virtual Coach each three repetitions, also considering the last three repetitions. If the evaluation is positive, the Virtual Coach motivates the user with a congratulatory sentence (e.g. “Well done”, “Keep it up”). On the other hand, if any of the analysed joints has a negative evaluation, the Virtual Coach warns the user of that mistake (e.g. “Be careful with your right elbow”).

  • Variant 3: a robot-accompanied session—The Exercise Buddy (i.e. the social robot) accompanying the user on the session guided by the Virtual Coach. All agents are employed in this variant. The Exercise Buddy greets the user at the start of the session and asks him/her to stay inside the central area for calibration. After the calibration phase ends, the Exercise Buddy informs the user it is done and the Virtual Coach starts executing the routine. The Exercise Buddy accompanies the user and the Virtual Coach on the exercises, vocalising the evaluation in the same manner as in variant 2. The speed of the exercise execution being adapted in the same way as in variant 2.

These variants were tested on a group of 12 elderly participants, where 6 participants were male and 6 were female, with an average age of the \(79.08 \pm 6.53\) years. Each session lasted between 25 to 30 min, of which 15 min were devoted to the exercise routine and the remaining 10–15 min to answer the questionnaires. In each experimental session, one of the variants was tested, with the sessions occurring once a week during the course of three weeks. The exercise routine performed in all of the variants was identical, consisting of performing four distinct exercises for two sets each, both seated and standing. The exercises stimulated the strength and flexibility of the user, both on the arms and legs.

During routine execution the above referred human evaluator analysed and recorded the participant performance in terms of: number of repetitions, quality of execution, and in sync with the trainer. After the routine ended, and the agent said goodbye, the participant answered two questionnaires about the system and his/her experience with it.

6.2 Hypotheses and metrics

With this study we intended to support the following hypotheses:

Hypothesis 1

Participants will evaluate all variants of the system with a high usability.

Hypothesis 2

Participants will experience a higher flow if motivation is given.

Hypothesis 3

Participants will experience a higher flow from the variant with the robot when compared with the other two.

Hypothesis 4

The system will be able to effectively guide the participants through various exercises.

Hypothesis 5

The system will be able to evaluate user performance similarly to a human counterpart.

In order to validate these hypothesis, we employed the following metrics:

M1:

System usability scale (SUS)—a ten-item Likert scale for a subjective assessment of the system usability;

M2:

Flow Short Scale (FSS)—evaluates the nine components of flow experience with ten items over a 7-point scale ranging from not at all (1) to very much (7);

M3:

Human evaluator score—evaluates participant performance in terms of number of repetitions, execution, and synchronization of each exercise. Synchronisation is evaluated on a 5-point scale from completely desynchronised (1) to perfectly synchronised (5). Execution also evaluated on a 5-point scale: wrong exercise (1), very badly performed (2), with small error in execution (3), insufficient range of motion (4), perfectly executed (5);

M4:

Automatic evaluation score—real-time system evaluation of the participant performance on in terms of number of repetitions, execution, and synchronisation of each exercise. Synchronisation is evaluated in how many seconds the user is delayed in relation to the avatar repetition. Execution is evaluated in correct or incorrect range of motion.

Both the SUS [1] and the FSS [29] were answered by the participants at the end of each session, right after they finished the exercise routine. The human evaluator graded the exercise sessions during their execution and confirmed its evaluation on a later date, making use of the videos recorded during each session.

6.3 Results

After the experiments, an analysis was conducted on the answers to the questionnaires and the evaluation scores, for both the human and automatic evaluators. For the case of SUS questionnaire, given the goals of the study, questions number 1 (“I think that I would like to use this system frequently.”) and number 2 (“I found the system unnecessarily complex”) were considered the most important ones. The results of the FSS questionnaire were submitted to an ANOVA analysis to verify the statistical significance of the data, and the level of flow state obtained from each variant was computed. In order to verify if the system was able to guide the user through the routine, a comparison of the results on user performance for each variant was done, using the data provided by a human evaluator. Finally, the data provided by the human evaluator was compared with the output of the Exercise Performance Evaluation and Feedback module, to validate the performance analysis done automatically by the system.

Variant 1 (see Fig. 8) was evaluated positively by the participants in terms of usability (see Fig. 8a), with an average score of \(81.875 \pm 8,99\), where question 1 obtained an average score of \(2.91 \pm 0.79\) and question 2 an average score of \(2.91 \pm 0.99\). On the FSS (see Fig. 8b) we verified the statistic significance of the results (\(p \approx 10^{-17}\)) and obtained an average flow level of \(5.475 \pm 0.486\). The human evaluator scored positively 100% of the repetitions of exercise 1, 80.6% of the repetitions of exercise 2, 79.9% of the repetitions of exercise 3, and 78.8% of the repetitions of exercise 4.

Fig. 8
figure 8

Results obtained for variant 1—workout video. In graphs a, b the blue bars represent the average score obtained for the corresponding item, with the black arrows representing the ±sigma range. Graph c represents, for each different exercise, the percentage of repetitions evaluated by the human evaluator with 4 or above, 3, 2 or lower, and not done

Variant 2 (see Fig. 9) was also evaluated positively by the participants in terms of usability (see Fig. 9a), with an average score of \(86.041 \pm 8,289\), where question 1 obtained an average score of \(3.17 \pm 0.94\) and question 2 an average score of \(3.41 \pm 0.51\). On the FSS (see Fig. 9b) we verified the statistic significance of the results (\(p \approx 10^{-10}\)) and obtained an average flow level of \(5.375 \pm 0.420\). The human evaluator scored positively 99.9% of the repetitions of exercise 1, 62.5% of the repetitions of exercise 2, 84.8% of the repetitions of exercise 3, and 82.9% of the repetitions of exercise 4.

Fig. 9
figure 9

Results obtained for variant 2—avatar-guided session. In graphs a, b the blue bars represent the average score obtained for the corresponding item, with the black arrows representing the ±sigma range. Graph c represents, for each different exercise, the percentage of repetitions evaluated by the human evaluator with 4 or above, 3, 2 or lower, and not done

On variant 3 (see Fig. 10) participants also evaluated positively its usability (see Fig. 10a), with an average score of \(87.708 \pm 6.524\), where question 1 obtained an average score of \(3.58 \pm 0.515\) and question 2 an average score of \(3.16 \pm 1.11\). On the FSS (see Fig. 10b) we once again verified the statistic significance of the results (\(p \approx 10^{-18}\)) and obtained an average flow level of \(5.658 \pm 0.281\). The human evaluator scored positively 98.5% of the repetitions of exercise 1, 73.3% of the repetitions of exercise 2, 83.4% of the repetitions of exercise 3, and 86.6% of the repetitions of exercise 4.

Fig. 10
figure 10

Results obtained for variant 3—full system, including the robotic exercise buddy. In graphs a, b the blue bars represent the average score obtained for the corresponding item, with the black arrows representing the ±sigma range. Graph c represents, for each different exercise, the percentage of repetitions evaluated by the human evaluator with 4 or above, 3, 2 or lower, and not done

6.3.1 Human evaluator vs automatic evaluation

The automatic evaluation was able to detect 93.4% of the repetitions of exercise 1, 40.56% of the repetitions of exercise 2, 93.94% of the repetitions of exercise 3, and 84.58% of the repetitions of exercise 4. In relation to range of motion the system classified as correct 81.6% of the repetitions of exercise 1, 95.7% of the repetitions of exercise 2, 83.3% of the repetitions of exercise 3, and 89,6 of the repetitions of exercise 4. The evaluation also detected an error of execution in 83.6% of the repetitions of exercise 3, associated with movement of the elbow.

6.4 Discussion

From the above presented results we can now analyse them and verify if they support the hypotheses raised in Sect. 6.2.

Hypothesis 1

We hypothesised that all variants of the system were easy to use by elderly user. Taking into account the results obtained from SUS we can say that this hypothesis is supported by the data, with all variants being evaluated with a SUS score above 80%.

Hypothesis 2

We hypothesised that participants would experience higher flow state with the adding of motivation. Results show variant 3 had the highest value of flow (\(5.658 \pm 0.281\)), followed by variant 1 (\(5.475 \pm 0.486\)) and variant 2 (\(5.375 \pm 0.420\)). These show a greater flow state on variant 3, where the robotic Exercise Buddy gives the motivation, but the motivation given by the Virtual Coach failed to elicit a higher flow than the variant without it.

Hypothesis 3

Results of the FSS support hypothesis 3, with variant 3 obtaining the highest flow value (\(5.658 \pm 0.281\)).

Hypothesis 4

The system was capable of correctly guiding the participants throughout the designed exercise routine. The participants were able to execute the correct exercises with all but one exercise in variant 3 obtaining a positive percentage above 70%. Negative scores were given to the executions but mostly due to physical problems (e.g. mobility problems, lack of strength) than misinterpretation of what exercise to execute.

Hypothesis 5

In terms of evaluation ability, the system was only able to evaluate correctly some of the exercises performed. Exercise 1, 3, and 4 were all upper body exercises, being the assessment done automatically by the Exercise Performance Evaluation and Feedback module similar to the one provided manually by the human evaluator. It was also able to detect the same error in execution of exercise 2, related to erroneous elbow movement, that the human evaluator detected. Exercise 2 consisted on the user raising the legs above a certain height. Most participants had difficulties moving their legs, which resulted in a range of motion much smaller than it was expected in the system design. As such, the Exercise Performance Evaluation and Feedback module performed significantly worse than expected on this exercise.

7 Conclusions

This article proposed a multi-agent system designed to promote physical activity in elderly users. It copes with important barriers that are usually reported as being detrimental to adherence to regular physical activity by the elderly. One important barrier is self-efficacy, which affects more prominently elderly people than other age groups. The proposed system intervenes on this domain through effective coaching and companionship. This is attained through faithful demonstration by the virtual coach, companionship by the robotic exercise buddy, and the automatic evaluation of the user’s performance and subsequent warning and motivational utterances. The results of the pilot study demonstrate the system’s ability to provide a usable training system able to elicit a flow state in elderly. By adapting the exercise execution speed, to allow the user to catch up to the virtual coach, it improves the user perceived control and self-efficacy , as it is shown by the answers to fluency of performance related questions of the FSS. The system also deals with other barriers that tend to hinder physical activity habits in the elderly. Being designed to be used at home, problems such as exercise facilities, transportation, parking, and weather conditions, simply do not exist. The lack of exercise partner, another important barrier, is also addressed by our approach through the use of the exercise buddy. In fact, the inclusion of a robotic exercise buddy has shown an increment in the flow state as show the FSS results.

The system can still be improved, in particular on what concerns the inclusion of clearer vocal guidance, as several participants reported some confusion on the advice the system was giving. It became also clear that the inclusion of user profiles, e.g. with information about physical limitations of the user, can be employed to focus the evaluation mechanism on the extent of the capabilities of the user, and therefore not to report unaccomplished movements for injured limbs, or with known limited mobility. This will contribute to avoid frustration or impact the users’ self-esteem.