Robotic Versus Human Coaches for Active Aging: An Automated Social Presence Perspective

This empirical study compares elderly people’s social perception of human versus robotic coaches in the context of an active and healthy aging program. In evaluating hedonic and utilitarian value perceptions of exergames (i.e., video games integrating physical activity), we consider elderly people’s judgments of the warmth and competence (i.e., social cognition) of their assigned coach (human vs. robot). The field experiments involve 58 elderly participants in the real-life context. Leveraging a mixed-method approach that combines quantitative and qualitative data, we show that (1) socially assistive robots activate feelings of (automated) social presence (2) human coaches score higher on perceived warmth and competence relative to robotic coaches, and (3) social cognition affects elderly people’s experience (i.e., emotional and cognitive reactions and behavioral intentions) with respect to exergames. These findings can inform future developments and design of social robots and systems for their smoother inclusion into elderly people’s social networks. In particular, we recommend that socially assistive robots take complementary roles (e.g., motivational coach) and assist human caregivers in improving elderly people’s physical and psychosocial well-being.


Introduction
Many societies face the challenges of aging populations, at risk of reduced physical activity [54], whereas an active lifestyle has proven health-related benefits for aging adults [3].When a sedentary lifestyle becomes routine though [5], falls among elderly people can develop into an alarming problem, often leading to hospitalization and reduced physical autonomy [77].Evidence links decreased motivation for physical activity to advanced age [67], which implies the need for healthcare systems to develop effective solutions to ensure the physical wellbeing of elderly people.Exergames offer an innovative way for seniors to avoid a sedentary lifestyle and combat the degenerative effects of aging; their easy-to-follow steps and gamified nature motivate seniors to remain physically active through playful interactions.The application of exergames in healthcare settings also has yielded positive outcomes for both physical and cognitive well-being [9,76].Recent studies of the use of exergames in rehabilitation [16,71] suggest comparable or slightly better results (e.g., balance, gait) than achieved with conventional fitness programs.
According to a study of the usability of exergame platforms [70], elderly people's cognitive deficiencies can hinder their engagement with exergames though.We propose that a social agent can serve a supportive function, guiding and motivating elderly people through the gaming process.Professionals in care institutions do their best to keep elderly patients physically and mentally active, but the unprecedented relative growth of the elderly population is creating a gap between the supply of and demand for care services [42].As an aid, gaming platforms can include virtual agents that serve in guidance and motivational roles, with an automated social presence.Still, the presence of a physical entity enhances interactions with an autonomous agent, compared with a virtual or real entity presented on screen [49,58,69].These findings have motivated researchers to turn their attention to socially assistive robots (SAR), like Vizzy [51], MBOT [74], and GrowMu [57], which understand social cues through facial and voice recognition technology, are capable of engaging in quasi-social interactions, and can cause users to feel as if they are in the company of a social entity [14,72].
The extent to which current SARs are capable of helping people enjoy and perceive exergames as useful is a crucial research topic.The adoption of a user-centered design perspective, with continuous benchmarks and comparisons between SARs and care professionals, can support technological improvements that better suit professionals and elderly patients.In this study, we take the perspective of elderly users to address the following research question: How does an automated (i.e., SAR) versus human presence affect elderly people's experience with exergames to improve their physical activity?For this purpose, we use the Vizzy robot (Fig. 1) as a robotic coach and the Portable Exergames Platform for Elderly (PEPE) [70] as an exergame system.Vizzy is a semiautonomous robot, and PEPE is a fixed gaming platform.These systems were developed and designed specifically for use in elderly care institutions.With Fig. 1 Vizzy interacting with an elderly lady an experimental design, applied in five elderly care locations, we investigate users' experience with playing one of PEPE's exergames, in the company of a human or robotic coach (Vizzy), by gathering their responses to a questionnaire augmented with a set of probing questions.The resulting contributions and practical implications include: • An empirical test of the new concept of automated social presence and how it influences exergame experiences for seniors.• Quantitative and qualitative comparisons of the social perceptions elderly people form about human and robotic coaches.• Evaluations of elderly people's experiences (i.e., emotional and cognitive reaction and behavioral intentions) with the exergame, depending on the company.
In Sect.2, we discuss prior end-user studies with SARs in motivational and conversational roles, which constitutes the background for our conceptual model in Sect.3. We present the hypotheses in Sect. 4. Next, we describe Vizzy, PEPE, the research setup, experimental design, and data collection procedures in Sect. 5.After we detail the results in Sect.6, we discuss contributions and implications of our study in Sect.7.

Social Robots for Older Adults
Applications of SARs in elderly care constitute a trending research topic.Several studies with older adults test their acceptability and possible range of applications.In Iwamura et al.'s work [41], seniors used two robots as shopping partners, and the findings suggest that the robots' social skills (e.g., conversation) improve people's intentions to use them.McColl and Nejat [50] present an exploratory study at an elderly care facility to investigate user engagement and compliance during mealtime interactions with a robot.Participants were engaged and compliant with the robot's instructions and also perceived the robot as enjoyable.
Other studies have investigated the evolution of seniors' perceptions of SARs over a longer period of time.In a study of which factors determine long-term user acceptance of social robots [34], the evidence reveals that hedonic factors gain the most attention, but utilitarian factors are a fundamental prerequisite of long-term interactions (i.e., the robot must have a clear purpose).A more recent contribution investigates older adults with dementia at residential aged care facilities over four years [46].After each trial, the robot's services improved, reflecting staff and resident feedback, leading to statistically significant increases in engagement and robot acceptability.

Robotic Coaches
Robotic coaches are platforms that engage, monitor, and support physical exercise activities, through verbal and non-verbal communication, demonstration of exercises, and real-time corrective or motivational feedback.An early study [25] explored interaction enjoyment and perceived utility, with a comparison of two different approaches.The first relied on a relational robot that praised, addressed users by name, and expressed humor and empathy.The second used a non-relational robot that just provided scores and help as needed.Users regarded the relational robot as a better companion and exercise coach than the non-relational one.Robotic coaches thus need to provide motivational support to users during the exercise.In turn, users perceive relational robots as more intelligent and more helpful.Feasibility tests of a robot coach architecture with older adults also indicate promising acceptability results [24].This architecture includes a rule-based decision process but also can learn users' preferences, estimate their affective state, and recall past interactions.Users' perceptions of their interaction with the robot thus improved, relative to their initial expectations.Schneider and Kümmert [66] also investigate the differences in motivation created by a robot that solely instructs users (instructor role) and one that exercises alongside the user (companion role); people appear more motivated by the robot companion role.Finally, in a robot coaching scenario [33], an elderly user study cites engagement and enjoyment while exercising but also reveals some difficulties, such as hearing instructions, focusing on the physical aspects of the robot while ignoring verbal instructions, and confusion associated with performing sequences of gestures.

Comparing Human and Robot Actors
Extant studies also compare the performances of human and robotic actors in a variety of contexts.To summarize these findings, we start with studies in which participants have no direct interaction with the human/robotic actor, then continue on to studies in laboratory settings, and finally end our analysis in this section with findings from real-world field experiments.
The first group of studies relies on online surveys that present participants with written [75] or video recorded [39] scenarios, followed by a set of questions designed to evaluate their perceptions of human versus robotic actors.The foci of the studies varies, from moral dilemmas to the communication ability of the human or robot agent.For example, one online study [75] reports that participants assign higher moral obligations to human than robotic actors, such that they believe it is morally wrong for a hypothetical human actor to choose to act and save four, while sacrificing one, coal miner, but they expect the robot to make this choice when faced with the same moral dilemma.Another study [39] compares human and robotic doctors' ability to inform patients about their health conditions, following a standardized script.Participants evaluate the robot better at communicating bad news, though these researchers predict that human social presence might invert the results in a real-life scenario.Although the design of these experiments allow for replication and comparability between humans and robots, they have limited external validity, because they refer to specific, artificial scenarios, and participants have no contact with actors or no immersion in the scenario.Such a limitation is especially important in critical scenarios [39], such as when participants do not experience a real health diagnosis.
Another group of studies relies on laboratory-based methodologies that enable participant-agent interactions.These studies generally report no significant differences in the behaviors of participants in the presence of robotic versus human agents.For example, in a shared attention memorization task with a human or robot agent [47], the participant and agent point at images and look at them together.Later, participants reviewed another set of pictures and had to identify which they had seen before.Recognition performance was better when the participant initiated the pointing behavior than when the collaborator did so, but it did not matter if that collaborator was a human or a robot.In a turn-based variant of the Tower of Hanoi experiment, conducted with a human or robot collaborator [44], participants performed significantly better with a collaborator than when playing individually, but again, there were no significant differences between the human and robot collaborator conditions.Because the collaborator always made the best move, the lack of performance differences could be because the game was too simple.Although these studies suggest that participants benefit equally from collaborating with other humans or robots, one contribution [44] assigns an advantage to humans, in terms of perceived social presence and perceived credibility.These designs are replicable and controlled, and they rely on participant-collaborator interactions.However, the laboratory setting might limit internal and external validity.
Finally, some studies use realistic field experiments to compare human and robotic teachers [28,45].For example, when teachers gave a tutorial on prime numbers, no significant differences emerged in the student evaluations of human or robotic teachers [45].The robot in this study was fully autonomous but could not adapt its gestures to events in the classroom or engage in mutual gazes with students.In a similar study of programming principles [28], students between 14 and 17 years of age performed better with the robotic teacher, whereas those 10-13 years of age performed better with a human teacher.Younger students (6-9 years) showed no significant performance differences.Although they do not suffer the limitations of the previously mentioned studies, strong novelty effects are present in these studies, especially [28].Beyond the novelty of interacting with a robot, the participants were visiting students, unfamiliar with the university environment.
This literature review reveals that prior research has focused on different domains of social interaction, including moral expectations, diagnosis communication, shared attention benefits, gaming performance effects with a collaborator, and teaching.They also provide valuable insights into the issues and methodologies that are required to achieve successful experiments involving human and robot actors.First, in particular, they reveal that people evaluate humans and robots differently, even if the robot is perceived to be intelligent [75].Second, lab experiments allow for more control over the stimuli, but they might not be valid in real-life scenarios [39].Third, some elements of these methodologies are standard and necessary for interaction studies, to increase their internal validity.The human actor must be trained to follow the same script and move similarly to the robot; the task should match the robots' skills to reduce possible biases.However, literature is still sparse in this domain, and to the best of our knowledge, no study compares older adults' perceptions of exergames when they are coached by either a human or a robot.The following section substantiates the importance and relevance of such a study.

Conceptual Background
Social interactions are not reserved to humans.Technologies mimic humans, in their appearance and behaviors (i.e., social robotics, [11]), and quasi-social interactions increasingly appear in diverse service settings, such as robotic waiters, bank tellers, and receptionists.These artificially intelligent social actors can engage in quasi-social interactions [6].Robots are not social by nature but instead are programmed to act as conversational partners and socially interactive peers [32].Yet, when they engage in these quasi-social interactions, humans tend to perceive robotic systems as social beings [36,60].This phenomenon is closely linked to the concept of social presence, broadly defined as the sense of being with others (e.g., in virtual reality, [38]) or having the access to others' minds (i.e., cognitive, affective, and intentional states, [7]).Social presence is a property of humans, not technologies [6], so current theories may benefit from using social presence as a lens to compare interactions with either human or non-human actors.The concept of automated social presence, as introduced in service literature, refers to "the extent to which machines (e.g., robots) make consumers feel that they are in the company of another social entity" ( [72], p. 44).If robots can interact with people in human-like ways, such as by listening, conversing, or reading emotional cues, then people might automatically respond socially to them [60].In particular, this study argues that people may activate their social cognition and judge the non-human social actor on the basis of warmth and competence dimensions, which are the universal dimensions of social cognition.

Social Cognition
According to Fiske et al. [31], when interacting with members of the same species, humans need to determine whether the other is friend or foe (i.e., warmth dimension), and whether it is capable of acting on good or ill intentions (i.e., competence dimension).Although robots are not members of the same species, their anthropomorphism [53] and increasing social dexterity make them resemble humans in many aspects.This study in particular focuses on SARs, which offer assistance through social interactions in a humanlike manner [26].In a healthcare context, SARs are autonomous, understand social cues through facial and voice recognition, and can provide both child (e.g., autism therapy, [65]) and elderly (e.g., medication reminders, [13]) care.In an elderly care context, SARs offer services such as health monitoring and safety, encouragement to engage in rehabilitation or general health-promoting exercises, social mediation, interactions, and companionship [27].By performing more socially engaging tasks, these robots take roles as companions, collaborators, partners, pets, or friends [21,32].We build on the idea that humans judge a social interactor (even a non-human actor) on warmth (i.e., friendliness, kindness, caring) and competence (i.e., efficacy, skill, confidence) dimensions, which then determines their emotional, cognitive, and behavioral reactions.Thus we address recent calls for research on the effects of automated social presence on the perceived enjoyment and usefulness of service interactions and the acceptance of new technologies [36,72].

Hypotheses Development
We test our empirical model through field experiments with elderly people.The research aim is to investigate the differences between human and robotic companions when it comes to motivating elderly persons to engage in physical activity.We introduce hypotheses related to robotic coaches only (H1), comparisons of human and robotic coaches (H2 and H3), and user experiences with the exergame (H4 and H5). Figure 2 depicts the empirical model.
Using the argument that humans tend to treat non-human actors/systems as social beings [36,60] and the definition of automated social presence [72], we investigate whether SARs, which display human-like social behavior and communicative skills [10], evoke feelings of social presence in humans.Furthermore, this study extends previous research [30,43] by involving socially dexterous robots that are capa- If people feel like they are in the company of a social entity, they likely evaluate robots using the mechanisms of social cognition (i.e., warmth and competence), similar to their evaluations of human companions [72].We extend previous research [30] with a comparison of human and robotic coaches, with the prediction that H2 There is a difference in the (a) perceived warmth and (b) perceived competence of human versus robotic coaches.Moreover, in line with calls for further research [35,37], we explore the effects of automated social presence versus the presence of human coaches on users' emotional and cognitive reactions, which may influence their acceptance of the technologies [73].Being in the company of a human or a robot may determine whether elderly participants find playing the exergame enjoyable and effective.We therefore postulate: H3 There is a difference in the (a) hedonic and (b) utilitarian value perceptions of an exergame, depending on whether the coach is a human or a robot.Finally, assuming users evaluate both human and robotic entities using universal social cognition dimensions, we further consider how differences in people's social cognition-based evaluations of human versus robot coaches affect their emotional and cognitive responses and, ultimately, their intentions to use the exergame.Therefore: H4 (a) Perceived warmth and (b) perceived competence of the coach affects users' (i) hedonic and (ii) utilitarian value perceptions of the exergame experience.H5 (a) Hedonic and (b) utilitarian value perceptions affect users' intentions to use the exergame.

Robot Description
The robotic platform in these experiments is the Vizzy robot [51], developed by the Institute for Systems and Robotics (ISR-Lisboa/ISR).Vizzy is a 1.3-m-tall, wheeled robot with a humanoid upper torso and a friendly, marsupial-like design (Fig. 3).It has a total of 30 mechanical degrees of freedom (DOF).The two DOFs of the differential drive base allow it to navigate planar surfaces easily and plan its trajectories using well-established algorithms [59,64].The head Fig. 3 The Vizzy robot can perform pan and tilt movements, and its eyes can do tilt, vergence, and version movements, for a total of five DOFs.Vizzy's arms and torso also have 23 DOFs, such that it can perform human-like motions.Two laser scanners at the bottom of the mobile base capture planar point-clouds of the environment, providing valuable information for the robot's safe navigation.Each of Vizzy's eyes contains a camera, used for object/people detection and tracking.A depth camera is mounted on Vizzy's chest, further enhancing the robot's sensing capabilities for human movement analysis.Other components include a loudspeaker and a microphone, useful for human-robot interactions, and 12 tactile sensors on each hand [55], mounted on the finger phalanges.
Vizzy's actuators and controllers allow it to navigate, grasp objects, perform gestures, and gaze in a biologically inspired way [1,63].The robot also has software that enables it to detect and follow people in the environment, using an implementation of the Aggregate Channel Features Detector [22] with appearance-based tracking using color features [29].A set of RVIZ plug-ins allow the remote control of the robot (head and base) in Wizard-of-Oz (WoZ) experiments, in a gaming-like way, while visualizing the obstacles gathered by the laser and the people detected by the cameras.A web interface (see Fig. 4) allows a "wizard" to select utterances for the robot to speak while listening through the robot's microphone.This approach supports faster replies and interactions, compared with explicitly typing utterances, though at the cost of speech flexibility.

Gaming Platform Description
To perform the exergames with seniors, we used the Portable Exergames Platform for Elderly (PEPE) [70], an augmented reality gaming platform that projects exergames on the floor, while a Kinect sensor captures the person's movements to control the game elements.Users can play through PEPE without requiring any wearable sensor or controller, which minimizes the burden and complexity for older adults.Two touchscreens on the top of the platform provide additional information to staff members monitoring the exercise.The games currently need to be initiated and terminated by a person, but in the future, the social robot will be able to control the gameflow.The chosen game for our experiments is Exer-Pong [16], which requires players to control a green paddle (Fig. 5a) with their body movements.The objective is to hit the yellow ball with the green paddle, making sure it does not leave the game area.Colored boxes populate the gaming area.When touched by the yellow ball, the colored boxes are destroyed and yield points to the player.After destroying all the boxes, the player completes the level.Every time the player fails to hit the yellow ball, and it leaves the playing area, a box reappears.During the game, audiovisual stimuli give performance information to the player, in the form of a red visual feedback and success and failure sounds.The

Participants
In total, 58 elderly persons (42 female, 16 male) participated in the study; they ranged in age between 48 and 94 years (μ 78.79, σ 9.85).We excluded people suffering from severe physical (e.g., full physical immobility) or mental (e.g., dementia) health problems, as well those incapable of giving their consent.The target population comprised elderly persons living autonomously (e.g., in their own home) or in a nursing home who accept services from a care institution.Thus, the main inclusion criterion was that they were clients of the elderly care centers with which we partnered, which offer a wide range of tailored activities (e.g., card games, fitness, handcrafts).We randomly divided the overall sample into two experimental groups: 22 participants in the human coach group (16 female, 6 male), aged between 65 and 94 years (μ 75.46, σ 12.79), and 36 participants in the robotic coach group (26 female, 10 male), aged between 48 and 91 years (μ 80.83, σ 6.98).The human coach experiments were carried out at two locations (Location A: 10, Location B: 12 elderly participants), and the robotic coach experiments were carried out at three locations (Location C: 11, Location D: 10, Location E: 15 elderly participants).

Experimental Setup
The data collection, in collaboration with five elderly care institutions, took place during July-September 2017.Each participating elderly care institution provided a spacious activity area, an adjoining room for surveys/interviews, and an in-house psychologist.The activity area was essential, because we needed to fit the gaming platform and secure enough space for the robot to navigate throughout the area.We requested an additional room to ensure each informant's privacy during the data collection.Finally, the in-house psychologist helped screen elderly clients, to evaluate their fit for the experiment (exclusion criteria: severe physical or mental health problems).The visits were scheduled during the regular activities time held by each care center.

Experimental Procedure
As a part of the study, the exergame platform was installed in the main activities room of each care institution.During the regular activities time, when people usually participate in various activities offered by the institution (e.g., drawing, knitting classes, yoga, card or board games), the seniors were approached either by a human or the robot, depending on their assigned experimental group, and invited to try a new activity that reportedly was being considered as an addition to the existing activity offers.Because all the activities took place in a main activities room, peers could observe the exergame activity, though some remained engaged in their own activities.To keep the experimental scenarios as constant as possible, we developed a standardized set of steps for each actor (i.e., human/robot) to follow: (1) Approach an elderly person who satisfies the inclusion criteria; (2) introduce the new activity (i.e., exergames) and invite the elderly person to join the game; (3) if the elderly person accepts, escort her or him to the gaming area; (4) provide instructions for how to play the game; (5) motivate the elderly person by words of encouragement and feedback on game progression; and (6) ask the elderly person whether to continue or terminate the game.The locations of the coach during Steps 4-6 were standardized, as depicted in Fig. 5.Although the "wizards" directing the robots were present in the same room, they were described as technicians, available resolve any problems with the exergame system.Furthermore, they were standing in a non-visible area, out of the field of view of the participant during the game.The communication was standardized; both human and robot actors used the same sentence to introduce the exergames, give instructions, and motivate players.Past studies have shown that gendered voices can influence perceived trust, likeability [48], and competence [18] if the agent is a computer with no gender-specific characteristics other than the voice.People tend to give higher ratings to agents whose voice gender matches their own, and they also apply gender stereotypes.Therefore, Vizzy had a female voice to match the gender of the caregivers, who were all women.On average, the time participants spent playing the exergame was 08:07 min, with a standard deviation of 02:32 min.The minimum playing time was 1:58 min, and the maximum was 12:25 min.After the game, a member of the research team escorted the elderly participants to the separate room to conduct the survey and interview.Prior to being surveyed, all elderly informants signed an informed consent, outlining the main objectives of the research project and guaranteeing the anonymity of their responses.We consciously decided not to ask for informed consent before their interaction with Vizzy or the exergame platform, because we wanted to mimic a regular activity at the center, rather than create a research environment.

Data Collection
To test whether a robot (i.e., automated social presence) is comparable to humans in terms of motivating elderly people to engage in physical activity, we chose a mixed-method approach, combining quantitative and qualitative data.The main instrument was a questionnaire, augmented with probing questions to elicit more in-depth, qualitative insights.

Quantitative Data Collection
The 58 elderly participants were asked to allocate 10-15 min of their time and join the first author to discuss their overall experience with the activity.For that purpose, a questionnaire was constructed to guide the discussion.With the decrease of their cognitive functioning, elderly people often struggle with question and rating scales comprehension [68], so we decided to conduct guided surveys.The researcher read the question to the elderly person and explained the rating scale (e.g., totally disagree to totally agree), then asked the elderly person to indicate her or his response.
To measure the experience with playing exergames (i.e., emotional and cognitive reactions), we adopted two bipolar scales [17,52].Specifically, we employed a five-item measure (α .85) to assess their hedonic value percep-tions (i.e., users' enjoyment of the physical activity of using the exergame platform).We assessed utilitarian value perceptions (i.e., users' perceptions of the effectiveness of the physical activity of using the exergame platform) with a three-item measure (α .85).The responses were on sevenpoint bipolar scales.To add a component of behavioral intentions, we also measured elderly participants' future intentions to play exergames with a two-item measure (α .91).Furthermore, we gathered the elderly participants' evaluations of whether the human/robotic coach was good-or ill-intended (i.e., warmth) and competent to perform the task.In line with prior conceptualizations of social cognition [20], we adopted a three-item measure for perceived warmth (α .76)and three-item measure for perceived competence (α .88).Participants responded to all items on five-point Likert scales ("totally disagree" 1 to "totally agree" 5).
Finally, to evaluate the experience of sensing a social entity when interacting with the robot, we developed a fouritem measure (α .7) of automated social presence [2,37].Again, all items were measured on five-point Likert scales.Tables 1 and 2 contain the constructs and items used in the questionnaire.

Qualitative Data Collection
We chose to employ a convergent parallel mixed methods design [19], in which we collected both quantitative and qualitative data at the same time, with the goal of determining whether the findings associated with both types of data support each other.During the survey process, the researcher probed particular items as appropriate, using either "tell me more" probes [4] or a laddering technique [61].For example, after elderly participants indicated their enjoyment level (from "I hated it" 1 to "I enjoyed it" 7), the researcher would ask what in particular they found enjoyable or not.After they indicated their intentions to use the exergame platform again (scale from 1 to 5), the researcher would ask why and thereby potentially uncover the core values that guide their behavioral intentions.
All collected narratives were digitally audio-recorded and then transcribed, translated, and reviewed.The analysis proceeded in the following order: First, the authors all read the transcripts independently to form their own understanding of each participant's narratives [62].Second, four co-authors met for a joint analysis session to share emerging codes and develop a more focused coding scheme.Third, the first two authors coded the remaining narratives according to the coding scheme.Due to the exploratory nature of the study, codes and thematic descriptions emerged from the data rather than from preconceived categories or theories [40].

Automated Social Presence
To investigate H1, we ran a one-sample t test to determine whether the mean value of automated social presence is statistically different from a neutral value, that is, the midpoint ( 3) on the 5-point Likert scale.In support of H1, the mean score of 3.5 (SD .97) is significantly different from 3, t(35) 2.97, p .005 (Fig. 6).That is, elderly people experience some sort of automated social presence when interacting with the robot.When we complement these quantitative findings with the qualitative data, we identify the emergence of a thematic paradox that we label "We know it's a machine, but it feels like a human."From the narratives, we recognize that the elderly participants experience some conflicting thoughts, as captured in the following quotes:   The comparison of quantitative and qualitative data thus reveals similarities and support for H1.

Social Cognition
To test H2, we ran an independent-samples t-test and determine if the means of two groups (human vs. robot) differ statistically.We find a significant difference in the mean scores for perceived warmth when the coach is a human (M 4.82, SD .37)or a robot (M 4.45, SD .73),t(54.40)2.517, p < .05.Similar results emerge for the perceived competence of the coach, as a human (M 4.84, SD .47)or robot (M 4.50, SD .67),t(54.88)2.431, p < .05(Fig. 7).
These results support H2, the difference for both perceived warmth and perceived competence is statistically significant.Elderly people perceive the robot to be friendly, well-intentioned, and understanding of their needs, but to a lesser extent than the human coach.Perceiving the robot as high on the warmth dimension is in itself an important finding, which we also can affirm with the qualitative data.In coding the narratives, we identified another, broader theme, which we label "It's a machine, but it's kind and gentle."As emphasized by the participants: Similarly, the elderly participants perceive the robot as competent, reliable, and a knowledgeable expert for the par-

Hedonic and Utilitarian Value
To investigate H3, we ran another independent-samples ttest, to determine whether the different types of coaches (human vs. robot) affect elderly participants' value perceptions regarding their enjoyment and the effectiveness of exercising using the exergame platform.As hypothesized, we find a significant difference in the mean scores, for both hedonic and utilitarian value perceptions, across the experimental groups.In particular, the mean for hedonic value when the coach is a human is 6.83 (SD .33),whereas when it is a robot, the mean is 6.50 (SD .73),t(52.67)2.313, p < .05.For utilitarian value, the means again are higher for a human coach (M 6.40, SD .78)compared with a robotic one (M 5.56, SD 1.31), t(55.99)3.051, p < .05(Fig. 8).
These results support H3 but do not explain the source of these differences.When we integrate the qualitative data, we realize that the elderly people had fun and enjoyed the experience with the exergames while in the company of a robot: Yet they enjoyed the experience slightly less than they did with a human coach, as indicated by their narratives: Similarly, when asked to share their narratives of whether they found the exergames effective (i.e., utilitarian value perceptions) for improving their physical condition, some participants shared concerns: Robot's explanations were not enough!I was there doing something that I didn't know if it was right or wrong.And if I really was… as the robot was saying "Excellent", but those are the words it has inside and we cannot trust that very much.(Female, 81) Let's see… if the robot only does that exercise, that isn't much more than moving left and right, I think it's insufficient.So it should have more, more advantages… (Female, 81)

Regressions
Table 3 contains an overview of the descriptive statistics and correlations.To test H4 and H5 (Fig. 2) we estimate a sys-tem of five linear equations (see Table 4 for an overview of the dependent and independent variables in each equation), using the seemingly unrelated regressions (SUR) procedure in STATA.The SUR method is based on generalized least squares and is more efficient than ordinary least squares regression when the independent variables differ across a system of equations, because it accounts for correlated errors [78].The significant Breusch-Pagan-Lagrange multiplier test for error independence (χ 2 (10) 28.306, p < .01)indicates correlated errors in the five equations, and the R-square for each individual equation is statistically significant at p < .01(see Table 5.Therefore, using SUR is appropriate [15,23].The first two equations have perceived warmth and perceived competence as dependent variables and the type of coach as the independent variable.These two equations control for the influence of the coach on social cognition.The third equation includes hedonic value as the dependent variable and perceived warmth and competence as independent variables; the fourth equation features utilitarian value as the dependent variable and perceived warmth and competence as independent variables.These two equations test H4.Finally, the fifth equation includes intention to use as the dependent variable and hedonic and utilitarian value as the independent variables, to test H5. The unstandardized coefficients and their standard errors, obtained with the SUR estimator (sureg command) in STATA 14.2, are in Table 5.The perceived warmth of the coach has a positive, significant coefficient for hedonic value (.29; p < .05), in support of H4a.i; its impact on utilitarian value does not produce a significant coefficient though (.23; p > .05),so we cannot confirm H4a.ii.The perceived competence of the coach indicates a positive, significant coefficient for both hedonic (.46; p < .01)and utilitarian (1.08; p < .01)value, in support of both elements of H4b.Finally, hedonic value has a positive, significant coefficient for intention to use (.52; p < .01),as does utilitarian value (.22; p < .05).Therefore, we find support for H5a and H5b. 123

Discussion
This research evaluates the application of robots (i.e., SARs) in social contexts, formerly exclusively reserved for human agents.Motivated by the importance of physical activity for healthy aging, and the shortages of elderly care staff [54], our aim is to examine if SARs can take a coaching role and motivate elderly people to undertake physical activity, as well as potentially expand their social networks.In particular, we test whether SARs, which are not inherently social entities but are programmed to exhibit social behaviors, can give rise to social perceptions in elderly people and affect them in ways similar to those induced by human agents [36].With the idea that SARs equipped with social capabilities are comparable to human agents, we also seek to investigate their efficiency in motivating elderly people to engage in a new kind of physical activity (i.e., exergames), relative to the impact of human coaches.

Automated Social Presence and Social Cognition
Our findings suggest that elderly people sense that they are in the company of social entities when they interact with SARs [72].Our mixed-method empirical tests affirm that they struggle to classify the robot as a "machine," because of its expressiveness, friendly voice, politeness, and considerateness.In their narratives, we detect evidence that they see the robot as a "metal box," but due to the robot's social dexterity, they humanize the machine in their imagination.This evidence implies that SARs potentially could extend elderly people's social networks and support human caregivers in not just mechanical tasks but also tasks of a social nature (e.g., motivational coaches, playing games, conversational partners).In these quasi-social interactions, in which elderly people perceive SARs as social entities, they also tend to evaluate SARs using the social cognition mechanisms they apply to humans.Our experiments shed light on social perceptions of automated agents [60].Elderly people activate their warmth and competence judgments when interacting with SARs.However, when we compare human and robot actors, performing the same (coaching) task, we find that they evaluate humans as superior to the robot.We explain this slight difference with qualitative data, which reveal that the human coaches are better at responding to elderly persons' individual needs.Even though they find Vizzy cute and funny, some elderly participants complained that they could barely hear what it was saying.Unlike Vizzy, for which the voice pace and pitch remained the same, and the volume could not be adjusted (i.e., it was already set at its maximum), human coaches could adjust their voices to the needs of hearingimpaired participants.This difference likely affected both warmth and competence judgments.

Hedonic and Utilitarian Values
By conducting these experiments in a real-life context, we could test whether elderly people enjoy and appreciate this new type of gamified workout, depending on their coaching partner (human vs. robot).We demonstrate that seniors' hedonic and utilitarian value perceptions of exergames are driven by the social cognition they develop of their coaches.Elderly participants both enjoy and find the exergame more effective if they are instructed and motivated by a human, rather than a robot.The reason for this difference might be the slightly higher levels of anxiety that we find expressed in the narratives, due to interacting with the robot.For many of the elderly participants, this experience was the first time they had seen or engaged with an automated agent.Some noted their nervousness when the robot first approached them ("Initially, I wasn't very relaxed, but I ended up being more relaxed as we continued," Male, 86).Some seniors also lacked the information they needed to enjoy playing exergames ("I did not quite understand the purpose of the game, because I could not hear well what the robot was saying," Female, 87).These quotes provide insights into the participants' psychological states and physical (e.g., hearing) impairments, which hinder their full enjoyment of the exergames.These important inputs for technology developers and service designers can help them design robotic platforms to ensure the best experience for senior users.
Our analysis also demonstrates that perceived competence has a positive impact on both hedonic and utilitarian value perceptions.The effect of competence on hedonic value is even stronger than that of perceived warmth.We explain the effect by taking the exercise context into account.That is, the activity the participants were evaluating involved physical exercise, so they were particularly interested in the reliability and ability of the coach to react flexibly.Thus, it makes sense that competence judgments exhibit primacy over warmth judgments.Furthermore, for those with the robotic coach, the participants were being exposed to a new and unfamiliar agent, which increased their need to trust in its abilities before being able to enjoy the activity.We argue that human coaches might seem to perform better because they are able, if necessary, to demonstrate the movement.That is, some seniors had difficulty understanding the requirements of the exergame.In such cases, the robot could only repeat the instructions, in the hope that the participant would understand what was expected; human coaches could augment their voice instructions with a movement demonstration, which likely improved the evaluations of their competence.Despite the robots' affective capabilities (e.g., kindness, care, emotional sensitivity), which enhance warmth judgments, their lack of competence might hinder the adoption of robotic coaches.

Contributions and Implications
With these findings, we primarily contribute to the theory of social presence and explain cognitive mechanisms activated by interacting with artificial minds [8].We advance current knowledge on the ways SARs, through their social resemblance to humans, allow people interacting with them to feel as if they are in a presence of social entities [72].In turn, by demonstrating that elderly people sense a social presence when interacting with SARs and develop social perceptions about them, we offer insights for further developments and designs of social robotics and systems.Including robots in people's social networks, as exercise coaches, could improve both physical and psychosocial well-being.However, in contexts that require high social engagement, humans still perform better than SARs.Social robots need continued improvements, in terms of their empathy, flexibility, and spontaneity, to adjust their behavior to the heterogeneity of human needs.
Still, SARs can be used to enhance elderly people's experience and assist human caregivers in certain tasks.Assisted living facility managers should make well-considered decisions about which tasks to automate and which to leave in human hands.Reflecting on our coaching tasks, SARs can be used to motivate seniors to be more active, but human caregivers should make sure the elderly users perform the exercises correctly and assist them if necessary.

Limitations and Further Research
This study provides an interesting and necessary perspective on ways in which SARs can support human caregivers in motivating elderly people to engage in physical activity, but it also contains some limitations.First, none of the elderly care institutions we collaborated with houses as many physically active clients as would be ideal for our experiments.Therefore, we conducted the study across several locations.Second, the participating care institutions, though located in the same region in the same country, feature heterogeneous elderly populations and varying conditions for the field experiments.Additional research might find other ways to expand the sample but also to minimize the differences across elderly populations and care institutions.
Third, the short-term nature of the study (i.e., single session) introduces a novelty effect, for both the robot and exergames.To rule out the effect of novelty, a longitudinal study might measure the same constructs in different phases of the experiment.Researchers then could track not only behavioral intentions but also actual behavior (i.e., whether elderly people keep playing the exergames in the future).Fourth, the robot's speech abilities were controlled with a WoZ method, which limits the spontaneity of dialog.Continued research could provide for more flexibility, moving beyond preset utterances to allow for more natural social interactions (e.g., typing responses that exactly address the questions/statements raised by the elderly participants), at least until we reach the point that the human granting agency to the robot can be eliminated and the robot can interact completely autonomously.Although the volume of the robot's speech was already at its maximum, it seemed insufficient for some elderly participants.Robot developers thus should find ways to enhance robots' volume and produce sounds on a range of frequencies that older adults can hear more easily [56].
Fifth, in addition to speech, we controlled the robot's navigation and gaze, so our experiment does not involve fully autonomous robot behavior.Efforts to endow the robot with autonomous navigation, perception, speech, and decision making could reduce the degree of human intervention needed to ensure appropriate functioning.However, this level of autonomous behavior is rare and difficult to achieve, considering the currently limited capabilities of social robots [12].Sixth, another limitation is the robot's inability to mimic humans' nonverbal communication, such as facial expressions and gestures.It would be interesting to investigate how robots endowed with nonverbal capabilities, in addition to verbal ones, affect users' perceptions of their warmth and competence.We expect that facial expressiveness (e.g., smile, eye contact) influences warmth judgments, and gestures (e.g., use of the limbs to demonstrate the exercise) might affect competence judgments.

Fig. 2
Fig. 2 Empirical model.Notes: H1 is tested with one-sample t-tests; H2 and H3 are tested with independent sample t-tests; H4 and H5 are tested with seemingly unrelated regressions (SUR)

Fig. 4
Fig. 4 Dialogue control GUI. a Default view with several buttons grouping verbal intentions.b One of button pressed, showing several options

Fig. 5
Fig. 5 Experimental conditions.a Scenario 1, human coaches elderly user in playing the exergame using PEPE.b Scenario 2, Vizzy performs the coaching role

3 :
I think the human/robotic coach is an expert/knowledgeable Automated social presence (1-5 Likert scale, α .70)1: I can image the robot to be a living creature 2: When interacting with the robot I felt I'm talking with a real person 3: Sometimes the robot seems to have real feelings 4: I felt like the robot was actually looking at me throughout the interaction 6 Results It's almost the same thing [as talking with a real person].(Female, 80)

Fig. 7
Fig. 7 Perceptions of warmth and competence It felt like talking to an adult person.It's like the grownups.(Male, 71) The robot spoke very politely to us [laughs].He is really friendly.(Female, 86) The robot seems gentle to me.(Female, 64) If he has revealed all of his intentions… they are good.(Male, 72) I was very loyal to the robot and it became my friend.(Male, 90)

I
was a bit shy and anxious, right?Without knowing what to say, or what to do… the movements I should do.(Female, 89) I found it funny [enjoyable].But I almost can't hear, so I couldn't understand most things that the robot said.(Female, 91)

Table 1
Items related to the exergameHedonic value (α .85)

Table 2
Items related to the human/robotic coach Perceived warmth (1-5 Likert scale, α .76) 1: I feel the human/robotic coach understands me 2: I think the human/robotic coach is well-intentioned 3: I think the human/robotic coach is friendly Perceived competence (1-5 Likert scale, α .88)1: I think the human/robotic coach is competent 2: I think the human/robotic coach is reliable

Table 4
Overview of Seemingly Unrelated Regressions (SUR) Equations