1 Introduction

The last few years, a growing number of HRI research projects concern themselves with eldercare (Mynatt et al. 2000; Pollack 2005; Taggart et al. 2005). Indeed, the future of eldercare could be that of elders living longer independently, supported by technology. Robotics could be an essential part of this, also because robots and screen agents with social abilities could function both as assistive technology and as social company (Forlizzi 2005). But will elders be willing to accept all this assistive technology, especially when it concerns interactive systems that could be perceived as autonomous and intelligent such as robots and screen agents (Forlizzi et al. 2004)? These systems differ from other technologies, because they concern technologies that are not always perceived just as such: a robot or screen agent can be (partly) perceived as a social actor, and it could be that interaction with it follows the same principles as inter-human communication rather than those of human–machine interaction—and this should show in the behavior of people interacting with robots or screen agents (Bartneck and Forlizzi 2005).

Recent research with robots in an eldercare environment shows that elders can feel positively about robots (Kidd et al. 2006) and that robots can have a comforting effect that is comparable to the effect pets have (Beck and Katcher 1996; Beck et al. 2003; Pineau et al. 2003; Wada et al. 2003; Wada and Shibata 2006). Experiments focusing on the effects of social behavior of robots and screen agents show that a more social or more caring condition does have an effect that is comparable to that of humans behaving more sociable or more caring (Heylen et al. 2002; Bickmore and Picard 2004; Heerink et al. 2006a).

The research presented here is part of a project on developing a methodology for predicting and explaining the acceptance of robots and screen agents by elderly users. It aims to enable researchers to denote the different factors that influence acceptance of robots and screen agents after a 3 min test. Our main instrument is a questionnaire with a rating scale, filled out by the subjects after their test session. Besides this, we want to explore the possibilities of user observation and develop an instrument that can be used additional to our questionnaire. Since our questionnaire is directed to result in quantitative data, we are specifically interested in processing data on behavior that can be related to acceptance of a robot or screen agent as a conversational partner.

Earlier publications on our research (Heerink et al. 2006a, b, 2008) reported on the results of experiments with the robot in eldercare institutions, which included conversational behavior analysis, but we did not relate this to social presence or acceptance. In this paper, we present and discuss data from a new experiment with a robot, focusing on the development of a more profound instrument for analysis of conversational behavior data. The result of this analysis of data, obtained by observing video recordings of the 3 min sessions, will be linked to questionnaire results to explore the relationship between user behavior and both social presence and robot acceptance.

After a short review of related research, we will describe the set up and instruments used, next we will present and interpret the results.

2 Robots in eldercare

Several projects have addressed the response of elderly users toward different types of robots that could serve different purposes, varying from just being good company to physical support and giving advice. An example of a pet-like robot with no other functionalities than being good company is Paro. Since 2002, a number of experiments with this seal shaped robot have been carried out (Shibata et al. 2003; Wada et al. 2003; Wada and Shibata 2006). In early studies, it was positioned in a group of elders where they could interact with it, mainly by caressing and talking to it. The aim of this study was to observe the use of a robot in a setting described as ‘robot assisted activity’ and to prove that elders felt more positive after a few sessions. This was done by measuring the moods of the participants, both with a face scale form and the Profile of Mood States (POMS) questionnaire. More recently, research with Paro focuses on collecting physical data on elders that have been exposed to the robot to measure its effect on their well-being.

An example of a robot with more functionalities that was subject to experiments in an eldercare institution is Pearl (Montemerlo et al. 2002; Pollack et al. 2002; Pineau et al. 2003). This robot was used in open-ended interactions, delivering candies and used to guide elders through the building to the location of a physiotherapy department. The experiments with Paro and Pearl both registered a high level of positive excitement on the side of elders, suggesting that a robot would be accepted. In case of Paro, it would merely be beneficial as a pet—a study by Libin and Cohen-Mansfield (2006) shows that a robotic pet is preferred over a plush toy cat—and in case of Pearl, it would be used as an actual assistant.

A robot with advanced assistive functionalities to be applied in eldercare is the German Care-o-bot (Graf et al. 2004; Parlitz et al. 2007). It is intended to provide assistance in many ways, varying from being a walking aid to functioning as a butler.

Other projects focus on an assistive environment rather then on the development of a specific robot. An example of this is the Italian RoboCare project (Cesta and Pecora 2005, 2006) in which a robot is an interface to a smart home for older adults.

Research concerning experiments with screen agents for elders is reported by Bickmore and Picard (Bickmore and Picard 2004, 2005; Bickmore et al. 2005a, b). The study focuses on the acceptance of a relational agent (a screen agent that simulates a personal interest in the user) appearing on a computer screen and functioning as a health advisor for older adults. Findings (scores on questions related to affection, trust and acceptance) indicate that the agent was accepted by the participants as a conversational partner on health and health behavior issues and rated high on trust and friendliness. It was also found to be successful as a health advisor. Other research with the same agent (Bickmore and Picard 2005) is focused on the ability to function in long-term relationships in which social abilities also appear essential. It is linked to the notion of social presence (Lombard and Ditton 1997; Lee and Nass 2003) that people feel in interaction with systems, which can play a role in interpreting the responses of participants when they apparently perceive social abilities.

We could divide research on robot and agent acceptance into two areas: acceptance of the robot in terms of usefulness and ease of use (functional acceptance) and acceptance of the robot as a conversational partner with which a human or pet-like relationship is possible (social acceptance). The experiments with Paro could be seen as a good example of research focused on social acceptance while the experiments with Pearl focused more on the acceptance of the robot regarding its functionalities. When considering behavior an indication of acceptance, in general, it could be appropriate to state, we are researching the social side of acceptance. For our approach this means, we take interaction with a robot as interaction with a social entity—and the amount in which users take a robot as such can be of influence on their acceptance.

3 Social presence and conversational expressiveness

Since it is not unusual for humans to treat systems and devices as social beings (Reeves and Nash 1996), it seems likely that humans treat embodied agents as such. The extent to which they do so seems to be related to a factor that is often related to as either ‘Presence’ or, more specifically ‘Social presence’. Many research projects that are related to our research incorporate this concept (DiSalvo et al. 2002; Lee and Nass 2003; Bickmore and Schulman 2006).

The term presence originally refers to two different phenomena. First, it relates to the feeling of really being present in a virtual environment and can be defined as ‘the sense of being there’ (Witmer and Singer 1998). Second, it can relate to the feeling of being in the company of a social entity: ‘the perceptual illusion of non mediation’ (Lombard and Ditton 1997). In our context, the second definition is relevant.

In an earlier study, we found a crucial role for social presence in the process of functional and conversational acceptance of embodied agent technology (Heerink et al. 2008). Therefore, we intend to incorporate measuring social presence when measuring acceptance of social assistive robots and screen agents.

The experience of presence of a social entity usually shows by a higher rate and intensity of expressions that a speaker uses (Wagner and Smith 1991; Lee and Wagner 2002). It demonstrates the amount of conversational engagement one feels (Nakano and Nishida 2005). We call this conversational expressiveness: the amount and intensity of facial expressions and gestures when engaged in a conversation. We hypothesize that also for our user group, a higher score on the construct of social presence will correlate with a higher score on conversational expressiveness. As in earlier research (Heerink et al. 2008), we found a higher score on social presence to correlate with a higher score on acceptance (as indicated by the expressed intention to use the system), we suspect conversational expressiveness to correlate with intention to use.

4 Robot acceptance

Defining user acceptance as “the demonstrable willingness within a user group to employ technology for the tasks it is designed to support” (Dillon 2001) brings the need to develop evaluation methodologies. Specifically for robots and screen agents, several methods have been used, varying from applying heuristics (Clarkson and Arkin 2007) or other usability type tests (Yanco et al. 2004), classifying tests (Riek and Robinson 2008), and role-based evaluation (Scholtz 2004) to measuring physical responses (Dautenhahn and Werry 2002). Also, Technology Acceptance Modeling is used (de Ruyter et al. 2005): a methodology that does not only provide insight in the probability of acceptance of a specific technology, but also in the influences underlying acceptance tendencies.

However, technology acceptance models have not been developed for systems that can be perceived as a social entity, such as a robot or screen agent and also not particularly for elderly users. Influences that are known to be of importance in acceptance of a social entity have never been adapted by any technology acceptance model and neither have influences that are known to be of influence by elderly users.

Therefore, in our study, we researched the possibilities of using an acceptance model for quantitative research on acceptance of robots and screen agents by elderly users. We aimed to include specific influences representing social acceptance and the specific demands of elderly users.

The sole instrument used in technology acceptance methodology is traditionally a questionnaire with replies on a Likert scale. Relating conversational expressiveness to acceptance would add behavior analysis to the instrumentation and thus enrich acceptance methodology.

5 Experiment

By analyzing data from an experiment with elderly participants using a robot, we want to find out whether there would be differences in measured conversational expressiveness between users of a more expressive and less expressive condition. Furthermore, we want to know whether conversational expressiveness of users can be related to their experience of social presence.

The participants were 40 elderly citizens, living in an eldercare institution. Given the results of an earlier study (Heerink et al. 2007), we expected the more social condition to evoke more conversational expressiveness by the participants.

5.1 Experimental design

For the experiment, a specific interaction context was created where the robot was used in a Wizard of Oz fashion: it was connected to a hidden operator who was controlling its behavior. A Wizard of Oz setup made it possible to have a similar discourse pattern for all sessions as all uses had the same limited set of tasks. This we considered an advantage when comparing counted behavior in different sessions.

We created two different conditions for the robot: a more social one (showing more expressiveness) and a less social one. They were realized with the following behavioral features:

  1. 1.

    The robot in the more social condition would gaze straight at the participant; the robot in the less social condition would look past the participant.

  2. 2.

    The robot made mistakes such as saying good morning in the afternoon or the other way round. When this would be made clear, the robot in the more social condition would apologize for the mistake, the robot in the less social condition would not. This feature was demonstrated to all participants in an introduction session.

  3. 3.

    The robot in the more social condition would smile when appropriate and express cheerfulness in its facial expression, the robot in the other condition did not.

  4. 4.

    The robot in the more social condition remembered the participant’s name and used it—the robot in the less social condition did not.

  5. 5.

    The robot in the more social condition would support the conversation by nodding and blinking, the less social robot would not do this.

  6. 6.

    The robot in the more social condition was better in turn taking by waiting until the conversation partner finished speaking, the robot in the less social condition was less polite.

  7. 7.

    The robot in the more social condition was more outgoing, both in facial expressiveness and in use of its voice (less monotonous, with pitch variation).

As stated, the intention was to create a more social condition, which means not all differences concern a more expressive robot.

Besides these behavioral differences, the functionalities and spoken texts were the same for both conditions.

5.2 Used robot

The robot we used in our experiment is the iCat (“interactive cat”), developed by Philips. The iCat is a research platform for studying social robotic user-interfaces. It is a 38 cm tall immobile robot with movable lips, eyes, eyelids and eyebrows to display different facial expressions to simulate emotional behavior. There is a camera installed in the iCat’s nose, which can be used for different computer vision capabilities, such as recognizing objects and faces (Fig. 1).

Fig. 1
figure 1

The iCat as used in the experiment

The iCat’s base contains two microphones to record the sounds it hears, and a loudspeaker is built in for sound and speech output. We used the iCat with a female voice, because this was the voice that was the one-three pretest subjects felt most comfortable with.

5.3 Procedure

Participants were elderly people (17 male, 23 female) between 65 and 96 years old, living in eldercare institutions in the cities of Lelystad and Loosdrecht, in the Netherlands. They were divided among the two conditions as equally as possible (the social condition featured one more male and one less female). They were first exposed to the robot in groups (two groups of eight participants and one group of four participants for each condition). After a short introduction by one of the researchers, the robot told them what its possibilities were: it could be used as an interface to domestic applications, for monitoring the user, companionship, information providing, agenda-keeping, and memorizing medication times and dates. They were told that for today’s experiment, the robot was only programed to perform three tasks: setting an alarm, give directions to the nearest supermarket, and giving the weather forecast for tomorrow. The experimenter subsequently demonstrated how to have a conversation with the robot in which it performed these tasks. After this group session, the participants were invited one by one to have a conversation with the robot, while the other group members were waiting in a different section of the room (separated by sound proof movable walls). The conversation was standardized as much as possible, and we asked the participants to have the robot perform the three simple tasks. Furthermore, we told them that the robot would be available in the next 5 days.

While being engaged in conversation, the participants’ behavior was observed by a researcher and recorded by camera. The group session and the individual session were both about 5 min, so the maximum time spent with the robot was 10 min for each participant. To give an example, a typical conversation would start with the participant saying ‘Good morning iCat’. The robot, being seemingly asleep until being spoken to, would raise its head and respond with ‘Good morning, what can I do for you?’ Subsequently, the participant could ask what the iCat’s possibilities were, or go straight to a task like setting the alarm. In the latter case, iCat would ask for the time for the alarm to go of and for the sound to make at the given time (choices were music, alarm bell, and calling the participant’s name).

5.4 Behavior analysis methodology

Although participants were observed during the experiment, we based our analysis on observations of the video’s afterward. During the analysis, non-verbal forms of conversational expressiveness were counted for each participant such as greeting the robot nodding or shaking the head, smiling, looking surprised or irritated (frowning), and moving toward or away from the robot. This list of items considering conversational expressiveness was generated by listing classical feedback gestures (see Scherer 1987; Cerrato 2002; Axelrod and Hone 2005; Sidner and Lee 2005; Heylen et al. 2006) without categorizing them to specific communicative functions. The gestures are not specifically intentional or non-intentional, but they can be identified as conversational behavior.

To each counted item, the observers attributed two values: one for the strength (weight) of it and one for the certainty. of the observer. Both could be one, two, or three points. So if the observer would be sure of someone laughing very loud, this would score two times three points.

The observers were students who were trained to observe objectively, but were unaware of the nature of the experiment. They watched the video’s in which the camera was turned toward the participant, so the robot was not visible. They were not made aware of the different conditions of the robot. We had two observers for each video and added their scores for each behavior.

5.5 Used questionnaire

For measuring acceptance, we used a questionnaire with statements that can be responded to on a 5 point Likert scale (rating scales are a usual instrument in TAM studies). Table 1 shows the statements on intention to use and social presence (in the used questionnaire these items were not grouped by construct, but sequenced randomly). The statements, we used for social presence are derived from the questions developed by Bailenson et al. (2001). As explained in Sect. 2, intention to use is determined by other influences (like perceived usefulness and perceived ease of use), but they are beyond the focus of this research. Since intention to use has been an effective predictor of actual use for this technology used by older adults, we did not intend to measure actual use in this study.

Table 1 Used statements for intention to use and social presence

6 Results

The different types of expressive behavior by participants during their interaction with the robot were counted for each participant, added for each condition and analyzed to measure conversational expressiveness. To account for inter-rater reliability, we calculated Lin’s concordance (Lin 1989, 2000), which we found to be 0.94 on average.

Table 2 shows that there is a pattern of more conversational expressiveness for the more social condition in the sense that the participants show a higher frequency for almost all types of behavior, but there are no significant differences between the conditions when we look at the individual behaviors.

Table 2 Means and t scores on items of conversational expressiveness

We categorized the behavior types by them being positive or negative (reflecting a positive or negative attitude toward the conversational partner) and looked at the total number of times a type of behavior occurred for the different conditions.

We considered the behaviors ‘shaking head’, ‘move away’, and ‘frown’ negative, and all others positive (reflecting a positive or negative attitude toward the conversational partner). Table 3 shows that behaviors categorized as negative in fact did correlate with intention to use (of course in a negative direction).

Table 3 Pearson correlation scores for constructs and categorized conversational expressiveness

Table 3 shows that there is a correlation between social presence and conversational expressiveness. There is no correlation, however, between intention to use and conversational expressiveness.

Table 4 shows that there is a clear difference between the more social and less social condition both in total expressions and in the total amount of expressions that were categorized as positive.

Table 4 t scores comparing a more and less social condition on constructs and conversational expressiveness

7 Discussion and conclusions

There is a clear pattern of more conversational expressiveness, a higher frequency of non-verbal behaviors, of participants that were in conversation with the robot in a more social condition. This corresponds with a higher score on social presence, showing users experiencing a social entity are indeed responding to that. This may say something about the effect of what we understand as social presence on users, but although social presence corresponds with intention to use, conversational expressiveness only partly seems to be an indication of acceptance. Positive expressions may correspond with higher scores on acceptance, while increasing amount of negative expressions (shakes, frowns, and taking distance) may indicate a lower acceptance rate.

Still, we find this research shows that behavior observation can be an additional instrument for studies on robot acceptance. However, there would be more possibilities to explore, considering both qualitative and quantitative instruments. A detailed discourse analysis for example, could provide clues that can be related to acceptance, although a different (non-Wizard of Oz) setup would in that case be more appropriate.

Another item for further research could be the question whether conversational expressions occurred as in response to the same expressions by the robot (a smile in response to a smile, a frown in response to a frown). In that case, we would be speaking of imitative behavior. This would be the occurrence of a well-known phenomenon in psychology called the chameleon effect (Chartrand and Bargh 1999). It concerns imitative behavior between humans, which seems to occur naturally unless two people do not like each other. The occurrence of this behavior could even very well be interpreted as a sign of acceptance (Kahn et al. 2006). But during behavior analysis the observers just counted the number of behaviors, without looking at the behavior of the robot that evoked it–the camera was always directed toward the participant. In future research, this possibility of imitative behavior could be something to observe, also when comparing robots with different embodiments, since it could add interesting viewpoints to HRI theory on this aspect (Dautenhahn 1994; Dautenhahn and Nehaniv 2002).

Furthermore, there are factors like enjoyment, perceived ease of use, perceived usefulness, attitude, and anxiety that influence acceptance (Venkatesh et al. 2003; Heerink et al. 2008), and future results could explore their relation with conversational expressiveness and social presence. This would not only provide a more complete picture of the relationship between conversational behavior in human–robot interaction and acceptance, it would also tell us more about what enhances the sense of social presence for this particular user group.