1 Introduction

In the European project “SERA” (Social Engagement with Robots and Agents), a series of field studies were conducted that involved repeated interactions of participants with a simple robotic installation. The SERA project was inspired by two types of projects that are currently popular. The first concerns the goal of introducing companion robots and agents into the homes of older people to satisfy social goals and to establish affective bonds; the idea of companions [3, 20]. The Paro robot is a typical case.Footnote 1 The second is to introduce coaches that can help people maintain a healthier lifestyle. Bickmore’s Laura agent is a seminal example ([2]). The robot companion that was built in the SERA project was a simple robotic device that knows about the activity plan of people and that encourages them to stick to the plan through a series of interactions during the day time. The goal of the project was to see what would happen if such technology was put in people’s homes for about ten days. In particular we were interested in the ways participants would interact with the rabbit, the range of attitudes that might be displayed and how these might change as the week progressed.

The installation consisted of the Nabaztag (rabbit) robot that was used as the interface to carry out a series of dialogues with the participants during the trials. The dialogues were all about the daily activities that the participants had planned. Each trial lasted about ten days. For each day we hoped to collect about 3 to 5 interactions between a participant and the robot. There were three iterations of the trials, with each iteration improving on the dialogue complexity. The idea was to have 3 participants in the first iteration, 6 in the second (the 3 participants of iteration 1 + 3 new ones) and 9 in the third (the 6 participants of the second iteration with 3 new ones). Participants in the field studies could choose to make video recordings of the interactions with the rabbit (each time an interaction was initiated). After each iteration, the participants were interviewed.

Collecting data about human-robot interaction in the wild instead of in controlled lab environments is a challenging undertaking. Technically, the system (the robot and the recording equipment) has to work for an extended period of time without any intervention. Methodologically speaking, it may be claimed that the data is ecologically speaking more valid than data collected in the lab, but the poor level of control and other limitations of the data may not allow proper quantitative analysis. So, in this paper we look at some other lessons learned and some issues for further investigation that arose from our experience. We will not be analysing the data we gathered but use data from some of the interviews to illustrate these points. For more systematic studies of the data collected in SERA we refer to other papers that have been published about. it. Given the limited number of participants, the analyses that have been carried out are mostly qualitative in nature. Klamer and Ben Allouch [11] analyse the interviews of the first iteration from the point of view of technology acceptance. The recordings have been analysed and discussed in several paper, such as [15, 16], from an ethnographic point of view. A Special Issue of Applied Artificial Intelligence (Vol. 25(6)) contains a collection of papers on the project. The paper by Creer and co-authors [4] describes the setup in more detail. The paper by Eimler and others deals with empirical results related to acceptance and emotion attribution [12]. How the interactions in one iteration inspired new ideas in the next iterations is the topic of [24]. Technical details about dialogue tools that were developed during the project are in [8].

One thing that appears from the papers that deal with the analysis of the interactions or the interviews is that the various participants differ a lot in their attitudes and interaction styles. This will also become apparent from the selection of quotes taken from two interviews that appear in this paper. In this paper, we use the interviews of two participants that took part in the first iteration only to illustrate some wider issues in the kind of study that SERA intended to perform. This paper reflects on the nature of the study and some of the results found rather than presenting a systematic analysis of the data.

2 Set-up of the field studies

For the SERA project, the partners at Sheffield built a set-up consisting of the Nabazatag robot, connected to a platform that could record interactions with selected individuals [4]. For each iteration, a series of simple dialogues was devised that would allow reflection on the activity plan of elder participants and the activities they had carried out during the day. The simple device was put into the participants’ homes for about ten days and about 3 to 6 interactions were recorded every day.

The set-up consisted of a Nabaztag agent that was mounted on a pedestal with a computer connected to the Internet, a camera above and a couple of buttons (yes/no) that participants could push. See Fig. 1 for the rabbit set-up. The dialogues consisted of greetings, explanations and advice by the robot. There were also simple (yes/no) questions that the rabbit would ask about the activities the participants had carried out—the “activity plan” for the week was a primary source of information for the device.

Fig. 1
figure 1

The set-up

In the morning, as soon as someone passed by the rabbit set-up, the system would start up and greet the person who passed by. The internal clock and an infrared sensor collaborated to trigger this interaction. The first dialogue involved a greeting, the advice to weigh oneself and the question whether one would like to hear the weather forecast. The set-up supposed that people would put their house keys on a hook (the idea was that the set-up was positioned in the hallway) and the rabbit assumed that people planned to go out when the keys were taken from the hook. When this happened, the system consulted the activity plan and started a conversation on the activities planned.

In the SERA project, three versions of the system were built which differed in the complexity of the dialogues and the modes of input. In the first iteration the dialogues were simple and the participants could only push either a yes or a no button. In the subsequent iterations, the set-up made use of RFID cards to make it possible for the participants to enter different types of information.

3 Some issues in data collection

In this section we use data from the interviews to illustrate some general points and issues that arose in the project most of them related to the methodology of investigating human-robot interaction in “naturalistic” conditions. Even though we use the interviews mainly for illustrative purposes it is worth giving some background information about how they were conducted.

3.1 The Interviews

The set-up of the semi-structured interviews and the first results are discussed in [11]. After the 10 day interaction period, the participants were interviewed about their experiences with the Nabaztag. All interviews were audio-recorded with permission of the participants. The interviews were semi-structured and the questions were about the following topics.Footnote 2

  1. 1.

    General use of Nabaztag: intention to use, usefulness, usage, expectations, health exercises, evaluation of the possibilities of the Nabaztag

  2. 2.

    Communication with the Nabaztag (information, appearance, interaction): perceived enjoyment, perceived playfulness

  3. 3.

    Relationship development with the Nabaztag: trust, likeability, source credibility, appearance, relationship building, novelty

  4. 4.

    Social factors (family/friends): subjective norm, self-identity

  5. 5.

    Personal interest in technology and demographic variables

In the following we present excerpts from interviews that were collected after the first iteration. They are all from two participants that participated in the first iteration. In the following, we will let them “speak” by quoting the participants extensively, as it is important to hear the differences in tone between the participants. The two participants differed quite a bit in their educational background. Participant A is a woman aged 65 who had eduction until she was 16. She is now retired and living alone. Participant B is a women aged 50+ who is now doing accounts but was trained and had worked before as a biochemist. She is living with her husband.

The first topic that we want to discuss concerns the nature of the study we performed. Our intention was to go out of the lab and into more naturalistic settings for our data. Having the system in the home of people for ten days certainly resulted in data that offers interesting new perspectives. However, it does not resolve all the typical biases that are found in experimental research.

The second topic we want to present concerns the attitude towards the robotic device. In our case we used a zoomorphic interface, the Nabaztag rabbit that talked to the people. People differ in how they react to robots or agents, depending on their own personality and other characteristics [2]. Some people can really get attached to a robot, treating it as if it were a pet or a living being [9]. Other people cannot imagine to have a bond with a robot treating it as if it were alive [11, 22]. We know that people treat things and media as social actors and have a tendency towards anthropomorphism. In this paragraph, we show some of the subtlety involved in these processes as they appear from the interviews and the interaction of the participants with the robotic device.

Next, we go to the topic of how the participants liked the robot and then how useful they judged it to be. Besides presenting their opinions we put this into the perspective of the way we asked about them and in the way they framed the whole undertaking.

Finally, we present a major challenge to address in designing interactive systems like the one we built: which is how to deal with limitations in situational awareness.

3.2 The nature of the experiment

A major objective of the project was to get the robotic device into people’s homes rather than have the participants come to a lab in order to get more ecologically valid data on interaction. But even though the rabbit has been put in the home of people for about ten days does not make it less of an experiment to the participants. The fact that the rabbit has been put there by researchers, the fact that the participants have consented to have it in their house and participate in the study, the fact that they met the researchers (Sarah and Peter) makes them act in a particular way and makes them evaluate the conduct of the rabbit and the general experience from this perspective. The context is not that of the laboratory, so people are more free in determining when and how to use the rabbit. But the situation is clearly different from a situation where one would have bought a companion robot that functioned (or did not function) in this way from a shop and one would return it.

The participants clearly expressed that they viewed themselves as participants in a study.

A::

I knew that it had all to do with research.

Often they interact with the rabbit not because they want to. A major motivation to interact with the rabbit is that they felt it their duty as participants in the experiment despite its obvious limitations.

A::

Ehm some people won’t see it that way, who think you are either mad. Mad. ’Cause I know someone said: “What are you doing with this rabbit?” and I said: “Not a lot really. It’s conversation is limited.” But I said, it’s there, that it is doing what it is supposed to do. And somebody, somewhere, may benefit from all of us having conversations with this rabbit.

A::

An amusing thing in your kitchen, that you know with a bit of luck in 10 years time, all my little ramblings to it have been of some use for somebody who is old by this time.

B::

I will continue with the experiment. I am really happy to do that. But whether I would want one or not depends on what improvements you come up with.

The fact that the research framing played such an important role in the motivation to interact with the rabbit had to do with the way we framed the purpose of the study: as a study to look into healthy lifestyle promoting technology. The participants viewed themselves as consumers trying to evaluate a product. The participants all kept interacting with the rabbit for the whole of the ten days, though—in this and in other iterations—the number of interactions per day and the length of the interactions became increasing smaller.

The way the participants framed the whole undertaking is important to take into account when drawing conclusions about studies like this. In this case, the participants did not see themselves as participants in an experiment or as participants in a data collection initiative but as people that are testing out a product.

3.3 Anthropomorphism

The rabbit is a mechanical contraption with limited interactive capabilities and limited intelligence. It is meant to resemble an animal (though stylised) and to talk like a human (though limited). To what extent is this character of the rabbit taken into account in their interactions and attitudes towards the set-up?

Participant A is more inclined to go along with the idea that the mechanical construction could be attributed some animacy or agency: she gave it a name—Harvey, after the 1950 movie with James Steward—and she interacted accordingly with the rabbit. Here are some things she said about the rabbit and her interactions.

A::

Ow, I knew it was a mechanical thing wired to a computer. I wasn’t quiet as daft as that. But I did talk to it, because it spoke to me.

A::

No, I think I spoke to it occasionally, you know, because you feel sorry for this inanimate object, that is sort of programmed to speak and move. But that’s all its doing. So I’d tell him sometimes where I was going and what I was doing, but I don’t think it understood [laughs] if it wasn’t Aqua or just come back from the Park Tavern.

A::

Ehm, it didn’t understand. But it didn’t matter, because by this time, you know, you are treating it like not a figure of fun, because that sounds horrible, but as an amusing thing in your kitchen

A::

So, in an evening, especially when I was watching football, I be in there and he probably saw me with all those bottles of wine. As you do. And, even making some cheese and biscuits to go with it. And he saw the cooking. So I don’t talk when I am cooking.

The way A talks about the rabbit shows a peculiar ambivalence. Of course she knows it is just a mechanical thing, but she cannot help treating it as an animate thing. In particular the fact that she felt sorry for it and that she felt embarrassed by its present when she was drinking wine are telling. When participant A was asked how interactions with the rabbit differed from interactions with people she points out that the rabbit does not really do anything with what you say to it.

A::

Oh yes, ‘cause you knew it was not going to argue with you. [laughs] You could talk to it, tell it all sorts of things and you would know that you would not be contradicted or, that’s a daft thing to say or whatever, it was quite good fun in that way, you could actually talk to it and know that whatever you said, it would not create a sting and there would not be no arguments.

In talking about the rabbit, participant A constantly shows the ambivalent perspective on the rabbit as something inanimate that is animated—not just in the sense that it moves but that it has some kind of persona (because you feel sorry for this inanimate object that’s sort of programmed to speak or move). At the end of the week and in the next iterations, participant A interacted less and less with the rabbit in this way. The relationship was clearly not as intense as that presented in [23] who describes the example of an older woman, who talked to Paro, after not interacting with him for a month because she was in hospital for treatment: “I was lonely Paro. I wanted to see you again.” Kidd et al. [9] writes: “Some residents expressed a special attachment to Paro. They spoke to it like it was a pet, gave it names and engaged it in (one-sided) conversations […] These users generally began a relationship with Paro in which they saw it as dependent of them. Very often they are/were pet owners.”

The ambivalence in treating the rabbit as both a thing and an animate being hardly shows up in participant B’s comments.

B::

I didn’t really have any emotions toward it worth giving it a name. I did not empathise with the rabbit. I did call it her, obvious because it had a ladies voice.

B::

It was just something in the way … it was just a thing.

The answers to the questions: “Was the rabbit willing to listen to you/was it open for your ideas? Can you explain why you felt the rabbit was/was not open to your ideas/willing to listen to you?” are interesting to compare for both participants.

A::

Eh yes, I mean it did sort of like sit there, recording to what I was saying to it. So somewhere along the line, they have got the balance right. So that, see, I never ever thought of the rabbit as being alive and human [laughs]. Yes.

B::

Well it didn’t really. When I answered then it would tell me that I wasn’t pressing the buttons. That was as much interaction as I felt we got really.

Our objective with the project was not to build the best possible interfaces for an advice giving robot. One of the things we wanted to know more about was in what way people differ in their attitudes and what factors influence this. The tendency towards taking an anthropomorphic stance is clearly an important one that has already been put forward in the literature (see above). What the recordings and the interviews made us see, however, was that this scale is extremely multi-faceted and shifting constantly. For persuasive systems it is important to understand the process better and to get a grip on the mechanism in order to be effective.

3.4 How they liked the rabbit

Participant A seemed to enjoy having had the rabbit in the house despite its shortcomings. In the interviews she made up excuses for its shortcomings several times. Of course, her attitude may be due in part to a well-known bias that participants in an experiment will be more positive in their evaluation to the researcher than they really are. Here are some of her remarks.

A::

An amusing thing.

A::

It was just a small, kind, friendly looking thing. I thought it was a nice… It was a sort of thing that you got used to seeing, and it was, I am looking for a word that doesn’t sound like friendly, it was a comfortable appearance.

A::

Enjoyable? Yeah. As I say, I suppose, because in the back of my mind I know that it is all for research. Nothing about it annoyed me. Apart from it would ask if I had been to Aqua when I’d already been to Aqua, but that’s not really the rabbit’s fault. It’s just a little glitch in what you can do with the rabbit.

From the recordings it appeared that certainly in the beginning participant A enjoyed the interactions but later on they became more routine. Participant B was more blunt in expressing her lack of appreciation.

B::

Ugly.

B::

It was just something in the way … It was a bit of a nuisance really.

B::

I didn’t find it really likeable.

B::

It was fairly uninteresting really, because its topics were limited and it was fairly, ehm, it was pretty much the same everyday and not very much on it.

B::

It was a bit boring really. It might have been fun, as I say, it might have been fun had there been more varied. It was fun when it said: I have a message for you from Sarah. Who goody, you know, that was fun.

These comments show a couple of things, such as the lack of variation in the dialogues, but also, that was appreciated most was the dialogue with the people who set up the system.

Also participant B is softening her negative criticisms to not offend the researchers.

B::

So, to improve it, make it more less of a long puff horribleness. Less ugly. Which I know this is going to be a new, a new one. But I realise this is a very early stage, you need to find out.

And while B answers with negative remarks when asked the open questions (such as “What did you think of it?”), she turns slightly more positive when asked directly whether she liked to use the rabbit. In contrast to the negative remarks made earlier she says:

B::

It was OK. It was amusing.

Also, when asked the question directly how sympathetic and sincere she found the rabbit, participant B is slightly positive. But in this answer she shows again how reluctant she is to anthropomorphise the rabbit and attribute it human qualities (see above).

B::

I suppose it was slightly sympathetic. Sincere, in given what it was programmed to do. I mean, it was just a thing.

As with all studies, the way the questions are put are important. Also important is the fact who asks the questions. With the interviews, we deliberately chose to have another person interviewing the participants than the ones who installed the device in their house. Asking the questions in different ways and not guiding the participants too much appeared to be very important.

When asked about how useful they thought the rabbit was, both participants found the rabbit rather useless. They said it didn’t have much impact on their behaviour at the time the rabbit was present or afterwards. Participant A did weigh herself everyday as the rabbit recommends but stopped when the rabbit was taken out of the house. She also started thinking about her activity plan and in particular all the exercise she did that was not explicitly recorded (for instance going to the shops). On the other hand, participant B states clearly that she was not going to be bossed around by a rabbit.

A::

I was not daft enough to think it was going to sort of change my life 3 times a year.

A::

So, I did not keep, I did not weigh myself since he left. ’Cause weighing yourself everyday was one thing but not seeing any difference that weight was one thing. Thus I did not weigh myself since he left.

B::

I found out that it would tell me the weather. And I found out that Sarah could send me messages, which was quite nice.

B::

It told me the weather that was all really and asked me if I had weighed and asked me if I have had done my activity, which I do if I want to and I very rarely don’t. And if I don’t do there is a very good reason. I didn’t need to be told.

The limited functionality of the robotic device and the repetitiveness were a major problem for engaging with the robots. The fact that the participants were convinced that their interactions would be useful for future generations of systems seemed to be the main reason they continued in the study. Besides limitations in functionality there were also limitations in the design of the interactions. The way these showed up and how the participants resolved them we discuss next.

3.5 Situational awareness

The set-up consisted of various dialogues. The triggers for the start of a particular dialogue were rather simple. For instance, the first time in the morning someone walks past the rabbit, the rabbit’s morning greeting dialogue is activated. When a participant goes out and takes the keys from the hook, the rabbit consults the activity plan and starts a dialogue about it. Similarly a dialogue is started when the keys are returned to the hook. It turned out that in many cases the dialogues were not appropriately timed as the cues that were assumed to provide the information on the relevance of the dialogues were inappropriate. Here are some examples.

B::

Well, the fact that it would sometimes talk to you when you had gone past and then you had to go back. My husband would come into the kitchen and said: “The rabbit, she is just talking now.” What does she say. And he would say, I don’t know, I just walked past. I would say, go and listen. So, he was very anti, he wasn’t very keen at all. But it didn’t help, because I would have run and listened if I were him. It often would speak to him and not to me.

B::

So, I had to make sure I was down first [instead of her husband], otherwise he would have ignored it [=the morning greeting].

B::

I didn’t use it for my keys. I need to put my keys either in the door or downstairs in the garage door. So, the little rabbit that hangs on the hook, I think only remembered to take it off once when I went out.

These design glitches forced the participants to work around it.

A::

Ehm. Yes I could fool it when just leaving the keys on and not letting it know, because I had another set of keys, But I only did that, when it was something daft just like say bringing the washing in or even hanging the washing out. Ehm, maybe just dashing to the shop at the garage and dashing back. ‘Cause you would be out 10 minutes and it seemed silly to have all these questions directed at you that are the same questions that were directed to you when you came back from Aqua.

So this makes participant A suggest a solution as follows, which shows insight in the complexity of the problem.

A::

Don’t know how they can do it, but I think he ought to have some sort of clock-timer device in him.

A::

It was just so funny, because, you know, you realise that they got to be programmed, so somebody who has a very active life, who did different things during the week, they probably have to program that in, but then it can’t, how can I say, it can’t be easy to think of everything that somebody might do. Because I went to the Opera one night, and he thought I went swimming or Aqua. So, it doesn’t allow, because there is no timing in it, I think eventually when they get something like a timing in it, that they realise that people aren’t going and doing the same thing at 7 o’clock in the evening, that they would be doing at 9 or 10 o’clock in the morning.

Realising the complexity she is also, again taking into consideration that still a lot of research needs to be done before the technology becomes useful. But she is hopeful.

A::

Ehm, well I think he was okay. It was just the fact that they probably need to tweak a few more things in him, because if he went wrong once, and then he went wrong again, there is something not quite right. That’s all highly technical, I know. And I am not a highly technical minded person […] Ehm, but then, you know, if you know how televisions were when they first came out and what they are like now, things like that can only improve. As they get used, and people say what was wrong with them and I don’t know what was wrong with him, but it was obviously technical.

The quotes above show an important multifaceted problem which every design of a robot companion will need to address: how can we make the companion situational aware given that there are so many situations to deal with? Is it a matter of building in more sensors, a matter of design methodology, or a matter of intelligence? Probably all of the above and more.

4 Conclusion

It becomes clear from the interviews that the rabbit set-up was not considered useful by either of the two participants. One enjoyed its presence more than the other and took a completely different approach to the rabbit in interaction. With respect to setting up an emotional relation, participant A enjoyed the rabbit as a kind of pet. Participant B, on the other hand, thought of it as a completely inanimate, malfunctioning object which she dutiful interacted with because it was part of a research project that she had consented to participate in.

The interviews also show that putting technology into people’s homes instead of getting the people into the lab, does not necessarily lead to naturalistic studies. The fact that the rabbit was put in the people’s homes did not change the fact that they considered it as a typical experimental procedure with all the biases in engagement and reporting that this entails. They dutifully kept interacting with the system and made up apologies for its improper functioning out of sympathy for the researchers.

The dialogues were simple and repetitive which did not help in scoring points for the application. But the most important shortcoming of the system was that it made a mess of choosing the right dialogue at the appropriate time. The pre-programmed assumptions about what triggers would fire a particular dialogue were too simplistic. It is interesting to see how most of the frustration of the participants derives from these inappropriate interventions of the system. Situational awareness is certainly an important property to consider in the design of social robots.