The first item in the questionnaire was if participants found the robots “cute.” As the data show, there is an overwhelming perception that all three of the robots are cute, with 23–29% strongly agreeing to this statement, with an additional 35% (Pepper), 45% (AV1), and 57% (Tessa) agreeing. For the negative spectrum (disagree and strongly disagree), we find 22% (Pepper), 10% (AV1), and 0% (Tessa), and the neutral opinions (coupled with “I don’t know”) to be 17% (Pepper), 15% (AV1), and 20% (Tessa). This could partly be explained by the design of the robots. They are all small, but Pepper is of similar height to many of the children participants, but also taller than some (Fig. 3).
The next parameter we asked participants to assess was whether or not they perceived the robot as “cool.” This question was added to cover the positive associations more thoroughly due to children of that age, especially boys, laying quite a different meaning into the words “cute” and “cool.” For Pepper, 38% strongly agreed that it was cool, 48% agreed, 6% were neutral, 3% disagreed, and 4% strongly disagreed. For AV1, 43% strongly agreed that it was cool, 47% agreed, 7% were neutral, 1% disagreed, and 3% strongly disagreed. For Tessa, 70% strongly agreed that it was cool, 23% agreed, 3% were neutral, and 3% did not know. Nearly all participants either agreed or strongly agreed that the robots were cool. However, Pepper and AV1 had a larger share of respondents that agreed than strongly agreed, whereas, with Tessa, there was a clear majority for “strongly agree.” This was quite surprising to us, but we think this could be related to the fact that the survey participations of year 2, where Tessa was introduced, had more time to interact with the robot. Again, as in the previous parameter, no one disagreed directly to the question regarding Tessa (Fig. 4).
The third parameter evaluated whether they perceived the robots as “scary.” For Pepper, 3% strongly agreed that it was scary, 5% agreed, 7% were neutral, 32% disagreed, and 53% strongly disagreed. For AV1, 1% strongly agreed that it was scary, 3% agreed, 7% were neutral, 25% disagreed, and 64% strongly disagreed. For Tessa, 0% strongly agreed or agreed that it was scary, 3% were neutral, 20% disagreed, and 73% strongly disagreed and 3% did not know. Although the vast majority respondents did not find any of the robots to be scary, Pepper had the most ambiguous response. This was perhaps the parameter that holds the strongest cultural connection as Hollywood movies are littered with killer robots and strongly affect our perception of robots in the real world (Søraa 2019). At 121 cm tall, Pepper was taller than some of the respondents, and height is a trait traditionally associated with power. If a grown man of 170 cm would meet a robot of 2 m, there would probably be some insecurity present.
It seems viable to consider the correlation between how human the robot looks and the possibility of children finding it scary. In fact, Pepper is on top of the list, although with a small percentage, followed by AV1 where the human resemblance is less than Pepper, and lastly Tessa which no one considered scary. This may be explained by the uncanny valley effect (Mori 2012; originally 1970), reflecting a hypothesis “that a person’s response to a humanlike robot would abruptly shift from empathy to revulsion as it approached, but failed to attain, a lifelike appearance.” In the qualitative part of the second research year, some of the participants were asked what they thought robots in general look like. Several participants answered, “Not like that” or “Different than this one” while pointing at the robot Tessa. Being designed as a flowerpot seems to make it differ from their perceptions of robots. One person commented: “The voice is a little bit scary,” indicating that elements of Tessa could be perceived as less likable (Fig. 5).
By analyzing the results relating to the question “Is the robot lifelike?”, we can see that children’s perceptions of robots vary greatly depending on robot designs. A Finnish study in a shopping mall found a similar willingness to interact. They conclude that “careful balancing between the robot being entertaining (esp. children) and purposeful (adults)... should not be mutually exclusive” (Aaltonen et al. 2017, p. 54). We found similar results for our experiments, with children emphasizing the entertainment value of the robots while holding a deep fascination with the creatures, whereas adults were more pragmatic in their interaction with the robots, asking questions about what functions it had and how they work. As recommended by Bartneck (2009, p. 78), results of surveys on social robots in this early stage of their implementation in society should not be interpreted as absolute values, but rather as a tool for comparison (Fig. 6).
By adding “do not know” (DK) as an alternative answer some participants’ actual opinions might have been lost. If one experiences contradictory thoughts and/or feelings toward the object of which one is asked to consider, or is concerned one will not be able to defend one’s answer due to lack of knowledge, it is often easier to respond with DK compared to choosing a weighted response (Krosnick and Presser 2010).
Fairs, especially those happening inside tents placed outdoors, represent one of the most difficult environments for controlling sound and light quality. Some difficulties include strong sound reflections, a large amount of ambient noise and sudden variation of light intensity. Such harsh conditions had a strongly negative impact on the performance of the robots, especially for Pepper due to the more advanced functionality presented. The Pepper robot experienced technical difficulties regarding two pre-programmed algorithms: “Guess my mood” and “Guess my age.” Pepper had trouble recognizing faces and could often not give a response to the question at hand. The researchers tried adjusting the lights, which seemed to be too weak, by placing floodlights at different angles. However, it was not possible to find the right lighting conditions in the fair tent, and therefore the functions were inconsistent. This concurs with the experiences of Kalaiselvi and Nithya (2013), who found that algorithms meant to recognize faces did not work properly when the lighting was too weak or too strong (e.g. a bright lamp).
Pepper’s speech recognition was also challenged at the research fair, which was quite noisy with sound sources coming from different directions and even reflected back from the roof of the tent. This caused some misunderstandings between the robot and its interlocutor. When speaking with fairgoers, Pepper could, for example, freeze and stop the established dialog. When trying to guess the age of a person, it was sometimes wrong by several decades, which the interlocutor sometimes perceived as offensive. As with the lighting challenges, noise problems have also previously been experienced by Gardecki and Porpora (2017). The malfunction of the Pepper robot could have had an impact on the participants’ perception of the robot in general, causing them to answer more negatively compared to what they would have, had they seen the robot fully functional in more controlled environmental conditions. However, one could argue that the more controlled the conditions, the less natural or realistic the setting would be. In reality there are few contexts where sound and lighting settings are ideal, making this scenario an ideal test site for a more realistic evaluation of Pepper’s functionalities.
Tessa, on the other hand, had less issues with the setting. Its reduced set of functionalities makes it less sensitive to the environment, and, because Tessa is designed for users with potential hearing disabilities, the volume can be adjusted up to a relatively high level. Most of the time, there were no difficulties in hearing the robot, even with a lot of background noise. This is suitable for the older adult user group it is constructed for, who often have a TV on with high volume.
Another possible explanation of negative perceptions regarding Pepper is that a humanoid robot may elicit higher expectations on the interaction capabilities when compared to less sophisticated robots, as are Tessa and AV1. Therefore, the disappointment led by a lack of communication would be correspondingly larger with regard to Pepper then to its less-humanoid counterparts. Additionally, cultural factors, such as what different social groups expect a robot to be able to do, should also be taken into consideration (Nomura and Nakazawa 2017). Through conversation with some of the children who participated, the researchers learned that many had prior experience or were already familiar with AV1 and its purpose and function. Due to this robot being made solely for helping school-aged children, this robot might be easier to relate to for our participants compared to the Pepper robot, which was presented as a potential helper for older adults. Tessa, on the other hand, was mostly unknown to the children, and although it had a familiar design (being a flowerpot), they were quite unsure and curious about what it could actually do when interacting with them.
“But not my grandparents”
As the current discourse on social robots relates strongly to elderly care (Wright 2018; Fong 2003; Van Wynsberghe 2016), it is interesting to learn how young people think about robots taking care of the elderly. Van Wynsberghe (2016, p. 1) states: “Make no mistake, the robots are coming! The question then is: what will this new technology do to the age-old practice of care-giving?” In that regard, a key finding in the research data is the juxtaposition between the two questions which we asked in relation to the social robots Pepper and Tessa: (1) “The robot can help the elderly in their daily lives” and (2) “The robot can take care of my grandparents.” AV1 was not included in these questions since it is not targeting older adults as a user group. With these two questions we wanted to see if the participants showed any bias toward bringing robots into their own life, especially when tied to the elderly in their own family. Our hypothesis was that robots could potentially be more accepted when they took care of someone rather than a loved one, as inter-personal connections to one’s own grandparents might affect the acceptance rate of robots.
The results support our hypothesis. 76% strongly agree or agree that Pepper can help the elderly), but only 60% would similarly allow it to help their own grandparents (see Fig. 7). Around 10% disagree or strongly disagree that Pepper can care for elderly, while double the amount—20%—disagree or strongly disagree that Pepper can take care of their own grandparents. The results thus show that a much larger percentage of children are positive toward allowing Pepper to take care of the “elderly” in general as compared to “my grandparents” in particular. The same trend can be observed in the negative side of the scale, a much larger percentage shows reluctance on allowing Pepper to take care of “my grandparents” than the generic “elderly.” The Tessa robot, on the other hand, had a quite positive response, with over 97% strongly agreeing or agreeing that Tessa can help the elderly and with 86% allowing it to watch their own grandparents (see Fig. 8). 10% are neutral toward Tessa taking care of their grandparents, as exemplified by this quote from one of the respondents (11 years old):
I think it can help my grandmother because she’s a bit forgetful. But not my other grandparent [X], because that one is unsure about technology.
The discrepancy between whether the children thought the Pepper robot could watch over older adults or their own grandparents could potentially be explained by the age and the relatively good health of their grandparents. The children who answered our survey were aged 6–13 years, and their grandparents could be aged below 50 in some cases. Furthermore, at the age of 80, many older adults are still living an active lifestyle and managing quite well on their own, so there might not be a visible “need for a robot.” Our participants might not perceive their own grandparents as “old and frail” compared to the elderly in general. This is exemplified by this quote from one respondent (11 years): “I think it can help my great-grandfather because he is quite forgetful, and he lives alone.” This child thinks that their great-grandparent, but not their grandparent could benefit from the robot, indicating, presumably, that grandparents would be too young to be perceived as needing to live with a robot.
Measuring social robot and human interaction
Bartneck’s (2009) suggestion for a standardized measurement tool for human robot interaction and his five key concepts of social robots—anthropomorphism, animacy, likability, perceived intelligence and perceived safety—can be seen in the following way in our study:
All robots have some degrees of anthropomorphization, although it is clear to all that these are in fact technological entities and not humans except, perhaps, for Tessa, which could be mistaken for a normal flowerpot. This became apparent for an earlier prototype of the robot, as test users would water the plant on top. However, later designs of the robot, including the version we had on display, were made to be waterproof. Pepper, with its measurements approximating a 7-year-old child and larger degree of mobility, is the robot that is perceived as most humanoid. As Breazeal (2003 p. 167) discussed, social robots are robots that people anthropomorphize to interact with them. However, for Tessa, this is a mixed entanglement of anthropomorphization and “plantification,”, i.e., people want to relate to it as a plant as well.
As we could see in our question about the robots being “lifelike,” we see a correlation to anthropomorphization as described above. Since Pepper moves about, gesticulates and is pre-programmed to be a bit “quirky” in its speech, people were charmed and found it more alive; Aaltonen et al. (2017) showed comparable findings. Our participants did not agree, for the most part, that Tessa was very lifelike, which we presume can relate to it being quite stationary and not moving save for the blinking. As we did not question the participants about the animacy of AV1, no data can be presented on this. Our reasoning for not the question of AV1 is that the teleoperation of AV1 by a child gives it a certain amount of human embodiment and animacy in its own right when used in a classroom setting. This is similar to findings by Choi et al. (2014, p. 1069), who found people to perceive “autonomous robots as more intelligent than tele-operated robots while they felt more social presence toward tele-operated robots than autonomous robots.”
For likeability, our study shows some different parameters concerning whether the participants found the robot to be “cool,” “cute” or “scary.” Whereas both AV1 and Tessa scored relatively high on it being cute, participants found Pepper to be less so. This could perhaps be the fact that Pepper is quite tall, at least seen from the perspective of a child of the same height. Tessa, with its flowery appearance, and AV1 with its round eyes and no mouth, are designed more to have some degree of cuteness. As for the cool parameter, here Pepper scored better, which can relate to it having more functions. Tessa was also perceived as very cool, which, because of Tessa’s lack of movement, we found surprising. For scariness, we see that all robots were quite well liked, but that participants tended to be a bit warier of Pepper—placing it a bit lower in Mori’s (2012) Uncanny Valley.
Because AV1 is teleoperated, and therefore controlled directly by another user, only Tessa and Pepper could be evaluated as being perceived to be intelligent independently of its operator. Communication and recognition are two aspects that can be linked with perceived intelligence. Tessa is possibly too simple to be perceived as intelligent. This simplicity is reflected as predictability—something totally predictable cannot be intelligent. However, Pepper is expected to be highly intelligent due to its claimed ability to recognize people and its advanced voice communication features, which exhibit a conversational level. As mentioned, due mostly to contextual circumstances (poor light and environment noise), both skills failed to impress the children, who were finally rather disappointed.
This aspect, in our study, relates to the perceived capability of the robots taking care of older adults. However, since the demonstrations of the robots were limited, this aspect was not further investigated in this study.
Limits of the study and its design
The validity of the data must be read in the localized context, and the randomization of the fair attendees. The data material was collected at the same type of research fair, but in two separate years, which might have led to dissimilarities in the data collection. Firstly, different researchers were involved in the two fairs, which could have led to different ways of presenting the robots and/or communicating with the children. All participants in the research team were instructed to keep the description and conversation of the robots neutral to limit the risk of influencing the children’s first impressions. In the first year the research team introduced the visitors to two robots, and in the second year there was only one robot present. Additionally, during the second year, there were more personnel operating the robot stand compared to the first year, allowing the visitors to engage for an extended period of time with the flower-pot robot, and it offered no reference of comparison with other robots.
As our data collection was dependent on who came by, a controlled data trial would be more representative, but as visiting the research fair is mandatory at school, a wide sample of attendees came by. The assembly of activities and objects at the fair varied as well. In the first year, the two robots were the only items at the booth, and in the second year the booth was shared with four stationary bikes connected with an exercise game. The exercise game had a screen where the game virtually took place, and gamers “drove” in virtual tanks with the force of the pedalling and shot other gamers by using controllers on their handle. This caught a lot of attention due to its popularity and significantly larger installment, and could have affected the size and type of attendants in that year’s sample. There is a possibility that people more interested in robots came by our stands, thus affecting the data.