1 Introduction

Human–robot interaction is increasingly spreading in society and it has acquired a certain importance in the articulated domain of the interactions that individuals experience in their everyday life. Therefore, it is not surprising that the scientific community shows a growing interest in the development of robotic models that can enhance the interaction between humans and robots. This paper explores the extent to which some present social robots are perceived by people in terms of their cognitive and affective capabilities. In other words, we intend to investigate to what extent current models (beyond good intentions) are really effective in terms of simulation of mental functions and emotions by robots. In this study, we decided to use four, real, physical robots: InMoov, Padbot, Joy Robot, and Turtlebot (Fig. 1).

Fig. 1
figure 1

The four robots used in the study: InMoov, Padbot, Joy Robot and Turtlebot (right to left, and top to bottom)

A lot of prior work has used images or videos rather than real, physical robots. While interesting results have emerged from this rich body of studies, these results have to be treated with caution because we know that interaction with physical robots changes perception (e.g., [2, 33, 54, 76]). Several studies have compared two or more real, physical robots with different appearances in interactions with humans (e.g., [11, 39, 44, 47, 53, 57]. For example, Li et al. [53] compared three, real physical robots with different shapes (one anthropomorphic, one zoomorphic and one machine-like). They started from the assumption that a robot with recognizable eyes and a robot with a human-like body shape are more likely to be categorized as human-like, while a robot without facial features and a robot with wheels and tracks were more likely to be categorized as machine-like robots [78]. They found that the robot’s appearance generates strong and positive correlations between interaction performance (understood as active response and engagement) and preference (intended as likeability, trust, and satisfaction, in the human–robot interaction). However, it is difficult to identify in their review which element of appearance, among all those that compose it, generates the effects that were examined. In our study thus we decided to use robots that are characterized by a decreasing degree of morphological human-likeness. To establish their decreasing morphology, we found inspiration in Eaton’s taxonomy [17, pp. 36–37] of n + 1 different levels of robots, which classified robots as follows:

0:

Replicant, which looks perfectly like a human being, both in physical aspect and behavior, except for the biological functions; people with normal senses cannot distinguish it from a human.

1:

Android, which presents a morphology and behavior very close to those of human and is characterized by high levels of intelligence and dexterity.

n − 3:

Humanoid, which is close to a human in both “body” and “brain” but without the possibility of mistaking the robot for a human being.

n − 2:

Inferior Humanoid, which has a broad, human morphology but with a reasonable intelligence and dexterity, mainly oriented to a limited set of tasks.

n − 1:

Human-inspired, but unlike a human. It may be wheeled or have bipedal capabilities and limited intelligence as well as dexterity.

n:

Built-for-Humans, which does not look like a human and is able to operate in most environments, performing a limited set of tasks.

Among our robots, InMoov corresponds to Humanoid, which is very human-like but without the possibility of mistaking the robot for a human being (n − 3). Padbot and Joy Robot correspond to a Human-inspired robot but unlike a human (n − 1); the fact that they belong to the same level n − 1, however, is not so problematic for us because this layer is so large that it includes very different robots. In our case, Padbot has a long neck and a face-like screen over this neck where the real face of the operator can be incorporated. This robot can talk and has the ability to move, which presents a certain similarity to the human body and is thus at the highest level of the layer. Joy Robot, which is similar to a doll but with expressive eyes and mouth, and arms that it can move, is at a lower level of the layer n − 1. Finally, the Turtlebot corresponds to a robot Built-for-humans, which does not look like a human and is able to operate in most environments (n). Our four robots are on the medium–low level of the human likeness scale for robots. Consequently, they should be free from any uncanny-valley effect that, according to Mori [61], should arise at the first levels of this taxonomy: Replicant = 0 and Android = 1. However, is this the case? We will return to this question in the next section. Overall, in our exploration of the attribution of mental and emotional capabilities of these four robots, we found inspiration in research that was carried out in Australia, China, and Italy [34] that has explored the mental capacities and emotions that are believed to differentiate humans from animals, robots, and supernatural beings. This study showed that 12 years ago robots were given low levels of mental capacities, and even lower levels of primary and secondary emotions by Chinese, Australian and Italian respondents. It found that distinctive patterns of mental capacities distinguished humans from non-humans in these three cultures. Robots were perceived as lacking capacities of perception and rational cognition, and were even more deficient in emotion. During the 12 years that have passed since that research, many social processes and phenomena have occurred, such as the further development of new technologies (e.g. [5, 25, 55]); the digitalization of the whole field of information, communication and entertainment (e.g. [1, 38, 63]), including mass media (e.g. [72]), and the diffusion of much more advanced social robots and virtual assistants in several sectors of society [3, 22, 30].

These phenomena have probably increased people’s familiarity with different degrees of media products that simulate human life likeness (e.g., reality shows; [65]) and visual products with robots (e.g., movies, television series, cartoons, etc.) that anticipate many factual aspects of robotic innovations and merge these with elements of fantasy [24]. The increasing technological sophistication in the creation of more advanced social robots seems to be matched by increasing technological discernment on the part of potential users [73]. Consequently, these sociotechnical changes could have changed people’s perceptions of robots.

We have taken three theoretical points of reference for our study: for the robots’ mental functions, we chose the theory of the dimensions of mind perception proposed by Gray et al. [28]; for the robots’ emotions, we chose the theory of infra-humanization elaborated by Leyens et al. [52]; and for the robot’s conception, we chose the theory of social representations [62]. The first and second theoretical approaches are pioneering contributions to the study of human/non-human contrast. In the present study, we considered a selection of the classification of mental states and processes, as described by D’Andrade [14], on the basis of the dimension of mind perception approach. This taxonomy consists of four dimensions: Perceptions (hearing, seeing, smelling, tasting), Wishes (desiring, needing, wanting, wishing), Thoughts (imagining, knowing, reasoning, thinking) and Intentions (choosing, deciding, expecting, intending, planning). The second theory is infra-humanization, which regards emotions and focuses on a subtle form of dis-humanization, and on the contrast between human/non-human. While explicit dehumanization equates its target (e.g., the members of a minority group) with animals, devils, machines, or objects to deprive them of their humanity [35], infra-humanization reaches the same objective by denying these members the experience of secondary emotions. This last theory is based on the distinction between primary and secondary emotions: the first indicates the emotional reactions that humans have in common with animals (e.g., anger, fear, pleasure etc.) and the second, which is also called sentiments, indicates the emotions that are rated as specifically human (e.g., pride, nostalgia, shame, remorse etc.). The third theory is that of social representations, which helps us capture how our respondents spontaneously perceive and conceive these robots [62]. By social representations we mean the set of concepts, statements and explanations that arise in daily life in the course of interpersonal communications; they can be considered as organizing principles and specific ways of expressing knowledge in a society, shared by large groups of people. All social representations tend to transform what is extraneous or new into something familiar.

This paper is structured as follows: in the next section, we will analyze the question of robots’ appearance and the uncanny-valley theory, in the following section we will illustrate the method and the measures applied. The illustration of the results of the research will be the topic of a dedicated section. These results will be discussed in the final section, along with an examination of the strong and weak points of the research, as well as the future paths of the research on the relation between perceptions of cognitive and affective capabilities of robots with a decreasing degree of morphological human likeness.

2 Robot Appearance and Uncanny Valley Theory

So far, research has shown that a robot’s physiognomy affects the mental model that humans have of the robot itself. In other words, it changes people’s perceptions of a robot’s human likeness, knowledge, and sociability [66]. For instance, Powers and Kiesler found that a robot with a shorter chin makes the robot’s eyes appear to be larger in the face. This typology of face (large eyes and small chins) is called baby faced. A baby-faced robot seems to replicate what Berry and McArthur [7] found for humans: baby-faced men are perceived to be more naive, honest, kind, and warm. As robots begin to enter people’s everyday life and houses and ordinary people start to interact with them, the question of how their appearance affects people’s perception and behavior becomes increasingly important. Four main fields of investigation regarding robots’ appearance have been explored so far. The first concerns the facial features and their importance in respect to the perceived similarity of robots to humans [70]. Research has found that the degree to which a robot is perceived as human-like is related to the number of features that its head presents (e.g., eyes, nose, mouth; DiSalvo et al. [16]). More recently, McGinn [60] has presented a review of the literature exploring at a macro level the capability for social interaction with robots with head-like features and has proposed a taxonomy of robots’ social interfaces. The second field of the influence of robots’ appearance focuses on the evaluation of their cognitive abilities. A number of research studies have shown that the expectation of a robot’s cognitive capabilities is connected to the robot’s appearance (e.g., [29, 36, 37, 48, 78]). In particular, Krach et al. [48] found that increased perceived human-likeness (in terms of appearance) implicates increased expectations about the robot’s cognitive abilities, in the sense that humans ascribe humanoid robots with mental states, though on a lower level. The degree of a robot’s human-likeness modulates its perception and biases “mental” state attribution. This modulation seems to be linear: the more that a robot exhibits human-like features, the more people build a model of its “mind.” The third field investigates the role played by human-like abilities in robots’ mind perception. For instance, Küster et al. [50] found that human-like abilities are able to supply more potent cues to mind perception than appearance. The fourth field explores emotions: only a few studies have systematically investigated humans’ emotional reactions towards robots (e.g., [69, 75]). The problem of a robots’ appearance is central to the development of a computational model that can simulate human modality in social interaction. Taking this issue into account is crucial for robot designers and engineers because no consensus has thus far been reached as to whether it is better to use more human-like robots or mechanical-like robots, and the context probably matters. Just to report some examples, several studies have demonstrated that people empathize more with human-like robots [67], especially in the healthcare field in which humanoid robots have been shown to increase the user’s propensity to imitation [64] and have proven to be effective in the treatment of people with autism [12, 27] and Alzheimer’s disease [74]. Other studies have indicated that robots that are more human-like and more complex in terms of sophisticated functionality are less accepted [31]. More recent research suggests that the robot’s appearance should be matched to the task that it performs [9]. According to another study, users prefer a fluffy robot as a companion but a more mechanical-like robot when they need to be reminded to take their medicines [8]. In several studies on robotics, the impression is that when researchers use a robot with a morphological similarity to humans, this objective human-likeness is taken for granted without verifying at the beginning of the study if the robot’s appearance is really perceived as human-like by the participants. In this research, we asked if the gradual morphological similarity of these four robots was really perceived as such by the observers.

Not only is the appearance of social robots strategic but the issue of the uncanny valley also stands out.Footnote 1 Mori [61] hypothesized the ‘uncanny valley’ effect in human–robot interactions, demonstrating that there is no linear relationship between the robot’s human-likeness and our affinity with them. The human likeness of a robot causes a feeling of affinity, but the affinity transforms into eeriness when a certain degree of human likeness is reached. For Mori, affinity with human like robots increases until an uncanny valley is reached, which is caused by some perceived imperfections in the near human-like forms of robots [71]. The more human-like the robots, the more pleasant they are experienced, until the point at which they start to elicit a negative emotional response: the uncanny feeling (UF) [75]. According to Mori, movement in human-like robots magnifies the uncanny valley [15, 59], given that movement is synonymous with life. The appearance of movement in an object that should be inanimate suggests greater concern than does the reverse, that is, the stillness of a person while looking alive [20]. When the boundaries between the various spheres blur (i.e., human beings, animals, plants, and the inanimate world), then the construction and representation of reality becomes perturbed (e.g. [18]). Here, we would like to further investigate if robots with a human-likeness that clearly cannot reach the point in which the uncanny valley effect is produced are able to generate this effect equally. Hopefully, Eaton’s taxonomy will allow us to better understand the breaking point of robot humanness from which the uncanny valley is supposed to start.

This theory is problematic because although a number of studies have been conducted to test it, the results are contradictory. While there is much evidence that supports the existence of some uncanny valley effect (e.g., [40, 43, 56, 79]), there is also evidence that denies its existence (e.g., [31, 32]). These studies applied various methods and used various materials as stimuli (e.g., morphed pictures, videos of actual robots, robots and computer graphics, and pictures of actual humans) which were often not standardized (e.g., [4, 58, 68]). Furthermore, the studies that confirm the uncanny valley hypothesis demonstrate that there are uncanny valley responses tout-court, without exploring which elements of appearance contribute to a positive or negative perception.

Despite an increasing interest in conducting empirical research on the uncanny valley theory, these contradictory findings and shortcomings have raised concerns among researchers about the scientific standing of this theory, both for theoretical [10, 45, 77] and methodological reasons [51]. The present study aims to contribute to this debate by addressing one of the central aspects of this theory: the point from which the uncanny valley effect is generated. As we mentioned earlier, we want to explore if the uncanny valley effect can also be generated by robots with a moderated human likeness. We set up the present study before becoming aware of a very recent paper by Kim et al. [46]. In their study using the largest set of real-world robots (n = 251) currently available in open-source format (the ABOT Database),Footnote 2 they discovered that an additional valley emerged when the robots’ appearance has low to moderate human-likeness. In our study, we will either confirm or deny this result.

Our research questions and hypotheses are as follows:

  • RQ1. Is the attribution (or not attribution) of mental and emotional capabilities to each of these four robots modulated according to their decreasing degrees of morphological and behavioral similarity to humans?

    We expected that there will be a decreasing modularity in the attribution to these robots of mental and emotional capabilities corresponding to their decreasing degree of morphological human-likeness (H1).

  • RQ2. Is the morphological similarity of a robot to humans perceived as such by observers?

    We have hypothesized that a robot’s morphological similarity to humans might not be perceived as such by people (H2).

  • RQ3. Does the effect of the uncanny valley still arise even at lower levels of robot–human likeness?

    We have hypothesized that robots with a human-likeness that clearly cannot reach the point in which the uncanny valley effect is produced are not able to generate this effect equally (H3).

3 Participants, Materials, and Measures

3.1 Participants

Our convenience sample was composed of 62 students, 38 males and 24 females, with an average age of 21.8 years (SD = 3.72). Their ages ranged from 18 to 44 years, with almost three-fourths of the respondents aged from 18 to 21 years old. As to the nationality, 44 students were Italians and 18 were foreigners (mainly German). The presence of a certain number of German students depended on the fact that this study was co-organized by the University of Udine and the University of Erfurt within an Erasmus exchange program. Regarding the students’ curricula, 11 were socio-humanistic students and 51 were techno-scientific students. The students’ previous familiarity with robots was very limited: only four (6.5%) of the students have used robots at school, nine (14.5%) at the university and 12 (19.4%) at home, which is considered to be the hub of the robotization of the reproductive sphere [13, 21]. The few robots that were used at school are Turtlebot, Joy Robot, Padbot and robots created by students, such as a robotic arm; at the university the students used Alfa 1, Nao and Replika; and at home the students used Roomba, Bimby, and Alexa.

3.2 Materials

This research was designed as a live presentation and illustration to a group of students of four robots showing decreasing degrees of human likeness. The robots we used in this study, in descending order based on the degree of morphological and behavioral similarity to humans, are: InMoov, Padbot, Joy Robot and Turtlebot (Fig. 1).

InMoov is the first open-source life-sized humanoid robot to be 3D printed and animated. It was a personal project designed in 2012 by Gael Langevin, a French sculptor and designer, and is easily replicable in any home by means of a 3D printer. Padbot U1 is a commercial telepresence bot that was designed by a Chinese company (Tianhe, Guangzhou). It has a long neck and face-like screen over this neck, where the real face of the operator can be incorporated, which presents a vague similarity to the human body. It is quite similar to the Giraffe robot that is used by social workers who wish to be present remotely. In this procedure Padbot had the face of one of the researchers in the display, who greeted the students. This was the only exception, but it was important that students could see how this robot moved and how it could talk to them. Joy Robot is a DIY (Do It Yourself) robot that presents a certain ability to express emotions by changing the shape of the eyes and the mouth and by moving the arms in various ways. This voluntary project was born within the maker movement in Brazil in 2016 for the purpose of developing a technology that would be able to attract communities and assist children in hospitals. Turtlebot 2 is a machine-like robot with a very low degree of similarity to humans. TurtleBot is a low-cost, personal robot kit with open-source software that has two-wheel navigation, can see in 3D, and can build maps and drive around. The robots were presented to the students in the following order: first, InMoov and then Padbot, Joy Robot and finally Turtlebot.

3.3 Procedure and Measures

We brought all of the robots to an amphitheater classroom at the University of Udine. We began with InMoov, which when the students arrived was already in the chair of the professor with two Master’s students near it. The robot was facing forwards, towards the Bachelor’s students, while some Master’s students from the Laboratory of Social Robotics illustrated its main characteristics and functionalities (15 min). After the explanations given by the Master’s students on the single robot, the Bachelor’s students were invited to approach the robot at close distance to familiarize themselves with it and experience directly some forms of interaction with it. This phase lasted for 30 min. In the last 15 min, the students were invited to complete a paper and pencil questionnaire that contained both closed and open-ended questions. After the first hour, InMoov was removed and Padbot entered the room by itself, moving through the classroom and up to the chair. This was the only exception in this procedure but it was important for the students to see how this robot moved and that the tablet over its neck showed the face of one of the researchers, who greeted the students. The procedure applied with InMoov was repeated also for Padbot. After the second hour, Padbot went out of the room and Joy Robot was brought into the room. We showed the students how it was able to move its eyes, mouth and arms in an expressive way. The procedure was repeated again. After the third hour, Joy Robot was removed and Turtlebot was brought into the room and the procedure was repeated for the fourth time. Overall, the students’ experience lasted 4 h.

In the questionnaire, we investigated several areas. The first regarded the conception of each robot and was addressed by assigning two tasks: ‘Please write the first three words that the robot you have observed evoked in you’ We decided to use this method, called a free association exercise, because it enables the capture of the spontaneous resurfacing of words elicited by the cue. This technique [6], by means of its projective character, offers the advantage of bringing out the latent and implicit dimensions of the knowledge and opinions on the specific object of the representation, whereby it allows access to the figurative core of the social representations of these four robots [62].

The second area of the questionnaire concerned the human characteristics attributed to each robot. The degree of human likeness was investigated through a series of items on the perception of mental functions adapted from D’Andrade [14], as well as on the perception of primary emotions (e.g., anger, fear, surprise, pleasure, pain, and joy) and secondary emotions (e.g., hope, love, guilt, remorse, pride, and shame) as identified by Leyens et al. [52]. Operationally, we gave the students the task: ‘Please, evaluate the degree in which you think that this robot is able to…”. A list of 14 items followed: hearing, seeing, smelling, tasting, desiring, needing, wanting, imagining, reasoning, thinking, choosing, deciding, expecting, planning, being conscious of oneself and the world. These items were measured on a 10-point Likert scale, ranging from not at all (1) to very much (10). The first four items described the four senses and belonged to the dimension of Perceptions, followed by the dimension of Wishes including the terms desiring, needing and wanting; the dimension of Thoughts, containing the words imagining, reasoning and thinking; and, finally, the dimension of Intentions, which included the terms choosing, deciding, expecting and planning. We also added the item ‘being conscious of oneself and the world,’ which marks the boundary beyond which, according to Faggin [19], a robot cannot go.

Emotions were explored by asking ‘Please, evaluate the degree in which you think that this robot is able to experience the following emotions…”. A list of 12 items followed: anger, fear, surprise, pleasure, pain, joy, hope, love, guilt, remorse, pride, shame’ (all the items were measured on a 10-point Likert scale from 1 = not at all to 10 = very much).

In addition, we asked: ‘Please, write the three most relevant emotions that this robot evoked in you’ (in free answer mode). We then asked the participants to evaluate some of the emotions they felt. ‘Looking at the robot that you observed, what did you feel? A sense of discomfort, a sense of eeriness, a sense of curiosity’ (each measured on a 10-point Likert scale from 1 = not at all to 10 = very much).

A third area that aimed to test the familiarity of students with robots contained questions about the use of robots at home, at school and at university (with yes/no response categories).

In this study, we used qualitative (free association exercises) and quantitative methods (survey). The gathered data was analyzed by means of content analysis and multivariate analysis of variance. As to content analysis, given that the free association exercise essentially collects words and that the number of our participants was low, we opted to perform the content analysis manually. As is required in these cases, three independent judges (or coders) did the analysis separately. They then confronted the results and negotiated a shared decision on the elaboration of the categories of meaning [49]. The other statistical analyses were performed with the software SPSS Statistics 21. We will give the results in the next section.

4 Results

4.1 The Robots’ Conception

The results that we obtained by means of the free association exercise on how our observers viewed the robots are shown in Table 1. Overall, for the four robots, we collected a dictionary of 721 headwords and a dictionary of 575 diverse words. After the elimination of not-classifiable words, 698 remained. These words were classified by means of content analysis into seven categories, which are the same for each robot: (1) Innovation, science and technology; (2) Negative values; (3) Positive values; (4) Humankind and the body; (5) Functions and characteristics; (6) Fiction and audio-visual products; and (7) Intelligence (Table 1).

Table 1 The categories of the conception of the four robots

We will briefly describe each category by reporting the most frequent words that form the core of the conveyed meaning. The category ‘Innovation, science and technology’ contained terms such as technology, machine, future, innovation, progress, robot, vacuum cleaner, humanoid and science. The ‘Negative values’ category included terms such as restlessness, eeriness, anguish, inutility, incompleteness and boredom. ‘Positive values’ contained the terms curiosity, interest, simplicity, utility, sweetness, sympathy, tenderness, cuteness, wonder, exceptionality, amazement and extraordinariness. The category ‘Functions and characteristics’ contained words such as modularity, white, small, complicate, multifunctionality, interaction, work, company, communication, driver, design, art, aesthetics, movement, videocall, practical, social, toy and game. The category ‘Fiction and audio-visual products’ included terms such as fake and Wall-e. The category ‘Intelligence’ contained artificial intelligence and intelligence. Finally, the category ‘Humankind and the body’ included terms containing identity, children, infancy, heart, skeleton, humanized, person and anthropomorphism.

The results of Table 1 show that the conception of these robots does not vary according to their degree of human likeness, in the sense that InMoov, the most human-like robot, is perceived as the most negative, cheater and intelligent; Padbot is seen as quite positive and functional; Joy Robot is considered as the most positive and quite functional; and Turtlebot as the most innovative and functional. Second, the category of ‘Positive values’ receives the highest number of occurrences, which suggests that the images of these robots are mainly positive. Third, the category ‘Negative values’ shows twice the number of occurrences for InMoov (the robot most similar to humans) than Padbot and Turtlebot and a much higher number than that for Joy Robot. In this case, the highest degree of human likeness is characterized by particularly strong negative values. Fourth, the category of ‘Innovation, science and technology’ has a certain importance but not for each robot. Fifth, the category of ‘Functions and characteristics’ becomes more important when the degree of similarity to humans is lower.

4.2 Are Human Mental Functions Attributed to InMoov, Padbot, Joy Robot and Turtlebot, and, If Yes, to What Extent?

We asked the participants to evaluate if each of these robots is able to carry out the following mental functions, organized into four clusters: Perceptions (hearing, seeing, smelling and tasting), Wishes (desiring, needing and wanting), Thoughts (imagining, reasoning and thinking) and Intentions (choosing, deciding, expecting and planning) [14]. The means and standard deviations of the single functions and those of the composite scores of the four clusters that define the main dimensions of mind perception are reported in Table 2 for each robot; the means of the clusters are illustrated specifically in Fig. 2.Footnote 3

Table 2 Mental functions attributed to the four robots
Fig. 2
figure 2

The means of the four clusters of mental functions attributed to InMoov, Padbot, Joy Robot and Turtlebot

The scores are overall rather low, the only scores that overcome the mid-point of the response scale are those related to two perceptions: the ability to hear and see.

Both the scores of each single mental function and the composite scores of each cluster were subjected to multivariate analysis of variance with one within factor (the four robots). The multivariate effect of the within factor was significant in the analysis with the single mental functions as dependent variables, F42,462 = 5.05, p < 0.0001, η2p = 0.32, and in the analysis with the four clusters as dependent variables, F12,492 = 7.57, p < 0.0001, η2p = 0.16. In Table 2 the results of the univariate tests and partial eta square are also reported. In both analyses all the univariate tests were significant. In the analysis considering the four clusters of mental functions we executed post-hoc tests with Bonferroni method. These comparisons showed that the ability to have Perceptions is attributed significantly more to InMoov than to Padbot and Turtlebot (without differences between them) and much more than to Joy Robot. Also, Wishes, Thoughts, and Intentions were attributed significantly more to InMoov than to the other three robots. Regarding Intentions, Turtlebot is in second position, at a significant distance from InMoov, while Padbot and Joy Robot receive the lower scores (without differences between them).

Finally, we examined the respondents’ evaluations on the item ‘being conscious of oneself and the world’. The mean attributed to this item is 2.73 (SD = 2.59) for InMoov, 1.65 (SD = 1.73) for Padbot, 1.39 (SD = 1.26) for Joy Robot and 2.70 (SD = 3.02) for Turtlebot. Thus, the respondents do not attribute to any of these robots the capacity of being aware of itself, which confirms Faggin’s [19] argument. However, the variance analysis with a within factor (the four robots) has highlighted the principal effect of this factor: F3,180 = 8.13, p < 0.0001, η2p = 0.12. That is, the two robots that are judged relatively less unconscious of themselves are InMoov and Turtlebot (without differences between them); the more unaware are Padbot and Joy Robot (again without differences between them), as shown by the comparisons between pairs that were performed.

4.3 Are Emotions Attributed to InMoov, Padbot, Joy Robot and Turtlebot and, If Yes, to What Extent?

We asked the participants to evaluate if these robots were able to experience a list of Primary emotions (anger, fear, surprise, pleasure, pain and joy) and Secondary emotions (hope, love, guilt, remorse, pride and shame) that were proposed to them. The means and standard deviations of the single emotion and those of the two clusters—Primary emotions and Secondary emotions- are reported in Table 3 for each robot and illustrated in Fig. 3.

Table 3 Primary and secondary emotions attributed to InMoov, Padbot, Joy Robot and Turtlebot
Fig. 3
figure 3

The means of the two clusters of Primary and Secondary emotions attributed to InMoov, Padbot, Joy Robot and Turtlebot

Table 3 reports also the results of the univariate tests of multivariate analyses of variance (the values of F and partial eta square).

These mean scores show that the respondents think that the four robots are fundamentally incapable of feeling emotions. There are, however, some differences. The multivariate analyses of variance with one within factor (the four robots) and the single emotions as dependent variables showed that the multivariate effect of the within factor was significant: F36,522 = 1.61, p < 0.02, η2p = 0.10, as well as all of the univariate tests. Also for the two clusters—Primary and Secondary emotions- the multivariate effect of the within factor was significant: F6,366 = 4.90, p < 0.0001, η2p = 0.07; as well as the two univariate effects. The post-hoc analyzes with Bonferroni method showed that Primary emotions were attributed a little more (but significantly) to InMoov and to Joy Robot (without differences between them). The situation is similar with regards to Secondary emotions: the post-hoc comparisons showed that respondents attributed to InMoov and to Joy Robot a little more (with significant) capacity to feel secondary emotions, which are those typical of human beings.

4.4 The Emotions Evoked by these Robots and the Uncanny Valley Effect

4.4.1 Free Association Exercises Regarding the Emotions that InMoov, Padbot, Joy Robot and Turtlebot Evoked on the Respondents

After exploring the emotions that, according to our respondents, the four robots could experience, we investigated the spontaneous emergence of the emotions that InMoov, Padbot, Joy Robot and Turtlebot aroused in our participants. In our study from the free association exercise, we collected an overall dictionary of 587 headwords and a dictionary of 172 diverse words. After removing some not classifiable words, we conducted a content analysis of the remaining words, which could be traced back to the twelve previously used categories (Table 4). These categories express the main constellations of emotions that emerged from the emotional continuum gathered around these twelve emotions. ‘Anger’ includes only anger. ‘Surprise’ comprises curiosity, wonder, surprise and interest. ‘Pain’ includes words such as restlessness, anxiety, discomfort, disorientation, insecurity, indifference, delusion, boredom, doubt and sadness. ‘Fear’ is made of the word fear. ‘Pleasure’ includes pleasure, serenity, calm, amusement, sympathy, empathy, enthusiasm and tenderness. ‘Joy’ contains terms like happiness and joy. ‘Love’ contains words such as love, fondness, intimacy and affection. ‘Hope’ and ‘Pride’ include only hope and pride. Remorse, guilt and shame are totally absent; others, such as anger, are almost absent while others are very much present.

Table 4 Emotions that emerged from the free association exercise as evoked in the participants by the four robots

The main category of emotions evoked by our robots is surprise (N = 210), the second is pleasure (N = 162), followed by pain (N = 115), with fear almost being irrelevant. As to the single robots, InMoov evokes especially pain and surprise (mostly equivalent) and, at distance, pleasure. Padbot arouses pleasure, followed by surprise and pain. Joy Robot evokes, first, pleasure, followed by surprise and joy. Turtlebot arouses surprise, followed by pain and pleasure. If we compare the robots to each other, we see that Joy Robot is unbeatable in provoking joy, Turtlebot is the robot who is able to surprise students the most, InMoov evokes more pain than any other robot and Padbot is more balanced in the emotions that it provokes.

4.4.2 Evaluation of the Discomfort, Eeriness, and Curiosity Evoked in the Participants

We further investigated this question of the emotions felt by the participants towards these robots to see if there exists a gradualness of intensity of three specific emotions (i.e., discomfort, eeriness and curiosity) corresponding to the various degrees of these robots’ human likeness. The means, standard deviations and the results of variance analyses are reported in Table 5.

Table 5 Feelings evoked by InMoov, Padbot, Joy Robot and Turtlebot

These results reveal that the degree of discomfort and eeriness towards these four robots is quite low (under the mid-point of the response scale), while curiosity seems to be the dominant feeling. The analyses of variance with a within factor showed that the judgments on the four robots were different. The post-hoc paired comparisons revealed that the participants felt both the highest sense of discomfort and eeriness but also of curiosity towards InMoov as compared to the other robots. Furthermore, another series of comparisons conducted with paired-samples t-test showed that, for all four robots, discomfort is not different from eeriness, while discomfort and eeriness are significantly lower than curiosity (ts are above 5.91, p < 0.0001).

5 Discussion and Final Remarks

The first result of this study is that, generally, our students attributed very low scores to these robots regarding their mental functions and even lower regarding the emotions. Our overall results do not differ much from those obtained by Haslam et al. [34] because the “folk model” of the mind and emotions of robots shows only a modest change. In particular, regarding the mental functions, the respondents attributed two perceptions to these robots: the ability to hear and the ability to see, which is perfectly understandable given the level reached by current sensors. In particular, the ability to hear is recognized only for InMoov, while the ability to see is attributed to InMoov, to Turtlebot and, at distance, to Padbot. For the rest, the other mental functions that approach the mid-point of the response scales without reaching it are attributed to InMoov, and concern reasoning and planning. Regarding the attribution of emotions, the students’ evaluations are even more severe—these robots are fundamentally perceived as incapable of feeling emotions.

We will use our research questions as a guide to structure further the discussion of the results. We will begin by providing an answer to RQ1. “Is the attribution (or not attribution) of mental and emotional capabilities to each of these four robots modulated according to their decreasing degrees of morphological and behavioral similarity to humans?” The first result is that students do not attribute mental functions and emotions to these robots, modulating their attribution according to their different human likeness. Thus, our H1 (i.e., the more a robot resembles humans, the more it is perceived with mental and emotional capabilities) does not find confirmation. The point is not the gradualness of the similarity to humans, but that only InMoov, which is the most human-like robot, was very partially assimilated to human mental functions. Any appearance of a robot which is not strongly similar to humans does not receive a recognition of any mental function from the participants. This confirms the split described earlier and contrasts with research such as that by Krach et al. [48], who found that the more a robot exhibits human-like features, the more people build a model of its “mind.”

We will next answer to RQ2: “Is the morphological similarity of a robot to humans perceived as such by observers?” This study has found that the morphological similarity of a robot to humans is not automatically perceived as such by observers and thus our hypothesis (H2) here finds a confirmation. In the figurative nucleus of the students’ spontaneous conceptions of these robots, the robots’ material body is in fact completely absent. This means that our participants have elaborated a disembodied image of these robots. The representations of these robots differ greatly from that described in previous research regarding robots [24], as well as from that describing information and communication technologies at the beginning of their diffusion [23]. In these studies, one of the most important dimensions of their representations was the material “body.” Thus, the human body is no longer a point of reference or comparison, whereby these robots are not conceptualized in a process leading back to the similarity to, or dissimilarity from, human beings. They are instead perceived as different entities and their intelligence is not a true issue. Thus robots are perceived not on the base of their morphological similarity to humans but beyond this. This result remains unexplained at the moment because we are unable to understand the reasons behind the fact that the current mental schemes for the robots have introjected disembodied robots. Furthermore, it is important to stress that there is no unique structure of the nuclei of the figurative representation of robots, but rather each robot has its own structure. This makes it impossible to detect a gradualness of the representation that corresponds to the gradualness of these four robots’ human likeness. There is, instead, a caesura between InMoov and the others; in the sense that InMoov, the most human-like robot, is perceived in the most negative way. This result would suggest that a high similarity to humans is considered a disvalue for a robot, which is a real problem that engulfs any interest in its affordances. With InMoov, people are, in fact, not interested in understanding what it can do but only in defining what it is in negative terms. Last but not least, this robot also presents some problems of fictionality in the sense that it is perceived as pretending to be a human.

Finally, we will answer to RQ3: “Does the effect of the uncanny valley arise even at lower levels of robot–human likeness?” An unexpected result comes from the category of emotions which have most frequently been evoked in the participants by our four robots—InMoov, the robot that is most similar to humans is perturbing because it evokes pain (e.g. anxiety, disorientation, insecurity) more than any other robot. This proves that even robots with a medium–high level of similarity to humans (n − 3) like InMoov can generate the uncanny valley effect, somehow confirming the study conducted by Kim et al. [46] in which they discovered an additional valley, connected to robots with low to moderate human likeness. Consequently, our H3, according to which robots with a medium human-likeness (n.3) are unable to generate the uncanny valley effect equally, does not find confirmation. A second result that comes from our exploration of the sense of discomfort, eeriness and curiosity felt towards these robots is that our participants felt towards InMoov, in comparison to the other robots, both the highest sense of discomfort and eeriness. This result is in line with the previous result. However, this second analysis adds another important finding: that the feeling of the uncanny valley is moderated in this case by the feeling of curiosity which is stronger. While discomfort and eeriness are under the mid-point, curiosity seems to be the dominant feeling. This result leads us to conclude also that a gradual intensity of emotional discomfort and eeriness that corresponds to the various degrees of these robots human likeness does not exist, but there is a contraposition between the humanoid InMoov and all the other robots whose similarity to humans is insufficient to arouse the uncanny valley effect.

The strong points of this study are first that it has used four physical robots to empirically measure both the live evaluation of their mental functions and emotions, as well as the degree to which students emotionally reacted towards these robots by obtaining a series of clear results. Second, it has matched quantitative and qualitative methods. There are several limitations of this study. First, this study was limited by the small number of participants forming a convenience sample. Second, because we used real robots (not pictures or videos), we had to limit their number. The practical exercise with the four robots lasted four hours and it would have been impossible to extend it any further even though only a few robots were used. Third, we had to choose among the robots that the laboratory of the university had, which lacked a more advanced humanoid. This was a major drawback for our analysis based on the gradualness of robot similarity to humans. Of these four robots, only InMoov and Padbot are included in the ABOT database, where they show a human-likeness score respectively of 59 and 4.13. However, the procedure of the human-likeness score applied in the ABOT Database is very specific and performed online with a picture of the robot, while we decided to show real, physical robots. In the present study, the fact that Padbot and Joy Robot belong to the same level n-1 of the Eaton’s taxonomy can represent a problem, although this layer is so large that it includes very different robots. In our case, Padbot, which has a long neck, a face-like screen over this neck where the real face of the operator can be incorporated, can talk and has the ability to move, presents a certain similarity to the human body and for this reason it is situated at the highest level of the layer. Joy Robot, which is similar to a doll but with expressive eyes and mouth and with arms that it can move, is at a lower level of the layer n-1. Fourth, in this research it was not possible to control differences between these four robots, which could interfere with the objectives of this research and the tasks assigned to the students. For example, the evaluations of similarity to humans attributed to InMoov could have been favored by the fact that it was the only robot that was dressed somehow since it was presented wearing a shirt, while the other robots hadn’t such a degree of anthropomorphization to make sense a cloth over them. Cultural cues, like clothes, are often used to increase anthropomorphization in addition to biological cues. Sometimes, dressing cues are part of the physical body of robots like, for example, in the case of the robots Doro (which has a bonnet) and Coro (which has a tie); other times robots are dressed with actual clothing items. But, while it makes sense to dress an anthropomorphized robot, it can be ridiculous to do this with robots more vaguely similar to humans. For this reason, only InMoov was dressed. Fifth, it was not possible to randomize the order in which we presented the robots because the participants had to be present in the classroom at the same time. Sixth, we cannot guarantee that all of the students experienced exactly the same interactions with the robots during the half an hour that they approached each robot and interacted with. This is a limitation of the study because any difference among interactions could affect the participants' perception towards the robots while each interaction should be comparable with the others. Finally, the participants were unfortunately not very familiar with actual robots and this has affected their evaluations. For future studies, this research at least recommends to involve a larger sample of participants and to add robots with a higher degree of human likeness, such as androids/gynoids.