1 Introduction

The current increases in life expectancy have heightened health challenges in terms of assuring to elders proper quality of life and wellbeing while living independently at their home. Ageing reduces individual’s autonomy and independence, triggering difficulties in mobility and /or self-care activities, discontinuing interpersonal interactions and acquaintances, and arousing social isolation and feelings of loneliness and neglect (Umberson and Montez 2010; Puvill et al. 2016). Both these physical and psychosocial distresses require treatments that social and health care institutions are increasingly unable to provide given the elevated number of requests and associated costs for such cares’ delivery. In attempting to lighten the increased demands for care providers of formal clinical social services and/or informal services in elderly home settings, research in ambient assisted living (AAL) had proposed socially assistive robots (SARs) devoted to support older adults within daily activities, promote healthier behaviours, encourage greater social participation, extend their independent living, and ensure active and healthy aging (AHA) while continuing living independently in their homes (Beer et al. 2012; Bishop et al. 2019; Cardinaux et al. 2011; Esposito and Jain 2016a, b; Esposito et al. 2014). Socially Assistive Robots (SARs) should act as companions, navigation and communication tools, and coaches by promoting elders’ wellbeing simultaneously minimising risks deriving from physical and cognitive shortcomings due to ageing. SARs’ design should be suitable for residential environments, such as homes and nursing homes, and functional in empowering older users’ physical and social skills enhancing their quality of life. Conversely, it is not an easy challenge to introduce robots in domestic spheres. The acceptance of their users is a fundamental step to achieve to ensure a successful implementation. However, user’s acceptance is a complex variable accounting of multiple theorized determinants and reflections users concurrently apply when requested to interact with robots. Among these determinants there are: robot perceived usefulness (robot’s ability to adapt to environments and user’s challenges), robot’s easiness to use, social influence and cognitive instrumental processes arisen by robots, hedonic motivations (pleasure in using the robot), trade-offs between perceived benefits and expenses related to robot’s use, and degree of cognitive loads required to overcomes strains associated to engaging machine driven interactional behaviours. These factors have been considered into several technology acceptance models (TAM) proposed along years while technology was rapidly changing concurrently with users’ needs and expectations (see Troncone et al. (2020) for a current review; Alaiad and Zhou (2014); Heerink et al. (2010) for the Almere model; Venkatesh et al. (2003) for the unified theory of acceptance and use of technology (UTAUT, UTAUT2); Davis (1989) for the first technology acceptance models (TAM, TAM2)). However, while time is running, these technology acceptance models appear not to be adequate for properly designing automatic behaving systems plainly accepted by their users since they still neglected to account of additional factors playing a role in user’s acceptance. Among these factors it is worth to mention age (young people are more accepting robots than older adults), gender (men are more open than women to technological changes), education (highly educated people are more willing to interact than less educated people), and interactions of these factors with technology experience, user’s contextual needs, context of use, and most importantly robots’ appearances (Matarić 2017; Nomura 2017; Latikka et al. 2019). Designing robots with human features (humanoid robots) or huge resemblances to human beings (android robots) reflects some of these factors grouped under the concept of robots’ appearance. It has been suggested that the endowment of human characteristics to social robots adversely affect their user’s perception (the uncanny valley response, Mori (1970)) eliciting negative responses of acceptance (Eyssel and Hegel 2012). However, this effect has been questioned by several researchers which provided different explanations to these user’s negative reactions. For example, a controlled experimental study addressing the effects of varying levels of human likeness (Burleigh et al. 2013) on robots’ digital faces acceptance (carefully manipulated by modifying robots’ facial features and merging human and non-human semblances at different levels) showed that it was the stimulus’ category membership (human vs non-human) rather than its human likeness to cause the uncanny valley effects. Wang et al. (2015) attributed this effect to a “dehumanization process” triggered by the inadequacy of humanlike characteristics (such as slowness of robots’ movements, and/or poor robots’ lexicons) embedded into the robot replica of a human being. Złotowski et al. (2015) proved that repeated interactions with a robot and a friendly robot’s attitude reduce feelings of eeriness independently from the robot’s appearance. MacDorman and Chattopadhyay (2016) proved that it was the lack of consistency in human characteristics to arouse eeriness rather than robots being inanimate or animate, or likely nonhuman or human. Esposito et al. (2020a) conducted a pilot experiment where video clips of five manufactured robots (Roomba, Nao, Pepper, Ishiguro, and Erica) were shown to 100 seniors (50 Female) aged 65+ years. After watching each video, seniors were administered a short questionnaire assessing their willingness to interact with the shown robots, feelings they aroused, and occupations seniors would entrust to them. The results appeared to favor the uncanny valley effects showing a significant elders’ preference of humanoids rather than android robots. The authors concluded that, as long as, android robots missed to appropriately render typical human interaction’s features such as natural voices, freely body movements, and natural facial expressions, they will always produce feelings of eeriness. These conclusions were also supported by another set of experiments assessing seniors’ preferences toward humanoids virtual agents. In this context, significant differences observed in the acceptance of male and female agents in favor of female ones completely disappeared when the same agents were muted and adopted a neutral facial expression (Esposito et al. 2019a). However, in another study involving female humanoid and android robots and a different group of seniors, it was shown that seniors’ preferences were significantly rooted toward androids rather than humanoid female robots (Esposito et al. 2019b). The inconsistent results reported by Esposito et al. (2020a, 2019b) may have derived by different methodological approaches exploited in the two studies. In Esposito et al. (2020b) results were assessed through a questionnaire that although having its validity was relatively short and neglected some variable determinants involved in the acceptance of robots. In Esposito et al. (2019b) the questionnaire attempted to embrace theoretical concepts derived from the most current versions of the technology acceptance (TAM2) and unified theory of acceptance and use of technology (UTAUT2).

This work aims to shed light on these inconsistent results and add to them the following new research questions. Firstly, differences, if any, in the acceptance and perception of male and female robots among different aged groups of users (in particular, seniors, young adults, and adolescents) will be investigated. This research aspect, to our knowledge, has been usually neglected by the literature which reports experiments involving mostly elders or young and adults’ users (Bedaf et al. 2019; Beuscher et al. 2017; Vandemeulebroucke et al. 2018; Latikka et al. 2019). The second research question is devoted to assessing differences, if any, among adolescents, young adults, and seniors, in attributing and/or expressing their age preferences for robots. The last research question aims to assess differences, if any, among a set of potential occupations (healthcare, housework, protection and security, and front office) adolescents, young adults and seniors will entrust to male and female humanoid and android robots. These investigations will also account of participants and robots’ gender differences effects in terms of robot’s acceptance.

2 Material and method

Two experiments were planned, each involving a group of adolescents, young adults, and elders. In each experiment, a set of stimuli composed of three video clips was used showing a humanoid and two android robots, the two latter disclosing a high degree of human-likeness and Caucasian and Asian traits, respectively. The two experiments were assessing the degree of acceptance, age preference and occupations the three differently aged group of participants entrusted respectively to three selected female (Pepper, Erica, and Sophia) and male robots (Romeo, Yuri, and Albert). Acceptance was assessed in terms of participants’ willingness to interact with robots and their scoring on pragmatic, hedonic, and attractive dimensions.

2.1 Participants

A total of 240 participants, split in six groups, were involved. Groups 1 and 4 were each composed of 45 adolescents (20 males and 25 females in Group 1, mean age = 14.09, SD = ± .29, and 20 males and 25 females in Group 4, mean age = 14.67, SD = ± 48). Groups 2 and 5 were each composed of 30 young adults (12 males and 18 females in Group 2, mean age = 25.60, SD = ± 2.72, and 15 males and 15 females in Group 5, mean age = 24.70, SD = ± 3.42). Groups 3 and 6 were each composed of 45 seniors (22 males and 23 females in Group 3, mean age = 73.04, SD = ± 7.03, and 19 males and 26 females in Group 6, mean age = 72.44, SD = ± 6.40). Groups 1, 2, and 3 watched video clips showing female robots. Groups 4, 5, and 6 watched video clips showing male robots. Participants were recruited in Campania (south of Italy). They joined the study on a voluntary basis and signed an informed consent formulated according to the current Italian and European laws about privacy and data protection (GDPR and Italian D. Lgs. 196/2003). The research was approved, with the protocol number 25/2017, by the ethical committee of the Università degli Studi della Campania “Luigi Vanvitelli” , Department of Psychology. The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

2.2 Stimuli

Fig. 1
figure 1

The humanoid (a) Pepper and the two android robots Erica (b) and Sophia (c)

Fig. 2
figure 2

The humanoid (a) Romeo and the two android robots Yuri (b) and Albert (c)

A total of six video clips, lasting between 4 and 7 s and consisting of three female and three male robots were created. Each set of three consisted of a humanoid and two android robots, the android ones disclosing Caucasian and Asian physical traits, respectively. The female robots comprised the humanoid Pepper (renamed “Tina” to characterize it as female, Fig. 1a), and the two android robots Erica (Fig. 1b) and Sophia (Fig. 1c). The male robots comprised the humanoid Romeo (Fig. 2a), and the two android robots Geminoid HI-1 (renamed Yuri, Fig. 2b) and Geminoid DK (renamed Albert, Fig. 2c). Sophia has been developed by the Hanson Robotics (http://www.hansonrobotics.com). Pepper and Romeo by the SoftBank Robotics (http://www.softbankrobotics.com). Erica and Geminoid HI-1both with Asian traits have been realized by Professor Hiroshi Ishiguro. Particularly, Geminoid HI-1 is the exact replica of Professor Ishiguro. The robotics firm Kokoro in Tokyo developed Geminoid DK (with a Caucasian appearance) as a replica of Professor Henrik Scharfe from Aalborg University in Denmark. Erica, Yuri, and Albert are currently housed at the Advanced Telecommunications Research Institute (ATR, http://www.geminoid.jp/en/robots.html) in Nara, Japan. The robots’ video clips originated from videos published on the “YouTube” website, and show the robots half torso, in a frontal position. Each robot, depending on the gender attributed to it, was endowed with an Italian female or male synthetic voice generated through the Natural Reader synthesizer (http://www.naturalreaders.com), and recorded with the free audio software Audacity (http://www.audacityteam.org). The synthetic voices were inserted in the robot’s video clips using the Windows 10 application “Videomomenti”. Each robot produced the Italian sentence “Ciao sono Tina/ Erica/ Sophia/ Romeo/ Yuri/ Albert. Se vuoi posso aiutarti nelle tue attività quotidiane” (Hi, my name is Tina/ Erica/ Sophia/ Romeo/ Yuri/ Albert. If you allow, I can assist you in your daily activities).

2.3 Tools

Participants’ preferences toward the proposed robots were assessed by administering a digitalized version of the Robots’ Acceptance Questionnaire (RAQ). RAQ was the robotic version of the Virtual Agent Acceptance Questionnaire (VAAQ) developed inside the H2020 EU funded project Empathic (http://www.empathic-project.eu/) by Esposito et al. (2018). A Java application was used to develop this digitalized version to allow the automatic randomization of the questionnaire’s items and sections. The questionnaire was structured into 8 sections, plus a six items initial part devoted to collect participants’ socio-demographic information.

Section 1 consisted of seven items aimed at investigating participants’ degree of experience and familiarity with technological devices such as smartphones, tablets, and laptops.

Section 2 consisted of a single item focused on participants’ willingness to be involved in a long-lasting interaction with each proposed robot.

Sections 3, 4, 5 and 6 (each composed of ten items) collected participants’ assessment pertaining the following robots’ qualities (as introduced by Hassenzahl (2014) in the AttrakDiff questionnaire): (1) Pragmatic Qualities (PQ) describing how effective, useful, practical, clear and controllable the proposed robot is perceived; (2) Hedonic Qualities-Identity (HQI) evaluating how original, creative, presentable, and aesthetically pleasing the proposed robot appears; (3) Hedonic Qualities-Feeling (HQF) concerning either positive or negative emotions aroused by the vision of the robot; (4) Attractiveness (ATT) assessing how the proposed robot encourages increased use and arouse positive emotions. Except for section 1, questionnaire’s items required 5-point Likert scale responses ranging from 1 to 5 (1 = strongly agree, 2 = agree, 3 = I do not know, 4 = disagree, 5 = strongly disagree). Since questionnaire’s items included either positive or negative statements, scores from negative items were corrected in a reverse way, signifying that low scores summon to positive and high scores to negative robots’ evaluations.

Section 7 consists of three items investigating the effects of robot’s age on participants’ willingness to interact with them. The first and third item question participants about their robots’ preferred age and the age they attributed to shown robots. Scores from 1 to 5 refer to an age range going from 1 = 19–28 years old; 2 = 29–38 years old; 3 = 39–48 years old; 4 = 49–58 years old; to 5 = + 59 years old. The second item explicitly asked participants whether robot’s age would affect their willingness to interact with them. This item required either a positive (yes) or a negative (no) answer.

Section 8 consists of four items assessing occupations participants would entrust to robots among healthcare, housework, protection/security, and front office. This section required a 5-point Likert scale from 1 to 5 (1 = unsuitable, 2 = hardly suitable, 3 = I do not know, 4 = quite suitable, 5 = very suitable) and high scores reflected participants’ positive evaluation of robot’s suitability for the proposed occupations.

2.4 Procedures

Participants were briefed on the aims of the study and subsequently signed an informed consent. Then, they were asked to provide both socio-demographic information and complete the section 1 of the RAQ. Subsequently participants were randomly assigned either to the female or male set of robots’ video clips and after randomly watching each robot in the set, asked to fill the digitalized questionnaire. This procedure was repeated three times for each participant assigned either to female or male robots.

2.5 Degree of experience and familiarity with technological devices

Before to run the experiment, participants were asked to self-evaluate their degree of experience with technological devices as high, low, or none, define their frequency of use of smartphones, tablets, and laptops as every day, often but not every day, never, and their self-evaluations of the easiness of use of such devices, as very easy, easy, do not know, difficult, very difficult. The percentage values of such answers computed for adolescents, young, and older adults are reported in the following Tables 1, 2, and 3. Table 1 shows that both adolescents and young adults declared a high experience, while 48.89% of older adults reported no experience. Table 2 shows that smartphones are the mostly frequently used technological devices by all participants, even though 56.67% of older adults declared to have never used them. Tablets were the less frequently used, and laptops were more or less frequently used only by young adults. Older adults were rarely using tablets (85.56% never used this device) and rarely using laptops (77.78% were never used a laptop). Smartphones were the mostly easy to use among the proposed technological devices, even though 33.33% of the older adults considered them very difficult to use. Tablets and laptops were considered very difficult to use by 43.33% and 46.67% of older adults, respectively. These data clearly demonstrated that older adults were less familiar with the proposed technological devices and considered them less easy to use, even though they were users of computers and mobile phones.

Table 1 Adolescents, young and older adults’ experience with technological devices
Table 2 Participants’ self-declared frequency of use of smartphones, tables, and laptops
Table 3 Participants’ self-declared easiness of use of smartphones, tables, and laptops

3 Results

The design of the proposed experiments considers differently aged populations (age can be considered a between subjects’ factor at three levels) involved in the assessment of the same stimuli (the male and female robots respectively) on different dimensions (PQ, HQI, HQF, etc, which can be considered within subjects’ factors since all the subjects are called to assess them). To evaluate whether differences among the scores attributed to the dimensions by the differently aged groups are statistically significant and the same considered dimensions differ significantly, repeated measures ANOVA (Analysis of Variance) were considered. Repeated measures ANOVA compare marginal means for each dimension (the within factors) and each differently aged population (the between factors). Results are expressed in terms of F and p values, where F (called Fisher value) is a ratio of two variances and p provides the significance cut-off (generally set to .05). When testing a set of different variables for statistical significance across various groups, some of the variables may be falsely considered as statistically significant. The purpose of Bonferroni’s post-hoc tests (a multiple testing correction) is to keep the overall error rate/false positives to less than the specified p-value cut-off (more details on ANOVA can be found in Judd et al. (2017)). ANOVA analyses were chosen because of the experimental set-up requiring the involvement of three differently aged groups in each of the male and female set of robots’ stimuli. The age and gender of participants were considered between subjects’ factors and PQ, HQI, HQF, ATT, willingness to interact, age preferences, and occupational suitability scores within subjects’ factors. Separate ANOVA repeated measures analyses were carried out either for female (A) or male (B) robots. For questionnaire’s sections 2 (willingness to interact), 3 (PQ), 4 (HQI), 5 (HQF) and 6 (ATT) due to the reverse correction of negative items, low scores summon to positive robots’ assessments whereas high scores to negative ones. Scores from section 7 were separately analysed based on participants preferred and attributed age to robots. These scores vary from 1 to 5 and reflect age ranges (1 = 19–28 years old; 2 = 29–38 years old; 3 = 39–48 years old; 4 = 49–58 years old; 5 = + 59 years old). Four separate analyses were carried out to assess differences among scores obtained by each robot for occupations participants entrusted them (healthcare, housework, protection/security, and front office jobs). In this case, scores vary from 1 to 5, and 5 reflected a very positive assessment. The significance level was set at \(\alpha \) < .05 and differences among means were assessed through Bonferroni’s post hoc tests.

Additional ANOVA analyses were performed to assess the effects of robots’ type (android/humanoid) and ethnicity (Caucasian vs. Asian) on robot’s acceptance by adolescents and young adults (results concerning elders’ acceptance are reported in Esposito et al. (2020b)). These analyses considered participants’ and robots’ gender as between-subjects’ factors, and scores obtained at each RAQ sections (willingness to interact, pragmatic, hedonic-identity, hedonic-feeling, and attractiveness) by Albert, Sophia, Yuri, Erica, Romeo, and Pepper were considered as within-subjects’ factors. The significance was set at \(\alpha \) < .05 and differences among means were assessed through Bonferroni’s post hoc tests. Also, for these analyses, due to the reverse correction of negative items, low scores summon to positive robots’ assessments and high scores to negative ones.

3.1 Female robots’ assessment

In the following are reported preferences expressed by the three differently aged groups toward the female robots Pepper (Tina), Erica, and Sofia.

3.1.1 Willingness to interact

No significant effects of participants’ gender (F (1,114) = .427, p = .515) were observed for the willingness to interact with the female robots. A significant difference emerged among age groups (F (2,114) = 17.188, p \(<<\) .01). Bonferroni’s post hoc tests revealed that seniors (mean = 3.052) were significantly less willing than adolescents (mean =2.132, p \(<<\).01) and young adults (mean = 1.801, p \(<<\).01) to interact with the proposed female robots. Figure 3 illustrates this result. In all the following figures, the Y-axis indicates the score summon obtained by each robot (X-axis), for each age group (series).

Fig. 3
figure 3

Willingness to interact scores attributed to Sophia, Erica, and Pepper by adolescents, young, and older adults. Curly brackets with a “*” above indicate where differences in willingness to interact are significant among the three differently aged groups

Participants’ willingness to interact slightly differed among the three proposed female robots (F (2,228) = 4.017 p = .019). Bonferroni’s post hoc tests revealed that this difference was due to Pepper (mean = 2.244, p = .042) which was slightly more preferred than Sophia (mean = 2.508). A significant interaction was found between age groups and participants’ willingness to interact with female robots (F (4,228) = 8.496, p \(<<\).01). Bonferroni’s post hoc tests revealed that adolescents and young adults’ preferences toward Sophia (mean adolescents = 2.235; mean young adults = 1.875) and Pepper (mean adolescents = 1.820; mean young adults = 1.653) were significantly more influential than seniors’ preferences for Sophia (mean = 3.414, p \(<<\).01) and Pepper (mean = 3.260, p \(<<\).01). When willingness to interact was considered for each age group it was found that:

  1. (a)

    Adolescents were slightly significantly in favour to interact with Pepper (mean = 1.820) rather than Sophia (mean = 2.235, p = .046) and Erica (mean = 2.340, p = .010)

  2. (b)

    Young adults felt equally well while interacting with Pepper (mean = 1.653) or Erica (mean = 1.875) or Sophia (mean =1.875)

  3. (c)

    Seniors significantly prefer to interact with Erica (mean = 2.482) rather than Sophia (mean = 3.414, p \(<<\).01) or Pepper (mean = 3.260, p \(<<\).01).

To sum up, results revealed that older adults showed less willingness to interact with the proposed female robots than adolescents and young adults, and among robots their preferences went for Erica. Adolescents preferred to interact with Pepper and Erica slightly more than with Sophia and young adults were equally comfortable to interact with any of the three robots.

3.1.2 Pragmatic qualities

Robots’ pragmatic qualities were not affected by participants’ gender (F (1,114) = 1.219, p = .272). Significant differences emerged among age groups (F (2,114) = 7.052, p = .001). Bonferroni’s post hoc tests revealed that adolescents (mean = 25.080, p = .001) considered the proposed female robots significantly more useful than seniors (mean = 29.243). Pragmatic qualities were slightly significantly different among the three robots (F (2,228) = 3.343, p = .037). Nevertheless, when Bonferroni’s post hoc tests were performed these differences disappeared due to multiple Bonferroni’s adjustments, even though PQ means were slightly different among the three robots (Pepper’s mean = 26.362, Erica’s mean = 26.549, Sophia’s mean = 27.685). Additionally, a significant interaction between age groups and pragmatic qualities (F (4,228) = 3.247, p = .013) was observed. Bonferroni’s post hoc tests revealed that adolescents valuated significantly better than seniors the PQ qualities of Sophia (mean adolescents= 25.575; mean seniors= 30.715, p = .002) and Pepper (mean adolescents = 23.590; mean seniors = 29.247, p \(<<\).01). Inside each age group significantly PQ differences emerged for adolescents between Erica (mean = 26.075) and Pepper (mean = 23.590) in favour of Pepper (p = .009) and for seniors between Erica (mean = 27.766) and Sophia (mean = 30.715) in favour of Erica (p = .005). These results are summarized in Fig. 4.

Fig. 4
figure 4

PQ scores attributed to Sophia, Erica, and Pepper by adolescents, young adults, and elders. Curly brackets with a “*” above indicate where PQ scores are significantly different among the differently aged groups

To sum up, adolescents considered significantly more useful than seniors both Sophia and Pepper, while young adults considered these two robots equally useful, whereas all the three differently aged groups attributed to Erica the same degree of usefulness.

3.1.3 Hedonic qualities-identity (HQI)

Robots’ hedonic qualities (identity) were not affected by participants’ gender (F (1,114) = .461, p = .498) even though, significant differences emerged among age groups (F (2,114) = 5.950, p = .003). Bonferroni’s post hoc tests revealed that adolescents (mean = 25.830, p = .003) attributed to female robots higher HQI scores than seniors (mean = 29.289), meaning that adolescents considered more engaging than seniors, the proposed female robots.

A significant interaction was observed between participants’ gender and age groups (F (2,114) = 4.293, p = .016). Bonferroni’s post hoc tests showed that these differences were due to female participants, to the extent that female in the adolescents (mean = 26.093, p = .001) and young adults’ groups (mean = 25.519, p = .001) attributed to female robots higher HQI scores than elder female participants (mean = 31.290). Robots significantly differed in their HQI scores (F (2,228) = 10.517, p \(<<\).01). Bonferroni’s post hoc tests revealed that these differences were due to Erica (mean = 25.668) which received higher HQI scores than Pepper (mean = 27.987, p = .001) and Sophia (mean = 28.320, p \(<<\) .01). A significant interaction was found between age groups and robot’s attributed HQI scores (F (4,228) = 7.443, p \(<<\) .01). Bonferroni’s post hoc tests showed that seniors (mean = 31.354) attributed to Sophia significantly worst HQI scores than adolescents (mean = 26.940, p = .008) and young adults (mean = 26.667, p = .013). In addition, adolescents (mean = 24.400) attributed to Pepper significantly higher HQI scores than young adults (mean = 28.944, p = .004) and elders (mean = 30.618, p \(<<\).01). Inside each age group, adolescents attributed to Pepper (mean = 24.400) slightly significantly higher HQI scores than Sophia (mean = 26.940, p = .046). Young adults attributed to Erica (mean = 24.958) significantly higher HQI scores than Pepper (mean = 28.944, p = .003). Seniors attributed to Erica (mean =25.895) significantly higher HQI scores than Sophia (mean = 31.354, p \(<<\).01) and Pepper (mean = 30.618, p \(<<\).01).

In summary HQI scores attributed to female robots were significantly different among participants’ age group. Seniors and young adults showed to prefer Erica more than Sophia and Pepper whereas for adolescents’ it was exactly the opposite. Female elders attributed to robots more negative HQI scores than female adolescents and young adults. These results are illustrated in Fig. 5.

Fig. 5
figure 5

HQI scores attributed to Sophia, Erica, and Pepper by adolescents, young adults, and elders. Curly brackets with a “*” above indicate where HQI scores are significantly different among the differently aged groups

3.1.4 Hedonic qualities-feeling (HQF)

Robots’ hedonic qualities (feeling) were not affected by participants’ gender (F (1,114) = 2.140, p = .146) even though, significant differences emerged among age groups (F (2,114) = 5.390, p = .006). Bonferroni’s post hoc tests revealed that adolescents (mean = 24.000) attributed to robots better HQF scores than young adults (mean = 27.815, p = .011) and seniors (mean = 26.943, p = .033). A slightly significant interaction emerged between participants’ gender and age groups (F (2,114) = 4.326, p = .015). Bonferroni’s post hoc tests showed that these differences were due to male adolescents (mean = 22.533, p = .003) which attributed to robots better HQF scores than male young adults (mean = 29.278).

HQF scores differed among robots in a slightly significant way (F (2,228) = 3.217, p = .042). Bonferroni’s post hoc tests showed that these differences were due to Erica (mean = 25.543) that aroused more positive feelings than Sophia (mean = 27.080, p = .027). A significant interaction emerged between age groups and HQF scores attributed to robots (F (4,228) = 5.241, p \(<<\) .01). Bonferroni’s post hoc tests showed that adolescents (mean = 22.020, p \(<<\) .01) attributed to Pepper higher HQF scores than young adults (mean = 28.528) and seniors (mean = 27.857). Inside each age group, significant differences were finding for adolescents which attributed to Pepper (mean = 22.020) higher HQF scores than Erica (mean = 24.605, p = .029) and Sophia (mean = 25.375, p = .004) and seniors which significantly preferred Erica (mean = 24.760) rather than Sophia (mean = 28.213, p = .001) or Pepper (mean = 27.857, p = .006).

To sum up, adolescents attributed to robots significantly higher HQF scores than young adults and seniors, and male adolescents considered the proposed robots more able to arouse positive feeling than young male adults. Among the robots, Erica aroused more positive feelings than Pepper and Sophia, even though Pepper was the adolescents’ favourite. Figure 6 illustrates these results.

Fig. 6
figure 6

HQF scores attributed to Sophia, Erica, and Pepper by adolescents, young adults, and elders. Curly brackets with a “*” above indicate where HQF scores are significantly different among the differently aged groups

3.1.5 Attractiveness

Robot’s attractiveness was not affected by participants’ gender (F (1,114) = 1.635, p = .204). Significant differences emerged for the age groups (F (2,114) = 6.134, p = .003). Bonferroni’s post hoc tests revealed that these differences aroused because adolescents (mean = 24.817) attributed to robots significantly higher attractiveness scores than seniors (mean = 26.943, p = .002). Additionally, an interaction between participants’ gender and age groups (F (2,114) = 4.599, p = .012) was found. Bonferroni’s post hoc tests showed that these differences were due to both male and female participants. In details, male adolescents (mean = 23.700, p = .046) evaluated the proposed robots significantly more attractive than young male adults (mean = 28.500). Female elders (mean = 31.087) evaluated the robots significantly less attractive than female young adults (mean = 25.463, p = .003), female adolescents (mean = 25.933, p = .003) and male elders (mean = 26.439, p = .004). ATT scores differed significantly among the three robots (F (2.228) = 4.109, p = .018). Bonferroni’s post hoc tests revealed that these differences were due to Erica (mean = 26.179) that was considered more attractive than Sophia (mean = 27.797, p = .013). A significant interaction was found between age groups and ATT scores (F (4,228) = 6.177, p \(<<\) .01). Bonferroni’s post hoc tests showed that inside the age groups, adolescents scored Pepper (mean = 22.580) significantly more attractive than Sophia (mean = 26.575, p = .001) and Erica (mean = 25.295, p = .006), while seniors scored Erica (mean = 26.674) significantly more attractive than Pepper (mean = 29.355, p = .006) and Sophia (mean = 30.261, p \(<<\).01).

In summary, data showed that adolescents scored the proposed robots more attractive than elders, and inside the groups, their preferences were for Pepper. Young adults did not show a particular preference for any of the three proposed robots attributing to them similar ATT scores. Elders showed clear preferences for Erica. Among the robots, Sophia was considered the less attractive. Regarding the gender of participants, female elders rated the robots’ attractiveness significantly worse than male elders did. Figure 7 summarize these results.

Fig. 7
figure 7

ATT scores attributed to Sophia, Erica, and Pepper by adolescents, young adults, and elders. Curly brackets with a “*” above indicate where ATT scores are significantly different among the differently aged groups

3.1.6 Age range

The following analyses assess participants attributed and preferred age range as defined into section 7 of the Robots’ Acceptance Questionnaire (RAQ). As a reminder, scores for items in this section varied from 1 to 5 and reflected age ranges (1 = 19–28 years old; 2 = 29–38 years old; 3 = 39–48 years old; 4 = 49–58 years old; 5 = + 59 years old).

Preferred age range Participants’ preferred age range was not affected by participants’ gender (F (1,114) = .3244, p = .623) rather was significantly affected by age groups (F (2,114) = 9.208, p \(<<\) .01). Bonferroni post hoc test revealed that elders (mean = 2.410) differed significantly from adolescents (mean = 1.862, p = .010) and young adults (mean = 1.574, p \(<<\) .01) by preferring more aged robots. The age range preference differed significantly among the three female robots (F (2,228) = 5.069, p = .007). Bonferroni’s post hoc tests revealed that the preferred age range for Sophia (mean = 2.113) differed significantly from Erica (mean = 1.845, p = .009) and slightly significantly from Pepper (mean = 1.888, p = .046). In summary, seniors preferred more aged robots than adolescents and young adults in general, and age preferred range for Sophia was higher than age preferred range for Erica and Pepper.

Attributed age range Attributed age range was not affected by participants’ gender (F (1,114) = .514, p = .475), rather was significantly affected by age groups (F (2,114) = 5.145, p = .007). Bonferroni’s post hoc tests showed that robots’ attributed age by elders’ (mean = 1.845) significantly differed from adolescents (mean = 1.578, p = .031) and young adults (mean = 1.519, p = .017). Attributed age range significantly differed among the three female robots (F (2.228) = 44.185, p \(<<\) .01). Bonferroni’s post hoc tests revealed that these differences were due to Sophia (mean = 2.180, p \(<<\) .01) attributed age range that was significantly higher than age ranges attributed to Erica (mean = 1.322) and Pepper (mean = 1.440). A significant interaction was found between age groups and attributed age range (F (4,228) =3.092, p = .017). Bonferroni’s post hoc tests revealed that these differences were due to the age range attributed to Pepper by seniors (mean = 1.846) which significantly differed from that attributed by adolescents (mean = 1.140, p = .001) and young adults (mean = 1.333, p = .043). Even though Bonferroni’s post hoc tests revealed no significant differences regarding the age range attributed to Sophia and Erica it seemed equally interesting to report their mean scores which for Sophia were adolescents= 2.175; young adults= 2.056 and elders= 2.310, and for Erica were adolescents= 1.420; young adults= 1.167 and elders= 1.379, suggesting that to a certain extent, Erica was judged younger than Sophia. Table 4 reports female robots attributed and preferred age range mean scores, showing clear differences between attributed and preferred age by adolescents, young adults, and seniors. All participants seem to prefer female robots older than how they appear to be.

Table 4 Female robots preferred and attributed age range mean scores
Table 5 Occupations (suitability scores) more entrusted to female robots by adolescents, young adults, and seniors

3.1.7 Robots’ entrusted occupations

This section summarizes results concerning occupations entrusted to the three proposed robots by the differently aged groups among healthcare, housework, protection/security, and front office jobs. High and low scores indicate respectively low and high suitability participants attributed to robots in accomplishing the proposed occupations.

Healthcare No participants’ gender effect (F (1,114) = .854, p = .357) was found rather a significant difference for age groups (F (2,114) = 7.001 p = .001) was observed. Bonferroni’s post hoc tests revealed that adolescents (mean = 3.563) considered female robots significantly more suitable than seniors (mean = 2.782, p = .001) for healthcare occupations. No significant differences were observed among the suitability of robots for healthcare (F (2,228) = .264, p = .768) occupations. A significant interaction was found between age groups and robots’ suitability to provide healthcare (F (4,228) = 10.568, p \(<<\) .01). Bonferroni’s post hoc tests showed that these differences were due to different opinions among the three differently aged groups on the suitability of Sophia and Pepper to provide health care assistance. Adolescents rated Pepper (mean = 4.160) significantly more suitable to provide health care assistance than Sophia (mean = 3.385, p = .001) and Erica (mean = 3.145, p \(<<\).01). For young adults Sophia (mean = 3.458) was equally suitable than Erica (mean = 3.444) but significantly more appropriate than Pepper (mean = 2764, p = .022). Seniors considered Erica (mean = 3.120) more suitable than Pepper (mean = 2.644) and significantly more suitable than Sophia (mean = 2.582, p = .011) for healthcare services.

In summary, adolescents considered robots more suitable than seniors for healthcare occupations, and in particular, Pepper more suitable than Sophia and Erica. Young adults considered Pepper less suitable than Erica and Sophia, while seniors considered Sophia less suitable Erica and Pepper for healthcare services.

Housework No participants’ gender effect (F (1,114) = 2.785, p = .098) was observed while a significant difference among age groups (F (2,114) = 10.342 p \(<<\) .01) emerged. Bonferroni’s post hoc tests revealed that adolescents (mean = 3.833, p \(<<\) .01) and young adults (mean = 3.653, p = .008) considered female robots significantly more suitable than seniors (mean = 2.941) for housework occupations. Suitability to do housework occupations was rated significantly different among the three robots (F (2,228) = 7.657, p = .001). Bonferroni’s post hoc tests revealed that Sophia (mean = 3.227) was considered significantly less suitable for housework than Erica (mean = 3.528, p = .020) and Pepper (mean = 3.672, p = .001). In summary, adolescents and young adults would have entrusted more housework to the proposed robots than the elders. Interestingly, among robots, Sophia was considered the less suitable performing housework tasks.

Protection and security tasks No significant differences were observed among participants’ gender (F (1,114) = 1.311, p = .255) and age groups (F (2,114) = .910, p = .405). A significant effect (F (2,228) = 10.407 p \(<<\) .01) was found concerning the robots’ suitability to perform protection and security tasks. Bonferroni’s post hoc tests revealed that Pepper (mean = 3.076) was considered significantly more qualified for protection and security tasks than Erica (mean = 2.624, p \(<<\) .01) and Sophia (mean = 2.679, p = .001).

Front office No participants’ gender effect (F (1,114) = .184, p = .668) was found rather a significant difference for age groups (F (2,114) = 4.933, p = .009) emerged. Bonferroni’s post hoc tests revealed that young adults (mean = 3.259, p = .014) considered female robots significantly more suitable than seniors (mean = 2.620) for front office tasks. Significant differences (F (2,228) = 5.699, p = .004) emerged concerning robots’ suitability to front office occupations. Bonferroni’s post hoc tests revealed that Erica (mean = 3.216) was considered significantly more suitable than Sophia (mean = 2.930, p = .038) and Pepper (mean = 2.818, p = .005). A significant interaction emerged between participants’ gender and robot suitability to perform front office tasks (F (2,228) = 4.251, p = .015). Bonferroni’s post hoc tests showed that male participants considered Erica (mean = 3.424) more qualified than Sophia (mean = 2.783, p = .001) and Pepper (mean = 2.869, p = .010) for this job. A significant interaction emerged between age groups and their opinion on the robots’ suitability to perform front office tasks (F (4,228) = 5.482, p \(<<\) .01). Bonferroni’s post hoc tests showed that inside the groups, young adults considered Pepper (mean = 2.708) significantly less suitable than Sophia (mean = 3.389, p = .026) and Erica (mean = 3.681, p \(<<\).01), and elders considered Sophia (mean =2.315) significantly less suitable than Erica (mean =2.988, p = .001) in performing front office tasks.

In summary young adults considered the proposed robots more suitable for front office tasks than seniors. Erica was considered significantly more suitable than Sophia and Pepper by seniors, while young adults considered Pepper the less suitable. Male participants considered Erica more suitable than Sophia and Pepper, while female participants considered the three proposed robots equally suitable for front office tasks. Table 5 summarizes the suitability scores to the proposed occupations attributed to the female robots by adolescents, young adults, and seniors.

3.2 Male robots’ assessment

3.2.1 Willingness to interact

No significant effect of participants’ gender (F (1,114) = 1.843, p = .177) was observed for the willingness to interact with male robots. A significant difference emerged among age groups (F (2,114) = 7.346, p = .001). Bonferroni’s post hoc tests revealed that adolescents (mean = 2.122, p = .023) and young adults (mean = 1.889, p = .001) were significantly more willing to interact than seniors (mean = 2.590) with the proposed male robots. No significant differences emerged among the three proposed robots in terms of participants willingness to interact (F (2,228) = 1.400 p = .249) with them. A slightly significant interaction emerged between participants ‘gender and willingness to interact with male robots (F (2,228) = 3.096, p = .047). Bonferroni’s post hoc tests revealed that this difference was due to female participants that were more willing to interact with Romeo (mean = 2.075) rather than Yuri (mean = 2.491, p = .012). These results are illustrated in Fig. 8.

Fig. 8
figure 8

Willingness to interact’ scores attributed to Albert, Yuri, and Romeo by adolescents, young adults, and elders

To sum up, results revealed that, adolescents and young adults showed a greater willingness to interact with male robots than seniors, and female participants were more willing to interact with Romeo rather than Yuri.

3.2.2 Pragmatic qualities

Robots attributed pragmatic qualities were not affected by participants’ gender (F (1,114) = 3.578, p = .061). A significant difference emerged among age groups (F (2,114) = 7.843, p = .001). Bonferroni’s post hoc tests revealed that adolescents (mean = 24.655, p \(<<\) .01) attributed higher PQ scores than seniors (mean = 28.887) to the proposed robots. No significant differences emerged among the PQ scores attributed to the proposed male robots (F (2.228) = .328, p = .721), even though adolescents showed a major appreciation than seniors. Figure 9 illustrates these results. Generally, adolescents considered the male robots significantly more useful than seniors.

Fig. 9
figure 9

PQ scores attributed to Albert, Yuri, and Romeo by adolescents, young adults, and elders

3.2.3 Hedonic qualities-identity (HQI)

Robots’ hedonic qualities (identity) were not affected by participants’ gender (F (1,114) = 2.474, p = .118) rather a significant difference emerged among age groups (F (2,114) = 4.985, p = .008). Bonferroni’s post hoc tests revealed that adolescents (mean = 25.913, p = .018) and young adults (mean = 25.844, p = .034) attributed to robots higher HQI scores than seniors (mean = 28.864). However, no significant differences emerged among robots for participants attributed HQI scores (F (2,228) = 2.540, p = .081). A significant interaction emerged between age groups and HQI scores (F (4,228) = 2.630, p = .035). Bonferroni’s post hoc tests revealed that elders (mean = 30.751) attributed to Romeo significantly lower HQI scores than adolescents (mean = 26.860, p = .010) and young adults (mean = 24.633 p \(<<\) .01) HQI scores. In summary, adolescents and young adults considered the male robots significantly more engaging than seniors. These results are depicted in Fig. 10.

Fig. 10
figure 10

HQI scores attributed to Albert, Yuri, and Romeo by adolescents, young adults, and elders. Curly brackets with a “*” above indicate where HQI scores are significantly different among the differently aged groups

3.2.4 Hedonic qualities-feeling (HQF)

No significant differences emerged both for participants’ gender (F (1,114) = 3.396, p = .068) and age groups (F (2,114) = 2.248, p = .110). HQF scores (F (2,228) = .990, p = .373) were not significantly different among the three male robots, though means may suggest slightly preferences for Romeo and Albert with respect to Yuri (Romeo’s mean = 24.875, Albert’s mean = 24.902, Yuri’s mean = 25.532). Figure 11 illustrates these results.

Fig. 11
figure 11

HQF scores attributed to Albert, Yuri, and Romeo by adolescents, young adults, and elders

3.2.5 Attractiveness

Robots’ attractiveness was not affected by participants’ gender (F (1,114) = 3.652, p = .059) while a significant difference emerged for age groups (F (2,114) = 4.874, p = .009). Bonferroni’s post hoc tests revealed that adolescents (mean = 23.693) attributed to robots higher ATT scores than seniors (mean = 27.036, p = .008). ATT scores did not significantly differ among the three robots (F (2.228) = .400, p = .671), even though means revealed that Romeo was considered more attractive than Yuri and Albert (Romeo’s mean = 24.901, Yuri’s mean = 25.307, Albert’s mean = 25.355). Figure 12 illustrates these results.

Fig. 12
figure 12

ATT scores attributed to Albert, Yuri, and Romeo by adolescents, young adults, and elders

3.2.6 Age range

The following analyses are based on scores varying from 1 to 5 and reflect participants preferred and attributed age range to robots (1 = 19–28 years old; 2 = 29–38 years old; 3 = 39–48 years old; 4 = 49–58 years old; 5 = + 59 years old).

Preferred age range Participants’ preferred age range was not affected by participants’ gender (F (1,114) = .000, p = .988) while a significant difference emerged for age groups (F (2,114) = 12.338, p \(<<\) .01). Bonferroni’s post hoc tests showed that adolescents’ (mean = 1.303) significantly differed from elders (mean = 2.187, p \(<<\) .01) in their preferred age range. Interestingly, young adults (young adults’ mean = 1.733) preferred age range was rated in between age ranges expressed by adolescents and seniors. Age range preference (F (2,228) = 18.643, p \(<<\).01) differed significantly among the three male robots. Bonferroni’s post hoc tests revealed that these differences were due to Romeo’s (mean = 1.585, p \(<<\) .01) preferred age range that significantly differed from Albert (mean = 1.804) and Yuri’s (mean = 1.834) preferred age ranges. A significant interaction emerged between age groups and preferred age range (F (4,228) = 5.926, p \(<<\) .01). Bonferroni’s post hoc tests showed adolescents (mean = 1.340) preferred age for Albert differed significantly for young adults (mean = 1.867, p = .044) and elders (mean = 2.206, p \(<<\) .01) Albert’s preferred age. Yuri’s preferred age range differed significantly between adolescents (mean = 1.435) and elders (mean = 2.168, p = .001). Romeo’s preferred age range differed significantly for Elders (mean = 2.187) preferred age range for Romeo differed significantly from adolescents (mean = 1.135, p \(<<\) .01) and young adults (mean = 1.433, p = .001) Romeo’s preferred age range.

To sum up, adolescents were in favour of a robot closely aged to their age (age range from 19 to 28 years old) no matter the robot was Albert, Yuri, or Romeo. Young adults were in between the first and second age range for Albert and Yuri but preferred a younger Romeo, while seniors expressed a clear preference for all robots aged between 29 and 38 years old.

Attributed age range Age range attribution was neither affected by participants’ gender (F (1,114) = .058, p = .811) nor age group (F (2,114) = .726, p = .486). Attributed age ranges significantly differed (F (2,228) = 73.698, \(<<\) .01) among the three male robots. Bonferroni’s post hoc tests revealed that these differences were due to Romeo’s attributed age range (mean = 1.472, p \(<<\) .01) which was considered significantly younger than Yuri (mean = 2.512) and Albert (mean = 2.702). A significant interaction was observed between age groups and attributed robot’s age ranges (F (4,228) = 3.853, p = .005) Bonferroni’s post hoc tests revealed that Yuri’s attributed age range was significantly different between young adults (mean = 2.867) and seniors (mean = 2.278, p = .043). Table 6 reports male robots attributed and preferred age range mean scores, showing clear differences between attributed and preferred age by adolescents, young adults, and seniors. All participants seem to prefer male robots younger than how they appear to be, except seniors for Romeo.

Table 6 Male robots preferred and attributed age range mean scores
Table 7 Occupations (suitability scores) more entrusted to male robots by adolescents, young adults, and seniors

3.2.7 Robots’ entrusted occupations

This section reports scores attributed by adolescents, young adults, and seniors e to entrusted robots’ occupations, among healthcare, housework, protection/security, and front office. High and low scores indicate respectively low and high suitability attributed to robots in accomplishing the proposed occupations.

Healthcare No age group effects (F (2,114) = 1.876, p = .158) emerged, rather a significant participants’ gender effect (F (1,114) = 4.802, p = .030) was found for suitability of robots to perform healthcare tasks. Bonferroni’s post hoc tests revealed that male participants (mean = 3.300) considered male robots significantly more suitable than females’ participants (mean = 2.939, p = .030) in performing healthcare tasks. No significant differences were observed among the three robots concerning their suitability to perform healthcare (F (2,228) = 2.302, p = .102) tasks. A significant interaction emerged between age groups and robots’ suitability for healthcare tasks (F (4,228) = 5.074, p = .001). Bonferroni’s post hoc tests showed that these differences were due to seniors (mean = 3.333, p = .022) which considered Yuri more suitable than young adults (mean = 2.633), and young adults (mean = 3.567, p = .008) which considered Romeo more suitable than adolescents (mean = 2.755) for healthcare tasks. To sum up, male considered significantly more suitable than female participants the three proposed male robots for healthcare tasks, and seniors considered Yuri more suitable than young adults, while young adults considered Romeo more suitable than adolescents for healthcare tasks.

Housework No significant effects emerged both for participants’ gender (F (1,114) = 2.094, p = .150) and age groups (F (2,114) = 1.611, p = .204) for robots’ suitability to perform housework occupations. The three proposed robots however scored significantly different (F (2,228) = 6.602, p = .002) concerning their suitability in accomplishing housework tasks. Bonferroni’s post hoc tests revealed that these differences were due to Romeo (mean = 3.491) which was considered significantly more suitable than Albert (mean = 3.129, p = .017) and Yuri (mean = 3.109, p = .006). In addition, there was a significant interaction between age groups and robots’ suitability in performing housework (F (4,228) = 6.066, p \(<<\) .01) tasks. Bonferroni’s post hoc tests showed that these differences were due to Albert and Yuri’s assessments. Seniors (mean = 3.597, p = .042) considered Albert more suitable than adolescents (mean = 2.890), and seniors (mean = 3.475, p = .027), considered Romeo more suitable than young adults (mean = 2.667) in accomplishing housework tasks. In summary, Romeo was considered significantly more suitable for housework tasks than Albert and Yuri.

Protection and security tasks No significant effects emerged both for participants’ gender (F (1,114) = 2.081, p = .152) and age groups (F (2,114) = .636, p = .531) and no significant differences emerged among the three proposed robots (F (2,228) = 1.174, p = .311) concerning robots’ suitability in protection and security tasks. A slightly significant interaction emerged between age groups and robots’ suitability to perform protection and security tasks (F (4,228) = 2.451, p = .047). Nevertheless, when Bonferroni’s post hoc tests were performed these differences disappeared due to Bonferroni’s multiple adjustments. To resume, no significant differences emerged among age groups, gender, and robots concerning robots’ adequacy in performing protection and security occupations.

Front office No participants’ gender effect (F (1,114) = 1.613, p = .207) emerged rather significant differences for age groups (F (2,114) = 5.876, p = .004) were observed concerning robots’ suitability to perform front office tasks. Bonferroni’s post hoc tests revealed that young adults (mean =3.522, p = .004) considered robots significantly more qualified than seniors (mean = 2.782) for front office tasks. Significant differences (F (2,228) = 4.410, p = .013) emerged among robots concerning their’ suitability for front office occupations. Bonferroni’s post hoc tests revealed that Albert (mean = 3.401) was considered more suitable than Romeo (mean = 3.044, p = .018) in accomplishing front office jobs. To sum up, young adults considered male robots significantly more qualified for front office jobs than seniors. Albert was considered significantly more suitable than Romeo. Table 7 summarize the male robots’ suitability to the proposed occupations expressed by adolescents, young adults, and seniors.

3.3 Adolescents, young adults, and seniors’ preferences toward robots’ type (android vs humanoid), ethnicity (Caucasian vs Asian), and gender (male vs female)

In the following are reported adolescents and young adults’ preferences expressed toward Caucasian and Asian androids and humanoids robots and toward female and male robots.

3.3.1 Adolescents

Willingness to interact When considering adolescents’ preferences toward the proposed robots, no significant effects of participants’ (F (1, 86) = .764, p = .385) and robots’ gender (F (1, 86) = .005, p = .942), as well as no interactions between robots and participants’ gender F (1, 186) = .100, p = .753) emerged. Participants’ willingness to interact with the robots was slightly significantly affected by robots’ type (F (2,172) = 3.670, p = .027). Bonferroni’s post hoc tests showed that adolescents’ willingness to interact was slightly significantly more in favour of the humanoid - Romeo and Pepper (mean = 1.950, p = .038), rather than Asian android robots—Yuri and Erica (mean = 2.285), while no significant differences emerged for the Caucasian android robots—Albert and Sophia (mean = 2.145).

Pragmatic qualities No significant effects of participants’ gender (F (1, 86) = 2.144, p = .147) and robots’ gender (F (1, 86) = .276, p = .601) and no significant interactions between robots’ and participants’ gender (F (1, 86) = .047, p = .829) emerged. Participants’ evaluation of robots’ pragmatic qualities was slightly significantly affected by the robots’ type (F (2,172) = 3.917, p = .022). Bonferroni’s post hoc tests showed that adolescents considered the android Yuri and Erica (mean = 25.525) slightly less useful than the humanoid robots - Romeo and Pepper (mean = 23.910, p = .014), while no differences were observed concerning the Caucasian android robots – Albert and Sophia (mean = 25.168).

Hedonic qualities (identity) No significant effects of participants’ (F (1, 86) = .809, p = .371) and robots’ gender (F (1, 86) = .009, p = .923) and no significant interactions between robots’ and participants’ gender (F (1, 86) = .084, p = .773) emerged. Adolescents considered the proposed robots similarly original, creative, presentable, and aesthetically pleasing, since no significant differences emerged among HQI scores attributed to robots (F (2,172) = .242, p = .785). A significant interaction emerged between robots’ gender and adolescents’ HQI scores (F (2,172) =6.247, p = .002). Bonferroni’s post hoc tests revealed that adolescents showed to prefer the humanoid robot Pepper (mean = 24.400) significantly more than its male counterpart Romeo (mean = 26.860, p = .010).

Hedonic qualities (feeling) A slightly significant effect of participants’ gender was observed among HQF attributed to robots (F (1, 86) = 6.145, p = .015). In this context male (mean = 22.667) experienced toward the proposed robots more positive emotions than female participants (mean = 24.980, p = .015). No significant effects of robots’ gender (F (1, 86) = .143, p = .706) and no significant interaction between robots’ gender and participants’ gender (F (1, 86) = .441, p = .508) emerged. HQF scores differed significantly among the proposed robots (F (2,172) = 7.403, p = .001). Bonferroni’s post hoc tests showed that humanoid robots (mean = 22.445) were arousing significantly more positive emotions than Caucasian (mean = 24.708, p = .002) and Asian (mean = 24.318, p = .005) android robots.

Attractiveness Concerning robots’ attractiveness, slightly significant effects of participants’ gender (F (1, 86) = 4.908, p = . 029) were observed. Male (mean = 23.283) considered robots more attractive than female participants (mean = 25.227, p = . 029). No significant effects of robots’ gender (F (1, 86) = 1.640, p = .204) and no significant interaction between robots’ and participants’ gender (F (1, 86) = .109, p = .742) emerged. Adolescents’ assessment of robots’ attractiveness was significantly different among robots’ types (F (2,172) = 10.251, p \(<<\) .01). Pairwise comparisons showed that humanoid (mean = 22.805) were considered significantly more attractive than Caucasian (mean = 25.615, p \(<<\) .01) and Asian android (mean = 24.345, p = .008) robots.

To sum up, adolescents HQF and ATT scores were significantly more in favour of the humanoid rather than the Asian and Caucasian android robots. Also, willingness to interact, PQ and HQI scores were slightly significantly more positive for humanoid rather than android robots. In addition, adolescents preferred female rather than male humanoid robots. In general, male were judging humanoid robots more positively than female adolescents. The Asian android robots were the less favourite among the proposed robots. In conclusion, the humanoid robots (Romeo and Pepper) were rated as more attractive and able to arise positive feelings than the Asian (Yuri and Erica) and Caucasian (Albert and Sophia) android robots. These results are depicted in Figs. 13 and 14.

Fig. 13
figure 13

Adolescents’ evaluations of android and humanoid robots. Curly brackets with a “*” above indicate where PQ, HQI, HQF, ATT, and willingness to interact’ scores differ significantly among android (Caucasian and Asian) and humanoid robots

Fig. 14
figure 14

Adolescents’ evaluations of female and male android and humanoid robots. Curly brackets with a “*” above indicate where PQ, HQI, HQF, ATT, and willingness to interact’ scores differ significantly among android (Caucasian and Asian) and humanoid robots

3.3.2 Young adults

Willingness to interact No significant effect of participants’ gender (F (1,56) = .812, p = .371) robot’s gender (F (1,56) = .245, p = .623) and no interaction between robots’ and participants’ gender (F (1,56) = 2.727, p = .104) emerged. Young adults’ willingness to interact with the proposed robots (F (2,112) = 2.340, p = .101) was not significantly affected by the robots’ type. A significant interaction among participants’ gender, robots’ gender, and participants’ willingness to interact with the proposed robots (F (2,112) = 4.005, p = .021) emerged. Bonferroni’s post hoc tests revealed that this triple interaction was due to female young adults (mean = 1.500) showing a slightly significant more willingness to interact with the robot Sophia than male (mean = 2.250, p = .028) young adults. In addition, female young adults, were significantly more willing to interact with Romeo (mean = 1.533) rather than Yuri (mean = 2.267. p = .005).

Pragmatic qualities No significant effects of participants’ (F (1,56) = .132, p = .717) and robots’ gender (F (1,56) = .108, p = .744), and no significant interaction between robots’ and participants’ gender (F (1,56) = .937, p = .170) emerged. Young adults considered all the proposed robots equally useful since no significant differences emerged among PQ scores attributed to robots (F (2,112) = .704, p = .478).

Hedonic qualities (identity) No significant effects of participants’ (F (1,56) = .160, p = .691) and robots’ gender (F (1,56) = .673, p = .416) and no significant interaction between robots’ and participants’ gender (F (1,56) = 3.127, p = .082) emerged. Young adults considered the proposed robots equally original, creative, presentable, and aesthetically pleasing, since no significant differences emerged among HQI scores attributed to robots (F (2,112) = .395, p = .675). A significant interaction emerged between robots’ gender and HQI scores (F (2,112) =5.191, p = .007). Bonferroni’s post hoc tests revealed that young adults rated the male robot Romeo (mean = 24.633) significantly more engaging than its female counterpart Pepper (mean = 28.944, p = .014).

Hedonic qualities (feeling) No significant effects of participants’ (F (1,56) = .008, p = .930) and robots’ gender (F (1,56) = 1.422, p = .238) and no significant interaction between robots’ and participants’ gender (F (1,56) = 3.008, p = .088) emerged. Young adults considered all the proposed robots similarly able to arouse positive feelings since no significant differences emerged among HQF scores attributed to (F (2,112) = .150, p = .861).

Attractiveness No significant effects of participants’ (F (1,56) = .015, p = .904) and robots’ gender (F (1,56) = 1.768, p = .189) and no significant interaction emerged between robots’ and participants’ gender (F (1,56) = 3.092, p = .084). Young adults considered all the proposed robots equally attractive since no significant differences emerged among ATT scores attributed to robots (F (2,112) = .048, p = .954).

In summary, young adults do not expressed singular preferences neither toward the specific type (android vs humanoid) nor gender (male vs female), nor ethnicity (Caucasian vs Asian) significant differences for the type and gender of the proposed robots except for a stronger preference for Sophia by female with respect to male young adults toward and always for female young adults a stronger preference for Romeo compared to Yuri. Mean values of the assessed dimensions showed a preference toward humanoid rather than android robots with no effects of their ethnicity. These results are depicted in Figs. 15 and 16.

Fig. 15
figure 15

Young adults’ evaluations of android and humanoid robots. No significant differences were observed among PQ, HQI, HQF, ATT, and willingness to interact’ scores attributed to the android (Caucasian and Asian) and humanoid robots

Fig. 16
figure 16

Young adults’ evaluations of female and male android and humanoid robots. Curly brackets with a “*” above for the HQI scores indicate that Romeo was significantly more engaging than Pepper

For sake of completeness Figs. 17 and 18 report seniors’ preferences toward different robots’ type (android vs humanoid), ethnicity (Caucasian vs Asian), and gender (male vs female). These data are not discussed here since have been already discussed in Esposito et al. (2020a). Figure 17 clearly shows that seniors preferences are toward android rather than humanoid robots, and in particular, seniors were significantly more willing to interact with, and considered the Asian android robots significantly more engaging (HQI scores), and more able to arouse positive emotions (HQF scores) than the android Caucasian and humanoid robots. However, when the preferences are detailed for each of the proposed robots, it clearly appears that seniors’ preferences were for Erica and Albert which received always higher, and in some cases significantly higher, scores (in willingness to interact, PQ, HQI, HQF, and ATT) than Sophia, Yuri, Romeo, and Pepper (see Fig. 18). In this respect, seniors’ preferences toward the proposed robots differ significantly from those expressed by young adults, which, in turn, report preferences for robots significantly different with respect to adolescents. The imaginary about robots’ preferences from their users is an obscure variable, where the age of interactants and robots’ appearances play a significant role in defining long lasting and successful interaction between human and complex autonomous systems as robots.

Fig. 17
figure 17

Seniors’ evaluations of female and male android and humanoid robots. Curly brackets with a “*” above indicate where PQ, HQI, HQF, ATT, and willingness to interact’ scores differ significantly among android (Caucasian and Asian) and humanoid robots

Fig. 18
figure 18

Seniors’ evaluations of female and male android and humanoid robots. Curly brackets with a “*” above for HQI, ATT, and willingness to interact’ scores indicate significant differences

4 Conlcusion

The present study aims at investigating robots’ acceptance by adolescents, young adults, and seniors. The variables at the stake were robots’ gender, their appearance in terms of android vs humanoid robots and inside the android robots, their ethnical traits (Asian vs Caucasian traits). Two experiments were conducted in to explore the effects of these variables on robots’ acceptance, the first exploiting video clips depicting female and the second video clips depicting male robots, each involving three differently aged groups of adolescents, young adults, and seniors. Data showed that adolescents and young adults were more willing than seniors to interact with female robots and among them their preferences were for the humanoid Pepper rather than the androids Erica and Sophia. Seniors were more reluctant to interact with robots and their preferences were for Erica rather than Pepper and Sophia. In addition, adolescents and young adults rated female robots significantly more positively than elders in terms of usefulness, pleasantness, appeal, and engagement (higher PQ, HQI, HQF and ATT scores). Among robots, adolescents’ preferences were strongly for Pepper, and elders’ preferences were strongly for Erica. Young adults’ preferences were not clearly oriented toward a specific female robot. Young adults showed a greater willingness to interact and judged Pepper more useful (Higher PQ scores) than Sophia and Erica, but then considered Pepper and Sophia equally more appealing, pleasant, and engaging (higher HQI, HQF and ATT scores) than Erica. The abovementioned effects were not modulated by seniors, young adults, and adolescents’ gender.

Adolescents, young adults, and seniors showed clear differences for female robots attributed and preferred age, preferring them older than they appear to be. Adolescents would entrust Pepper to perform more appropriately than Erica and Sophia all the proposed occupations (healthcare, housework, front office and protection and security). Young adults rated Pepper more suitable than Erica and Sophia for housework and protection and security, while Sophia and Erica were considered more suitable than Pepper for front office and healthcare occupations. Seniors entrusted Erica more than Pepper and Sophia to perform more appropriately the proposed occupations (healthcare, housework, front office and protection and security). Specifically, seniors rated female robots always significantly worse than adolescents and young adults on the investigated acceptance dimensions (willingness to interact, usefulness (PQ scores), appealing (HQI scores), pleasantness (HQF scores), and engagement (ATT scores)), and expressed less confidence in their suitability to perform all the proposed occupations, clearly being reluctant in their willingness to deal with such assistive technologies.

Seniors were more willing to interact and considered Erica more appealing, pleasant, and engaging, as well as more suitable than Pepper and Sophia in all the investigated acceptance dimensions and proposed robots’ occupations. In general, adolescents and young adults showed more willingness to interact and rated male robots more useful, pleasant, appealing, and engaging (higher PQ, HQI, HQF, and ATT scores) than seniors. Young adults preferred Romeo in terms of willingness to interact, usefulness, pleasantness, and engagement (higher PQ, HQI and ATT scores) and rated Albert and Romeo similarly in terms of appeal (HQF scores). Adolescents rated Albert and Romeo similarly in terms of willingness to interact but then rated Romeo more useful, appealing, and engaging (higher PQ, HQF, and ATT scores) than Albert and Yuri, while considered Albert and Yuri more pleasant than Romeo (higher HQI scores). Seniors expressed more willingness to interact with Romeo rather than Albert and Yuri but rated Albert more useful, pleasant, appealing, and engaging (higher PQ, HQI, HQF and ATT scores) than Yuri and Romeo.

Concerning robots’ age, clear differences emerged between attributed and preferred male robots’ age. All participants seem to prefer male robots younger than how they appear to be, except for Romeo, which was preferred older than how it appears to be by young adults and seniors.

Concerning suitability to the proposed occupations, young adults considered Romeo more suitable for housework and healthcare than Albert and Yuri. Albert more suitable for front office than Romeo and Yuri. Yuri more suitable for protection and security than Albert and Romeo. Adolescents considered Romeo more suitable for housework and protection and security than Albert and Yuri and Albert more suitable for front office and healthcare than Yuri and Romeo. Seniors considered Albert more suitable than Yuri and Romeo for all the proposed occupations.

Concerning differences between male and female robots, seniors expressed more willingness to interact and rated Erica and Albert more useful, pleasant, appealing, and engaging than Sophia, Pepper, Yuri, and Romeo, with Erica slightly more preferred than Albert in all the acceptance dimensions except for engagement where Albert was slightly more preferred than Erica. This result may suggest a senior’s preference for female robots. However, Sophia was the worse rated by seniors (followed by Pepper and Romeo) in all the proposed acceptance’ dimensions. Young adults expressed more willingness to interact with Pepper and Romeo rather than Sophia, Erica, Yuri, and Albert, scoring similarly female and male robots and slightly preferring the female ones. In addition, they considered the male robots more appealing and engaging (higher HQF and ATT scores) than female robots, preferring more Romeo, than Albert and Yuri. However, Erica was rated the more useful and pleasant, followed by Romeo, Pepper and Sophia with similar scores for usefulness (similar PQ scores) while Yuri and Albert were rated less useful. Albert and Romeo received similar and better scores than Yuri and Sophia for pleasantness. Adolescents expressed more willingness to interact and rated Pepper more useful and pleasant (higher PQ and HQI scores) than Romeo, Sophia, Erica, Yuri, and Albert. Pepper and Romeo were rated similarly more appealing and engaging (HQF and ATT scores) than Albert, Yuri, Erica, and Sophia.

Going back to the research questions considered in the introduction, the current data do not seem to add support to the uncanny valley theory, since a clear preference did not emerge between humanoid and android robots, even though adolescents appeared to be more confident with humanoid robots than young adults and seniors. Adolescents’ preferences toward humanoid robots do not seem to derive from their being scared by the human likeness of android robots, rather adolescents rated humanoid robots more engaging and pleasant to the extent they considered them as toys to play with. Along this same line of reasoning, the opposite preferences of seniors toward android robots seem to derive by the fact that seniors consider humanoid robots more as toys than possible assistants, suggesting an intermingled age and appearance effect on robots’ acceptance dimensions. Finally, no effect of participants and robots’ gender was observed. Concerning attributed vs preferred age of robots, both participants’ age and robots’ gender effects were observed. To this aim adolescents and young adults attributed and expressed to prefer robots’ age ranges close to their own age, while seniors preferred robots’ age ranges clearly younger (between 28 and 39 years old) than their own age. Regarding robots’ gender effects on the attributed and preferred robots’ age ranges it was observed that female robots appeared to participants younger than the age they would have preferred, while male robots appeared to participant older than the age they would have preferred. Finally, about occupations entrusted by the three differently aged groups of participants to the proposed robots, no effects of participants’ age (adolescents, young adults, and seniors), robots’ gender (male vs. female) and robots’ appearance (humanoids vs androids) were observed. These data suggest that neither age nor participants’ gender, nor robots’ human likeness affected their users’ acceptance. Rather acceptance was determined by a nonlinear combination of different factors, which remain to be determined. What we learned from these data is that the imaginary about robots’ preferences is an obscure variable, where the age and gender of interactants and robots’ appearances are intermingled with other factors hard to be identified and measured through the current experimental set up. These factors can originate from cultural and social conducts, cognitive efforts and personal needs, education levels, and environmental constraints. . However, these factors may play a significant role in defining long lasting and successful interaction between human and complex autonomous systems as robots. More data need to be collected to establish secure guidelines in the design and implementation of users’ well accepted either humanoid or android robots.