1 Introduction

The great ‘robot invasion’ has been expected for decades [1]. Robots have been described as becoming ‘ubiquitious’ in the near future (cf. [14, 36, 41]), but instead, it seems more likely that we will have to wait some more years until we actually welcome robots as social interaction partners into our lives. Social robots capable of engaging in social interactions or able to recognize emotions are still rare [2]. Even if technological progress eventually enables the development of functional social robots, user reactions and attitudes are still hard to predict. Potential users have high hopes for the use of robots [15]. However, these hopes may currently easily be crushed by incompetent robots, causing disappointment and a hesitation to use robots at all [16, 20]. Relatedly, users have serious and justified concerns for robot use, e.g., regarding security or privacy [8, 21, 40]. As a potential reinforcement of such negative evaluations, robots are likely to be seen as an outgroup to humans. As such, they are prone to suffer discrimination such as other discriminated groups [3, 10] and even bullying [18]. Such discrimination might in part be caused by evaluative conflict on the individual level, namely attitudinal ambivalence [38]. Potential robot users feel torn between aspirations and concerns associated with the use of social robots, resulting in ambivalent attitudes towards robots.

Ambivalence may strongly impact attitudes and behavioral intentions towards robots and should therefore be investigated in detail. In the current work, we extended previous work that commonly assessed ambivalence towards robots using self-reports [38]. Therefore, we measured ambivalence explicitly via self-report and implicitly via mouse tracking towards a variety of robot-related stimuli, and investigated the cognitive and behavioral consequences.

2 Related Work

2.1 Ambivalence Towards Robots

Social psychological research has recently investigated ambivalence towards robots in general. Despite often being described as neutral, attitudes towards robots actually seem to be highly ambivalent [38]. That is, attitudes towards robots encompass strong positive and strong negative evaluations at the same time. Such ambivalent attitudes cause negative affect, and the experience of conflict and being torn between two sides of an attitude. In contrast, neutral attitudes imply weak positive and negative evaluations. As such, neutral attitudes do not cause strong affective responses. Previous research has shown that neutral and ambivalent attitudes can easily be confused, depending on the measurement method used [33]. The distinction between ambivalence and neutrality is practically relevant because of its affective and behavioral consequences, e.g., resulting in higher arousal or decision delay, which might practically result in potential users’ reluctance to engage with robots. Previous research on ambivalence towards robots has predominantly relied on self-report measures concerning ambivalent attitudes and its affective consequences [38]. To extend this work, we applied a response-time-based method, thereby measuring the magnitude and resolution of ambivalence both implicitly and on a behavioral level. Our work was based on a theoretical framework that represents the affective, cognitive, and behavioral aspects of ambivalence, the ‘ABC of Ambivalence’ by van Harreveld, Nohlen, and Schneider [47]. According this model, subjective ambivalence, the subjective experience of conflicting evaluations, results in negative affect, since subjective ambivalence indicates an unpleasant state of conflict and arousal. Further, ambivalence can be observed in behavioral indicators (i.e., decision delay and motor behavior) and cognitive indicators (i.e., compensatory cognitions and systematic processing). In the current work, we aimed to replicate the findings concerning cognitive and behavioral indicators of ambivalence in the domain of social robotics, while extending the notion of interindividual differences in the experience of ambivalence.

2.2 Behavioral Indicators of Ambivalence

Ambivalence in attitudes influences behavior. Specifically, it causes choice delay and diverging motor behavior [47]. Tracking mouse trajectories represents a reliable method reflecting the decision-making process, measuring choice delay and implicit indicators of conflict in motor behavior [13]: In a common mouse tracking task, the mouse cursor is fixed at a starting point. Buttons in the top corners of the screen have to be reached to make an evaluation (e.g., ‘positive’ and ‘negative’). A stimulus appears in the middle of the screen and participants are asked to quickly move the mouse to the answer button of their choice, while their trajectories are recorded. In addition to overall decision times, tracking mouse trajectories provides the opportunity to measure ‘Maximum Deviation’ (MD). During the evaluation task, when recording the path of the mouse cursor, MD indicates the point at which the trajectory deviates the most from an ideal path from the starting point to the chosen answer button. To illustrate, responses towards univalent stimuli usually follow a straight line, and ambivalent responses show a ‘pull’ towards the non-chosen option [13] (see Fig. 1). While response times unspecifically indicate overall decision difficulty or processing difficulty, MD specifically indicates attitudinal conflict [32]. In addition, MD-time can be assessed. This is the point in time at which the highest deviation from the direct cursor path occurs. MD-time can thus be interpreted as the moment of highest conflict, after which the experienced conflict is then resolved.

Fig. 1
figure 1

Screenshot of an example mouse tracking task. The mouse cursor starts on the next button and the participant moves it to one of the response buttons (‘positive’ and ‘negative’, in German) during decision making

To illustrate, Schneider and colleagues [32] have used mouse tracking to demonstrate longer response times (i.e., choice delay) during decision making concerning ambivalent stimuli. Choice delay is one indicator of evaluative conflict. Above and beyond, participants’ mouse trajectories during a dichotomous decision-making task did not transfer to a straight line between the starting point and the response button. Instead, mouse movements deviated significantly from the straight line from starting point to answer button when attitudes were ambivalent (i.e., diverging motor behavior). That is, the feeling of being ‘torn’ between positive and negative evaluations translated directly to mouse movements during the decision. To date, aside from ambivalence research, mouse tracking methodology has been often used in cognitive categorization tasks, e.g., categorizing human versus non-human agents [51], or investigating the uncanny valley [50].

In the current research, we applied the mouse tracking methodology to the domain of attitudes towards social robots as a tool to assess implicit behavioral measures of ambivalence. Further, we operationalized choice delay not only in terms of a delay in response times, but also in terms of a low score on contact intentions. Contact intentions were measured via self-report to assess participants’ readiness to meet and interact with robots.

2.3 Cognitive Indicators of Ambivalence

In addition to behavioral indicators, ambivalent attitudes are associated with specific cognitive indicators, distinguishing ambivalent from univalent or neutral attitudes. Specifically, ambivalent attitudes are associated with systematic processing and compensatory cognitions [47]. Previous research has shown that ambivalence leads to systematic processing of attitude relevant information. This might include seeking out further information about the attitude object in order to reduce ambivalence and come to a non-conflicted attitude [7]. People who hold univalent attitudes do not experience conflict. Accordingly, they might not feel the need to obtain further information on the attitude object. In contrast, individuals who hold ambivalent attitude are motivated to resolve their conflict, following consistency motives [12]. This attitudinal conflict can be resolved by obtaining further information on the respective attitude object. Therefore, people with ambivalent attitudes might welcome further information on an attitude object in order to resolve the unpleasant state of attitudinal conflict.

Further, they might engage in compensatory cognitions. Compensatory cognitions are a mechanism to attenuate experienced conflict and may or may not be related to respective attitude object. For example, concerning the attitude object itself, participants might specifically commit to one side of evaluations, to reduce conflict [47]. This process is called affirmation [26]. Affirmation can also be unrelated to the ambivalent attitude object concerning its content, such as compensating for the conflict by finding order in distorted images or even report stronger conspiracy beliefs [46].

2.4 Interindividual Differences

Whereas ambivalence has certain behavioral and cognitive indicators, ambivalence itself is influenced by various interindividual differences.

2.4.1 Technology Commitment

Not everyone resolves evaluative conflict in the same way, and interindividual difference variables (i.e., attitudes, traits etc.) might play a role in this process. One of these interindividual differences is technology commitment, which might be specifically important concerning novel technologies such as robots. Technology commitment refers to people’s general affinity and their ease of use of technology [24]. Previous research has provided evidence for the fact that people high in technology commitment feel less conflicted about robots [38]. It is therefore plausible that people high in technology commitment experience less conflict concerning robots overall, or, alternatively, they reach the point of highest conflict (MD-time) earlier in the decision making process, as measured via mouse tracking.

2.4.2 Self-Control

Self-control also plays a key role in the context of ambbivalence research. The relationship between self-control and ambivalence has been reviewed in a recent meta-analysis [34]. People high in self-control turn out to be more successful and better adjusted (e.g., they report less psychopathology, higher self-esteem, and more optimal emotional responses) [42]. Moreover, self-control facilitates efficient conflict resolution. This may be helpful when trying to resolve ambivalence. Namely, self-control leads to an earlier moment of highest conflict (MD-time), but not to less conflict overall (MD), as measured via mouse tracking. That is, people high in self-control resolve attitudinal conflict earlier in the decision-making process. Therefore, self-control might also influence decision-making towards robots.

2.4.3 Proclivity to Anthropomorphize

Another construct that might be relevant in the context of evaluating robots may be the individual tendency to anthropomorphize nonhuman entities. This entails the attribution of humanlike characteristics to nonhuman entities [49]. The tendency to anthropomorphize likewise influences attitudes towards robots, especially increasing empathy and trust [29]. In the current research, we explored whether participants’ tendency to anthropomorphize might influence the magnitude and resolution of attitudinal conflict. Specifically, participants with a high level of anthropomorphization proclivity might report more positive attitudes towards robots, since they see them as more human-like and therefore experience less conflict. However, the opposite mechanism is equally plausible, since a robot high in humanlikeness in terms of appearance or behavior might be perceived as both threatening and likeable at the same time. This, in turn, would increase ambivalent attitudes.

2.4.4 Big Five Factors of Personality

Finally, personality traits, such as the Big Five factors of personality (i.e., agreeableness, openness to experience, conscientiousness, neuroticism, and extraversion) might influence ambivalent attitudes towards robots. For instance, extraverted people report more positive attitudes towards robots [17]. We explore possible relationships of the Big Five with ambivalence towards robots in the present research.

3 The Present Research

In the present work, we aimed to extend previous research concerning ambivalent attitudes towards robots by replicating it with various robot-related stimuli in four experiments. Further, we additionally investigated behavioral and cognitive indicators of ambivalence in attitudes towards robots with the help of self-report and response-time-based measures. This way, we tested the applicability of the ABC of Ambivalence [44] to the domain of social robotics and gathered further detailed insights into ambivalent attitudes towards robots on the affective, behavioral, and cognitive level. Specifically, in Experiments 1 and 2, we employed robot-related words as stimuli (robot categories and robot functions in Experiment 1; general robot-related words in Experiment 2) and tested them against univalent words. In Experiments 3 and 4 we employed robot pictures as stimuli (machine-like robots and humanoid robots in Experiment 3; social robots in Experiment 4) and tested them against univalent pictures. Further information can be found in the Experimental Manipulation section of the respective experiment.

All four experiments were approved by Bielefeld University’s Ethics Committeee (applications No. 2019-237 of 19/11/06 and No. 2020-094 of 20/06/15). For all four experiments presented in the following, we report how we determined our sample size, all data exclusions (if any), all manipulations, and all measures in the study or in the respective preregistrations. As effect sizes, we report Cohen’s d [4].

4 Experiment 1

In Experiment 1, we conducted a laboratory-based experiment, assessing ambivalence towards robot-related words reflecting robot categories and robot functions via self-report and mouse tracking. Furthermore, we explored the cognitive consequences of ambivalence towards robots qualitatively. We further investigated the role of the interindividual difference variables self-control and technology commitment in the experience and resolution of attitudinal ambivalence.

We used different types of robot-related words as stimuli, namely robot categories and robot functions. When investigating attitudes towards robots, it might be essential to specify the type of robot that is being investigated, since e.g., an industrial robot is fundamentally different in its functions and appearance from a social robot. Within Experiment 1, we examined five prominent robot categories (i.e., service robot, industrial robot, medical robot, exploration robot, social robot). Concerning robot functions, we investigated five defining functions of a social robot that is currently under development, namely the VIVA robot (https://navelrobotics.com/viva/). VIVA was developed as a social robot for the home use which is able to carry out short conversations with the user, recognizes emotions and reacts accordingly, and plans and carries out actions autonomously [37, 39]. We investigated several robot functions since evaluations might differ greatly depending on the respective function, while all functions represent important aspects that constitute various autonomous robots (social function, personalizability, mobility, video function, voice control).

We assumed that ambivalence in attitudes towards robots would reflect in behavioral indications of evaluative conflict. Thus, as preregistered (https://aspredicted.org/x5qq9.pdf), we hypothesized that MD, as a measure of behavioral indicators of ambivalence, would be higher for robot stimuli compared to univalent stimuli (H1a, H1b). To replicate previous findings concerning self-reported ambivalence towards robots [38], we hypothesized that objective (H2) and subjective ambivalence (H3) would be higher for robot stimuli than for univalent stimuli. Moreover, to investigate the role of interindividual differences on the resolution of ambivalence, we hypothesized that participants high in technology commitment would experience less ambivalence overall, and thus, show lower MD (H4a). Finally, to investigate the influence of interindividual differences on the resolution of conflict, we assessed MD-time. MD-time was defined as the time-point of the highest deviation and thus, the highest conflict. This measure marked the time of conflict resolution, independent from the overall response-time. We hypothesized that participants high in technology commitment (H4b) or self-control (H5) would resolve their attitudinal conflict more quickly, and, in turn, would show lower MD-time.

  • H1a: MD is higher for robot category stimuli compared to univalent stimuli.

  • H1b: MD is higher for robot function stimuli compared to univalent stimuli.

  • H2: Objective Ambivalence is higher for robot stimuli than for univalent stimuli.

  • H3: Subjective Ambivalence is higher for robot stimuli than for univalent stimuli.

  • H4a: Technology commitment would correlate negatively with MD.

  • H4b: Technology commitment would correlate negatively with MD-time.

  • H5: Self control would correlate negatively with MD-time.

4.1 Method

4.1.1 Participants and Design

118 participants were recruited at Bielefeld University to participate in a 15-minute laboratory study for a raffle of three 20 vouchers or course credit, and sweets. As preregistered, we excluded 7 participants who failed the attention check, resulting in 111 valid cases (\(\textit{M}_{age} = 23.11\), \(\textit{SD}_{age} = 3.79\); 43 female, 57 male, 11 not specified). We calculated a required sample size of 111 using the software G*Power [11] to achive 95 % power with an alpha error of .05 and a medium effect. We employed a one-factorial within-participants design with two levels (stimulus type: Robot stimuli vs. univalent stimuli).

4.1.2 Experimental Manipulation

We used ten univalent words (i.e., happy, holiday, in love (one word in German), sunshine, vegetable, abuse, depressed, disgust, unhappy, cockroach) for comparison based on previous research investigating ambivalence [6, 32]. In experiments 2–4, we employed univalent stimuli as stimuli that do not evoke ambivalence, but are rather perceived as clearly positive or negative. For the robot conditions, we chose five robot category words that represent robot categories and five robot function words. For the robot categories, we chose five prominent robot categories (service robot, industrial robot, medical robot, exploration robot, social robot) and provided a short explanation for each. That is, the service robot was introduced as a robot providing service for people, the industrial robot as being able to handle and assemble work pieces, the medical robot as performing medical tasks, such as surgery, diagnostics and care, the exploration robot as being used in places that are dangerous for people, and the social robot as being capable of basic social interactions.

We chose the robot functions based qualitative data from a previous experiment [39]. Here, participants had indicated potential advantages and disadvantages of a social robot, which mainly concerned five functions of the robot (social function, personalizability, mobility, video function, voice control). We concluded that these functions seemed to be important for participants’ evaluations of social robots and we therefore included them in the current research. To illustrate the practical significance of the functions, we introduced them with positive and negative remarks provided about the VIVA robot in a previous experiment [39].

We further included five ambivalent stimuli (i.e., abortion, organ donation, euthanasia, alcohol, candy [6, 32]) for exploratory reasons, however, they are not included in the hypotheses or results. Nevertheless, they are available in the provided dataset. In total, the experiment consisted of 25 trials in which participants categorized the respective stimulus into ‘positive’ or ‘negative’, orienting on [32], who included between eight and 24 trials per experiment.

4.1.3 Measures

Unless otherwise indicated, self-report measures consist of seven-point scales ranging from ‘not at all’ to ‘very’.

Mouse Tracking We assessed the magnitude of evaluative conflict by observing mouse trajectories during responding with the validated MouseTracker software [13]. Here, the path of the mouse cursor during the evaluation task is recorded, ranging from a set starting point to one of two choices (positive, negative). This path shows a stronger ‘pull’ towards the non-chosen objects when evaluating ambivalent attitude objects, compared to univalent attitude objects, operationalized as MD [32]. In addition, response times were recorded as an indicator of choice delay, and MD-time was recorded as a an indicator of the point of highest conflict.

Subjective Ambivalence We assessed subjective ambivalence towards each stimulus with one item reading ‘To what degree do you experience conflicting thoughts and/or feelings?’ The seven-point scale ranged from ‘no conflicting thoughts/feelings’ to ‘completely conflicting thoughts/feelings’, cf. [25].

Objective Ambivalence We assessed objective ambivalence with two items, assessing the positive and the negative sides of the attitude separately, reading ‘How positive [negative] do you find this?’. The values were integrated into a quantitative score of objective ambivalence using the following formula: [(P + N)/2] - |P - N| [43]. P is substituted by the positive values and N by the negative values. Low values for both evaluation sides or for one evaluation side result in a low score, while high values for both evaluation sides result in a high score for objective ambivalence, indicating the objectively opposing evaluations with the potential to cause attitudinal conflict.

Technology Commitment We assessed technology commitment with eight items (adapted by [28]; original version by [24]). The questionnaire features the subscales technology acceptance, e.g., ‘I like to use the newest technological devices.’, and technology competence, e.g., ‘I find it difficult to deal with new technology.’ (reverse-coded). The subscale technology control is not included in this version due to its lack of discriminant validity and internal consistency, cf. [24].

Self-Control We assessed trait self-control with the Brief Self-Control Scale [42], consisting of 13 items, e.g., ‘I am good at resisting temptations.’.

Qualitative Items To extend our insights into potential users’ specific evaluations regarding robots, we included three items in an open response format which were analyzed qualitatively. These items read ‘Which benefits and disadvantages of robots cause an inner conflict in you?’ (open answer format), ‘Which robot function causes an inner conflict in you?’ (six options with the presented robot functions, and the option ‘none’), ‘Why did you choose this function?’ (open answer format), ‘Which robot categories causes an inner conflict in you?’ (six options with the presented robot categories, and the option ‘none’) and ‘Why did you choose this category?’ (open answer format).

Demographics and Attention Check We assessed gender, age, and German language skills. We included a check of data quality by asking participants whether they had participated meticulously.

4.1.4 Procedure

Participants were told that they would be asked for their opinion about a new social robot and robots in general. After reading the instructions and providing informed consent, participants were presented with an image and a short description of a social robot. After a first attention check asking for the name of the robot, they saw descriptions of five of its functions together with positive and negative statements about these functions that were collected from a previous study. We selected these functions, because qualitative results from a previous study indicated that participants hold positive as well as negative evaluations towards these functions [39]. We aimed to make this potential ambivalence salient by presenting the participants in this study with positive and negative aspects of the function, since people do not encounter robots in their daily lives [2], and might not have formed strong opinions yet about various robot functions. We randomized whether the positive or the negative statement was presented first. To introduce participants to various robot categories, participants were presented with five pictures of robot categories (i.e., service robot, industrial robot, medical robot, exploration robot, social robot) and a short description, respectively. To ensure attention, participants were told to memorize the descriptions for a subsequent memory task. In the memory task, participants were asked to pair the robot category with the right description. Then, participants completed the mouse tracking task, evaluating five positive, five negative, and five ambivalent stimuli, the five robot functions and the five robot categories as ‘positive’ or ‘negative’. Then, we presented all stimuli again and assessed self-reported objective and subjective ambivalence for each stimulus. Finally, we assessed self-control and technology commitment, qualitative questions, demographic data, and a final attention check. Participants were debriefed, thanked, and dismissed.

4.2 Results and Discussion

4.2.1 Main Analyses

Mouse Tracking We conducted dependent t-tests with stimulus type as the independent variable to investigate the main hypotheses concerning MD as an indicator of ambivalence towards robots. MD was higher towards robot stimuli overall (\(\textit{M} = 0.39\), \(\textit{SD} = 0.22\)), than towards univalent stimuli (\(\textit{M} = 0.33\), \(\textit{SD} = 0.28\), \(\textit{t}(108) = 2.15\), \(\textit{p} =.017\), \(\textit{d} = 0.20\); see Fig. 2). When investigating the stimulus categories individually, contrary to our expectations (H1a), MD was not significantly higher towards robot category stimuli (\(\textit{M} = 0.37\), \(\textit{SD} = 0.24\)) than towards univalent stimuli (\(\textit{t}(110) = 1.45\), \(\textit{p} = .075\), \(\textit{d} = 0.14\)). However, in line with Hypothesis H1b, MD was higher towards robot function stimuli (\(\textit{M} = 0.40\), \(\textit{SD} = 0.29\)) than towards univalent stimuli (\(\textit{t}(108) = 2.17\), \(\textit{p} =.016\), \(\textit{d} =.20\)). This indicates a small effect size, which might be observed more consistently with larger sample sizes. Moreover, the lack of a significant difference in MD between univalent stimuli and robot category stimuli could be due to a lack of information about the presented robot categories. When a limited amount of information is accessible, attitudes might tend to be ambiguous, rather than ambivalent, potentially resulting in less evaluative conflict [31]. Concerning the robot category stimuli, participants were provided with a picture and a short description of the robot, while for the robot function stimuli, they were provided with pro and con arguments from peers. This way, ambivalence might have been more successfully induced towards robot functions, compared to robot categories.

Fig. 2
figure 2

Means, standard errors and statistical significance of maximum deviation, response time, objective ambivalence and subjective ambivalence for each condition in Experiment 1

Furthermore, in line with previous work (e.g., [32, 44]) overall response times in milliseconds were higher when evaluating robot-related stimuli (\(\textit{M} = 1701.36\), \(\textit{SD} = 364.95\)) compared to univalent stimuli (\(\textit{M} = 1444.46\) \(\textit{SD} = 251.48\), \(\textit{t}(108) = 10.34\), \(\textit{p} <.001\), \(\textit{d} = 0.71\)). This was the case for both robot category stimuli (\(\textit{M} = 1637.91\), \(\textit{SD} = 361.44\); \(\textit{t}(110) = 7.46\), \(\textit{p} <.001\), \(\textit{d} = 0.58\)) as well as robot function-related stimuli (\(\textit{M} = 1777.48\), \(\textit{SD} = 414.43\); \(\textit{t}(108) = 11.31\), \(\textit{p} <.001\), \(\textit{d} = 0.74\)), indicating medium to large effect sizes. Response times might be used as an additional parameter to reflect choice delay caused by ambivalence. However, they should be interpreted with caution, since robot words, despite having been introduced in the beginning of the experiment, might be less familiar to most participants compared to the univalent words that are often used in everyday life and they therefore might process the respective information slower and respond later.

Self-Report Measures To ensure convergent validity of the mouse tracking measurement, we also investigated self-reported ambivalence towards robots. In line with previous research [38], objective ambivalence was higher towards robot stimuli (\(\textit{M} = 2.57\), \(\textit{SD} = 1.01\)) than towards univalent stimuli (\(\textit{M} = 0.86\), \(\textit{SD} = 0.70\)), \(\textit{t}(110) = 17.47\), \(\textit{p} <.001\), \(\textit{d} =0.86\); (H2). Both robot categories (\(\textit{M} = 2.07\), \(\textit{SD} = 1.06\);; \(\textit{t} = 11.65\), \(\textit{p} <.001\), \(\textit{d} =0.74\)) and robot functions (\(\textit{M} = 3.07\), 1.24,; \(\textit{t} = 18.83\), \(\textit{p} <.001\), \(\textit{d} =0.87\)) evoked higher objective ambivalence than univalent stimuli, with large effect sizes. Furthermore, self-reported subjective ambivalence was higher towards robots (\(\textit{M} = 2.82\), \(\textit{SD} = 0.97\)) than towards univalent stimuli (\(\textit{M} = 1.60\), \(\textit{SD} = 0.61\)), \(\textit{t}(110) = 14.16\), \(\textit{p} <.001\), \(\textit{d} =0.80\) (H3), indicating a large effect size. Both, robot categories (\(\textit{M} = 2.52\) \(\textit{SD} = 1.02\); \(\textit{t}(110) = 11.65\), \(\textit{p} <.001\), \(\textit{d} = 0.74\)) and robot functions (\(\textit{M} = 3.13\), \(\textit{SD} = 1.13\); \(\textit{t}(110) = 18.83\), \(\textit{p} <.001\), \(\textit{d} = 0.87\)) evoked higher subjective ambivalence than univalent stimuli, indicating medium to large effect sizes. However, these differences in self-reported ambivalence did apparently not directly translate to motor behavior assessed via mouse tracking. We discuss possible causes for this outcome in the General Discussion section.

4.2.2 Interindividual Differences

To investigate the influence of interindividual differences on conflict resolution, we ran correlations between technology commitment (\(\textit{M} = 4.98\), \(\textit{SD} = 1.10\), \(\alpha =.80\)) and self-control (\(\textit{M} = 4.23\), \(\textit{SD} = 0.98\), \(\alpha =.83\)) with mouse tracking data in robot-related trials. As predicted, technology commitment correlated moderately negatively with MD (\(\textit{r}(106) = -.35\), \(\textit{p} <.001\)). That is, people high in technology commitment experienced less conflict overall, compared to people low in technology commitment (H4a). Also, technology commitment correlated negatively with MD-time(\(\textit{r}(106) = -.21\), \(\textit{p} =.028\); H4b). This indicates that people high in technology commitment experience less ambivalence and also resolve ambivalence earlier in the decision making process compared to people low in technology commitment. However, contrary to our expectations, self-control (\(\textit{r}(106) =.03\), \(\textit{p} =.786\); H5) did not significantly correlate with MD-time. That is, in this case, people high in self-control did not reach the moment of highest conflict, and thus, conflict resolution earlier than people low in the self.control (cf. [34]).

4.2.3 Qualitative Data

To gain more detailed insights into the actual contents that might have caused ambivalence towards robots, we had recorded evaluation contents in an open format. In this qualitative part of the experiment, we asked participants which robot functions would elicit attitudinal conflict. Participants provided 345 evaluations in total. Those evaluations were categorized by two raters into 18 categories, namely 8 assets (i.e., assistance, companionship, entertainment, usability, personalization, information, status, surveillance) and 9 risks (i.e., privacy, isolation, data security, discomfort, trouble, loss of autonomy, realistic threat, inhumanity, abuse, resources; based on [39]). The concern that robots could take over humans’ jobs was voiced frequently, so we created an additional risk category (i.e., robots taking over jobs). The most frequently voiced assets were assistance (82 mentions, e.g. ‘help in everyday life’), companionship (12 mentions, e.g. ‘social contact for lonely people’), and usability (14 mentions, e.g., ‘easy to use’). The most frequently voiced risks were privacy and data security concerns (57 mentions, e.g., ‘violation of privacy’), the fear of social contact being replaced by robots (33 mentions, e.g., ‘neglecting social interaction’) or robots taking over jobs (32 mentions, e.g., ‘can take jobs from humans’). In the ABC of Ambivalence [47], one of the cognitive consequences of ambivalence is engaging in compensatory cognitions. That is, participants try to attenuate experienced conflict by compensating via related cognition (e.g., focusing on one side of the argument) or even unrelated cognitions (e.g., finding order in snowy pictures or showing higher belief in conspiracy theories). In current data, we find exploratory indication of compensatory cognitions: Data revealed that people might be prone to resolve their conflict by especially strong arguments, such as ‘No matter how useful the robot is, I don’t want to use it if my data are not safe’ or ‘Technologies are useless if people suffer’. However, this interpretation is purely exploratory and might be investigated on a quantitative level in future experiments.

5 Experiments 2–4

Experiments 2 to 4 were conducted online due to restrictions to perform laboratory experiments caused by the COVID-19 pandemic. Since the original software was not compatible for the online use, we conducted Experiments 2–4 with a new Qualtrics package software by [22].

In these experiments, we aimed to establish the main effect of robot related stimuli on MD as a behavioral indicator of ambivalence. We measured ambivalence via mouse tracking and self-report concerning various robot pictures and words. We further investigated the behavioral and cognitive indicators of ambivalence through measuring contact intentions towards robots and interest in further information on robots (information search). The preregistered hypotheses (https://aspredicted.org/4p3zh.pdf) for Experiments 2 to 4 were as follows:

  • H1: MD is higher for robot stimuli than for univalent stimuli.

  • H2: MD predicts lower contact intentions.

  • H3: MD predicts extended information search.

  • H4: Technology commitment correlates negatively with MD.

The required sample size was computed using G*Power [11]. A power analysis for a one sided t-test for H1 expecting a medium effect (\(\textit{d} = 0.5\), \(\alpha = 0.05\), \(\beta = 0.95\)) revealed a required sample size of 45. A further analysis for correlations for H2–4 (\(\textit{r} = 0.35\), \(\alpha = 0.05\), \(\beta = 0.95\)) revealed a required participant size of 100 per experiment. We therefore preregistered a sample size of 100, respectively.

6 Experiment 2

In Experiment 2, we investigated the behavioral and cognitive indicators of ambivalence towards robots by using robot-related words vs univalent words, while exploring interindividual differences in the experience of ambivalence.

6.1 Method

6.1.1 Participants and Design

171 complete datasets were collected in a 15-minute online experiment for a raffle of three 10 vouchers or course credit. As preregistered, we excluded data from participants who indicated not having participated meticulously (16 datasets) and not having used a mouse in the evaluation task (51 datasets). While participants using a touchscreen were excluded since it can not trace the path of the mouse, we also excluded participants using a mouse pad, to keep the experimental conditions as similar as possible to Experiment 1. Invalid mouse tracking responses were excluded as specified in the MouseTracking software (3 datasets), and responses by participants not speaking German fluently (0 datasets). From the remaining 101 datasets, the last one was excluded to reach the preregistered sample of 100 datasets (\(\textit{M}_{age} = 33.87\), \(\textit{SD}_{age} = 14.90\), 59 female, 38 male, 1 diverse, 2 not specified). 53 participants were students, 29 were employed, 5 were self-employed, 1 was unemployed, 4 were retired and 8 were not specified.

6.1.2 Experimental Manipulation

We used six univalent words from Experiment 1 (i.e., disgust, abuse, unhappy, happy, holiday, sunshine) and six robot-related words (i.e., android, humanoid, robot, robotics, robotic, robot-like). All stimuli consisted of one word in German, respectively. This resulted in a total of twelve trials.

6.1.3 Measures

Mouse Tracking With the new online mouse tracking package for Qualtrics [22] Maximum Deviation and response times (in milliseconds) ere recorded while MD-time could not be obtained. Further measures automatically recorded by the software were not used in the current experiments.

Self-Report Measures We used the same self-report measures as in Experiment 1 to assess subjective ambivalence, objective ambivalence, and technology commitment. We measured contact intentions towards robots through the mean of five items that assess the willingness to interact with a robot in general, adapted from [9]. For instance, we adapted the item ‘How much would you like to meet the robot?’ to read ‘How much would you like to meet a robot?’. Further, we assessed behavioral intentions to seek out further information about robots (information search) through the mean of four items (adapted from [7]), e.g., ‘To what degree are you curious about robots?’. For exploratory purposes, we further assessed the Big Five factors of personality with a short scale consisting of ten items from [27] and the tendency to anthropomorphize with a scale of 15 items by [49].

6.1.4 Procedure

Participants were informed that they would be asked to evaluate several words as positive or negative and that their mouse movements would be recorded during the evaluations and provided informed consent. After the practice trials, in the experimental trials, participants evaluated all attitude objects as positive or negative in a random order. Participants then completed the measures of subjective and objective ambivalence for each item. Subsequently, they filled out the measures regarding contact intentions and information search. Finally, the Big Five factors of personality, proclivity to anthropomorphize, and technology commitment items were presented, participants completed the attention check and indicated their demographic data.

6.2 Results and Discussion

6.2.1 Main Analyses

Mouse Tracking As predicted, MD was higher in the robot condition (\(\textit{M} = 0.93\), \(\textit{SD} = 0.43\)) than in the univalent condition (\(\textit{M} = 0.68\), \(\textit{SD} = 0.43\)), \(\textit{t}(99) = 5.28\), \(\textit{p} <.001\), \(\textit{d} = 0.53\), indicating a medium effect size (H1; see Fig. 3). As in Experiment 1, response times were higher concerning robot words (\(\textit{M} = 1761.32\), \(\textit{SD} = 532.05\)) compared to univalent words (\(\textit{M} = 1298.16\), \(\textit{SD} = 361.66\)), \(\textit{t}(99) = 7.94\), \(\textit{p} <.001\), \(\textit{d} = 0.79\), indicating a medium to large effect size. Therefore, participants showed different motor behavior, operationalized through MD, as well as choice delay via response times when evaluating robot stimuli compared to univalent stimuli.

Fig. 3
figure 3

Means, standard errors and statistical significance of maximum deviation, response time, objective ambivalence and subjective ambivalence for each condition in Experiment 2

To test Hypothesis 2 (H2), we used a regression analysis to investigate whether MD would predict contact intentions (\(\textit{M} = 3.32\), \(\textit{SD} = 1.56\), \(\alpha = .82\)) towards robots. This was not the case: \(\beta = -0.02\), \(\textit{t}(98) = -0.82\), \(\textit{p} =.413\), \(\textit{R}^{2} =.01\). Concerning our third hypothesis, MD did also not significantly predict information search (\(\textit{M} = 3.67\), \(\textit{SD} = 1.73\), \(\alpha = .92\)), \(\beta = -0.04\), \(\textit{t}(98) = -1.76\) \(\textit{p} =.081\), \(\textit{R}^{2} =.03\) (H3). Consequently, MD as an behavioral indicator of ambivalence was not predictive of the usual cognitive consequences of ambivalent attitudes in this experiment. For the fourth hypothesis (H4), we tested whether technology commitment (\(\textit{M} = 4.72\), \(\textit{SD} = 1.15\), \(\alpha =.86\)) would correlate negatively with MD measured in robot evaluation tasks, like in Experiment 1. This was not the case \(\textit{r}(98) =.12\), \(\textit{p} =.235\) (H4). Therefore, we found no evidence in this experiment indicating that participants high in technology commitment would experience particularly low ambivalence.

Self-Report Measures In Experiment 2, objective ambivalence was higher in the robot condition (\(\textit{M} = 1.32\), \(\textit{SD} = 1.25\)) than in the univalent condition (\(\textit{M} = -0.64\), \(\textit{SD} = 0.80\)), \(\textit{t}(99) = 12.55\), \(\textit{p} <.001\), \(\textit{d} = 1.25\). Subjective ambivalence was also higher in the robot condition (\(\textit{M} = 3.39\), \(\textit{SD} = 1.22\)) than in the univalent condition (\(\textit{M} = 1.69\), \(\textit{SD} = 0.74\)), \(\textit{t} = 11.83\), \(\textit{p} <.001\), \(\textit{d} = 1.18\). That is, robot stimuli evoked higher objective and subjective ambivalence compared to univalent stimuli, indicating large effect sizes.

Secondary Analyses In order to explore the impact of interindividual differences on the experience of ambivalence, we analyzed correlation patterns between measures of ambivalence and the tendency to anthropomorphize and the Big Five factors of personality using Pearson’s correlations. Tendency to anthropomorphize (\(\textit{M} = 2.50\), \(\textit{SD} = 0.88\), \(\alpha =.88\)) did not significantly correlate with MD towards robots \(\textit{r}(98) = -0.19\), \(\textit{p} =.062\). Neither did openness (\(\textit{M} = 3.04\), \(\textit{SD} = 0.55\), \(\textit{r}(98) =.05\), \(\textit{p} =.656\)), conscientiousness (\(\textit{M} = 3.37\), \(\textit{SD} = 0.64\), \(\textit{r}(98) = -0.15\), \(\textit{p} =.148)\), neuroticism (\(\textit{M} = 2.96\), \(\textit{SD} = 0.79\), \(\textit{r}(98) = -0.10\), \(\textit{p} =.327\)), agreeableness (\(\textit{M} = 2.95\), \(\textit{SD} = 0.65\), \(\textit{r}(98) = -0.07\), \(\textit{p} =.459\)), or extraversion (\(\textit{M} = 3.06\), \(\textit{SD} = 0.49\), \(\textit{r}(98) = -0.02\), \(\textit{p} =.878\)). To conclude, in Experiment 2, response time based mouse tracking data were statistically unrelated with the interindividual difference measures.

7 Experiment 3

With Experiment 3, we replicated and extended Experiments 1 and 2. We did so by using pictures as stimulus materials instead of words as in Experiments 1 and 2 in order to aim for a generalization of our findings regarding visual materials. Concretely, we utilized pictures depicting two robot categories, namely machine-like robots and humanoid robots. We chose these particular categories because they reflect a broad spectrum, from robots that merely resemble appliances to robots that resemble humans. We presume that these different robot types likely evoke diverging evaluations. Moreover, we included these distinct robot categories to follow up on Experiment 1. Here, robot category words did not evoke higher MD compared to univalent stimuli. We will test whether this will also be the case when comparing machine-like vs. humanoid robots. Again, we considered self-reported ambivalence, technology commitment, and the tendency to anthropomorphize.

7.1 Method

7.1.1 Participants and Design

161 complete data sets were collected in a 15 min online experiment for a raffle of three 10 vouchers or course credit. As preregistered, we stopped data collection when 100 complete datasets (\(\textit{M}_{age} = 27.62\), \(\textit{SD}_{age}= 8.43\), 65 female, 35 male) were collected after excluding data as in Experiment 2. We excluded data from participants who indicated not having participated meticulously (8 datasets), not having used a mouse in the evaluation task (38 datasets), and not speaking German fluently (1 dataset). Invalid mouse Tracking responses were excluded as specified in the mouse tracking software (13 datasets). From the remaining 101 datasets, the last one was excluded to reach the preregistered sample of 100 datasets. Of the remaining 100 participants, 74 participants were students, 21 were employed, 2 were self-employed, 2 were unemployed, and 1 was not specified.

7.1.2 Experimental Manipulation

In Experiment 3, participants were presented with five univalent pictures and ten robot pictures. The layout was analogous to Experiment 1 (see Fig. 1), only that instead of the word, the respective picture was displayed. The univalent pictures depicted a cockroach (‘cockroach 4’), a coffee (‘coffee 1’), a rubber duck (‘rubber duck 1’), a fence (‘fence 2’), and weapons (‘gun 6’) from the open affective standardized image set (OASIS) [19], that were unequivocally positive or negative. The robot pictures were separated in two categories: five machine-like robots (Versatrax by Inuktun Services, Packbot by FLIR Systems, Phantom by DJI, Quince by Chiba Institute of Tech. and Tohoku University, and Spirit by NASA Jet Propulsion Laboratory) and five humanoid robots (HRP-4C by AIST, Kaspar by University of Hertfordshire, Sophia by Hanson Robotics, and Geminoid F and Geminoid HI by Osaka University, ATR, and Kokoro,). All robot pictures used in Experiment 3 can be obtained at https://robots.ieee.org/ or upon request from the authors. In total, all participants evaluated 15 stimuli.

7.1.3 Measures

We employed the same measures for mouse tracking and self-reports as in Experiment 2, but in Experiment 3, we refrained from assessing the Big Five factors of personality.

7.1.4 Procedure

As in Experiment 2, after providing informed consent and being informed about the purpose of the study, participants evaluated all stimuli as positive or negative in the mouse tracking task. Then they completed the self-report scales, the attention check, and reported demographic information.

7.2 Results and Discussion

7.2.1 Main Analyses

Mouse Tracking In line with Experiments 1 and 2, we had predicted that MD would be higher for robot stimuli compared to univalent stimuli (H1). We created a robot condition as the mean MD from all ten robot-related stimuli. Contrary to our predictions, MD was not higher for robot pictures (\(\textit{M} = 0.63\), \(\textit{SD} = 0.51\)) compared to univalent pictures (\(\textit{M} = 0.61\), \(\textit{SD} = 0.33\)), \(\textit{t}(99) = 0.53\) \(\textit{p} =.298\), \(\textit{d} =0.50\) (see Fig. 4). When analyzing the robot categories independently, MD was not higher for humanoid robot pictures (\(\textit{M} = 0.53\), \(\textit{SD} = 0.40\); \(\textit{t}(99) = -2.01\), \(\textit{p} =.976\), \(\textit{d} = 0.2\)) compared to univalent pictures. Surprisingly, MD was even lower for humanoid robots, although the difference was not significant. However, MD was significantly higher concerning machine-like robot pictures (\(\textit{M} = 0.78\), \(\textit{SD} = 0.58\); \(\textit{t}(99) = 1.79\), \(\textit{p} =.040\), \(\textit{d} = 0.18\)) compared to univalent pictures, indicating a small effect size. In the current experiment, our main hypothesis could only be confirmed for part of the robot stimuli, namely the machine-like robots. The humanoid robots did not evoke higher MD compared to univalent stimuli. One explanation for the surprising results concerning the humanoid robot stimuli might be the Uncanny Valley Phenomenon [23], a strong negative, affective response when a robot approaches human-likeness but does not fully reach it. This negative response might have dominated the evaluation, overriding potential motor behavior indicating ambivalence.

Fig. 4
figure 4

Means, standard errors, statistical significance of maximum deviation, response time, objective ambivalence, and subjective ambivalence for each condition in Experiment 3

Moreover, we analyzed response time data as indicators of choice delay. Overall, response times were higher for robot pictures (\(\textit{M} = 1402.49\), \(\textit{SD} = 648.42\)) compared to univalent pictures (\(\textit{M} = 975.19\), \(\textit{SD} = 360.61\), \(\textit{t}(99) = 9.12\), \(\textit{p} <.001\), \(\textit{d} = 0.91\)). This was the case for both humanoid robot pictures (\(\textit{M} = 1289.55\), \(\textit{SD} = 520.20\); \(\textit{t}(99) = 7.81\), \(\textit{p} <.001\), \(\textit{d} = 0.78\)), as well as machine-like robot pictures (\(\textit{M} = 1515.42\), \(\textit{SD} = 756.66\); \(\textit{t}(99) = 7.49\), \(\textit{p} <.001\), \(\textit{d} = 0.75\)), indicating large effect sizes. Taken together, these results indicate that evaluations of humanoid robot pictures partly resemble evaluations concerning univalent pictures: Mouse trajectories during the evaluation of humanoid robot pictures also did not deviate significantly from a direct path to the response button. However, these trajectories also resemble ambivalent evaluations in terms of higher response times. One reason for the high response times might be that the robot’s eeriness prompted an initial, intuitive and straight, negative response, which was reevaluated consciously. This reevaluation might take up additional time in the decision making process. As in Experiment 2, MD did not significantly predict contact intentions (\(\textit{M} = 3.40\), \(\textit{SD} = 1.73\), \(\alpha =.91\)), \(\beta < 0.01\), \(\textit{t}(98) = 0.07\), \(\textit{p} =.947\), \(\textit{R}^2 <.001\) (H2) or information search (\(\textit{M} = 4.11\), \(\textit{SD} = 1.60\), \(\alpha =.91\)), \(\beta = 0.01\), \(\textit{t}(98) = 0.39\) \(\textit{p} =.701\), \(\textit{R}^{2} =.001\) (H3). As in Experiment 2, contrary to our predictions, technology commitment (\(\textit{M} = 4.95\), \(\textit{SD} = 1.05\), \(\alpha =.84\)) did not correlate significantly with MD, \(\textit{r}(98) =.01\), \(\textit{p} =.957\) (H4). Therefore, Hypotheses 2, 3, and 4 could not be confirmed.

Self-Report Measures Similar to the pattern of results from Experiments 1 and 2, findings from Experiment 3 revealed that objective ambivalence was higher in the robot condition (\(\textit{M} = 0.94\), \(\textit{SD} = 1.06\)) than in the univalent condition (\(\textit{M} = -0.76\), \(\textit{SD} = 0.88\)), \(\textit{t}(99) = 13.83\), \(\textit{p} <.001\), \(\textit{d} = 1.38\). Specifically, objective ambivalence was higher for both humanoid robot pictures (\(\textit{M} = 0.80\), \(\textit{SD} = 1.37\); \(\textit{t}(99) = 10.64\), \(\textit{p} <.001\), \(\textit{d} = 1.06\)) as well as machine-like robot pictures (\(\textit{M} = 1.08\), \(\textit{SD} = 1.23\); \(\textit{t}(99) = 12.85\), \(\textit{p} <.001\), \(\textit{d} = 1.29\)) compared to univalent pictues, indicating large effect sizes. Moreover, self-reported subjective ambivalence was higher in the robot condition (\(\textit{M} = 3.17\) \(\textit{SD} = 1.02\)) than in the univalent condition (\(\textit{M} = 2.15, \textit{SD} = 0.92\)), \(\textit{t} = 8.64\), \(\textit{p} <.001\), \(\textit{d} = 0.86\). Specifically, subjective ambivalence was higher for both humanoid robot pictures (\(\textit{M} = 3.33\), \(\textit{SD} = 1.29\); \(\textit{t}(99) = 8.97\), \(\textit{p} <.001\), \(\textit{d} = 0.90\)) as well as machine-like robot pictures (\(\textit{M} = 3.01\), \(\textit{SD} = 1.16\); \(\textit{t}(99) = 6.02\), \(\textit{p} <.001\), \(\textit{d} = 0.60\)), indicating medium to large effect sizes.

7.2.2 Secondary Analyses

Proclivity to anthropomorphize (\(\textit{M} = 2.93\), \(\textit{SD} = 0.94\), \(\alpha =.88\)) did not correlate with MD (\(\textit{r}(98) = 0.02\), \(\textit{p} =.839\)), as in Experiment 2. That is, we again did not find any indication of a connection between interindividual differences and motor behavior.

8 Experiment 4

In Experiment 4, we replicated and extended our work to yet another group of stimuli, namely various social robot pictures. Social robots were not chosen as a distinct category to the stimuli in the previous experiments (e.g., humanoid robots), but rather as a means to replicate and extend our research, since replication and generalization are crucial in consolidating scientific results.

Fig. 5
figure 5

Means, standard errors and statistical significance of maximum deviation, response time, objective ambivalence and subjective ambivalence for each condition in Experiment 4

8.1 Method

8.1.1 Participants and Design

121 complete datasets were collected in a 15-minute online experiment for a raffle of three 10 vouchers or course credit. As preregistered, we stopped data collection when 100 datasets (\(\hbox { M}_{{age}} = 29.02\), \(\hbox { SD}_{{age}} = 13.03\), 61 female, 39 male) were collected, after excluding data. Specifically, we had preregistered to exclude data from participants who had either indicated not having participated meticulously (0 datasets), not having used a mouse in the evaluation task (17 datasets), and not speaking German fluently (0 datasets). From the remaining 103 datasets, the last three were excluded to reach the preregistered sample of 100 datasets. Seventy-one participants were students, 20 participants were employed, 4 individuals were self-employed, one participant reported being unemployed, 2 people were retired, and one individual did not specify a response.

8.1.2 Experimental Manipulation

In Experiment 4, participants were shown five univalent pictures depicting swans (‘bird 1’), a cat (‘cat 4’), flowers (‘flowers 2’), a fence (‘prison 2’), and a toilet (‘toilet 4’) from [19]) and five pictures of social robots (Sophia and Zeno by Hanson Robotics, Pepper by Softbank Robotics, VIVA by Navel Robotics, and Kobian developed at Waseda University in Japan). Robot pictures used in Experiment 4 can be obtained at https://robots.ieee.org/ or upon request from the authors. We chose social robots as stimulus materials in this experiment, since this category follows up the robot functions from Experiment 1. In total, all participants evaluated ten stimuli.

8.1.3 Measures and Procedure

The procedure for Experiment 4 was identical to that of Experiment 3. However, unfortunately, due to a technical error no MD and response time data were collected. Nonetheless, we report the available self-report results for Experiment 4 here to complement the series of experiments.

8.2 Results and Discussion

Due to the technical failure in measuring mouse tracking, Hypotheses 1–4 pertaining to MD could not be tested. However, fortunately, self-report data were recorded: In line with Experiments 1–3, self-reported objective ambivalence was higher in the robot condition (\(\textit{M} = 0.90\), \(\textit{SD} = 0.96\)) than in the univalent condition (\(\textit{M} = -1.10\), \(\textit{SD} = 0.90\)), \(\textit{t}(99) = 17.77\), \(\textit{p} <.001\), \(\textit{d} = 1.78\) (see Fig. 5).

Further, subjective ambivalence was higher in the robot condition (\(\textit{M} = 3.41\) \(\textit{SD} = 1.10\)) than in the univalent condition (\(\textit{M} = 1.67\), \(\textit{SD} = 0.70\)), \(\textit{t}(99) = 16.58\), \(\textit{p} <.001\), \(\textit{d} = 1.66\), indicating large effect sizes. Experiment 4 thus extended the array of stimuli used in this set of studies to pictures of social robots. In contrast to the humanoid robot pictures in Experiment 3, the social robots in Experiment 4 consisted of various appearances, ranging from a very human-like Sophia robot to a cartoon-like Pepper robot. Participants seemed to hold opposing evaluations and feel conflicted towards social robots. Whether this self-reported ambivalence would reflect implicitly in mouse tracking might be investigated in future experiments.

9 Meta Analysis of Experiment 1–4

In the current experiments, self-reported ambivalence did not consistently translate to mouse tracking data as a behavioral indicator of ambivalence: Specifically, in Experiment 1, MD was higher concerning robot function words but not robot category words compared to univalent words. In Experiment 2, MD was higher concerning general robot words compared to univalent words, and in Experiment 3, MD was higher towards machine-like robot pictures but not humanoid robot pictures compared to univalent pictures. To explore the inconsistent findings regarding MD across experiments we conducted a random effects meta analysis of the observed effects in Experiment 1–3 (means and standard deviations of MD concerning robot vs. univalent stimuli) using the metafor package in R [48]. The meta analysis estimated a Restricted maximum likelihood (REML) model (\(\hbox {k} = 3\)) with an estimated amount of total heterogenity of \(\tau ^2 = 0.052\) and an estimated true standard deviation of \(\tau = 0.229\). Furthermore, the standardized mean difference was \(-0.29 [-0.59, 0.02]\) (see Fig. 6). Since the 0 is included in the confidence interval, this indicates the absence of an overall effect. Therefore, based on this meta analysis, robots in general might not evoke signficantly higher MD compared to univalent stimuli, despite the direction of mean differences being hypothesis-conform. This is interesting, since all robot stimuli evoked responses that translated to medium to large effect sizes on self-reported ambivalence. It seems to depend on the specific stimulus type, whether such self-reported ambivalence translates to behavioral consequences, which we found for robot functions and machine-like robots. Overall, we observed a high variability across experiments concerning MD, and further research is necessary to investigate the specific stimulus characteristics, e.g., robot type, that lead to MD as a behavioral indicator of ambivalence. Future research might investigate, whether there is no overall effect concerning MD or whether the power in our studies was too low to detect it. Due to the high heterogenity of stimuli, this meta-analysis should be interpreted with caution and might rather be seen as an exploration of results than an additional result.

Fig. 6
figure 6

Meta Analysis of the effect of stimulus type (robot vs univalent) on MD in Experiments 1–3

10 General Discussion

The aim of the current research was to investigate ambivalence in a multi-faceted manner, taking into account the affective, behavioral, and cognitive dimension in explicit and implicit measures. In four experiments, we investigated self-reported objective and subjective ambivalence. To reflect behavioral indicators of ambivalence, we assessed MD and response times in Experiments 1 to 3, extended by contact intentions in Experiments 2 to 4. Further, we assessed compensatory cognitions in Experiment 1 and systematic processing in Experiment 2 to 4 as cognitive indicators of ambivalence. As a similarity in all four experiments, we found that self-reported objective and subjective ambivalence was higher for various robotic stimuli compared to univalent stimuli. Large effect sizes underline the practical relevance of the observed effects. Participants seem to have competing positive and negative evaluations, as well as feeling torn between the positive and negative aspects of robots concerning robot function words, robot category words, general robot words, humanoid robot pictures, machine-like robot pictures, and various social robot pictures. This underlines the generalizability of previous results (e.g., [38]) and emphasizes that all sorts of robots evoke high levels of evaluative conflict, namely ambivalence. However, self-reported ambivalence did not consistently translate to behavioral indicators of ambivalence, as measured via MD.

Our hypotheses concerning motor behavior were confirmed in one of three experiments concerning words related to robots in general (Experiment 2), whereas choice delay was consistently higher concerning robotic stimuli compared to univalent stimuli in all three mouse tracking experiments (Experiments 1 to 3). In two further experiments, the main hypothesis was confirmed only for a part of the stimuli, namely robot functions (Experiment 1), and pictures of machine-like robots (Experiment 3). In Experiment 3, it was especially surprising that participants showed particularly low behavioral ambivalence (measured via MD) concerning humanoid robot pictures. One possible explanation might be an uncanny valley phenomenon, describing the effect that nonhuman entities are evaluated more favourably when they are more human-like, but seem eerie and uncanny of robots approaching human-likeness but do not completely achieving it [23]. In the literature, the role of the uncanny valley in the evaluation process has not yet been thoroughly investigated and does not seem to fit the originally proposed cubic function between human-likeness and evaluations [30]. In the current work, participants might have experienced an initial negative response towards pictures of humanoid robots, strongly affecting the evaluative conflict. Observing the evaluation process further via mouse tracking might provide valuable insights for roboticists into the specific cognitive and behavioral implications of the uncanny valley during decision making.

Contrary to our hypotheses, MD as an indicator of ambivalence did not predict systematic processing and interest in further attitude-relevant information across three experiments. Therefore, in the case of robots, experienced conflict might not be the most important factor when deciding whether to seek contact or information on robots. External factors such as the availability, price, or usability might be more important for robot contact in everyday life at the moment. However, qualitative data in Experiment 1 suggested that participants might engage in compensatory cognitions, which should be investigated further. Engaging in cognitions that enable a feeling of order is one strategy to cope with attitudinal conflict. Further studies might investigate, whether related or unrelated compensatory cognitions, e.g., finding order in snowy pictures [46], might help potential users cope with robot-related ambivalence.

Finally, in contrast to earlier work [39], participants high in technology commitment did not experience less ambivalence overall in Experiments 2 to 4. Neither the Big Five Factors nor interindividual differences in the proclivity to anthropomorphize correlated with ambivalence as measured by means of mouse tracking and self-reports. Accordingly, the investigated interindividual differences might not be connected with behavioral indicators of ambivalent attitudes towards robots. However, it should be noted that the power in the current experiments might have been too low to detect small effects of interindividual differences on ambivalence.

10.1 Strengths and Limitations

Conducting a four-experiment series comes both with a range of methodological assets and challenges. The current research is programmatic, experimental, and the empirical approach enabled us to inquire the validity and generalizability of our results. This was due to the variety of stimuli employed and the replication of results across studies. In a mixed-methods approach, we combined explicit and implicit measures of ambivalence towards robots through including self-report and response time based data and extended the notion of ambivalence in attitudes towards robots through qualitative results. We explored the connection between interindividual differences and behavioral indicators of ambivalence, with scales of acceptable to high internal consistency. However, we did not obtain consistent significant correlations between interindividual differences and ambivalence. Furthermore, in line with the notions of open science and reproducibility, all four experiments were preregistered and data and code are available. The inconsistent results in behavioral measures underline the importance of replication in both psychological and social robotics research. Whereas Experiment 2 alone would have provided convincing data regarding MD as a behavioral indicator of ambivalence towards robots, the variability of stimuli used in this series of experiment showed that the external validity of single experiments might be limited. It further demonstrated that explicit measures of ambivalence do not always transfer to implicit measures of ambivalence, and that the correlation might depend on stimulus type. With a replication approach, one can strengthen the confidence in results that are evident repeatedly and identify potential incidental findings, that only appear once. The current work contributes to social robotics research by providing examples of using social psychological methodology and theory to further our understanding of attitudes towards robots, integrating explicit and implicit measures. It furthermore contributes to social psychological research by applying fundamental research on ambivalent attitudes to a novel and practically relevant topic, namely social robotics.

Despite our focus on replicability and reproducibility, we also had to face methodological challenges. Due to the contact restrictions caused by the COVID-19 pandemic, laboratory experiments were not possible for Experiments 2 to 4. Switching mouse tracking software had several implications: First, the comparability between experiments is limited due to the different methods. Further, online mouse tracking produces more noise and data exclusions than mouse tracking in the laboratory due to diverging devices and uncontrollable environments. Although we included an attention check in the end of every experiment, we can not be sure on how the online setting influenced participants attentiveness in contrast to a laboratory setting. We hoped to counterbalance these restrictions using our replication approach and a large overall sample of 411 participants. Also, to provide utmost transparency, we informed participants that their mouse movements would be tracked in all experiments during informed consent. We assumed, that this did not significantly alter results since mouse tracking is thought to record ongoing cognitive processes and implicit components of attitudes in an unobtrusive manner (e.g., [32]). However, we can not be sure that participants’ awareness did not influence the mouse tracking results. Furthermore, we conducted a relatively small number of trials per condition, in order to keep the length of the experiments, especially the online experiments to a minimum. Using more trials in future experiments might reduce noise and result in clearer results. Moreover, for future experiments, laboratory settings are preferred to eliminate as much variance that is unrelated to the investigated effect as possible.

Another limitation of the current work is the limited practical relevance of participants decisions, which might have particularly influenced the cognitive consequences of ambivalence. Previous research has indicated that ambivalence is especially strong, when a relevant decision has to be made [45]. Since participants in the current experiments had no possibility to meet either of the presented robots, they might not have felt the decision to be important for their individual lives and might therefore not have been motivated to reduce their experienced ambivalence by cognitive strategies. Future studies might replicate the proposed methodology using robots that participants encounter in a research environment, while including a practically relevant decision, for example whether participants would like to speak to the robot, and track mouse trajectories in the meantime. Finally, in the current experiments we only considered ambivalence towards robots; future work might investigate ambivalence against other technological devices that might evoke similar evaluative conflict, such as smart speakers, smart home devices, or virtual agents.

10.2 Future Work

In the current work, some indicators of ambivalence were consistently higher towards robot-related stimuli (objective and subjective ambivalence, response times), whereas MD as an implicit indicator of ambivalence yielded inconsistent results. To understand the connection between explicit and implicit measures of ambivalence towards robots, further categories of robot stimuli might be investigated with the proposed methodology, e.g., education robots or telepresence robots. Moreover, further knowledge concerning the uncanny valley effect might be gained with the use of mouse tracking methodology, since supposedly uncanny humanoid robots elicited surprising results in the current work.

As we did not find any indication of cognitive or behavioral consequences (i.e., information search, contact intentions) being influenced by MD in Experiments 2 and 3, future studies might investigate further variables derived from the components of the ABC of Ambivalence, such as unrelated compensatory cognitions or long term behavioral consequences, such as avoidance behavior [34]. Furthermore, in our experiments, the trait variables technology commitment, tendency to anthropomorphize, and personality characteristics (Big Five) were not connected to ambivalence towards robots. Future research might investigate other variables from the ambivalence literature, such as need for cognition [25] or a general tendency for ambivalent attitudes [35]. Possibly, ambivalence is not necessarily a negative status to reduce, but also a factor that ameliorates decisions and makes ambivalent individuals less susceptible to cognitive bias [35].

Future research might manipulate ambivalence experimentally by having participants generate arguments for and against the attitude objects themselves, or univalence by having participants generate just one-sided arguments [46]. This way, a stronger manipulation of ambivalence might be achieved and affect behavioral and cognitive indicators of ambivalence.

11 Conclusion

This paper featured a set of four programmatic experiments to investigate ambivalence on the affective, behavioral and cognitive level concerning various robot stimuli using explicit and implicit measures. Results indicated that self-reported attitudes towards robots are indeed ambivalent and partly evoke behavioral expressions of conflict as measured through mouse tracking data. This is in line with previous research on ambivalent attitudes towards robots [5, 16, 38, 39].

Through the current work, we demonstrated the applicability of the ABC of Ambivalence to a new field, namely social robotics, while at the same time exploring the limits of the models applicability. For instance, behavioral indicators of ambivalence towards robots seem to depend on the specific type of robot that is being evaluated. Furthermore, we hope to contribute to the advance of measurement methodology in social robotics, by emphasizing the relevance of measuring ambivalence in attitudes towards robots and providing examples on how to do so on an explicit and implicit level.