1 Introduction

Work environments are changing rapidly with the emergence of new technology, such as robots, and their employment in different industries [1]. As robots are expected to collaborate more with humans in the future [2], there is an urgent need to investigate their effects on human workers. One crucial aspect for a well performing work force is the feedback workers receive and give to each other regarding their performance. Performance feedback, that is “information regarding some aspect(s) of one’s task performance” provided “by (an) external agent(s)” ([3], p. 255), helps workers better understand their successes and mistakes and can motivate them to perform a task or become better in it [4].

Extensive research has been conducted to investigate the psychosocial effects of performance feedback provided by humans [5,6,7], showing effects on intrinsic motivation and self-esteem [8]. However, it is not yet fully understood whether such human-human interactions differ from human-robot interactions and if performance feedback from humans and robots has different effects on people.

Such knowledge can vastly increase the understanding of how to implement robotic technology in professional work settings and of what social aspects must be considered. It can also guide the development of supervising robotic systems correcting human errors and providing feedback about errors in a way that humans can more easily accept the information and the role of the robot.

Therefore, the main goal of this study is to examine how performance feedback given by either a human or a robot affects recipients’ subjective and psychophysiological reactions. Specifically, it compares effects of human and robotic feedback on intrinsic motivation, self-esteem, and psychophysiological reactions. Such a multilevel approach can uncover more facets of social effects, since these often happen unconsciously [9] and can therefore be more reliably investigated by supplementing questionnaire-based data with psychophysiological measurements.

1.1 Effects of Performance Feedback on Motivation and Self-Esteem

Performance feedback is a commonly used technique in human resource management [10]. Research [11] shows that the productiveness, satisfaction, and motivation of workers highly depends on feedback on their performance [4] coming from colleagues or supervisors in form of e.g., praise and critique [12]. Performance feedback is necessary to maintain a high-quality level of work, to exchange information about specific aspects of tasks and to increase workers’ preparedness for challenges [13].

Besides these general effects, performance feedback was shown to impact intrinsic motivation [14], that is, motivation that one experiences out of the task itself due to interest [15]. Specifically, performance feedback affects motivation towards a task [16,17,18] by reinforcing or diminishing the perceived feeling of self-competence and therefore one’s enjoyment of the task [19]. Deci and Ryan [15, 20] describe this effect in their Self Determination Theory as the consequence of the (non)fulfillment of people’s basic psychological needs, especially the need of feeling competent in one’s task and work, but also the need for belonging to a social group and the ability to decide and act autonomously. Since individuals strive for the fulfillment of their needs, they will be more motivated by a task if they are rewarded with need fulfillment. On the other hand, people are prone to losing interest in a task if the result of their performance is the denial of need-fulfillment [15]. In line with these arguments, [19] showed that individuals who received positive performance feedback perceived themselves as more competent and autonomous and in turn also reported higher intrinsic motivation than individuals who received negative performance feedback. Hence performance feedback contributes to intrinsic motivation of workers as it provides “direct and clear information about the effectiveness of his or her performance” ([21], p. 258). This leads to our first hypothesis.

Hypothesis 1:

Individuals are more intrinsically motivated towards a task after receiving positive feedback on their performance of the task than after receiving negative performance feedback.

Besides being associated with intrinsic motivation, the fulfillment of basic psychological needs also affects self-esteem. Self-esteem is defined as the personal judgment of one’s worthiness [22]. It describes the positive and negative image people have of themselves. Depending on the temporal stability, two different types of self-esteem have been proposed: trait self-esteem and state self-esteem [23]. Trait self-esteem describes a long term and relatively stable image one has about oneself, which builds over months or years. It interacts with state self-esteem, which is the situational self-image that will fluctuate depending on one’s current mood and external factors.

If people do not feel competent, socially included or autonomous, their self-esteem tends to be lower than if those needs are fulfilled [24]. Since the feeling of competence is associated with one’s self-esteem, performance feedback can either be a threat to one’s state self-esteem or act as an encouragement [20, 25, 26]. Generally, positive performance feedback increases, and negative performance feedback decreases self-esteem [14]. From these findings, we derive

Hypothesis 2:

Individuals experience higher state self-esteem after receiving positive feedback on their performance of a task than after receiving negative performance feedback.

1.2 Effects of Performance Feedback on Parameters of Cardiovascular and Electrodermal Activity

Since performance feedback causes an affective reaction [27], it is known to influence psychophysiological parameters, especially those associated with cardiovascular and electrodermal activity.

The expected relationship can be described using Gray’s Three Arousal Theory. The theory explains different types of behavioral orientation reactions on rewarding (praise) and punishing (critique) stimuli [28]. It distinguishes three systems of the human nervous system controlling behavioral reactions and the psychophysiological phenomena accompanying those reactions. The systems are intended to regulate energy-resource allocation and initialize information acquisition to prepare for an adequate reaction to a situation. They are known as the behavioral activation system (BAS) and the behavioral inhibition system (BIS), and are integrated and regulated by the non-specific arousal system (NAS).

The BAS reacts primarily to positive affect, rewarding stimuli or unexpected non-punishment after expecting a negative outcome of a situation. It mobilizes resources to react with increased activation to anticipate the positive situation, which shows in an increased heart rate [29]. However, studies additionally show that positive affect and reward also increase parameters of skin conductance, which suggests that skin conductance increases the effects of the BAS [30].

The BIS is responsible primarily for inhibiting behavior after being exposed to negative affect, punishing stimuli or unexpected non-reward after expecting positive outcomes of a situation. It inhibits behavior to allow an individual to contemplate the situation and invest more energy in thinking about how to potentially turn the situation into a positive outcome. The reaction of the BIS is reflected in an especially high increase in skin conductance. It is argued that skin conductance not only reflects negative experiences and punishment, but also “a general component of the whole somatic emotional response” (Damasio (1994) cited in [31], p. 21) [30]. Additionally, negative feedback often has a decelerating effect on the heart rate [32, 33].

According to the theory about BIS and BAS, we expect

Hypothesis 3:

Heart rate is higher after receiving positive performance feedback compared to negative performance feedback.

Hypothesis 4:

The level of skin conductance and non-specific fluctuations in skin conductance is higher after receiving negative performance feedback compared to positive performance feedback.

1.3 Differential Effects of Performance Feedback from Robots and Humans

While social robots may have an impact on people in social interaction situations, it is not yet fully understood as to what extent these robotic interaction effects are comparable to human interaction effects. The best-known theory describing affective reactions to robots is the Uncanny Valley Theory, proposing a positive connection between human-likeness and likeability of robots until a great decline in trustworthiness and likeability occurs, if a robot’s appearance is eerily human-like [34]. Research, however, shows that robots with human-like features appear less uncanny if they avoid heavily mimicking a human [35].

In addition to the Uncanny Valley Theory, Social Surrogate Theory states that people can find social interaction partners in surrogates that do not fulfill all criteria of sociability like another human being does [36]. However, since social surrogates such as social robots do not display as much (natural) sociability and “human-like” features as humans, one can infer that they show less social presence in interaction situations. This means that although social robots are perceived as “quasi social actors” and humans interact with them “as if” they were social beings [37], social robots would in general have less impact on socially relevant interaction parameters than a human. A study by Comier et al. [38] points in this direction. The researchers modeled an experiment similar to Milgram’s compliance experiment, where participants had to perform monotonous tasks in the presence of a robotic or a human experimenter. The robot had a similar, but significantly weaker effect on participants’ behavior than the human experimenter [38].

While several studies in the field of human-robot interaction have explored the effects of performance feedback on task performance [39,40,41] as well as the evaluation of [42, 43] and trust in robots [40], only a few studies have investigated effects of performance feedback on intrinsic motivation. Fasola and Matarić [44] found that elderly people who received positive performance feedback from a robot, such as praise, when performing physical exercises rated the task as more enjoyable compared to those who did not receive positive performance feedback. In the context of higher education learning, Donnermann et al. [45] examined adaptive performance feedback from a robot for exam preparation but did not find any effect on intrinsic motivation. Thus, the valence of the feedback (i.e., positive versus negative) appears to have a greater impact on intrinsic motivation in human-robot interaction than its adaptiveness.

There is limited research on the effects of performance feedback on self-esteem and psychophysiological reactions in human-robot interaction. A study on the social effects of robots as social surrogates examined the impact of social feedback, including social rejection and acceptance [46]. The study found that being rejected by a robot had a negative impact on humans’ self-esteem compared to receiving no social feedback or social acceptance from a robot. Huang and Rau [39] demonstrated that negative performance feedback in a cognitive task resulted in higher activation of emotion-related brain regions.

Lastly, to the best of our knowledge, no studies have compared the effects of human and robotic performance feedback on recipients’ psychosocial outcomes. However, some studies suggest that while robots can act as a social surrogate, they may have less social presence than a human [38]. Therefore, we hypothesize that

Hypothesis 5:

The effect of performance feedback on (a) intrinsic motivation and (b) self-esteem is weaker when receiving performance feedback from a robot than when receiving performance feedback from a human.

Since the physiological reaction reflects the affective response to a cue and a robot should induce less arousal due to its lower social presence, we hypothesize that

Hypothesis 6:

Heart rate and (b) skin conductance is lower after receiving performance feedback from a robot than after receiving performance feedback from a human.

2 Methods

2.1 Statement of Ethical Research

The study procedure, its design, and all questionnaires were approved prior to conduction by the ethics committee of the last author’s institution (No. 57-2019/20). All methods used in the study complied with the ethical guidelines for psychological research in the country in which the study was conducted. Participants were informed about the procedure and purpose of the study and gave their written consent to participate in the study.

2.2 Sample

We tested 72 participants because a sensitivity analysis with G*Power 3.1.9.7 showed that this sample size would be sufficiently large for detecting the hypothesized effects. The sensitivity of an ANOVA was calculated, with .05 α error, .80 β error and 4 groups as parameters, which resulted in a critical F of 2.74.

The participants’ age ranged from 18 to 32 years (M = 23.94, SD = 2.83) and gender was balanced with 36 female and 36 male participants. The educational level of the participants was quite high, with 38 participants (53%) having a high school diploma, 22 participants (30%) holding a bachelor’s degree, and 10 participants (14%) holding a master’s degree. Only two participants had left school after completing their compulsory education (3%). Of the participants, 15 were employed (21%) and 57 were students (79%), of which the majority was studying either psychology or computer science.

2.3 Study Design and Procedure

The study used a multilevel approach to subjective and objective data, including individual ratings and psychophysiological parameters, in order to examine the effects of different forms of feedback in human-human and human-robot interaction on subjective perception and psychophysiological reactivity.

A mixed model with two between-subjects factors (feedback giver and valence of feedback) and one within-subjects factor (before and after the feedback) was used. Each participant was given positive (n = 36) or negative feedback (n = 36) by a robot (n = 37) or a human experimenter (n = 35). The feedback they received was determined via randomization to ensure comparable group sizes. Each subjective parameter was measured once before and after the feedback. The physiological parameters were assessed during a baseline measurement and directly after receiving the feedback.

2.4 Study Procedure

The participants were seated in front of a desk with various questionnaires and a computer (see Figure 1). First, the physiological instruments were mounted on the participants. In the robot condition, the human experimenter left the laboratory after attaching the instruments and the robot continued the experiment. In the human condition, the human experimenter just continued. The human experimenter was trained to perform the experiment consistently.

Fig. 1
figure 1

Experimental setup

Following a physiological baseline measurement, the participants provided sociodemographic data and were familiarized with the experimenter (either the robot, n = 37, or the human, n = 35) through a brief conversation [47]. The test conductor (robot or human) then explained the task to the participants, who familiarized themselves with it by completing one task run. Subsequently, participants answered questionnaires about their intrinsic motivation and self-esteem. Then the participants performed a second task run which was followed by positive or negative feedback. After receiving feedback, they again answered psychological questionnaires on intrinsic motivation and self-esteem. During the feedback reception, their heart rate and skin conductance were assessed. A final task run was then conducted. Upon completing, participants were debriefed and informed that they had received random feedback that did not correspond to their performance. Figure 2 provides a schematic illustration of the study procedure.

Fig. 2
figure 2

Schematic illustration of the study procedure and the measurements

2.5 Task

We used a 3-back version of the n-back task to induce a performance situation that could be evaluated by an agent. The n-back task is a widely used cognitive task for assessing working memory abilities and capacity. The task was presented on a computer in front of the participants. Participants were shown a sequence of letters and had to press a button on the keyboard when the current letter matched the one from three steps earlier in the sequence. Figure 3 provides a graphical illustration of the task.

Fig. 3
figure 3

Schematic illustration of the 3-back task

The 3-back task was selected because participants should not realize that the feedback they received did not correspond to their performance but was given randomly. According to validation studies [48], the visual 3-back task has a 66% correct response rate, indicating an average level of difficulty. Therefore, the task should not result in obviously good or poor performance as it is neither too difficult nor too easy. Additionally, the task requires an intense involvement of the working memory, making self-evaluation of task performance challenging for participants. Therefore, participants should not be aware that the feedback did not match their actual performance.

2.6 Feedback

Each participant received one of two feedback variants, either positive or negative feedback. We derived the feedback from a study about the effects of comparative feedback [8]. The positive feedback stated: “Your performance lies in the 87th percentile. This means that you gave more correct answers and were quicker than 87% of other people performing this task.” The negative variant reads: “Your performance lies in the 19th percentile. This means that 81% of other people performing this task gave more correct answers and were quicker than you.”

2.7 Robot

The robot used in the study was the model PEPPER from Aldebaran. It is about 1.2m tall and has a simple color pattern of mostly white with blue LED’s. It is an anthropomorphic social robot with a stylized face, arms with fingers and a roughly human body shape. It is capable of speaking in a rather natural way.

We used a Wizard-of-Oz design, where a human experimenter controlled the robot with a web-interface from another room [49]. The human experimenter was able to observe the study with a network camera installed in the laboratory. A schematic illustration of the spatial study setup is provided in Figure 4.

Fig. 4
figure 4

Schematic illustration of the spatial study setup

The robot was programmed to stay in front of the participant’s desk at a distance of about two meters. It made slight naturalistic and anthropomorphic movements, such as moving its arms slightly from front to back. It maintained eye contact with the participants most of the time but, looked around in the room when not speaking. While speaking, it made gestures with its arms and moved its body a little. Overall, the goal of its programming was to increase its anthropomorphism.

2.8 Psychophysiological Parameters

In the present study, we measured cardiovascular activity and electrodermal activity. Cardiovascular activity was measured with single-use electrodes using a three-point ECG on the chest. Electrodermal activity was measured using AgCl-electrodes, positioned on the hypothenar and thenar eminence on the non-dominant hand. To measure and process psychophysiological data, the Varioport system and the Variograf v469 were used. Analysis of the data was conducted using EKG Vario v1.85 and EDA Vario v1.94. The cardiovascular system parameter of interest was heart rate. Parameters on electrodermal activity used in this study were skin conductance level (SCL) and non-specific skin conductance response (NS.SCR). Duration of interested measurement sections for psychophysiological parameters was 20 seconds directly after the feedback.

Psychophysiological parameters were baseline corrected for each participant using an individual 60-second resting-phase as a baseline to control for participants’ initial physical conditions and to avoid cross-interactions and effects of participants’ individual physical conditions during the experiment. Psychophysiological parameters were adjusted regarding individual artifacts for each participant and measurement section.

2.9 Questionnaires

Various questionnaires were given to the participants to assess psychological variables. Since this study was conducted with German native speakers, the German versions of all surveys and questionnaires were presented to the participants.

2.10 Intrinsic Motivation

Intrinsic motivation was assessed via an adapted version of the Questionnaire for Measuring the Current Motivation [50]. This questionnaire originally consists of 18 items, measuring fear of failure, interest, subjective probability of success, and perceived challenge. However, three items were excluded because they did not match the features of the given task in the experiment. An example of such an item is “In the task, I like the role of the scientist who discovers connections”, which was not representative for a working memory task. Answers were given on a seven-point rating scale, ranging from 1 (does not apply) to 7 (does apply). The authors [50] argue against the computation of a total score. For that reason, we chose the sub-scale “interest” for analyses, since interest appropriately represents internal or intrinsic motivation [51]. Internal consistency of the interest sub-scale as indicated by Cronbach’s α was .71.

2.11 State Self-Esteem

To measure state self-esteem, we used the State Self-Esteem Scale [52]. The scale consists of 15 statements, grouped into three sub-scales (a) performance state self-esteem, (b) social state self-esteem, and (c) visual appearance state self-esteem. For each statement, participants were required to indicate the extent to which they agree with the statement on a 5-point rating scale. We focused on performance state self-esteem, since it best represents self-esteem in a performance situation. The internal consistency of the performance state self-esteem scale was satisfactory, with α = .80.

2.12 Agent Attributes

To better understand the impression the participants had of the different test conductors in a feedback situation, the Robotic Social Attributes Scale (RoSaS) was used [53]. It is used to measure the perceived warmth and competence the agent radiates, as well as the discomfort the participant experiences when interacting with the agent. All subscales were measured with six items each on a 9-point rating scale. The internal consistencies of the sub-scales ranged from α = .82 to .91.

3 Results

Overall, the participants performed rather well in the 3-back task. On average, they correctly responded to 80% of the cues in the first feedback run and to 82% of the cues in the second feedback run. Thus, their accuracy was higher than the 66% accuracy reported in previous validation studies [48].

Participants received bogus feedback that was either positive or negative, regardless of their actual performance. We therefore tested whether the feedback was perceived as plausible using the following self-developed item: “Did you feel the feedback was representative of your performance in the task you just completed?” Participants rated the item on a 5-point scale ranging from 1 (not at all) to 5 (very much). To evaluate differences in feedback plausibility, participants were assigned to one of four groups: inadequate positive feedback, inadequate negative feedback, adequate positive feedback and adequate negative feedback. For example, participants who had an accuracy of 66% or higher and received negative feedback were assigned to the inadequate negative feedback group. The results of an ANOVA indicate that there were no significant differences in the assessment of feedback plausibility among the four groups, (F(3,66) = 0.962, p = .42, η2 = 0.04).

3.1 Intrinsic Motivation and Self-Esteem

Intrinsic motivation and performance self-esteem were analyzed via a mixed model MANOVA. For this calculation, one participant had to be excluded due to extreme outliers in both variables of more than two standard deviations. Therefore, the analysis of intrinsic motivation and self-esteem was conducted with 71 participants.

Results show that no significant difference between receiving positive or negative performance feedback exists for intrinsic motivation, leading to a rejection of H1. For performance self-esteem, a main effect for time and a significant interaction between time and valence was found. People reported higher self-esteem after receiving feedback than beforehand (F(1,67) = 10.42, p < .01, Η2 = 0.04). However, their self-esteem increased even more after receiving positive feedback than after receiving negative feedback (F(1,67) = 4.15, p < .05; η2 = 0.02). This result supports H2.

People are significantly more motivated when receiving feedback from a robot than from a human (F(1,67) = 5.63, p < .05; η2 = 0.07), contradicting H5a. There are also no significant differences in the effects of a human and a robotic feedback giver on self-esteem, which leads to the rejection of H5b. The group values for self-esteem and intrinsic motivation can be seen in Table 1. The effect on intrinsic motivation is depicted in Figure 5.

Table 1 Means and standard deviations of subjective parameters by study groups
Fig. 5
figure 5

Effects of feedback valence and agent on intrinsic motivation

3.2 Psychophysiological Results

Two participants were excluded from psychophysiological data analysis due to insufficient psychophysiological data quality, leaving 70 participants for further data analysis.

Concerning parameters of skin conductance (SCL and NS.SCR) and heart rate, analyses reveal no significant differences for the main effect of valence, leading to a rejection of hypotheses H3 and H4. Regarding the main effect of agent, analyses reveal significant differences for heart rate, NS.SCR, and SCL. Within all these parameters, participants show more psychophysiological reactivity when receiving feedback from a human than from a robot, which supports hypotheses H6a and H6b. Figure 6 shows the significant difference in the baseline-corrected mean frequency of NS.SCRs, as an example of the main effects of the agent on all psychophysiological parameters.

Fig. 6
figure 6

Main effect of agent on the frequency of non-specific skin conductance response (NS.SCR)

As shown in Table 2, a significant interaction was found of valence and agent on the SCL. As can be seen in Figure 7, participants show higher reactivity in SCL after receiving positive feedback from a human agent than a robot, whereas no differences in the reactivity can be seen between a human agent or a robot if participants receive negative feedback. Still, negative feedback from both a human agent and a robot provokes higher reactivity than positive feedback from a robot (for means of the groups see Table 3). This qualifies hypothesis 6 insofar as with regard to SCL, it was only supported for positive feedback.

Table 2 Results of analysis of variance of psychophysiological parameters
Fig. 7
figure 7

Interaction of feedback valence and agent on skin conductance level (SCL)

Table 3 Means of psychophysiological parameters by study groups

3.3 Additional Analyses

Since not all hypotheses were supported, with significant effects mainly in psychophysiological parameters and small effects in subjective parameters, we conducted additional exploratory analysis of important facets of social interactions.

A MANOVA was calculated to examine differences in social warmth, perceived competence and discomfort of the interaction between humans and social robots. The results showed a significant difference between the agents in terms of social warmth (F(1,61) = 40.427, p < .001, η2 = 0.399). Participants perceived significantly more social warmth when interacting with the human (M = 5.33, SD = 1.68) than when interacting with the robot (M = 2.80, SD = 1.38), as shown in Figure 8. Another significant difference between agents was found in discomfort (F(1,61) = 4.113, p < .05, η2 = 0.063). Participants reported feeling more discomfort when interacting with the robot (M = 1.71, SD = 0.72) than with the human (M = 1.36, SD = 0.65). There were no significant differences for the perceived competence of the agents.

Fig. 8
figure 8

Effects of agent and feedback valence on perceived social warmth

We also calculated correlations between psychophysiological parameters and warmth, competence, and discomfort, to determine if psychophysiological reactions are associated with the quality of social interaction. To conduct these correlation analyses, we first assessed the normality of the variables’ distributions. As normality was violated for all three variables, Spearman correlations were used (see Table 4). The results show significant positive correlations between all skin conductance parameters and the social warmth dimension of the RoSAS Scale. The higher the agent’s social warmth was perceived, the higher the SCL and the NS.SCR.

Table 4 Spearman correlations between psychophysiological parameters and the RoSAS Scale

4 Discussion

Since robots will play an increasingly important role in work environments [1], this study aimed to investigate the psychological consequences of deploying social robots in certain work situations, particularly in supervision and feedback situations. Specifically, we compared subjective and psychophysiological effects of positive and negative feedback from a human and a robot. We expected that feedback from a robot would have fewer effects than feedback from a robot.

In contrast to our hypothesis, the results show that people are generally more motivated when receiving feedback from a social anthropomorphic robot than from a human, regardless of the valence of the feedback. Self-esteem on the other hand was only influenced by the valence of the feedback. It increased after receiving positive and negative performance feedback, but the effect was stronger for the positive feedback. Regarding parameters of psychophysiological reactions, we found no differences between positive and negative feedback. However, people have higher heart rate, SCL, and NS.SCR when interacting with a human as opposed to a social robot. The difference in the SCL solely occurs when receiving positive performance feedback.

Additional exploratory analyses revealed that people perceive social interaction with a social robot as less warm and more discomforting than social interaction with another human. Moreover, people’s SCL and NS.SCR were positively associated with the perceived social warmth.

4.1 Theoretical Implications

In the following, we discuss theoretical implications of our study results, first regarding the subjective outcomes, then regarding the psychophysiological parameters.

4.2 Intrinsic Motivation and Self-Esteem

Contrary to our expectations, intrinsic motivation was higher when receiving performance feedback from a robot than from a human. This unexpected finding can be explained by a novelty effect. Most people do not have prior experience with social robots, which causes greater engagement and interest in the agent than when interacting with another human [54]. This can result in a spillover effect, where the interest in the robot extends to other aspects of the interaction and therefore to increased motivation and interest in the task.

Although unexpected, the result supports Smedegaard’s [55] notion that novelty is a relevant category in human-robot interaction. Smedegaard argues that interacting with social robots is radically different from established sense-making and therefore novel, because it challenges fundamental distinctions between living versus non-living beings. This novelty requires people to seek new knowledge but also provides them opportunities for exploration and learning, making it intrinsically motivating [15, 55]. Consequently, besides social presence, which, according to Social Surrogate Theory, is lower in interactions with robots [36], other psychological mechanisms, such as novelty, might influence people’s perceptions of and reactions towards robots.

For state self-esteem, we found an increase over time, particularly when receiving positive performance feedback, but no effect of agent type. These findings suggest that the valence of the performance feedback has a greater impact on state self-esteem than the type of the feedback giver. The lack of significance regarding the agent is consistent with a previous study that also found no effect of agent type (human versus computer) on state self-esteem in a situation where participants received negative performance feedback [10]. The fact that we did only find an effect of positive performance feedback on state self-esteem over time might be explained by the Theory of Self-Serving-Attribution [56]. The theory posits that people tend to attribute positive outcomes of their actions internally and negative outcomes externally, protecting their self-esteem. The different attributions of positive and negative outcomes could lead to people taking positive feedback more seriously than negative feedback and even to questioning the validity of the negative feedback. Therefore, positive performance feedback might have a stronger impact on state self-esteem than negative performance feedback.

In summary, the findings suggest that robots, probably due to their novelty, may be more effective in promoting intrinsic motivation and engagement, while feedback valence seems to be more important for state self-esteem.

4.3 Psychophysiology

Even though the agent type seems to have little effect on a subjective level, it does affect people on the psychophysiological level. A robot has less effect on the cardiovascular and the electrodermal activity than a human does in a feedback-situation. Thus, our results suggest that robots are used as social surrogates, as proposed by Nash et al. [46], and that they also affect parameters of psychophysiology corresponding with social interactions. However, the robots’ effects on a person’s psychophysiology are diminished, because they only act as social surrogates and are perceived less as social entities than humans are. This finding is in line with research by Rosenthal von der Pütten et al. [57] who compared empathy in human-human and human-robot interaction situations. Their research revealed that people show significantly less empathy towards a robot after interaction with a robot that was abused in a video than towards one that was treated in an affectionate way. They argue that the presence and the actions of the robot have less effect on people’s arousal supporting the theory that robots are perceived as less socially present due to their social surrogate status. Other studies also support this line of reasoning, arguing that acceptance, enjoyment, and intention to use a robot are mediated by its social presence [58, 59]. Moreover, our findings add to recent research on differential effects of robot type on physiological arousal. While Zhang et al. [60] showed that robots with moderate levels of anthropomorphic appearance evoke higher physiological arousal than robots with low or high anthropomorphic appearance, we demonstrated that robots have less effects on cardiovascular and electrodermal activity than human interaction partners. Our exploratory findings that skin conductance correlates positively with perceived social warmth and negatively with discomfort also support the assumption that psychophysiological parameters are influenced by the quality of the social interaction. This finding responds to recent calls to extend research on human-robot interactions by using multimodal measurements that consider both subjective experiences and physiological activity [60].

Regarding the SCL, however, the agent effect was mainly visible in performance feedback with a positive valence. As opposed to negative feedback where no difference was found, people showed a higher skin conductance reaction when receiving positive feedback from a human as opposed to a robot. The general increase in SCL after receiving negative feedback can be explained with an enhanced activity of the BIS, while the BAS seems to become more active only after receiving positive feedback from a human (but not from a robot). One explanation could be that negative performance feedback primarily conveys information about one’s insufficiency, thus activating the BIS, regardless of which agent gives the feedback. Positive performance feedback, on the other hand, might also convey additional information about appraisal from a socially relevant interaction partner. This could explain why people show less skin conductance after receiving positive feedback from a robot showing less social presence, but about the same skin conductance after receiving negative feedback. Non-specific fluctuations also show tendencies toward this pattern.

The lack of a main effect difference between the heart rate and parameters of electrodermal activity in positive and negative feedback suggests that the psychophysiological parameters in our designed experimental situation do not represent the classical effects often observed for BAS and BIS. It rather indicates that our study design primarily induced activity in the non-specific activation system (NAS). According to the correlations between psychophysiology and social interaction aspects, it seems that all parameters represent some form of general arousal invoked by the quality of the social interaction. Fowles [28] also stated that especially electrodermal activity tends to fluctuate in its subjective meaning between different experiments, which can explain why only a non-specific arousal is witnessed.

In sum, the results support the assumption that robots are less socially present than humans, thus inducing less psychophysiological reactions in social situations. Therefore, it is recommended to supplement current research on the uncanny valley [34] with studies on the social presence of social robots [58, 59].

4.4 Practical Implications

Since feedback from today’s social robots appears to have lesser effects on people’s arousal than feedback from a human, it is important to carefully plan the deployment of social robots in work settings in order to utilize them for the greatest benefit. Using the results of the present study, it must first be clarified what purpose the feedback of, or interaction with the robot is intended to fulfill, and in what setting the robot is deployed. For negative feedback, it does not seem to make much of a difference, but for positive feedback, a robot leads to less arousal.

The arousal, which is induced by the agent, should be calibrated to the situational requirements. If an interaction is intended to invoke little arousal, such as in certain therapeutic settings, it may be advantageous to use a robot. For example, studies showed that autistic children could use social robots to communicate with other humans and learn social skills more easily [61,62,63]. The lower level of social presence and induced arousal when interacting with a robot could be beneficial there. On the other hand, people often require medium to higher levels of arousal for optimal performance and well-being in certain situations [64]. In monotonous workplaces [65], for example, interaction partners with a higher level of social presence, such as humans, or robots, which are capable of providing such social presence, would be beneficial for the performance and health of the workers.

Criticism and negative feedback is often intended to convey information about how the receiver should change their behavior in order to become better in the future [66]. Hence, future studies need to examine whether the reduced level of arousal when interacting with a robot also reduces the likelihood of behavioral change in humans.

5 Limitations

Despite its theoretical and practical contributions, our study has, as any other study, some limitations. First, the sample consisted primarily of students, which caused a skew in the education distribution. The sample group was also relatively young (M = 23.94). Since robot effects may be influenced by people age and education with younger and higher educated people being more receptive to the idea of using a robot and generally perceiving robots as less social [67], we recommend that future studies in this field recruit more diverse samples in terms of age and education.

Second, the feedback given to the participants did not represent their actual performance, but was randomized to ensure comparable group sizes [8]. Even though people’s acceptance of the feedback was medium to high (M = 3.01 on a 5-point rating scale) and the groups did not differ in their assessment of feedback plausibility, feedback that precisely matches the participants’ performance could result in larger effects. Such an approach might also lead to a better differentiation between effects of the activation of either the BIS or BAS. While this limitation primarily affects the interpretations of results based on feedback valence, it does not affect the validity of results based on the agent type.

Third, the 3-back task used in this study is a cognitive task to evaluate working memory capacity. It therefore mainly represents work tasks that require high levels of cognitive attention and mental resources, such as retaining and processing of information. However, as work tasks are rather heterogeneous, future studies should replicate our results using other task types [10].

6 Conclusion

The present study shows that while social robots such as PEPPER are able to act as social surrogates, they are perceived as being less socially present and as having a lower quality of interaction when compared to interactions with a human. These effects are mainly visible in psychophysiological parameters, such as skin conductance. Effects such as the social presence of robots require closer investigation, as does the uncanniness of social robots. Implications suggest that feedback from robots should be taken into specific consideration, when a situation requires that interaction partners have lower arousal levels, such as in very stressful situations.