1 Introduction

Robots and other complex autonomous systems offer potential benefits through assisting humans in accomplishing their tasks. These beneficial effects, however, may not be realized due to maladaptive forms of interaction. While robots are only now being fielded in appreciable numbers, a substantial body of experience and research already exists characterizing human interactions with more conventional forms of automation in aviation and process industries.

In human interaction with automation, it has been observed that the human may fail to use the system when it would be advantageous to do so. This has been called disuse (underutilization or under-reliance) of the automation [97]. People also have been observed to fail to monitor automation properly (e.g. turning off alarms) when automation is in use, or they accept the automation’s recommendations and actions when inappropriate [71, 97]. This has been called misuse, complacency, or over-reliance. Disuse can decrease automation benefits and lead to accidents if, for instance, safety systems and alarms are not consulted when needed. Another maladaptive attitude is automation bias  [33, 55, 77, 88, 112], a user tendency to ascribe greater power and authority to automated decision aids than to other sources of advice (e.g. humans). When the decision aid’s recommendations are incorrect, automation bias may have dire consequences [2, 78, 87, 89] (e.g. errors of omission , where the user does not respond to a critical situation, or errors of commission, where the user does not analyze all available information but follows the advice of the automation).

Both naïve and expert users show these tendencies. In [128], it was found that skilled subject matter experts had misplaced trust in the accuracy of diagnostic expert systems. (see also [127]). Additionally the Aviation Safety Reporting System contains many reports from pilots that link their failure to monitor to excessive trust in automated systems such as autopilots or FMS [90, 119]. On the other hand, when corporate policy or federal regulations mandate the use of automation that is not trusted, operators may “creatively disable” the device [113]. In other words: disuse the automation.

Studies have shown [64, 92] that trust towards automation affects reliance (i.e. people tend to rely on automation they trust and not use automation they do not trust). For example, trust has frequently been cited [56, 93] as a contributor to human decisions about monitoring and using automation. Indeed, within the literature on trust in automation, complacency is conceptualized interchangeably as the overuse of automation, the failure to monitor automation, and lack of vigilance [6, 67, 96]. For optimal performance of a human-automation system, human trust in automation should be well-calibrated. Both disuse and misuse of the automation has resulted from improper calibration of trust , which has also led to accidents [51, 97].

In [58], trust is conceived to be an “attitude that an agent (automation or another person) will help achieve an individual’s goals in a situation characterized by uncertainty and vulnerability.” A majority of research in trust in automation has focused on the relation between automation reliability and operator usage, often without measuring the intervening variable, trust. The utility of introducing an intervening variable between automation performance and operator usage, however, lies in the ability to make more precise or accurate predictions with the intervening variable than without it. This requires that trust in automation be influenced by factors in addition to automation reliability/performance. The three dimensional (Purpose, Process, and Performance) model proposed by Lee and See [58], for example, presumes that trust (and indirectly, propensity to use) is influenced by a person’s knowledge of what the automation is supposed to do (purpose), how it functions (process), and its actual performance. While such models seem plausible, support for the contribution of factors other than performance has typically been limited to correlation between questionnaire responses and automation use. Despite multiple studies of trust in automation, the conceptualization of trust and how it can be reliably modeled and measured is still a challenging problem.

In contrast to automation where system behavior has been pre-programmed and the system performance is limited to the specific actions it has been designed to perform, autonomous systems/robots have been defined as having intelligence-based capabilities that would allow them to have a degree of self governance, which enables them to respond to situations that were not pre-programmed or anticipated in the design. Therefore, the role of trust in interactions between humans and robots is more complex and difficult to understand.

In this chapter, we present the conceptual underpinnings of trust in Sect. 8.2, and then discuss models of, and the factors that affect, trust in automation in Sects. 8.3 and 8.4, respectively. Next, we will discuss instruments for measuring trust in Sect. 8.5, before moving on to trust in the context of human-robot interaction (HRI) in Sect. 8.6 both in how humans influence robots, and vice versa. We conclude in Sect. 8.7 with open questions and areas of future work.

2 Conceptualization of Trust

Trust has been studied in a variety of disciplines (including social psychology, human factors, and industrial organization) for understanding relationships between humans or between human and machine. The wide variety of contexts within which trust has been studied leads to various definitions and theories of trust. The different context within which trust has been studied has led to definitions of trust as an attitude, an intention, or a behavior [72, 76, 86]. Both within the inter-personal literature and human-automation trust literature, a widely accepted definition of trust is lacking [1]. However, it is generally agreed that trust is best conceptualized as a multidimensional psychological attitude involving beliefs and expectations about the trustee’s trustworthiness derived from experience and interactions with the trustee in situations involving uncertainty and risk [47]. Trust has also been said to have both cognitive and affective features. In the interpersonal literature, trust is also seen involving affective processes, since trust development requires seeing others as personally motivated by care and concern to protect the trustor’s interests [65]. In the automation literature, cognitive (rather than affective) processes may play a dominant role in the determination of trustworthiness, i.e., the extent to which automation is expected to do the task that it was designed to do [91]. In the trust in automation literature, it has been argued that trust is best conceptualized as an attitude [58] and a relatively well accepted definition of trust is: “...an attitude which includes the belief that the collaborator will perform as expected, and can, within the limits of the designer’s intentions , be relied on to achieve the design goals” [85].

3 Modeling Trust

The basis of trust can be considered as a set of attributional abstractions (trust dimensions) that range from the trustee’s competence to its intentions. Muir [91] combined the dimensions of trust from two works ([4] and [100]). Barber’s model [4] is in terms of human expectations that form the basis of trust between human and machine. These expectations are persistence, technical competency, and fiduciary responsibility. Although in the subsequent literature, the number and concepts in the trust dimensions vary [58], there seems to be a convergence on the three dimensions—Purpose, Process, and Performance [58]—mentioned earlier, along with correspondences of those to earlier concepts, such as the dimensions in [4], and those of Ability, Integrity, and Benevolence [76]. Ability is the trustee competence in performing expected actions, benevolence is the trustee intrinsic and positive intentions towards the trustor, and integrity is trustee’s adherence to a set of principles that are acceptable to the trustor [76].

Both trust in automation [92] and interpersonal relations literature [37, 53, 84, 107] agree that trust relations are dynamic and varying over time. There are three phases that characterize trust over time: trust formation, where trustors choose to trust trustees and potentially increase their trust over time, trust dissolution, where trustors decide to lower their trust in trustees after a trust violation has occurred, and trust restoration where trust stops decreasing after a trust violation and gets restored (although potentially not to the same level as before the trust violation). Early in the relationship, the trust in the system is based on the predictability of the system’s behavior. Work in the literature has shown shifts in trust in response to changes in properties and performance of the automation [56, 91]. When the automation was reliable, operator trust increased over time and vice versa. Varying levels of trust were also positively correlated with the varying levels of automation use. As trust decreased, for instance, manual control became more frequent. As the operator interacts with the system, he/she attributes dependability to the automation. Prolonged interaction with the automation leads the operator to make generalizations about the automation and broader attributions about his belief in the future behavior of the system (faith). There is some difference in the literature as to when exactly faith develops in the dynamic process of trust development. Whereas [100] argue that interpersonal trust progresses from predictability to dependability to faith, [92] suggest that for trust in automation, faith is a better predictor of trust early rather than late in the relationship.

Some previous work has explored trust with respect to automation versus human trustee [64]. Their results indicate (a) the dynamics of trust are similar, in that faults diminish trust both towards automation or another human, (b) the sole predictor of reliance on automation was the difference between trust and self-confidence, and (c) participants, in human-human experiments, were more likely to delegate a task to a human when the human was thought to have a low opinion of their own trustworthiness. In other words, when participants thought their own trustworthiness in the eyes of others was high, they were more likely to retain control over a task. However, trustworthiness played no role when the collaborative partner was an automated controller, i.e. only participants’ own confidence in their performance determined their decision to retain/obtain control. Other work on trust in humans versus trust in automation [61] explored the extent to which participants trusted identical advice given by an expert system under the belief that it was given by a human or a computer. The results of these studies were somewhat contradictory however. In one study, participants were more confident in the advice of the human (though their agreement with the human advice did not vary versus their agreement on the expert system’s advice), while in the second study, participants agreed more with the advice of the expert system, but had less confidence in the expert system. Similar contradictory results have been shown in HRI studies, where work indicated that errors by a robot did not affect participants’ decisions of whether or not to follow the advice of a robot [111], yet did affect their subjective reports of the robot’s reliability and trustworthiness [104]. Study results by [71], however, indicated that reliance on a human aid was reduced in situations of higher risk.

4 Factors Affecting Trust

The factors that are likely to affect Trust in automation have generally been categorized as those pertaining to automation, the operator, and the environment. Most work on factors that have been empirically researched pertains to characteristics of the automation. Here we briefly present relevant work on the most important of these factors.

4.1 System Properties

The most important correlates of use of automation have been system reliability and effects of system faults. Reliability typically refers to automation that has some error rate—for example, misclassifying targets. Typically this rate is constant and data is analyzed using session means. Faults are typically more drastic, such as controller that fails making the whole system behave erratically. Faults are typically single events and studied as time series.

System reliability: Prior literature has provided empirical evidence that there is a relationship between trust in automation and the automation’s reliability [85, 96,97,98, 102]. Research shows [86] that declining system reliability can lead to systematic decline in trust and trust expectations, and most crucially, these changes can be measured over time. There is also some evidence that only the most recent experiences with the automation affect trust judgments [51, 56].

System faults: System faults are a form of system reliability, but are treated separately because they concern discrete system events and involve different experimental designs. Different aspects of faults influence the relation between trust and automation. Lee and Moray [56] showed that in the presence of continual system faults, trust in the automation reached its lowest point only after six trials, but trust did recover gradually even as faults continued. The magnitude of system faults has differential effects on trust (smaller faults had minimal effect on trust while large faults negatively affected trust and were slower to recover the trust). Another finding [92] showed that faults of varying magnitude diminished trust more than large constant faults. Additionally, it was found that when faults occurred in a particular subsystem, the corresponding distrust did spread to other functions controlled by the same subsystem. The distrust did not, however, spread to independent or similar subsystems.

System predictability: Although system faults affect the trust in the automation, this happens when the human has little a priori knowledge about the faults. Research has shown that when people have prior knowledge of faults, these faults do not necessarily diminish trust in the system [64, 102]. A plausible explanation is that knowing that the automation may fail reduces the uncertainty and consequent risk associated with use of the automation. In other words, predictability may be as (or more) important as reliability.

System intelligibility and transparency: Systems that can explain their reasoning will be more likely to be trusted, since they would be more easily understood by their users [66, 117, 121, 122]. Such explanatory facility may also allow the operator to query the system in periods of low system operation in order to incrementally acquire and increase trust.

Level of Automation: Another factor that may affect trust in the system is its level of automation (i.e. the level of functional allocation between the human and the system). It has been suggested [91, 93] that system understandability is an important factor for trust development. In their seminal work on the subject [116], Sheridan and Verplank propose a scale for assessing the level of automation in a system from 0 to 10, with 0 being no autonomy and 10 being fully autonomous. Since higher levels of automation are more complex, thus potentially more opaque to the operator, higher levels of automation may engender less trust. Some limited empirical work suggests that different levels of automation may have different implications for trust [86]. Their work based on Level 3 [116] automation did not show same results when conducted with Level 7 (higher) automation.

4.2 Properties of the Operator

Propensity to trust: In the sociology literature [105] it has been suggested that people have different propensity to trust others and it has been hypothesized that this is a stable personality trait. In the trust in automation literature, there is very limited empirical work on the propensity to trust. Some evidence is provided in [97] suggests that operator’s overall propensity to trust is distinct from trust towards a specific automated system. In other words, it may be the case that an operator has high propensity to trust in automation in general, but faced with a specific automated system, their trust may be very low.

Self Confidence: Self-confidence is a factor of individual difference and one of the few operator characteristics that has been studied in the trust in automation literature. Work in [57] suggested that when trust was higher than self-confidence, automation, rather than manual control would be used and vice versa when trust was lower than self-confidence. However, later work [86], which was conducted with a higher level of automation than [57], did not obtain similar results. It was instead found that trust was influenced by properties of the system (e.g., real or apparent false diagnoses) while self-confidence was influenced by operator traits and experiences (e.g. whether they had been responsible for accidents). Furthermore, it was also found that self-confidence was not affected by system reliability. This last finding was also suggested in the work of [64] which found that self-confidence was not lowered by shifts in automation reliability.

Individual Differences and Culture: It has been hypothesized, and supported by various studies, that individual differences [57, 74, 80, 119] and culture [50] affect the trust behavior of people. The interpersonal relations literature has identified many different personal characteristics of a trustor, such as self-esteem [105, 106], secure attachment [17], and motivational factors [54] that contribute to the different stages in the dynamics of trust. Besides individual characteristics, socio-cultural factors that contribute to differences in trust decisions in these different trust phases have also been identified [8, 10, 32, 37]. For example, combinations of socio-cultural factors that may result in quick trust formation (also called “swift trust” formation in temporary teams [83]) are time pressure [25] and high power distance with authority [16]. People in high power distance (PD) societies expect authority figures to be benign, competent and of high integrity. Thus people in high power distance societies will engage in less vigilance and monitoring for possible violations by authority figures. To the extent then that people of high PD cultures perceive the automation as authoritative, they should be quick to form trust. On the other hand, when violations occur, people in high PD cultures should be slow to restore trust once violations have occurred [11]. Additionally, it has been shown [79] via replication of Hofstede’s [45] cultural dimensions for a very large-scale sample of pilots, that even in such a highly specialized and regulated profession, national culture still exerts a meaningful influence on attitude and behavior over and above the occupational context.

To date, only a handful of studies consider cultural factors and potential differences in the context of trust in automation, with [99, 125] and [22] being exceptions. As the use of automation gets increasingly globalized, it is imperative that we gain an understanding on how trust in automation is conceptualized across cultures and how it influences operator reliance and use of automation, and overall human-system performance.

4.3 Environmental Factors

In terms of environmental factors that influence trust in automation, risk seems most important. Research in trust in automation suggests that reliance on automation is modulated by the risk present in the decision to use the automation [101]. People are more averse to using the automation if negative consequences are more probable and, once trust has been lowered, it takes people longer to re-engage the automation in high-risk versus low risk situations [102]. However, knowing the failure behavior of the automation in advance may modify the perception of risk, in that people’s trust in the system does not decrease [101].

5 Instruments for Measuring Trust

While a large body of work on trust in automation and robots has developed over the past two decades, standardized measures have remained elusive with many researchers continuing to rely on short idiosyncratically worded questionnaires. Trust (in automation) refers to a cognitive state or attitude, yet it has most often been studied indirectly through its purported influence on behavior often without any direct cognitive measure. The nature and complexity of the tasks and failures studied has varied greatly ranging from simple automatic target recognition (ATR) classification [33], to erratic responses of a controller embedded within a complex automated system [57] to robots misreading QR codes [30]. The variety of reported effects (automation bias , complacency, reliance, compliance, etc.) mirror these differences in tasks and scenarios [27] and [28] have criticized the very construct of trust in automation on the basis of this diversity as an unfalsifiable “folk model” without clear empirical grounding. Although the work cited in the reply to these criticism in [98] as well as the large body of work cited in the review by [96] have begun to examine the interrelations and commonalities of concepts involving trust in automation, empirical research is needed to integrate divergent manifestations of trust within a single task/test population so that common and comparable measures can be developed.

Most “measures” of trust in automation since the original study [92] have been created for individual studies based on face validity and have not in general benefited from the same rigor in development and validation that has characterized measures of interpersonal trust. “Trust in automation” has been primarily understood through its analogy to interpersonal trust and more sophisticated measures of trust in automation have largely depended on rationales and dimensions developed for interpersonal relations, such as ability, benevolence, and integrity.

Three measures of trust in automation, Empirically Derived (ED), Human-Computer Trust (HTC), and SHAPE Automation Trust Index (SATI) have benefited from systematic development and validation. The Empirically Derived 12 item scale developed by [46] was systematically developed, subjected to a validation study [120] and used in other studies [75]. In [46], they developed their scale in three phases beginning with a word elicitation task. They extracted a 12-factor structure used to develop a 12-item scale based on examination of clusters of words. The twelve items roughly correspond to the classic three dimensions: benevolence (purpose), integrity (process), and ability (performance).

The Human-Computer Trust (HTC) instrument developed in [72] demonstrated construct validity and high reliability within their validation sample and has subsequently been used to assess automation in air traffic control (ATC) simulations, most recently in [68]. Subjects initially identified constructs that they believed would affect their level of trust in a decision aid. Following refinement and modification of the constructs and potential items, the instrument was reduced to five constructs (reliability, technical competence, understandability, faith, and personal attachment). A subsequent principal components analysis limited to five factors found most scale items related to these factors.

The SHAPE Automation Trust Index, SATI, [41] developed by the European Organization for the Safety of Air Navigation is the most pragmatically oriented of the three measures. Preliminary measures of trust in ATC systems were constructed based on literature review and a model of the task. This resulted in a seven dimensional scale (reliability, accuracy, understanding, faith, liking, familiarity, and robustness). The measure was then refined in focus groups with air traffic controllers from different cultures rating two ATC simulations. Scale usability evaluations, and construct validity judgments were also collected. The instrument/items have reported reliabilities in the high 80s but its constructs have not been empirically validated.

All three scales have benefited from empirical study and systematic development yet each has its flaws. The ED instrument in [46], for instance, addresses trust in automation in the abstract without reference to an actual system and as a consequence appears to be more a measure of propensity to trust than trust in a specific system. A recent study [115] found scores on the ED instrument to be unaffected by reliability manipulations that produced significant changes in ratings of trust on other instruments. The HTC was developed from a model of trust and demonstrated agreement between items and target dimensions but stopped short of confirmatory factor analysis. Development of the SATI involved the most extensive pragmatic effort to adapt items so they made sense to users and captured aspects of what users believed contributed to trust. However, SATI development neglected psychometric tests of construct validity.

A recent effort [21, 23] has led to a general measure of trust in automation validated across large populations in three diverse cultures, US, Taiwan and Turkey, as representative of Dignity, Face, and Honor cultures [63]. The Cross-cultural measure of trust is consistent with the three (performance, purpose, process) dimensions of [58, 81] and contains two 9 item scales, one measuring the propensity to trust as in [46] and the other measuring trust in a specific system. The second scale is designed to be administered repeatedly to measure the effects of manipulations expected to affect trust while the propensity scale is administered once at the start of an experiment. The scales have been developed and validated for US, Taiwanese, and Turkish samples and are based on 773 responses (propensity scale) and 1673 responses (specific scale).

The Trust Perception Scale-HRI [114, 115] is a psychometrically-developed 40 item instrument intended to measure human trust in robots. Items are based on data collected identifying robot features from pictures and their perceived functional characteristics. While development was guided by the triadic (human, robot, environment) model of trust inspired by the meta-analysis in [43], a factor analysis of the resulting scale found four components corresponding roughly to capability, behavior, task, and appearance. Capability and behavior correspond to two of the dimensions commonly found in interpersonal trust [81] and trust in automation [58], while appearance may have a special significance for trust in robots. The instrument was validated in same-trait and multi-trait analyses producing changes in rated trust associated with manipulation of robot reliability. The scale was developed based on 580 responses and 21 validation participants.

The HRI Trust Scale [131] was developed from items based on five dimensions (team configuration, team process, context, task, and system) identified by 11 subject matter experts (SMEs) as likely to affect trust. A 100 participant Mechanical Turk sample was used to select 37 items representing these dimensions. The HRI Trust Scale is incomplete as a sole measure of trust and is intended to be paired with Rotter’s [105] interpersonal trust inventory when administered. While Lee and See’s dimensions [58] other than “process” are missing from the HRI scale, they are represented in Rotter’s instrument.

Because trust in automation or robots is an attitude, self-report through psychometric instruments such as these provides the most direct measurement. Questionnaires, however, suffer from a number of weaknesses. Because they are intrusive, measurements cannot be conveniently taken during the course of a task but only after the task is completed. This may suffice for automation such as ATR where targets are missed at a fixed rate and the experimenter is investigating the effect of that rate on trust [33], but it does not work in measuring moment to moment trust in a robot reading QR codes to get its directions [30].

6 Trust in Human Robot Interaction

Robots are envisioned to be able to process many complex inputs from the environment and be active participants in many aspects of life, including work environments, home assistance, battlefield and crisis response, and others. Therefore, robots are envisioned to transition from tool to teammate as humans transition from operator to teammate in an interaction more akin to human-human teamwork. These envisioned transitions raise a number of general questions: How would human interaction with the robot be affected? How would performance of the human-robot team be affected? How would human performance or behavior be affected? Although there are numerous tasks, environments, and situations of human-robot collaboration, in order to best clarify the role of trust we distinguish two general types of interactions of humans and robots: performance-based interactions, where the focus is on the human influencing/controlling the robot so it can perform useful tasks for the human, and social-based interactions, where the focus is on how the robot’s behavior influences the human’s beliefs and behavior. In both these cases, the human is the trustor and the robot the trustee. In particular, in performance based interactions there is a particular task with a clear performance goal. An example of performance-based interactions is where human and robot collaborate in manufacturing assembly, or a UAV performing surveillance and recognition of victims in a search and rescue mission. Here measures of performance could be accuracy and timing to complete the task. On the other hand, in social interactions, the performance goal is not as crisply defined. An example of such a task is the ability of a robot to influence a human to reveal private knowledge, or how a robot can influence a human to take medicine or do useful exercises.

6.1 Performance-Based Interaction: Humans Influencing Robots

A large body of HRI research investigating factors thought to affect behavior via trust, such as reliability, rely strictly on behavioral measures without reference to trust. Meyer’s [82] expected value (EV) theory of alarms provides one alternative by describing the human’s choice as one between compliance (responding to an alarm) and reliance (not responding in the absence of an alarm). The expected values of these decisions are determined by the utilities associated with an uncorrected fault, the cost of intervention and the probabilities of misses (affecting reliance) and false alarms (affecting compliance). Research in [31], for example, investigated the effects of unmanned aerial vehicle (UAV) false alarms and misses on operator reliance inferred from longer reaction times for misses and compliance inferred from shorter reaction times to alarms. While reliance/compliance effects were not found, higher false alarm rates correlated with poorer performance on a monitoring task, while misses correlated with poorer performance on a parallel inspection task. A similar study by [20] of unmanned ground vehicle (UGV) control found participants with higher perceived attentional control were more adversely affected by false alarms (under-compliance) while those with low perceived attentional control were more strongly affected by misses (over-reliance). Reliance and compliance can be measured in much the same way for homogeneous teams of robots as illustrated by a follow up study of teams of UGVs [19] of similar design and results. A similar study [26] involved multiple UAVs manipulating ATR reliability and administering a trust questionnaire, again finding that ratings of trust increased with reliability.

Transparency, common ground, or shared mental models involve a second construct (“process” [58] or “integrity” [76]) believed to affect trust. According to these models, the extent to which a human can understand the way in which an autonomous system works and predict its behavior will influence trust in the system. There is far less research on effects of transparency, with most involving level of automation manipulations. An early study [60] in which all conditions received full information found best performance for an intermediate level of automation that facilitated checks of accuracy (was transparent). Participants, however, made substantially greater use of a higher level of automation that provided an opaque recommendation. In this study, ratings of trust were affected by reliability but not transparency. More recent studies have equated transparency with additional information providing insight into robot behavior. Researchers in [9] compared conditions in which participants observed a simulated robot represented on a map by a status icon (level of transparency 1), overlaid with environmental information such as terrain (level 2), or with additional uncertainty and projection information (level 3). Note that these levels are distinct from Sheridan’s Levels of Automation mentioned previously. What might appear as erratic behavior in level 1, for example, might be “explained”’ by the terrain being navigated in level 2. Participant’s ratings of trust were higher for levels 2 and 3. A second study manipulated transparency by comparing minimal (such as static image) contextual (such as video clip) and constant (such as video) information for a simulated robot team mate with which participants had intermittent interactions but found no significant differences in trust. In [126], researchers took a different approach to transparency by having a simulated robot provide “explanations” of its actions. The robot guided by a POMDP model can make different aspects of its decision making such as beliefs (probability of dangerous chemicals in building) or capabilities (ATR has 70% reliability) available to its human partner. Robot reliability affected both performance and trust. Explanations did not improve performance but did increase trust among those in the high reliability condition. As these studies suggest, reliability appears to have a large effect on trust, reliance/compliance, and performance, while transparency about function has a relatively minor one, primarily influencing trust. The third component of trust in robot’s “purpose” [58] or “benevolence” [76] has been attributed [69, 70, 95] to “transparency” as conveyed by appearance discussed in Sect. 8.6.2. By this interpretation, matching human expectations aroused by a robot’s appearance to its purpose and capabilities can make interactions more transparent by providing a more accurate model to the human.

Studies discussed to this point have treated trust as a dependent variable to be measured at the end of a trial and have investigated whether or not it had been affected by characteristics of the robot or situation. If trust of a robot is modified through a process of interaction, however, it must be continuously varying as evidence accumulates of its trustworthiness or untrustworthiness. This was precisely the conception of trust investigated by Lee and Moray [56] in their seminal study but has been infrequently employed since. An recent example of such a study is reported in [29] where a series of experiments addressing temporal aspects of trust involving levels of automation and robot reliability have been conducted using a robot navigation and barriers task. In that task, a robot navigates through a course of boxes with labels that the operator can read through the robot’s camera and QR codes presumed readable by the robot. The labels contain directions such as “turn right” or “U turn”. In automation modes, robots follow a predetermined course with “failures” appearing to be misread QR codes. Operators can choose either the automation mode or a manual mode in which they determine the direction the robot takes. An initial experiment [29] investigated the effects of reliability drops at different intervals across a trial, finding that decline in trust as measured by post trial survey was greatest if the reliability decline occurred in the middle or final segments. In subsequent experiments, trust ratings were collected continuously by periodic button presses indicating increase or decrease in trust. These studies [30, 49] confirmed the primacy-recency bias in episodes of unreliability and the contribution of transparency in the form of confidence feedback from the robot.

Work in [24] collected similar periodic measures of trust using brief periodically presented questionnaires to participants performing a multi-UAV supervision task to test effects of priming on trust. These same data were used to fit a model similar to that formalized by [39] using decision field theory to address the decision to rely on the automation/robot’s capabilities or to manually intervene based on the balance between the operator’s self-confidence and her trust in the automation/robot. The model contains parameters characterizing information conveyed to operator, inertia in changing beliefs, noise, uncertainty, growth-decay rates for trust and self-confidence, and an inhibitory threshold for shifting between responses. By fitting these parameters to human subject data, the time course of trust (as defined by the model) can be inferred. An additional study of UAV control [38] has also demonstrated good fits for dynamic trust models with matches within 2.3% for control over teams of UGVs. By predicting effects of reliability and initial trust on system performance, such models might be used to select appropriate levels of automation or provide feedback to human operators. In another study involving assisted driving [123], the researchers use both objective (car position, velocity, acceleration, and lane marking scanners) and subjective (gaze detection and foot location) to train a mathematical model to recognize and diagnose over-reliance on the automation. The authors show that their models can be applied to other domains outside automation-assisted driving as well.

Willingness to rely on the automation has been found in the automation literature to correlate with user’s self-confidence in their ability to perform the task [57]. It has been found that if a user is more confident in their own ability to perform the task, they will take control of the automation more frequently if they perceive that the automation does not perform well. However, as robots are envisioned to be deployed in increasingly risky situations, it may be the case that a user (e.g. a soldier) may elect to use a robot for bomb disposal irrespective of his confidence in performing the task. Another factor that has considerably influenced use of automation is user workload. It has been found in the literature that users exhibit over-reliance [7, 40] on the automation in high workload conditions.

Experiments in [104] show that people over-trusted a robot in fire emergency evacuation scenarios conducted with a real robot in a campus building, although the robot was shown to be defective in various ways (e.g. taking a circuitous route rather then the efficient route in guiding the participant in a waiting room before the emergency started). It was hypothesized by the experimenters that the participants, having experienced an interaction with a defective robot, would decrease their trust (as opposed to a non-defective robot), and also that participants’ self-reported trust would correlate with their behavior (i.e their decision to follow the robot or not). The results showed that, in general, participants did not rate the non-efficient robot as a bad guide, and even the ones that rated it poorly still followed it during the emergency. In other words, trust rating and trust behavior were not correlated. Interestingly enough, participants in a previous study with similar scenarios of emergency evacuation in simulation by the same researchers [103] behaved differently, namely participants rated less reliant simulated robots as less trustworthy and were less prone to follow them in the evacuation. The results from the simulation studies of emergency evacuation, namely positive correlation between participants’ trust assessment and behavior, are similar to results in low risk studies [30]. These contradictory results point strongly that more research needs to be done to refine robot, operator and task-context variables and relations that would lead to correct trust calibration, and better understanding of the relationship between trust and performance in human robot interaction.

One important issue is how an agent forms trust in agents it has not encountered before. One approach from the literature in multiagent systems (MAS) investigates how trust forms in ad hoc groups, where agents that had not interacted before come together for short periods of time to interact and achieve a goal, after which they disband. In such scenarios, a decision tree model based on both trust and other factors (such as incentives and reputation) can be used [13]. A significant problem in such systems, known as the cold start problem, is that when such groups form there is little to no prior information on which to base trust assessments. In other words, how does an agent choose who to trust and interact with when they have no information on any agent? Recent work has focused on bootstrapping such trust assessments by using stereotypes [12]. Similar to stereotypes used in interpersonal interactions among humans, stereotypes in MAS are quick judgements based on easily observable features of the other agent. However, whereby human judgements are often clouded by cultural or societal biases , stereotypes in MAS can be constructed in a way that maximizes the accuracy. Further work by the researchers in [14] shows how stereotypes in MAS can be spread throughout the group to improve others’ trust assessments, and can be used by agents to detect unwanted biases received from others in the group. In [15], the authors show how this work can be used by organizations to create decision models based on trust assessments from stereotypes and other historical information about the other agents.

6.1.1 Towards Co-adaptive Trust

In other studies [129, 130], Xu and Dudek create an online trust model to allow a robot or other automation to assess the operator’s trust in the system while a mission is ongoing, using the results of the model to adjust the automation behavior on the fly to adapt to the estimated trust level. Their end goal is trust-seeking adaptive robots, which seek to actively monitor and adapt to the estimated trust of the user to allow for greater efficiency in human-robot interactions. Importantly, the authors combined common objective, yet indirect, measures of trust (such as quantity and type of user interaction), with a subjective measure in the form of periodical queries to the operator about their current degree of trust.

In an attempt to develop an objective and direct measure of trust the human has in the system, the authors of [36] use a mathematical decision model to estimate trust by determining the expected value of decisions a trusting operator would make, and then evaluate the user’s decisions in relation to this model. In other words, if the operator deviates largely from the expected value of their decisions, they are said to be less trusting, and vice versa. In another study [108], the authors use two-way trust to adjust the relative contribution of the human input to that of the autonomous controller, as well as the haptic feedback provided to the human operator. They model both robot-to-human and human-to-robot trust, with lower values of the former triggering higher levels of force feedback, and lower values of the latter triggering a higher degree of human control over that of the autonomous robot controller. The authors demonstrate their model can significantly improve performance and lower the workload of operators when compared to previous models and manual control only.

These studies help introduce the idea of “inverse trust”. The inverse trust problem is defined in [34] as determining how “an autonomous agent can modify it’s behavior in an attempt to increase the trust a human operator will have in it”. In this paper, the authors base this measure largely on the number of times the automation is interrupted by a human operator, and uses this to evaluate the autonomous agent’s assessment of change in the operator’s trust level. Instead of determining an absolute numerical value of trust, the authors choose to have the automation estimate changes in the human’s trust level. This is followed in [35] by studies in simulation validating their inverse trust model.

6.2 Social-Based Interactions: Robots Influencing Humans

Social robotics deals with humans and robots interacting in ways humans typically interact with each other. In most of these studies, the robot—either by its appearance or its behavior—influences the human’s beliefs about trustworthiness, feelings of companionship, comfort, feelings of connectedness with the robot, or behavior (such as whether the human discloses secrets to the robot or follows the robot’s recommendations). This is distinct from the prior work discussed, such as ATR, where a robot’s actions are not typically meant to influence the feelings or behaviors of its operator. These social human-robot interactions contain affective elements that are closer to human-human interactions. There is a body of literature that looked at how robot characteristics affected ratings of animacy and other human-like characteristics, as well as trust in the robot, without explicitly naming a performance or social goal that the robot would perform. It has been consistently found in the social robotics literature that people tend to judge robot characteristics, such as reliability and intelligence, based on robot appearance. For example, people ascribe human qualities to robots that look more anthropomorphic. Another result of people’s tendency to anthropomorphize robots is that they tend to ascribe animacy and intent to robots. This finding has not been reported just for robots [109] but even for simple moving shapes [44, 48]. Kiesler and Goetz [52] found that people rated more anthropomorphic looking robots as more reliable. Castro-Gonzalez et al. [18] investigated how the combination of movement characteristics with body appearance can influence people’s attributions of animacy, liekeability, trustworthiness, and unpleasantness. They found that naturalistic motion was judged to be more animate, but only if the robot had a human appearance. Moreover, naturalistic motion improved ratings of likeability irrespective of the robot’s appearance. More interestingly, a robot with human-like appearance was rated as more disturbing when its movements were more naturalistic. Participants also ascribe personality traits to robots based on appearance. For instance, in [118], robots with spider legs were rated as more aggressive whereas robots with arms rated as more intelligent than those without arms. Physical appearance is not the only attribute that influences human judgment about robot intelligence and knowledge. For example, [59] found that robots that spoke a particular language (e.g. Chinese) were rated higher in their purported knowledge of Chinese landmarks than robots that spoke English.

Robot appearance, physical presence [3], and matched speech [94] are likely to engender trust in the robot [124] found that empathetic language and physical expression elicits higher trust [62] found that highly expressive pedagogical interfaces engender more trust. A recent meta-analysis by Hancock et al. [43] found that robot characteristics such as reliability, behaviors and transparency influenced people’s rating of trust in a robot. Besides these characteristics, the researchers in [43] also found that anthropomorphic qualities also had a strong influence on ratings of trust, and that trust in robots is influenced by experience with the robot.

Martelato et al. [73] found that if the robot is more expressive, this encourages participants to disclose information about themselves. However, counter to their hypotheses, disclosure of private information by the robot, a behavior that the authors labelled as making the robot more vulnerable, did not engender increased willingness to disclose on the part of the participants. In a study on willingness of children to disclose secrets, Bethel et al. [5] found in a qualitative study that preschool children were found to be as likely to share a secret with an adult as with a humanoid robot.

An interesting study is reported in [111], where the authors studied how errors performed by the robot affect human trustworthiness and willingness of the human to subsequently comply with the robot’s (somewhat unusual) requests. Participants interacted with a home companion robot, in the experimental room that was the pretend home of the robot’s human owner in two conditions, (a) where the robot did not make mistakes and (b) where the robot made mistakes. The study found that the participants’ assessment of robot reliability and trustworthiness was decreased significantly in the faulty robot condition; nevertheless, the participants were not substantially influence in their decisions to comply with the robot’s unusual requests. It was further found that the nature of the request (revocable versus irrevocable) influenced the participants’ decisions on compliance. Interestingly, the results in this study also show that participants attributed less anthropomorphism when the robot made errors, which contradict those found by an earlier study the same authors had performed [110].

7 Conclusions and Recommendations

In this chapter we briefly reviewed the role of trust in human-robot interaction. We draw several conclusions, the first of which is that there is no accepted definition of what “trust” is in the context of trust in automation. Furthermore, when participants are asked to answer questions as to their level of trust in a robot or software automation, they are almost never given a definition of trust, leaving open the possibility that different participants are viewing the question of trust differently. From a review of the literature, it is apparent that robots still have not achieved full autonomy, and still lack the attributes that would allow them to be considered true teammates by their human counterparts. This is especially true because the literature is largely limited to simulation, or to specific, scripted interactions in the real world. Indeed, in [42], the authors argue that without human-like mental models and a sense of agency, robots will never be considered equal teammates within a mixed human-robot team. They argue that the reason researchers include robots in common HRI tasks is due to their ability to complement the skills of humans. Yet, because of the tendency of humans to anthropomorphize things they interact with, the controlled interactions researchers develop for HRI studies are more characteristic of human-human interactions. While this tendency to anthropomorphize can be helpful in some cases, it poses a serious risk if this naturally gives humans a higher degree of trust in robots than is warranted. The question of how a robot’s performance influences anthropomorphization is also unclear—with recent studies finding conflicting results [110, 111].

There is a general agreement that the notion of trust involves vulnerability of the trustor to the trustee in circumstances of risk and uncertainty. In the performance-based literature, where the human is relying on the robot to do the whole task or part of the task, it is clear that the participant is vulnerable to the robot with respect to the participant’s performance in the experimental task. In most of the studies in social robotics, however, where the robot is trying to get the participant to do something (e.g. comply with instructions to throw away someone else’s mail, or disclose a secret) it is not clear that the participant is truly vulnerable to the robot (unless we regard breaking a social convention as making oneself vulnerable), merely enjoying the novelty of robots, or feeling pressure to follow experimental procedure. Therefore, the notion that was measured in those studies may not have been trust in the sense that the term is defined in the trust literature. For example in [104], where participants showed compliance with a robot guide even when reliability was ranked lower after an error, the researchers admit several confounding factors (e.g., participants did not have enough time to deliberate). The findings on human tendencies to ascribe reliability, trustworthiness, intelligence and other positive characteristics to robots may prohibit correct estimation of robot’s abilities and prevent correct trust calibration. This is dangerous especially since the use of robots is envisioned to increase, especially in high risk situations such as emergency response and the military.

This overview enables us to provide several recommendations for how future work investigating trust in human-autonomy and human-robot interaction would proceed. First, it would be useful for the community to have a clear definition in each study as to what autonomy and what teammate characteristics the robot in the study possesses. Second, it would be useful for each study to define the notion of trust the author’s espouse, as well as which dimensions of the notion of trust they believe are relevant to the task being investigated. The experimenters should also try to understand, via surveys or other means, what definition of trust the participants have in their heads. A possible idea is that experimenters could even give their definition of trust to the participants and see how this may affect the participants’ answers.

Another recommendation is that, given the novelty of robots for the majority of the population, along with the well-known fact from in-group/out-group studies that people seem to be influenced very easily and for trivial reasons, it would be useful to perform longer duration studies to investigate the transient nature of trust assessments. In other words, how does trust in automation change as a function of how familiar users are with the automation and how much they interact with it over time? One could imaging someone unfamiliar with automation or robots placing a high degree of trust in them due to prior beliefs (which may be incorrect). Over time, this implicit trust may fade as they work more with automation and realize that it is not perfect.

Furthermore, we believe in a need to increase research in the multi-robot systems area, as well as the area of robots helping human teams. As the number of robots increase and hardware and operation costs decrease, it is inevitable that humans will be interacting with larger numbers of robots to perform increasingly complex tasks. Furthermore, trust in larger groups and collectives of robots is no doubt influenced by different factors—specifically those regarding the robots’ behaviors—in addition to single robot control. Similarly, there is little work investigating how multiple humans working together with robots affect each others’ trust levels, which needs to be addressed.

Finally, it would be helpful for the community to define a set of task categories of human-robot interaction with characteristics that involve specific differing dimensions of trust. Such characteristics could be the degree of risk to the trustor, the degree of uncertainty, the degree of potential gain, whether the trustor’s vulnerability is to the reliability of the robot, or the robot’s integrity or benevolence. Other studies should expand on the notion of co-adaptive trust to improve how robots assess their own behavior and how it affects the trust in them by their operator. As communication is key to any collaborative interaction, research should not focus merely on how the human sees the robot, but also how the robot sees the human.