Modeling and Predicting Trust Dynamics in Human–Robot Teaming: A Bayesian Inference Approach

Guo, Yaohui; Yang, X. Jessie

doi:10.1007/s12369-020-00703-3

Modeling and Predicting Trust Dynamics in Human–Robot Teaming: A Bayesian Inference Approach

Open access
Published: 04 October 2020

Volume 13, pages 1899–1909, (2021)
Cite this article

Download PDF

You have full access to this open access article

International Journal of Social Robotics Aims and scope Submit manuscript

Modeling and Predicting Trust Dynamics in Human–Robot Teaming: A Bayesian Inference Approach

Download PDF

6366 Accesses
35 Citations
5 Altmetric
Explore all metrics

Abstract

Trust in automation, or more recently trust in autonomy, has received extensive research attention in the past three decades. The majority of prior literature adopted a “snapshot” view of trust and typically evaluated trust through questionnaires administered at the end of an experiment. This “snapshot” view, however, does not acknowledge that trust is a dynamic variable that can strengthen or decay over time. To fill the research gap, the present study aims to model trust dynamics when a human interacts with a robotic agent over time. The underlying premise of the study is that by interacting with a robotic agent and observing its performance over time, a rational human agent will update his/her trust in the robotic agent accordingly. Based on this premise, we develop a personalized trust prediction model and learn its parameters using Bayesian inference. Our proposed model adheres to three properties of trust dynamics characterizing human agents’ trust development process de facto and thus guarantees high model explicability and generalizability. We tested the proposed method using an existing dataset involving 39 human participants interacting with four drones in a simulated surveillance mission. The proposed method obtained a root mean square error of 0.072, significantly outperforming existing prediction methods. Moreover, we identified three distinct types of trust dynamics, the Bayesian decision maker, the oscillator, and the disbeliever, respectively. This prediction model can be used for the design of individualized and adaptive technologies.

Framing the predictive mind: why we should think again about Dreyfus

Article Open access 06 May 2024

Transparency and the Black Box Problem: Why We Do Not Trust AI

Article 01 September 2021

Understanding A.I. — Can and Should we Empathize with Robots?

Article 28 April 2020

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The use of autonomous and robotic agents to assist humans is expanding rapidly. Robots have been developed for various application domains such as urban search and rescue (USAR) [1], manufacturing [2], and healthcare [3]. For example, an in-home robot can be used to improve the coordination of patient communication with care providers and to assist the patient with medication management. In order for the human–robot team to interact effectively, the human should establish appropriate trust toward the robotic agents [4,5,6,7].

Humans’ trust in automation, or more recently trust in autonomy, has received extensive research attention in the past three decades. The diverse interest has generated multiple definitions of trust as a belief, attitude, and behavior [6]. In this paper, we use the definition by Lee and See [8]: Trust is the “attitude that an agent will help achieve an individual’s goals in situations characterized by uncertainty and vulnerability” (see [6, 9, 10] for discussions on the definitions of trust and see [11,12,13] for examples using the Lee and See’s definition).

Despite the research effort, existing research faces two major challenges. First, the majority of prior literature adopted a “snapshot” view of trust and typically evaluated trust at one point, usually at the end of an experiment (Fig. 1). The static snapshot approach, however, does not fully acknowledge that trust is a dynamic variable that can strengthen and decline over time. With few exceptions (e.g. [14,15,16,17,18,19,20,21]), we have little understanding of a human agent’s trust formation and evolution process after repeated interactions with a robotic agent [7, 20]. Second, trust in automation is usually measured by questionnaires administered to the human agents. This approach introduces operational challenges, especially in high-workload and time-critical settings, because the human agent may not have the resource or time to report trust periodically.

To address the two challenges, we develop a computational model that does not depend on repeatedly querying the human interacting with a robotic agent. Instead, this model infers a human’s trust at any time by analyzing the robotic agent’s performance history. We model a human agent’s temporal trust using a Beta distribution and learn its parameters using Bayesian inference based on the history of the robotic agent’s performance. This formulation adheres to three major properties of trust dynamics found in prior empirical studies: Trust at the present moment is significantly influenced by the trust at the previous moment [15]; Negative experiences with autonomy usually have a greater influence on trust than positive experiences [18, 21]; A human agent’s trust will stabilize over repeated interactions with the same autonomous agent [20]. We test the proposed method using an existing dataset involving 39 human participants interacting with four drones in a simulated surveillance task. Results demonstrate that the proposed model significantly outperforms existing models [15, 19]. On top of its superior performance, the proposed model has another two significant advantages over existing trust inference models. As the proposed formulation adheres to human agents’ trust formation and evolution process de facto, it guarantees high model explicability and generalizability. Additionally, the proposed model is not based on the collection of human agents’ physiological information, which could be difficult to collect.

The remaining of the article is organized as follows. Section 2 reviews the relevant literature on trust dynamics and prediction models. Section 3 formulates the trust prediction problem. Section 4 describes the proposed model and Sect. 5 describes the dataset. Section 6 presents and discusses the prediction results of the proposed model. Section 7 concludes the study and suggests future research.

2 Background

As described in Sect. 1, the majority of prior literature on trust in automation adopted a “snapshot” view and typically evaluated trust at the end of an experiment. More than two dozen factors have been identified to influence one’s “snapshot” trust in automation. These factors can be broadly categorized into three groups: individual (i.e., the truster) factors, system (i.e., the trustee) factors and environmental factors. Examples of individual factors are human’s culture and age [22,23,24]. System factors include robot’s reliability [25, 26], level of autonomy [27], adaptivity [28] and transparency [29], timing and magnitude of robotic errors [9, 30], and robot’s physical presence [31], vulnerability [32], and anthropomorphism [33]. Environmental factors include multi-tasking requirements [34] and task emergency [35].

This “snapshot” view, however, does not acknowledge that trust can strengthen or decay due to moment-to-moment interaction with autonomy. Only few studies emphasized the dynamic nature of trust and examined how trust changes as a human agent interacts with a robotic agent over time [14,15,16,17,18,19,20,21].

Manzey et al. [18] noted two feedback loops in the human agent’s trust adjustment process, namely a positive and a negative feedback loop. The positive loop is triggered by experiencing automation success, and the negative loop by experiencing automation failure. The negative feedback loop exerts a stronger influence on trust adjustment than the positive feedback loop [15, 21]. In addition, Lee and Moray [15] proposed an auto-regressive moving average vector (ARMAV) time series model of trust which calculated trust at the present moment t as a function of trust at the previous moment $t-1$, task performance, and the occurrence of automation failures. Yang et al. [20] examined how trust in automation evolved as an average human agent gained experience interacting with robotic agents. Results of their study showed that the average human agent’s trust in automation stabilized over repeated interactions, and this process can be modeled using a first-order linear time-invariant dynamic system. The above-mentioned studies provide valuable insight into the trust dynamics of an average human agent. More recent studies used a data-driven approach to model trust dynamics. In this approach, trust is considered as information internal to the human that is not directly observable but can be inferred from other observable information [19]. For example, Hu et al. [14] proposed to predict trust as a dichotomy, i.e., trust/distrust, by analyzing the human agent’s electroencephalography (EEG) and galvanic skin response (GSR) data. Similarly, Lu and Sarter [17] proposed the use of eye-tracking metrics including fixation duration and scan path length to infer the human’s real-time trust. Their follow-up study [16] used three machine learning techniques, logistic regression, k-nearest neighbors (kNN), and random forest to classify the human’s real-time trust level. Instead of using physiological signals, Xu and Dudek [19] built an online probabilistic trust inference model based on the dynamic Bayesian network framework, treating the human agent’s trust as a hidden variable which was estimated by analyzing the autonomy’s performance and the human agent’s behavior. In [19] the trust dynamics of each individual human agent was modeled. The above mentioned data-driven methods provided insights on how to predict a human’s real-time trust by analyzing other observable information. However, they were subject to two limitations. First, some of them depend on using physiological sensors such as EEG and eye-tracking devices, which could be intrusive or be sensitive to noises [14, 16, 17]. Second, as none of the existing models fully considered the empirical results showing how human agents adjust their trust de facto, the resulting models could be limited in model explicability and generalizability.

3 Problem Statement

In the present study, we aim to propose a personalized trust prediction model to predict each individual human agent’s trust dynamics when s/he interacts with a robotic agent over time. In this section, we formulate the trust prediction problem mathematically.

We consider a scenario where a robotic agent is going to work with a new human agent on a series of tasks. We denote the robot’s performance on the ith task as $p_i\in \{0,1\}$, where $p_i=1$ indicates a success and $p_i=0$ indicates a failure. The reliability of the robotic agent, $r\in [0,1]$, is defined as the probability that the robot can succeed in the task. We assume that the robot has the same reliability while working with the new human agent. At time i, after observing the robot’s performance $p_i$, the new human agent will update his or her current trust $t_i \in [0,1]$ according to the robot’s performance history $\{p_1,p_2,\ldots ,p_i\}$, where $t_i=1$ means the new human agent completely trusts the robotic agent and $t_i=0$ means s/he does not trust it at all.

We assume that before the new human agent, the robotic agent has worked with k other old human agents, and each of the old human agents finished n tasks. Each old human agent reported his or her trust at the end of each task, so his or her trust history $T^j=\{t^j_1,\ldots ,t^j_n\}$ and the robot’s performance history $P^j=\{p^j_1,\ldots ,p^j_n\}$ are fully available, $j=1,2,\ldots ,k$.

Before performing a real task, the new human agent receives a training session consisting of l tasks (see Fig. 2). In the training session, the new human agent reports his or her trust after every task. After the training session, the new agent is to perform real tasks, during which s/he can choose whether to report his or her trust in the robotic agent at their own discretion.

The objective of the trust prediction problem is to predict the new human agent’s trust $t_m$ after s/he finishes the mth task, based on the robot’s performance history $P_m=\{p_i|i=1,2,3,\ldots ,m\}$, trust history during the training session $T_m^t=\{t_i|i=1,2,3,\ldots ,l\}$, occasionally reported trust $T_m^o=\{t_i|i\in O_m,O_m\subset \{l+1,l+2,\ldots ,m-1\}\}$, and the data $T^j$ and $P^j$ from the k old agents, $j=1,2,\ldots ,k$. Here, $O_m$ is an indicator set: $O_{m}=O_{m-1}\cup \{m-1\}$ if the user choose to report his trust after the $m-1$th task, otherwise $O_{m}=O_{m-1}$. We define trust history at time m as $T_m=T_m^o\cup T_m^t$.

This formulation applies to any interaction scenarios wherein the human and the robotic agent are interacting with each other repeatedly and the human can observe the robotic agent’s task performance over time. For example, a newly purchased in-home robot reminds an elderly adult of an upcoming monthly medical check-up. The elderly adult does not double-check the calendar and shows up at the doctor’s office. Until then s/he finds out that the appointment has been re-scheduled by the doctor but the robot has not updated the calendar due to an error. Such a situation is considered a task failure by the robotic agent, and will most likely lead to a trust decrement. After the elderly adult interacts with the robot many times, s/he will probably have a more calibrated trust toward the robot and may not blindly follow the robot’s monthly reminders anymore.

4 Personalized Trust Prediction Model

In this section, we summarize the major empirical findings on trust dynamics. After that, we introduce the proposed Beta distribution model and explain how it adheres to the empirical findings. Finally, we describe the Bayesian framework we use to infer the model’s parameters.

4.1 Major Empirical Findings on Trust Dynamics

Based on the studies reviewed in Sect. 2, a desired trust prediction model should adhere to three properties:

1.
Trust at the present moment i is significantly influenced by trust at the previous moment $i-1$ [15].
2.
Negative experiences with autonomy usually have a greater influence on trust than positive experiences [18, 21].
3.
A human agent’s trust will stabilize over repeated interactions with the same autonomous agent [20].

4.2 Personalized Trust Prediction Model

We use Beta distribution to model a human agent’s temporal trust, for three reasons. First, Beta distribution, defined on the interval [0, 1], is consistent with the bounded self-reported trust. Other distributions, e.g., Gaussian distribution, could be unbounded. Second, Beta distribution fits the exploration-exploitation scheme and can be useful in a reinforcement learning scenario. Third, more importantly, the Beta distribution formulation adheres to the three properties in Sect. 4.1.

We use Bayesian inference to calculate the parameters defining the Bata distribution, because it provides better explainability compared to other machine learning methods, such as neural networks. Also, Bayesian inference provides a belief instead of point estimation of trust so it incorporates uncertainty. Moreover, Bayesian inference can leverage the population-wise prior for calculating model parameters for each individual human agent.

After the robotic agent completes the ith task, the human agent’s temporal trust $t_{i}$ follows a Beta distribution:

$$\begin{aligned} t_{i} \sim Beta( \alpha _{i} ,\beta _{i}) \end{aligned}$$

(1)

The predicted trust ${\hat{t}}_i$ is calculated by the mean of $t_i$

$$\begin{aligned} {\hat{t}}_i=E(t_i)=\frac{\alpha _{i}}{\alpha _{i} +\beta _{i}} \end{aligned}$$

(2)

$\alpha _{i}$ and $\beta _{i}$ are updated by

$$\begin{aligned} \begin{array}{l} \alpha _{i} ={\left\{ \begin{array}{ll} \alpha _{i-1} +w^{s}, &{} \text {if} \ p_{i} =1\\ \alpha _{i-1}, &{} \text {if} \ p_{i} =0 \end{array}\right. }\\ \\ \beta _{i} ={\left\{ \begin{array}{ll} \beta _{i-1} +w^{f}, &{} \text {if} \ p_{i} =0\\ \beta _{i-1}, &{} \text {if} \ p_{i} =1 \end{array}\right. } \end{array} \end{aligned}$$

(3)

where $p_i$ is the performance of the robot on the ith task. $\alpha _i$ and $\beta _i$ are the parameters of the Beta distribution and $w^{s}$ and $w^{f}$ are the gains due to the human agent’s positive and negative experiences with the robotic agent. In other words, a success of the robot causes an increase in $\alpha _i$ by $w^s$ and a failure of the robot causes an increase in $\beta _i$ by $w^f$. The superscript s stands for success and f stands for failure.

Next we explain how the model adheres to the three properties of trust dynamics. First, it is clear in Eq. (3) that the present trust is influenced by the previous trust, which satisfies the first property. Second, we calculate the difference between trust increment caused by the robot agent’s success and trust decrement caused by the robot agent’s failure at time i:

$$\begin{aligned} \begin{aligned}&({\hat{t}}_{i} |_{p_{i} =1} -{\hat{t}}_{i-1}) -({\hat{t}}_{i-1} -{\hat{t}}_{i} |_{p_{i} =0})\\&\quad = \frac{1}{D}\left( \frac{w^{s} \beta _{i-1}}{D+w^{s}} -\frac{w^{f} \alpha _{i-1}}{D+w^{f}}\right) \end{aligned} \end{aligned}$$

(4)

where $D=\alpha _{i-1} +\beta _{i-1}$.

If $\alpha _{i-1}$ and $\beta _{i-1}$ are close, Eq. (4) indicates that the robot agent’s failure will lead to a greater trust change compared to the robot agent’s success when $w^f>w^s$. More precisely, when $\frac{\alpha }{\beta } >\frac{w^{s} D+w^{s} w^{f}}{w^{f} D+w^{s} w^{f}}$, the robotic agent’s failures will have a greater impact. An example is shown in Fig. 3. Within the white region the robot agent’s failure would lead to a larger trust change. In Sect. 5 we show that $w^f>w^s$ is true for most human agents, such that the second property will be satisfied when the value of $w^s$ and $w^f$ are appropriately chosen.

We assume the robot has a constant reliability r. After n tasks, the robot accomplishes $n^s$ tasks and fails $n^f$ tasks. Then

$$\begin{aligned} t_i \sim Beta(\alpha _0+n^sw^s,\beta _0+n^fw^f) \end{aligned}$$

(5)

When $n\rightarrow \infty $, $t_n$ will be a point mass distribution centered at

$$\begin{aligned} \frac{\alpha _0+n^sw^s}{\alpha _0+\beta _0+n^fw^f+n^sw^s}=\frac{rw^s}{rw^s+(1-r)w^f} \end{aligned}$$

(6)

which is a constant and it means trust stabilizes with repeated interactions. Therefore, the proposed model satisfies the three properties of trust dynamics.

To infer the model’s parameters, after the mth trial, and given the robot’s performance history $P_m=$

$\{p_1,p_2,\ldots ,p_m\}$, we determine trust $T_m=\{t_1,t_2,\ldots ,t_m\}$ by the parameter set

$$\begin{aligned} \displaystyle \theta =\left\{ \alpha _{0} ,\beta _{0} ,w^{s},w^{f}\right\} \end{aligned}$$

(7)

Personalizing the trust model for the new human agent means finding the best $\theta $ for him or her. Here, we use the maximum a posteriori estimation (MAP) to estimate $\theta $, which is to maximize the posterior of $\theta $, given the robotic agent’s performance $P_m$, trust history $T_m$ and robot reliability r. First, we have

$$\begin{aligned} \begin{aligned}&P( \theta \ |\ P_m,T_m,r)\\&\quad \propto P( P_m,T_m,r\ |\ \theta ) \ P( \theta )\\&\quad =P( T_m\ |\ \theta ,P_m,r) \ P( P_m,r\ |\ \theta ) \ P( \theta )\\&\quad =P( T_m\ |\ \theta ,P_m) \ P( P_m\ |\ r,\theta ) \ P( r\ |\ \theta ) \ P( \theta )\\&\quad =P( T_m\ |\ \theta ,P_m) \ P( P_m\ |\ r) \ P( r) \ P( \theta )\\&\quad \propto \prod _{t_{i} \in T_m} Beta( t_{i} ;\alpha _{i} ,\beta _{i}) \ \cdot P( \theta ) \end{aligned} \end{aligned}$$

(8)

Then

$$\begin{aligned} {\begin{matrix} \theta &{}= \underset{\theta }{\text {argmax}} \ P( \theta \ |\ P_m,T_m,r)\\ &{}=\underset{\theta }{\text {argmax}} \prod _{t_{i} \in t} Beta( t_{i} ;\alpha _{i} ,\beta _{i}) \ \cdot P( \theta )\\ &{}=\underset{\theta }{\text {argmax}}\sum _{t_{i} \in T_m}\log ( Beta( t_{i} ;\alpha _{i} ,\beta _{i})) \ +\log P( \theta ) \end{matrix}} \end{aligned}$$

(9)

The above equation shows that $\theta $ will be updated only when the human agent provides a new trust report. As $P(\theta )$ is unknown, the model needs to learn $P(\theta )$ first. This prior can be estimated by the empirical distribution of the parameters of the k old human agents who have previously worked with the same robotic agent. The parameter $\theta _j$ of agent j is estimated via the maximum likelihood estimation (MLE):

$$\begin{aligned} \begin{aligned} \theta _j&=\underset{\theta }{\text {argmax}} \ P( T^j\ |\ \theta ,P^j )\\&=\underset{\theta }{\text {argmax}} \ \prod \limits ^{n}_{i=1} Beta( t^j_{i} ;\alpha ^j _{i} ,\beta ^j _{i}) \end{aligned} \end{aligned}$$

(10)

where $\alpha ^j_i$, $\beta ^j_i$, $i=1,2,\ldots $, are determined by Eq. (3).

5 Experiment and Dataset

In this section, we describe the experiment and dataset used to test our proposed model.

We use the dataset in Yang et al. [20]. Participants in the study had an average age of 24.3 years (SD $=$ 5.0 years) with normal or corrected-to-normal vision and without reported color vision deficiency.

All participants performed a simulated surveillance task with four drones. Each participant performed two tasks simultaneously (Fig. 4): controlling four drones using a joystick and detecting potential threats in the images captured by the drones. The participant was able to access only one task at any time and had to switch between the controlling and the detection tasks.

The drones were able to detect potential threats. They would report ‘danger’ when a threat was detected. Due to environmental noises, the threat detection was imperfect. The system reliability of the drones was set as 70, 80, and 90% according to the signal detection theory (SDT) [36, 37]. There were four states considering the drones’ detection results and the true states of the world: hits, misses, false alarms, and correct rejections. As the drones cannot detect the threats perfectly, there is uncertainty involved in the task. For this particular experiment, a more contextualized definition of trust is a person’s attitude that the drones will help him or her achieve his or her goal in the surveillance mission.

The participants had two practice sessions to practice using a joystick. The two practice sessions consisted of a 30-trial block of the tracking task and an 8-trial block of both the tracking and the detection tasks. Hits, misses, false alarms, and correct rejections were illustrated during the 8-trial block. Then the participant completed the subsequent experimental block of 100 trials. The experiment lasted approximately 60 min with a 5-minute break at the halfway point. After each trial, participants reported their perceived reliability of the drones, trust in automation, and confidence. Each participant received compensation (a $10 base) plus a bonus (up to $5). The compensation scheme was determined from a pilot study, incentivizing participants to perform well.

6 Results and Discussion

In the present study, we use data from the 39 participants who received binary detection alerts. We use the participants’ self-reported trust and the drones’ detection performance data according to the problem statement in Sect. 3. To fully exploit the dataset, we use the leave-one-out method to evaluate the proposed model. In each run, we select one participant as the new human agent and consider the remaining 38 participants as the old agents who previously worked with the drones. The trust history of the old agents and the robotic agent’s performance history are fully available for estimating $P(\theta )$; for the new human agent, we assume s/he performs l trials during the personalized training session and thereafter when performing the real tasks s/he reports his or her trust every q trials. In other words, after the new human agent’s mth trial, where $m>l$, we predict his or her trust $t_m$ toward the robotic agent given his or her personalized training trust history $T_m^t=\{t_i|i=1,2,3,\ldots ,l\}$, the occasionally reported trust feedback $T_m^o=\{t_i|i=l+q,l+2q,l+3q,\ldots ,i<m\}$, as well as the data $T^j$ and $P^j$ from the old agents.

6.1 Estimation of $P(\theta )$

We use Eq. 10 to estimate $P(\theta )$. Due to the small size of the dataset, we assume $\alpha _{0} ,\beta _{0} ,w^{s} ,w^{f}$ are independent. We learn the prior distribution of the four parameters using MLE. Figure 5 shows the empirical distributions of $\alpha _0,\beta _0,w_s,w_f$. Comparing the distributions of $\alpha _0$ and $\beta _0$ shows that $\alpha _0$ has a larger mean than $\beta _0$, which indicates that the participants in the experiment generally have a positive attitude toward the robotic agent. Comparing the distributions of $w^s$ and $w^f$ shows that in general $w^f>w^s$, which indicates most detection failures cause larger trust changes than detection successes.

6.2 Prediction Results and Performance Comparisons

Figure 6 shows the prediction results for all the 39 participants. The proposed model successfully captures the trust dynamics for many participants.

We compare the proposed model with two existing trust prediction models. We use root mean square error (RMSE) to evaluate the difference between the predicted value and the ground truth. The smaller the RMSE, the more accurate the prediction.

The two models are the online probabilistic trust inference (Optimo) model [19] and the auto-regression moving average vector (ARMAV) [15] model. We do not compare our model with [14] or with [16], because our dataset lacks physiological data. Since the Optimo and the ARMAV models use different sets of variables, we modify them so all three models use the robot’s performance history, but not other behavioral variables (e.g., human agent’s intervention behaviors [19]).

For each participant h, we calculate his or her RMSE using each prediction model g.

$$\begin{aligned} \text {RMSE}_h^g =\sqrt{\frac{\sum ^{100}_{i=l+1}\left( {t_{i}} -{\hat{t}}_i^g\right) }{100-l}} \end{aligned}$$

(11)

where ${t_{i}}$ is the self-reported trust, ${\hat{t}}_{i}^g$ is the predicted trust calculated using method g (i.e., our proposed model, ARMAV, and Optimo), and l is the length of the personalized training session.

The RMSE for each trust prediction model is calculated as the average of all the 39 participants: $\text {RMSE}^g = \frac{1}{39}\sum _{h=1}^{39}{\text {RMSE}_h^g}$. Table 1 details the mean and standard deviation of the RMSE values of the three models.

Table 1 mean and standard deviation (SD) of the RMSE values of the three models

Full size table

To compare the performance of the three trust prediction models, we conduct a repeated-measure Analysis of Variance (ANOVA), followed by pairwise comparisons with Bonferroni adjustments. The omnibus ANOVA reveals a significant difference among the three models (F(2, 76) $=$ 21.64, $p <.001$). Pairwise comparisons reveals that our proposed model significantly outperforms ARMAV with a medium-large effect size ($t(39) = 3.9$, $p <.001$, Cohen’s $d = 0.63$), and Optimo with a large effect size ($t(39) = 5.7$, $p <.001$, Cohen’s $d = 0.91$). Figure 7 compares the three models.

The superior performance of our proposed model could have been due to two reasons: First, the proposed method captures the nonlinearity of trust dynamics, that trust stabilizes over repeated interaction with the same autonomous agent. In other words, the effect on trust due to a success or a failure from the robotic agent changes as the interaction experience changes. While the first task failure from the robotic agent may cause trust to decline substantially, a robotic task failure after the human agent gains more experience may not. On the contrary, the ARMAV and Optimo models employ a linear rule for updating the predicted trust. It is clear in Fig. 6 that most participants’ trust varies at the start of the experiments and then stabilizes as more trials are completed. Second, although the three models define trust on a bounded interval [0, 1], only our proposed method guarantees the predicted value to be bounded. The predicted trust value from ARMAV or Optimo needs to be truncated if it exceeds the defined boundary.

6.3 Effects of Trust Report Gap and Training Duration

Since $w^s,w^f,\alpha _0,$ and $\beta _0$ are learned from the dataset, the only tunable parameters are the trust report gap q and the number of personalized trials l. Thus it is necessary to understand the effects of the two parameters on the prediction results of our proposed method.

To examine the effect of varying trust report gaps, we set the training duration $l=10$ and vary the trust report gap $q= 2,5,10$ and 25. The average and SD of RMSE across the 39 participants are $0.059 \pm 0.050$, $0.064 \pm 0.052 $, $0.072 \pm 0.053$, and $0.085 \pm 0.062$ respectively. The effect of using different trust report gaps is illustrated further by using the data of one participant. Figure 8a shows that as the trust report gap increases from 2 to 25, the deviance from the ground truth and the predicted values increases accordingly. Since the model parameters are updated when a new trust feedback is available, there are “jumps” on the prediction curve when the human agent chooses to report his or her trust after the training period. If the trust report gap is too wide, such as 25, the prediction accuracy is heavily harmed. This suggests that we need to carefully select the trust report gap such that the trust prediction accuracy can be maintained without disturbing the human agent extensively during real tasks.

To examine the effect of using different training duration, we vary the training duration $l = 5, 10, 20, 40$ while fixing the trust report gap q at 10. The average RMSE across the 39 participants $0.076 \pm 0.079$, $0.072 \pm 0.053$, $0.071 \pm 0.051$, and $0.068 \pm 0.052$ respectively (Fig. 9). This result suggests that the prediction error decreases with a longer personalized training session.

6.4 Three Types of Trust Dynamics

Detailed investigation of Fig. 6 reveals the existence of different types of trust dynamics. To further investigate them, we perform k-means clustering [38]. We find that while most participants’ trust can be accurately predicted by the proposed method, some participants’ self-reported trust significantly deviates from the predicted values. Moreover, four participants almost always reported very low trust in the experiment (participants 7, 16, 19, 21 in Fig. 6). Therefore, we select the RMSE and the average log trust as two features for the clustering analysis. RMSE measures how close the participant’s trust dynamics follows the properties described in Sect. 4.1. Average log trust, defined by ${\sum _{i=1}^{n}\log t_i}/{n}$, can separate the participants who almost always report zero trust. We normalize the features across participants and determine the number of clusters by the elbow rule [39], which is a commonly used heuristic to select the number of clusters. Figure 10 shows the three types trust dynamics and Fig. 11 shows the clustering analysis process. The first type is the Bayesian rational decision maker, shown in Fig. 10a. A Bayesian decision maker’s trust dynamics follows the three properties that trust is dynamic, changes according to the robotic agent’s performance, and stabilizes over repeated interactions. The second is the oscillator, shown in Fig. 10b whose temporal trust significantly fluctuates. The third is the disbeliever, shown in Fig. 10c, whose trust in the robotic agent is constantly low. The different types of trust dynamics may be related to each human agent’s individual characteristics, such as their propensity to trust autonomy [12].

7 Conclusion

We proposed a personalized trust prediction model that adheres to three properties of trust dynamics characterizing human agents’ trust development process de facto. Trust was modeled by a Beta distribution with performance-induced parameters. The parametric model learned the prior of the parameters from a training dataset. When predicting the temporal trust of a new human agent, the model estimated the posterior of its parameters based on the interaction history between the human agent and the robotic agent. The model was tested using an existing dataset and significantly outperformed existing models. On top of the superior performance, the proposed model has another two significant advantages over existing trust inference models. As the proposed formulation adheres to human agents’ trust formation and evolution process de facto, it guarantees high model explicability and generalizability. Additionally, the proposed model does not depend on the collection of human agents’ physiological information, which could be intrusive and difficult to collect.

The proposed trust model complements the subjective measures of trust and can be applied to design adaptive robots. Accurately predicting trust in real time is the first step in designing robotic agents that can adapt to human agents’ trust. For example, if a home companion robot detects an unexpected decline in trust by its human owner, the robot can adopt specific trust recovery strategies to regain the owner’s trust.

The results should be viewed in light of the following limitations: First, the proposed model assumes that the robotic agent’s ability is constant across all the interactions. Second, it assumes the parameters are independent of each other. Third, the proposed model assumes that the robotic agent’s performance is dichotomous and immediately available after a task. Fourth, each participant in the experiment had 100 interaction episodes with the robotic agent in a relatively short period of time. To address the four limitations, further research is needed to test whether the proposed method would work for situations where a robotic agent learns and improves over time. The independence assumption can be removed once a larger dataset is available. Another promising future research direction is to examine how the proposed model should be modified for situations wherein the robotic agent’s performance consists of multiple levels (e.g., extremely good, good, neutral, bad, extremely bad) or the agent’s performance results are delayed. Further research is also needed to validate the proposed method with longer interaction episodes and to examine relationships between participants’ individual characteristics and their trust dynamics.

References

Murphy RR (2004) Human–robot interaction in rescue robotics. IEEE Trans Syst, Man, Cybern, Part C (Appl Rev) 34(2):138–153
Article Google Scholar
Unhelkar VV, Siu HC, Shah JA (2014) Comparative performance of human and mobile robotic assistants in collaborative fetch-and-deliver tasks. In: Proceedings of the 9th ACM/IEEE international conference on human–robot interaction (HRI ’14). ACM, pp 82–89
Rantanen P, Parkkari T, Leikola S, Airaksinen M, Lyles A (2017) An in-home advanced robotic system to manage elderly home-care patients’ medications: a pilot safety and usability study. Clin Ther 39(5):1054–1061
Article Google Scholar
Du N, Huang KY, Yang XJ (2020) Not all information is equal: effects of disclosing different types of likelihood information on trust, compliance and reliance, and task performance in human-automation teaming. Hum Factors 62(6):987–1001
Article Google Scholar
Hancock PA, Billings DR, Schaefer KE, Chen JYC, de Visser EJ, Parasuraman R (2011) A meta-analysis of factors affecting trust in human–robot interaction. Hum Factors 53(5):517–527
Article Google Scholar
Lewis M, Sycara K, Walker P (2018) The role of trust in human–robot interaction. Springer, Cham, pp 135–159
Google Scholar
de Visser EJ, Peeters MM, Jung MF, Kohn S, Shaw TH, Pak R, Neerincx MA (2020) Towards a theory of longitudinal trust calibration in human–robot teams. Int J Soc Robot 12(2):459–478
Article Google Scholar
Lee JD, See KA (2004) Trust in automation: designing for appropriate reliance. Hum Factors 46(1):50–80
Article MathSciNet Google Scholar
Rossi A, Dautenhahn K, Koay KL, Walters ML (2017) How the timing and magnitude of robot errors influence peoples’ trust of robots in an emergency scenario. In: Kheddar A et al (eds) Social Robotics. ICSR 2017. Lecture Notes in Computer Science, vol 10652. Springer, Cham
Google Scholar
Schaefer KE, Billings DR, Szalma JL, Adams JK, Sanders TL, Chen JY, Hancock PA (2014) A meta-analysis of factors influencing the development of trust in automation: implications for human–robot interaction. Tech rep, Army Research Laboratory
Hoff KA, Bashir M (2015) Trust in automation: integrating empirical evidence on factors that influence trust. Hum Factors 57(3):407–434
Article Google Scholar
Merritt SM, Heimbaugh H, Lachapell J, Lee D (2013) I trust it, but i don’t know why: effects of implicit attitudes toward automation on trust in an automated system. Hum Factors 55(3):520–534
Article Google Scholar
Ullman D, Malle BF (2017) Human–robot trust: just a button press away. In: Proceedings of the companion of the 12th ACM/IEEE international conference on human–robot interaction (HRI ’17). ACM, pp 309–310
Hu WL, Akash K, Jain N, Reid T (2016) Real-time sensing of trust in human-machine interactions. IFAC-PapersOnLine 49(32):48–53
Article MathSciNet Google Scholar
Lee J, Moray N (1992) Trust, control strategies and allocation of function in human-machine systems. Ergonomics 35(10):1243–1270
Article Google Scholar
Lu Y (2020) Detecting and overcoming trust miscalibration in real time using an eye-tracking based technique. PhD thesis, University of Michigan
Lu Y, Sarter N (2019) Eye tracking: a process-oriented method for inferring trust in automation as a function of priming and system reliability. IEEE Trans Hum-Mach Syst 49(6):560–568
Article Google Scholar
Manzey D, Reichenbach J, Onnasch L (2012) Human performance consequences of automated decision aids: the impact of degree of automation and system experience. J Cogn Eng Decis Mak 6(1):57–87
Article Google Scholar
Xu A, Dudek G (2015) Optimo: online probabilistic trust inference model for asymmetric human–robot collaborations. In: Proceedings of the 10th annual ACM/IEEE international conference on human–robot interaction (HRI ’15). ACM Press, pp 221–228
Yang XJ, Unhelkar VV, Li K, Shah JA (2017) Evaluating effects of user experience and system transparency on trust in automation. In: Proceedings of the 12th ACM/IEEE international conference on human–robot interaction (HRI ’17). ACM, pp 408–416
Yang XJ, Wickens CD, Hölttä-Otto K (2016) How users adjust trust in automation: contrast effect and hindsight bias. Proc Hum Factors Ergon Soc Ann Meet 60:196–200
Article Google Scholar
Ezer N, Fisk AD, Rogers WA (2008) Age-related differences in reliance behavior attributable to costs within a human-decision aid system. Hum Factors 50(6):853–863
Article Google Scholar
McBride SE, Rogers WA, Fisk AD (2011) Understanding the effect of workload on automation use for younger and older adults. Hum Factors 53(6):672–686
Article Google Scholar
Rau PP, Li Y, Li D (2009) Effects of communication style and culture on ability to accept recommendations from robots. Comput Hum Behav 25(2):587–595
Article Google Scholar
Salem M, Lakatos G, Amirabdollahian F, Dautenhahn K (2015) Would you trust a (faulty) robot?: effects of error, task type and personality on human–robot cooperation and trust. In: Proceedings of the 10th ACM/IEEE international conference on human–robot interaction (HRI ’15). IEEE, pp 141–148
Wickens CD, Rice S, Keller D, Hutchins S, Hughes J, Clayton K (2009) False alerts in air traffic control conflict alerting system: is there a “cry wolf” effect? Hum Factors 51(4):446–462
Article Google Scholar
Rau PLP, Li Y, Liu J (2013) Effects of a social robot’s autonomy and group orientation on human decision-making. Adv Hum-Comput Interact 2013:263721
Article Google Scholar
Schneider S, Kummert F (2020) Comparing robot and human guided personalization: adaptive exercise robots are perceived as more competent and trustworthy. Int J Soc Robot 1–17
Wang N, Pynadath DV, Hill SG (2016) Trust calibration within a human–robot team: comparing automatically generated explanations. In: Proceedings of the 11th ACM/IEEE international conference on human–robot interaction (HRI ’16). IEEE, pp 109–116
Desai M, Kaniarasu P, Medvedev M, Steinfeld A, Yanco H (2013) Impact of robot failures and feedback on real-time trust. In: Proceedings of the 8th ACM/IEEE international conference on human–robot interaction (HRI ’13). IEEE, pp 251–258
Bainbridge WA, Hart JW, Kim ES, Scassellati B (2011) The benefits of interactions with physically present robots over video-displayed agents. Int J Soc Robot 3(1):41–52
Article Google Scholar
Martelaro N, Nneji VC, Ju W, Hinds P (2016) Tell me more: designing hri to encourage more trust, disclosure, and companionship. In: The 11th ACM/IEEE international conference on human–robot interaction (HRI ’16). IEEE, pp 181–188
Waytz A, Heafner J, Epley N (2014) The mind in the machine: anthropomorphism increases trust in an autonomous vehicle. J Exp Soc Psychol 52:113–117
Article Google Scholar
Zhang MY, Yang XJ (2017) Evaluating effects of workload on trust in automation, attention allocation and dual-task performance. Proc Hum Factors Ergon Soc Ann Meet 61(1):1799–1803
Article Google Scholar
Robinette P, Li W, Allen R, Howard AM, Wagner AR (2016) Overtrust of robots in emergency evacuation scenarios. In: Proceedings of the 11th ACM/IEEE international conference on human–robot interaction (HRI ’16). ACM, pp 101–108
Tanner WP, Swets JA (1954) A decision-making theory of visual detection. Psychol Rev 61(6):401–409
Article Google Scholar
Neil AM, Douglas C (2005) Detection theory: a user’s guide, 2nd edn. Lawrence Erlbaum Associates, Mahwah
Google Scholar
Lloyd SP (1982) Least squares quantization in PCM. IEEE Trans Inf Theory 28(2):129–137
Article MathSciNet Google Scholar
Thorndike RL (1953) Who belongs in the family? Psychometrika 18(4):267–276
Article Google Scholar

Download references

Acknowledgements

This research was partially funded by ARL Cooperative Agreement Number W911NF-20-2-0087. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Laboratory or the U.S. Government. The authors would also like to thank Drs. Chongjie Zhang and Cong Shi for useful discussions.

Author information

Authors and Affiliations

Department of Industrial and Operations Engineering, University of Michigan, Ann Arbor, MI, USA
Yaohui Guo & X. Jessie Yang

Authors

Yaohui Guo
View author publications
You can also search for this author in PubMed Google Scholar
X. Jessie Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to X. Jessie Yang.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Guo, Y., Yang, X.J. Modeling and Predicting Trust Dynamics in Human–Robot Teaming: A Bayesian Inference Approach. Int J of Soc Robotics 13, 1899–1909 (2021). https://doi.org/10.1007/s12369-020-00703-3

Download citation

Accepted: 21 September 2020
Published: 04 October 2020
Issue Date: December 2021
DOI: https://doi.org/10.1007/s12369-020-00703-3

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Modeling and Predicting Trust Dynamics in Human–Robot Teaming: A Bayesian Inference Approach

Abstract

Similar content being viewed by others

Framing the predictive mind: why we should think again about Dreyfus

Transparency and the Black Box Problem: Why We Do Not Trust AI

Understanding A.I. — Can and Should we Empathize with Robots?

1 Introduction

2 Background

3 Problem Statement

4 Personalized Trust Prediction Model