Iterative learning of human partner’s desired trajectory for proactive human–robot collaboration

Xia, Jingkang; Huang, Deqing; Li, Yanan; Qin, Na

doi:10.1007/s41315-020-00132-5

Iterative learning of human partner’s desired trajectory for proactive human–robot collaboration

Regular Paper
Open access
Published: 05 June 2020

Volume 4, pages 229–242, (2020)
Cite this article

Download PDF

You have full access to this open access article

International Journal of Intelligent Robotics and Applications Aims and scope Submit manuscript

Iterative learning of human partner’s desired trajectory for proactive human–robot collaboration

Download PDF

Jingkang Xia¹,
Deqing Huang¹,
Yanan Li ORCID: orcid.org/0000-0002-1443-2547² &
…
Na Qin¹

6234 Accesses
5 Altmetric
Explore all metrics

Abstract

A period-varying iterative learning control scheme is proposed for a robotic manipulator to learn a target trajectory that is planned by a human partner but unknown to the robot, which is a typical scenario in many applications. The proposed method updates the robot’s reference trajectory in an iterative manner to minimize the interaction force applied by the human. Although a repetitive human–robot collaboration task is considered, the task period is subject to uncertainty introduced by the human. To address this issue, a novel learning mechanism is proposed to achieve the control objective. Theoretical analysis is performed to prove the performance of the learning algorithm and robot controller. Selective simulations and experiments on a robotic arm are carried out to show the effectiveness of the proposed method in human–robot collaboration.

A review of motion planning algorithms for intelligent robots

Article Open access 25 November 2021

A systematic review on cooperative dual-arm manipulators: modeling, planning, control, and vision strategies

Article 14 July 2023

Performance-based Assistance Control for Upper Limb Robotic Mirror Therapy

Article 25 June 2024

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Human–robot interaction (HRI) is a rapidly expanding research field with a great need for robotic assistance in various applications such as assembly, teleoperated surgery, heavy load transport and so on [1,2,3,4,5,6]. HRI has been mainly studied in two complementary branches, such as divisible shared control (DSC) that has been discussed in [7,8,9] and interactive shared control (ISC) in [10,11,12]. DSC enables clear division of the human partner’s and the robot’s subtasks so that they can work independently, while ISC allows for flexible intervention to improve the collaborative performance which is the focus of this paper. How to design a simple, effective and safe robotic controller has been a focus in the field of HRI [13], especially in the subfield of physical human–robot interaction (pHRI).

A critical yet typical problem in pHRI is how to obtain the human’s desired trajectory or motion intention, as the knowledge of this information enables a robot to adapt its behaviour according to the human partner and to complete a task cooperatively [14, 15]. In the literature of pHRI, researchers have tried to extract this information from the human partner by utilizing the haptic information, especially the interaction force that can be measured from the force/torque sensor at the interaction point. In [10], neural networks are used to build up a mapping between the human’s desired trajectory and the interaction force. In [11], the robot’s desired trajectory is deformed according to the interaction force, while the human’s desired trajectory is not explicitly estimated. A different approach is adopted in [12], which develops a controller for lifting a beam with human–robot collaboration, using a probabilistic framework based on Gaussian Mixture Models (GMM) and Gaussian Mixture Regression (GMR) to encode force/torque at the end-effector with the demonstrated velocity. After learning the robot could perform the collaborative lifting by conditioning the GMM on the perceived force/torque and generate the velocity response using GMR. The strategy of using interaction force as stimuli is also applied in [16] for teaching multiple solution trajectories by means of a Hidden Markov Model (HMM) and GMR. The perceived force/torque could be displayed to the human partner by using a haptic device, and the human partner demonstrates motion response. In addition, without using a force sensor, [17] proposes a human intent detection method by observing the impact of the physical interaction on the robot’s control effort. From the control point of view, these works use the estimated human information to alter the robot’s controller, but they do not evaluate how it affects the performance of the closed-loop system. This is important as human is involved in the system so the system stability should be ensured at all instances.

Impedance control provides a useful framework for analyzing a pHRI system [18], which is adopted as a basis of many works. In [19,20,21], adaptive control is developed to deal with uncertainty of the parameters in robot dynamics to achieve a target impedance model. In [22], a desired impedance model is learned for a robot through repetitive interaction with its environment. It is important to note that impedance control is useful for not only regulating the interaction between the robot and the environment, but also for establishing a human motor control model. Although describing an individual human’s behavior in physical interaction with an environment is challenging, impedance control is found to be an effective and robust computational model [23, 24]. With a unified model of impedance control for robot control and human motor control, it is natural to use it for skill transfer from the human to the robot. In [25], by measuring the sensory outputs from the robot and the corresponding motor commands generated by the human tutor, a proper mapping between the selected muscle EMG and robot’s end-effector impedance is established. Moreover, the coupled HRI system can be explicitly described within the same framework of impedance control, so it becomes possible to evaluate the system performance with estimation of the human state and adaptation of the robot’s controller. In the literature, a similar idea is found for impedance-based force control [20], where the robot is under impedance control and its environment is treated as a spring model. In this way, the robot-environment system can be analyzed within the same control framework. Differently, in pHRI, the human dynamics cannot be described as a passive system by a spring model. In this paper, a human model with an actively planned trajectory is investigated.

On the other hand, it is found that many pHRI processes are repetitive, so it is a natural idea to introduce iterative learning control (ILC) for the robot to gain skills from the human partner. In the past decades, ILC has been developed for improving performance of systems that carry out repetitive operations [26,27,28]. Its main goal is to generate a control that tracks a given reference or rejects a periodic disturbance [29]. However, in the pHRI problem under study in this paper, the human’s desired trajectory is unknown to the robot so the control objective is different from the traditional ILC. It will be shown that the control problem is formulated in a way that is different from conventional trajectory tracking, although ILC is employed.

Furthermore, due to human variance during repetitive pHRI, the same period cannot be guaranteed for every iteration. Therefore, this paper develops a period-varying ILC that ensures learning convergence in the presence of human uncertainty. In the literature of ILC, some recent works [30,31,32,33,34,35] investigate the problem of period-varying ILC which lay the foundation of our controller. In particular, [33] proposes an ILC scheme with an iteration-average operator for discrete-time linear systems where the iteration period could be randomly varying. [34] introduces an ILC scheme based on an iteratively moving average operator for nonlinear dynamic systems with randomly varying iterative periods. Different from the above two methods, [35] applies the adaptive design framework for continuous-time parametric nonlinear systems with varying iteration periods. Similar to other ILC methods as discussed above, these three methods also require a predefined desired output trajectory. In this paper, the human’s unknown desired trajectory will be first learned by using a human limb model and by formulating a zero force regulation problem. Then, it will be tracked by the robot so it can reduce human’s control effort.

Through the above discussion, this paper has the following contributions:

1.
Based on a simplified human limb model, estimation of the human partner’s desired trajectory is formulated as a force regulation problem.
2.
Different from traditional ILC which has been mainly used for trajectory tracking, we develop ILC to achieve force regulation and estimation of human trajectory, enabling the robot to proactively cooperate with the human through repetitive pHRI.
3.
A period-varying ILC is developed to deal with the problem of interactions with uncertain time durations due to human variance, so it can be applied to a wider range of pHRI applications.

The rest of the paper is arranged as follows: Sect. 2 introduces the main problem formulation, including the system description and the novel control objective. In Sect. 3, a non-standard ILC is designed to solve the problem of varying iterative periods and theoretical analysis is conducted to show the learning convergence and system stability. Furthermore, the proposed algorithm is testified by simulations and experiments in Sect. 4. Finally, conclusions are drawn in Sect. 5.

2 Problem formulation

2.1 System description

In this paper, we consider a typical human–robot collaboration scenario that is composed of a robotic manipulator and its human partner. The human partner guides the robotic manipulator along a trajectory to complete a task, e.g. painting a surface with a certain curvature. This trajectory is determined by the human but cannot be preprogrammed to the robotic manipulator. The interaction force between the human hand and the robotic manipulator is measured by a force sensor at the end-effector of the robotic manipulator.

The dynamics model of the n-degrees-of-freedom (n-DOF) robotic manipulator is given as below:

$$\begin{aligned} H(\theta )\ddot{\theta }+C(\theta ,{\dot{\theta }}){\dot{\theta }}+G(\theta )=u-J^{T}F_{h} \end{aligned}$$

(1)

where $\theta ,{\dot{\theta }},\ddot{\theta }\in {\mathbb {R}}^{n}$ represent the joint position, velocity and acceleration vectors, respectively; $H(\theta ) \in {\mathbb {R}}^{n\times n}$ is the symmetric positive definite mass matrix; $C(\theta , {\dot{\theta }}){\dot{\theta }} \in {\mathbb {R}}^{n}$, $G(\theta ) \in {\mathbb {R}}^{n}$ denote the torques due to centrifugal and gravity; $u \in {\mathbb {R}}^{n}$ is the joint torque applied by robotic actuators; $J\in {\mathbb {R}}^{n\times n}$ is the Jacobian matrix that relates the joint velocity to the linear and angular velocities of the end-effector; and $F_{h} \in {\mathbb {R}}^{n}$ is the robot/human interaction force that can be measured by a force sensor.

It is often desirable to describe the manipulator dynamics in the Cartesian space for the convenience of analysis, when the interaction takes place at the end-effector. The robot dynamics in the Cartesian space are given by

$$\begin{aligned} H_{x}{\ddot{X}} + C_{x}{\dot{X}}+ G_{x} = J^{-T}u - F_{h} \end{aligned}$$

(2)

where $X\in {\mathbb {R}}^{n}$ represents the position of the end effector. The velocity and acceleration satisfy

$$\begin{aligned} {\dot{X}}= & \,{} J {\dot{\theta }} \nonumber \\ {\ddot{X}}= & \, {} J\ddot{\theta } +{\dot{J}}{\dot{\theta }} \end{aligned}$$

(3)

Then, we obtain

$$\begin{aligned} H_{x}= & \, {} J^{-T}HJ^{-1} \nonumber \\ C_{x}= & \, {} J^{-T}(C-HJ^{-1}{\dot{J}})J^{-1} \nonumber \\ G_{x}= & \, {} J^{-T}G \end{aligned}$$

(4)

According to [20], dynamics of a robot’s environment can be described by a spring model. This corresponds to the equilibrium point control model that describes human motor control [36], which is given below:

$$\begin{aligned} F_{h} = K_{h}(X- X_{h} ), \end{aligned}$$

(5)

where $K_{h}$ is the $n\times n$ equivalent stiffness matrix and $X_{h}$ is the desired trajectory of the human. With Eqs. (2) and (5), we have a full description of the system dynamics and are ready to discuss the control objective.

2.2 Control objective

In traditional control tasks, the reference trajectory of the robot is available for the control design. However, in this paper, it should be designed according to the desired trajectory that is determined by the human partner and unknown to the robot. In the rest of the paper, we propose an approach to estimate the human partner’s desired trajectory and control the robot to track it.

For this purpose, impedance control is adopted with the following target impedance model:

$$\begin{aligned} M_{d}({\ddot{X}}-{\ddot{X}}_{r})+C_{d}({\dot{X}}-{\dot{X}}_{r})+K_{d}(X-X_{r})=-F_{h} \end{aligned}$$

(6)

where $X_{r}$ is the robot’s reference trajectory. $M_{d}$, $C_{d}$ and $G_{d}$ are inertia, damping and stiffness matrices that are specified by the control designer, respectively.

To make Eq. (2) identical to Eq. (6), the robot’s joint torque is given by $u=J^{T}F$ with

$$\begin{aligned} F= & \, {} C_{x}{\dot{X}} + G_{x}+F_{h}-H_{x}M_{d}^{-1}\{C_{d}({\dot{X}}-{\dot{X}}_{r}) \nonumber \\&+K_{d}(X-X_{r})+F_{h}+{\ddot{X}}_{r}\} \end{aligned}$$

(7)

To estimate the human partner’s desired trajectory, we observe from Eq. (5) that the interaction force $F_h$ becomes 0 when the human’s desired trajectory is perfectly tracked, i.e. $X=X_h$. This indicates that we can minimize $F_h$ so that the robot’s trajectory X gets close to $X_h$ as much as possible. Based on this idea, the control objective becomes

$$\begin{aligned} F_h=0 \end{aligned}$$

(8)

The next question is how to design the reference trajectory $X_r$ to achieve the above control objective. As discussed in the Introduction, this reference trajectory can be learned through repetitive interaction between the robot and the human. In particular, we will adopt ILC to iteratively update $X_r$ so that the control objective $F_h=0$ is eventually achieved.

The control flow is illustrated in Fig. 1, where i represents the number of iterations. The “Impedance Control” part shows that the robot dynamics (2) are controlled by the robot controller (7), which has inputs $X_{r_i}$ and $F_{h_i}$ measured by a force sensor in the “Measurement and Storage” part. In the “Update Reference Position” part, (5) is an internal human model that relates the position information and interaction force $F_{h_i}$, which is also used to derive the dynamics model (12). Then, the developed ILC (26) uses the dynamics model (12) to update the reference trajectory $X_{r_{i+1}}$ and store it for the use in the next iteration, which will be detailed in the following section (Fig. 1).

3 Controller design

3.1 System model transformation

For clarity of presentation, in the subsequent analysis we consider each Cartesian variable independently and replace $M_{d}$, $C_{d}$, $K_{d}$, E, $F_{h}$, $F_{r}$ by $m_{d}$, $c_{d}$, $k_{d}$, e, $f_{h}$, $f_{r}$, respectively. In this way, the target impedance model in a single direction becomes

$$\begin{aligned} m_{d}(\ddot{x}-\ddot{x}_{r})+c_{d}({\dot{x}}-{\dot{x}}_{r})+k_{d}(x-x_{r})=-f_{h} \end{aligned}$$

(9)

By using the Laplace transform of (9), we have

$$\begin{aligned} (m_{d}s^{2}+c_{d}s+k_{d})(x(s)-x_{r}(s))=-f_{h}(s) \end{aligned}$$

(10)

Using the environment model (5), x(s) can be described by $x(s)=\frac{f_{h}(s)}{k_{h}}+x_{h}(s)$. Substituting for x(s) in (10) yields

$$\begin{aligned} f_{h}(s)+ \frac{k_h f_h(s)}{m_{d}s^{2}+c_{d}s+k_{d}}=k_h(x_{r}(s)-x_{h}(s)) \end{aligned}$$

(11)

The above equation combines the robot’s impedance model and the human motor control model. By defining $f'_{h}(s)=f_{h}(s)/(m_{d}s^{2}+c_{d}s+k_{d})$, we have

$$\begin{aligned} m_{d}s^2 f'_{h}(s)+c_{d}sf'_{h}(s)+(k_{d}+k_{h})f'_{h}(s)=k_{h}(x_{r}(s)-x_{h}(s)) \end{aligned}$$

(12)

$f'_{h}$ can be simply interpreted as a filtered signal of $f_{h}$. Then, we convert the system in Eq. (12) from the frequency domain back to the time domain, and obtain the time-domain state-space form as follows:

$$\begin{aligned} {\dot{\varphi }}(t)= & \, {} A\varphi (t) + B e(t),~~ \varphi (t)= \left[ \begin{array}{cccc} f'_{h}(t) \\ {\dot{f}}'_{h}(t)\\ \end{array} \right] ,\nonumber \\ A= & \, {} \left[ \begin{array}{cccc} 0 &{} 1 \\ \frac{-(k_{d}+k_{h})}{m_{d}} &{} \frac{-c_{d}}{m_{d}} \\ \end{array} \right] , ~~B = \left[ \begin{array}{cccc} 0 \\ \frac{k_h}{m_d}\\ \end{array} \right] , \end{aligned}$$

(13)

where $e(t)=x_{r}(t)-x_{h}(t)$. In order to apply ILC, this state-space form is further written for each iteration as below:

$$\begin{aligned} {\left\{ \begin{array}{ll} {\dot{\varphi }}_{i}(t) = A \varphi _{i}(t) + B e_i(t) \\ \rho _{i}(t)=C \varphi _{i}(t) \end{array}\right. } \end{aligned}$$

(14)

where $t\in [0,T_{i}]$ and $i\in [1,\mathrm {N}]$ is the iteration index. $\mathrm {N}$ is the total number of iterations and $T_{i}$ is the trial length of the ith iteration. $C=[1,1]$ and $e_i(t)$, $\varphi _{i}(t)$, $\rho _{i}(t)$ denote input, state and output of the system (14), respectively. Note that $T_i$ may be different in a different iteration, as the human introduces uncertainty to the interaction. Therefore, traditional ILC with a fixed period cannot apply to this case. In the following subsection, we will develop an ILC with varying periods.

Moreover, it is important to emphasize that the control objective is to make the interaction force $F_h$ be zero, indicating that the human partner’s desired trajectory will be tracked. This is denoted as $\lim \nolimits _{i\rightarrow \infty }f_{h_i}(t) = 0$ and $\lim \nolimits _{i\rightarrow \infty }e_i(t) = 0$ in the framework of ILC. Thus, for the system (12), $e_{d}(t) = 0$ and $\rho _{d}(t) = 0$ are the desired input and output, respectively.

3.2 ILC design

The main problem about the ILC design for the system in Eq. (14) is that the period $T_{i}$ is uncertain and usually different from a fixed period $T_{d}$. In order to address it, we first give some definitions and assumptions that will be used in the later design.

Definition 1

${\mathbf {P}}[\cdot ]$, ${\mathbf {E}}[\cdot ]$ and ${\mathbf {F}}[\cdot ]$ denote the occurrence probability of the event, the expectation of a random variable and the probability distribution function.

Assumption 1

Assume that$T_{i}$is a stochastic variable, and its probability distribution function is

$$\begin{aligned} {\mathbf {F}}_{T_{i}}(t)={\mathbf {P}}[T_i\le t]={\left\{ \begin{array}{ll}0, &{}t \in [0,T_{min}] \\ p(t),&{}t\in [T_{min},T_{max}] \\ 1,&{}t> T_{max} \end{array}\right. } \end{aligned}$$

(15)

where $0 \le p(t) \le 1$ is a known continuous function. $T_{min}$ and $T_{max}$ are the minimum and maximum of $T_i$ that can be obtained through multiple iterative experiments.

Assumption 2

$\varphi _{i}(0)= \varphi _{d}(0)$, which indicates that the initial state is the same at every iteration.

If $T_{i}=T_{d}$ and initial conditions are consistent, an open loop D-type ILC for the system (14) is usually designed as

$$\begin{aligned} e_{i+1}(t)= e_{i}(t)+\omega {\dot{\mu }}_{i}(t) \end{aligned}$$

(16)

where $\mu _{i}(t) = \rho _{i}(t) -\rho _{d}(t)=f'_{h_{i}}(t)+\dot{f'}_{h_{i}}(t)$ is the tracking error on the interval $[0,T_{d}]$, ${\dot{\mu }}_{i}(t)$ is the derivative of $\mu _{i}(t)$ on $[0,T_{d}]$, and $\omega$ is the learning rate. Considering that $x_{h}(t)$ does not change with the iteration, i.e. $x_{h_{i}}(t)=x_{h_{i+1}}(t)$, Eq. (16) can be rewritten as

$$\begin{aligned} x_{r_{i+1}}(t) = x_{r_{i}}(t) + \omega {\dot{\mu }}_{i}(t) \end{aligned}$$

(17)

which gives an updating law for ${x}_{r}$. When the actual period $T_{i} \ne T_{d}$, which is the case of the non-standard ILC, the updating law (17) has to be redesigned.

Since $T_{i}$ changes with iteration, two cases need to be discussed: $T_{i} < T_{d}$ and $T_{i} \ge T_{d}$. On the one hand, $T_{i} < T_{d}$ means that the ith iteration ends before the desired period, so both the output $\rho _{i}(t)$ and $\mu _{i}(t)$ on the time interval $[T_{i}, T_{d}]$ are missing, which thus cannot be used for learning. On the other hand, $T_{i} \ge T_{d}$ means that the ith iteration is still running after the desired period, the data from $(T_{i},T_{d}]$ are redundant and useless. In order to deal with those missing data or redundant data in different cases, a sequence of stochastic variables is defined to satisfy Binomial distribution, so that a newly defined force error $\mu _{i}(t)$ is introduced to improve the traditional ILC.

The main improvement process has the following four steps:

1.
Define the stochastic variable $\gamma _{i}(t)$ to satisfy the Binomial distribution.

In this paper, we define $\gamma _{i}(t)$, $t\in [0,T_{max}]$ as a stochastic variable satisfying Binomial distribution and taking binary values 1 and 0. $\gamma _{i}(t)=1$ indicates that the control process of (14) can arrive at time t in the ith iteration. The probability of $\gamma _{i}(t)=1$ is q(t), where $0 < q(t) \le 1$ is a distribution probability function of time t. $\gamma _{i}(t)=0$ indicates that the control process of (14) cannot arrive at time t, which occurs with a distribution probability of $1 - q(t)$.
2.
Compute the probability ${\mathbf {P}}[\gamma _{i}(t)=1]$ and the expectation ${\mathbf {E}}[\gamma _{i}(t)]$.

Because of $T_{min} \le T_{i} \le T_{max}$ the control process (14) will not stop before $T_{min}$. $\gamma _{i}(t)=1$ is the inevitable event when $t \in [0,T_{min}]$, which implies that $q(t)=1$, $\forall t \in [0,T_{min})$. For the scenario of $t \in [T_{min},T_{max}]$, the event $\gamma _{i}(t)=1$ represents that the control process (14) stops at or after time t, which means that $T_{i} \ge t$. ${\mathbf {P}}[T_{i}=t]=0$ is applied, thus
$$\begin{aligned} {\mathbf {P}}[\gamma _{i}(t)=1]= & \, {} {\mathbf {P}}[T_{i}\ge t] \nonumber \\= & \, {} 1 - {\mathbf {P}}[T_{i}\le t]\nonumber \\= & \, {} 1 - F_{T_{i}}(t) \end{aligned}$$
(18)
Combining Eqs. (15) and (18), we can obtain that
$$\begin{aligned} q(t) = 1- F_{T_{i}}={\left\{ \begin{array}{ll}1, &{}t \in [0,T_{min}] \\ 1-p(t),&{}t\in [T_{min},T_{max}] \\ 0,&{}t> T_{max} \end{array}\right. } \end{aligned}$$
(19)
Because $\gamma _{i}(t)$ satisfies the Binomial distribution and taking binary values 1 and 0, the expectation ${\mathbf {E}}[\gamma _{i}(t)] = q(t)\cdot 1 + (1-q(t))\cdot 0=q(t)$.
3.
Define a modified force error.

Denote the modified tracking error as
$$\begin{aligned} \mu ^{*}_{i}(t)= \gamma _{i}(t)\mu _{i}(t), ~~ t\in [0,T_{d}] \end{aligned}$$
(20)
Taking derivative on both sides of the equation leads to
$$\begin{aligned} {\dot{\mu }}^{*}_{i}(t)= \gamma _{i}(t){\dot{\mu }}_{i}(t), ~~ t\in [0,T_{d}] \end{aligned}$$
(21)
If $T_{i} < T_{d}$, Eq. (21) can be rewritten to
$$\begin{aligned} {\dot{\mu }}^{*}_{i}(t)= {\left\{ \begin{array}{ll} {\dot{\mu }}_{i}(t), ~~t\in [0,T_{i}] \\ 0, ~~~~~~t\in (T_{i},T_{d}] \end{array}\right. } \end{aligned}$$
(22)
If $T_{i} \ge T_{d}$, Eq. (21) can be rewritten to
$$\begin{aligned} {\dot{\mu }}^{*}_{i}(t)= {\dot{\mu }}_{i}(t),~~t\in [0,T_{d}] \end{aligned}$$
(23)
4.
Design a new ILC updating law.

In order to compensate for the missing data and reduce the error of single learning, an iteratively moving average operator is introduced, as below
$$\begin{aligned} \Phi \{e_{i}(t)\}= \frac{1}{m+1} \sum _{j=0}^m e_{i-j}(t), \end{aligned}$$
(24)
for a sequence $e_{i-m}(t), ~ e_{i-m+1}(t), ~ e_{i-m+2}(t)\ldots , ~ e_{i}(t)$ with $m \ge 1$, which includes only the recent $m + 1$ iterations that could provide more accurate control information for learning process. The ILC updating law is given as follows:
$$\begin{aligned} e_{i+1}(t)= \Phi \{e_{i}(t) \} +\sum _{j=0}^m \beta _{j}{\dot{\mu }}^{*}_{i-j}(t), ~~t\in [0,T_{d}] \end{aligned}$$
(25)
where the learning rates $\beta _{j}\in {\mathbf {R}}$, $j=0,1,2,3 \ldots ,m$ and $e_{i}(t)=0, i<0$. Considering Eq. (17), we develop the following updating law for $x_{r}$:
$$\begin{aligned} x_{r_{i+1}}(t) = \Phi \{x_{r_{i}}(t)\}+\sum _{j=0}^m \beta _{j}{\dot{\mu }}^{*}_{i-j}(t), ~~t\in [0,T_{d}] \end{aligned}$$
(26)

With the above design, the main result of this paper is presented in the following theorem and its proof is in the Appendix.

Theorem 1

For the system (14) and the ILC scheme (26), we choose the appropriate learning rates$\beta _{j}$, $j = 0,1,2,3\ldots ,m$so that for any constant$0 \le \epsilon < 1$,

$$\begin{aligned} \sum _{j=0}^m \alpha _{j}\le \epsilon \end{aligned}$$

(27)

where $\alpha _{j}=\sup \nolimits _{t\in [0,T_{d}]}\left\{ |\frac{1}{m+1}- \beta _{j}CB|q(t)+ \frac{1-q(t)}{m+1}\right\}$, then the tracking error$\mu _{i}(t)$, $t\in [0,T_{d}]$will converge to zero iteratively.

Remark 1

While the probability distribution of $T_{i}$ is unknown, it could be estimated based on preliminary experiments. Thus, the probability distribution function $F_{T_{i}} (t)$ in Assumption 1 is known and q(t) is available for controller design that can be calculated by Eq. (19).

4 Simulations

4.1 Simulation I

In this section, we consider a single joint application scenario with an exoskeleton robot for gait tracking through iterative learning. Related works have been done for lower-limb exoskeleton for gait tracking and rehabilitation, such as [37, 38]. As shown in Fig. 2a, in the simulation we suppose that the interaction force could be measured by the force sensor mounted on the exoskeleton.

The desired joint trajectory of the user’s gait is set as $q_{h}(t)=\frac{\pi t}{20} -\frac{\pi }{2}$, $t\in [0,10]$ s and $T_{d}=10$ s, indicating that the rotational joint is moving at a constant speed. The parameters of the desired impedance model are set as: $m_d=0.1$ kg, $c_d=25$ N/(m/s) and $k_d=20$ N/m. The human leg’s stiffness is set as $k_h=18$ N/m. In the Cartesian space, we take into account the change of the endpoint position in the X direction. The link length is 0.5 m, so according to the mapping between joint angles and endpoint position, we can obtain that the desired endpoint position with time is $x_h=-0.5cos(\frac{\pi t}{20})$.

The reference trajectory before learning is $x_{r_{0}}(t)=(-0.5,0)$ m, $t\in [0,10]$ and the initial position is $(-0.5,0)$ m for all iteration trails. The learning period of each iteration will change because of the user’s influence, so the periods of the 4th, 5th, 6th, 7th, 8th and 10th are different from $T_d$. And we assume that the period obeys a normal distribution on the time range [6, 10] s with a mean of 8 and a standard deviation of 0.2. As is shown in Fig. 2b–d, the end of each iteration is marked with a dot. In order to obtain better simulation results, we take the first three results as the iteratively moving average operator’s elements, so m is set as 3 and $\beta _0=\beta _1=\beta _2=\beta _3=0.25$.

The tracking performance of the trajectory and joint angle are shown in Fig. 2b, c. It can be seen from the two figures that the actual position and joint angle trajectory (the solid lines in different colors) track the desired trajectory (the curve shown by the dotted red line) rapidly and precisely. In addition, the interaction force converges to zero after 20 learning iterations which can be observed from Fig. 2d. These results show that the whole learning process is also a skill transfer process. When the robot learns the user’s gait, it becomes easier for the user to cooperate with the exoskeleton robot.

4.2 Simulation II

Different from simulation I, in this section, a 2-DOF manipulator is used to verify the proposed period-varying ILC algorithm. A typical trajectory learning process is considered in this simulation which can emulate an application of object manipulation, as is shown in Fig. 3a. The robot moves in the $X-Y$ plane with an initial position (0, 0)m. With the guidance of a human partner, the robot gradually learns to track the desired trajectory that is determined by the human partner. It is assumed that the interaction force is only exerted in the Y direction and the robot’s speed in the X direction is 0.1 m/s. The desired trajectory in the Cartesian space is $x_{h}=\frac{sin0.2 \pi t}{2}+\frac{sin0.1 \pi t}{2}m$, $t\in [0,10]$ and thus $T_d=10$ s is the desired period. The parameters of the desired impedance model are set as: $m_d=0.5$ , $c_d=28N/_{(m/s)}$ and $k_d=26$ N/m. The human arm’s stiffness is set as $k_h=20$ .

Without loss of generality, the input of the initial iteration is $x_{r_0}(t)=[0,0]$, $t\in [0,10]$ s. In this simulation, we assume that the iteration period obeys the Gaussian distribution with a mean of 10 and a standard deviation of 0.2. We further set $m=3$ and $\beta _0=\beta _1=\beta _2=\beta _3=0.2$. The algorithm runs for 20 iterations.

The tracking performance of the trajectory is shown in Fig. 3b, where the lengths of the 4th, 5th, 6th, 8th and 10th periods are different from the desired one. It is obvious that the robot can gradually reach the human’s desired trajectory $x_h$. The interaction forces at the 4th, 5th, 6th, 8th, 10th, 15th and 20th are shown in Fig. 3c. It is observed that the interaction force is reduced to zero in the 20th iteration, so the robot fully learns the desired trajectory and it can eventually complete the task alone.

In addition, we change the value of $k_h$ to represent the stiffness of different human users, varying from 10 to 28 N/m, i.e. $k_{h_{user1}}=10$ N/m, $k_{h_{user2}}=12$ N/m $\ldots k_{h_{user10}}=28$ N/m. According to the above task and parameters, simulations are designed to show the tracking performance of ten different users. Furthermore, we introduce traditional impedance control with zero stiffness for comparison. In order to facilitate comparison of the results, the values of inertia and damping parameters are the same as in the proposed ILC. For the proposed ILC, we record the different users’ total iteration numbers, maximum errors and maximum interaction forces after convergence. For the impedance control, the maximum errors and maximum interaction forces are recorded.

The tracking performance and interaction force of impedance control with different users are shown in Fig. 4a, b where different colors represent different users, while it can be found that different users’ tracking trajectories almost overlap. With the proposed method, the number of iterations is different for each user, which decreases as $k_h$ increases in an appropriate range as shown in Fig. 5a. From Fig. 5b, c, it can be found that the proposed ILC achieves smaller maximum error and interaction force compared with impedance control for different users. Thus, the proposed method improves the tracking accuracy and dramatically reduces the interaction force so less human effort is required.

Simulation I and II demonstrate that the proposed method for tracking repeated trajectories is feasible. However, parameters of the desired impedance model and learning rates of the ILC are different, as the choice of parameters will affect the accuracy of tracking and the speed of convergence. Based on the desired impedance parameters in reference [22], we adjust them appropriately and find that they should not be chosen too large, otherwise the system will become unstable. For iterative learning rates, they must satisfy the condition in Theorem 1. It is worth noting that large learning rates $\beta _i$ lead to faster convergence but larger tracking errors, whereas small learning rates will make the iteration number increase. The influence of the impedance parameters and the iterative learning rates have been extensively studied in the literature [22, 33, 34], respectively, so this paper does not discuss it further.

5 Experiment verification

In this section, to verify the validity of the proposed algorithm we carry out an experiment on the Sawyer robot, which is a 7-DOF manipulator equipped with a motor encoder and a torque sensor in each joint. As is shown in Fig. 6, a human user guides the robot manipulator to track a circular trajectory according to his intent but this trajectory is unknown to the robot. This experiment setup emulates an industrial application where the robot learns a motion from the human partner through repetitive collaboration. In the experiment, joints 1 and 2 are used which make the experimental platform similar to that in the simulation in Fig. 3a.

The experiment process is shown in Fig. 6, with a circular trajectory that starts from A(0.26, 0) m and has a diameter of 0.4 m. We set the desired period as $T_d=20$ s that is estimated by preliminary experiments. In the X-axis direction, we set

$$\begin{aligned} x_r={\left\{ \begin{array}{ll}\frac{0.8}{T_d}t+0.26, &{}0\le t < \frac{T_d}{2} \\ 0.66-\frac{0.8}{T_d}t,&{}\frac{T_d}{2}\le t \le T_i \end{array}\right. } \end{aligned}$$

(28)

where $T_i$ is the period of each iteration determined by the human user. In the Y-axis direction, there is a vertical interaction force on the robot’s end-effecter. One key issue of the experiment is to measure the interaction force, since there is no force sensor that can directly measure the force at the end-effector. Therefore, we convert the joint torque that can be measured by the torque sensor at each joint to obtain the approximate interaction force. The parameters used in the experiment are summarized in Table 1. Because the physical meaning of each parameter is introduced above, we do not repeat them here.

Table 1 Experiment parameters

Full size table

Table 2 Learning periods (s)

Full size table

During the process of the experiment, due to continuous change of human–robot interaction force and the interference of friction in each joint, the learning period $T_i$ changes accordingly. We record the learning periods of typical iterations in Table 2. As is shown in Fig. 7a, we present the change of the interaction force over the iterations. It is worth noting that due to the force measurement noise, the measured data must be smoothed through a filter. And then we take the interaction force into the designed ILC algorithm to constantly update the robot’s reference trajectory. In order to control the motors, we convert the end-effector positions to joint angles through inverse kinematics. To further illustrate the effectiveness of the proposed algorithm, we also fit the filtered force data to get better tracking performance as is shown in Fig. 7b. Finally, the position change of the end-effector in the Y-direction and the trajectory in the vertical plane are given in Fig. 7d, f.

From Fig. 7c, e, it can be seen that the robotic manipulator accomplishes the trajectory tracking task under the effect of the actual interaction force, and the maximum error in the 15th iteration is less than 1 cm. In addition, we get a better tracking performance through the fitted interaction force as in Fig. 7d, f. Correspondingly, the interaction force gradually decreases to 0 and the trajectory gradually converges. These results clearly demonstrate that the robot is able to track unknown human user’s desired trajectory and reduce human force, in presence of time-varying periods.

The proposed method is based on impedance control where the interaction force is modulated by refining the robot’s reference trajectory. However, traditional impedance control generates a virtual reference trajectory but its real reference trajectory does not change. In comparison, the proposed ILC updates the robot’s real reference trajectory and enables the robot to proactively move to the human partner’s desired trajectory with less effort.

6 Conclusions

Unlike iterative learning control in conventional applications such as motion control, the reference trajectory cannot be programmed for the robot in the coupled system of human–robot interaction. Therefore, we introduce a simplified interaction force model to estimate the human partner’s desired trajectory and apply the filtered interaction force to update the robot’s reference trajectory. Stability of the non-standard iterative learning control system has been shown to be guaranteed by rigorous analysis, where the condition of a fixed period is relaxed for the iterative learning control design. The validity of the proposed method has been verified through two simulations of gait tracking and trajectory learning. A further experiment of trajectory learning on a physical robot proves its feasibility. These results show that the proposed method has a potential to reduce human effort in human–robot collaboration and transfer human skills to robots, which can be applied to applications such as heavy load transport in automotive industry and compliant machining.

References

Sheridan, T.B.: Human–robot interaction: status and challenges. Human Fact. 4, 58 (2016)
Google Scholar
Bauer, A., Wollherr, D., Buss, M.: Human-robot collaboration: a survey. Int. J. Humanoid Rob. 5(1), 47–66 (2008)
Article Google Scholar
Tsarouchi, P., Makris, S., Chryssolouris, G.: Human-robot interaction review and challenges on task planning and programming. Int. J. Comput. Integr. Manuf. 29(8), 916–931 (2016)
Article Google Scholar
Brown, J.D., O’Brien, C.E., Leung, S.C., Dumon, K.R., Lee, D.I., Kuchenbecker, K.J.: Using contact forces and robot arm accelerations to automatically rate surgeon skill at peg transfer. IEEE Trans. Biomed. Eng. 64(9), 2263–2275 (2017)
Article Google Scholar
Jakopin, B., Mihelj, M., Munih, M.: An unobtrusive measurement method for assessing physiological response in physical human–robot interaction. IEEE Trans. Human Mach. Syst. 47(4), 474–485 (2017)
Article Google Scholar
Yu, H., Huang, S., Chen, G., Pan, Y., Zhao, G.: Human-robot interaction control of rehabilitation robots with series elastic actuators. IEEE Trans. Rob. 31(5), 1089–1100 (2015)
Article Google Scholar
Wang, W., Li, R., Chen, Y., Diekel, Z.Max, Jia, Y.: Facilitating human-robot collaborative tasks by teaching–learning-collaboration from human demonstrations. IEEE Trans. Autom. Sci. Eng. 16(2), 640–653 (2018)
Article Google Scholar
Zhang, Z., Wang, W., Chen, Y., Jia, Y., Peng, G.: Prediction of human actions in assembly process by a spatial–temporal end-to-end learning model. SAE Technical Paper, no. 2019-01-0509 (2019)
Chen, Y., Wang, W., Zhang, Z., Kroi, V., Jia, Y.: Modeling and learning of object placing tasks from human demonstrations in smart manufacturing. SAE Technical Paper, no. 2019-01-0700 (2019)
Li, Y., Ge, S.S.: Human–robot collaboration based on motion intention estimation. IEEE/ASME Trans. Mechatron. 19(3), 1007–1014 (2013)
Article Google Scholar
Losey, D., O’Malley, M.: Trajectory deformations from physical human–robot interaction. IEEE Trans. Rob. 34(1), 126–138 (2017)
Article Google Scholar
Evrard, P., Gribovskaya, E., Calinon, S., Billard, A., Kheddar, A.: Teaching physical collaborative tasks: object-lifting case study with a humanoid. In: IEEE-RAS International Conference on Humanoid Robots (2010)
Villani, V., Pini, F., Leali, F., Secchi, C.: Survey on human–robot collaboration in industrial settings: safety, intuitive interfaces and applications. Mechatronics (2018)
Rozo, L., Calinon, S., Caldwell, D.G., Jimenez, P., Torras, C.: Learning physical collaborative robot behaviors from human demonstrations. IEEE Trans. Rob. 32(3), 513–527 (2016)
Article Google Scholar
Cherubini, A., Passama, R., Crosnier, e a: Collaborative manufacturing with physical human–robot interaction. Robot. Comput. Integrat. Manuf. 40, 1–13 (2016)
Article Google Scholar
Rozo, L., Jimenez, P., Torras, C.: Robot learning from demonstration of force-based tasks with multiple solution trajectories. In: International Conference on Advanced Robotics (2011)
Erden, M.S., Tomiyama, T.: Human-intent detection and physically interactive control of a robot without force sensors. IEEE Trans. Rob. 26(2), 370–382 (2010)
Article Google Scholar
Hogan, N.: Impedance control: an approach to manipulation-part 1: Theory; part 2: Implementation; part 3: Applications. J. Dyn. Syst. Meas. Contr. 107(1), 1–24 (1985)
Article Google Scholar
Sam Ge, S., Li, Y., Wang, C.: Impedance adaptation for optimal robot–environment interaction. Int. J. Control 87(2), 249–263 (2014)
Article MathSciNet MATH Google Scholar
Seraji, H., Colbaugh, R.: Force tracking in impedance control. Int. J. Robot. Res. 16(2), 499–506 (1993)
Google Scholar
Lu, W.S., Meng, Q.H.: Impedance control with adaptation for robotic manipulations. IEEE Trans. Robot. Autom. 7(3), 408–415 (1991)
Article Google Scholar
Li, Y., Ge, S.S.: Impedance learning for robots interacting with unknown environments. IEEE Trans. Control Syst. Technol. 22(4), 1422–1432 (2014)
Article Google Scholar
Burdet, E., Osu, R., Franklin, D.W., Milner, T.E., Kawato, M.: The central nervous system stabilizes unstable dynamics by learning optimal impedance. Nature 414(6862), 446–449 (2001)
Article Google Scholar
Noohi, E., Zefran, M., Patton, J.L.: A model for human-human collaborative object manipulation and its application to human–robot interaction. IEEE Trans. Robot. 32(4), 1–17 (2016)
Article Google Scholar
Peternel, L., Petri, T., Oztop, E., Babi, J.: Teaching robots to cooperate with humans in dynamic manipulation tasks based on multi-modal human-in-the-loop approach. Auton. Robots 36, 1–14 (2014)
Article Google Scholar
Tayebi, A.: Adaptive iterative learning control for robot manipulators (2004)
Aubin, P.M., Cowley, M.S., Ledoux, W.R.: Gait simulation via a 6-dof parallel robot with iterative learning control. IEEE Trans. Biomed. Eng. 55(3), 1237–1240 (2008)
Article Google Scholar
He, X., Qin, Z., Wu, X.: Adaptive iterative learning control of robot manipulators in the presence of environmental constraints. Control Theory Appl. 20, 133–150 (2012)
MATH Google Scholar
Bristow, D.A., Tharayil, M., Alleyne, A.G.: A survey of iterative learning. IEEE Control Syst. 26(3), 96–114 (2006)
Article Google Scholar
Wang, L., Li, X., Shen, D.: Sampled-data iterative learning control for continuous-time nonlinear systems with iteration-varying lengths. Int. J. Robust Nonlinear Control 28(8), 3073–3091 (2018)
Article MathSciNet MATH Google Scholar
Bu, X., Wang, S., Hou, Z., Liu, W.: Model free adaptive iterative learning control for a class of nonlinear systems with randomly varying iteration lengths. J. Franklin Inst. 356(5), 2491–2504 (2019)
Article MathSciNet MATH Google Scholar
Liu, C., Shen, D., Wang, J.: Adaptive learning control for general nonlinear systems with nonuniform trial lengths, initial state deviation, and unknown control direction. Int. J. Robust Nonlinear Control 20, 20 (2019)
MathSciNet MATH Google Scholar
Li, X., Xu, J., Huang, D.: An iterative learning control approach for linear systems with randomly varying trial lengths. IEEE Trans. Autom. Control 59(7), 1954–1960 (2014)
Article MathSciNet MATH Google Scholar
Li, X., Xu, J.-X., Huang, D.: Iterative learning control for nonlinear dynamic systems with randomly varying trial lengths. Int. J. Adapt. Control Signal Process. 29(11), 1341–1353 (2015)
Article MathSciNet MATH Google Scholar
Shen, D., Xu, J.: Adaptive learning control for nonlinear systems with randomly varying iteration lengths. IEEE Trans. Neural Netw. Learn. Syst. 30(4), 1119–1132 (2019)
Article MathSciNet Google Scholar
Feldman, A.G.: Once more on the equilibrium-point hypothesis ($\lambda$ model) for motor control. J. Mot. Behav. 18(1), 17–54 (1986)
Article Google Scholar
Chen, Y., Lin, S., Li, R.H., Zhang, L.D.: Kinematics characteristic of exoskeleton walking robot for older people. Appl. Mech. Mater. 644, 167–170 (2014)
Article Google Scholar
Veneman, J.F., Rik, K., Hekman, E.E.G., et al.: Design and evaluation of the lopes exoskeleton robot for interactive gait rehabilitation. IEEE Trans. Neural Syst. Rehabil. Eng. 15(3), 379–386 (2007)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Southwest Jiaotong University, Chengdu, 610031, People’s Republic of China
Jingkang Xia, Deqing Huang & Na Qin
University of Sussex, Brighton, BN1 9RH, UK
Yanan Li

Authors

Jingkang Xia
View author publications
You can also search for this author in PubMed Google Scholar
Deqing Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yanan Li
View author publications
You can also search for this author in PubMed Google Scholar
Na Qin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yanan Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Due to the control objectives of $f_{h}(t)=0$ and $e(t)=0$, we have $f_{h_{d}}(t)=0$ and $\varphi _{d}(t)=0$, then we obtain

$$\begin{aligned} {\dot{\mu }}_{i}(t)= & \, {} {\dot{\rho }}_{i}(t) -{\dot{\rho }}_{d}(t) \nonumber \\= & \, {} C{\dot{\varphi }}_{i}(t) \nonumber \\= & \, {} C[A \varphi _{i}(t) + B e_{i}(t)] \end{aligned}$$

(29)

From Eqs. (25) and (21), the following relationship is obtained

$$\begin{aligned} e_{i+1}(t)= & \, {} \Phi \{e_{i}(t) \} +\sum _{j=0}^m \beta _{j}\gamma _{i-j}(t){\dot{\mu }}_{i-j}(t)\nonumber \\= & \, {} \Phi \{e_{i}(t) \} + \sum _{j=0}^m \beta _{j}\gamma _{i-j}(t)C[A \varphi _{i-j}(t)\nonumber \\&+ B e_{i-j}(t)] \nonumber \\= & \, {} \sum _{j=0}^m \left[ \frac{1}{m+1}-\beta _{j}\gamma _{i-j}(t)CB\right] e_{i-j}(t)\nonumber \\&+\sum _{j=0}^m \beta _{j}\gamma _{i-j}(t)CA \varphi _{i-j}(t) \end{aligned}$$

(30)

Taking norm on both sides of Eq. (30) yields

$$\begin{aligned} \Vert e_{i+1}(t)\Vert\le & {} \sum _{j=0}^m \left| \frac{1}{m+1}-CB\beta _{j}\gamma _{i-j}(t)\right| \Vert e_{i-j}(t)\Vert \nonumber \\&+\sum _{j=0}^m a|\beta _{j}|\gamma _{i-j}(t)\Vert \varphi _{i-j}(t)\Vert \end{aligned}$$

(31)

where a is the upper bound of $\Vert A\Vert$.

Taking norm on both sides of system Eq. (14) yields

$$\begin{aligned} \Vert \varphi _{i}(t)\Vert \le \int _{ 0 }^{ t } a\Vert \varphi _{i}(\tau )\Vert + b\Vert e_{i}(\tau )\Vert d\tau \end{aligned}$$

(32)

where $b\ge \parallel B \parallel$. By applying Bellman–Gronwall lemma, we have

$$\begin{aligned} \Vert \varphi _{i}(t)\Vert\le & {} be^{at}\int _{ 0 }^{ t } \Vert e_{i}(\tau )\Vert d\tau \nonumber \\= & \, {} be^{at}\int _{ 0 }^{ t } \Vert e_{i}(\tau )\Vert e^{-\lambda \tau }e^{\lambda \tau }d\tau \nonumber \\\le & {} be^{at}\Vert e_{i}(t)\Vert _{\lambda } \int _{ 0 }^{ t }e^{\lambda \tau }d\tau \nonumber \\= & \, {} be^{at}\frac{e^{\lambda t}-1}{\lambda } \Vert e_{i}(t)\Vert _{\lambda } \end{aligned}$$

(33)

According to the definition of $\lambda$-norm, it can be obtained that $\Vert e_{i}(t)\Vert _{\lambda } = \sup \nolimits _{t\in [0,T_{d}]}\{e^{-\lambda t} \Vert e_{i}(t)\Vert _{\infty }\}$, $\lambda > 0$. Then, substituting (33) into (31) implies that

$$\begin{aligned} \Vert e_{i+1}(t)\Vert\le & {} \sum _{j=0}^m \left| \frac{1}{m+1}-CB\beta _{j}\gamma _{i-j}(t)\right| \Vert e_{i-j}(t)\Vert \nonumber \\&+abe^{at}\frac{e^{\lambda t}-1}{\lambda }\sum _{j=0}^m |\beta _{j}|\gamma _{i-j}(t) \Vert e_{i-j}(t)\Vert _{\lambda } \end{aligned}$$

(34)

Introducing the expectation operator ${\mathbf {E}}[\cdot ]$ to Eq. (34) and noting that only $\gamma _{i}(t)$ is the stochastic variable, we can rewrite Eq. (34) as

$$\begin{aligned} \Vert e_{i+1}(t)\Vert\le & {} \sum _{j=0}^m {\mathbf {E}}\left\{ \left| \frac{1}{m+1}-CB\beta _{j}\gamma _{i-j}(t)\right| \right\} \Vert e_{i-j}(t)\Vert \nonumber \\&+abe^{at}\frac{e^{\lambda t}-1}{\lambda }\sum _{j=0}^m |\beta _{j}|{\mathbf {E}}\{\gamma _{i-j}(t)\}\Vert e_{i-j}(t)\Vert _{\lambda } \end{aligned}$$

(35)

Based on the definition of stochastic variable $\gamma _{i}(t)$ and the mathematical expectation, it follows

$$\begin{aligned} {\mathbf {E}}\left\{ \left| \frac{1}{m+1}-CB\beta _{j}\gamma _{i-j}(t)\right| \right\}= & \, {} \left| \frac{1}{m+1}-1\cdot \beta _{j}CB\right| q(t) \nonumber \\&+\left| \frac{1}{m+1}-0\cdot \beta _{j}CB\right| (1-q(t)) \nonumber \\= & \, {} \left| \frac{1}{m+1}-\beta _{j}CB\right| q(t)+\frac{1-q(t)}{m+1} \end{aligned}$$

(36)

then

$$\begin{aligned} \Vert e_{i+1}(t)\Vert\le & {} \sum _{j=0}^m \left\{ \left| \frac{1}{m+1}-CB\beta _{j}\right| q(t)+\frac{1-q(t)}{m+1}\right\} \Vert e_{i-j}(t)\Vert \nonumber \\&+abq(t)e^{at}\frac{e^{\lambda t}-1}{\lambda }\sum _{j=0}^m |\beta _{j}|\Vert F_{h_{i-j}}(t)\Vert _{\lambda } \end{aligned}$$

(37)

From (37) and the definition of $\lambda$-norm, we obtain

$$\begin{aligned} \Vert e_{i+1}(t)\Vert _{\lambda }\le & {} \sum _{j=0}^m \sup \limits _{t\in [0,T_{d}]} \left\{ \left| \frac{1}{m+1}-CB\beta _{j}\right| q(t)+\frac{1-q(t)}{m+1}\right\} \Vert e_{i-j}(t)\Vert _{\lambda } \nonumber \\&+abq(t)e^{aT_{d}}\frac{e^{\lambda T_{d}}-1}{\lambda }\sum _{j=0}^m |\beta _{j}|\Vert e_{i-j}(t)\Vert _{\lambda } \end{aligned}$$

(38)

Define

$$\begin{aligned} \delta \triangleq abq(t)e^{aT_{d}}\frac{e^{\lambda T_{d}}-1}{\lambda }\sum _{j=0}^m |\beta _{j}| \end{aligned}$$

(39)

From Theorem 1 with (38) and (39), we have

$$\begin{aligned} \Vert e_{i+1}(t)\Vert _{\lambda } \le \left( \sum _{j=0}^m\alpha _{j}+ \delta \right) max\{\Vert e_{i}\Vert _{\lambda },\Vert e_{i-1}\Vert _{\lambda },\ldots ,\Vert e_{i-m+1}\Vert _{\lambda }\} \end{aligned}$$

(40)

where $0< q(t)< 1$ is applied. When $\lambda$ is sufficiently large, $\delta$ can be made as small as possible. According to Eq. (27), it yields that $\sum _{j=0}^m \alpha _{j}+\delta \le \epsilon +\delta <1$, so we have $\lim \nolimits _{i\rightarrow \infty }e_{i}=0$ and the control objective $x_{d}(t)=x_{h}(t)$ is achieved.

According to the convergence of $e_{i}$ and the inequality (33), it is obvious that

$$\begin{aligned} \lim \limits _{i\rightarrow \infty }\Vert \varphi _{i}(t)\Vert =0 \end{aligned}$$

(41)

so we can get $\lim \nolimits _{i\rightarrow \infty }f_{h_{i}}(t)=0$ which indicates that $\lim \nolimits _{i\rightarrow \infty }x_{i}(t)=x_{h}(t)$ and the robot tracks the human’s desired trajectory.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Xia, J., Huang, D., Li, Y. et al. Iterative learning of human partner’s desired trajectory for proactive human–robot collaboration. Int J Intell Robot Appl 4, 229–242 (2020). https://doi.org/10.1007/s41315-020-00132-5

Download citation

Received: 10 October 2019
Accepted: 23 April 2020
Published: 05 June 2020
Issue Date: June 2020
DOI: https://doi.org/10.1007/s41315-020-00132-5

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Iterative learning of human partner’s desired trajectory for proactive human–robot collaboration

Abstract

Similar content being viewed by others

A review of motion planning algorithms for intelligent robots

A systematic review on cooperative dual-arm manipulators: modeling, planning, control, and vision strategies

Performance-based Assistance Control for Upper Limb Robotic Mirror Therapy

1 Introduction

2 Problem formulation

2.1 System description

2.2 Control objective

3 Controller design

3.1 System model transformation

3.2 ILC design

Definition 1

Assumption 1

Assumption 2

Theorem 1

Remark 1

4 Simulations

4.1 Simulation I

4.2 Simulation II

5 Experiment verification

6 Conclusions

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Iterative learning of human partner’s desired trajectory for proactive human–robot collaboration

Abstract

Similar content being viewed by others

A review of motion planning algorithms for intelligent robots

A systematic review on cooperative dual-arm manipulators: modeling, planning, control, and vision strategies

Performance-based Assistance Control for Upper Limb Robotic Mirror Therapy

1 Introduction

2 Problem formulation

2.1 System description

2.2 Control objective

3 Controller design

3.1 System model transformation

3.2 ILC design

Definition 1

Assumption 1

Assumption 2

Theorem 1

Remark 1

4 Simulations

4.1 Simulation I

4.2 Simulation II

5 Experiment verification

6 Conclusions

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation