Introduction

With the development of artificial intelligence (AI) and modern engineering technology, robots are widely used in the industry and military domains [1, 2] to complete dexterous manipulation such as grasping and holding [3] objects, etc. Compared with a single-arm robot, the dual-arm robot can complete more complex tasks and take heavier objects through the cooperative actions of two robot arms [4], which draws a great of attention in academic and industrial areas [5,6,7].

Different from traditional analytical methods spending a lot of time for system modelling, trajectory planning, and force control, learning from demonstration (LfD) is a new paradigm that robots acquire new skills by learning to imitate an expert that is easy to be extended and adapted to novel situations and draws more and more attention in recent years [8,9,10,11]. Some researchers used a dual-arm robot for robotic skill learning and training which benefits the isomorphic structure of the two arms: the operator use one arm to record robotic motions and the other robot arm is used as an actuator [12,13,14,15]. There are great series of robotic skill learning methods, such as the Hidden Markov model (HMM), Gaussian mixture model (GMM) and Gaussian mixture regression (GMR), and Dynamic movement primitives (DMPs). Compared with HMM and GMM-GMR, DMPs are easier to be explained and have linearity in the parameters of expressions with robustness, and continuity.

Due to the advantages, DMPs were widely used for dual-arm robot skill learning. Kulvicius et al. proposed sensory feedback together with a predictive learning mechanism that allows tightly coupled dual-agent systems to learn an adaptive, sensor-driven interaction based on DMPs [16]. Gams et al. reckoned that the original DMPs function should be modified by adding not only acceleration term but also velocity term, to get a smoother interaction. So they proposed the coupling of originally independent robotic trajectories by expanding framework of DMPs, which enables the bimanual execution tightly coupled for cooperative tasks [17]. Zhao et al. presented a reinforcement learning (RL) algorithm called the policy improvement with path integrals for sequences of DMPs (SDMPs) to learn and adjust recorded trajectories of dual-arm robot cooperative manipulation [18]. Colome et al. studied simultaneously learning a DMP-characterized robot motion and the joint couplings through linear dimensionality reduction (DR), which provides valuable qualitative information leading to a reduced and intuitive algebraic description of such motion [19]. Lee et al. integrated a similar method for cooperative aerial transportation with the random tree star (RRT*) to enable cooperative aerial manipulators to carry a common object and keep reducing the interaction force between multiple robots while avoiding an obstacle in the unstructured environment [20].

Seen from the above-mentioned methods, it is not hard to notice that the original DMPs function is modified by adding coupling terms calculated by the relative distance or force errors to change the path and ensure the relative distance tracking errors converge to 0. But, the dynamic performance of the trajectories such as how to enlarge and reduce relative distance flexibly and avoidance obstacles are not considered for the moving process. If we only use a fixed relative distance limitation for the dual arms, the object-approaching skills such as changes of the speed and contact forces will be ignored.

In this paper, we will integrate the control strategy of Barrier Lyapunov Functions (BLFs) and DMPs to enable the generate trajectories to satisfy predesigned transient performance for the relative distance of robot end effects. As the trajectory generalized by DMPs is determined by three variables: the start and end points and sampling time interval, and the measured data with errors and noises will be processed (e.g. aligning, filtering etc.) for skill learning, even for the processed data, the data-driven learned results may be against physical limitations. The proposed method based on the integration of BLFs and DMPs will address this problem. Similar ideas combining control and motion planning methods for manipulation have been explored in previous work [21,22,23].

Additionally, we will combine the generated actions in the joint space with the solution of a redundant dual-arm robot to perform human-like operations. Though a similar study about human-like coordinative learning in the Cartesian and joint space for a redundant dual-arm robot has been studied by Qu et al. [24], we will propose another idea by defining a “swivel angle” and combining the null-space method and the results of DMPs with distance constraints. With the dual-arm demonstration data acquired through a Kinect, an experiment is taken based on Matlab and Gazebo to verify the effectiveness of the proposed method.

The rest of the paper is organized as follows: “Problem description” makes a brief introduction about DMPs and the problems of skill learning for dual-arm redundant robots. “BLFs-based improved DMPs for human-like skill learning and redundancy resolution” presents the DMPs and BLFs framework and related three calculating modules. “Experiment” masks three experiments to certify the effectiveness and application of the proposed method. Finally, in “Conclusion”, the conclusions of this paper are summarized.

Problem description

General DMPs model

DMPs model is firstly proposed in [25] and updated by Ijspeert et al. [26], whose function is expressed as

$$ \left\{ \begin{gathered} \tau \dot{v} = \alpha _{z} \left( {\beta _{z} \left( {g - x} \right) - v} \right) + f\left( s \right) \hfill \\ \tau \dot{x} = v \hfill \\ \end{gathered} \right., $$
(1)

where \(\alpha _{z} ,\beta _{z} > 0\) are coefficients of a two-order function as the linear part in (1), ensuring the convergence of the generated trajectory to the unique attractor point at \(x = g\)\(,v = 0,\) \(f\left( s \right) = \theta ^{T} \Psi (s)\) is a forcing function and a linear combination of nonlinear radial basis functions, and \(\theta =\) \(\left[ {w_{1} ,} \right.\left. {w_{2} , \ldots ,w_{n} } \right]^{{\text{T}}} ,\Psi (s) = \left[ {\psi _{1} ,} \right.\psi _{2} \left. {, \ldots ,\psi _{n} } \right]^{{\text{T}}}\) and

$$ \psi _{k} = \frac{{\varphi _{k} (s)s}}{{\sum\nolimits_{{k = 1}}^{n} {\varphi _{i} (s)} }},\varphi _{k} (s) = \exp ( - h_{k} (s - c_{k} )^{2} ),k = 1,2, \ldots n, $$

where \(c_{k}\) and \(h_{k} > 0\) are centers and widths of radial basis functions respectively. \(\tau > 0\) is a timing parameter adjusting speed before execution of movements and \(s\) is a phase variable to achieve dependency of function \(f\left( s \right)\) out of time. The dynamics of \(s\) is expressed by a canonical system

$$ \tau \dot{s} = - \gamma s,\;\;\;\;\;\;\gamma > 0. $$
(2)

Term \(s\) has implicit relation with time that can modify the convergence time by changing \(\gamma\), and \(\theta\) can be learned by supervised learning algorithms e.g. locally weighted regression (LWR). The purpose of the calculating process is to minimize the error function:

$$ \min \sum {\left( {f_{{}}^{{{\text{Tar}}}} \left( s \right) - f\left( s \right)} \right)} ,x = g,v = 0,\;\tau > 0, $$
(3)

where \(f\left( s \right)\) is the forcing function in (1), and \(f_{{}}^{{{\text{Tar}}}} \left( s \right)\) represents the target value of \(f\left( s \right)\):

$$ f_{{}}^{{Tar}} \left( s \right) = \tau \dot{v} - \alpha _{z} \left( {\beta _{z} \left( {g - x} \right) - v} \right). $$
(4)

DMP-based dual-arm robot manipulation

From human demonstrations to the skills learned and generalized by robots, we will solve the following three problems: joint distance restriction, redundant joint resolution, and relative distance limitations (Fig. 1).

Fig. 1
figure 1

Problems for dual-arm robot manipulation

Joint distance restriction is caused by the bone lengths of the adjacent joints such as the elbow and the wrist or the elbow and the shoulder. It is a constant. The redundant joint resolution is to plan the robot arm joints to achieve human-like motions and the relative distance limitations provide constraints to the robot end effectors.

Meanwhile, measuring noise is acquired together with raw data. If we use the data for skill learning, then learned results will be influenced by noise and measuring errors. Therefore, the signals such as EMG measurements will be processed before learning procedures. Images and videos affected by occlusions are reconstructed, which may cause new uncertainties and errors. An example shown in Fig. 2 reveals the measuring errors of hand positions are larger than the ones of the shoulders and elbows. Based on the pre-knowledge such as the size of the object, bone length between the elbow and the shoulder, or exact position of the object, we can re-plan joints and positions of the end effector’s to suit new cases. Here, we argue that the known conditions have fixed constraints and propose a new DMP -based framework for dual-arm robot cooperative skill learning with consideration of the above three problems.

Fig. 2
figure 2

Position measuring errors of shoulders, elbows and hands measured by Kinect

Robotic human-like manipulation has been studied for several decades. By using the nullspace method for the redundant robotic arm, robots can avoid conflicts with own arms and outer obstacles with multiple joint motion planning results. As the skills for robot ends and joints are generated in both the Cartesian and joint space, the previous researches like [9] and [13] only for the joint or Cartesian space cases are not applicable. We will combine the constrained skills and the nullspace method for skill generalization for redundant dual-arm robots.

BLFs-based improved DMPs for human-like skill learning and redundancy resolution

In this section, we propose three solutions for the above relative distance limitation, joint distance restriction, and redundant joint resolution in three subsections. We firstly specify mathematical symbols in the following paragraphs in Table 1.

Table 1 Mathematical symbols and meanings

Integrated BLFs and DMPs skills learning for relative distance limitation

Interactive actions of robot end effectors can be seen as a common effect of relative distance and posture changes. For the cooperative actions e.g. folding clothes, grasping and placing objects, relative distance is always changing during the interaction process with environmental objects. Too large relative tracking errors may cause operational failure such as losing control of the object or conflict with the obstacle.

Following robotic desired relative distance \(d_{j}^{{}} (t)\), we set predesigned error boundaries as \(\sigma _{{ij}}^{{}} ,i = 1,2\), which means boundary violation is not allowed throughout the cooperative manipulation process, then the relationship of the \(j{\text{th}}\) hand \({\mathbf{x}}_{{wj}}^{{}}\) and its cooperative role \({\mathbf{x}}_{{w\bar{j}}}^{{}}\) is

$$ \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\sigma } _{j}^{{}} \le \left\| {{\mathbf{x}}_{{w\bar{j}}}^{{}} - {\mathbf{x}}_{{wj}}^{{}} } \right\|_{2} - d_{j}^{{}} (t) \le \bar{\sigma }_{j}^{{}} . $$
(5)

The expression of DMPs model can be rewritten as a strict feedback nonlinear system as

$$ \left\{ \begin{gathered} \dot{x} = {v \mathord{\left/ {\vphantom {v \tau }} \right. \kern-\nulldelimiterspace} \tau } \hfill \\ \dot{v} = {{\alpha _{z} \left( {\beta _{z} \left( {g - x} \right) - v} \right)} \mathord{\left/ {\vphantom {{\alpha _{z} \left( {\beta _{z} \left( {g - x} \right) - v} \right)} \tau }} \right. \kern-\nulldelimiterspace} \tau } + u \hfill \\ u = {{f\left( s \right)} \mathord{\left/ {\vphantom {{f\left( s \right)} \tau }} \right. \kern-\nulldelimiterspace} \tau } \hfill \\ y = x \hfill \\ \end{gathered} \right., $$
(6)

where \(u\) has the same usage with the forcing function \(f\left( s \right)\) in (6), while in (1) of [27], it represents the input signal to modify output \(y\). Set \(\bar{\kappa }_{j}^{i} = k_{z} \left( {\bar{\sigma }_{j}^{i} + d_{j}^{i} } \right)\), \(\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} = k_{z} \left( {\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\sigma } _{j}^{i} + d_{j}^{i} } \right)\), and \(z_{1}^{i} = k_{z} \left( {x_{{\bar{j}}}^{i} - x_{j}^{i} } \right)\), and \(k_{z}\) is a positive factor, \(\bar{\sigma }_{j}^{i}\),\(\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\sigma } _{j}^{i}\) and \(d_{j}^{i}\) are decomposed values of the \(i{\text{th}}\) dimension, and \(x_{j}^{i}\) and \(x_{{\bar{j}}}^{i}\) are general variables that can be instantiated as \(x_{{wj}}^{i}\) and \(x_{{w\bar{j}}}^{i}\) in (5). Then (5) can be expressed as \(\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} \le z_{1}^{i} \le \bar{\kappa }_{j}^{i}\). Inspired by the asymmetric BLFs candidate in [27], we define a new variable as \(z_{2}^{i} = d_{z} \left( {\alpha _{2} - v_{j}^{i} } \right),d_{z} > 0\), where \(\alpha _{2}\) is a function to be designed and build the Lyaponov candidate as

$$ V^{c} = \frac{1}{2}\log \frac{{\left( {\bar{\kappa }_{j}^{i} - \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} } \right)^{2} }}{{4\left( {\bar{\kappa }_{j}^{i} - z_{1}^{i} } \right)\left( {z_{1}^{i} - \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} } \right)}} + \frac{1}{2}\left( {z_{2}^{i} } \right)^{2} . $$
(7)

The difference of \(V^{c}\) can be calculated as

$$ \begin{gathered} \dot{V}^{c} = \frac{{z_{1}^{i} - {{\left( {\bar{\kappa }_{j}^{i} + \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} } \right)} \mathord{\left/ {\vphantom {{\left( {\bar{\kappa }_{j}^{i} + \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} } \right)} 2}} \right. \kern-\nulldelimiterspace} 2}}}{{\left( {\bar{\kappa }_{j}^{i} - z_{1}^{i} } \right)\left( {z_{1}^{i} - \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} } \right)}}\dot{z}_{1}^{i} + z_{2}^{i} \dot{z}_{2}^{i} \hfill \\ \;\;\;\; = \frac{{z_{1}^{i} - {{\left( {\bar{\kappa }_{j}^{i} + \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} } \right)} \mathord{\left/ {\vphantom {{\left( {\bar{\kappa }_{j}^{i} + \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} } \right)} 2}} \right. \kern-\nulldelimiterspace} 2}}}{{\left( {\bar{\kappa }_{j}^{i} - z_{1}^{i} } \right)\left( {z_{1}^{i} - \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} } \right)}}k_{z} \tau \left( {\dot{x}_{{\bar{j}}}^{i} - \dot{x}_{j}^{i} } \right) + z_{2}^{i} d_{z} \tau \left( {\dot{\alpha }_{2} - \dot{v}_{j}^{i} } \right). \hfill \\ \end{gathered} $$
(8)

Theorem 1

Considering the DMPs function described as (6) under the condition of (5), if the initial conditions are such that \(\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} (0) \le z_{1}^{i} (0) \le \bar{\kappa }_{j}^{i} (0)\) and (9) is satisfied, then the output constraint is never violated and all the closed loop signals are bounded.

$$ \left\{ \begin{gathered} \alpha _{2} = \frac{{\tau d_{z} \dot{x}_{{\bar{j}}}^{i} + z_{1}^{i} - {{\left( {\bar{\kappa }_{j}^{i} + \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} } \right)} \mathord{\left/ {\vphantom {{\left( {\bar{\kappa }_{j}^{i} + \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} } \right)} 2}} \right. \kern-\nulldelimiterspace} 2}}}{{d_{z} }} \hfill \\ u_{j}^{i} = \dot{\alpha }_{2} - {{\left( {\alpha _{z} \left( {\beta _{z} \left( {g_{j}^{i} - x_{j}^{i} } \right) - v_{j}^{i} } \right)} \right)} \mathord{\left/ {\vphantom {{\left( {\alpha _{z} \left( {\beta _{z} \left( {g_{j}^{i} - x_{j}^{i} } \right) - v_{j}^{i} } \right)} \right)} \tau }} \right. \kern-\nulldelimiterspace} \tau } + k_{n} z_{2}^{i} \hfill \\ k_{n} = \frac{{k_{z} }}{{4\tau d_{z}^{2} \left( {\bar{\kappa }_{j}^{i} - z_{1}^{i} } \right)\left( {z_{1}^{i} - \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} } \right)}} \hfill \\ \end{gathered} \right.. $$
(9)

Proof

Taking the expressions of \(z_{2}^{i}\) and (6) into (8), we have

$$ \begin{gathered} \dot{V}^{c} = \frac{{z_{1}^{i} - {{\left( {\bar{\kappa }_{j}^{i} + \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} } \right)} \mathord{\left/ {\vphantom {{\left( {\bar{\kappa }_{j}^{i} + \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} } \right)} 2}} \right. \kern-\nulldelimiterspace} 2}}}{{\left( {\bar{\kappa }_{j}^{i} - z_{1}^{i} } \right)\left( {z_{1}^{i} - \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} } \right)}}k_{z} \tau \left( {\dot{x}_{{\bar{j}}}^{i} - {{\alpha _{2} } \mathord{\left/ {\vphantom {{\alpha _{2} } \tau }} \right. \kern-\nulldelimiterspace} \tau } + {{z_{2}^{i} } \mathord{\left/ {\vphantom {{z_{2}^{i} } {\left( {\tau d_{z} } \right)}}} \right. \kern-\nulldelimiterspace} {\left( {\tau d_{z} } \right)}}} \right) + z_{2} d_{z} \tau \left( {\dot{\alpha }_{2} - {{\alpha _{z} \left( {\beta _{z} \left( {g_{j}^{i} - x_{j}^{i} } \right) - v_{j}^{i} } \right)} \mathord{\left/ {\vphantom {{\alpha _{z} \left( {\beta _{z} \left( {g_{j}^{i} - x_{j}^{i} } \right) - v_{j}^{i} } \right)} \tau }} \right. \kern-\nulldelimiterspace} \tau } - u_{j}^{i} } \right) \hfill \\ \;\;\;\; = k_{z} z_{2}^{i} \frac{{z_{1}^{i} - {{\left( {\bar{\kappa }_{j}^{i} + \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} } \right)} \mathord{\left/ {\vphantom {{\left( {\bar{\kappa }_{j}^{i} + \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} } \right)} 2}} \right. \kern-\nulldelimiterspace} 2}}}{{d_{z} \left( {\bar{\kappa }_{j}^{i} - z_{1}^{i} } \right)\left( {z_{1}^{i} - \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} } \right)}} + \frac{{z_{1}^{i} - {{\left( {\bar{\kappa }_{j}^{i} + \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} } \right)} \mathord{\left/ {\vphantom {{\left( {\bar{\kappa }_{j}^{i} + \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} } \right)} 2}} \right. \kern-\nulldelimiterspace} 2}}}{{\left( {\bar{\kappa }_{j}^{i} - z_{1}^{i} } \right)\left( {z_{1}^{i} - \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} } \right)}}k_{z} \tau \left( {\dot{x}_{{\bar{j}}}^{i} - {{\alpha _{2} } \mathord{\left/ {\vphantom {{\alpha _{2} } \tau }} \right. \kern-\nulldelimiterspace} \tau }} \right) \hfill \\ \;\;\;\;\;\; + \;z_{2}^{i} d_{z} \tau \left( {{{\dot{\alpha }_{2} - \alpha _{z} \left( {\beta _{z} \left( {g_{j}^{i} - x_{j}^{i} } \right) - v_{j}^{i} } \right)} \mathord{\left/ {\vphantom {{\dot{\alpha }_{2} - \alpha _{z} \left( {\beta _{z} \left( {g_{j}^{i} - x_{j}^{i} } \right) - v_{j}^{i} } \right)} \tau }} \right. \kern-\nulldelimiterspace} \tau } - u_{j}^{i} } \right) \hfill \\ \;\;\;\; = - \frac{{k_{z} \left( {z_{1}^{i} - {{\left( {\bar{\kappa }_{j}^{i} + \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} - z_{2}^{i} } \right)} \mathord{\left/ {\vphantom {{\left( {\bar{\kappa }_{j}^{i} + \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} - z_{2}^{i} } \right)} 2}} \right. \kern-\nulldelimiterspace} 2}} \right)^{2} }}{{d_{z} \left( {\bar{\kappa }_{j}^{i} - z_{1}^{i} } \right)\left( {z_{1}^{i} - \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} } \right)}} + k_{z} \frac{{z_{1}^{i} - {{\left( {\bar{\kappa }_{j}^{i} + \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} } \right)} \mathord{\left/ {\vphantom {{\left( {\bar{\kappa }_{j}^{i} + \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} } \right)} 2}} \right. \kern-\nulldelimiterspace} 2}}}{{\left( {\bar{\kappa }_{j}^{i} - z_{1}^{i} } \right)\left( {z_{1}^{i} - \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} } \right)}}\left[ {\frac{{z_{1}^{i} - {{\left( {\bar{\kappa }_{j}^{i} + \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} } \right)} \mathord{\left/ {\vphantom {{\left( {\bar{\kappa }_{j}^{i} + \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} } \right)} 2}} \right. \kern-\nulldelimiterspace} 2}}}{{d_{z} }} + \tau \dot{x}_{{\bar{j}}}^{i} - \alpha _{2} } \right] \hfill \\ \;\;\;\;\;\; + z_{2}^{i} d_{z} \tau \left( {{{\dot{\alpha }_{2} - \alpha _{z} \left( {\beta _{z} \left( {g_{j}^{i} - x_{j}^{i} } \right) - v_{j}^{i} } \right)} \mathord{\left/ {\vphantom {{\dot{\alpha }_{2} - \alpha _{z} \left( {\beta _{z} \left( {g_{j}^{i} - x_{j}^{i} } \right) - v_{j}^{i} } \right)} \tau }} \right. \kern-\nulldelimiterspace} \tau } - u_{j}^{i} } \right. + \left. {\;\;\frac{{k_{z} z_{2}^{i} }}{{4\tau d_{z}^{2} \left( {\bar{\kappa }_{j}^{i} - z_{1}^{i} } \right)\left( {z_{1}^{i} - \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} } \right)}}} \right), \hfill \\ \end{gathered} $$

where \(k_{2} > 0\) is a positive number. As \(k_{z} > 0\), and terms \(z_{1}^{i} - {{\left( {\kappa _{{2j}}^{i} + \kappa _{{1j}}^{i} } \right)} \mathord{\left/ {\vphantom {{\left( {\kappa _{{2j}}^{i} + \kappa _{{1j}}^{i} } \right)} 2}} \right. \kern-\nulldelimiterspace} 2}\) and \(z_{2}^{i} d_{z}\) are not always \(0\), then the sufficient condition for \(\dot{V}^{c} < 0\) is

$$ \left\{ \begin{gathered} \tau \dot{x}_{{\bar{j}}}^{i} - \alpha _{2} + \frac{{z_{1}^{i} - {{\left( {\bar{\kappa }_{j}^{i} + \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} } \right)} \mathord{\left/ {\vphantom {{\left( {\bar{\kappa }_{j}^{i} + \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} } \right)} 2}} \right. \kern-\nulldelimiterspace} 2}}}{{d_{z} }} = 0 \hfill \\ \dot{\alpha }_{2} - {{\left( {\alpha _{z} \left( {\beta _{z} \left( {g_{j}^{i} - x_{j}^{i} } \right) - v_{j}^{i} } \right)} \right)} \mathord{\left/ {\vphantom {{\left( {\alpha _{z} \left( {\beta _{z} \left( {g_{j}^{i} - x_{j}^{i} } \right) - v_{j}^{i} } \right)} \right)} \tau }} \right. \kern-\nulldelimiterspace} \tau } - u_{j}^{i} + \hfill \\ \;\;\;\;\frac{{k_{z} z_{2}^{i} }}{{4\tau d_{z}^{2} \left( {\bar{\kappa }_{j}^{i} - z_{1}^{i} } \right)\left( {z_{1}^{i} - \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} } \right)}} = 0 \hfill \\ \end{gathered} \right. $$
(10)

Then we can get the expressions of \(\alpha _{2}\),\(u_{j}^{i}\) and \(k_{n}\) in (9). According to lemma 1 in [27], \(z_{1}^{i}\) will be kept within the range of \(\left( {\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{\kappa } _{j}^{i} (t),\bar{\kappa }_{j}^{i} (t)} \right)\). Following the expression of \(f_{{}}^{{Tar}}\) in (4), the target value of \(u_{j}^{i}\) in (6) can be expressed as

$$ \begin{gathered} \left( {u_{j}^{i} } \right)^{{Tar}} = \dot{\alpha }_{2} - \dot{v}_{j}^{i} + {{\left( {f_{j}^{i} } \right)^{{Tar}} } \mathord{\left/ {\vphantom {{\left( {f_{j}^{i} } \right)^{{Tar}} } \tau }} \right. \kern-\nulldelimiterspace} \tau } + k_{n} z_{2}^{i} \hfill \\ \;\;\;\;\;\;\;\;\; = {{\left( {f_{j}^{i} } \right)^{{Tar}} } \mathord{\left/ {\vphantom {{\left( {f_{j}^{i} } \right)^{{Tar}} } \tau }} \right. \kern-\nulldelimiterspace} \tau } + k_{n} z_{2}^{i} + {{\dot{z}_{2}^{i} } \mathord{\left/ {\vphantom {{\dot{z}_{2}^{i} } {d_{z} }}} \right. \kern-\nulldelimiterspace} {d_{z} }}. \hfill \\ \end{gathered} $$
(11)

Furthermore, we use \(f(s)\) to replace \(\left( {f_{j}^{i} } \right)^{{Tar}}\) based on the error function of (3) and get

$$ u_{j}^{i} = {{f(s)} \mathord{\left/ {\vphantom {{f(s)} \tau }} \right. \kern-\nulldelimiterspace} \tau } + k_{n} z_{2}^{i} + {{\dot{z}_{2}^{i} } \mathord{\left/ {\vphantom {{\dot{z}_{2}^{i} } {d_{z} }}} \right. \kern-\nulldelimiterspace} {d_{z} }} = {{f(s)} \mathord{\left/ {\vphantom {{f(s)} \tau }} \right. \kern-\nulldelimiterspace} \tau } + \Delta u_{j}^{i} , $$
(12)

where \(\Delta u_{j}^{i} = k_{n} z_{2}^{i} + {{\dot{z}_{2}^{i} } \mathord{\left/ {\vphantom {{\dot{z}_{2}^{i} } {d_{z} }}} \right. \kern-\nulldelimiterspace} {d_{z} }}\), which means that \(\Delta u_{j}^{i}\) will be calculated by two steps: first, using (3) to get the forcing function \(f(s)\); second, calculating \(z_{1}^{i}\) and \(z_{2}^{i}\) timely and adding them to (12). Then the output \(y\) is determined by the common function of \(f(s)\) and the \(z_{2}^{i}\) generated by BLFs function.

Remark 1

Similar to the improved DMPs for multi-agent formation [16,17,18,19,20], by adding a term \(\Delta u_{j}^{i}\), the original path point \(x\) (or output \(y\)) is modified to fit the constraints in (5). However, the proposed method in (9) is more general compared with the previous special designs only providing limitations for the end effectors or point shape of multi-agent formation. The following subsection will extend this method to the case of joint distance restriction.

Integrated BLFs and DMPs skills learning with joint distance restriction

The challenges for the joint distance restrictions are modifying the distance between the adjacent joints such as the elbow and the shoulder or the elbow and twist. Following the definitions in Table 2, the distance errors ranges (like \(\sigma _{{ij}}^{{}}\) for relative distance limitation) are set as \(\sigma _{e}^{{}}\) and \(\sigma _{s}^{{}}\). Similar to (5), we can use the following inequalities to reshape the elbow and shoulder positions as

$$ \left\{ \begin{gathered} \left| {\left\| {{\mathbf{x}}_{{ej}}^{{}} - {\mathbf{x}}_{{wj}}^{{}} } \right\|_{2} - d_{e}^{{}} } \right| < \sigma _{e}^{{}} \hfill \\ \left| {\left\| {{\mathbf{x}}_{{ej}}^{{}} - {\mathbf{x}}_{{sj}}^{{}} } \right\|_{2} - d_{s}^{{}} } \right| < \sigma _{s}^{{}} \hfill \\ \end{gathered} \right.. $$
(13)

However, as mentioned in “Problem description”, if we want to replan positions of the elbow and the shoulder, we should take the real measurements of the elbow and the shoulder as reference, but they are limited by the hands’ relative distance conditions. Therefore, we will reshape the elbow and the shoulder distance satisfying both (5) and (13) by calculating a common result for the two conditions.

Additionally, following Fig. 2, the measuring errors of the hands are larger than those of the elbows. There are measuring mistakes due to the occupations, then the errors are processed in the following two steps:

  1. 1.

    Satisfy limitations for hands’ motions \(^{1} {\mathbf{x}}_{{wj}}^{{}}\) as:

    $$ ^{1} {\mathbf{x}}_{{wj}}^{{}} : = \left| {\left\| {{\mathbf{x}}_{{ej}}^{{}} - ^{1} {\mathbf{x}}_{{wj}}^{{}} } \right\|_{2} - d_{e}^{{}} (t)} \right| < \sigma _{e}^{{}} , $$
    (14)

    where \(^{1} {\mathbf{x}}_{{wj}}^{{}}\) represents the results for the first round. Equation (14) is used to filter the data of hands first by the elbow measurements.

  2. 2.

    Synchronized constraints for the hands, elbows and shoulders

    $$ ^{2} {\mathbf{x}}_{{wj}}^{{}} : = \left\{ \begin{gathered} \sigma _{{1j}}^{{}} < \left\| {^{2} {\mathbf{x}}_{{w\bar{j}}}^{{}} - ^{2} {\mathbf{x}}_{{wj}}^{{}} } \right\|_{2} - d_{j}^{{}} (t) < \sigma _{{2j}}^{{}} \hfill \\ - \sigma _{e}^{{}} < \left\| {{\mathbf{x}}_{{ej}}^{{}} - ^{2} {\mathbf{x}}_{{wj}}^{{}} } \right\|_{2} - d_{e}^{{}} (t) < \sigma _{e}^{{}} \hfill \\ \end{gathered} \right., $$
    (15)
    $$ {\mathbf{x}}_{{ej}}^{{}} : = \left\{ \begin{gathered} - \sigma _{s}^{{}} < \left\| {{\mathbf{x}}_{{ej}}^{{}} - {\mathbf{x}}_{{sj}}^{{}} } \right\|_{2} - d_{s}^{{}} (t) < \sigma _{s}^{{}} \hfill \\ - \sigma _{e}^{{}} < \left\| {{\mathbf{x}}_{{ej}}^{{}} - {\mathbf{x}}_{{wj}}^{{}} } \right\|_{2} - d_{e}^{{}} (t) < \sigma _{e}^{{}} \hfill \\ \end{gathered} \right., $$
    (16)
    $$ {\mathbf{x}}_{{sj}}^{{}} : = \left| {\left\| {{\mathbf{x}}_{{ej}}^{{}} - {\mathbf{x}}_{{sj}}^{{}} } \right\|_{2} - d_{s}^{{}} (t)} \right| < \sigma _{s}^{{}} , $$
    (17)

    where \(^{2} {\mathbf{x}}_{{wj}}^{{}}\) represents the results for the second round and Eqs. (15) to (17) consider both limitations of the relative distance and joint distance to rebuild trajectory.

Remark 2

The calculating basis for (15) and (16) is to find a common conditional result satisfying both two inequalities. But, sometimes there is no overlap area for the two conditions. Here, we reckon that the priority of joint distance restriction is higher than that of relative distance, thus the inequality of joint distance restriction will be first considered and then \(\sigma _{j}^{{}}\) will be modified to adapt to the condition \(\kappa _{{1j}}^{i} (k) \le z_{1}^{i} (k) \le \kappa _{{2j}}^{i} (k)\) for the kth calculation.

Remark 3

Similar to the methods in [16,17,18,19,20], the skill learning process will be handled first, and then by adding new term \(\Delta u_{j}^{i}\), the trajectory will be generalized. Here, we will learn and generalize angle skills for the elbow and the shoulder first without using any limitations. Then the integrated BLFs and DMPs skills learning are used to modify the generalized trajectory to suit inequalities (15) to (17). The improved DMPs hold the properties of normal DMPs and dynamic performance determined by the factors: start and end points, sampling interval.

Based on the BLFs and DMPs integrated skill learning method, we present the general calculation procedure as follows:

In Fig. 3, \(f(s)\) in (12) of DMPs and constraints (like (5), (13)–(17)) of BLFs are firstly designed separately. After initialing \(\Delta u_{j}^{i}\) in (12), \(u_{j}^{i}\) will be calculated to generate new trajectories, and the new position and velocity information will be used to update \(\Delta u_{j}^{i}\) for the next circulation till the destination.

Fig. 3
figure 3

Procedure of the integrated skills learning method

Robotic human-like redundancy resolution

For replanning robotic actions based on the learned skill from demonstrations, some previous researches proposed human-like swivel motion by using its redundant degrees of the manipulator [28, 29]. After rebuilding positions of the hand, elbow, and shoulder, we can generalize the joints and end effectors of the redundant robots. Following the depictions in [28], arm plane is the plane built with three joint points of the wrist, elbow and shoulder and reference plane is set as the vertical plane to the human body. We set the swivel angle \(\psi _{j} ,j = l,r\) as the angle of arm plane and reference plane, shown in Fig. 4.

Fig. 4
figure 4

Definition of swivel angle of dual arms

Set the joint velocities of 7-Dof redundant robot arm as \({\mathbf{\dot{q}}}_{j} \in R^{{7 \times 1}} ,\, j = l,f\), and set the generalized end effectors’ positions as \(^{2} {\mathbf{\bar{x}}}_{{wj}}^{{}} \in R^{{3 \times 1}} ,\, j = l,f\), and the distance between the hand and the wrist are ignored for the two-arm holding actions as \({\mathbf{\bar{x}}}_{{hj}}^{{}} \approx ^{2} {\mathbf{\bar{x}}}_{{wj}}^{{}}\) (in fact due to the object occlusion, all hand positions cannot be well got). Then by extending \({\mathbf{\bar{x}}}_{{hj}}^{{}}\) with gestures of the end effectors to \({\mathbf{X}}_{{hj}}^{{}}\), we have \({\mathbf{\dot{X}}}_{{hj}}^{{}} = {\mathbf{J}}_{j} {\mathbf{\dot{q}}}_{j}\), and \({\mathbf{J}}_{j} \in R^{{6 \times 7}}\) represents a Jacobian matrix. Following null-space projection, we calculate the redundancy solution of a redundant robot arm as

$$ {\mathbf{\dot{q}}}_{j} = {\mathbf{J}}_{j}^{ + } {\mathbf{\dot{X}}}_{{hj}}^{{}} + \left( {{\mathbf{I}} - {\mathbf{J}}_{j}^{ + } {\mathbf{J}}_{j}^{{}} } \right){\mathbf{J}}_{{Ej}}^{ + } \psi _{j} {\mathbf{U}}_{{\psi _{j} }} , $$
(18)

where \({\mathbf{J}}_{{Ej}} \in R^{{3 \times 7}}\) is the Jacobian matrix from the elbow of the robot to the robot base as well as the mapping between the swivel angle and joint velocities [28]. \({\mathbf{U}}_{{\psi _{j} }}\) is defined as the velocity director of swivel motion:

$$ {\mathbf{U}}_{{\psi _{j} }} = \frac{{\overrightarrow {{{\text{SE}}}} \times \overrightarrow {{{\text{EW}}}} }}{{\left\| {\overrightarrow {{{\text{SE}}}} \times \overrightarrow {{{\text{EW}}}} } \right\|}}, $$
(19)

where \(\overrightarrow {{{\text{SE}}}} = {\mathbf{x}}_{{sj}}^{{}} - {\mathbf{x}}_{{ej}}^{{}}\) represents a vector from the shoulder to the elbow and \(\overrightarrow {{{\text{EW}}}} = {\mathbf{x}}_{{ej}}^{{}} - ^{2} {\mathbf{\bar{x}}}_{{wj}}^{{}}\) represents a vector from the elbow to the wrist for the generalized human demonstrations. Then the vector will be used for the robot joint planning and calculation of \(\theta\) shown in Fig. 1.

Experiment

In the experiment, we will achieve demonstration data by Kinect and verify the manipulation effect through the virtual model in Gazebo. The skeleton data of a task of holding and placing a box is recorded for demonstrations. The positions of joints (the elbow, shoulder and wrist) are used for skill training in Cartesian space as Fig. 5 shown. The upper figures (a) to (g) are the illustrations of human demonstrations and the lower figures (a) to (g) show the human skeleton data acquired by Kinect. But, we can find the acquired sampling data has errors such as the left leg motions in (b) and (d) and the hand motions in (e), which are caused by obstacle occupation.

Fig. 5
figure 5

Human demonstrations and skeleton data

Figures 6 and 7 are the results based on the initialization of hand’s positions using (14). Figure 6 presents changes of the trajectories and skeletons of the original and modified DMPs. Figure 7 presents tracking errors to the predesigned referring trajectory fitting the constraints in (14) to (17). Combining with Fig. 7, we can see that the trajectories of both the left and right arms are changed to avoid violation (14) to (17). Seen from the blue lines (results of original DMPs) in Fig. 7, the amplitudes of the errors in each axis vary from − 0.1 to about 0.05, causing the final distance of the elbow and the wrist changed within a large range of \((0.12,0.25)\), which seriously against to the fact.

Fig. 6
figure 6

Desired and real shoulder, elbow and wrist movements a changes of trajectories, b changes of skeletons

Fig. 7
figure 7

Errors of the desired and real the elbow and wrist positions a results of the left arm, b results of the right arm

As the distance is measured between the elbow and the hand contact point (palm) to about 0.28 m, namely \(d_{e}^{{}} (\infty ) = 0.28\). Setting \(\sigma _{e}^{{}} = 0.01\), then the generalized results show that the errors to desired position decrease to the range of \(( - 0.01,0.01)\) after the first few steps and the distance converges to the value around the desired value of 0.28 for both hands. The initial distance errors are large. It needs about 15 iterations to guide the large distance errors to decrease to the desired conditions in (14), which is further processed in step 2 by using (15) to (17). Figure 8a, b presents the original distance of the joints (the elbow and wrist, and the elbow and shoulder). It shows that the joint distance varies within a range of \((0.13,0.32)\) for both the elbow and the shoulder, and the elbow and the wrist of both arms. Using the proposed two-step method, we can get the fixed distance for the joint links and the varying relative distance for the dual hands’ manipulation, both of which converge to a stable interval (Figs. 9a, b, 10).

Fig. 8
figure 8

Joint distances of the right and left arms. a Results of the left arm, b results of the right arm

Fig. 9
figure 9

Modified joint distances of the right and left arms. a Modified distance of the joints of the left arm. b Modified joint of the joints of the right arm

Fig. 10
figure 10

Original, modified and compared relative distance of the wrists

As we set the desired joint distance as \(d_{e}^{{}} \equiv 0.28\) and \(d_{s}^{{}} \equiv 0.22\) by actual measurements, and the error range as 0.01, the results presented in Fig. 9 verify the effectiveness of the proposed method even for the large initial errors. The performance functions for the upper and lower boundaries are set as

$$ \left\{ \begin{gathered} \sigma _{{2j}}^{{}} = 0.02\text{e} ^{{ - {k \mathord{\left/ {\vphantom {k {10}}} \right. \kern-\nulldelimiterspace} {10}}}} + 0.01 \hfill \\ \sigma _{{1j}}^{{}} = - 0.02\text{e} ^{{ - {k \mathord{\left/ {\vphantom {k {10}}} \right. \kern-\nulldelimiterspace} {10}}}} - 0.01 \hfill \\ \end{gathered} \right., $$
(20)

where \(k\) represents the sampling times. Additionally, we compare the results with the method in [17] and set the desired distance as \(d_{j}^{{}} (t) \equiv 0.3\). Figure 10 shows the measured relative distances of human two hands (blue lines). The black dash lines present the boundaries of relative distance that decrease from 0.32 to the interval \((0.29,0.31)\). The method in [17] enables the relative distance to quickly decrease and keep the value around 0.305, but the relative distance cannot change along with dynamic performance like (20).

The last step is extending the learned skills to new task. We select two Franka robots as manipulators to hold and place an object with width of 0.3 m, the base of which located at \([0.2,0,0.4]\) and \([ - 0.2,0,0.4]\). The simulations are taken in Matlab and then the results are transferred and certified in the Gazebo environment. Here, we select the 4th joint as the elbow joint to solve the redundant solution. Figure 11 presents the simulation process that Two Franka robots are controlled to move the object to follow the trajectories generated in Matlab under the PD control. During the simulation, we set the object and robot end effectors have a certain degree of deformation to counteract the influence of the relative distance tracking errors.

Fig. 11
figure 11

Simulated certification taken based on Matlab and Gazebot

Conclusion

In this paper, we proposed a new DMP-based skill learning and generalization framework for the dual-arm redundant cooperative manipulation. The framework has three functions: skill learning and generalization for the relative distance limitation, trajectory replanning for the joint distance restriction, and redundant solution for multi-Dof robot based on the generalized dual-arm skills. The two former skill/trajectory learning and generalization methods are studied based on the integration of BLFs and DMPs methods. Using the demonstration data acquired by Kinect, the effectiveness of the proposed framework is verified by a task of holding and placing an object based on the simulations in Matlab and Gazebo. Each technical method is proved and explained by the simulation results. The future work is hopeful to be taken on the real robotic system to complete skill learning autonomously.