1 Introduction

Robot manipulator has been widely used in a number of fields, such as industrial assembly [1], space exploration [2], medical surgery [3] and so on. Specifically, it has been utilised to perform tasks in specific and structured environments due to the advantages of low-cost, efficiency and safety. However, it is hard to program robots for various scenarios, and it is also time-consuming to program each robot manually. As the fast development of machine learning techniques, robot skill learning has attained increasing attention. Several machine learning techniques, e.g., reinforcement learning, imitation learning and deep learning [4, 5], have been successfully employed in robotic skill learning. Among the various learning methods, the learning from demonstration (LfD) (also named programming by demonstration, PbD) has been proved as an effective way to transfer manipulation skills from humans to robots easily [6]. Also, human often has substantial advantages over robots in terms of complex manipulation skill. In contrast to the traditional robot programming methods which require expertise in coding and significant time investment, the attractive aspect of LfD is its capability to facilitate nonexpert robot programming. Thus, the LfD has the potential to significantly benefit a variety of industries, such as manufacturing and health care.

Currently, it is very common for industrial robots to perform accurate position control tasks. However, it is time-consuming to prepare the work environment and robot programs carefully. It often needs to replan the trajectory when any variation happens, such as the changes of object positions, the deviation between the real object and the programmed position, limiting the application of automation, such as assembly tasks in the industrial plant. For example in [7], object handover is a common task in human–robot interaction/collaboration, and it is still very challenging on the generalisation, temporal and spatial scaling. In [8], the proposed method can be used to generate a trajectory for the handover task, which could satisfy the shape-driven and goal-driven requirement. It is ensured to achieve the goal and also try to maintain the demonstrated trajectory shape.

The LfD process consists of three phases: the human demonstration, the model learning and skill reproduction. In the demonstration stage, humans teach robots how to execute the tasks with various approaches, such as kinesthetic teaching, teleoperation and passive observation, and the movement profiles of robots and humans will be recorded. In the next learning stage, the manipulation skill models will be trained, which has a significant impact on the performance of robot skill learning and generalisation in practice. The skill model is expected to be modular, compact and adaptive for robotic manipulation skills. There already exists much work to deal with skill modelling for human–robot skills transfer, such as dynamic movement primitives (DMPs) [9, 10], Gaussian mixture model (GMM) [11], the stable estimator of dynamical systems (SEDS) [12], kernelised movement primitives (KMPs) [13], probabilistic movement primitives (ProMPs) [14] and hidden semi-Markov model (HSMM) [15]. And some of these approaches are combined, such as integrating the HSMM with GMM to model the robot skills and perception mechanism [16]. Generally, based on the modelling principle, they can be divided into two branches: dynamic system method and statistic approach. The statistic-based methods include GMM, KMP, ProMPs and HSMM, which could easily represent multimodal sensory information. However, DMP is a general framework to realise the movement planning, online trajectory modification for LfD, which was originally proposed by Ijspeert et al. [17]. As the DMPs have several good characteristics, such as resistance to perturbation and uncertainties, spatial and temporal scaling, they have been gaining much attention. The DMPs approach has the property of generalising the learnt skills to new initial and goal positions, maintaining the desired kinematic pattern. Since the original version of DMPs was proposed, a number of modified versions had been studied to improve the performance of DMPs. Most of these works mainly focus on the two issues, how to improve the generalisation ability of DMPs and how to overcome the inherent drawbacks of DMPs. More recently, it also has been further used to encode different modalities, such as stiffness and force profiles. For example, DMPs with the perceptual term have been proposed to execute physical interaction tasks, which require robots to regulate the contact force, torque, as well as the desired trajectory [18]. Besides, some researchers proposed coupling DMPs to realise obstacle avoidance, interaction with objects and bimanual manipulation by modifying the formulation of DMPs model or adding control methods [19]. The reinforcement learning technique has been used to optimise the parameters of DMPs, which could further improve the generalisation ability of DMPs. RL-based DMPs were proposed to increase the generalisation of the original DMPs [20].

An essential aspect of LfD is how to generalise the learnt skills to novel environments and situations. Since the demonstration cannot cover all the robot working environments and situations, robots need to own the ability to adapt their behaviours according to the changes in environments. The adaptability of robot skills often refers to spatial and temporal scaling, adjusting their behaviours based on the perception information. Such as tracking moving tasks, the robot needs to modify its trajectory based on the position and velocity of the moving target [21]. Besides, many specific tasks require the generalisation of robot skills, such as obstacle avoidance and performing tasks in dynamic environments and situations. Heiko et al. modified the original DMP framework using biologically inspired dynamical systems to increase the generalisation, achieving the real-time goal adaptation and obstacle avoidance [22]. The sensory information has been integrated into the DMP framework to increase the online generalisation, which could generate a robust trajectory account for external perturbations and perception uncertainty [23]. The neural network technique has been utilised to learn the perception term in DMP to realise the reactive planning and control, which can pave the path for robots working in dynamical environments. Further, the modulation of DMP has been exploited using force and tactile feedback to increase the interaction ability and execute bimanual tasks [23]. In addition, a task-oriented regression algorithm with radial basis functions has been proposed to increase the generalisation of DMPs. For dynamic tasks, such as tracking moving targets, it can be seen that many researchers proposed modified versions of DMPs to deal with moving goals. For example in [21], the authors modified the DMP by adjusting the temporal scaling parameter online to follow a moving goal, although it only focused on the position trajectory in Cartesian space. To improve the generalisation, single DMP could not produce complex behaviour. Merging the different DMP is very important to deal with this challenge [24]. It also pointed out that building a motion skill library for robots to produce complex behaviour is a useful tool. Also, the merging sequential motion primitives have been studied to produce complex behaviours. Complex trajectories involving several actions can be reproduced by sequencing multiple motion primitives. Each motion primitive is represented as DMP; various approaches were investigated to connect the motion primitives seamlessly [24]. However, most of the current work focused on the position motion primitives in Cartesian or joint space, and there is a lack of research on the orientation primitive trajectories in Cartesian.

Most recently, various versions of DMP have been proposed to increase the online adaptability to uncertainties and novel tasks. However, the spatial scaling is limited due to encoding the position trajectory for each coordinate. Most of the existing work focused on DMPs representing position skills in Cartesian space, ignoring the orientation requirements in some application, such as obstacle avoidance [22, 25], picking and placing [10], cutting task [9]. However, for some tasks, such as ultrasound scanning in medical application, the probe orientation has a significant impact on the image quality in robot-assisted ultrasonography; hence the orientation and position need to be considered simultaneously. The forcing term in DMPs often is approximated by the Gaussian functions, and the locally weighted regression (LWR) technique to be used to learn the weights of each basis functions. However, [26] stated that forcing term approximation could influence the accuracy and performance of DMP. And different basis functions have been studied to improve the performance of DMPs.

In this work, a composite DMPs-based skill learning framework is studied, which considers not only the position constraints but also the orientation requirement. Both temporal and spatial generalisation capability has been increased. Besides, the DMP-based framework can be adapted temporally to moving targets with the specific requirement of orientation. Further, the RBFNNs are utilised to learn the nonlinear functions in composite DMPs.

The contributions in this work are (1) combining the DMPs and RBFNNs to improve the generalisation of robot manipulation skills. The radial basis function NN is employed to approximate the force term in the composite DMPs. (2) A basic skill associated with position and orientation could be modelled by the composite DMP simultaneously, coupled with the temporal parameter. (3) The composite DMP could reach the moving goals with generalisation in terms of temporal and spatial scaling. The composite DMP-based framework can guarantee to converge to moving goals while being perturbed to obstacles.

The rest of the paper is organised as follows. Section 2 provides an overview of the position and orientation DMP in Cartesian space and its limitations. The composite DMPs framework based on RBFNNs is presented in Sect. 3. The stability analysis for the DMP-based model is presented. Section 4 presents the simulation and experimental results to validate the temporal and spatial generalisation. RBFNNs have been utilised to learn the nonlinear functions associated with the combined DMPs. Section 6 concludes the paper finally.

2 Preliminaries and motivations

2.1 Radial basis function neural networks (RBFNNs)

The neural network has been proved to an effective approach to robot applications, and much work on the neural network has been studied, such as the stability of neural network [27, 28]. RBFNNs are a useful tool to approximate nonlinear functions for robot control and robot skills learning. For instance, RBFNNs is combined with the broad learning framework to learn and generalise the basic skills [29]. RBFNNs are employed to approximate the nonlinear dynamics of the manipulator robot to improve tracking performance [16, 30]. Therefore, RBFNNs can approximate the nonlinear forcing term in the DMP framework. Radial basis function networks consist of three layers: an input layer, a hidden layer with a nonlinear RBF activation function and a linear output layer. It is an effective approach to approximate any continuous function \(h:{R^n} \rightarrow R\),

$$\begin{aligned} h(x) = {W^T}S(x) + \varepsilon (x) \end{aligned}$$
(1)

where \(x \in {R^n}\) is the input vector, \(W = {[{\omega _1},{\omega _2},\ldots ,{\omega _N}]^T} \in {R^N}\) denotes the weight vector for the N neural network nodes. The approximation error \(\varepsilon (x)\) is bound. \(S(x) = {[{s_1}(x),{s_2}(x),\ldots ,{s_N}(x)]^T}\) is a nonlinear vector function, where \({s_i}(x)\) can be defined as a radial basis function,

$$\begin{aligned} {s_i}(x) = \exp ( - {h_i}{(x - {c_i})^T}(x - {c_i}))\qquad i = 1,2,\ldots ,N \end{aligned}$$
(2)

where \({c_i} = {[{c_{i1}},{c_{i2}},\ldots ,{c_{in}}]^T} \in {R^n}\) denotes the centres of the Gaussian function and \({h_i} = 1 \big / {\chi _{i}^{2}}\) , \({\chi _i}\) denotes the variance. The ideal weight vector W is defined as,

$$\begin{aligned} W = \mathop {\arg \min }\limits _{{\hat{W}} \in {R^N}} \left\{ {\sup \left| {h(x) - {{{\hat{W}}}^T}S(x)} \right| } \right\} \end{aligned}$$
(3)

which minimises the approximation error of nonlinear function. The nonlinear functions in DMPs can be learnt by RBFNNs from demonstration data. In this work, RBFNNs will be utilised to parameterise the nonlinear functions in DMPs.

2.2 Position and orientation DMP in Cartesian space

DMP is a useful tool to encode the movement profiles via a second-order dynamical system with a nonlinear forcing term. Robots skills learning by DMPs aims to model the forcing term in such a way to be able to generalise the trajectory to a new start and goal position while maintaining the shape of the learnt trajectory. DMPs can be used to model both periodic and discrete motion trajectories. However, in this work, we will focus on the discrete motion trajectories. Currently, the most research on DMPs mainly focuses on the position DMPs and its modifications, which can be used to represent arbitrary movements for robots in Cartesian or joint space by adding a nonlinear term to adjust the trajectory shape. For one degree of multiple-dimensional dynamical systems, the transformation system of position DMP can be modelled as follows [31],

$$\begin{aligned}&{\tau _s}\dot{v} = {\alpha _z}({\beta _z}({p_g} - p) - v) + F_{p}(x) \end{aligned}$$
(4)
$$\begin{aligned}&{\tau _s}\dot{p} = v \end{aligned}$$
(5)

where the \({p_g}\) is the desired position, p is the current position, the v is the scaled velocity, \( \tau _s \) is the temporal scaling parameter, \( \alpha _{z}, \beta _{z}\) are the design parameters, usually, \( \alpha _z =4 \beta _z \). \( F_{p}(x)\) is the nonlinear forcing term responsible for tuning the shape of trajectory. The \( F_{p}(x)\) can be approximated by a set of radial basic functions,

$$\begin{aligned}&F_{p}(x) = \frac{{\sum _{i = 1}^N {{\psi _i}(x){w_i}} }}{{\sum _{i = 1}^N {{\psi _i}(x)} }}x({p_g} - {p_0}) \end{aligned}$$
(6)
$$\begin{aligned}&{\psi _i}(x) = \exp ( - {h_i}{(x - {c_i})^2}) \end{aligned}$$
(7)

where \(\psi _i (x)\) is a Gaussian radial basis function with the centre \( c_i\) and width \(h_i \); \( p_0 \) is the initial position, and \( w_i \) is the weight learning from demonstration. The phase variable x is determined by the canonical system, which can be represented as follows,

$$\begin{aligned} {\tau _s}\dot{x} = - {\alpha _x}x,\qquad x \in \left[ {0,1} \right] ;\qquad x(0) = 1 \end{aligned}$$
(8)

where \( \alpha _{x} \) is a positive gain coefficient, \( \tau _s \) is the temporal scaling parameter and the \( x_{0}=1\) is the initial value of x, which can converge to 0 exponentially. For the multiple-degree-of-freedom (DoF) dynamic system, each dimension can be modelled by a transformation system, but they share a common canonical system to synchronise them.

The orientation DMP has been first proposed by [32], which is vital to robot learning and control. The orientation in DMP is often represented by rotation matrix or quaternions. For example in [33], the unit quaternions are used to model the orientation, and the unit quaternion set minus one single point also has been proved to be contractible [34]. This property of the unit quaternion set could guarantee the convergence of orientation DMPs. In addition, as the quaternion formulation has less variable than the rotation matrix, it has been used widely in the orientation representation for robot learning and control. In [32], the unit quaternion-based transformation system can be described as,

$$\begin{aligned}&{\tau _s}\dot{z} = - {\alpha _z}({\beta _z}2\log ({q_g}*{\bar{q}}) - z) + F_o(x) \end{aligned}$$
(9)
$$\begin{aligned}&{\tau _s}\dot{q} = \frac{1}{2}\left[ \begin{array}{l} 0\\ z \end{array} \right] *q \end{aligned}$$
(10)

where \(q \in {S^3}\) denotes the orientation as a unit quaternion, \({q_g}\in S^3\) represents the final orientation. \( \omega \) denotes the angular velocity, \(z = \tau _s \omega \in {R^3}\) is the scaled angular velocity, ‘*’ denotes the quaternion product, \({\bar{q}}\) represents the quaternion conjugate which is equal to the inverse quaternion for unit quaternions and \(2\log ({q_2}*{{\bar{q}}_1)} \in {R^3}\) denotes the rotation of \({q_1}\) around a fixed axis to reach \({q_2}\). The forcing term \(F_{o}(x) \in {R^3}\) for each orientation coordinate will learn the desired orientation skills from the demonstration data.

3 Composite position and orientation dynamic movement primitives

Currently, the separate position or orientation DMP has been studied widely [35]; however, research on the composite DMPs, modelling the position and orientation simultaneously, is not common. In real practice, most manipulation skills often mix the position and orientation skills, which requires robots to satisfy the specific position constraints as well the orientation for many tasks, such as polishing, spraying, assembly [36, 37]. In addition, for human–robot interaction tasks, such as two partners collaborating an object handover interaction, the target position is always changing. It is still open to guarantee various orientation requirements. Inspired by the improvement in the orientation DMP, the proposed framework has great generalisation and adaptability to novel tasks and situations. Studying on the DMPs to handle the moving goals is also vital to the practical application. Therefore, we propose the composite DMPs, coupling the position and orientation modelling in a framework, and the RBFNNs are used to learn the nonlinear term in models.

3.1 The composite DMP formulation

As shown in Fig. 1, the manipulation skill modelled by position and orientation DMPs consisted of recoding the demonstration data, training the RBFNNs and reproducing the skills. The demonstration data include the position and orientation trajectories, and the output of skills reproducing is the reference of position and orientation trajectories associated with specific tasks. The canonical system is used to coordinate the position and orientation constrains in the composite DMPs. The nonlinear forcing terms associated with each DMP are trained by using RBFNNs from the position and orientation demonstration data. Six RBFNNs are used to parameterise the nonlinear functions for position and orientation DMPs, respectively. After the DMPs have learned the demonstration, the dynamic and multiple constraints can be guaranteed: (1) the goal and initial position and orientation can be changed; (2) the targets can be moved, the velocity profiles of DMP output will keep in a safe bound; and (3) the requirements of position and orientation can be achieved simultaneously.

Fig. 1
figure 1

Structure of human–robot skill transfer using the composite DMP model

The position DMP formulation can be described as,

$$\begin{aligned}&{\tau _s}\dot{v} = - {\alpha _z}({\beta _z}{e_p} + v) + diag({p_g} - {p_0})f_p(x) \end{aligned}$$
(11)
$$\begin{aligned}&{\tau _s}{{\dot{e}}_p} = v \end{aligned}$$
(12)

where \({e_p} = {p_g} - p \in {R^3}\) is the position error and \(v \in {R^3}\) is the scaled velocity error. The \({\alpha _z},{\beta _z}\) are positive gains, and \( f_{p}(x)\) is trained by RBFNNs for each DMP. The system is trained using a demonstration from the initial position \({p_{0,d}}\) to the stationary goal \({p_{g,d}}\) with temporal scaling \({\tau _d}\).

The orientation DMP formulation can be described as [33],

$$\begin{aligned}&{\tau _s}\dot{z} = - {\alpha _z}({\beta _z}{e_o} + z) + diag({q_g}*{{{\bar{q}}}_0})f_o(x) \end{aligned}$$
(13)
$$\begin{aligned}&{\tau _s}{{\dot{e}}_o} = z \end{aligned}$$
(14)
$$\begin{aligned}&{e_o} = 2\log ({q_g}*{\bar{q}}) \end{aligned}$$
(15)

where the \( e_{o}\) is the quaternion error, z is the scaled quaternion error velocity. To obtain the orientation, we solve Eq. (15),

$$\begin{aligned} q = \overline{\exp (\frac{1}{2}{e_o})} *{q_g} \end{aligned}$$
(16)

The angular velocity is

$$\begin{aligned} \omega = 2vec(\dot{q}*{\bar{q}}) \end{aligned}$$
(17)

where the \(\dot{q}\) can be obtained by the following equations,

$$\begin{aligned}&\dot{q} = - \frac{1}{2}q*{\bar{q}}{}_g*J{}_{\log q}({q_g}*{\bar{q}}){\dot{e}_o}*q \end{aligned}$$
(18)
$$\begin{aligned}&{\dot{e}_o} = - 2{J_q}({q_g}*{\bar{q}})({q_g}*{\bar{q}}*\dot{q}*{\bar{q}}) \end{aligned}$$
(19)

Inspired by the work [38], the temporal scaling can be adjusted based on the task and the velocity constraints. The target position and velocity update the shared temporal parameter in the position and orientation DMPs, and it may be described as [21],

$$\begin{aligned} {{{\dot{\tau }}} _s}\mathrm{{ = - }}\gamma ({\tau _s}\mathrm{{ - }}{\tau _a})\mathrm{{ + }}{{{\dot{\tau }}} _a} \end{aligned}$$
(20)

where the \(\gamma \) is a design parameter, \({\tau _a}\) is determined by

$$\begin{aligned}&{\tau _a} = \frac{{\left\| e \right\| }}{{\left\| {{e_d}} \right\| }}*{\tau _d} \end{aligned}$$
(21)
$$\begin{aligned}&\begin{array}{l} e = {\left[ {e_p^T ,\qquad e_o^T} \right] ^T} \\ {e_d} = {\left[ {e_{p,d}^T , e_{o,d}^T} \right] ^T} \end{array} \end{aligned}$$
(22)

where the \({e_p}\) is the position error between the goal and the initial point, \({e_o}\) is the orientation error between goal and start. The \({e_{p,d}}\) is the position error between the goal and the initial point in the demonstration, \({e_{o,d}}\) is the orientation error between the goal and start in the demonstration. \( \tau _d \) is the temporal scaling coefficient in the demonstration. The temporal parameter update law has been proved to converge to the moving goals in [21].

3.2 The training of DMPs by RBFNNs

Take one dimension for position and orientation DMP as examples. The nonlinear forcing terms of position and orientation DMP can be approximated by RBFNNs respectively,

$$\begin{aligned}&{f_p}(s) = \sum \limits _i {w_p^i} {\psi ^i}(s) \end{aligned}$$
(23)
$$\begin{aligned}&{f_o}(s) = \sum \limits _j {w_o^j} {\psi ^j}(s) \end{aligned}$$
(24)

\(w_p^i,w_o^j\) are the weight coefficients, \({\psi ^i}(s)\) and \({\psi ^j}(s)\) are the Gaussian activation functions, defined as,

$$\begin{aligned}&{\psi ^i}(s) = \exp ( - {h^i}{(s - {c^i})^2}) \end{aligned}$$
(25)
$$\begin{aligned}&{\psi ^j}(s) = \exp ( - {h^j}{(s - {c^j})^2}) \end{aligned}$$
(26)

In the demonstration phase, one position trajectory \({p_d},{\dot{p}_d},{\ddot{p}_d}\) is recorded, from starting position \({p_{0,d}}\), to the target position \({p_{g,d}}\). According to the position DMP transformation system Eqs. (11), (12) and the demonstration data, the desired force function is

$$\begin{aligned} f_p^d(s) = \frac{1}{{{p_{g,d}} - {p_{0,d}}}}({\tau _d ^2}{\ddot{p}_d} - {\alpha _z}({\beta _z}({p_{g,d}} - {p_d}) - \tau _d {\dot{p}_d})) \end{aligned}$$
(27)

where the \( \tau _d\) is the temporal scaling during demonstration. Similarly, the force term in the orientation DMPs can be described as,

$$\begin{aligned}f_o^d(s) &= {(diag(2\log ({q_{g,d}}*{{{\bar{q}}}_{0,d}})))^{ - 1}}\nonumber \\&\quad \mathrm{{*}}({\tau _d ^2}{{{{\dot{\omega }}} }_d} - {\alpha _z}({\beta _z}(2\log ({q_{g,d}}*{{{\bar{q}}}_d}) - \tau _d {\omega _d}))) \end{aligned}$$
(28)

The following error function between the desired force term and the approximated value is the objective function of the optimisation problem, which will be minimised for learning the parameters of RBFNNs in the DMPs:

$$\begin{aligned} E = \frac{1}{2}({(f_p^d({s^t}) - {f_p}({s^t}))^2} + {(f_o^d({s^t}) - {f_o}({s^t}))^2}) \end{aligned}$$
(29)

\( s^{t}\) is the value of s. A gradient descent approach is used to derive the weight update law as [39],

$$\begin{aligned}&w_p^i(t + 1) = w_x^i(t) - {\lambda _1}\frac{{\partial E}}{{\partial w_p^i}} \end{aligned}$$
(30)
$$\begin{aligned}&w_o^j(t + 1) = w_o^j(t) - {\lambda _2}\frac{{\partial E}}{{\partial w_o^j}} \end{aligned}$$
(31)
$$\begin{aligned}&\begin{array}{l} \frac{{\partial E}}{{\partial w_p^i}} = (f_p^d({s^t}) - {f_p}({s^t}))\frac{\partial }{{\partial w_p^i}}( - \sum \limits _i {w_p^i{\psi ^i}(s)} )\\ \quad= (f_p^d({s^t}) - {f_p}({s^t}))( - {\psi ^i}({s^t})) \end{array} \end{aligned}$$
(32)

The weight update law of \(w_p^i \) is given as,

$$\begin{aligned} w_p^i(t + 1) = w_p^i(t) + {\lambda _1}(f_p^d({s^t}) - {f_p}({s^t})){\psi ^i}({s^t}) \end{aligned}$$
(33)

Similarly, the weight \(w_o^i\) is updated by,

$$\begin{aligned} w_o^j(t + 1) = w_o^j(t) + {\lambda _2}(f_o^d({s^t}) - {f_o}({s^t})){\psi ^j}({s^t}) \end{aligned}$$
(34)

The weights in the RBFNNs can be attained through the gradient descent approach and demonstration data.

4 Experimental results

As a complex task can be hierarchically decomposed into different subtasks involving multiple primitive actions and manipulated objects, several basic motion skills could be synthesised to complex tasks. Thus, in the paper, we will conduct several typical motion skills through simulation and experiments. As shown in Fig. 2, Omni Phantom is an input device for human–robot skill transfer, which has been used in the teleoperation applications. This haptic device could provide the operator force feedback when interacting with the objects or the environments. In this paper, the Omni Phantom is used to acquire training data of human demonstration in 3D Cartesian space for training the DMP model.

Fig. 2
figure 2

Six-DoF Omni Phantom

We use Omni Phantom to demonstrate trajectories, including the position and orientation in Cartesian space. We then used the demonstrated data to train all DMPs and executed the DMPs with new start and goal position and orientation. Omni Phantom can record the position and pose of the end. During the demonstration, both the position and orientation trajectories are recorded, and used to train the skill model. In the execution, we modify the desired task to test the generalisation performance. The parameters in DMP are shown in Table 1.

Table 1 Parameters in DMP

4.1 Spatial scaling of composite DMP

To demonstrate the spatial generalisation ability, we carried out simulation experiments to test the composite DMPs. When the DMP reproduces trajectory, we set a new goal position; the proposed DMPs could converge to the desired position. We test the spatial generalisation of DMPs through the task shown in Fig. 3, simulating the picking and placing skill in the industrial case. First, we demonstrate an obstacle-free trajectory from point A to point B for robots. However, when the robot performs the task, the target moves from Point B to Point C. Our experiment assumes the target moves from B to C at a constant velocity, which is known. The position DMP could generate one trajectory online to adapt the dynamic tasks.

Fig. 3
figure 3

a Represents the human demonstration from Point A to Point B; b represents the target goal moving from Point B to Point C

Fig. 4
figure 4

In (a), the blue line is the human demonstration trajectory; the green line is the reproduced trajectory by DMPs; the red dash line represents the moving target. b The position trajectory generated by demonstration and DMP; the red dash line is the goal’s position trajectory. c The velocity trajectory generated by demonstration and DMP; the red dash line is the goal’s velocity trajectory. d Provides the evolution of the temporal coefficient and its derivative

In Fig. 4, the trajectory generated by DMPs could reach the desired position of the moving goal even when we learn the DMP using a static goal. (a) shows the human demonstration trajectory and trajectory reproduced by DMP. Although the goal is moving, the trajectory generated by DMP maintains the shape of the demonstration. The red dash line in (c) represents the target velocity, and the green line is the velocity trajectory generated by DMP. From the (d), it can be seen that the temporal scaling parameter \( \tau _s \) is increasing. In the beginning, since the target velocity is relatively high, the rate of change of \(\tau _s \) is also relatively large, until it decreases to zero. When the target does not move, the \( \tau _s \) does not change. Since the temporal scaling coefficient is tuned based on the goal’s position and velocity, it could achieve the target and maintain the demonstrated shape. In original DMP, the temporal scaling parameter is fixed; hence, it is hard to deal with the dynamic perturbance, such as the moving target and the stopping by an obstacle. Therefore, the composite DMP could adapt to a dynamic environment and tasks based on the position and velocity of the goal.

4.2 The temporal scaling of orientation DMP

It often requires robots to satisfy the orientation requirements when the robot coordinates with humans or other robots in the industrial application. To demonstrate the temporal scaling ability in the orientation of composite DMP, we modify the execution duration when DMP is reproducing the trajectory. As shown in Fig. 5, a human demonstrates how to change Omni’s orientation from (a) to (b) and the demonstration data are recorded for training the orientation DMP. In reproduced period, we set the duration time as twice, and the result is shown in Fig. 6.

Fig. 5
figure 5

Human demonstrating to change the pose of the Omni from (a) to (b) through the three orientation joints (red arrow)

Fig. 6
figure 6

a Pose of Omni: the blue line is the demonstration trajectory; the green one is the output of DMP when the execution time is set twice the demonstration one; b provides the angular velocity generated by demonstration and DMP

Fig. 7
figure 7

a 3D trajectory. The blue and green lines in b show the demonstration and DMP trajectory in each direction; the red dash line is the goal trajectory in XYZ directions. c The orientation error between the current orientation and the goal orientation. d provides the evolution of the temporal coefficient and the phase variable x with time

From the (a) in Fig. 6, we can find the orientation DMPs can be scaled temporally, and since the execution time is longer, the angular velocity is slower than the demonstration one. The orientation scaling could be achieved by adjusting the temporal parameter. When the temporal coefficient \(\tau _s \) is twice, the execution time is double, and the trajectory shape is maintained. Also, from the (b), the angular velocity trajectory has the same pattern with the demonstrated one, when modifying the execution time. The trajectory is also smooth and can be adjusted temporally based on the task requirement and perception information on the external environment. This property could be used to adjust the orientation dynamically and satisfy the orientation requirement. When the DMP couples the position and orientation, the temporal coefficient is adjusted based on the position and orientation tasks, and the external environment.

4.3 The performance of composite DMP for a moving goal

For the tasks with position and orientation constraints, the composite DMPs between the position and orientation are necessary. Test the performance of composite DMPs to the tasks requiring the position and orientation simultaneously. For this case, we first demonstrate a trajectory involving the position and orientation and then train the composite DMPs using the demonstration data. During the reproducing stage, the DMPs need to generate position and orientation trajectory for the moving goal and satisfy the orientation constrains. The performance of reaching a moving target with orientation constraint can be found in Fig. 7.

Through (a) and (b) in Fig. 7, the trajectory generated by DMPs could reach the moving goal with the desired orientation. Due to the moving goal, the temporal scaling \( \tau _s \) is increasing. Although the target has a constant velocity, the shape of position and orientation is consistent with the demonstration. Due to the goal’s velocity, the temporal scaling is increasing, which guarantees the velocity shape is similar to the learned pattern. The position and orientation constraints are satisfied simultaneously. For the composite DMP, because the goal’s motion information could influence the temporal scaling and phase variable, it could influence the shape of trajectory. The position and orientation could be coupled and adjusted based on the task and the external environments through the temporal scaling. The proposed composite DMP considers the moving goal and the orientation requirements simultaneously.

5 Conclusion

This paper proposed composite DMPs, coupling the position and orientation representation simultaneously and using the RBFNNs to approximate the nonlinear forcing term in DMPs. The composite DMPs can track moving goals and guarantee the velocity stays in a safe range. The generalisation performance of temporal and spatial scaling is validated through several primitive skills. In the future, we will consider extending the DMPs to model various pieces of sensory information, making the DMPs interact with the environment. It also can be used in the cooperation manipulation tasks for the bimanual manipulator.