A Shared Control Approach Based on First-Order Dynamical Systems and Closed-Loop Variable Stiffness Control

In this paper, we present a novel learning-based shared control framework. This framework deploys first-order Dynamical Systems (DS) as motion generators providing the desired reference motion, and a Variable Stiffness Dynamical Systems (VSDS) (Chen et al. 2021) for haptic guidance. We show how to shape several features of our controller in order to achieve authority allocation, local motion refinement, in addition to the inherent ability of the controller to automatically synchronize with the human state during joint task execution. We validate our approach in a teleoperated task scenario, where we also showcase the ability of our framework to deal with situations that require updating task knowledge due to possible changes in the task scenario, or changes in the environment. Finally, we conduct a user study to compare the performance of our VSDS controller for guidance generation to two state-of-the-art controllers in a target reaching task. The result shows that our VSDS controller has the highest successful rate of task execution among all conditions. Besides, our VSDS controller helps reduce the execution time and task load significantly, and was selected as the most favorable controller by participants.


Introduction
Despite the recent advancements in robot motion planning and control, teleoperation is still a viable solution in domains such as surgical procedures that consist of delicate or dynamic environments, and therefore can benefit from the human cognitive and problem solving abilities.Nevertheless, teleoperating a robot can still be a mental burden that requires a lot of time and practice.
To that end, the notion of shared control was introduced and proved to be useful in many applications such as surgical robotics, autonomous driving and nuclear sites.The basic idea in shared control is that a human interacts with an autonomous agent that encodes some form of task knowledge, thereby reducing the operator workload and facilitating task execution.For instance, the control space can be partitioned such that the autonomy controls a subset of the degrees of freedom, while the human is in charge of the rest [2,3].Another possibility is to fuse human inputs with the outputs of the autonomous agent depending on some authority allocation metric [4,5].Alternatively, virtual fixtures can be devised to provide haptic guidance rendered on the master interface, which can guide the operator along a desired path [6], avoid certain areas of the environment (forbidden region virtual fixtures) [7], reach optimal grasping poses [8] and to enforce task-related geometrical constraints [9].
Recently, with the increasing popularity of machine learning, Learning from Demonstrations (LfD) has been introduced for the design of shared control techniques, where task knowledge is obtained through demonstrations provided by an expert, which are then encoded by a regression model that can be adequately deployed to guide a novice user achieve the desired task.This can be the case for example in surgical procedures to help train a novice surgeon perform certain surgical maneuvers [10].For instance, in [11], two shared control architectures relying on LfD in the form of Gaussian Mixtures Models (GMM) were compared for a teleoperated protection cover replacement task.GMM were also deployed in [12] and [13] to design virtual fixtures that guide the user to one of possible goal locations, depending on the probability of each.Along the same lines, in [14], probabilistic movement primitives were combined with the flow controller from [15] for guidance generation.In [16], the authors suggest Locally Weighted Regression (LWR) to encode human demonstrations in order to provide a timeindexed trajectory for an impedance controller that provides a guiding force, with a spring stiffness inversely proportional to the variance in demonstrations.They also propose incremental learning for refining the desired motions.While these works mostly rely on haptic guidance generation, the work in [3] exploits Dynamic movement primitives (DMPs) to predict the evolution of one transnational DOF, depending on the human state which controls the other DOF.Incremental learning is also used to refine task knowledge due to a change in the environment.
The aforementioned techniques mainly employ LfD to infer a desired motion plan, which can be subsequently used for haptic guidance.To the best of our knowledge, first order dynamical systems (DS) [17][18][19][20][21] have not been considered before in shared control for motion generation.Therefore, it was not possible to benefit from their nice asymptotic stability properties in terms of convergence to the desired equilibrium, regardless of the initial position, or possible perturbations along the robot motion.Such features cannot be guaranteed for instance in GMM or in LWR techniques.Furthermore, DS motion generators do not rely on a clock signal, as in DMPs for example, which makes them well suited to handle temporal perturbations.
In this regard, the DS formulation, being essentially a velocity field, lends itself nicely to closedloop configuration control formulations, where motion generation and control are combined in one loop, eliminating the notion of "tracking" a time-indexed trajectory.This was shown in [22], where a flow controller was developed to follow the integral curves of a first-order DS.In [1,23], the so-called symmetric attraction behavior is also enforced in the DS, which refers to the robot ability to attract back to a desired path once perturbed.In addition to the inherent robustness and safety, such a closed-loop formulation can be highly beneficial for designing the haptic guidance in shared control frameworks, due to the fact that the controller is always aware of the current human state.Therefore, there is no need for the human to actively think about matching the speed of an open-loop time trajectory as in [3], or to attempt the synchronization of the DMP clock variable to that of the human [3].Instead, the synchronization is automatically ensured by the controller configuration.
In this work, we present a new shared control architecture that builds on the use of first-order DS as motion generators, and control in closedloop to generate haptic guidance.In particular, we exploit the use of our recently developed Variable Stiffness Dynamical Systems (VSDS) controller [1], which takes as input any desired first-order DS representing a motion plan, a desired (constant or possibly varying) stiffness profile, and generates a force field that allows to follow the desired path, while symmetrically attracting locally to it with an interactive behavior dictated by the desired stiffness, in a spring-like manner.VSDS is constructed as the non-linear weighted sum of

VSDS (II.B) (II.C)
Fig. 1 The overall architecture of the proposed shared control approach.f r is the motion generator first-order DS that provides reference motions to VSDS.σ 2 is the predictive variance calculated by Gaussian Process Regression, K des defines the desired stiffness profile for VSDS.uc represents the control input generated by VSDS controller, while u h are the external forces from the human operator.xm is the position of master device in cartesian space, and xr is the position of remote robot.linear springs systems, centered around a set of equidistant attractors sampled from a first-order DS, and where the weights are determined via guassian kernels.While in [1] we demonstrated the benefits of our controller for autonomous task execution, in this work, we show how to exploit and adapt our controller features to develop a new shared control approach.For instance, it can be used with any DS, which offers the flexibility to benefit from existing learning/regression techniques available for DS in the literature.The controller is in closed-loop, and therefore synchronizes automatically with the human state.In addition to that, the ability to encode variable stiffness profiles can be used to adjust the strength of the guidance depending on the human confidence or the model knowledge.Moreover, the symmetric attraction behavior means the user is always pulled to a desired path, which can be crucial to successful task execution, in addition to convergence to the global attractor.Finally, this attraction only holds locally, which means that the width of the attraction region can be adjusted to be consistent with the stiffness, and therefore can be designed such that the human can escape the guidance, when needed.To summarize, we show how first-order DS and VSDS can be effectively employed in a shared control architecture, for the purposes of motion and guidance generation, authority allocation and incremental motion refinement.To the best of our knowledge, this was not explored before.We further verify our approach in experiments in multiple scenarios, and in a user study.
The rest of this work is divided as follows: Section 2 explains the different components of our proposed shared control framework.In Section 3, we evaluate our approach in several scenarios and also conduct a user study to compare with other state-of-the-art controllers.In Section 4, we discuss the results of the user study and the proposed approach.Finally, Section 5 concludes and provides future work directions.

Proposed framework
In this work, we consider a teleoperation scenario where a human physically interacts with a master robot to control the motion of a remote manipulator, to complete a desired task.The results however can be straightforwardly extended to the case where the human directly interacts with a robot e.g. in a cooperative manipulation scenario.In the following, we present the fundamental building blocks of our shared control architecture, illustrated in Fig. 1.For a complete shared control solution, such a framework would consist of a motion generator that outputs a desired motion plan, and naturally a controller that provides haptic guidance depending on the desired motion.Furthermore, the strength of this guidance should be adjusted given some criteria in such a way the authority is arbitrated between the human and the autonomous agent.Finally, the framework should provide an option to the human to locally adapt generated motions depending on changes in the environment or task scenario.

Motion Generation
The first part of the proposed framework is the motion generator, which outputs the desired path for a specific task.In this work, this is provided by a first-order time invariant DS.While in principle any state-of-the-art DS approach can be used, in this work, we chose a DS based on the formulation proposed in [20], since it can be seamlessly extended with incremental learning.We deploy LfD to learn an inital DS model from demonstrations provided by the user.We assume that the demonstrations are given by position-velocity pairs, and describe point-to-point motions that converge to the same final goal location.Furthermore, we assume that the demonstrations do not feature intersections or self-loop, due to the inability of a first-order DS representation to learn such features.In such case, representations based on 2nd-order DS can be sought [17], which is however outisde the scope of this work.
To learn an initial DS model, we deploy LfD.Let the original DS be where x r ∈ R n is the robot state variable, chosen here as the cartesian end-effector position (n = 2 in this paper), f o represents a linear globally asymptotically stable DS, and ẋd,o is the desired velocity.Obviously, the velocity of demonstrations will be different from the velocity field described by f o .Through rotating and scaling by (2), it is possible to reshape f o to match the demonstrated velocity field.Therefore, LfD becomes the task of learning to reshape the original DS based on demonstrations.The rotation and scaling parameters can be combined together to form a modulation field T (x r ) where κ(x r ) is the scaling factor, and R(x r ) is the rotation matrix.The rotation matrix has the following form in two-dimensional space where ϕ(x r ) represents the state-dependent rotation angle.The reshaped DS is then expressed as ẋd = f r (x r ) = T (x r )f o (x r ), (4) and does not lead to any spurious attractors or cause divergent behaviors [20].Learning the reshaped DS from demonstrations is equivalent to learning the state dependent parameters ϕ(x r ) and κ(x r ), termed modulation parameters.The raw collected demonstration data consisting of position and velocity data can be converted to position and modulation parameters, where position data are inputs and modulation parameters are outputs.The detailed conversion process is explained in [20].Same as in [20], we choose Gaussian Process (GP) to fit the training data, because it enables incremental learning by simply enlarging the training dataset.The squared exponential covariance function between two positions x and is chosen to construct the covariance matrix, where γ f , l > 0 are hyperparameters.Additionally a random Gaussian noise is added in the covariance matrix.In this work, we set the hyperparameters to pre-fixed values.
After fitting the training dataset into the GP model, we use Gaussian Process Regression (GPR) to compute the predicted modulation parameters ϕ(x r ) and κ(x r ), given a certain position x r .GPR outputs a predictive mean value µ(x r ) and a predictive variance σ 2 (x r ), which is computed by following the standard expression in GPR [20].The variance indicates the certainty of the GPR about the prediction i.e a low variance means the model is confident about its prediction, while high variance means the model is less certain.Finally, we obtain the reshaped DS as (4), which outputs a motion plan to the global attractor given any starting position.An example of this DS is shown in Fig. 2, left.

Haptic guidance
Once the DS is learnt, a controller is needed to provide haptic guidance along the desired motion.This is rendered on the master device, as done with virtual fixtures in the shared control literature.The DS model, however, represents a motion on the remote manipulator side, where the task goal is expressed.To solve this problem, given a desired cartesian position x k,r or velocity ẋk,r on the remote robot side, we map it the master side via1 where x k,m , ẋk,m are the corresponding positions/velocities on the master side, x 0,m and x 0,r are the initial positions of the master and remote robots at the start of the teleoperation, and β is a scaling factor due to possible differences in workspace.This is needed for example in our case, where the motion range of the master is much smaller than the motion range of the remote robot, and therefore master motions need to be scaled up before commanding it to the remote robot.
The considered cartesian-space gravity compensated dynamics of the master robot can be expressed as where M (x m ) is the Inertia matrix, C(x m , ẋm ) is the Coriolis matrix, u c are the controller forces providing haptic guidance while u h are the external forces applied by the human.The remote robot is assumed to perfectly track the motion of the master x m , after mapping it according to (6).
To compute u c , our VSDS controller [1] is used.The controller provides symmetric attraction towards a path generated from one of the integral curves of f r dictated by the initial robot position, as shown in Fig. 2, right.This is achieved by a nonlinear weighted sum of linear DS, with dynamics f i (x m ) = A i (x m − x i ) centered around a local attractor x i .These attractors are crucial to realize the spring-like attraction behavior shown in Fig. 2 right.These attractors can be computed regardless of the form of f r , and are obtained by simulating f r to obtain a temporary sequence of via points.Then, we re-sample the preliminary via-points into an N number of via-points chosen to be equidistant to ensure a smooth velocity profile, and such that x 0 is the initial position, while x N = x * is the global attractor.The attractors are initially obtained on the remote robot side (illustrated as red dots in Fig. 2 right), and mapped to the master according to (6).The stiffness of the i-th local, system A i , is computed as where K des,i is a diagonal positive definite matrix, sampled from a desired stiffness profile K des (x m ).The eigen values of K des,i are interpreted as stiffness values along and perpendicular to the motion direction, computed as f r (xi) ∥f r (xi)∥ .In order to realize that, Q i projects K des,i to these directions.
To combine the linear DS, we define the Gaussian kernel of the i-th linear DS as ) where x cen,i = 1 2 (x i + x i−1 ) and ϵ i is a smoothing parameter proportional to the distance between sampled points.The actual weight of how each linear DS affects the dynamics at the current position is then defined as Finally, the control force sent to the master robot is computed according to (10) where α(x m ) is a position dependent scale function to avoid large robot accelerations in the beginning of the motion, while D is the damping matrix.

Authority Allocation
Another important aspect in shared control is authority allocation.In this work, this is realized by adjusting the strength of the guidance forces.While several metrics can be used, here we show how the commonly used idea, where authority allocation is variance-based (e.g.[16]), can be integrated in our framework.Since the GPR outputs the prediction with a mean and a variance σ 2 (x r ), we use this variance information to set the stiffness of our VSDS.We set a high stiffness in regions having low variances, since a low variance output by GPR indicates closeness to demonstrations.This limits the freedom of the human in deviating from the desired motion.Conversely, we set a low stiffness in regions that have high variances which are far from demonstrated motions.This makes it easier for the human to overrule the guidance forces.Therefore, authority allocation is implicitly achieved by adjusting the stiffness.
Taking the i-th local attractor of VSDS as an example, the desired stiffness profile for a planar motion is expressed as where k i,1 is the stiffness along the direction of motion, and hence the strength with which the user is pulled along the trajectory, while k i,2 is the stiffness perpendicular to the motion direction and penalizes deviations from the path.We chose to set k i,1 to a fixed value, while k i,2 is computed according to where a 1 , a 2 , σ 2 l , σ 2 u are pre-defined thresholds and σ 2 i (x i ) is the predictive variance from GPR at the i-th local attractor.The second condition of (12) ensures a smooth transition between the low and high variance states as shown in Fig. 3, left.
We illustrate our stiffness setting based on variances in Fig. 3, right.The green path is the demonstrated motion, and naturally the variances along this trajectory are very low, resulting in high stiffness values at all the local attractors of VSDS.On the other hand, for the red and blue paths, we can see that the stiffness is low at local attractors far away from the demonstrated trajectory, and increases when the position of the local attractor is closer to or coincides with the demonstrations.

Incremental Learning
We complement our shared control architecture with online incremental learning in order to refine learnt motions, or to update task knowledge in regions of the state space not demonstrated before.This implies that the human might need to temporarily escape the guidance, in order to provide new demonstrations.Therefore, we exploit the fact that our VSDS controller can provide local symmetric attraction in a tunnel region around the reference path.When the human operator moves out of the tunnel, no further guidance is applied, and the master interface goes into gravity compensation mode where the human is completely free to manipulate the robot.The new demonstrated path is then used for incremental learning.
The tunnel of VSDS is determined by properly setting a threshold value ω th .For each position x m , we check the weights of all local attractors, computed by (9).VSDS controller only takes effect when the largest weight ω max = max( ω i ) ∀i = 1 . . .N , is smaller than ω th .In this work, we set the threshold value proportionally to the variance of the reference path.First, we sum over the predictive variance from GPR of all attractors along the reference path and compute the average of the variance where N represents the number of local attractors of VSDS.Then the threshold value is set as The incremental learning is enabled when the trajectory gets out of the tunnel of VSDS, which means ω max < ω th .The incremental learning under GP framework is simply expanding the training dataset for GPR.However, a matrix inverse computation is done in GPR every time when a new data point is added, which can be computationally inefficient.To deal with this issue, we adapt the trajectory-based sparsity criteria [20] to our context.In particular, we check 1) if new data points should be added in the GP dataset, and 2) if some old data points need to be discarded.This comes from the intuition that each data point in GP is responsible for a certain region around it, named as knowledge region in this paper.This region can be imagined as a circle centered at that point in two-dimensional case.If the new data point is within the knowledge region of the old data point, it implies the old knowledge needs to be updated.The details of the incremental learning are shown in Algorithm 1.

Evaluation
We evaluate our shared control approach in a teleoperation scenario, where we use an Omega.3 haptic device from Force Dimension © as a master interface to control a 7-DOF KUKA robot in Gazebo, that serves as our remote manipulator (Fig. 5).Given that our algorithm is implemented entirely on the master interface, and considering 13 end 14 end that the remote robot is programmed with a stiffposition control mode to simply follow the motion commands from the master, utilizing a simulated remote robot seems to be a reasonable choice in our case.A similar setting was adopted in other shared control works e.g [24].The task is that the human teleoperates the KUKA to reach a target object inside the box.First, we show normal task execution, then we demonstrate several scenarios where task knowledge needs to be updated or refined through incremental learning.Finally, we conduct a user study to compare the performance of our VSDS to other haptic guidance controllers used in previous works, namely an impedance controller tracking a time-indexed trajectory and a flow controller.For simplicity, we constrain the robot motion in x-direction and all the considered motions are in y − z plane.

Normal execution
In this section, we test the ability of our VSDS controller to generate haptic guidance.A human is asked to reach the target object with the robot end-effector via teloperation, while being guided through the force cues.To provide the motion plan, we use the linear DS ẋd,o = −0.4(xr − x * ), and then locally modulate it with an initial demonstration, with σ f = 1 and l = 0.001 for the kernel function expressed in ( 5), and σ 2 n = 0.01 for the Gaussian noise.The streamlines of the used f r are shown in Fig. 2, left.As for VSDS construction (Fig. 2, right), the local attractors are sampled equidistantly from the reference path generated by f r , and where we set the length between two attractors to ∆ l = 0.04m.The stiffness setting is chosen to ensure stable motions on the omega.3haptic device, where we set k i,1 = 250N/m, a 1 = 1100N/m, a 2 = 700N/m, σ 2 l = 0, σ 2 u = 0.85.As Fig. 6 right shows, the human operator is guided to follow the reference path, completing the task without hitting the wall of the box.Fig. 6 left shows another scenario where the starting position is different from the demonstration, however, in this particular case the motion plan output of f r is feasible, and is followed by the human towards the goal location inside the box.

Incremental Learning
In this section, we test the ability of our framework to deal with situations where it is desired to update the task knowledge, or to adapt it due to possible changes in the environment.In the first scenario, the human attempts the reaching task from an initial position far away from demonstrations, and therefore, the governing dynamics are those of the linear DS.This is problematic since while the dynamics converge to the attractor, the path generated leads to collisions with the walls of the box (Fig. 7(a)).As soon as the task execution starts, the human quickly realizes that the guidance is leading him/her in a wrong manner, and therefore exerts a force to escape from the tunnel region of local attraction, where he/she can then freely manipulate the master device to demonstrate the successful task execution.After the refinement, when the human starts from the same initial position, he/she is guided correctly to achieve the task (Fig. 7(b)).
We showcase the second scenario in a situation where an obstacle is introduced in a region demonstrated before, and therefore model knowledge should be adapted.As can be seen from Fig. 7(c), the streamlines lead to collision with the placed obstacle.The human realizes that he is being guided in the wrong manner, escapes the tunnel region of the guidance (Fig. 7(e))and adds a new demonstration to how the collision with the obstacle should be avoided.After the refinement, the human is properly guided along a path that avoids the obstacle (Fig. 7(d)).
It should be noted that due to the variable stiffness and the tunnel settings, the required force to escape from the VSDS tunnel differs depending on the region of the state space.In the first case, the human attempts to update task knowledge in a region far away from demonstrations.Therefore, the stiffness is lower and the tunnel region is narrower, and in consequence the force needed to escape the guidance is much lower, compared to the second case, where the obstacle is placed in an area demonstrated before, resulting in a much higher force necessary to escape the guidance (Fig.

7(f)).
Finally, it is worth noting also that we can handle motion refinement in both cases, because of our specific choice of the incremental learning  7(d)).For Fig. 7(a) to Fig. 7(d), the blue path shows the robot motion, the red path is the reference path generated by f r while the pink dotted lines show the borders of the VSDS tunnel.Fig. 7(e) shows the escaping trajectory of case 2, where the red dotted lines represent the reference path, the blue and the green lines are the real trajectory in y and z direction.They deviate from the reference, then escape from the VSDS tunnel and stop at the escaping point.Case 1 has the same pattern as case 2, and therefore is not shown in the plot.Fig. 7(f) shows the corresponding escaping force, where the red plot corresponds to Case 1, while the blue is for Case 2 method as described in Section II.D.More specifically, we assign a knowledge region for each data point, and discard old data points if their knowledge region is shared with new demonstration points.This implies that existing task knowledge is obsolete and should be refined, which is the case for the obstacle scenario (Case 2 in Fig. 7).

User Study
In this section, we conduct a user study to compare the performance of several controllers for haptic guidance generation, in a target-reaching task.

Methods
The DS shown in Fig. 2 left is used to provide the motion plan, where the user starts from an initial position close to the start of the demonstrations to make the remote robot end-effector reach a desired goal location.To provide force cues, we compare the following controllers: • Our VSDS controller, with streamlines shown in Fig. 2, right.
• The Flow controller presented in [22] where , with streamlines according to Fig. 2 left, and where D f is a feedback gain while v d is the mapping of f r (x r ) on the master side.Note that the controller formulation is also similar to the commonly used flow controllers in the exoskeleton literature (e.g.[15]).• An impedance controller tracking a trajectory x d (t) integrated in open-loop from f r (x r ) and mapped to the master, starting from the initial robot position, such that For the first and third conditions, we use the same constant stiffness matrix.Also, for the VSDS controller, we deactivate the tunnel region effect, since incremental learning is not needed during the user study.This results that the symmetric attraction is active in the entire state space.For the second condition, we noticed that high gains cause unstable vibrations, and therefore limited the eigenvalues of D f to 45 and 20.We tested 12 participants in total, aged from 20 to 30, with no previous experience in teleoperation.We asked them to interact with the master device to teleoperate the remote robot endeffector to guide it to the pink object inside the box as shown in Fig. 5. Subjects could visually observe the motion of the KUKA LWR in Gazebo during teleoperation in real-time.Subjects are instructed to focus primarily on attempting task execution without any collisions, and if possible to be quick, while roughly following a continuous curve towards the goal.Before starting the experiment, we show each participant how to do the task, and give them a familiarisation trial under each condition.During the experiment, participants are asked to conduct three trials for each condition, the sequence of which is randomly shuffled across subjects.After each condition, subjects are requested to fill in NASA TLX and a questionnaire on Guidance Quality (GQ questionnaire from hereon).After finishing all the trials, we asked participants which condition they preferred the most.
The GQ questionnaire is based on [11] and aims to reflect how participants judge the guidance.We ask the following questions: • Q1: Do you feel the guidance useful?The participants have five options for each question, namely, absolutely no, no, neutral, yes, absolutely yes.We then map the answers into 5 discrete values in the range [0, 5] for analysis, where 0 represents "absolutely no" and 5 represents "absolutely yes".To further evaluate the performance, we additionally compute the metrics: successful rate of execution, execution time, the task load computed by using NASA TLX scores, and the jerk of the remote robot movement.We define a trial as successful if the robot reaches the target without hitting the box or the ground, otherwise it is defined as failure.

Data Analysis
With respect to the successful rate, we count the total number of successful trials as a percentage of the total number of trials for each controller.For the remaining metrics, we computed the mean across trials for further statistical analysis [25].We first tested the data for normality using the Shapiro-Wilk test.Then, we computed repeated measures ANOVAs for normally distributed data, and Friedman test otherwise.We also used Friedman test to analyze the results of the GQ questionnaire since the data is not continuous.This was followed by Bonferroni corrected post-hoc pairwise comparisons to compare the individual conditions.A Greenhouse-Geisser correction was used when the assumption of sphercity was violated, where we used the Mauchly test for sphercity.For the GQ questionnaire, although we recorded the subject response for all conditions, we thought it would be meaningful to analyze the results for the conditions where the guidance is activated, therefore excluding the Free mode.We set the Alpha level to 0.05, where p < 0.05 is considered statistically significant, while p < 0.1 indicates a statistical tendency.

Results
The results of the user study are shown in Fig. 8 as bar plots showing the mean across conditions and the standard deviation, as well as the statistically significant different conditions.Friedman test revealed that all the three guidance conditions reduced the jerk compared to the free mode (χ 2 = 13.8, p = 0.003) with no significant different across conditions (Fig. 8(d)).Friedman test for the execution time also showed significant effects (χ 2 = 15.7,p = 0.0013064), where the VS condition was found to reduce the execution time compared to the FR condition (p = 0.003), the OL (p = 0.04) and the FL (p = 0.003) (Fig. 8(b)).For the TLX load, Repeated Measures Anova also showed significant effects (F (3, 44) = 5.8323, p = 0.0019111), which mainly were due to the VS condition reducing the task load compared to the other conditions (Fig. 8(c)).The evaluation of GQ questionnaire is shown in (Fig. 8(e)).The response from the first question (χ 2 = 4.7692, p = 0.092) regarding guidance usefulness indicates VS condition has no significant difference in comparison with others, with p = 0.13 compared to OL, and and p = 0.11 compared to FL.For Q2, we had (χ 2 = 7.0556, p = 0.02937) mainly caused by a tendency for the OL to have higher scores compared to the VS(p = 0.075) and the FL (p = 0.0553) conditions.On the other hand, no significant differences for Q3 regarding the degree to which subjects felt in control among conditions was found.Finally, the answers of the participants regarding their guidance preference were as follows: VSDS controller (75%), flow controller (17%), and free mode (8%).

Discussion
The results of the user study came in line with previous shared control literature that haptic guidance improves the teleoperation performance [3,11], revealed mainly by higher success rates and lower jerk.The VSDS controller shows the highest rate in comparison with the other two controllers.The relatively higher failure rate for the open-loop impedance controller could be due to the fact that this controller lacks the timing freedom, and therefore, if the user does not attempt to synchronize with the guidance or passively follow it, the results might be unpredictable.On the other hand, the flow controller does not attempt to pull the user to a specific path that successfully achieves the task, but rather follows the streamlines of f r to reach the target, and therefore following a streamline that collides with the outside of the box is more likely.
We also think that these are the reasons why the open-loop and flow controllers had higher NASA TLX load scores compared to VSDS.The lack of timing freedom in the open-loop impedance controllers meant that the subject had to spend additional effort to actively synchronize or even fight against the guidance at times.This is also reflected by noting that the open-loop controller resulted in the highest score in the answer to Q2 (Fig. 8(e)), related to fighting the guidance.On the other hand, the higher score for NASA TLX recorded for the flow controller could be due to the fact that subjects did not feel enough restriction to move along a particular path, thereby needed to focus more on moving the end-effector along a collision-free path.The NASA TLX results seem also to be in correlation with the results of Q1 on the usefulness of guidance, with a tendency noticed for the VS condition to have higher scores.Related to that, it seems the more natural guidance provided by VSDS resulted in a lower score for Q2, as subject did not feel the need to fight the guidance, as compared to FL and OL, due to the aforementioned shortages of these approaches.
While the VSDS controller generally seemed to have a better performance, in our view, the choice of one haptic guidance approach or another should depend on the given scenario.The OL and VSDS controllers rely essentially on a spring action to provide guidance storing potential energy for large errors from the reference path, thereby makes it more restrictive for the subjects.This would be suitable for example for novice surgeons during training who might lack experience in teleoperation.The flow controller is more forgiving in this regard since the guidance rather provides assistance to move forward along the direction of the flow, but requires more mental demand from the operator to focus on following a collision free path, and in consequence could be useful for more experienced subjects.
In this work, we mainly focused in our user study evaluation on controllers dedicated for DS, but that share some similarities with other controllers in the literature.For example, the flow controller has the same of working principle as the velocity field controller of [14] essentially closing the loop around the velocity error.The open-loop impedance controller on the other hand is a classical approach, and was used as well in the shared control context [16].Future work will also focus testing in comparison to control approaches not necessarily focused on DS, for example the path control paradigm [26].
Finally, regarding the passivity of our closed loop system, it should be noted since we consider unilateral teleoperation, the only source of potential activity in the system could be due to the haptic guidance controller, and therefore, ensuring the passivity of the controller would be sufficient to guarantee an overall stable operation.Current work in progress [27] explores the use of energy tanks, adapted from [28] to ensure the passivity and the asymptotic stability of VSDS controllers, which would guarantee the convergence to the global attractor.

Conclusion
In this work, we presented a new shared control approach based on first-order time invariant DS.We use LfD to learn a globally stable DS as a motion generator, and deploy our previously proposed VSDS controller to generate haptic guidance.The variance-based stiffness setting of VSDS controller realizes the authority allocation implicitly.Additionally our proposed approach enables incremental learning to adapt motions when necessary, by properly setting the region of local attraction provided by VSDS.We validated our shared control approach in a teleoperation task, where the human controls the haptic device, interacting with the VSDS controller together to execute the target reaching task.The results show that our approach works well in normal execution and is also suitable for refining old task knowledge.Moreover, we conducted a user study, comparing the performance of VSDS controller to state-of-the-art controllers used for haptic guidance generation.The results showed that using VSDS controller yielded the highest success rate, and was the most preferred shared control method by the subjects.
In the future, in addition to the aforementioned directions in the previous section, we will aim to extend of our shared control approach to also include orientations for higher flexibility.We will also consider other shared control settings more extensively, such as in collaborative tasks.

Fig. 2
Fig. 2 Left: Streamlines of the locally reshaped DS f r around demonstration data points shown in blue.Right: Streamlines of VSDS that symmetrically attract around the reference path simulated from f r shown in blue.The red points lying on the reference path are local attractors of VSDS, sampled also from f r .The rhombus in both plots is the global attractor x *

Fig. 3
Fig. 3 Left: An example plot to show how stiffness changes according to variance.Right: the stiffness along the path shown as ellipses, where wide ellipse means a high stiffness.The paths generated by the reshaped DS are shown in Fig. 2, where the green path is the demonstrated one, while the blue and red paths are obtained starting from two different positions

Fig. 4
Fig. 4 Tunnel region effect of VSDS, where the highlighted area is the region where the symmetric attraction effect is activated, while the purple is the rest of the state space where streamlines follow f r .The left figure shows a relatively wide region with ω th = 0.1, while the region in the right figure is narrower with ω th = 0.8 where b 1 , b 2 , σ 2 l , σ 2 u are set to constant values.The second condition again ensures smooth transitions between lower and upper limits of ω th .As shown in Fig. 4 left, a path close to demonstrations (i.e.low variance) has a comparatively wider tunnel region compared to Fig. 4 right that represents an area not demonstrated before.The incremental learning is enabled when the trajectory gets out of the tunnel of VSDS, which means ω max < ω th .The incremental learning under GP framework is simply expanding the training dataset for GPR.However, a matrix inverse computation is done in GPR every time when a new data point is added, which can be computationally inefficient.To deal with this issue, we adapt the trajectory-based sparsity criteria[20] to our context.In particular, we check 1) if new data points should be added in the GP dataset, and 2) if some old data points need to be discarded.This comes from the intuition that each data point in GP is responsible for a certain region around it, named as knowledge region in this paper.This region can be imagined as a circle centered at that point in two-dimensional case.If the new data point is within the knowledge region of the old data point, it implies the old knowledge needs to be updated.The details of the incremental learning are shown in Algorithm 1.

Fig. 5
Fig. 5 Experiment settings.Left: The 3 DOF omega.3haptic device used as master interface.Right: The task scenario in Gazebo, with the KUKA LWR as the remote robot and the top surface of the pink object inside the box is the target to reach

Fig. 6
Fig. 6 The robot motion for the target-reaching task starting from different initial positions.The blue line is the real robot motion.The red dotted line is the reference motion generated by f r .The pink dotted lines show the border of VSDS tunnel.Left: Starting from a position far away from the demonstration.Right: Starting from a position near the demonstration.

Fig. 7
Fig. 7 Results of motion refinement in two different scenarios, where in one the starting point is far away from the (Case 1, Fig. 7(a) to Fig. 7(b)), while in the other the environment is changed by adding an obstacle (Case Fig. 7(c) to Fig.7(d)).For Fig.7(a) to Fig.7(d), the blue path shows the robot motion, the red path is the reference path generated by f r while the pink dotted lines show the borders of the VSDS tunnel.Fig.7(e) shows the escaping trajectory of case 2, where the red dotted lines represent the reference path, the blue and the green lines are the real trajectory in y and z direction.They deviate from the reference, then escape from the VSDS tunnel and stop at the escaping point.Case 1 has the same pattern as case 2, and therefore is not shown in the plot.Fig.7(f) shows the corresponding escaping force, where the red plot corresponds to Case 1, while the blue is for Case 2 with K o and D o as stiffness and damping.•Free mode: Teleoperation without guidance.
Successful rate of execution (b) Average execution time (c) Average task load (d) Average jerk (e) Subjective evaluation (f)

Fig. 8
Fig. 8 Results of user study.FR: Free Mode (no guidance), OL: Open-Loop Impedance controller, FL: FLow controller, VS: VSDS controller.Q1, Q2 and Q3 refer to the three questions of the GQ questionnaire.Error bars indicate the standard deviation.To indicate significance between conditions, '**' represents p < 0.01, '*' represents p < 0.05, '+' represents p < 0.1.In Fig. (a), the y-axis indicates the normalized percentage of the successful rate of execution.In Fig. (c), The y-axis indicates the weighted scores of the NASA-TLX from 0 to 100 where lower scores indicate better performance.Finally, Fig. (e) highlights the scores of the GQ questionnaire, where the answers to the questions of the questionnaire are mapped on a scale from 0-5.

•
Q2: Do you have to fight the guidance?• Q3: Do you feel in control while being assisted?