DMP-Based Reactive Robot-to-Human Handover in Perturbed Scenarios

While seemingly simple, handover requires joint coordinate efforts from both partners, commonly in dynamic collaborative scenarios. Practically, humans are able to adapt and react to their partner’s movement to ensure seamless interaction against perturbations or interruptions. However, literature on robotic handover usually considers straightforward scenarios. We propose an online trajectory generation method based on Dynamic Movement Primitives to enable reactive robot behavior in perturbed scenarios. Thus, the robot is able to adapt to human motion (stopping should the handover be interrupted while persisting through minor disturbances on the partner’s trajectory). Qualitative analysis is conducted to demonstrate the capability of the proposed controller with different parameter settings and against a non-reactive implementation. This analysis shows that controllers with reactive parameter settings produce robot trajectories that can be deemed as more coordinated under perturbation. Additionally, a randomized trial with participants is conducted to validate the approach by assessing the subject perception through a questionnaire while measuring task completion and robot idle time. Our method has been shown to significantly increase the subjective perception of the interaction with no statistically significant deterioration in task performance metrics under one of the two sets of parameters analyzed. This paper represents a first step towards the introduction of reactive controllers in handover tasks that explicitly consider perturbations and interruptions.


Introduction
Handover represents a quintessential cooperative task. As such, humans perform it with ease, exchanging objects seamlessly and sometimes even without explicit communication. However, successful cooperation depends on prior knowledge and anticipatory behavior between the partners.
In human-to-human handovers, agents anticipate the movement of the partner and adjust accordingly [1,2]. Furthermore, studies have shown that humans also prefer robots that can replicate such behavior [1,3,4]. Most notably, humans place a great deal of importance on temporal precision in handover tasks [5], as it is integral to satisfying user experience. In a simple handover setting, without any disturbances, this results in smooth and synchronized trajectories from both partners. Naturally, if the interaction includes disturbances or interruptions, controller design can become significantly more complex. Even in the interaction between humans, anticipatory control can fail on a higher level. As an example, imagine a practical joke of interrupting a handover mid-way, while the partner's hand still reaches for the air. To overcome this, the receiver would have to react quickly, either to stop or target a new position, possibly on a lower control level. Namely, by promptly and intuitively adjusting the trajectory of the hand, as opposed to a higher level of control which would be more akin to a pre-planned state machine (i.e. recognizing the change in the state of the joint task and updating its behavior accordingly). Nevertheless, the implications of disturbances or perturbations on joint tasks are noticeable in real-world scenarios where unexpected events happen or partners get engaged in other tasks [6]. More formally, addressing these issues could improve Human-Robot Collaboration (HRC) safety and satisfaction. In an industrial setting, a robot could endanger human workers by invading their workspace when they are not ready to receive an object. Hence, the development of controllers that are robust to perturbations and thus improve cooperation and safety metrics, is warranted. To summarize, such a controller should be capable of stopping the handover should the partner stop or move away. However, if there are some minor disturbances, and the partner persists in reaching for the object, the robot should comply. Additionally, the HRC problem is further constrained by safety requirements and desired performance. The robot should be compliant to human contact, and should not generate trajectories that humans find threatening. Conversely, the generated trajectories should be smooth and efficient, representing the expected behavior of a human partner.
To this end, in this work, we propose an online trajectory generation method for robust robot-to-human handover, based on Dynamic Movement Primitives (DMP) [7]. The DMP is used to generate a trajectory online and to modulate the speed of such trajectory to react to disturbances and coordinate with the human partner. This allows for an interaction that is safe, responsive, and robust to disturbances or interruptions of the handover. Furthermore, the added safety and coordination can be realized without deterioration of task performance metrics commonly discussed in the handover literature [8], while improving the quality of the interaction. Primarily, to the best of our knowledge, addressing the problem of being robust to interruptions or unexpected movements with online trajectory generation has never been addressed explicitly. Nevertheless, it can be considered a necessary step for the application of robot-human handover in practical contexts, and our work aims to address it.
The paper is outlined as follows. First, in Sect. 2 a brief overview of the related work is presented. Then, in Sect. 3 a mathematical framework for the proposed approach is formalized and Sect. 4 presents a detailed report of the robotic implementation. Next, in Sect. 5 methods for experimental evaluation of the proposed method are specified, and results are presented in Sect. 6. Finally, in Sect. 7 the results and their implications on robotic handover and HRC in general are discussed.

Related Work
As there is an ever-increasing interest in the development of HRC applications, robotic handover has also garnered popularity in recent robotic literature as a fundamental action required in most contexts with physical collaboration. However, the implementation of robotic agents that can handover seamlessly (as humans do) is still an open problem. Ortenzi et al. [8] present a comprehensive review of the handover in a recent survey focused on robotic applications.

Toward Human-Like Handover
By definition, handover is a joint action between a giver and a receiver. Accordingly, when humans hand over objects they are cooperating, both temporally and spatially [8]. Successful joint interaction is dependent on properly perceiving and anticipating the partner's actions. Furthermore, a study by Koene et al. [9] has shown that humans place more relative importance on temporal precision, as opposed to spatial.
Communication has a great influence on cooperation, and thus several studies explore explicit forms such as speech [5], gaze [10], tactile feedback [11] or gestures [12] to communicate the intent. Humans are capable of predicting the partner's intention without explicit communication [2] and adjusting their own motion accordingly [1]. However, studies examining the exploitation of human motion to communicate intent are not as prevalent. Data-driven approaches have been used to exploit kinematic features for handover detection [13], and identification of important physical cues during the interaction [14]. Even more implicitly, humans use sensorimotor communication to convey intent in joint actions by adjusting the motion of their trajectories [15]. As the method proposed is online, an implicit communication conveyed by the motion of the partner's trajectories is suitable for fast adaption of robotic trajectory.

Reacting to Perturbations
Literature on robotic handover rarely considers cases in which the handover could be interrupted due to other tasks or some perturbations. Most notably, Huang et al. [3] propose an adaptive controller for the robotic giver which considers the occupancy of the human receiver. Similarly, they explore the influence of adaptive robot behavior on subjective and task performance metrics during a handover. In contrast to our work, the adaptive approach the authors propose is based on a finite state machine (FSM). As the examined setting (unloading dishes) consists of multiple robot tasks (reach, grasp, retrieve, give, handover, retract) which are repeated throughout, a high-level control strategy is warranted. Instead, the focus of our method is on reacting to perturbation given the permanence of the handover intention on the robot's side. As an example, consider the case when a high-level controller is too slow (or fails) to recognize when the handover should be interrupted or adjusted, or when it starts the handover at the wrong time (e.g. too soon).
The approach we propose aims at modulating the interaction continuously, instead that on a symbolic level, acting directly on how the trajectory is generated. As such, the controller is robust to perturbations or stoppages that might hinder the human receiver, while increasing low-level safety should the receiver disengage from the handover. Furthermore, the proposed method can be integrated with others that act on a higher level of the control loop, using more elaborate methods to interpret human actions, like the one proposed by Huang et al. [3]. In such context, the proposed approach could lower the requirements of the FSM in terms of reaction time and precision in the interpretation of the human activity (e.g. to start/stop the handover), enabling a smoother application to more complex HRC tasks. Thus, by considering the control on a lower level we can more closely address coordination, safety, and efficiency between the partners when the robot is engaged in the "give" phase.

Online Trajectory Generation for Robotic Handover
Successful HRC implementation depends on robustness, reactivity, and context awareness [8,16]. Adaptable behavior is a requirement if changes in the environment and partner behavior are considered. A pre-planned approach could satisfy these requirements only if all aspects of interaction are known, which is not the case considered in this work. Thus, we propose an online method for trajectory generation. The problem of generating a suitable trajectory for a robot-human handover has been addressed with a wide range of different methods by the research community. Additionally, numerous methods of generating suitable trajectories have been proposed also in the more general context of human-robot interactions, such as minimum-jerk methods [17] or optimization-based methods [18] over legibility or predictability metrics.
Another notable line of work can be identified in the application of techniques that try to learn an underlying correlation between different parts of trajectories (either of those of the same agent, between two agents, or both). During the execution, observing the action of the partner, the robot can adapt reactively by exploiting the learned correlation. To this end, Probabilistic Movement Primitives (ProMP) [19], Interaction Primitives [20], and Interaction Meshes [21] have all been applied to generate suitable trajectories for typical HRI tasks, including handover [19,20,22].
Given that one of the main objectives of the proposed controller is to adapt through unexpected perturbations, it would be complex to directly apply such methods: the fundamental issue is that a general perturbation is unpredictable, and can happen in any possible way. As such, learning what to do in such cases from recorded human-human interactions can become quickly impractical due to an exponential growth of possible cases. To make a practical example, let's consider that the human reaching for the object receives an unexpected push against his arm: considering variations of the relative positions of the two agents, the direction of the push, and its magnitude, would lead to one specific perturbation requiring many different training examples. A different way of exploiting learned correlations could be to detect whether the human is currently engaging or not in a handover, interpreting his actions. In such case, the controller could then implement a (learned or hard-coded) reaction to a situation when it detects that the human is not moving for handover. This strategy can be seen as akin to the work performed by Huang et al. [3], where the robot has to decide when and how to engage in the handover. The method proposed here follows similar reasoning, albeit aiming to address a lower level of reactivity, prioritizing simplicity and efficiency. This focus also comes from the practical consideration that an analytical and simple approach can lead to a more standardized and predictable behavior of the system, which may be desired in the case considered by this work.

DMP
Both DMP and dynamical systems have been already applied in this context, with various degrees of success. The main benefit of this type of approach is the ability to generate a trajectory online and in real-time. Additionally, this allows the robot to react to changes in the environment without having to re-plan the whole motion. In Ref. [23] the authors developed a controller based on coupled dynamical systems, which is inspired and modeled after data recorded from humanhuman handover experiments. However, the controller relies on an estimation of the shared weight of the object, which can prove difficult to extend to different objects or handover directions. In Ref. [24] the coupling of hand transport and finger motion is analyzed in humans. Then, the results are applied to develop a controller based on coupled dynamical systems, which results in smooth reach-to-grasp motions. Finally, in Ref. [25] DMP have been directly applied to the problem of robot-to-human handover. Dependence on the human's motion is included only as a target for the DMP, which is considered to always be the partner's hand instead of a predicted target position. However, should the human retreat the hand before taking the object, the robot would still converge to the hand of the partner, representing a potential danger if the robot cannot recognize it and stop the handover fast enough. This is also potentially in contrast with the preference for faster handover shown in the same work.

Mathematical Framework
Dynamic Movement Primitives are a method of generating trajectories as the evolution of virtual dynamical systems. The method, introduced by Ijspeert et al. [7], allows the learning and reproduction of both non-periodic and periodic trajectories and has seen several extensions for different purposes. Here, we introduce briefly the basis of the method for the generation of non-periodic trajectories; for further details, the reader is referred to [26].
For simplicity, let's consider a mono-dimensional trajectory with just one degree of freedom (DOF) x(t), with initial state x(t 0 ) = x 0 and desired final goal x(t f ) = g. The DMP method considers the evolution of this trajectory as generated by a second-order dynamical system similar to the classical mass-damper-spring model, called transformation system: where α x and β x are positive parameters of the system and τ is its time constant. In the rest of this paper we will consider a critically damped system, with β x = α x 4 . The forcing term f (s) can be used to shape the evolution of the trajectory and is usually defined as where f nl is a generic non-linear function approximator, (g − x 0 ) is a scaling factor and s is a phase variable that decreases monotonically from 1 to 0 as the system evolves. As s vanishes, the effect of f (s) on the system does too, and x(t) converges stably toward g. The evolution of s is determined by the canonical system, typically chosen as with τ again as the time-constant of the system and α s a positive parameter. This framework can easily be extended to multiple DOFs by considering one transformation system for each DOF and a single common canonical system. In this way, different DOFs are kept coordinated by the shared phase variable.

Coupling Terms
A major advantage of the DMP framework is the possibility to obtain a complex behavior of the system by adding seemingly simple coupling terms, either spatial or temporal, to the original formulation. In the following, we will refer to a DMP with the terms where C s and C t represent the spatial and temporal coupling terms, respectively. Ideally, the evolution of the trajectory should react to what the human partner is doing. This is needed in situations where, for example, the human interrupts the handover and retreats their hand. Similarly, in situations where they stop the hand and go back to the execution of the handover after a few moments, or reengage in the handover after executing a secondary task. Furthermore, it may be of interest to adjust the speed of the movement depending on the speed of the human partner [25]. These objectives can be accounted for by considering coupling terms that depend on the speed at which the partner's hand is approaching the final handover position. However, the system should distinguish between a case where the partner's hand is stationary because it has already reached the final position and a case where it has stopped midway for some other reason. Furthermore, it shouldn't have to rely excessively on the accuracy of the estimation of the final handover position.
As in this work we focus on the temporal coordination of the interaction, we assume that we have an estimate of the final handover position g. This is in line with the findings that humans place more importance on temporal coordination, compared to spatial, during handovers [9]. Given g, we define d as the distance between the partner's hand and the final handover position, with an initial value d(t 0 ) = d 0 at the start of the handover. The following coupling terms are then proposed: where (7) is a second-order low-pass filter for the measured distance d, k t and k s are positive values, and σ i (y) is a sigmoid function with steep coefficient a i and x-axis offset δ i .

Intuition Behind the Coupling Terms
The proposed coupling terms are based on two assumptions: 1. If the human partner is moving with "enough" speed toward the final location of the handover (or to a point Fig. 1 The evolution of the trajectory of the robot, x(t), is modulated by the distance d(t) between the hand of the participant and the final handover position, g in proximity), it can be interpreted as if he is engaging in the handover. As such, the robot should accelerate towards the final location. This effect is given by the sigmoid σḋ , and modulated by its parameters. 2. The closer the human's hand is to the final location, the less we should use its approach speed to evaluate his engagement in the handover. An extreme example of this is when the human partner has already reached out completely and is waiting for the robot to pass the object.
The mathematical effect of the two coupling terms can be considered intuitively as increasing the mass (with C t ) and the damping (with C s ). The extent to which they increase is modulated by the two sigmoids and the values of k t and k s . The presented formulation has the added benefit that the coupling terms cannot generate an unsafe evolution of the dynamical system, as they can only slow down and dampen the system. Finally, the adaptation to the human partner is modulated mainly by σḋ , while the main objective of σ d is to reduce the prevalence of the other term once the hand closes in on the final position. For this reason, during all the experiments the values of a d and δ d were kept fixed to 13.0 and −0.35, respectively, to produce an almost linear response for These values have been chosen intuitively and do not represent an optimum of some kind. If needed, this term can also be easily adapted to consider absolute distances, instead of relative ones. Figure 1 shows an example of the physical setup, highlighting the distance from the goal and the trajectory of the robot.

Virtual Compliance
To guarantee both the safety of the participant and to enable a more human-like interaction during the exchange of the object a virtual compliance method is also implemented as an additional extension to the DMP. Particularly, as the joints of the robot used in the experiments are controlled with position/velocity commands, more advanced methods like computed torque control or backstepping control are not applicable. In addition, compliance cannot be implemented by controlling directly the joints' torques.
The DMP framework presents a way of naturally including a compliant behavior of the end-effector. As the trajectory is generated through a simulated dynamical system, corresponding to a mass-spring-damper system, external forces can be sensed and applied to the virtual system. A compliant behavior at the end-effector is naturally obtained by modifying the forcing term to be (11) where f ext are the sensed external forces and k ext is a gain term that can be adjusted to modulate the compliance.

ROS Setup and Overall Architecture
The proposed control architecture has been implemented using the Robot Operating System (ROS) [27]. To this end, ROS nodes have been used for: • Reading the input data regarding the position of the human hand; • Collecting and processing the data from the force-torque sensor; • Controlling the opening/closing of the robot's hand; • Generating the trajectory as presented above; • Running the inverse kinematics of the robot to follow such trajectory.
The block diagram of the proposed implementation is shown in Fig. 2.

Robot
A UR5 CB-series robot from Universal Robots has been used to perform experiments. Controlling the robot via a ROS interface was possible by using the official drivers developed and provided by the manufacturer, which allow connecting with Ethernet to the robot.
Closed-loop inverse kinematics has been implemented as a variation of the classical one based on the Jacobian damped pseudo-inverse [28].

End Effector and Force Sensor
To hold the object, the robot was equipped with an IH2 Azzurra hand (Prensilia SRL), [29] as the end-effector. A six-axis force-torque sensor (model HEX-70-XE, OnRobot) was mounted between the robot and the hand, to measure the forces applied at the robot hand. As the focus of this work is on the trajectory generation, to release the object a simple strategy based on a force threshold was used: when the absolute magnitude of the force read from the sensor exceeded a preset value, the opening of the hand was triggered. Furthermore, to avoid opening the hand mid-way due to inertial forces from the robot hand, a second threshold was added on the distance from the final position. The opening was triggered when both conditions (proximity to the final position and force readings above a certain value) were satisfied. The parameters for the release were kept constant during all the experiments.

Perception
To read the position of the human partner's hand a "Vicon" motion capture system with 8 active cameras (2 Vero cameras, 6 Bonita cameras, Vicon Ltd.) has been used. A custom motion capture skeleton was created for the right arm of participants. Before each experiment, it was refitted to every new participant, following the pipeline suggested by the manufacturer. The only measure fed to the controller is the hand position, which is used to compute the distance and the speed of approach to the final handover position.

Methods for Experimental Evaluation
In this section, methods for two analyses are presented. Firstly, a framework for demonstration of robot coordina-

Parameter Settings
To perform the trials, three different configurations of the controller were used: Non-Reactive (NR), Reactive-1 (R1), and Reactive-2 (R2). The first corresponds to a controller with no coupling term to coordinate with the human, similar to a general motion primitive for a handover, while the other two configurations differ in the coupling terms.
The specific values of R1 and R2 are reported in Table  1 and were chosen after a pilot study [30] was performed to examine the proposed approach. This study highlighted a spread in the preferences of participants, with some preferring faster robot reactions and others preferring a smoother and slower behavior. Furthermore, it showed how different parameters could produce similar behaviors and perceptions of them. For these reasons, the two sets of parameters were hand-picked by the authors, corresponding to a (subjectively) slower reacting (R1) and faster reacting (R2) robot.
During all the trials the base values of the DMP were kept at: The non-linear part of the forcing term f nl was learned with a K-nearest neighbors method to produce minimum-jerklike trajectories that converge to the target in ∼ 1.3s. This method has been chosen as it is simple, produces a smooth interpolation, and exhibits a predictable behavior outside the range of the data used to generate it. Furthermore, it is already available in common scientific computing libraries.
The implementation used has been the one provided by the SciKit Learn Python library [31]. The reference minimumjerk trajectory, to be learned, has been generated analytically. In all the presented cases, the initial distance δ 0 is measured at the start of the interaction. As such, during the randomized trials with participants, it is measured considering the Home position shown in Fig. 3a.

Robot Coordination in Different Handover Scenarios
To demonstrate the capabilities of the proposed method, and to show the influence on the coordination with a human, the controller has been applied in three typical handover scenarios: • Straightforward handover; • Handover with a stop; • Handover with an external push (while reaching out).
These scenarios are performed by one of the experimenters with the robotic setup described in Sect. 4. In the "straightforward handover" scenario both the human and the robot reach out to the handover location at the same time. In the "handover with a stop" scenario, however, after both the robot and the participant start, the human pauses briefly and then continues. Finally, in the "handover with an external push" scenario, there is an external disturbance (a small push) from an external agent (another one of the experimenters) on the user's arm while they are reaching the handover location. In the "straightforward handover" scenario, both the human and robot are following minimumjerk-like trajectories and are expected to reach the goal at the same time. On the other hand, the other scenarios represent perturbations that might impede the interaction and require adaptation on the robot side.
To characterize the coordination effects, three metrics are considered: 1. Normalized distance of the robot from the goal, as a function of the human distance from the goal. 2. Normalized distance of the robot and the human from the goal over time. 3. Normalized speed of the robot and the human towards the goal over time.
as these metrics can provide insight into the correlation between partners' movements.

Randomized Trials
To assess and validate the efficacy of the proposed method, 21 participants were asked to perform an experiment collaborating with the robot with one within-participant factor, i.e., the behavior of the controller.

Experimental Protocol
Twenty-one participants (right-handed, 10 females and 11 males, aged 22-34) took part in the experiment. None of the participants reported any history of sensory or motor impairments and all of them claimed to have normal or corrected to normal vision. Informed consent in accordance with the Declaration of Helsinki was obtained from each participant before conducting the experiments.
To evaluate the behavior of the proposed controller, the experiment has been setup as a perturbed handover scenario, inspired by an industrial assembly task where either: • The robot starts at the wrong time, or • The robot starts correctly, but the participant has changed their mind.
The object used for the handover was an empty 0.5l plastic bottle (cylinder diameter ∼ 60mm, length ∼ 210mm). Participants, starting from a pre-defined home position (Fig.  3a), were required to perform a task consisting of multiple phases: The box is intentionally placed close to (∼ 25 cm) the final handover location. Each participant has been explained the task and the capability of the robot to move at different times and speeds. Then, each participant was trained to perform the task by executing some training attempts with the robot stationary in the final position. Before the experiment, the motion capture skeleton was calibrated for each participant. A verbal signal was given to the participant to let them start performing their task. Just before the participant was instructed to start, the robot was given the command to begin performing the handover. As a result, the robot was allowed . From left to right: the robot initiates the execution of the handover and the human is given the signal to initiate the task; as the human moves for an object close to the handover location, the robot accelerates (gently) toward the final position; the human interrupts the handover to place the object, the robot slows down; as the human moves back for the handover, the robot accelerates toward the final position to move along the generated trajectory even before the participant received their signal.
Three parameter settings discussed in Sect. 5.1 were randomized over 15 trials (five per setting). In total, the whole procedure lasted approximately 45 min. This study was approved by the local ethical committee of the Scuola Superiore Sant'Anna, Pisa, Italy (approval number 02/2017).

Data Analysis
For the purposes of data analysis task performance metrics and a subjective survey were taken during the experiment.
Two objective metrics were recorded to assess the robot's performance: 1. Task completion time. 2. Percentage robot idle time.
Task completion time was defined through phases presented in Sect. 5.3.1. Thus, it was the time from the onset of Phase 1 (human initiates the motion) until the end of Phase 3 (hands collide). Percentage robot idle time is calculated as a percentage of time the robot spent idling (V r ≈ 0) out of the total Task Completion Time.
After each trial participants were asked to give their evaluation based on a survey similar to Ref. [25]: • It was easy to receive the object.
• I was satisfied with the interaction.
• The interaction was comfortable.
• I felt safe during the interaction.
Answers to the survey were given on a 9-point scale. It is worth noting that participants were explained that the object will always arrive at the same location and that the release of the robotic hand will always be the same.
As an additional check for differences across the repeated trials, data were separated into three groups (first, middle, and last 5 trials) and the correlation between these groups was analyzed.

Robot Coordination in Different Handover Scenarios
To qualitatively evaluate whether the proposed methods could coordinate the robot even in the presence of perturbations, we consider the plots shown in Figs. 4, 5, and 6. Figure 4 shows a comparison of the two normalized distances over time, for each of the considered cases. A point , where d h and d r correspond to the distances of the human and the robot from the final handover position g, respectively. Intuitively, the handover starts in (0, 0), and ends in (1, 1), when both agents reach the goal. The center diagonal, shown as a dashed line in Fig. 5, is considered as a hypothetical reference of a "perfectly coordinated" handover, with both agents having covered the same relative distance at each instant. This choice considers the typical case of a human-to-human (unperturbed) handover as reported by Shibata et al. [32], with the derivative of the distance between the two hands  following also a bell-shaped minimum-jerk profile as the two hands converge. Considering also the typical minimumjerk-like trajectory of both hands, reported in the same study, such a case would produce the center diagonal line. Thus, we considered that line as representative of a hypothetical reference interaction. This approach to illustratively describe coordination borrows from a similar approach in the neuro-science literature, which presents equivalent ways to assess motor coordination during grasp [33] and to compare grasping force, e.g. healthy versus impaired participants, adults versus children, unimpaired digits versus anesthetized ones [34] [35].
By observing normalized distance to goal (Fig. 5) and normalized speed (Fig. 6) we can inspect how does human move- represents the delay in speed peaks between the human and the robot in respective trials. Crosses represent the points of external push ment influence the robot's behavior. Correlation between changes in robot (dashed line) and human (continuous) represents adaptive behavior from the controller.

Task Performance Metrics
Since the distribution is assumed to be non-normal, to calculate the 95% confidence interval around the mean, bias corrected and accelerated (BCa) bootstrap method is used. Then, Kruskal-Wallis non-parametric ANOVA test was used to calculate p-values between the parameter settings. Detailed results are reported in Table 2. Figure 7 compares the mean task-completion time between the three sets of parameters. While there was a significant difference in p-value between NR and R1 ( p = 0.002), no statistical significance could be claimed between NR and R2 ( p = 0.060).
Percentage robot idle time is represented in Fig. 8. Thus it can be seen that the robot with the NR controller spent considerable time idling. There was a significant difference between NR and both reactive controllers ( p < 0.001). Additionally, Kruskal-Wallis non-parametric ANOVA test was used to calculate p-values between the early, middle, and late 5 trials for the task performance metrics. In terms of the percentage robot idle time, there was no significant difference across the trial repetitions. There was a statistically signifi- cant difference between the first and middle ( p = 0.037), and first and last ( p = 0.002) 5 trials in terms of the task completion time. This is however to be expected due to participants improving and getting accustomed to the task.

Subjective Metrics
Again, Kruskal-Wallis non-parametric ANOVA test was used to calculate p-values between the parameter settings, and BCa bootstrapping was used to calculate the 95% confidence interval around the mean. The results are reported in Table 3. In Fig. 9 differences in subjective response between the parameter settings are presented. There was no significant difference in the evaluation of how easy it was to receive the object ( p > 0.05). For all the other criteria, there was a significant preference for reactive controllers. This was particularly the case in terms of satisfaction and safety.
Additionally, Kruskal-Wallis non-parametric ANOVA test was used to calculate p-values between the early, middle, and late 5 trials for the subjective metrics. There was no significant difference across the repeated trials for all the subjective metrics.

Coordination Analysis
By examining the Fig. 4, it can be noted that in the "stop" and "external push" scenarios the controller is able to adapt to the unexpected disturbances by modulating the speed and thus performing the final part of the trajectory in synchrony with the human. It is also interesting to consider the "straightforward" handover scenario, where the robot and the human are almost perfectly coordinated when the controller is not active (NR). As the handover can proceed with no interruptions this is to be expected. Furthermore, in the same case, it can be assumed from the plots of Fig. 4 that the controller with R1 and R2 parameters produced an uncoordinated behavior on the robot part. However, as the plot does not show timerelevant information, in a faster execution (as is the case for the "straight" scenario) the plot is more sensitive to smaller delays on one part. This can be verified observing Fig. 5.
After the experiments, it has been noticed by the authors that the reactive controllers (either with R1 and R2 parameters) consistently produced a trajectory of the robot with peaks in the speed profile delayed with the respect to the human one (as shown in 6). The trajectory reaches peak speed around 0.3-0.5 s later (Fig. 6), a value similar to a typical human sensorimotor delay [36]. After further investigation, it has been found that no specific single part of the implemented architecture (in particular, low-pass filtering) can produce directly such a delay. As such, the behavior can be attributed to the parameters of the coupling terms in R1 and R2. Furthermore, these parameters were selected from two best sets learned from participants' preferences in Ref. [30] (i.e. the delay has never been explicitly considered). Thus, this finding is in line with the human preference for human-like sensorimotor delays, as shown in Ref. [36]. Further investigation is warranted, to see if such metrics could drive the learning of parameters in such systems.

Randomized Trials
The task is designed to represent a challenging setting for the proposed controller, as well as to highlight the aspects of collaboration that depend on mutual cooperative efforts between the agents (i.e. coordination, safety, fluency). Thus, the initial position of the box was purposely placed close (∼ 25 cm) to the handover location. Given the proposed coupling terms, this represents an explicitly bad case for the controller, as the initial target of phase 1 requires the hand also to approach the handover location.
In doing so, a more elaborate HRC task is set up. By engaging the human receiver in a secondary task prior to handover, additional constraints are placed on the interaction. This not only represents a realistic scenario from a real-world cooperative setting, but additional coordinate efforts required from the agents will better demonstrate how different parameters influence the perceived interaction and task metrics. Furthermore, as the human cannot actively pay attention to the robot for the whole duration of the task, the perceived safety of the interaction becomes more prominent.
As defined in Sect. 5.3.1, for practical purposes, the task completion time is reported until the hands collide. This is done to give a clear and precise end-point and to focus on the parts of the task the work focuses on (i.e. trajectory generation). Thus, the last phase (Phase 4) serves only as a natural conclusion of a multi-phased task, giving a final goal to the participants.
While there are some objective task performance metrics suggested for the robot handover scenarios [8], many of them are difficult to apply to the proposed scenario. For example, concurrent activity could only be fairly analyzed in the handover phase of the task and only for the reactive trial types. Otherwise, simply reporting the % concurrent activity would lead to a great mismatch in phases when the robot is not intended to move. For this reason, we focused on concurrent movement and related metrics in Sect 6.1. For similar reasons, the percentage robot idle time should be inspected cautiously, especially in the NR setting, as the robot spends the majority of interaction time idling. This does not necessarily deteriorate performance, as the robot does not have any further assignments after reaching the final location, however, the robot idling in the shared workspace could be deemed as an obstruction. Additionally, robot idling could deteriorate the perceived quality of collaboration which might correlate to the results of the subjective metrics in Sect. 6.2.2. Further, as the human is active during the whole length of interaction, human idle time is virtually non-existent.
From Fig. 7 it can be noted that the NR setting is somewhat faster when compared to reactive controllers. This is partially expected, as it was similarly reported in [3], because the robot heads straight to the final handover location and waits there for the human to finish their respective tasks and grasp the bottle. However, when compared to R2 there is no significant deterioration in task completion time, while both reactive controllers outperform the NR in terms of percentage robot idle time.
According to the results of the subjective survey, in terms of satisfaction, only 4 out 21 participants preferred the NR controller, on average, as opposed to the reactive ones. Similarly, only 4 preferred the NR controller in terms of comfort. Most notably, no participant deemed the NR controller more safe when compared to the reactive ones. Thus, as subjective metrics are considered, the inclination towards reactive and coordinated control is evident.
Results of the subjective survey are conclusive with related works [3,9,25] on HRC and coordination, as humans place importance on adaptive efforts from their partners. Thus, further research which would aim to improve temporal coordination between the agents is warranted for successful HRC implementations. In Sect. 5.2, a way to quantitatively assess this type of coordination is presented. This goes in line with the previous pilot study [30], where participants did not show significant preferences between non-reactive and reactive parameter settings for the "straightforward" scenario. This was similarly presented in the studies [1,25] which have shown that humans prefer faster handovers. However, as the task gets obstructed and perturbed, fast and non-reactive controllers can deteriorate the subjective perception of the interaction.
While physiological measurements were not recorded due to practical constraints, experiment length, and focus on the technical feasibility of the method, we report having observed noticeable discomfort in certain participants, with the NR parameter setting. For example, some participants were startled by the robot which was heading directly to the final goal, while the participant was still in the first phase of the task. More drastically, certain participants, in later trials, decided to let the NR controller reach the final location before continuing with their respective tasks. Deterioration in human experience can probably also be attributed to the fact that fast robotic movements can induce stress in handover scenarios [37]. This could warrant future studies which could more accurately measure induced stress in compound HRC scenarios that involve controllers with different adaptive capabilities.
Somewhat in contrast with [3], we have shown in Sect. 6.2.1 that reactive (or more akin to their proposed "adaptive") controllers do not necessarily deteriorate task performance metrics to a significant degree and might even improve on certain metrics. This could be due to the fact that in this work control on a lower level is considered, leading to faster and more robust trajectory generation, as opposed to high-level task planning. As discussed, the number of objective metrics which apply are limited by considering perturbations and interruptions in a handover setting. Nevertheless, considered metrics represent a suitable baseline for objective performance analysis and encourage further considerations for metrics that could be employed in similar scenarios. As HRC tasks can range in complexity, and the experience is shaped by multiple criteria (as explored in the subjective survey), additional objective metrics are needed to accurately assess the interaction. As stated, literature on less-structured handover tasks is sparse, and as it develops hopefully more objective metrics will be proposed.
Considering the results of task performance metrics and the subjective survey, the "fast reactive" (R2) controller could provide a good trade-off between quality and efficacy. The proposed controller does not deteriorate significantly task completion time, while significantly improving the percentage robot idle time and the subjective metrics.

Conclusion
In this paper, a method for robust handovers with online trajectory generation using DMP is proposed. This method allows for dynamic handovers which take cues from human hand motion, and thus allow the robot to coordinate accordingly. The approach has been shown to be capable of stopping when appropriate, persisting through minor disturbances on the partner's trajectory, and completing the task. Then, the method was assessed through coordination analysis and validated through objective task performance metrics and a subjective survey on user experience.
While this paper was focused on the temporal aspect of coordination, and thus handover location was predetermined, extension with a goal-free implementation is in development. Thus, a complete framework for online trajectory generation for dynamic and robust handover can be presented. Furthermore, as simplistic hand release was used in this work, reactive capabilities could be added to the release controller to improve the fluency of the task. While implemented virtual compliance made the release more natural as the robotic hand was compliant to human pull, the release triggered by force threshold could be deemed as "stiff" by human partners.
It is apparent that HRC research can benefit from methods that can use implicit cues to be more robust against uncertainty or unexpected external factors. This is particularly the case for methods where the robot is made to learn some behavior: it is easy to showcase a correct execution, but it is difficult to learn all the ways in which it can go wrong. Extended research on this topic could lead to a more standardized analysis of managing disturbances and unexpected behavior. Having better-suited objective metrics to assess these kinds of scenarios would greatly benefit the research community and robotic development for real-world HRC in general.

Conflict of interest
The authors declare they have no competing interests.
Ethical approval All procedures performed in the study involving human participants were in accordance with the ethical standards of the institutional committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. The study was approved by the local ethical committee of the Scuola Superiore Sant'Anna, Pisa, Italy (approval number 02/2017).

Consent to participate
Informed consent was obtained from the study participants. Participants enrolled to the study on voluntary basis.

Consent for publication Not applicable.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/. strategies. Her M.Sc. degree thesis won the student award of "Luigi Divieti e Marisa Marazana" offered by Politecnico di Milano during the XXXVIII Annual School of the Italian national bioengineering group (GNB) held in 2017. In 2018, she spent three months at the Australian Centre of Artificial Vision, Queensland University of Technology, Australia, where she investigated the principles that guide the human grasp choice during a collaborative action and that can be implemented on robotic platforms. Dr. Cini authored 5 peer reviewed scientific papers on international journals in the field of robotics and neuroscience. Since 2022, she is working as Control Software Engineer for Prensilia s.r.l., a spin-off of the Scuola Superiore Sant'Anna. Her research focuses on assistive strategies for teleoperation for the collection of marine samples. Currently, she is working on APRIL project, studying human behavior in grasping deformable objects to develop algorithms for robotic grasping. Her research interests are in the design of robotic hardware and control algorithms for dexterous robotic manipulation and teleoperation.

Angela
Egidio Falotico received the dual Ph.D. degree in innovative technologies from Scuola Superiore Sant'Anna, Pisa, Italy, and in cognitive science from Pierre et Marie Curie University Paris, France, in 2013, and the M.Sc. degree in computer science from the University of Pisa, Italy, in 2008. He is currently a tenure-track assistant professor with the BioRobotics Institute, Scuola Superiore Sant'Anna. Since his early studies, he has developed his strong interest in the domain of neuroscience and through his double Ph.D. degree, he had the chance to explore the potential of neuroscience knowledge applied to robotics. He has also put into practice his deep expertise in artificial intelligence and computational neuroscience for the control of soft and rigid robots. He is head of the BRAin-Inspired Robotics (BRAIR) laboratory at the BioRobotics Institute and serves as PI for Scuola Superiore Sant'Anna in some european projects, such as Proboscis, GrowBot and Human Brain Project. Srl. His research focuses on the design of robotic systems interacting with human beings. He worked on mechatronic design, development, and controllability issues of dexterous robotic upper limbs to be used as thought-controlled prostheses and to be used also as smart endeffectors for a new generation of robotic systems that will be able to cooperate and support humans in a wide range of activities.