Advertisement

A dynamical system approach to task-adaptation in physical human–robot interaction

Abstract

The goal of this work is to enable robots to intelligently and compliantly adapt their motions to the intention of a human during physical Human–Robot Interaction in a multi-task setting. We employ a class of parameterized dynamical systems that allows for smooth and adaptive transitions between encoded tasks. To comply with human intention, we propose a mechanism that adapts generated motions (i.e., the desired velocity) to those intended by the human user (i.e., the real velocity) thereby switching to the most similar task. We provide a rigorous analytical evaluation of our method in terms of stability, convergence, and optimality yielding an interaction behavior which is safe and intuitive for the human. We investigate our method through experimental evaluations ranging in different setups: a 3-DoF haptic device, a 7-DoF manipulator and a mobile platform.

This is a preview of subscription content, log in to check access.

Access options

Buy single article

Instant unlimited access to the full article PDF.

US$ 39.95

Price includes VAT for USA

Subscribe to journal

Immediate online access to all issues from 2019. Subscription will auto renew annually.

US$ 99

This is the net price. Taxes to be calculated in checkout.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

References

  1. Aarno, D., & Kragic, D. (2008). Motion intention recognition in robot assisted applications. Robotics and Autonomous Systems, 56(8), 692–705.

  2. Bandyopadhyay, T., Chong, Z. J., Hsu, D., Ang Jr, M. H., Rus, D., & Frazzoli, E. (2012). Intention-aware pedestrian avoidance. In ISER (pp. 963–977).

  3. Berger, E., Sastuba, M., Vogt, D., Jung, B., & Ben Amor, H. (2015). Estimation of perturbations in robotic behavior using dynamic mode decomposition. Advanced Robotics, 29(5), 331–343.

  4. Billard, A. (2017). On the mechanical, cognitive and sociable facets of human compliance and their robotic counterparts. Robotics and Autonomous Systems, 88, 157–164.

  5. Burdet, E., Osu, R., Franklin, D. W., Milner, T. E., & Kawato, M. (2001). The central nervous system stabilizes unstable dynamics by learning optimal impedance. Nature, 414(6862), 446–449.

  6. Bussy, A., Gergondet, P., Kheddar, A., Keith, F., & Crosnier, A. (2012a). Proactive behavior of a humanoid robot in a haptic transportation task with a human partner. In IEEE RO-MAN (pp 962–967).

  7. Bussy, A., Kheddar, A., Crosnier, A., & Keith, F. (2012b). Human-humanoid haptic joint object transportation case study. In IEEE/RSJ International conference on intelligent robots and systems (IROS) (pp 3633–3638).

  8. Calinon, S., Bruno, D., & Caldwell, D. G. (2014). A task-parameterized probabilistic model with minimal intervention control. In IEEE international conference on robotics and automation (ICRA) (pp. 3339–3344).

  9. Cherubini, A., Passama, R., Crosnier, A., Lasnier, A., & Fraisse, P. (2016). Collaborative manufacturing with physical human–robot interaction. Robotics and Computer-Integrated Manufacturing, 40, 1–13.

  10. Corteville, B., Aertbeliën, E., Bruyninckx, H., De Schutter, J., & Van Brussel, H. (2007). Human-inspired robot assistant for fast point-to-point movements. In IEEE international conference on robotics and automation (pp. 3639–3644).

  11. Davidson, P. R., & Wolpert, D. M. (2003). Motor learning and prediction in a variable environment. Current Opinion in Neurobiology, 13(2), 232–237.

  12. De Santis, A., Siciliano, B., De Luca, A., & Bicchi, A. (2008). An atlas of physical human–robot interaction. Mechanism and Machine Theory, 43(3), 253–270.

  13. Dragan, A. D., Lee, K. C., & Srinivasa, S. S. (2013). Legibility and predictability of robot motion. In 2013 8th ACM/IEEE international conference on human–robot interaction (HRI) (pp 301–308). IEEE.

  14. Duchaine, V., & Gosselin, C. M. (2007). General model of human–robot cooperation using a novel velocity based variable impedance control. In Second joint EuroHaptics conference and symposium on haptic interfaces for virtual environment and teleoperator systems (pp. 446–451).

  15. Evrard, P., & Kheddar, A. (2009). Homotopy switching model for dyad haptic interaction in physical collaborative tasks. In Third joint EuroHaptics conference and symposium on Haptic interfaces for virtual environment and teleoperator systems (pp. 45–50).

  16. Ewerton, M., Neumann, G., Lioutikov, R., Amor, H. B., Peters, J., & Maeda, G. (2015). Learning multiple collaborative tasks with a mixture of interaction primitives. In IEEE international conference on robotics and automation (ICRA) (pp. 1535–1542).

  17. Ganesh, G., Albu-Schäffer, A., Haruno, M., Kawato, M., & Burdet, E. (2010). Biomimetic motor behavior for simultaneous adaptation of force, impedance and trajectory in interaction tasks. In IEEE international conference on robotics and automation (ICRA) (pp. 2705–2711).

  18. Ganesh, G., Takagi, A., Osu, R., Yoshioka, T., Kawato, M., & Burdet, E. (2014). Two is better than one: Physical interactions improve motor performance in humans. Scientific Reports, 4, 3824.

  19. Ghadirzadeh, A., Bütepage, J., Maki, A., Kragic, D., & Björkman, M. (2016). A sensorimotor reinforcement learning framework for physical human–robot interaction. In IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 2682–2688).

  20. Gribovskaya, E., Kheddar, A., & Billard, A. (2011). Motion learning and adaptive impedance for robot control during physical interaction with humans. In IEEE international conference on robotics and automation (ICRA) (pp. 4326–4332).

  21. Haddadin, S., Albu-Schaffer, A., De Luca, A., & Hirzinger, G. (2008). Collision detection and reaction: A contribution to safe physical human–robot interaction. In Intelligent robots and systems, 2008. IEEE/RSJ international conference on IROS 2008 (pp. 3356–3363). IEEE.

  22. Hogan, N. (1988). On the stability of manipulators performing contact tasks. IEEE Journal on Robotics and Automation, 4(6), 677–686.

  23. Jarrassé, N., Paik, J., Pasqui, V., & Morel, G. (2008). How can human motion prediction increase transparency? In IEEE international conference on robotics and automation (ICRA) (pp. 2134–2139).

  24. Khansari-Zadeh, S. M., & Billard, A. (2011). Learning stable nonlinear dynamical systems with gaussian mixture models. IEEE Transactions on Robotics, 27(5), 943–957.

  25. Khoramshahi, M., Shukla, A., & Billard, A. (2014). Cognitive mechanism in synchronized motion: An internal predictive model for manual tracking control. In IEEE international conference on systems, man and cybernetics (pp. 765–771).

  26. Kim, W., Lee, J., Peternel, L., Tsagarakis, N., & Ajoudani, A. (2017). Anticipatory robot assistance for the prevention of human static joint overloading in human–robot collaboration. IEEE robotics and automation letters.

  27. Kouris, A., Dimeas, F., & Aspragathos, N. (2018). A frequency domain approach for contact type distinction in human–robot collaboration. IEEE robotics and automation letters.

  28. Kronander, K., & Billard, A. (2016). Passive interaction control with dynamical systems. IEEE Robotics and Automation Letters, 1(1), 106–113.

  29. Landi, C. T., Ferraguti, F., Sabattini, L., Secchi, C., & Fantuzzi, C. (2017). Admittance control parameter adaptation for physical human–robot interaction. arXiv:1702.08376.

  30. Lee, S. Y., Lee, K. Y., Lee, S. H., Kim, J. W., & Han, C. S. (2007). Human–robot cooperation control for installing heavy construction materials. Autonomous Robots, 22(3), 305.

  31. Lee, S. H., Suh, I. H., Calinon, S., & Johansson, R. (2015). Autonomous framework for segmenting robot trajectories of manipulation task. Autonomous Robots, 38(2), 107–141.

  32. Leica, P., Roberti, F., Monllor, M., Toibero, J. M., & Carelli, R. (2017). Control of bidirectional physical human–robot interaction based on the human intention. Intelligent Service Robotics, 10(1), 31–40.

  33. Li, Y., Yang, C., & He, W. (2016). Towards coordination in human-robot interaction by adaptation of robot’s cost function. In International conference on advanced robotics and mechatronics (ICARM) (pp. 254–259).

  34. Li, Y., Tee, K. P., Chan, W. L., Yan, R., Chua, Y., & Limbu, D. K. (2015). Continuous role adaptation for human–robot shared control. IEEE Transactions on Robotics, 31(3), 672–681.

  35. Maeda, Y., Hara, T., & Arai, T. (2001). Human-robot cooperative manipulation with motion estimation. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp 2240–2245.

  36. Maeda, G. J., Neumann, G., Ewerton, M., Lioutikov, R., Kroemer, O., & Peters, J. (2017). Probabilistic movement primitives for coordination of multiple human–robot collaborative tasks. Autonomous Robots, 41(3), 593–612.

  37. Medina, J. R., Lawitzky, M., Mörtl, A., Lee, D., & Hirche, S. (2011). An experience-driven robotic assistant acquiring human knowledge to improve haptic cooperation. In IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 2416–2422).

  38. Medina, J. R., Lee, D., & Hirche, S. (2012). Risk-sensitive optimal feedback control for haptic assistance. In IEEE international conference on robotics and automation (ICRA) (pp. 1025–1031).

  39. Modares, H., Ranatunga, I., Lewis, F. L., & Popa, D. O. (2016). Optimized assistive human–robot interaction using reinforcement learning. IEEE Transactions on Cybernetics, 46(3), 655–667.

  40. Noohi, E., Žefran, M., & Patton, J. L. (2016). A model for human–human collaborative object manipulation and its application to human-robot interaction. IEEE Transactions on Robotics, 32(4), 880–896.

  41. Peternel, L., Petrič, T., & Babič, J. (2017). Robotic assembly solution by human-in-the-loop teaching method based on real-time stiffness modulation. Autonomous Robots, 42, 1–17.

  42. Peternel, L., Petrič, T., Oztop, E., & Babič, J. (2014). Teaching robots to cooperate with humans in dynamic manipulation tasks based on multi-modal human-in-the-loop approach. Autonomous Robots, 36(1–2), 123–136.

  43. Petrič, T., Babič, J., et al (2016). Cooperative human-robot control based on fitts’ law. In 2016 IEEE-RAS 16th international conference on humanoid robots (humanoids) (pp. 345–350). IEEE.

  44. Pistillo, A., Calinon, S., & Caldwell, D. G. (2011). Bilateral physical interaction with a robot manipulator through a weighted combination of flow fields. In IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 3047–3052).

  45. Ravichandar, H. C., & Dani, A. (2015). Human intention inference and motion modeling using approximate em with online learning. In 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 1819–1824). IEEE

  46. Rozo, L., Calinon, S., Caldwell, D., Jiménez, P., & Torras, C. (2013). Learning collaborative impedance-based robot behaviors. In AAAI conference on artificial intelligence (pp. 1422–1428).

  47. Sartori, L., Becchio, C., & Castiello, U. (2011). Cues to intention: The role of movement information. Cognition, 119(2), 242–252.

  48. Sawers, A., Bhattacharjee, T., McKay, J. L., Hackney, M. E., Kemp, C. C., & Ting, L. H. (2017). Small forces that differ with prior motor experience can communicate movement goals during human–human physical interaction. Journal of Neuroengineering and Rehabilitation, 14(1), 8.

  49. Sebanz, N., & Knoblich, G. (2009). Prediction in joint action: What, when, and where. Topics in Cognitive Science, 1(2), 353–367.

  50. Stefanov, N., Peer, A., & Buss, M. (2010). Online intention recognition for computer-assisted teleoperation. In 2010 IEEE international conference on robotics and automation (ICRA) (pp. 5334–5339). IEEE.

  51. Strabala, K. W., Lee, M. K., Dragan, A. D., Forlizzi, J. L., Srinivasa, S., Cakmak, M., et al. (2013). Towards seamless human–robot handovers. Journal of Human–Robot Interaction, 2(1), 112–132.

  52. Takeda, T., Kosuge, K., & Hirata, Y. (2005). Hmm-based dance step estimation for dance partner robot -ms dance-. In 2005 IEEE/RSJ international conference on intelligent robots and systems, 2005. (IROS 2005) (pp. 3245–3250). IEEE.

  53. van der Wel, R. P., Knoblich, G., & Sebanz, N. (2011). Let the force be with us: Dyads exploit haptic coupling for coordination. Journal of Experimental Psychology: Human Perception and Performance, 37(5), 1420.

  54. Vanderborght, B., Albu-Schäffer, A., Bicchi, A., Burdet, E., Caldwell, D. G., Carloni, R., et al. (2013). Variable impedance actuators: A review. Robotics and Autonomous Systems, 61(12), 1601–1614.

  55. Vesper, C., Butterfill, S., Knoblich, G., & Sebanz, N. (2010). A minimal architecture for joint action. Neural Networks, 23(8), 998–1003.

  56. Wang, W., Li, R., Chen, Y., & Jia, Y. (2018). Human intention prediction in human–robot collaborative tasks. In Companion of the 2018 ACM/IEEE international conference on human–robot interaction (pp. 279–280). ACM.

Download references

Acknowledgements

We thank support from the European Communitys Horizon 2020 Research and Innovation programme ICT-23-2014, grant agreement 644727-CogIMon and 643950-SecondHands. Thanks to José R. Medina, Klas Kronander, and Guillaume deChambrier for their help with the controller implementations on Kuka LWR 4+ and Clearpath ridgeback.

Author information

Correspondence to Mahdi Khoramshahi.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (mp4 14313 KB)

Supplementary material 2 (mp4 18183 KB)

Supplementary material 1 (mp4 14313 KB)

Supplementary material 2 (mp4 18183 KB)

Appendices

Appendix A: Mathematical details

A.1 An implementation of Winner-take-all

In the following, we introduce a simple implementation for the Winner-Take-all (WTA) process that is used in this work; see Algorithm 1. This algorithm takes two inputs: a vector for the current beliefs and their updates (computed based on the adaptation mechanism). Here, we assume that beliefs are between 0 and 1 and they sum to unity. In the first step (line 1), the element with the greatest update is detected as the winner. In case of multiple maximums, one can pick the winner randomly. In the following lines (2–5), we handle the case where the winner is already saturated at 1. In this case, no update is necessary. In lines 6–8, we make sure that only the winner has a positive update. This is done by detecting the second-biggest update-value and setting the baseline in the middle. Again, in case of multiple maximums, one can pick randomly. In the rest of the algorithm, we ensure that the belief-updates sum to zero. This guarantees that the sum of the beliefs stays constant. To do this, we compute the sum of the current updates (S). In doing so, we exclude those components that are saturated at zero and have negative updates (line 11) since they do not influence the process. Based on the previous steps (line 6–8), it is guaranteed that S has a non-positive value. By adding this value to the winner component, we ensure that the updates—of active components—sum to zero, thus, sum of the beliefs stays one.

figurec

A.2 Pairwise competitions

The adaptation rule (Eq. 10) can be rewritten as follows:

$$\begin{aligned} \begin{aligned} \epsilon ^{-1}~\dot{\hat{b}}_i&= - |\dot{x}_r - f_i|^2 + 2 \sum \limits _{i\ne j} b_j f_j^Tf_i \\&= - |\dot{x}_r - f_i|^2 + 2 \sum \limits _{j}b_j f_j^Tf_i - 2b_i |f_i|^2 \\&= - |\dot{x}_r - f_i|^2 + 2 ~\dot{x}_d^Tf_i - 2b_i |f_i|^2 \end{aligned} \end{aligned}$$
(22)

A pairwise distance between two arbitrary DS (e.g., k and l) can be considered as follows that takes on values between -1 and 1.

$$\begin{aligned} \begin{aligned} \varDelta b_{kl}&= b_k - b_l \end{aligned} \end{aligned}$$
(23)

Since WTA process preserves the pairwise distances among the beliefs (Eq. 12), the dynamics of the belief after WTA can be approximated by those before WTA (which has slower dynamics).

$$\begin{aligned} \begin{aligned} \varDelta \dot{b}_{kl} =&~ \dot{b}_k - \dot{b}_l \simeq \dot{\hat{b}}_k - \dot{\hat{b}}_l \\ \end{aligned} \end{aligned}$$
(24)

Using Eq. (22), we can write

$$\begin{aligned} \begin{aligned} \epsilon ^{-1} \varDelta \dot{b}_{kl}=&- |\dot{x}_r - f_k|^2 + |\dot{x}_r - f_l|^2 \\&- 2 \dot{x}_d^T (f_k - f_l) + 2b_k |f_k|^2 - 2b_l |f_l|^2\\ =&-f_k^2 + f_l^2 + 2\dot{x}_r^T(f_k - f_l) \\&- 2 \dot{x}_d^T (f_k - f_l) + 2b_k |f_k|^2 - 2b_l |f_l|^2\\ =&~2 \dot{e}^T (f_k - f_l) + (2b_k-1)|f_k|^2 + (1-2b_l)|f_l|^2\\ \end{aligned} \end{aligned}$$
(25)

A.3 Optimality principle

First, we reformulate the cost function using the expansion of \(|\dot{x}_d|^2\).

$$\begin{aligned} |\dot{x}_d|^2 = \sum \limits _{i=1}^{N} b_i^2 |f_i|^2 + \sum \limits _{i \ne j} (b_i f_i)^T (b_j f_j) \end{aligned}$$
(26)

This leads to

$$\begin{aligned} \begin{aligned} J(B) =&|\dot{e}|^2 + \sum \limits _{i=1}^{N} b_i (1-b_i) |f_i|^2 \\ =&|\dot{e}|^2 + \sum \limits _{i=1}^{N} b_i |f_i^2| - |\dot{x}_d|^2 + \sum \limits _{i \ne j} (b_i f_i)^T (b_j f_j) \end{aligned} \end{aligned}$$
(27)

By expanding \(|\dot{e}|^2=|\dot{x}_r - \dot{x}_d|^2\), we have

$$\begin{aligned} \begin{aligned} J(B) =&-2 \dot{x}_r^T \dot{x}_d + \sum \limits _{i=1}^{N} b_i (|f_i|^2 + |\dot{x}_r|^2) + \sum \limits _{i \ne j} (b_i f_i)^T (b_j f_j) \end{aligned} \end{aligned}$$
(28)

We can show that the presented adaptive mechanism minimizes this cost-function by moving along its gradient:

$$\begin{aligned} \begin{aligned} \dot{\hat{b}}_i&= -\epsilon ~ \frac{\partial J}{\partial b_i} \\&= -\epsilon ~\left( -2\dot{x}_r^T~ \frac{\partial \dot{x}_d}{\partial b_i} + |f_i|^2 + |\dot{x}_r|^2 + 2 \sum \limits _{j \ne i} (b_j f_j)^T f_i \right) \\&= -\epsilon ~ \left( -2\dot{x}_r^T f_i + |f_i|^2 + |\dot{x}_r|^2 + 2 \sum \limits _{j \ne i} b_j f_j^T f_i \right) \\&= -\epsilon ~ \left( |\dot{x}_r - f_i|^2 + 2 \sum \limits _{j \ne i} b_j f_j^T f_i\right) \end{aligned} \end{aligned}$$
(29)

In this derivation, we assume \({\partial \dot{x}_r}/{\partial b_i}=0\) since the real velocity is the given input to the adaptive mechanism. Moreover, a simple approximation of this cost function (Eq. 26) can be achieved as

$$\begin{aligned} \begin{aligned} \tilde{J}(B)&\simeq |\dot{e}|^2 + |\tilde{f}|^2 \sum \limits _{i=1}^{N} b_i (1-b_i) \\&\simeq |\dot{e}|^2 + (1-b^{\star }) |\tilde{f}|^2 \\ \end{aligned} \end{aligned}$$
(30)

where \(|f_i|\simeq |\tilde{f}|\) and the summation is approximated by \(1-b^{\star }\), \(b^{\star }\) being the maximum \(b_i\). To simplify further, we can scale the cost by \(|\tilde{f}|\) and remove the offset.

$$\begin{aligned} \begin{aligned} \bar{J}(B) = \tilde{J}(B)/|\tilde{f}|^2 - 1 = |\dot{e}|^2/|\tilde{f}|^2 - b^{\star } \end{aligned} \end{aligned}$$
(31)

which shows the adaptation is a trade-off between minimizing the scaled-error and maximizing the maximum-belief. Moreover, in cases without perturbations (i.e., \(\dot{e}=0\) such as the autonomous mode), adaptation maximizes the belief of the most certain task.

A.4 Convergence to demonstration

By replacing error as \(\dot{e} = f_k - \dot{x}_d\), and the definition of \(\dot{x}_d\) in Eq. (25), and \(\delta _{kl} = \sum \limits _{i \ne k,l} b_i f_i \), we have

$$\begin{aligned} \epsilon ^{-1} \varDelta \dot{b}_{kl}= & {} 2 (f_k - \dot{x}_d)^T (f_k - f_l) + (2b_k-1)|f_k|^2 + (1-2b_l)|f_l|^2\nonumber \\= & {} 2 ( (1-b_k)f_k - b_l f_l - \sum \limits _{i\ne k,l} b_i f_i )^T (f_k - f_l) \nonumber \\+ & {} (2b_k-1)|f_k|^2 + (1-2b_l)|f_l|^2\nonumber \\= & {} |f_k|^2 + |f_l|^2 - 2 (1+b_l-b_k)f_k^Tf_l - 2 \delta _{kl}^T (f_k - f_l)\nonumber \\= & {} |f_k - (f_l + \delta _{kl})|^2 - 2(b_1 - b_k) f_k^T f_l - \delta _{kl}^2\nonumber \\ \end{aligned}$$
(32)

To have a convergence to \(b_k=1\), it required to have \(\varDelta \dot{b}_{kl} >0\), therefore:

$$\begin{aligned} {[}f_k - (f_l + \delta _{kl})]^2 > 2(b_1 - b_k) f_k^T f_l + \delta _{kl}^2 \end{aligned}$$
(33)

A.5 Convergence in the autonomous condition

In the absence of human perturbations on the real velocity (i.e., \(F_h =0\) in Eq. 3), and with the assumption of perfect tracking (i.e., \(\dot{e}=0\)), Eq. (25) can be simplified to

$$\begin{aligned} \epsilon ^{-1} \varDelta \dot{b}_{kl} = (2b_k-1)|f_k|^2 + (1-2b_l) |f_l|^2 \end{aligned}$$
(34)

In this case, when the belief of the dominant task (\(b_k\)) is bigger than 0.5, one can make sure that all other beliefs are less than 0.5 (since \(\sum b_i =1\)), therefore the terms of the right-hand-side are positive, and consequently, \(\varDelta \dot{b}_{kl} >0\). This means that the difference between \(b_k\) and \(b_l\) increase over time until the saturation points of \(b_k=1\) and \(b_l=0\). Assuming \(|f|^2=min(|f_k|^2,|f_l|^2)\), we have

$$\begin{aligned} \epsilon ^{-1} \varDelta \dot{b}_{kl} > (2b_k-1) |f|^2 + (1-2b_l) |f|^2 = 2 \varDelta b_{kl} |f|^2 \end{aligned}$$
(35)

which shows that the beliefs converge exponentially with rate of \(2\epsilon |f|^2\). By assuming \(b_k+b_l=1\), we have \(\varDelta {b}_{kl} = 2b_k-1\) which changes Eq.(35) to

$$\begin{aligned} \dot{b}_k \simeq 2\epsilon |f|^2 (b_k - 0.5) \end{aligned}$$
(36)

The solution to this equation is

$$\begin{aligned} {b}_k(t) \simeq 0.5 + (b_k(0) - 0.5) exp(2\epsilon |f|^2 t) \end{aligned}$$
(37)

Therefore the convergence time \(b_k(T_{auto})=1\) is

$$\begin{aligned} T_{auto} \simeq log(\frac{0.5}{b_k(0)-0.5})/(2 \epsilon |f|^2) \end{aligned}$$
(38)

Moreover, in Eq. (35), the particular case of two tasks with equal beliefs (\(b_k=b_l=0.5\)) is an unstable equilibrium point for the adaptation where the system generate motions based on \(0.5 (f_k+f_l)\). Therefore, the adaptation in the autonomous condition is only guaranteed if there is a task with \(b_i>0.5\) which requires the human supervision to ensure that the robot received enough demonstrations before retracting from the physical interaction; e.g., the human retracts only if he/she is confident that the robot switched to the indented task.

A.6 Stability in the autonomous condition

Assuming there is a task a task with \(b_k>0.5\), we can use its Lyapunov function (\(V_k(x)\)) to investigate the stability of the motion generation in the autonomous condition as follows:

$$\begin{aligned} \begin{aligned} \dot{V}_k&= \left( \frac{\partial V_k}{\partial x}\right) ^T \dot{x}_d = \left( \frac{\partial V_k}{\partial x}\right) ^T (b_k f_k + \sum _{i \ne k} b_i f_i) \\&= b_k \left( \frac{\partial V_k}{\partial x}\right) ^T f_k + \sum _{i \ne k} b_i \left( \frac{\partial V_k}{\partial x}\right) ^T f_i \end{aligned} \end{aligned}$$
(39)

Based on the stability of DS (Assumption 1), \((\frac{\partial V_k}{\partial x})^T f_k <0\). We further assume that the perturbations are bounded \(|(\frac{\partial V_k}{\partial x})^T f_i| < \psi (x)\). This boundaries leads to

$$\begin{aligned} \begin{aligned} \dot{V}_k&< -b_k \left( \frac{\partial V_k}{\partial x}\right) ^T f_k + (1-b_k)\psi (x) \end{aligned} \end{aligned}$$
(40)

Due to the exponential convergence of \(b_k\) (Eq. 37), for \(t>T_{auto}\), the second term vanishes and the stability of kth DS is restored.

A.7 Convergence speed

To investigate how \(\epsilon \) affects the convergence speed, we consider the case where the current task is \(\dot{x}_d = f_l\) and the human demonstration is \(\dot{x}_r = f_k\). This simplifies Eq. (25) into

$$\begin{aligned} \epsilon ^{-1} \varDelta \dot{b}_{kl} = (1+2b_k)|f_k|^2 + (3-2b_l)|f_l|^2 - 4 f_k^Tf_l \end{aligned}$$
(41)

To reach a simple estimation of convergence speed, we assume \(f_k^Tf_l \) (i.e., the two task are distinguishable) and tasks operate at the same speed (\(|f_k|^2 =| f_l|^2 = |f|^2\)). This yields

$$\begin{aligned} \varDelta \dot{b}_{kl} = \epsilon |f|^2 (4 + 2 \varDelta {b}_{kl}) \end{aligned}$$
(42)

The analytical solution to this equation with initial condition \(\varDelta {b}_{kl}=-1\) (\(b_l =1\) and \(b_k=0\)) can be computed as

$$\begin{aligned} \varDelta {b}_{kl}(t) = \exp (2 \epsilon |f|^2 t) -2 \end{aligned}$$
(43)

Then the reaching time \(T_{reach}\) to \(\varDelta {b}_{kl} = 1\) (\(b_l =0\) and \(b_k=1\)) is

$$\begin{aligned} T_{reach} = \frac{\log (3)}{ 2 \epsilon |f|^2} \end{aligned}$$
(44)

For example, for tasks operating around \(|f|^2 = 0.1\) and \(\epsilon =4\) as in the Sect. 6.2, we have \(T_{reach} = 1.37\) which can be verified in Fig. 14a. In real-world settings, given the time-scale of noises and other undesirable dynamics (approximated by \(T_{noise}\)), to avoid noise-driven adaptation and chatting between undesirable tasks, one should aim for

$$\begin{aligned} T_{reach}>> T_{noise} \end{aligned}$$
(45)

For example, considering 30 Hz noise (\(T_{noise}=1/30\)) for a case operating at \(|f|^2 = 0.1\) leads to the \(\epsilon < 164.7\) as the upper-bound. A better approach to tune \(\epsilon \) is to aim for a \(T_{reach}\) that correspond to a natural human–robot interaction. For example, expecting the robot to recognize and adapt to the human intention in 1 s leads to \(\epsilon = 5.5\). Thereafter, the approximated value can be re-adjusted in the real experiment to achieve the desirable behavior.

A.8 Null DS

Definition 5

(Null DS) It is possible to include a dynamic DS encoding for zero-velocity (i.e., \(f_0(x_r) \equiv 0\)) in Eq. (8) with its corresponding belief \(b_0\). In this case, the constraints in Eq. (9) should be modified to include \(b_0\) as well.

To have the dynamics of the competition between the null-DS and other DS in the autonomous condition, we need to insert \(f_0 =0\) in Eq. (34) which results in

$$\begin{aligned} \varDelta \dot{b}_{k0} = (2b_k-1)|f_k|^2 \end{aligned}$$
(46)

This equation shows that any DS with belief lower than 0.5 decreases and saturates at 0. Only the confident task—if exists—converges to 1. Therefore, the human can change the task of the robot to a desired one by providing enough demonstrations as to pass this threshold.

A.9 Resulting compliance at the force-level

In DS-based impedance control framework (Eq. 6), the observed stiffness for the human-user can be computed as

$$\begin{aligned} K = - \frac{\partial F_h}{\partial x_r} = - D \sum \limits _{i=1}^{N} b_i ~ \frac{\partial f_i(x_r)}{\partial x_r} \end{aligned}$$
(47)

where \(K \in \mathbb {R}^{3\times 3}\). It can be seen that the stiffness is not only affected by the control gain D, but also by the properties of the DS (i.e., \(\partial f_i(x_r)/\partial x_r\) which denotes the convergence rates of the DS). The stiffness in a particular direction, namely \(x_s\) with unit norm, can be calculated by the following Rayleigh quotient.

$$\begin{aligned} K(x_s) = x_s^T K x_s = - D \sum \limits _{i=1}^{N} b_i ~ x_s^T\frac{\partial f_i(x_r)}{\partial x_r} x_s \end{aligned}$$
(48)

Considering that the stiffness of each DS in the \(x_s\) direction is \(K_i(x_s) = x_s^T K x_s = - D x_s^T {\partial f_i(x_r)}/{\partial x_r} x_s\), we have the following property.

$$\begin{aligned} K(x_s) = \sum \limits _{i=1}^{N} b_i ~ K_i(x_s) \le \sum \limits _{i=1}^{N} b_i ~ K_{max}(x_s) = K_{max}(x_s) \end{aligned}$$
(49)

where \(K_{max}(x_s)\) denotes the stiffness of the stiffest DS in \(x_s\) direction. This is a conservative upper-bound that shows in transitory states where several DS are active with low \(b_i\); the real resulting stiffness of the system would be lower than the most stiff possible candidate. By introducing the null-DS as introduced in “Appendix.A.8”, the resulting stiffness is different since the null-DS has no stiffness (\(K_0(x_s)\equiv 0\)).

$$\begin{aligned} \begin{aligned} K(x_s)&= \sum \limits _{i=0}^{N} b_i ~ K_i(x_s) = b_0 K_0(x_s) + \sum \limits _{i=1}^{N} b_i ~ K_i(x_s) \\&\le \sum \limits _{i=1}^{N} b_i ~ K_{max}(x_s) = (1-b_0) K_{max}(x_s) \\ \end{aligned} \end{aligned}$$
(50)

This upper-bound shows that the stiffness can be reduced by adapting to the null-DS. The advantage of this property is twofold. First, the lower stiffness (i.e, higher compliance) allows the user to provide demonstration or guidance easier. Second, by sensing this compliance, the user can infer the confidence level of the robot resulting in a richer haptic communication.

Moreover, the observed damping for the human-user (B) can be computed using Eq. (6) as follows.

$$\begin{aligned} B = - \frac{\partial F_h}{\partial \dot{x}_r} = D \end{aligned}$$
(51)

It can be seen that the resulting damping solely depends on the controller. To reduce the human effort in the interaction, lower controller gain should be used.

A.10 Tracking performance

The tracking performance of the impedance controller for execution of one DS (\(f_i(x_r)\)) can be investigated using Eq.6 as follows.

$$\begin{aligned} \ddot{e} = - M^{-1} D \dot{e} - f'(x_r)\dot{x}_r + M^{-1} F_h \end{aligned}$$
(52)

where \(f'=\partial f(x) / \partial x\). In the first term, \(M^{-1} D \succ 0\) guarantees vanishing errors. However, the other terms (especially the external forces) which can be seen as disturbances introduce biases. The control gain (D) can be increased in order to reduce the effect of such disturbances and improve tracking behavior. However, one should note that in discrete control loop, there is upper-bound for the stability of the system. Discretization of Eq. (52) with \(\varDelta t\), ignoring the disturbances, and studying the eigenvalues provides us with an approximation of this upper-bound; i.e., \(D < 2M \varDelta t^{-1}\)

Appendix B: Supplementary materials

B.1 Technical details

The adaptation and motion generation is running at 300 Hz for both experiments. The control loop of the impedance controller of LWR and the velocity controller of UR5 are running at 200 and 125 Hz respectively. The motion planning for all cases is considered in the Cartesian space i.e, the position and the linear velocity of the end-effector (xyz). The orientation of the end-effector is controlled on a set-point. Moreover, the measured velocities are low-pass-filtered with cutoff frequency around 30 Hz. In both experiment, we set the control gains experimentally to have a practical balance between compliance and tracking.

B.1.1 DS parametrization for manipulation tasks

The linear polish is generated by the following dynamics:

$$\begin{aligned} \begin{aligned} \dot{x}_d&= 0.1 \overrightarrow{p} - 0.8 e^{\perp } \\ \end{aligned} \end{aligned}$$
(53)

first the first term induce a velocity in the direction of the line and the second term generate a velocity (saturated at 0.25 m/s) to correct for deviation from the line. The direction \(\overrightarrow{p}\) between two end-points (\([-.54 , .25 , .1 ]\) and \([-.54 , -.25 , 0.1 ]\)) switches when one is reached.

The circular polish is encoded in the cylindrical coordinates:

$$\begin{aligned} \begin{aligned} \dot{r}&= -2.7 (r- 0.025) \\ \dot{\theta }&= 2.5 \\ \dot{z}&= -2.7 (z - 0.12) \end{aligned} \end{aligned}$$
(54)

where \(r^2 = x^2+ y^2\), and \(\theta =atan2(y,x)\), and the center of rotation is \([-.55 ,0 ,.1]\).

The other two tasks (push down and retreat) are created by SEDs (Khansari-Zadeh and Billard 2011) with the following parameters.

$$\begin{aligned} \pi _1= & {} 0.35, ~~\pi _2=0.20, ~~\pi _3=0.45 \nonumber \\ \mu _1= & {} [35.7, -\,5.8, -\,11.4, \,-2.4, 4.3, 18.0]\nonumber \\ \mu _2= & {} [0.6, -\,34.8, 37.4, -\,0.2, 12.9, 3.1]\nonumber \\ \mu _3= & {} [-33.6, 10.9, -\,2.7, 2.6,-\,0.3, 17.8]\nonumber \\ \varSigma _1= & {} \begin{bmatrix} 1.3&-\,0.2&3.0 \\ -\,0.2&0.1&-\,5.4 \\ 3.0&-\,5.4&721.1 \\ -\,0.8&1.2&-\,160.7 \\ -\,0.2&0.5&-\,79.5 \\ -\,0.6&2.0&-\,282.2 \end{bmatrix} ~\varSigma _2=\begin{bmatrix} 22.4&5.4&-\,4.3 \\ 5.4&6.7&21.1 \\ -\,4.3&21.1&136.6 \\ -\,5.1&-\,1.3&0.4 \\ -\,1.7&4.8&33.3 \\ -\,0.7&-\,10.6&-\,58.8 \end{bmatrix}\nonumber \\ ~\varSigma _3= & {} \begin{bmatrix} 1.1&-\,0.3&-\,2.1 \\ -\,0.3&0.2&-\,10.3 \\ -\,2.1&-\,10.3&922.3 \\ -\,0.6&-\,2.4&222.4 \\ 0.2&0.9&-\,89.9 \\ 0.2&4.0&-\,348.4 \\ \end{bmatrix} \end{aligned}$$
(55)

However the attractor of push-down is at \([-.4, 0, .08]\) while the attractor of retreat is at \( [-.32, .28, .36]\).

B.1.2 Admittance control parametrization

The parameters used in the admittance control for the mobile-robot are as follows.

$$\begin{aligned} \begin{aligned} M_a&= \text {diag}(1,1,1,.5,.5,.5) \\ D_a&= \text {diag}(25,25,25,5,5,5) \\ K_a&= \text {diag}(10,150,10,5,5,5) \\ M_p&= \text {diag}(100,10,0,0,0,500) \\ D_p&= \text {diag}(500,50,0,0,0,101) \end{aligned} \end{aligned}$$
(56)

diag denotes a diagonal matrix with the given values where coordinate system is \((x,y,z,\theta _x,\theta _y,\theta _z)\).

B.1.3 DS parametrization for carrying task

The four tasks has the same dynamics as \(\dot{x}_d=- (x_r - x_{g})\) with saturated velocity at 0.12 m/s. However, the location of the attractor (\(x_g\)) is set differently for each task as follows.

$$\begin{aligned} \begin{aligned} x_{MF}&= [0.05, 0.47, 0.50]\\ x_{MB}&= [0.05, 0.32, 0.50]\\ x_{PL}&= [-\,0.3, 0.35, 0.1]\\ x_{PR}&= [0.3, 0.35, 0.1]\\ \end{aligned} \end{aligned}$$
(57)
Fig. 15
figure15

Snapshots of the task-adaptation in the manipulation task. The robot is initially in the retreat task. Staring around \(t=1s\), the human starts to demonstrate the linear polishing task. From \(t=4s\) the robot start to perform the linear polish autonomously

Fig. 16
figure16

Snapshots of the task-adaptation in the manipulation task. The robot is initially performing the “forwar” task where the human demonstrate a motion that is similar to “place left”. Therefore, the robot switches to this task. The robot performs all the tasks autonomously

B.2 Videos

The experimental result of the manipulation tasks using Kuka LWR 4+ (Sect. 6.1) can be watched here: https://youtu.be/oqHJ8crB5KY. The snapshots of the interaction for a short period is illustrated in Fig. 15.

The results of the carrying task using the mobile robot (Sect. 6.2) can be viewed here: https://youtu.be/7BjHhV-BkwE The snapshots of the interaction for a short period is illustrated in Fig. 16.

B.3 Source codes

A C++ implementation of our method can be found at https://github.com/epfl-lasa/task_adaptation.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Khoramshahi, M., Billard, A. A dynamical system approach to task-adaptation in physical human–robot interaction. Auton Robot 43, 927–946 (2019) doi:10.1007/s10514-018-9764-z

Download citation

Keywords

  • Physical human–robot interaction
  • Adaptive behavior
  • Compliant control
  • Dynamical systems
  • Predictive models