1 Introduction

In the last two decades, the scientific community has shown a great interest in understanding and shaping the interaction mechanisms between humans and robots. Human–robot interaction, in fact, implies communication between two dyadic agents and, if the type of interaction is ‘physical’, the communication is represented by the set of forces mutually exchanged. Within this context, the step of quantifying these forces becomes of pivotal importance for understanding the interaction mechanisms. At the current scientific stage, classical robots are built to act for humans, but in order to adapt their functionality to the current technological demand, the new generation of robots are required to collaborate with humans. This implies that the robots will be endowed with the capability to control physical collaboration through intentional interaction with humans. Future robots have to be aware of the human partner kinematics and dynamics (contact and exchanged forces, internal forces, joint torques). While several well-established technologies make nowadays possible the estimation of the real-time human motion (e.g., marker-based motion capture as Vicon or marker-less systems as Xsens and Microsoft Kinect), the estimation of exchanged forces with the external environment is still an outstanding problem. Most of the studies on the physical human–robot interaction (pHRI) derive from the intrinsic mutual adaptive nature that automatically occurs when two humans are cooperating together to accomplish a common task. Based on the pioneering study on the minimum-jerk model for human manipulation tasks (Flash and Hogan 1985), a method based on the minimization of the jerk can be used as a suitable approximation for estimating the human partner motions during a human–robot cooperative manipulation in Maeda et al. (2001). Here the authors’ attempt is to incorporate human characteristics into the control strategy of the robot. The weakness of this type of approach lies in the pre-determination of the task and in the role that the robot has to play in the task execution. Furthermore, the minimum-jerk model reliability decreases considerably if the human partner decides to apply non-scheduled trajectory changes during the task (Miossec and Kheddar 2009).

Another suitable route for pHRI is the imitation learning approach, justified by a wide variety of computational algorithms that have been recently developed [see an exhaustive overview in Schaal et al. (2003)]. This approach yields to the teacher–student concept and consists in (i) collecting movements of a human agent (i.e., the teacher in the imitation process), gathered by motion capture techniques and clustered in motion databases (Guerra-Filho and Biswas 2011; Kuehne et al. 2011; Mandery et al. 2015; Wojtusch and Stryk 2015; ii) classifying motions as a set of primitive actions (i.e., motion primitives) for model abstraction (Kulic et al. 2008; iii) mapping human models into a robot platform, i.e., the student in the imitation process. The retargeting implies the development of robot control algorithms capable of learning motion models from these primitives (Amor et al. 2014; Terlemez et al. 2014; Mandery et al. 2016) in order to emulate human-like movements. In general, although in the imitation approach the task pre-determination does not represent a drawback anymore, a relevant problem (termed, correspondence problem) concerns the comparability of the models (teacher vs. student). Human motion primitives are exploitable only if their knowledge could be properly mapped into a compatible robot kinematic/dynamic model. This inevitably narrows down the potential student-role candidate to anthropomorphic robots and it requires an accurate analysis on the imitation mechanism assumptions.

Unlike the current leaning, the focus is here shifted on the key role that a proper sensing technology for human beings (body-mounted IMUs, force plates on the ground, force/torque sensors on the iCub robot) together with a dynamics estimation algorithm may offer for retrieving whole-body motion and interaction forces. The importance in retrieving this set of information can be exemplified by the scenario in Fig. 1: once the human dynamics is completely known, the human feedback may be provided to the robot controllers. As a straightforward consequence, the robot may accordingly adjust the strategy of the interaction to enhance in this way the common aim of the task. This kind of online awareness in the robot controllers will be the key to bridge the current limitation of robots in observing the human dynamics. This specific case falls within the scope of assistive collaboration (e.g., a humanoid robot is helping a human in removing a tabletop) but in general this framework can be applied in any field where a proficient collaboration between humans and robots is required (e.g., in an industrial scenario where a manipulator is helping an assembly-line worker, or in a rehabilitative field where an actuated exoskeleton is assisting an injured person).

Fig. 1
figure 1

A possible pHRI scenario: the human agent is provided with a wearable technology and an estimation algorithm allows to compute information about his dynamics. By properly embedding human estimations in the control loop of a human-aware robot, the intentional collaboration may be enhanced (Color figure online)

The paper is built on the theoretical framework described in our previous work (Latella et al. 2016) from which it inherits both the notation and formulation. This paper represents the real-time evolution of Latella et al. (2016) and a first attempt at advancing the current state of the art in pHRI by designing an estimation tool for monitoring the dynamics of the human while physically interacting with a robot, in real-time.

The paper is structured as follows. Section 2 introduces the theoretical framework which the paper is based on. Section 3 presents the modelling of the human body as an articulated multi-body system. In Sect. 4 the adopted Gaussian probabilistic domain for the sensor fusion methodology is briefly recalled. Section 5 outlines the experimental setup of the framework followed by the offline algorithm validation in Sect. 6. In Sect. 7 the methodology for the real-time estimation is defined. Conclusions, current limitations and several considerations on the pivotal role of further control and estimation developments are discussed in Sect. 8.

2 Background

The humanoid robots formalism is adopted throughout the paper for modelling the human body as an articulated rigid multi-body system. The advantage of this choice lies in the possibility to handle both the systems (i.e., human and robot) with the same mathematical tool. In this domain, the application of the Euler–Poincaré formalism (Marsden and Ratiu 1999) leads to three sets of equations describing: (i) the motion of the human, (ii) the motion of the robot, (iii) the linking equations characterizing the contacts between the systems.

$$\begin{aligned}&(i) \qquad \varvec{\mathrm {M}}(\varvec{{q}}) \dot{\varvec{\nu }} + \varvec{\mathrm {C}}(\varvec{{q}}, \varvec{\nu }) \varvec{\nu } + \varvec{\mathrm {G}}(\varvec{{q}}) = \begin{bmatrix} \varvec{0} \\ {\varvec{\tau }} \end{bmatrix} + \varvec{\mathrm {J}}^\top (\varvec{{q}}) \varvec{\mathrm {f}}\\&(ii) \qquad \mathbb {M}(\bar{\varvec{q}}) \dot{\bar{\varvec{\nu }}} + \mathbb {C}(\bar{\varvec{q}}, \bar{\varvec{\nu }}) \bar{\varvec{\nu }} + \mathbb {G}(\bar{\varvec{q}}) = \begin{bmatrix} \varvec{0} \\ \overline{\varvec{\tau }} \end{bmatrix} + \mathbb {J}^\top (\bar{\varvec{q}}) \varvec{\mathrm {f}}\\&(iii) \qquad \begin{bmatrix} \varvec{\mathrm {J}}(\varvec{{q}})&~-\mathbb {J}(\bar{\varvec{q}}) \end{bmatrix} \begin{bmatrix} \dot{\varvec{\nu }} \\ \dot{\bar{\varvec{\nu }}} \end{bmatrix} + \begin{bmatrix} \varvec{\dot{\mathrm {J}}}(\varvec{{q}})&~-\dot{\mathbb {J}}(\bar{\varvec{q}}) \end{bmatrix} \begin{bmatrix}{\varvec{\nu }} \\ {\bar{\varvec{\nu }}} \end{bmatrix} = \varvec{0} \end{aligned}$$

Equations (i) and (ii) are floating-base representations of the dynamics of the human and robot models, respectively. In general, a floating-base system is defined by its configuration \({\varvec{q}} = \left( {^{\mathcal I}}\varvec{H}_{\mathcal B},\varvec{q}_j\right) \in SE(3)\times \mathbb R^{n}\) and its velocity \({\varvec{\nu }} = \left( {^{\mathcal I}}\varvec{v}_{\mathcal B},\dot{\varvec{q}}_j\right) \in \mathbb R^{n+6}\), where n is the (internal) number of Degrees of Freedoms (DoFs) of the system, \(\mathcal I\) is an inertial frame and \(\mathcal B\) the base frame. \({^{\mathcal I}}\varvec{H}_{\mathcal B}\) is the homogeneous transformation from \(\mathcal B\) to \(\mathcal I\) and \({^{\mathcal I}}\varvec{v}_{\mathcal B}\) is the base velocity with respect to (w.r.t.) \(\mathcal I\). The configuration and the velocity of the internal DoFs are denoted with \(\varvec{q}_j\) and \(\dot{\varvec{q}}_j\), respectively. The matrices \(\varvec{\mathrm {M}}\), \(\varvec{\mathrm {C}}\), \(\varvec{\mathrm {G}}\) and \(\mathbb {M}\), \(\mathbb {C}\), \(\mathbb {G}\) denote the mass matrix, the Coriolis matrix and the gravity bias term for the human and the robot systems, respectively. The exchanged forces between the two systems are denoted by the vector \(\varvec{\mathrm {f}} \in \mathbb R^{6k}\), where k is the number of the wrenchesFootnote 1 exchanged during the interaction.Footnote 2 It is worth remarking that while all the variables are different for the two systems, the exchanged wrenches \(\varvec{\mathrm {f}}\) are mandatorily the same quantities. Matrices \(\varvec{\mathrm {J}}(\varvec{q})\) and \(\mathbb {J}(\bar{\varvec{q}})\) are the proper Jacobians for the human and the robot, respectively. Equation (iii) is due to the rigid contacts assumption between the two systems. Since in this paper the human performs the experiments with the feet fixed on the force plates, we consider the human system to be fixed base, i.e, \({^{\mathcal I}}\varvec{H}_{\mathcal B} = const\) and known a priori and \({^{\mathcal I}}\varvec{v}_{\mathcal B}=\varvec{0}\). This choice implies \(\varvec{q} = \varvec{q}_j\) and reduces remarkably the framework complexity.

3 Human body modelling

Inspired by the biomechanical model developed for the Xsens MVN motion capture system (Roetenberg et al. 2009), we propose a human body model as a mechanical system represented by an oriented kinematic tree with \(N_B = 23\) moving links and n internal DoFs. Note that \(n = \sum _{i = 1}^{N_B} n_i\) is the total number of DoFs of the system. The generic ith link and its parent are coupled with a ith joint by following the topological Denavit–Hartenberg convention for joint numbering (Denavit and Hartenberg 1955). The origin of each link is located at the parent joint origin (i.e., the joint that connects the link to its parent). We model the links with simple geometric shapes (parallelepiped, cylinder, sphere) whose dimensions are estimated by readings coming from the motion capture acquisition (Fig. 2).

Fig. 2
figure 2

Human body reference model with labels for links and joints. 17 IMUs are distributed along the body, embedded in a lycra suit. Reference frames are shown using RGB (Red–Green–Blue) convention for xyz axes (Color figure online)

The dynamic properties, such as center of mass (CoM) and inertia tensor for each link, are not embedded in the Xsens output data since they are usually computed in a post-processing phase. Since our aim is to go towards the direction of a real-time estimation for the human dynamic variables, the knowledge of the dynamic properties during the acquisition phase is mandatory (Drillis et al. 1964). Since it is impractical to retrieve these quantities in-vivo for humans, we relied on the available anthropometric data concerning the relation between the total body mass and the mass of each body (Winter 2009; Herman 2007), under the assumption of geometric approximations and homogeneous density for the rigid bodies (Hanavan 1964; Yeadon 1990).

4 Probabilistic sensor fusion algorithm

In this section we briefly recall the probabilistic method for estimating dynamic variables of an articulated mechanical system by exploiting the so-called sensor fusion information, already presented in our previous work [the reader should refer to Latella et al. (2016) for a more thorough description]. The following description is applied within the context of the fixed-base representation of equations in Sect. 2.

Here we are interested in computing an estimation of a vector of dynamics variables \(\varvec{d}\) defined as follows:

$$\begin{aligned} \varvec{d}= & {} \begin{bmatrix} \varvec{d}_{1}^\top&\varvec{d}_{2}^\top&\ldots&\varvec{d}_{N_B}^\top \end{bmatrix}^\top \in \mathbb R^{24N_B+2n}~, \end{aligned}$$
(1a)
$$\begin{aligned} \varvec{d}_i= & {} \begin{bmatrix} \varvec{a}_{i}^\top&{\varvec{f}_{i}^B}^\top&\varvec{f}_{i}^\top&\tau _{i}&{\varvec{f}_{i}^x}^\top&\ddot{{q}}_{j_i} \end{bmatrix}^\top \in \mathbb R^{24+2n_i}~, \end{aligned}$$
(1b)

where \(\varvec{a}_i\) is the ith body spatial acceleration, \(\varvec{f_i}^B\) is the net wrench, \(\varvec{f}_i\) is the internal wrench exchanged from the parent link to the ith link through the ith joint, \( \tau _i ,~\ddot{q}_{j_i} \in \mathbb R^{n_i}\) are the torque and the acceleration at the ith joint, respectively. The system can interact with the surrounding environment, and the result of this interaction is reflected in the presence of the external wrenches \(\varvec{f_i}^x\) acting on the ith link.

The dynamics of the mechanical system is represented by the Newton–Euler equationsFootnote 3 (Featherstone 2008) whose matrix form yields to the following linear system of equations in the variable \(\varvec{d}\):

$$\begin{aligned} \varvec{D}(\varvec{q}_j, \dot{\varvec{q}}_j) \varvec{d} + \varvec{b}_D (\varvec{q}_j, \dot{\varvec{q}}_j)= \varvec{0}~, \end{aligned}$$
(2)

where \(\varvec{D}\) is a block matrixFootnote 4\(\in \mathbb R^{(18 N_B+n) \times d}\) and the bias vector \(\varvec{b}_D \in \mathbb R^{18 N_B+n}\). We now consider the presence of \(N_S\) measurements coming from different sensors (e.g., accelerometers, force/torque sensors) and we denote with \(\varvec{y} \in \mathbb R^{y}\) the vector containing all the measurements. The dynamic variables and the values measured by the sensors can be coupled with \(\varvec{d}\) by the following set of equations:

$$\begin{aligned} \varvec{Y}(\varvec{q}_j, \dot{\varvec{q}}_j) \varvec{d} + \varvec{b}_Y (\varvec{q}_j, \dot{\varvec{q}}_j)= \varvec{y}~, \end{aligned}$$
(3)

with the block matrix \(\varvec{Y} \in \mathbb R^{N_S \times d}\) and the bias vector \(\varvec{b}_Y \in \mathbb R^{N_S}\). By stacking together (2) and (3) we obtain a linear system of equations in the variable \(\varvec{d}\), such that

$$\begin{aligned} \begin{bmatrix} \varvec{Y}(\varvec{q}_j, \dot{\varvec{q}}_j) \\ \varvec{D}(\varvec{q}_j, \dot{\varvec{q}}_j) \\ \end{bmatrix}\varvec{d} + \begin{bmatrix} \varvec{b}_Y(\varvec{q}_j, \dot{\varvec{q}}_j) \\ \varvec{b}_D(\varvec{q}_j, \dot{\varvec{q}}_j) \end{bmatrix} = \begin{bmatrix} \varvec{y}\\ \varvec{0} \end{bmatrix}~. \end{aligned}$$
(4)

Equation (4) describes, for most cases, an overdetermined linear system. The bottom part [corresponding to (2)] represents the Newton–Euler equations, while the upper part contains the information coming from the, possibly noisy or redundant, sensors. It is possible to compute the whole-body dynamics estimation by solving the system in (4) for \(\varvec{d}\). One possible approach is to solve (4) in the least-square sense, by using a Moore–Penrose pseudo-inverse or a weighted pseudo-inverse.

In the following we perform a different choice. We frame the estimation of \(\varvec{d}\) given the knowledge of \(\varvec{y}\) and the prior information about the model and the sensors in a Gaussian domain by means of a Maximum-A-Posteriori (MAP) estimatorFootnote 5 such that

$$\begin{aligned} \varvec{d}_{\small {MAP}}=\arg \max _{\varvec{d}} p(\varvec{d}| \varvec{y})~. \end{aligned}$$

Since in this framework the probability distributions are associated to both the measurements and the model, it suffices to compute the expected value and the covariance matrix of \(\varvec{d}\) given \(\varvec{y}\), i.e.,

$$\begin{aligned} \varvec{\varSigma }_{d|y}= & {} \left( \overline{\varvec{\varSigma }}_D^{-1}+ \varvec{Y}^\top \varvec{\varSigma }_{y}^{-1} \varvec{Y} \right) ^{-1}~, \end{aligned}$$
(5a)
$$\begin{aligned} \varvec{\mu }_{d|y}= & {} \varvec{\varSigma }_{d|y} \left[ \varvec{Y} ^\top \varvec{\varSigma }_{y}^{-1} (\varvec{y} - \varvec{b}_Y) + \overline{\varvec{\varSigma }}_D^{-1} \overline{\varvec{\mu }}_D \right] ~, \end{aligned}$$
(5b)

where \(\overline{\varvec{\mu }}_D\) and \(\overline{\varvec{\varSigma }}_D\) are the mean and covariance of the probability distribution \({p(\varvec{d}) \sim \mathcal N \left( \overline{\varvec{\mu }}_D, \overline{\varvec{\varSigma }}_D\right) }\) of the model, respectively; \(\varvec{\varSigma }_{y}\) is the covariance matrix of the measurements distribution \({p(\varvec{y})\sim \mathcal N \left( {\varvec{\mu }}_y, {\varvec{\varSigma }}_y\right) }\). In the Gaussian framework, (5b) corresponds to the estimation of \(\varvec{d}_{\small {MAP}}~\).

5 Experimental setup

This section describes the experimental setup for the algorithm analysis. The MAP framework requires: (1) the computation of the human kinematics (possible through whole-body distributed sensor measurements), (2) the estimation of the human dynamics (i.e., the solution [5b)].

To this purpose, the experimental setup encompassed the following different sensor modules: (i) a wearable suit for the motion tracking, (ii) two standard force platforms for the ground reaction wrenches tracking, (iii) the force/torque sensors of an iCub robot arms. Ten healthy adult subjects (7 female, 3 male) have been recruited in the experimental session, height (\(166.6\pm 4.5\,\hbox {cm}\)) and mass (\(61.14\pm 5.76\,\hbox {kg}\)). Data were collected at Istituto Italiano di Tecnologia (IIT), Genoa, Italy. Each subject was provided of a written informative consent before starting the experiment. Kinematics data were acquired by using a full-body wearable lycra suit provided by Xsens Technologies. The wearable suit is composed of 17 wired trackers (i.e., inertial sensor units-IMUs including an accelerometer, a gyroscope and a magnometer). The suit has signal transmitters that send measurements to the acquisition unit through a wireless receiver which collects data at a frequency of \(240\,\hbox {Hz}\). The human subject performed the required task standing with the feet on two standard AMTI OR6 force platforms mounted on the ground, while interacting with the robot. Each platform acquired a wrench sample at a frequency of \(1\,\hbox {kHz}\) by using AMTI acquisition units.

Experiments were conducted on the iCub (Metta et al. 2010), a full-body humanoid robot (Fig. 3a) with 53 DoFs: 6 in the head, 16 in each arm, 3 in the torso and 6 in each leg. The iCub is endowed with whole-body distributed force/torque sensors, accelerometers, gyroscopes and tactile sensors. Specifically, the limbs are equipped with six force/torque sensors placed in the upper arms, in the upper legs and in the ankles (Fig. 3b). Internal joint torques and external wrenches are estimated through a whole-body estimation algorithm (Nori et al. 2015). Robot data were collected at a frequency of \(100\,\hbox {Hz}\).

Fig. 3
figure 3

a The humanoid iCub robot. b Model of the iCub with the force/torque sensors embedded in the limbs structure (Color figure online)

Each subject was asked to wear the sensing suit and to stand on the two force plates by positioning one foot per platform. The robot was located in front of the subject facing him at a known distance from the human foot location (Fig. 4). The mutual feet position was fixed for all the trials. The interaction implied that the human grasped and pushed down both the robot arms while was performing a five-repetion block of a bowing task (BT).

Fig. 4
figure 4

Subject grasps and pushes down the robot arms. The figure shows the reference frames (in RGB convention) for the force/torque sensor of the robot (iCubFT), the robot fixed base (iCubFB), the force plate (FP), the human fixed base (hFB), the human foot and hand (hFOOT, hHAND) respectively. The mutual feet position between human and robot is fixed during the experiment (Color figure online)

6 Algorithm validation

In this section we discuss about the validatin of the MAP as a tool for estimating quantities related to the joints and links of the human model. The offline validation is mainly performed in Matlab by implementing the algorithm for the framework in Fig. 5. Data coming from the force plates and from the robot (\({\varvec{f}^x}\)) are considered as acquired from a particular class of net external wrench sensors. Each link linear acceleration \(\varvec{a}\) and angular velocity \(\varvec{\omega }\) are acquired by Xsens IMUs. Xsens data are used as input for the OpenSim (Delp et al. 2007). Inverse Kinematics (IK) toolbox that allowed to retrieve the joint angles \(\varvec{q}_j\). Joint velocities \(\varvec{\dot{q}}_j\) and accelerations \(\varvec{\ddot{q}}_j\) are computed by using a weighted sum of windows of elements, with a third-order polynomial Savitzky–Golay filtering (Savitzky and Golay 1964). Also joint accelerations are considered as acquired from a class of ‘fictitious’ DoF acceleration sensors. Since the acquisition sampling rate was different among the sources, data are linearly interpolated in order to guarantee the synchronization. In general, by considering the data acquired from all above-mentioned sensors and the state \((\varvec{q}_j, \dot{\varvec{q}}_j)\), the MAP estimator provides the estimation of \(\varvec{d}\) given \(\varvec{y}\). The reader should refer to the Algorithm 1 for the description of the algorithm procedure.

Fig. 5
figure 5

Schematic pipeline of the MAP estimation framework

It is worth remarking that the estimated vector \(\varvec{d}\) , i.e., the solution (5b), contains variables that can be directly measured (\(\varvec{a}\), \(\varvec{f}^x\), \(\ddot{\varvec{q}}_j\)) and variables that can not be directly measured in humans (\(\varvec{f}^B\), \(\varvec{f}\), \(\varvec{\tau }\)) but only estimated via algorithms. The MAP algorithm represents, in a sense, the probabilistic way to estimate those quantities for which a direct measure does not exist. The leading validation idea is to compare the same variables (measured and then estimated) in order to prove the quality of the proposed algorithm for certainFootnote 6 links/joints of the model. Figure 6a shows the comparison between the linear acceleration measured by IMUs (in red) and the related MAP estimation (in blue) for the right foot, upper leg and hand, respectively. These results show that the MAP algorithm is a reliable tool for estimating the human link acceleration. We also investigated the reliability for the external forces (Fig. 6b) by considering only the links (right foot, right hand and left hand) where external forces are applied. Even in this case we obtained a suitable estimation of the quantities measured by the force/torque sensors. The same analysis was computed for the joint acceleration (in Fig. 6c). By relying on the previous results we exploited the MAP approach for estimating the joint torques in humans. Figure 6d shows the estimation of the joint torque during a BT for the right ankle, hip and shoulder, respectively.

Fig. 6
figure 6

Comparison between the variables measured (mean and standard deviation, in red) and their MAP estimation (mean, in blue) for (a) the linear acceleration \(\varvec{a}\), (b) the external force \(\varvec{f}^x\) and (c) the joint acceleration \(\ddot{\varvec{q}}_j\). The analysis shows the goodness of the MAP algorithm in estimating those variables that can be directly measured by sensors. By relying on this result, the estimation of the joint torque \(\varvec{\tau }\) (mean and standard deviation) is shown in (d) (Color figure online)

Fig. 7
figure 7

Mean and standard deviation of \(\varvec{\tau }\) (i.e., the sum of the hips torque estimation (i.e.,the sum of the torques estimated at the hips) among five repetitions of BT performed by a subject computed for CASE A (red) and CASE B (blue) by means of (a) the MAP algorithm and (b) the OpenSim ID toolbox. (c) Box plots of the torque estimation error \(\varvec{\varepsilon }_{\varvec{\tau }}\) computed in (7) with MAP (on the right) and with OpenSim (on the left). It shows that MAP is a method more robust to the modelling errors since it gives the possibility of weighting the reliability of the model by properly setting the related covariance matrix (Color figure online)

figure i

6.1 Robustness test

To test the robustness of the approach w.r.t. modelling errors, one subject was asked to perform the BT in two different configurations, i.e., with and without an additional mass (W) of \(6\,\hbox {kg}\) roughly positioned in correspondence of the torso CoM. The MAP computation was performed by considering the following cases as algorithm inputs (see Table 1):

Table 1 Cases for the MAP validation

Both the cases are performed with the model of the subject without W. In order to highlight a lower reliability for the model used in CASE A, it is assigned to the model a varianceFootnote 7 equal to \(10^{-1}\) (differently from the \(10^{-4}\) value assigned for the CASE B).

By exploiting the linearity property of the system we consider the following expression for the torques:

$$\begin{aligned} \varvec{\tau }_{(model + 6\,\hbox {kg})} - \varvec{\tau }_{model} = \varvec{\tau }_{6\,\hbox {kg}}~, \end{aligned}$$
(6)

where \(\varvec{\tau }_{6\,\hbox {kg}}\) is the theoretical torque due to the additional W positioned on the torso.Footnote 8 Given (6), it is possible to retrieve the error \(\varvec{\varepsilon }_{\varvec{\tau }}\) on the \(\varvec{\tau }\) estimation for the subject with W, due to the CASE A:

$$\begin{aligned} \varvec{\varepsilon }_{\varvec{\tau }} = |\varvec{\tau }_{(model + 6\,\hbox {kg})} -\varvec{\tau }_{model}| - \varvec{\tau }_{6\,\hbox {kg}}~. \end{aligned}$$
(7)

We compute (7) by using the OpenSim Inverse Dynamics (ID) toolbox as well, in order to evaluate its effectiveness w.r.t. the modelling errors. Figure 7a, b provide the mean and the standard deviation of the torque estimation by means of the MAP algorithm and the OpenSim ID, respectively. Figure 7c shows the comparison between the computation of the error in (7) for both the above methods: the error is higher in OpenSim computation than in MAP since OpenSim does not offer the possibility of setting the model reliability in the computation.

6.2 Incremental sensor fusion analysis

The novelty of the MAP framework consists in replacing the classical boundary conditions for the Newton–Euler equations with multi-sensor readings. In this section, we discuss the effects on the estimation when different measurements are progressively added into the algorithm. Let us consider all the sensor sources involved in this analysis, such that

$$\begin{aligned} \varvec{y}_{FP}= & {} {}^{{{{FP }}}} \varvec{X}_{{{hFOOT }}}~\varvec{f}_ {{{hFOOT }}}~, \end{aligned}$$
(8a)
$$\begin{aligned} \varvec{y}_{IMU}= & {} {}^{{{IMU }}}\varvec{X}_{{{L }}}~ \varvec{a}_{{{L }}}~, \end{aligned}$$
(8b)
$$\begin{aligned} \varvec{y}_{iCubFT}= & {} {}^{{{iCubFT }}}\varvec{X}_ {{{hHAND }}}~ \varvec{f}_{{{hHAND }}}~. \end{aligned}$$
(8c)
Fig. 8
figure 8

10-subjects mean torque variance [Nm] of five repetitions of the BT by using three different combinations of the measurements equation: CASE 1 for (9a), CASE 2 for (9b) and CASE 3 for (9c), for both left and right ankle, hip joints, respectively (Color figure online)

Fig. 9
figure 9

The YARP architecture for estimating and visualizing real-time human dynamics during a physical interaction with a robot. The software architecture is able to estimate every 10 ms the dynamic variables in \(\varvec{d}\) for each ith link/joint in the model (pink area) and to visualize the information about the kinematics and the dynamics of the human via RViz (gray area) (Color figure online)

Equation (8a) is built for each force plate and it is used the trasformation matrixFootnote 9\(\varvec{X}\) between the human foot reference frame (\({{\small {hFOOT }}}\)) and the force plate frame (\({{\small {FP }}}\)). Equation (8b) is built for each IMU and it exploits the trasformation between each human link frame \({{\small {L }}}\) on which the IMU is attached and that particular \({{\small {IMU }}}\) reference frame. Equation (8c) is used for the iCub force/torque sensors for which it is necessary the transformation from each human hand frame (\({{\small {hHAND }}}\)) to the robot sensor frame (\({{\small {iCubFT }}}\)). A detailed frames view in Fig. 4. In particular, we build (3) for three different cases:

$$\begin{aligned} {CASE 1 } \quad \quad \varvec{y}= & {} \begin{bmatrix} \varvec{\ddot{q}}_j,~\varvec{y}_{{{FP }}} \end{bmatrix} \end{aligned}$$
(9a)
$$\begin{aligned} {CASE 2 } \quad \quad \varvec{y}= & {} \begin{bmatrix} \varvec{\ddot{q}}_j,~\varvec{y}_{{{FP }}},~ \varvec{y}_{{{IMUs }}} \end{bmatrix}\end{aligned}$$
(9b)
$$\begin{aligned} {CASE 3 } \quad \quad \varvec{y}= & {} \begin{bmatrix} \varvec{\ddot{q}}_j,~\varvec{y}_{{{FP }}},~ \varvec{y}_{{{IMUs }}},~\varvec{y}_ {{{iCubFT }}} \end{bmatrix} \end{aligned}$$
(9c)

The MAP algorithm is performed thrice by including (as set of sensors) Eqs. (9a), (9b), (9c), respectively. This analysis is to prove that, by progressively adding (i.e., increasing) sensors data, the variance associated to the estimated dynamic variables consequently decreases at each MAP computation, by making the estimation more reliable. Among the estimated variables in \(\varvec{d}\) we consider again the torque \(\varvec{\tau }\) (e.g., both the right and left ankle and hip) and the variance along the axis of major relevance (i.e., y axis). By passing progressively from CASE 1 to CASE 3 the variance associated to the mean torques decreases.Footnote 10 Figure 8 shows the decreasing behaviour of the mean variance of the torque at the hips and the ankles computed among ten subjects. The variance values at the ankles do not vary significantly among the three sensor configurations since the ankle torque estimation depends mostly on the contribution of the force plates that are included in all the cases of the computation. Conversely, a significant decreasing behaviour is evident in the values associated to the hips. In this case the contribution of the three sensors becomes important since the torque estimation at the hips is affected by the complete set of sensors (i.e., contribution of the force plates weights lower, contribution of IMUs + iCub sensors weights more).

7 Towards the real-time estimation

We exploit in this section the YARP middleware (Metta et al. 2006) to perform in a real-time domain the human dynamics estimation. The YARP interface is mainly used to minimize the effort in the infrastructure-level software development by facilitating code reusability and modularity. The algorithm of Fig. 5 is hereafter arranged to fit the online requirements, Fig. 9. In this specific case, YARP allows to split the algorithm into three different modules, such that

  1. 1.

    The human-state-provider module reads the links information from a human model (built in a pre-acquisition phase) and returns the human state \((\varvec{q}_j, \dot{\varvec{q}}_j)\). Note that the results will depend on the complexity of the model (e.g., few-DoF model may lead to worst approximations) but the formulation does not depend on the model complexity.

  2. 2.

    The human-forces-provider module reads forces coming from different YARP-based sources and transforms them into quantities expressed in human reference frames.

  3. 3.

    The human-dynamics-estimator module estimates the dynamics vector \(\varvec{d}\) every 10 ms given the output data of the previous two modules.

Furthermore, YARP allows to visualize the real-time estimation (whether a pHRI is occurring or not) by means of some structured modules and a ROS-based visualizer (gray part in Fig. 9). The human-viz module is composed of different sub-modules that are in charge of reading information from the Xsens system, the human state, the estimated vector \(\varvec{d}\), and of publishing information in the form of ROS-topic messages to be visualized in the RViz toolit (Kam et al. 2015). The advantage in adopting this tool is due to its versatility and agnosticism to the type of the data structure or algorithm. Figure 10 shows two different interaction tasks with their real-time visualizations, for a bowing (on top sequence) and a squat task (on middle sequence), respectively. In addition to providing a visual feedback to the estimation, the visualizer gives us an important information about how much the human joint ‘effort’ is. The joint torques (estimated via MAP) are represented with balls whose colour can be considered as an indicator of the effort: in a gray scale, a light ball means a high effort, a dark ball a minor effort.

8 Conclusions and future works

The paper endeavours to design an estimation tool for monitoring the real-time dynamics of a human being physically involved in a collaboration task with a robot. Indeed, it introduces a description of the steps needed to get towards the real-time estimation: starting from an offline validation of the algorithm (performed in Matlab, Sect. 6), we designed a (C++ based) tool for the estimation of the online human joints torques (Sect. 7). Within the real-time framework any kind of pHRI can be visualized as long as it obeys the initial assumptions (i.e, fixed base and rigid contact conditions). Even if a mathematical formalism already exists for the floating-base representation (Sect. 2), the existing software tool does not support it yet. The upgrade to a floating-base representation is mandatory for the implementation of the algorithm in more complex experimental setups where both humans and robots could move while interacting.

Fig. 10
figure 10

Subject performing three different tasks with the related real-time RViz sequence visualization: a bowing (on top), a squat (on middle), a task where the human is helping the iCub to stand up from a rigid support (Romano et al. 2018) (on bottom). The visualizer shows the forces expressed in human frames: the ground reaction forces (in yellow) and the forces measured by the force/torque sensors of the robot at the human hands (in light blue). It visualizes also how much effort (in terms of joint torques) the human is doing during the pHRI by means of gray-scale balls placed at the joints: a light ball means a high effort, a dark ball a minor effort (Color figure online)

In this work, we applied the proposed approach to human models composed of 1-DoF revolute joints, by using the classical formalism widespread in robotics (Featherstone 2008). In particular, we combined this type of joints to obtain a series of joints with a high number of DoFs, that however are only a rough approximation of the complexity exhibited by real-life biomechanical joints. Despite we chose this joint model for an initial exploration, the proposed algorithm is not limited to this particular choice. In particular, the properties of any joint (with an arbitrary number of DoFs) can be encapsulated in an interface where the relative position, velocity and acceleration of the two bodies connected by it are described by arbitrary functions of joint coordinates variables and their derivatives that can then be directly inserted in (2) and (3). A possible implementation to generalize our method to arbitrarily complex musculoskeletal models is presented in Seth et al. (2010), where it is used for biomechanical simulations.

Another important investigation will concern the estimation of the human inertial parameters (mass, CoM, inertias). The use of the anthropometric tables is currently the most widespread tool for estimating such values. Even though they allow to scale the model along with the subject, this could be a rough approximation for the non-standard population samples (children, elderly, obese, individual with prostheses) and this is pushing the scientific community towards the development of new alternative ways to estimate these parameters directly from data (Venture et al. 2009; Robert et al. 2017).

The long-term objective of the entire work is represented by the possibility to extend the human dynamics estimation framework to a new framework that encompasses an active collaboration with a robot. In the experimental setup of Sect. 5, the robot is considered only as a passive forces measurer, i.e., as a tool for estimating the external forces acting on the human hands. Since the human dynamics is of pivotal importance for a control design aimed at considering the human in the loop, forthcoming developments will provide (online) the robot with the human force feedback. This information could be used as a tool for reactive human–robot collaboration (implying a robot reactive control) and, in a long-term perspective, for predictive collaboration. A first investigation towards this direction has been done in Romano et al. (2018). The paper attempts to answer the question “How can we predict human intentions so as to synthesize robot controllers that are aware of and can react to the human presence?” by considering an interaction scenario between a human and a human-aware iCub robot. In the paper, the momentum-based balancing controller of the robot has been modified [see Sect. 2 of Romano et al. (2018)] to take into account and exploit the human forces. A task for the robot stand-up from a rigid support has been performed with (bottom sequence of Fig. 10) and without the human help showing that the robot needs to provide less torque when helped by the human since it is capable to exploit the human assistance.