1 Introduction

Motion capture, in its multiple forms, is possibly the most widely used technique in biomechanical analysis. The data it provides can be directly used in many cases to perform basic studies, focused only on kinematics. However, it is also possible to carry out more advanced analyses, based on additional information such as joint torques, which are in turn obtained by applying the acquired motion to skeletal models. Going even further, the use of complex musculoskeletal models also allows the estimation of joint-reaction forces and muscle efforts. There are currently multiple software packages, both open source and commercial, that provide such models, among which OpenSim [1] (NCSRR, Stanford CA, USA), AnyBody [2] (AnyBody Technology A/S, Aalborg, Denmark), or BoB [3] (BoB Biomechanics, Bromsgrove, UK) are well-known examples. However, they are essentially designed as postprocessing tools, so they do not provide the results in real time.

Obtaining the musculoskeletal analysis results in real time can offer many benefits. On the one hand, it enables for early detection of motion-capture problems, thus avoiding the problem of finding that a take had errors when the subject is no longer available to repeat it. On the other hand, the availability of musculoskeletal data in real time can serve not only to improve the biofeedback-based methodologies currently used in sports and rehabilitation, but also to enable the creation of new and more advanced applications. Giggins et al. [4] conducted a literature review on the benefits of real-time biofeedback in rehabilitation, including many studies based on different technologies, such as electromyography (EMG), inertial measurement units (IMUs) and optical motion-capture systems. Some of the reviewed studies demonstrated that the use of real-time biofeedback improved rehabilitation outcomes, whereas others were less conclusive. There are specific applications, such as lower-back postural control, where the use of simple motion sensors to provide real-time feedback has been shown to be effective as a tool to reduce pain [5]. Later studies used IMUs to provide real-time feedback for more complex movements. For example, Carpinella et al. [6] used six IMUs attached to the lower limbs and the trunk to provide visual and acoustic feedback to Parkinson’s patients during rehabilitation, which helped to improve their balance. Cerqueira et al. [7] used a similar setup for the upper extremities, this time providing vibrotactile feedback whenever the ranges of motion of the trunk, neck, and arm joints reached compromised values, effectively reducing ergonomic risk. Real-time biofeedback has also demonstrated its advantages in sports applications. For example, Chan et al. [8] used force plates to retrain the gait of novice long-distance runners, and the results showed that participants who trained with real-time feedback had a lower injury rate than those who received the same training without it.

Obtaining musculoskeletal analysis results in real time from optical motion capture is a challenging task. However, among the multiple technologies available for motion capture, systems based on IR cameras and reflective markers are currently the most accurate, which makes them remain the gold standard [9]. Inertial systems are known to enable simpler and faster reconstruction of kinematics [10], but at the cost of being less accurate, as well as suffering from translational drift, as they lack an absolute position reference. Nowadays, real-time motion reconstruction and visualization has become a standard feature of optical systems, although the motion-reconstruction algorithms developed by manufacturers are not usually published. The traditional approach for working with optical markers requires a rather complex manual postprocessing [11, 12], although there exist more automatic methods in the literature, such as algorithms based on nonlinear optimization [13], or on the Kalman filter [1417], but none of them are intended for real-time use. Among the published motion-reconstruction methods targeted at real-time applications, very few use reflective markers [18].

Methods already exist that provide real-time inverse dynamics at the skeletal level. Pizzolato et al. [19] adapted OpenSim to be able to estimate joint torques while receiving motion-capture data in real time. They achieved processing times as low as 0.5 ms per frame on a 23-DoF model, by solving the kinematics with a multithreaded implementation of the nonlinear optimization algorithm described by Lu and O’Connor [13]. In addition, the paper presents a comprehensive study of the effects on the joint torques caused by the delay induced by real-time low-pass filtering of kinematic data. Giarmatzis et al. [20] resorted to a totally different approach, by using machine-learning techniques to achieve real-time estimation of the joint torques produced at the lower limbs during gait, regardless of the technology used for the motion capture. The main drawback of the method stems from its data-driven nature, since the analysis is limited to the specific activity used to train the algorithm.

Most implementations of real-time musculoskeletal dynamics require the use of EMG. For instance, Sartori et al. [21] used an EMG-driven muscle model to estimate muscle activations in real-time at several degrees of freedom. Murai et al. [22] achieved real-time musculoskeletal analysis and visualization in a large-scale model by using EMG measurements where possible, and assuming certain activation correlations to estimate the forces of the remaining nonmeasured muscles. There are also real-time musculoskeletal dynamics implementations that do not require EMG. Van den Bogert et al. [23] used the same nonlinear least-squares method as Pizzolato et al. to solve the kinematics at every time step, followed by inverse dynamics and static optimization for the muscle-redundancy problem. This work also used a large-scale musculoskeletal model, with 44 degrees of freedom and 300 muscles, achieving real-time performance at 120 Hz. In order to obtain such performance, the authors limited the iterations of both the kinematics solver and the muscle optimizer, and used an approximated method to compute the moment arms of the muscles. More recently, Stanev et al. [24] also implemented real-time musculoskeletal analysis and visualization, in this case using OpenSim. The authors focused again on the filtering delay, and proposed a new method to minimize it. The described motion capture and musculoskeletal analysis algorithm is theoretically able to work with either optical or inertial-based motion-capture systems. However, its real-time capabilities were only tested using the latter, with a reduced model including only the lower limbs, and using approximated muscle-moment arms.

In this work, the objective is to develop and implement a highly efficient algorithm for whole-body motion capture using IR cameras and optical markers, along with force-plate measurements, with the ability to estimate and visualize muscle efforts in real time. This is achieved by combining a Kalman filter to estimate the skeletal kinematics, with a multibody formulation to calculate the joint torques, and an optimization algorithm for obtaining the muscle efforts producing such torques. The Kalman filter is derived from an existing EKF developed for real-time motion capture [18], with some modifications to include the estimation of accelerations.

The resulting algorithm is highly robust and efficient, thus leaving enough computational headroom to increase the complexity of the musculoskeletal model, either by including muscle wrapping [2527], more sophisticated physiological muscle models [2830], or improved joint kinematics [25], including closed-loop models [31, 32] that can be solved iteratively or using lookup tables. It is also possible to incorporate additional sensors into the Kalman filter, such as IMUs, to reinforce the robustness and accuracy of the motion-capture process, or to integrate the equations of motion into the plant model [3335], thus allowing to also add the force plates as sensors, which would be beneficial for both motion acquisition and joint-torques estimation.

2 Real-time motion capture with musculoskeletal analysis

This section describes in detail all the elements involved in the motion-capture system with real-time musculoskeletal analysis developed in this paper: the definition of the biomechanical model, the Kalman filter used for reconstructing the motion, the procedure for solving the inverse dynamics at every time step, and the optimization approach for estimating the muscular forces.

2.1 Biomechanical model

The biomechanical model used here, which is shown in Fig. 1, is the same one employed in the previously developed EKF [18] that, in turn, derives from the work by Silva and Ambrósio [36]. It consists of \(n_{b} = 18\) rigid bodies: pelvis, trunk, neck, head, arms, forearms, hands, thighs, shanks, feet, and toes. The axes of the global reference frame are defined as follows: \(x\)-axis in the anteroposterior direction, \(y\)-axis in the mediolateral direction, and \(z\)-axis in the vertical direction. The local axes of the body segments are defined parallel to the global ones, when the subject is in a standing position, with the arms lowered down and the palms of the hands facing inward.

Fig. 1
figure 1

Multibody model showing joints and optical markers

The joint between the pelvis and the trunk is located between T12 and L1 vertebrae, and the neck base is placed between T3 and T4, to keep it at the same height as the shoulder joints. Most body segments are connected together through spherical joints, with three exceptions: there is a universal joint between the trunk and the base of the neck, and each toe segment is attached to its corresponding foot by means of a revolute joint. This is done to avoid placing extra markers on the neck and toes, which would be required to capture the eliminated degrees of freedom. According to this model definition, the total number of degrees of freedom \(n_{z}\) is equal to 52: three translations and three rotations at the pelvis, two rotations at the base of the neck, one rotation at each metacarpophalangeal joint, and 42 absolute Euler angles at the remaining spherical joints. These variables are grouped into a vector of independent positions \(\mathbf{z}\), which will constitute the base of both the motion-capture EKF and the equations of motion used for the inverse dynamics.

In the previous EKF implementation, the model coordinates were defined using absolute angles for the bodies with spherical proximal joints, because this allowed us to define a simpler and more efficient observation function, with a very sparse Jacobian matrix. However, using absolute angles is not the best option when solving the dynamics, so most formulations in independent coordinates rely on relative coordinates to better exploit recursivity when assembling the dynamic terms [30, 37]. Nevertheless, this set of coordinates will be kept here also for the dynamics, mainly for two reasons. First, keeping the same model variables allows reuse of the code previously developed in [18] for the observation function and its Jacobian matrix. Secondly, the dynamic terms obtained by using absolute rotations are easier to linearize. Although this advantage is not relevant to this work, it will facilitate one of its future goals: the implementation of an EKF that incorporates the equations of motion into the plant model [3335], using the force plates as sensors.

2.2 EKF for motion capture

The dynamic system used in this work has a linear plant model, but the observation model is highly nonlinear, so it is not possible to use a conventional Kalman filter. There are many complex and accurate ways to deal with nonlinearity when using Kalman filters, such as the unscented Kalman Filter (UKF) [38, 39], however, since in this work the goal is to meet real-time requirements, the simpler extended Kalman filter (EKF) [3840] is chosen for the task.

The equations of the predictor–corrector algorithm used by the EKF are well known, so they will not be repeated here for the sake of brevity. Instead, the definitions of all the involved terms will be addressed, namely the state vector \(\mathbf{x}\), the state propagation matrix \(\boldsymbol {\Phi}\), the process noise covariance matrix \(\boldsymbol{\mathcal {Q}}\), the observation function \(\mathbf{h}\left (\mathbf{x}\right )\), and the sensors noise covariance matrix \(\boldsymbol{\mathcal {R}}\).

2.2.1 Plant model

The filter uses a kinematic state-propagation model, similar to the second-order one used in [18]. However, in this case, since the calculation of the inverse dynamics needs the accelerations, a third-order discrete Wiener process acceleration model (DWPA) is used instead [40]. The state propagation is defined directly as a discrete model, where the acceleration increment at every time step is modeled as discrete Gaussian noise. For each degree of freedom \(i\) of the model, a state vector \(\mathbf{x}_{i}\) containing its position, velocity and acceleration propagates in time between two consecutive time steps \(k\) and \(k+1\) as follows:

$$ \mathbf{x}_{i}^{k+1} = \boldsymbol {\Phi}_{i}\mathbf{x}_{i}^{k} + \boldsymbol {\Gamma}_{i} w_{i}^{k} \qquad i = 1, \ldots , n_{z}, $$
(1)

where \(w_{i}^{k}\) represents the increment of acceleration (or angular acceleration, depending on the nature of the corresponding degree of freedom) along time step \(k\). The state-propagation matrix \(\boldsymbol {\Phi}_{i}\) and the noise gain vector \(\boldsymbol {\Gamma}_{i}\) are both constant and equal for all degrees of freedom, with the following structure:

$$ \begin{bmatrix} z_{i} \\ \dot{z}_{i} \\ \ddot{z}_{i} \end{bmatrix} _{k+1} = \begin{bmatrix} 1 && \Delta t&& \tfrac{1}{2}\Delta t^{2} \\ 0 && 1 && \Delta t \\ 0 && 0 && 1 \end{bmatrix} \begin{bmatrix} z_{i} \\ \dot{z}_{i} \\ \ddot{z}_{i} \end{bmatrix} _{k} + \begin{bmatrix} \tfrac{1}{2}\Delta t^{2} \\ \Delta t \\ 1 \end{bmatrix} w_{i}^{k} \qquad i = 1, \ldots , n_{z}. $$
(2)

The time step \(\Delta t\) is fixed to 10 ms, since the motion-capture system provides the marker coordinates at a 100 Hz rate. The noise value \(w_{i}^{k}\) represents a random acceleration increment, which is assumed to remain constant along the whole time step. The state vector \(\mathbf{x}\) and propagation matrix \(\boldsymbol {\Phi}\) of the whole system are obtained by assembling all the individual \(\mathbf{x}_{i}\) and \(\boldsymbol {\Phi}_{i}\) terms, leading to a system of dimension \(3\times n_{z}\).

Assuming that the discrete Gaussian noise \(w_{i}\) has a variance \(\sigma _{w_{i}}^{2}\), the process noise covariance matrix \(\boldsymbol{\mathcal {Q}}_{i}\) can be obtained as [40]:

$$ \boldsymbol{\mathcal {Q}}_{i} = \mathbf{\Gamma}_{i}\sigma _{w_{i}}^{2} \mathbf{\Gamma}_{i}^{\mathsf {T}} = \begin{bmatrix} \tfrac{1}{4}\Delta t^{4} && \tfrac{1}{2}\Delta t^{3} && \tfrac{1}{2} \Delta t^{2} \\ \tfrac{1}{2}\Delta t^{3} && \Delta t^{2} && \Delta t \\ \tfrac{1}{2}\Delta t^{2} && \Delta t&& 1 \end{bmatrix} \sigma _{w_{i}}^{2} \qquad i = 1, \ldots , n_{z}, $$
(3)

which is also constant, although not necessarily equal for all degrees of freedom. The complete \(\boldsymbol{\mathcal {Q}}\) matrix for the whole system is obtained by assembling the individual \(\boldsymbol{\mathcal {Q}}_{i}\) matrices, according to the arrangement of variables used in the state vector \(\mathbf{x}\).

2.2.2 Observation model

The observation function \(\mathbf{h}\left (\mathbf{x}\right )\) must provide the theoretical values of the sensors, i.e., a vector \(\mathbf{m}\) containing the absolute positions of the \(n_{m}\) optical markers, as a function of the system states \(\mathbf{x}\). Each virtual marker \(i\) is considered as rigidly attached to an underlying body segment \(b\), at a fixed location \(\bar{\mathbf{m}}_{i}\) within its local axes. The body segment \(b\) itself is defined by the position vector \(\mathbf{r}_{b}\) of its proximal joint, and a rotation matrix \(\mathbf{A}_{b}\). As shown in Fig. 2, the absolute position of a given marker \(i\) is the sum of the position vector of the origin of its underlying body segment, and the local coordinates of the marker transformed into the global axes by \(\mathbf{A}_{b}\):

$$ \mathbf{m}_{i} =\mathbf {r}_{b} + \mathbf {A}_{b}\bar{\mathbf {m}}_{i} \qquad i = 1,\ldots ,n_{m} \quad b = 1,\ldots ,n_{b}. $$
(4)

The position of the origin of body \(b\), in turn, can be calculated recursively in the same way, as the origin \(\mathbf{r}_{b-1}\) of its proximal body, plus the local coordinates \(\bar{\mathbf{r}}_{b}\) of the proximal joint of body \(b\) within the frame of reference of body \(b-1\), also rotated to the global frame:

$$ \mathbf{r}_{b} =\mathbf {r}_{b-1} + \mathbf {A}_{b-1}\bar{\mathbf {r}}_{b} \qquad b = 1,\ldots ,n_{b}. $$
(5)

Both the local coordinates of joints and optical markers are assumed to be constant, although their values will depend on the size of the subject.

Fig. 2
figure 2

Recursive calculation of absolute marker positions

The \(3\times n_{m}\) coordinates of the optical markers used as sensors in this work are considered independent, thus leading to a diagonal sensor-noise covariance matrix \(\boldsymbol{\mathcal {R}}\). The total noise affecting each marker measurement is the result of two effects: the noise introduced by the motion-capture system, and the skin-motion artifact [41]. Whereas the former is very small, and equal for all markers, the latter is significantly large and, moreover, it does not behave as Gaussian noise, since it is correlated to the motion of the underlying bodies, and does not contain a flat frequency spectrum. This fact, along with the use of a Kalman filter on a nonlinear system, renders the resulting estimation nonoptimal in a statistical sense.

The skin-motion artifact does not affect all markers equally, so their noise variances will differ. Therefore, it is possible to finely tune \(\boldsymbol{\mathcal {R}}\) by setting lower variances for markers that are closely attached to the bone, and higher values for the most unreliable ones. However, for the sake of simplicity, and taking into account that this work is mostly focused on obtaining a working real-time application, all markers will be assigned the same noise variance \(\sigma _{m}^{2}\).

2.2.3 Model scaling

The local coordinates of the joints \(\bar{\mathbf{r}}_{b}\) present in the observation function must match the dimensions of the subject, so they need to be properly scaled before running the Kalman filter. The reference unscaled dimensions, which are detailed in Table 1, are determined by the geometry of the 3D bone models used for the graphics output, which are in turn taken form the BodyParts3D digital library [42].

Table 1 Unscaled joint coordinates

A first approach to scaling a body segment could be multiplying the components of all the \(\bar{\mathbf{r}}_{b}\) and \(\bar{\mathbf{m}}_{i}\) vectors attached to it by a set of three independent scaling factors, defined along the directions of its local axes. However, using three independent factors per segment leads to inaccurate results in many situations. For instance, when the marker distances along a certain local axis are too short, the scaling factor on that direction is very sensitive to small marker displacements. For this reason, a reduced set \(\mathbf{k}\) of independent scaling factors is used instead. The relationship between the independent scale factors and their dependent counterparts is defined in Table 2.

Table 2 Scaling factors

Scaling is carried out by solving a standard nonlinear least-squares problem, where the design variables are the independent position coordinates of the model, together with the independent scale factors, and the objective function is the squared norm of the measurement residual:

$$ \min _{\mathbf{z}, \mathbf{k}} f\left (\mathbf{z}, \mathbf{k}\right ) = \left [\mathbf{m}_{c} - \mathbf{h}_{s}\left (\mathbf{x}, \mathbf{k} \right )\right ]^{\mathsf{T}}\left [\mathbf{m}_{c} - \mathbf{h}_{s} \left (\mathbf{x}, \mathbf{k}\right )\right ]. $$
(6)

This optimization problem is essentially the one proposed by Lu and O’Connor [13], to which the independent scale factors are added here as additional design variables. The residual is defined as the difference between the measured marker coordinates \(\mathbf{m}_{c}\) and their estimated or virtual counterparts, provided by the observation function \(\mathbf{h}_{s}\) of the EKF, which is augmented here in order to include the scale factors \(\mathbf{k}\) as additional input arguments. Likewise, the Jacobian matrix of the measurement residual mostly coincides with that of the observation function, with additional columns corresponding to the derivatives with respect to the components of \(\mathbf{k}\). Having the exact Jacobian matrix enables us to solve the optimization in a very robust and efficient way by using the Levenberg–Marquardt algorithm [43, 44].

2.2.4 Marker labeling

The motion-capture system provides a set of 3D marker coordinates every 10 ms. Ideally, each dataset should only contain the coordinates of the \(n_{m}\) markers attached to the subject, with the markers always sorted in the same order. This would allow for properly identifying or labeling the markers, so they can be assembled correctly within the measurement vector \(\mathbf{m}_{c}\). However, it is very frequent to lose markers due to occlusions, or to have spurious reflections appear as additional markers. Moreover, the markers are not guaranteed to be sorted in any particular order. In traditional, postprocessing-based methods, these problems are addressed after the capture process, in a manual or semiautomatic way. However, in the case of a real-time application, where it is not possible to perform any manual data correction, these problems must be addressed on-the-fly.

In the present work, this is solved using the same method described in [18], which is based on two separate algorithms: the first one performs an initial marker identification and, in the subsequent time steps, the second one takes advantage of the observation function to keep labeling markers on-the-fly.

Initial labeling is approached by assuming that the subject is standing close to a known reference pose, which enables the identification of the measured markers based on their relative spatial locations. Once the experimental markers are identified, the vector of sorted coordinates \(\mathbf{m}_{c}\) is used as an input to solve the scaling problem described in Eq. (6). If the objective function reaches a sufficiently small value after the optimization, and all the resulting scale factors are positive, the solution is regarded as valid. On the contrary, if the posture of the subject is too far from the reference, markers are likely bo be mislabeled by the initial sorting algorithm, thus leading to a large measurement residual and/or negative scaling factors after the optimization. In this case, the process is repeated at every time step in a loop, until the subject adopts an adequate posture. In the real-time application described in Sect. 3, each initial sorting and optimization process takes less than 2.5 ms, so initialization is perceived as instantaneous as soon as the reference pose is reached.

After a valid solution has been achieved, the optimal scale factors are stored and permanently applied to the model. However, before starting the EKF, the local position vectors \(\bar{\mathbf{m}}_{i}\) of the virtual markers require a further adjustment. For the current pose of the already scaled model, their components are corrected to make the virtual markers match the measured ones. Performing such correction decreases the measurement error or innovation while running the Kalman filter. This allows us to reduce the marker search radius used in the continuous labeling algorithm described below, thus minimizing the risk of marker swapping.

Once the model is correctly scaled, and the local marker coordinates are adjusted, the recursive predictor–corrector algorithm of the EKF can start, using the independent positions \(\mathbf{z}\) obtained from the optimization as a starting point (setting the initial velocities and accelerations to zero). As with the scaling algorithm, the EKF also requires an ordered set of experimental marker coordinates \(\mathbf{m}_{c}\) to work. However, the initial labeling method described above can no longer be used, since identifying the markers according to their relative positions is only possible for certain poses of the subject. Conveniently, when the filter is running, the observation function provides a set of estimated marker positions, which can be exploited to identify the measured markers based on their proximity to them, through a nearest-neighbor search. This can be done very efficiently by computing a matrix of squared crossdistances between the real and the virtual marker set, as explained in [18]. If a virtual marker does not have any measured one within a predefined search radius, it is regarded as lost for that time step, so it is not used in the EKF corrector. If the algorithm detects a severe marker loss, or the innovation of the EKF becomes too large, the process must be rebooted. This means that the pose-based labeling and optimization loop must be carried out again, but this time only the positions \(\mathbf{z}\) are used as design variables, since the scaling factors are already determined.

2.3 Inverse dynamics

Apart from the kinematics provided by the EKF, solving the inverse-dynamics problem requires knowing the body-segment parameters and the ground reactions, along with a set of equations of motion. These elements will be explained in detail in this section.

2.3.1 Body-segment parameters

The body-segment parameters (BSP) comprise the mass of each body segment, the location of its center of gravity (CoG), and its tensor of inertia. These parameters have a significant effect in the inverse dynamics [45], however, since it is very difficult to obtain their exact values, they are normally scaled using reference values taken from the literature. There exist methods to obtain these values more accurately, such as performing some previous measurements and exercises [12], using optimization techniques [46, 47], or even resorting to clinical measurements [48]. However, in this work, the simpler scaling method is chosen instead, since it allows for obtaining results immediately after placing the markers on the subject, which poses an interesting advantage for real-time biofeedback applications.

The reference values used here are a combination of those published by Silva and Ambrósio [36] and Dumas et al. [49]. The latter has more recent data, so its values are used whenever possible. However, they used a kinematic model in which the lumbar joint was placed at a different location, so the pelvis and trunk data must be taken from the first cited paper. In order to match the geometry of the 3D graphics model defined in Table 1, the values from the referred papers are scaled, to obtain the reference body-segment parameters displayed in Table 3.

Table 3 Unscaled body-segment parameters in SI units (kg, m, kg⋅m2)

2.3.2 Ground-reaction forces and torques

Force plates measure the forces and moments produced by foot–ground interactions. Each plate provides the six components of the corresponding ground reaction, in the form of a 3D wrench located at its center. In the case of a single support, force plates are not strictly necessary, since inverse dynamics can provide the six components of the net external reaction. However, in the frequent case of having more than one external contact, they become essential to eliminate the resulting indeterminacy [50]. In order to correctly transfer the reactions to the feet, the location and orientation of the force plates within the motion capture volume must be accurately determined.

During the motion-capture session, it is important to make sure that, whenever a foot steps on a plate, it stays completely within its bounds, and that two feet never step on the same plate at the same time. In the standard motion-capture workflow, when the data is postprocessed, it is necessary to determine which foot stepped on each plate, in order to correctly assign the measured reactions during the inverse dynamics analysis. In real-time biofeedback applications, however, this information is not necessarily known in advance, so an algorithm to assign the force-plate reactions to the feet in real time becomes necessary. The method used in this work performs the following steps:

  • Calculate the absolute position of the centers of gravity of both feet.

  • Check which force plate is bearing the highest load.

  • Locate the center of pressure of said plate in global coordinates.

  • Determine which foot has its center of gravity closest to this point.

  • Assign that foot to the most loaded plate.

  • Assign the remaining foot to the least loaded plate.

The centers of pressure are initially set to the geometric centers of the force plates. They are only recalculated when the vertical force exceeds a 1.0 N threshold, which is slightly above the amplitude of the noise present in the signal. Otherwise, if the plate is unloaded or bearing a very small load, the noisy values of forces and moments can lead to random locations of the center of pressure, usually outside the bounds of the force plate.

Assigning the most loaded plate first avoids problems when one of the feet is airborne and far away from the ground. In such a situation, the center of gravity of the standing foot is the closest one to both centers of pressure, so the results can be wrong depending on the assignment order. Once the plates have been assigned, their measured forces and moments can be translated to the center of gravity of the corresponding foot, and then added to the Cartesian forces vector \(\bar{\mathbf{Q}}\), as will be explained in the next section.

2.3.3 Equations of motion

The equations of motion for the inverse-dynamics problem are stated in the same set of independent coordinates \(\mathbf{z}\) used for the EKF. In order to assemble the dynamic terms in a systematic way, an intermediate set of dependent Cartesian velocities \(\mathbf{v}_{i}\) is defined for each body \(i\). It contains the velocity of its center of mass in global coordinates, along with its angular velocity in local coordinates:

$$ \mathbf{v}_{i} = \begin{bmatrix} \dot{\mathbf{r}}_{G_{i}} \\ \bar{\boldsymbol{\upomega }}_{i} \end{bmatrix} \qquad i = 1,\ldots ,n_{b}. $$
(7)

This is similar to the set of Cartesian velocities used in the efficient semirecursive formulation described in [37]. In that work, the authors used the velocity of the point of body \(i\) that instantaneously coincides with the global origin, and the angular velocity of the same body in global coordinates, with the goal of being able to recursively accumulate the dynamic terms. In this work, however, they are defined in such a way that they lead to a constant block-diagonal mass matrix. This results in simpler and easier to linearize expressions for the dynamic terms, which has other advantages, as mentioned in Sect. 2.1.

According to the previous definition of the Cartesian velocities of body segment \(i\), its corresponding constant mass matrix is defined as follows:

$$ \bar{\mathbf{M}}_{i} = \begin{bmatrix} m_{i}\mathbf{I}&& \boldsymbol{0} \\ \boldsymbol{0} && \bar{\mathbb{I}}_{G_{i}}\end{bmatrix} \qquad i = 1,\ldots ,n_{b}, $$
(8)

where \(m_{i}\) is the total mass of the segment, \(\mathbf{I}\) is the 3 × 3 identity matrix, and \(\bar{\mathbb{I}}_{G_{i}}\) is the tensor of inertia of the segment in local axes, defined with respect to its center of gravity. When the products of inertia are zero, this matrix becomes purely diagonal. The forces and torques associated to these Cartesian velocities have the following form:

$$ \bar{\mathbf{Q}}_{i} = \begin{bmatrix} \mathbf{f}_{i} + m_{i}\mathbf{g} \\ \bar{\mathbf{n}}_{G_{i}} - \bar{\boldsymbol{\upomega }}_{i} \times \bar{\mathbb{I}}_{G_{i}}\bar{\boldsymbol{\upomega }}_{i} \end{bmatrix} \qquad i = 1,\ldots ,n_{b}, $$
(9)

where \(\bar{\boldsymbol{\upomega }}_{i}\) is the angular velocity of body \(i\), expressed in its local reference frame, and \(\mathbf{g}\) is the acceleration of gravity. The vector \(\mathbf{f}_{i}\) corresponds to the net external force applied to body \(i\) (excluding the weight), and \(\bar{\mathbf{n}}_{G_{i}}\) is the net external moment, calculated with respect to the center of gravity, and expressed in local coordinates. These external forces and moments include the reactions and motor torques at the joints, in addition to the ground reactions in the case of the feet.

If the velocities of all bodies are assembled into a vector \(\mathbf{v}\), and the same is done to the individual dynamic terms \(\bar{\mathbf{M}}_{i}\) and \(\bar{\mathbf{Q}}_{i}\), thus obtaining the mass matrix \(\bar{\mathbf{M}}\) and force vector \(\bar{\mathbf{Q}}\) of the whole system, the Newton–Euler equations of motion can be written in matrix form as:

$$ \bar{\mathbf{M}}\dot{\mathbf{v}} = \bar{\mathbf{Q}}. $$
(10)

In order to obtain the equations of motion in independent coordinates, the first step is to apply the principle of virtual power to these equations. This principle states that, for any given set of virtual velocities \(\tilde{\mathbf{v}}\), provided that they are compatible with the kinematical constrains of the multibody system, the corresponding virtual power of all the forces acting on the system (including D’Alembert’s inertia forces), is always null [51]. It can be expressed in matrix form as follows:

$$ \tilde{\mathbf{v}}^{\mathsf{T}}\left (\bar{\mathbf{M}} \dot{\mathbf{v}} - \bar{\mathbf{Q}}\right ) = 0. $$
(11)

For any allowed motion of the system, the joint reactions do not produce any power, since the constrained degrees of freedom have no relative motion. Therefore, they can be removed from \(\bar{\mathbf{Q}}\) in this equation. This means that all the unknown \(\mathbf{f}_{i}\) and \(\bar{\mathbf{n}}_{G_{i}}\) terms corresponding to joint reactions in Eq. (9) can be eliminated, leaving only those corresponding to motor forces and torques.

The next step for obtaining the equations of motion in independent coordinates is to define a position-dependent velocity-transformation matrix \(\mathbf{R}\) [51], such that:

$$ \mathbf{v}= \mathbf{R}\dot{\mathbf{z}}. $$
(12)

For scleronomous systems, as is the case of the multibody model of the human body used in this work, the dependent accelerations are directly obtained by differentiating the previous equation in time:

$$ \dot{\mathbf{v}}= \mathbf{R}\ddot{\mathbf{z}}+ \dot{\mathbf{R}} \dot{\mathbf{z}}. $$
(13)

This transformation can be applied to both the real and virtual velocities in Eq. (11), yielding:

$$ \dot{\tilde{\mathbf{z}}}^{\mathsf{T}}\mathbf{R}^{\mathsf{T}}\left [ \bar{\mathbf{M}}\mathbf{R}\ddot{\mathbf{z}}- \left (\bar{\mathbf{Q}}- \bar{\mathbf{M}}\dot{\mathbf{R}}\dot{\mathbf{z}}\right )\right ] = 0. $$
(14)

As opposed to what happened in Eq. (11), the virtual velocities \(\dot{\tilde{\mathbf{z}}}\) appearing in this equation are independent, which means they can take any arbitrary value. Since Eq. (14) must be fulfilled for any possible set of virtual velocities, the term multiplying them must be equal to zero. This leads to the desired equations of motion of the system in independent coordinates:

$$ \mathbf{R}^{\mathsf{T}}\bar{\mathbf{M}}\mathbf{R}\ddot{\mathbf{z}}= \mathbf{R}^{\mathsf{T}}\left (\bar{\mathbf{Q}}- \bar{\mathbf{M}} \dot{\mathbf{R}}\dot{\mathbf{z}}\right ), $$
(15)

which can be rewritten as:

$$ \mathbf{M}\ddot{\mathbf{z}}= \mathbf{Q}, $$
(16)

where \(\mathbf{M}\) and \(\mathbf{Q}\) are the mass matrix and generalized forces vector:

$$ \mathbf{M}= \mathbf{R}^{\mathsf{T}}\bar{\mathbf{M}}\mathbf{R}\qquad \mathbf{Q}= \mathbf{R}^{\mathsf{T}}\left (\bar{\mathbf{Q}}- \bar{\mathbf{M}}\dot{\mathbf{R}}\dot{\mathbf{z}}\right ). $$
(17)

As explained above, the joint reactions do not affect the result of this equation, so their \(\mathbf{f}_{i}\) and \(\bar{\mathbf{n}}_{G_{i}}\) terms can be omitted when using Eq. (9) to assemble \(\bar{\mathbf{Q}}\).

2.3.4 Inverse dynamics

In order to solve the inverse-dynamics problem, the generalized forces vector from Eq. (16) must be split into two separate terms, as follows:

$$ \mathbf{M}\ddot{\mathbf{z}}= \mathbf{Q}_{0} + \mathbf{Q}_{e}. $$
(18)

The vector \(\mathbf{Q}_{0}\) includes only the known part of the generalized forces, i.e., those due to the measured ground reactions, together with the gravitational and velocity-dependent inertia forces. It is assembled as described in Eqs. (9) and (17), with the feet being the only bodies with nonzero \(\mathbf{f}_{i}\) and \(\bar{\mathbf{n}}_{G_{i}}\) terms, due to the ground reactions. All remaining external forces and torques have been now translated into the vector of unknown motor efforts \(\mathbf{Q}_{e}\). As mentioned in Sect. 2.3.2, prior to being incorporated into \(\mathbf{Q}_{0}\), the ground reactions measured by the force plates must be transformed to make them consistent with the definition of the Cartesian velocities \(\mathbf{v}\). This implies translating the reaction moments from the force plate centers to the centers of gravity of the feet, and transforming them to the corresponding local axes.

The generalized forces included in \(\mathbf{Q}_{e}\) are acting directly on the independent coordinates \(\mathbf{z}\), which makes them difficult to interpret. Therefore, it is better to establish a more meaningful set of external efforts \(\mathbf{T}\), and map them into the independent coordinates by using a position-dependent input matrix:

$$ \mathbf{Q}_{e} = \mathbf{B}^{\mathsf{T}}\mathbf{T}. $$
(19)

The vector of motor efforts \(\mathbf{T}\) contains the motor torques driving all the joints, expressed in the local axes of the corresponding proximal bodies. The torque at the base of the neck only has local \(x\) and \(y\) components, due to the use of a 2-DOF universal joint, and the torques at the toes are scalar, since they correspond to revolute joints. Along with the motor torques, \(\mathbf{T}\) includes a residual wrench acting at the pelvis, which is intended to absorb the inconsistency between the measured ground reactions and the net external wrench predicted by inverse dynamics. This wrench should ideally be zero, but in practice it never is, due to the many sources of error present in the process [28, 52].

To facilitate its derivation, the input matrix \(\mathbf{B}\) can be interpreted as a velocity transform such that:

$$ \dot{\mathbf{y}}= \mathbf{B}\dot{\mathbf{z}}, $$
(20)

where \(\dot{\mathbf{y}}\) is a set of independent Cartesian velocities, defined consistently with the power produced by the motor efforts \(\mathbf{T}\). Therefore, it is formed by the relative angular velocities at the joints, expressed in the local axes of their corresponding proximal bodies, along with the velocity of the origin of the pelvis and its angular velocity, both in global axes. The computation of matrix \(\mathbf{B}\) is very efficient, since the \(\dot{\mathbf{y}}\) velocities are closely related to \(\mathbf{v}\), so the blocks forming matrix \(\mathbf{B}\) are easily derived from terms already present in matrix \(\mathbf{R}\). In the real-time software implementation described in Sect. 3, the computation of the input matrix takes an average of 23 μs per time step.

At every time step, after the independent positions, velocities, and accelerations have been provided by the EKF, and the ground reactions have been measured by the force plates, the equations of motion can be assembled, so the external efforts \(\mathbf{T}\) are obtained by solving the following linear system:

$$ \mathbf{B}^{\mathsf{T}}\mathbf{T}= \mathbf{M}\ddot{\mathbf{z}}- \mathbf{Q}_{0}. $$
(21)

2.4 Estimation of muscle efforts

As a first approach, muscular efforts are estimated in this work using simple static optimization. This implies that muscles are introduced in the model as plain contractile actuators, ignoring the presence of elastic tendons, and neglecting most of their physiological limitations, such as the delay produced by excitation–activation dynamics, and the effect of muscle length and contraction velocity [28, 29]. Although it is a very simplistic way of modeling muscle forces, it provides reasonable results for slow movements with small muscle elongations [29, 30]. Under these assumptions, the \(n_{F}\) muscle forces \(\mathbf{F}\) can be estimated by solving an optimization problem:

$$ \begin{aligned} \min _{\mathbf{F}} g(\mathbf{F}) = \sum _{i = 1}^{n_{F}} \left ( \frac{F_{i}}{F_{i}^{0}}\right )^{2}, \\ \begin{aligned} \text{s.t.} \quad &\mathbf{J}^{\mathsf{T}}\mathbf{F}= \mathbf{T}, \\ &0 \leq F_{i} \leq F_{i}^{0}, \end{aligned} \end{aligned} $$
(22)

where \(F_{i}^{0}\) is the maximum isometric force of muscle \(i\), and \(\mathbf{J}\) is the matrix of moment arms, defined such that the product \(\mathbf{J}^{\mathsf{T}}\mathbf{F}\) is mechanically equivalent to the joint torques provided by the inverse dynamics. The moment arms depend on the placement of the muscle-insertion points, whose reference local coordinates are adapted from the “Gait2392” model included in OpenSim [1]. The local coordinates of the insertion points are affected by the model-scaling process described in Sect. 2.2.3 in the same way as those of joints and optical markers.

The moment-arm matrix \(\mathbf{J}\) can be derived by using the same reasoning used to obtain the input matrix \(\mathbf{B}\), i.e., by understanding it as a transformation matrix that turns the Cartesian velocities \(\dot{\mathbf{y}}\) into the shortening rates of the active muscle segments. As with the input matrix, most of the terms involved in the assembly of \(\mathbf{J}\) are already computed at this point, since they are required for solving the inverse-dynamics problem. Therefore, obtaining the exact moment-arm matrix has very little computational cost (about 34 μs per time step), making approximated methods [23, 24] unnecessary in this case.

It must be pointed out that the joint-torque constraints are only imposed to the degrees of freedom that are considered as being actually actuated by muscles. For instance, the knees are modeled here as spherical joints for the sake of simplicity, so the inverse dynamics will output relatively large abduction–adduction torques at them. However, it is obvious that these torques are indeed joint reactions, essentially produced by ligaments and surface contacts, so it does not make sense to enforce the muscles to provide them. To avoid parasitic work by the reaction forces, the knees should be modeled using a more physiologically accurate model, either by using a revolute joint with an optimized rotation axis [47], or a macro joint replicating the actual kinematics [25]. These model improvements would not have a significant impact on computational efficiency, so using them should not be a problem in a real-time application.

The optimization problem stated in Eq. (22) can be solved at a very small computational cost, because it is a standard quadratic programming problem with linear equality and inequality constraints, for which there exist several well-known solvers [43]. In order for the problem to be addressed by a standard quadratic programming algorithm, the objective function \(g\) can be rewritten in the typical quadratic form:

$$ g\left (\mathbf{F}\right ) = \frac{1}{2}\mathbf{F}^{\mathsf{T}} \mathbf{C}\mathbf{F}+ \mathbf{D}^{\mathsf{T}}\mathbf{F}, $$
(23)

where the vector \(\mathbf{D}\) is not present in this case, and the matrix \(\mathbf{C}\) is a constant diagonal matrix such that:

$$ C_{ii} = \left (\frac{\sqrt{2}}{F_{i}^{0}}\right )^{2} \qquad i = 1, \ldots , n_{F}. $$
(24)

In this work, a modified version of the interior-point algorithm proposed by Mehrotra [43, 53] is implemented. The original algorithm has been tuned for this specific problem, in order to take advantage of the fact that there is no \(\mathbf{D}\) vector and \(\mathbf{C}\) is diagonal. This makes the optimization very efficient: with 86 muscles, the algorithm needs less than 0.15 ms to converge in a standard desktop PC, when the results obtained at each time step are fed as the initial guess to the next one.

In order to facilitate convergence, a set of additional external torques is added to the vector of design variables \(\mathbf{F}\), in a similar fashion to the reserve actuators used by OpenSim [1]. These actuators are introduced directly at the joints, so they can always provide the missing torque required to achieve convergence. To prevent the optimizer from relying on them too much, their maximum capacity is set to a small value, thus penalizing their use in the objective function. In an ideal situation, the reserve actuators should not be necessary but, if their activation is kept sufficiently low, they can help achieving convergence, while maintaining a reasonably low impact on the results.

3 Implementation and testing

The laboratory has a motion-capture system composed by 18 OptiTrack Flex 3 infrared cameras (OptiTrack, Corvallis OR, USA). These cameras have a resolution of 640x480 pixels, with a refresh rate of 100 Hz. In order to measure the ground reactions, two AMTI AccuGait force plates (Advanced Mechanical Technology Inc., Watertown MA, USA) are embedded into a ground platform.

3.1 Software architecture

The software for real-time motion capture, analysis, and visualization is programmed in C++. The graphical user interface (GUI), 3D graphics, multithreading and serial-port-related tasks are implemented using the Qt multiplatform framework (The Qt Company, Espoo, Finland). The Levenberg–Marquardt algorithm for model scaling, all Kalman filter computations, along with the inverse-dynamics and muscle-optimization solvers, are all implemented using the Eigen C++ template library for linear algebra [44], which is in turn configured to take advantage of the Intel® MKL libraries (Intel Corporation, Santa Clara, CA, USA).

Since there are different hardware devices involved, the software has an asynchronous multithreaded structure. The 3D coordinates of the optical markers are acquired by a separate thread using the C library provided by the manufacturer of the cameras. This thread launches a callback function every 10 ms, sending the raw marker coordinates to the main thread. In a parallel subprocess, the force plates are read through a serial-port interface. They also send the data at a 100 Hz rate, with another callback function being executed every time a data packet is received at the serial port. Since the ADC of the force plates does not provide an input for hardware clock synchronization, the data from force plates and cameras is not completely synchronized at a hardware level.

In the main thread, there are two possible states. When the program starts, the system is uninitialized, since the markers are not yet labeled and the model is unscaled. The program loops until \(n_{m}\) markers are found, and, when this happens, it tries to label them by using the pose-based method described in Sect. 2.2.4. Then, the scaling problem is run, and the value of the objective function is checked. If the error is below a certain threshold, it is assumed that the labeling is correct, so the EKF can start. Otherwise, the loop continues until a valid solution is found.

When the program is running the EKF, the sequence is as follows: as soon as a new dataset arrives from the cameras, the markers are labeled on-the-fly using the nearest-neighbor approach. Then, these ordered sensor measurements are used, together with the latest estimated markers and the EKF gain matrix, to correct the state vector \(\mathbf{x}\). Once the updated kinematics are available, the inverse-dynamics problem is solved, using the positions, velocities, and accelerations provided by the EKF, and the most current force-plate measurements available as inputs. When the motor torques \(\mathbf{T}\) are ready, the muscle optimization is carried out, leading to the estimated muscle forces \(\mathbf{F}\). In order to visualize the individual activations, each muscle is assigned a color, which is mapped from blue when the muscle is deactivated, to bright red when it is fully activated. Then, this color information, along with the transformation matrices of all rigid bodies, is sent to the 3D graphics engine, which takes care of the rendering in an asynchronous way. At this point, the EKF prepares to receive the next incoming dataset from the cameras: by using the state-propagation matrix, it calculates the next a priori estimation of the states that, in turn, provides the estimated markers by evaluating the observation function, along with its Jacobian matrix, necessary to compute the new EKF gain matrix.

Figure 3 shows a screenshot of the GUI of the software, which displays in real time the musculoskeletal model, with the measured and estimated markers, along with the ground-reaction forces and the muscle activations.

Fig. 3
figure 3

Real-time motion capture software GUI

3.2 Experimental results

The tests were carried out using the biomechanical model described in Sect. 2.1, with 36 optical markers and 86 muscles (43 at each leg). With this model size, an AMD Ryzen™ 7 3700X processor can run the algorithm in real time with an ample margin, since it needs about 0.8 ms per time step to perform all the calculations (marker labeling, EKF, inverse dynamics, and muscle optimization), and the cameras and force plates provide new data every 10 ms.

There are several sources of latency in the system: the IR cameras have to identify the markers and perform the triangulation, then the data is processed through all the algorithm steps and, finally, the 3D graphics output is generated and sent to the screen. However, the total lag of the 3D visualization is still small enough to be almost imperceptible, and the application has a very good real-time feedback. A demonstration of the application running in a motion capture laboratory is shown in Fig. 4.

Fig. 4
figure 4

Real-time demonstration of the software

The algorithm has proven fast enough to enable real-time musculoskeletal analysis, but the computational cost is not the only concern when processing a signal on-the-fly. The raw position data provided by the motion-capture system contains a significant amount of high-frequency noise, which needs to be filtered out of the signal prior to calculating velocities and accelerations. In the traditional postprocessing approach, the positions are passed through a discrete-time low-pass filter first, and velocities and accelerations are then obtained by using finite differences. In the EKF, both filtering and time differentiation occur within the predictor–corrector algorithm. The former method has the advantage that the entire time history is available at the time of processing, which allows for an additional backwards filter pass that compensates for the phase delay [12]. On the contrary, a real-time filter is causal, i.e., its output depends only on past and current inputs, so the delay is completely unavoidable [23, 24, 54]. The EKF, in this particular application where the system has no inputs and everything is driven by position sensors, is not free from this problem, so its impact in the results must be studied.

The experiment carried out to study such effects consisted of an unloaded squat exercise, performed by a 1.65 m, 56.5 kg young female, who gave written informed consent for her participation. In this movement, the torque constraints are violated when the joint flexion angles reach high values, as muscle wrapping has not yet been accounted for in the model, so muscle-moment arms can be very inaccurate in certain postures. However, this movement is interesting from the point of view of inverse dynamics, since it involves large joint-motion ranges and produces significant joint torques. Thus, the experiment focuses primarily on the inverse dynamics, rather than the muscular efforts since, once the joint torques obtained from the real-time algorithm are proven to be correct, any further muscular analysis results will be equivalent, regardless of how the torques were obtained.

In order to obtain a lag-free reference solution, the raw sensor data from the real-time experiment was also processed through the original EKF described in [18]. Then, as was done in the referred paper, the positions obtained from the second-order EKF were filtered using a forward–backward second-order Butterworth filter, before computing the velocities and accelerations by using central differences. The reference inverse dynamics solution was finally obtained by applying the same equations of motion used in the real-time method, described in Sect. 2.3.3. In the previous work [18], the accelerations obtained from the optical system were compared to those obtained by direct measurement, in order to determine the EKF and Butterworth filter settings providing the best match. The best results were obtained for a sensor variance of \(10^{-6}\) m2, a process noise variance of 1.0 rad2/s4 (or m2/s4, depending on the coordinate), and a Butterworth filter cutoff frequency of 20 Hz, so these were the values used here to obtain the reference solution.

It must be pointed out that two passes of a second-order Butterworth filter with a cutoff frequency \(f_{c}\) have the equivalent effect of a single-pass fourth-order filter, with a lower effective cutoff frequency \(f_{e}\). This frequency is obtained by squaring the gain function of a single-pass second-order Butterworth filter, and calculating at which frequency the 3 dB attenuation occurs:

$$ f_{e} = \left (\sqrt{2} - 1\right )^{\frac{1}{4}}f_{c}. $$
(25)

According to this expression, two passes of a 20-Hz filter result in an effective cutoff frequency of 16.04 Hz.

In the EKF, the variables most affected by the delay are the accelerations. Figure 5 displays the comparison of the reference vertical acceleration of the lumbar joint to the same magnitude obtained by the third-order EKF. Although the process noise variance has the same units in both EKFs, its physical meaning is different in each filter: in the second-order filter, the process noise represents the acceleration value during each time step whereas, in the third-order one, it represents the acceleration increment. Therefore, the parameters of the third-order EKF required some adjusting to better match the reference solution. In order to obtain more meaningful values, the sensor-noise variance \(\sigma _{m}^{2}\) has been set in this work to \(10^{-4}\) m2, which is more consistent with the observed values. For this sensor-noise variance, the process-noise variance \(\sigma _{w}^{2}\) providing the best results is 625 rad2/s4 (or m2/s4).

Fig. 5
figure 5

Vertical acceleration of the lumbar joint: 2nd-order EKF + BW (gray) vs. 3rd-order EKF (black)

When using these filter settings, the accelerations provided by the third-order EKF are delayed by about 10 ms. This delay is essentially the same that is obtained if the reference solution is passed through a single-pass, second-order Butterworth filter with a cutoff frequency of 16.04 Hz. If the variance of the process noise of the third-order EKF is lowered, the accelerations become smoother, but the delay quickly increases to 20 or 30 ms.

To assess the effect of the acceleration delay on the joint torques, Fig. 6 shows the comparison of flexion–extension torques at the knees. As can be seen in the plots, the torques have no noticeable delay: the RMS error between both methods is about 0.5 N m for both curves. This is due to the fact that the position-dependent gravitational forces are dominant in this movement, so the delay in inertia forces does not affect the total torques in a significant way.

Fig. 6
figure 6

Knee flexion–extension torques: 2nd-order EKF + BW (gray) vs. 3rd-order EKF (black)

4 Conclusions

In this work, a complete algorithm for real-time motion capture and musculoskeletal analysis is described, implemented, and experimentally tested. It is shown that using a third-order EKF with positions, velocities, and accelerations as state variables, along with 3D optical marker coordinates as sensors, combined with ground-reaction measurements, it is possible to reconstruct the kinematics, solve the inverse-dynamics problem, and obtain and visualize the muscle efforts in real time. The approach has been successfully implemented in a fully functional software application, with a fast and responsive visual feedback. The main improvement of the method presented here over related literature references, in which kinematics is solved by optimization methods that rely on the availability of proper marker data, is that the Kalman filter allows us to cope with marker occlusions in a very robust way, while maintaining the accuracy of real-time inverse dynamics, since accelerations are delayed about the same as when using the more common filter-and-differentiate approach.

The implementation used in this work presents some limitations. The most obvious one is the lack of muscle wrapping, which leads to inaccurate muscle-moment arms in certain postures. This can be easily solved by introducing a simple wrapping model based on dynamic via points [25, 26], which would greatly improve the accuracy at a small computational cost, or using more elaborate approaches, such as geometric surface-contact models [27]. If the application requires a complex wrapping model that cannot be run in real time, lookup tables [30] or polynomial fitting techniques [23, 24] can be used to provide approximate results more efficiently. Another relevant issue is the use of plain static optimizaton for estimating muscle forces. However, since the algorithm requires only 0.8 ms for evaluating each time step, which may be significantly reduced by using a more efficient multibody formulation [30, 37], there is still headroom for introducing a simplified physiological muscle model, as was done by Murai et al. [22], or even using a more complex static physiological-optimization approach [28, 29]. In the latter case, the inclusion of activation dynamics and tendon elasticity in the muscle model requires the integration of first-order ODEs, so the computational cost may become prohibitive for a large number of muscles. Nevertheless, there are many situations in which acceptable results can be efficiently obtained by assuming some simplifications [30]. The last important limitation is the acceleration delay. In very dynamic movements, such as a golf swing, the inertia forces may become dominant, so the delay will be translated to the joint torques and, consequently, to the muscle forces. However, being a real-time application, some kind of filtering delay is unavoidable, and 10 ms should be acceptable in most applications.

Apart from implementing an improved muscle model with wrapping, and possibly even fatigue [55], the next further development will be incorporating the equations of motion within the EKF plant model [3335], with the force plates acting as sensors, such that the Kalman filter will include the motor torques as state variables. By including the equations of motion along with the external reactions in the plant model, the system will be no longer driven by position sensors alone, thus improving the quality of the estimation.