1 Introduction

Dynamic simulation of multibody systems mainly refers to inverse and forward problems. Inverse dynamics deals with the determination of the driving forces that generate a given motion, as well as the constraint reaction forces. The solution to the inverse dynamics problem can be obtained for example using the Newton–Euler (N-E) method [1] that results in efficient recursive algorithms, e.g., Rigid Body Dynamics Library (RBDL) [2] and similar methods [3]. Conversely, forward dynamics involves the motion estimation of a multibody system over time under the given applied forces and initial conditions. Therefore, in a direct dynamic problem, the motion is the result of the force system that generates it. From a mathematical perspective, forward dynamics is computationally intensive as it entails the integration of a system of nonlinear ordinary differential equations. The most common formulations that deal with forward dynamic are the Composite Rigid Body Algorithm (CRBA) [4] and the Articulated Body Algorithm (ABA) [5]. However, the above-mentioned methods are not suited for systems with closed loops like parallel robots, for which more complex and expensive procedures are required to solve both inverse and forward dynamics, as described, for example, in [6]. Therefore, a single algorithm that can solve all types of dynamics problems has not been fully established. One notable effort is the work by Rodriguez [7], who proposed a unified approach based on Kalman filtering and on the concept of smoothing to solve dynamics problems. This approach has been further extended to solve the forward dynamic problem of closed kinematic chains [8]. Another effort in defining a unifying framework has been given by [9], who analyzed various algorithms for serial chain dynamics. In [10], an attempt is proposed to unify the CRBA and ABA derivation as two elimination methods to solve forward dynamics. To the best of our knowledge, factor graphs have not been applied to solve the general multibody kinematics and dynamics in the existing literature. The closest works are the preprints [11, 12], where factor graphs are indeed employed to solve kinematic and dynamic problems, although their proposed graph structure is applicable to open-loop robot arms only.

The present work proposes factor graphs as a unifying graphical language to express kinematics and dynamics for general multibody systems. As it will be shown, classical problems such as direct and inverse dynamics can then be easily solved as especial instances of those factor graphs. Moreover, this new paradigm has the potential to develop novel and advanced state estimators. Therefore, the main contributions of this work are:

  • a factor graph-based representation of dynamics problems, which is a insightful visualization of the underlying sparse relationship between all involved variables,

  • a unified method, which can solve inverse and forward dynamics for either open or closed kinematic chains,

  • a flexible framework that can be expressed and solved for both dependent and independent coordinates.

The implementation described in this manuscript is focused on planar mechanisms, although it is perfectly suitable to spatial systems without any change at all at the level of factor graph structures. Additionally, our approach allows more powerful and flexible schemes for state and parameter estimation to be implemented in contrast with standard methods based on Kalman filtering [13,14,15,16,17]. Such applications are left for future extensions of this work to keep the present manuscript focused on the key ideas on how to apply factor graphs to multibody motion problems.

The proposed approach draws on the formalism of graphical models, a powerful tool borrowing concepts from statistics and graph theory [18, 19]. By addressing the multibody simulation problem from the perspective of the variable structure, graphical models allow us creating efficient estimators for any combination of observed and hidden variables, effectively unifying the problems of kinematic and forward dynamic analysis (predicting or estimating the trajectory of a MB system), inverse dynamics, and parameter identification (e.g., inertial properties of the bodies involved, external disturbance forces, friction in the joints, etc.). All those problems end up to be formulated as a sparse nonlinear cost function built from a library of reusable “building blocks” (the factors) on which efficient solvers can then be applied. The framework is suitable for either offline batch analysis or online real-time operation that represents another clear advantage of the proposed approach.

The rest of the paper is structured as follows. Sections 2 and 3 first provide the required background on multibody dynamics and graphical models, respectively. Next, Sect. 4 presents a methodology for the application of Bayesian networks to multibody dynamics problems, whereas Sect. 5 particularizes such networks as factor graphs for a number of practical problems. Individual factors used in those graphs are described in detail in Sect. 6. Numerical examples are provided in Sect. 7, and we end sketching some conclusions in Sect. 8.

2 Review of multibody dynamics

In this section, fundamentals of multibody system motion analysis are briefly recalled, whereas the interested reader can refer for more details to the wide related literature, e.g., [20]. A multibody system is an assembly of two or more bodies (or elements) constrained to each other to fulfill a given motion law. In many practical applications, these elements may be considered rigid and, throughout this paper, we will work under this assumption even though the proposed framework may be further extended to include body flexibility.

One of the key decisions to take when modeling a MBS is selecting the set of generalized coordinates that will be used to represent it. Using independent coordinates \({{\mathbf {z}}}\) allows one to deal with the lowest number of parameters, i.e., the number of DOFs of the system. A second choice is to adopt dependent coordinates \({{\mathbf {q}}}\), in a number larger than that of DOFs but able to describe all multibody system points univocally. When dependent coordinates are used, the corresponding set of constraint equations must be included as well for a complete system analysis. There are different kinds of dependent coordinates [20]: Relative Coordinates, References Point Coordinates, Natural Coordinates, and a combination of the previous ones (Mixed Coordinates). Natural coordinates, mixed with relative coordinates where needed will be assumed in this work.

In the remainder of this section, the MBS motion equations are developed and expressed in terms of both dependent and independent coordinates.

2.1 Dependent coordinates formulation

For any given MBS with f dofs, the use of n dependent coordinates expressed by the vector \({{\mathbf {q}}}\) requires \(m = n-f\) constraint equations, which form the following set of equations

$$\begin{aligned} {\mathbf {\Phi }}\left( {{\mathbf {q}}}(t),t\right) ={\mathbf {0}} \end{aligned}$$
(1)

Thus, it is assumed that there are at least as many equations as there are unknown coordinates. To solve the kinematic problem, the time derivative of Eq. (1) is required, one time for velocity analysis and two times for acceleration analysis, leading to the following set of equations

$$\begin{aligned}&\mathbf {\Phi _q}({{\mathbf {q}}}(t),t) ~ {\dot{{\mathbf {q}}}}= -{\mathbf {\Phi }}_t \equiv {\mathbf {b}} \end{aligned}$$
(2)
$$\begin{aligned}&\mathbf {\Phi _q}({{\mathbf {q}}}(t),t) ~ {\ddot{{\mathbf {q}}}}= -{\dot{{\mathbf {\Phi }}}}_t-{\dot{{\mathbf {\Phi }}}}_{{{\mathbf {q}}}}{\dot{{\mathbf {q}}}}\equiv {\mathbf {c}} \end{aligned}$$
(3)

where \(\dot{{\mathbf {q}}}\) is the vector of dependent velocities, \(\mathbf {\Phi _q}\in {\mathbb {R}}^{m\times n}\) the Jacobian matrix of Eq. (1) respect to \({{\mathbf {q}}}\), and \({\mathbf {\Phi }}_t\) the time derivative of constraint equations that is equal to the null vector for time-independent constraints.

To solve the dynamic problem, external forces and inertia forces need to be considered. From the classical Newton’s law, one obtains the following set of differential equations that expresses the force equilibrium equations

$$\begin{aligned} {{\mathbf {M}}}{\ddot{{\mathbf {q}}}}+ \mathbf {\Phi _q}^\top \varvec{\lambda } = {{\mathbf {Q}}} \end{aligned}$$
(4)

where \(\varvec{\lambda }\) is the vector of Lagrange multipliers, \({{\mathbf {M}}}\) is the system mass matrix and vector \({{\mathbf {Q}}}\) contains the generalized external forces. Since Eq. (4) is a system of n equations in \(n+m\) variables, by adding the m Eq. (3) one obtains the following system

$$\begin{aligned} \begin{bmatrix} {\mathbf {M}} &{}&{} \mathbf {\Phi _q}^\top \\ \mathbf {\Phi _q} &{}&{} {\mathbf {0}} \end{bmatrix} \left[ \begin{array}{c} {\ddot{{\mathbf {q}}}}\\ \varvec{\lambda } \end{array} \right] = \left[ \begin{array}{c} {{\mathbf {Q}}}\\ {\mathbf {c}} \end{array} \right] \end{aligned}$$
(5)

Normally [20, 21], this equation is solved for the extended vector of unknowns that includes both, \({\ddot{{\mathbf {q}}}}\) and \(\varvec{\lambda }\). In our framework, we are specifically interested in the generalized accelerations \({\ddot{{\mathbf {q}}}}\). Therefore, by applying the block matrix inversion lemma (see 8) to Eq. (5) and keeping the first row only, it leads to the following equation of motion expressed in terms of dependent coordinates:

$$\begin{aligned} {\ddot{{\mathbf {q}}}}= & {} \left( {{\mathbf {M}}}^{-1} - {{\mathbf {M}}}^{-1}\mathbf {\Phi _q}^\top \mathbf {\Gamma }^{-1}\mathbf {\Phi _q}{{\mathbf {M}}}^{-1} \right) {{\mathbf {Q}}}\nonumber \\&+\left( {{\mathbf {M}}}^{-1}\mathbf {\Phi _q}^\top \mathbf {\Gamma }^{-1} \right) {\mathbf {c}} \end{aligned}$$
(6)

where we defined the auxiliary term \(\mathbf {\Gamma }\left( {{\mathbf {q}}}\right) \) as the \(m\times m\) square symmetric matrix

$$\begin{aligned} \mathbf {\Gamma }= \mathbf {\Phi _q}{{\mathbf {M}}}^{-1}\mathbf {\Phi _q}^\top , \end{aligned}$$
(7)

introduced for convenience in subsequent derivations.

2.2 Independent coordinates formulation

Independent coordinates \({{\mathbf {z}}}\), ensure the minimum number of variables, i.e., the number of DOFs. However, multibody systems can be more difficult to analyze with respect to dependent ones. First, the matrix \({{\mathbf {R}}}\) is introduced as

$$\begin{aligned} {\dot{{\mathbf {q}}}}= \frac{d{{\mathbf {q}}}}{dt} = \frac{\partial {{\mathbf {q}}}}{\partial {{\mathbf {z}}}}\frac{\partial {{\mathbf {z}}}}{\partial t} = {\mathbf {R}}({{\mathbf {q}}}){\dot{{\mathbf {z}}}} \end{aligned}$$
(8)

where the columns of \({{\mathbf {R}}}\) are f linearly independent vectors that constitute a basis of the nullspace of \(\mathbf {\Phi _q}\). Then, we express the independent velocities \({\dot{{\mathbf {z}}}}\) as the projection of the dependent velocities \({\dot{{\mathbf {q}}}}\) on the rows of a constant matrix \({{\mathbf {B}}}\) that we assume satisfying the condition of having f rows linearly independent from each other and from the m rows of \(\mathbf {\Phi _q}\)

$$\begin{aligned} {\dot{{\mathbf {z}}}}= {{\mathbf {B}}}{\dot{{\mathbf {q}}}} \end{aligned}$$
(9)

By combining Eq. (2) with Eq. (9), one obtains

$$\begin{aligned}&\begin{bmatrix} \mathbf {\Phi _q}\\ {{\mathbf {B}}}\end{bmatrix} {\dot{{\mathbf {q}}}}= \begin{bmatrix} {\mathbf {b}} \\ {\dot{{\mathbf {z}}}}\end{bmatrix} \end{aligned}$$
(10)
$$\begin{aligned}&{\dot{{\mathbf {q}}}}= \begin{bmatrix} \mathbf {\Phi _q}\\ {{\mathbf {B}}}\end{bmatrix}^{-1} \begin{bmatrix} {\mathbf {b}} \\ {\dot{{\mathbf {z}}}}\end{bmatrix} = {\mathbf {S}}{\mathbf {b}} + {{\mathbf {R}}}{\dot{{\mathbf {z}}}} \end{aligned}$$
(11)

The columns of matrix \({{\mathbf {R}}}\) are the partial velocities with respect to the generalized coordinates \({{\mathbf {z}}}\), and the term \(\mathbf {Sb}\) represents the partial velocities with respect to time.

By differentiating Eq. (9) and by taking into account Eq. (3), one gets

$$\begin{aligned} \begin{bmatrix} {\mathbf {c}} \\ {\ddot{{\mathbf {z}}}}\end{bmatrix} = \begin{bmatrix} \mathbf {\Phi _q}\\ {{\mathbf {B}}}\end{bmatrix} {\ddot{{\mathbf {q}}}}\Rightarrow {\ddot{{\mathbf {q}}}}= \left[ {\mathbf {S}} {{\mathbf {R}}}\right] \begin{bmatrix} {\mathbf {c}} \\ {\ddot{{\mathbf {z}}}}\end{bmatrix} = {\mathbf {S}}{\mathbf {c}} + {{\mathbf {R}}}{\ddot{{\mathbf {z}}}} \end{aligned}$$
(12)

From Eq. (4), pre-multiplied by \({{\mathbf {R}}}^\top \)

$$\begin{aligned} {{\mathbf {R}}}^\top {{\mathbf {M}}}{\ddot{{\mathbf {q}}}}= {{\mathbf {R}}}^\top {{\mathbf {Q}}}\end{aligned}$$
(13)

and by substituting \({\ddot{{\mathbf {q}}}}\) in Eq. (12)

$$\begin{aligned} {{\mathbf {R}}}^\top {{\mathbf {M}}}{{\mathbf {R}}}{\ddot{{\mathbf {z}}}}= {{\mathbf {R}}}^\top \left( {{\mathbf {Q}}}-{{\mathbf {M}}}\mathbf {Sc} \right) \end{aligned}$$
(14)

Introducing the reduced mass matrix \(\overline{{\mathbf {M}}}= {{\mathbf {R}}}^\top {{\mathbf {M}}}{{\mathbf {R}}}\) and force vector \(\overline{{\mathbf {Q}}}= {{\mathbf {R}}}^\top \left( {{\mathbf {Q}}}-{{\mathbf {M}}}\mathbf {Sc} \right) \), the following equation of motion in terms of independent coordinates can be finally obtained

$$\begin{aligned} {\ddot{{\mathbf {z}}}}= \overline{{\mathbf {M}}}^{ -1}\overline{{\mathbf {Q}}}\end{aligned}$$
(15)

3 Background on graphical models

A factor graph is a particular type of probabilistic graphical model that can be used to describe the structure of an estimation problem [22]. Factor graphs have found applications in many fields, for example in robot perception [23], information theory [24], signal processing [25], and in other areas of robotics, mostly SLAM [26] and computer vision [27], state estimation in legged robots [28], and kinematic motion planning [29].

3.1 Dynamic Bayesian networks

It is instructive to start our discussion by considering a Dynamic Bayesian Network (DBN), a type of graphical model where variables are represented as nodes and directed edges stand for causal relationships [22, 30]. The existence of directed edges in a DBN allows us to encode an expert’s knowledge in causality relationships between all involved variables in the graph.

The rules to convert a DBN into the kind of graphs we are actually using in this work, factor graphs (FGs), are discussed in Sect. 5. A relevant point regarding such conversion is that one single DBN can be mapped into different FGs depending on which subset of the original DBN variables are known data and which unknowns are required to be found. Let us remark that whereas a DBN displays all variables as nodes, in a FG only unknown variables are variable nodes.

3.2 Factor graphs

Factor graphs (FGs) are bipartite graphs comprising two kinds of nodes: variable nodes and factor nodes. Variables are the unknown data to be estimated, and factors represent any constraint (a cost function to be minimized) or relationship between variables. A crucial aspect of a FG is its sparsity: each factor node is only connected to the variables that appear in its cost function. Sparse graph optimization algorithms exist with computational cost almost linear with the number of edges in a FG, as opposed to, for example, the cubic cost of a naive implementation of Kalman Filtering [31]. Incremental (e.g., [32]) and sliding window (e.g., [33]) approaches exist as well, enabling the efficient estimation of problems with thousands to millions of variables [34,35,36].

Once a FG is formulated for a given problem, estimating the most-likely value of all the unknowns becomes a numerical nonlinear minimization problem, for which very efficient algorithms exist. As a probabilistic estimator, these optimal values can be assigned an uncertainty. In general, retrieving the covariance \(\varvec{\Sigma }\) of the estimated variables involves inverting the Hessian of the linearized problem evaluated at the optimal solution, such that \(\varvec{\Sigma } = ({\mathbf {J}}^\top {{\varvec{\Lambda }}}{\mathbf {J}})^{-1} \) with \({\mathbf {J}}\) the sparse Jacobian of all constraints (factors) and \({{\varvec{\Lambda }}}\) the weight matrix representing our level of certainty about each constraint. The matrix inversion operation is, in general, very costly since it is cubic with the problem dimension. However, optimized methods exist for the kind of sparse problem at hand, see, for example, [37, §B.4] or [32].

4 DBN for multibody dynamics

Since DBNs allow a human expert to specify the causality relationships between variables, we propose the two DBNs in Fig. 1 as the underlying structure of a generic multibody system that is the basis for our proposed framework. Note the use of discrete time steps t, exactly as simulations are commonly done following numerical integration schemes. A set of observations or measurements \({\mathbf {o}}\) are available from sensors installed on the system. Note, however, that the graphical model formalism is flexible enough to allow each sensor to be available at a different or even sampling rate. When using this graphical model to achieve state estimation, sensors may provide partial information on the system dynamics, which will be then fused with the rest of graph constraints over the system trajectory to reach the most likely values of the estimated variables. In non-state estimation problems, the set of observations \({\mathbf {o}}\) may be empty. Additionally, system parameters such as masses, stiffness, or friction coefficients (constant or variable over time) could be added as additional nodes, although this possibility is left out of the scope of this manuscript for the sake of conciseness.

Fig. 1
figure 1

Dynamic Bayesian Network (DBN) models representing the causality relationship between the variables involved in general multibody dynamics problems over discrete time steps t. Notation: \({{\mathbf {q}}},{\dot{{\mathbf {q}}}},{\ddot{{\mathbf {q}}}}\) are the dependent coordinates, \({{\mathbf {z}}},{\dot{{\mathbf {z}}}},{\ddot{{\mathbf {z}}}}\) are independent coordinates, \({\mathbf {o}}\) are sensor observations, C is the active branch of the mechanism, and \({{\mathbf {Q}}}\) is the external forces. Dashed boxes represent groups of variables for which input or output directed edges affect all members. See Sect. 3 for a discussion

From Fig. 1, some observations can be drawn:

  • A central role in the DBN for independent coordinates (Fig. 1b) is played by the independent coordinates \({{{\mathbf {z}}},{\dot{{\mathbf {z}}}},{\ddot{{\mathbf {z}}}}}\) as they represent the degrees of freedom (d.o.f) that govern the evolution of the MBS. This role is assumed by the dependent coordinate variables \({{{\mathbf {q}}},{\dot{{\mathbf {q}}}},{\ddot{{\mathbf {q}}}}}\) in Fig. 1a.

  • The branch variable C, required uniquely in the independent-coordinate formulation, allows us to disambiguate between, e.g., the two possible configurations for a four-bar linkage if we are only given the information about the crank angle. This variable could be part of the unknowns to be estimated, as already done in [38], although that work did not use the more powerful FG representation proposed here.

  • Typically, nodes in a DBN are depicted as shaded or unshaded depending of whether they are “hidden” or “observed” variables, respectively, e.g., see [18, 38]. The former are estimated from the latter. Since this work focuses on FGs instead, where observed variables are subsumed into factor nodes and unknowns become variable nodes [23], different FGs will be generated from the same DBNs in Fig. 1 for different multibody simulation problems; hence, it is not convenient to establish such a distinction at the DBN level yet.

  • Observations \({\mathbf {o}}\) (sensor readings) are a function of (all, or a subset of) dependent coordinates. Typical observations that may be useful in a MBD problem are measurements obtained from gyroscopes, accelerometers, and load cells.

  • External forces \({{\mathbf {Q}}}_t\) act by means of modifying accelerations \({\ddot{{\mathbf {q}}}}_t\) or \({\ddot{{\mathbf {z}}}}_t\), according to the system dynamics, expressed by Eq. (6) or Eq. (15), respectively.

Once the model of the dynamics MBS has been expressed as a graphical model, the time evolution of the system can be obtained by converting the graph into a maximum-a-posteriori (MAP) estimation problem. To this aim, one could write down the joint probability distribution \(p(\phi )\) of all the involved variables \(\phi =\{{{\mathbf {q}}}_{0:t}, {\dot{{\mathbf {q}}}}_{0:t}, {\ddot{{\mathbf {q}}}}_{0:t}, {{\mathbf {o}}}_{0:t}, {{\mathbf {Q}}}_{0:t}\}\) for time steps 0 to t exploiting the conditional probability encoded by the DBN (refer to, e.g., [18, 22]), which for dependent coordinates (i.e., Fig. 1a) becomes:

$$\begin{aligned} p(\phi )= & {} p({{\mathbf {q}}}_0) p({\dot{{\mathbf {q}}}}_0) \left( \prod _{i=0}^t p({{\mathbf {Q}}}_i) \right) \left( \prod _{i=1}^t p({{\mathbf {o}}}_i|{{\mathbf {q}}}_i,{\dot{{\mathbf {q}}}}_i,{\ddot{{\mathbf {q}}}}_i) \right. \nonumber \\&p({\ddot{{\mathbf {q}}}}_i|{{\mathbf {q}}}_i,{\dot{{\mathbf {q}}}}_i,{{\mathbf {Q}}}_i) p({{\mathbf {q}}}_i|{{\mathbf {q}}}_{i-1},{\dot{{\mathbf {q}}}}_{i-1}) p({\dot{{\mathbf {q}}}}_i|{\ddot{{\mathbf {q}}}}_{i-1},{\dot{{\mathbf {q}}}}_{i-1}) \bigg )\nonumber \\ \end{aligned}$$
(16)

where each conditional probability term in Eq. (16), taking negative logarithms and up to an irrelevant proportionality constant, becomes a factor in this alternative form of an overall cost function \(c(\phi )\) to be minimized:

$$\begin{aligned} c(\phi )= & {} f_{prior}({{\mathbf {q}}}_0) f_{prior}({\dot{{\mathbf {q}}}}_0) \left( \prod _{i=0}^t f_{prior}({{\mathbf {Q}}}_i) \right) \nonumber \\&\left( \prod _{i=1}^t f_{obs}({{\mathbf {o}}}_i,{{\mathbf {q}}}_i,{\dot{{\mathbf {q}}}}_i,{\ddot{{\mathbf {q}}}}_i) f^d_{dyn}({\ddot{{\mathbf {q}}}}_i,{{\mathbf {q}}}_i,{\dot{{\mathbf {q}}}}_i,{{\mathbf {Q}}}_i) \right. \nonumber \\&f_{ei}({{\mathbf {q}}}_i,{{\mathbf {q}}}_{i-1},{\dot{{\mathbf {q}}}}_{i-1}) f_{ei}({\dot{{\mathbf {q}}}}_i,{\ddot{{\mathbf {q}}}}_{i-1},{\dot{{\mathbf {q}}}}_{i-1}) \bigg ) \end{aligned}$$
(17)

A fundamental feature of graphical models is how they enable factorizing functions of all problem variables such as \(c(\phi )\propto -\log p(\phi )\) into the product of a large number of smaller functions (in terms of the number of variables involved in each expression) called factors in Eq. (17), which are discussed individually in Sect. 6. This is what keeps the estimation problem tractable even for hundreds of thousands of variables, something intractable for estimators in the family of the Kalman filter not exploiting the sparsity of the problem structure. Note that the goal of an estimator is searching for the optimal set of unknowns \(\phi ^\star \) that maximize the likelihood of all observed data according to the model, that is:

$$\begin{aligned} \phi ^\star = \arg \max _{\phi } p(\phi ) = \arg \min _{\phi } c(\phi ) \end{aligned}$$
(18)

Depending on the division of the DBN variables into observed and unknowns, we would arrive at different factorizations since factors are considered to be functions of the unknowns only. For example, if all variables in one given cost factor in Eq. (17) are known data, it just evaluates to a constant which can be absorbed by the proportionality relationship \(c(\phi )\propto - \log p(\phi )\) and hence can be ignored. A graphical representation of the remaining relevant factors leads us to factor graphs themselves.

5 Factor graphs for multibody dynamics

The conversion from DBN to FG is known to be achievable as follows, without the need to explicitly writing down probabilistic equations like Eq. (16)–Eq. (17):

Every Bayes net [DBN] node splits in both a variable node and a factor node in the corresponding factor graph. The factor is connected to the variable node, as well as the variable nodes corresponding to the parent nodes in the Bayes net. If some nodes in the Bayes net are evidence nodes, i.e., they are given as known variables, we omit the corresponding variable nodes: the known variable simply becomes a fixed parameter in the corresponding factor. [23, p. 12]

Applying these rules to the DBN reflecting the structure of variables involved in the dynamic simulation of multibody systems with dependent coordinates in Fig. 1a, we derive next the FGs for a couple of practical problems depending on which are the known and unknown variables. Factors will be only briefly discussed here regarding their purpose and meaning, whereas their detailed implementations are described in Sect. 6.

5.1 FG for forward dynamics in dependent coordinates

Fig. 2
figure 2

Factor graphs for the forward dynamic simulation problem using dependent generalized coordinates. Circle nodes are problem unknowns; solid square nodes are factors. See discussion in Sect. 5.1

In forward dynamics simulation, we are given a known initial state of a mechanical system (position \({{\mathbf {q}}}_0\) and velocity \({{\mathbf {q}}}_0\)), their geometric and inertial properties, and the external forces that act on it (\({{\mathbf {Q}}}_i\), \(i=0,...,N\)). For simplicity, we assume no sensors are installed in the system since this problem instance does not need them, but predicting sensor outputs could be also possible adding the corresponding nodes and factors.

The resulting FG when using dependent coordinates is shown in Fig. 2, and it involves the following factor nodes (or plain factors):

  • \(f_{prior}\): Factors imposing a given a priori knowledge about the attached variables, e.g., initial conditions. Prior factors can be defined for both, position and velocity.

  • \(f^d_{pc}\): Factor for position constraints in dependent coordinates. It ensures the fulfillment of mechanical holonomic constraints by keeping the state vector \({{\mathbf {q}}}\) on the manifold \({\mathbf {\Phi }}({{\mathbf {q}}},t)={\mathbf {0}}\); hence, this factor is repeated for each position node \({{\mathbf {q}}}_i\). It could be omitted for \({{\mathbf {q}}}_0\) if the enforced pose imposed by the prior factor is known to be a correct mechanism position complying with \({\mathbf {\Phi }}(\cdot )={\mathbf {0}}\); alternatively, the prior factor can be made to affect a minimum number of variables in \({{\mathbf {q}}}_0\) (the number of degrees of freedom), leaving \(f^d_{pc}\) in charge of recalculating the rest of the generalized coordinates. This factor, on its own, solves the so-called position problem [20] in multibody dynamics.

  • \(f^d_{vc}\): Factor for velocity constraints in dependent coordinates, enforcing generalized velocities \({\dot{{\mathbf {q}}}}\) to remain on the manifold \(\mathbf {\Phi _q}{\dot{{\mathbf {q}}}}+ {\mathbf {\Phi }}_t={\mathbf {0}}\).

  • \(f^d_{dyn}\): Factor for the dynamics equation of motion: it links external forces (known data in this FG, hence not reflected as variable nodes) with the instantaneous acceleration \({\ddot{{\mathbf {q}}}}\). It also depends on \({{\mathbf {q}}}\) and \({\dot{{\mathbf {q}}}}\) since acceleration is always a function of them too. Note in the graph how acceleration for each timestep depends on position and velocity of the same timestep only.

  • \(f_{ti}\): Trapezoidal numerical integrator factor is used twice per timestep: to integrate velocities into positions, and accelerations into velocities. Implicit integrators as the trapezoidal one are often employed in MBS simulations for their stability. However, the Euler integrator version is also devised (see Sect. 6), since explicit integrators are commonly used in real-time applications as well.

5.2 FG for forward dynamics in independent coordinates

Fig. 3
figure 3

Factor graph model for the forward dynamic simulation problem using independent generalized coordinates. Circle nodes are problem unknowns; solid square nodes are factors. See discussion in Sect. 5.2

Alternatively, one could devise a FG implementation for the forward dynamics problem when using independent coordinates, leading to the graph depicted in Fig. 3. In this case, the following factors are used:

  • \(f_{prior}\): In this case, they are used to impose both, an initial known dynamic state (\({{\mathbf {z}}}_0\), \({\dot{{\mathbf {z}}}}_0\)) and approximated initial values for the full vector of dependent coordinates \({{\mathbf {q}}}_0\). The latter may be required to solve ambiguities in closed kinematic topologies, e.g., a four bar linkage, where knowing the minimum set of independent variables still leaves more than one feasible kinematic configuration.

  • \(f_{eq}\): An equality factor, used to impose a soft constraint between state vectors consecutive in time. The rationale behind this factor is to impose a soft constraint, which can be easily violated (its weight is small in comparison to all other factors) but still provides a solid starting point for nonlinear numerical solvers to exploit the fact that mechanism coordinates cannot vary abruptly between consecutive time steps. This factor is particularly important to solve ambiguities in mechanisms with more than one branch, exactly as argued by the end of the former paragraph.

  • \(f^i_{dyn}\): The independent coordinate version of \(f^d_{dyn}\) discussed above, it implements the dynamics equation of motion.

  • \(f^i_{pc}\), \(f^i_{vc}\): Like their dependent coordinate counterparts in the former section, these factors keep the position and velocity vector within their corresponding constraint manifolds. Note how, in this case, the factors not only affect \({{\mathbf {q}}}\) and \({\dot{{\mathbf {q}}}}\) coordinates, but their independent coordinate versions \({{\mathbf {z}}}\) and \({\dot{{\mathbf {z}}}}\) too, respectively. These factors actually correspond to the so-called position problem and velocity problem in multibody mechanics [20].

5.3 FG for inverse dynamics in dependent coordinates

Fig. 4
figure 4

Factor graph for the inverse dynamics problem using dependent generalized coordinates. See discussion in Sect. 5.3

Another problem that can be formalized as a FG is inverse dynamics, as shown in Fig. 4. The problem consists of specifying a desired trajectory over time for the set of degrees of freedom of a mechanism, then solving for the required forces and torques to generate such trajectory. The most relevant factors here are:

  • \(f_{prior}\): Different prior factors are used to define initial known values for the dynamic state (\({{\mathbf {q}}}_0\), \({\dot{{\mathbf {q}}}}_0\)) and to enforce a value of zero in all components of the generalized force vectors \({{\mathbf {Q}}}_i\) where it is known that no external force is acting. In other words, the latter factors are required to leave only part of \({{\mathbf {Q}}}_i\) as an unknown, e.g., for those degrees of freedom where a motor is exerting a torque.

  • \(f^d_{idyn}\): Similar to the dynamics equation factor \(f^d_{dyn}\) discussed above, but with the force \({{\mathbf {Q}}}_i\) as an additional variable. Note that \(f^d_{dyn}\) (see Sect. 5.2) also used a value for \({{\mathbf {Q}}}_i\), but it was treated as a known constant, whereas for \(f^d_{idyn}\) the \({{\mathbf {Q}}}_i\) are unknowns and as such the factor provides a Jacobian of the error term with respect to them as well.

From all discussed problems so far, inverse dynamics is the hardest to solve for nonlinear iterative solvers from initial values that are far from the optimum. Therefore, it requires a proper initialization strategy to enable numerical nonlinear solvers to cope with it effectively, as discussed in the experiments section later on.

6 Factors definition

In a factor graph, factors establish the relations between variables. This section provides an insightful practical guide to those factors MBS problems are made of.

Each factor defines an error \({\mathbf {e}}({\mathbf {x}})\) between predicted and measured data. To apply nonlinear optimization algorithms (e.g., Gauss–Newton or Levenberg–Marquardt), the Jacobian of such error with respect to all involved variables \({\mathbf {x}}\) is required. DBN variables whose values are known do not become FG nodes and hence are constant parameters of factors. The goal of the optimization is to search for the best values of all variables \({\mathbf {x}}^\star \) taking the weighted sum of error functions (one per factor) as close to zero as possible:

$$\begin{aligned} {\mathbf {x}}^\star = \arg \min _{{\mathbf {x}}} \sum _{i} \left\Vert {\mathbf {e}}_i({\mathbf {x}})\right\Vert ^2_{\varvec{\Sigma }} \end{aligned}$$
(19)

where \(\left\Vert {\mathbf {e}}_i({\mathbf {x}})\right\Vert ^2_{\varvec{\Sigma }}\) is a form of whitening already integrating the probabilistic noise model (or weight) of each factor. The most common model, used in the present work, is the assumption of additive zero mean Gaussian noise \({\mathbf {n}} \sim {\mathcal {N}}(\mathbf{0 }, \varvec{\Sigma })\) with covariance matrix \(\varvec{\Sigma }\). Taking the negative logarithm of the corresponding probability density can be shown to give us the nonlinear least-squares problem in Eq.(19), where

$$\begin{aligned} \left\Vert {\mathbf {e}}({\mathbf {x}})\right\Vert ^2_{\varvec{\Sigma }} = \frac{1}{2} {\mathbf {e}}({\mathbf {x}})^\top \varvec{\Lambda } {\mathbf {e}}({\mathbf {x}}) \end{aligned}$$
(20)

with information matrix \(\varvec{\Lambda }=\varvec{\Sigma }^{-1}\). Larger information values in \(\varvec{\Lambda }\) (or smaller variances in \(\varvec{\Sigma }\)) imply that the factor must be considered with a higher weight during the optimization in comparison to other factors with smaller information values (or larger variances). Note that each component of multidimensional \({\mathbf {x}}\) vectors may have a different weight for a given single factor, e.g., as proposed in the priors over external forces in Sect. 5.3. The interested reader is deferred to [23] for a more in-depth discussion on this topic.

In the following, note the use of the superscripts \(f^{d}_{\cdot }\) and \(f^{i}_{\cdot }\) for factors with differentiated implementations for dependent and independent coordinates, respectively.

6.1 \(f_{prior}\): Prior factor

Prior factors \(f_{prior}({\mathbf {x}})\) over a variable \({\mathbf {x}}\) are the only unary factor used in our proposed graphs. They represent a priori belief (hence the name) about the state of a given variable.

Since the problems addressed in this manuscript use variables that are either (a) state vectors with generalized coordinates of planar mechanisms, or (b) their velocities or (c) their accelerations, all variables can be treated as vectors in the group of real numbers; hence, the error function \({\mathbf {e}}(\cdot )\) of this factor for a vector of dimensionality d has a simple form:

$$\begin{aligned}&f_{prior}\text { error function:} \quad {\mathbf {e}}(\cdot ) = {\mathbf {x}}- {\mathbf {x}}_0 \end{aligned}$$
(21a)
$$\begin{aligned}&\text {Variables:} \quad {\mathbf {x}}\end{aligned}$$
(21b)
$$\begin{aligned}&\text {Fixed parameters:} \quad {\mathbf {x}}_0 \end{aligned}$$
(21c)
$$\begin{aligned}&\text {Jacobian:} \quad \dfrac{\partial {\mathbf {e}}}{\partial {\mathbf {x}}} = {\mathbf {I}}_d \end{aligned}$$
(21d)

where \({\mathbf {x}}_0\) is the “desired” value for the variable \({\mathbf {x}}\).

Fig. 5
figure 5

ab Numerical integration factors discussed in Sects. 6.2-6.3. Factors used to enforce \({{\mathbf {q}}}\) and \({\dot{{\mathbf {q}}}}\) to remain within their corresponding manifolds: c holonomic position constraints, (b) velocity constraints (see Sects. 6.4-6.5)

6.2 \(f_{ei}\): Euler integrator factor

The graphs proposed in former sections work over temporal sequences of variables, sampled at a fixed sample rate \(f_s=1/\Delta t\). Imposing the continuity of ordinary differential equations in discrete time can be done by means of numerical integration, of which the Euler integrator is the simplest instance.

Given two consecutive samples for time steps t and \(t+1\) of a given variable \({\mathbf {x}}\) and its time derivative \(\dot{{\mathbf {x}}}\) at time t, the Euler integrator factor \(f_{ei}\) (shown in Fig. 5a) can be defined as:

$$\begin{aligned}&f_{ei}\text { error function:} \quad {\mathbf {e}}(\cdot ) = {\mathbf {x}}_{t+1} - {\mathbf {x}}_t - \Delta t \dot{{\mathbf {x}}}_t \end{aligned}$$
(22a)
$$\begin{aligned}&\text {Variables:} \quad {\mathbf {x}}_t,{\mathbf {x}}_{t+1},\dot{{\mathbf {x}}}_t \end{aligned}$$
(22b)
$$\begin{aligned}&\text {Fixed parameters:} \quad \Delta t \end{aligned}$$
(22c)
$$\begin{aligned}&\text {Jacobians:} \quad \dfrac{\partial {\mathbf {e}}}{\partial {\mathbf {x}}_t} = -{\mathbf {I}}_d \quad \dfrac{\partial {\mathbf {e}}}{\partial {\mathbf {x}}_{t+1}} = {\mathbf {I}}_d \nonumber \\& \dfrac{\partial {\mathbf {e}}}{\partial \dot{{\mathbf {x}}}_t} = -\Delta t {\mathbf {I}}_d \end{aligned}$$
(22d)

where the state space is assumed to be \({\mathbb {R}}^d\).

6.3 \(f_{ti}\): Trapezoidal integrator factor

Another well-known numerical integration scheme follows the Trapezoidal Rule. Given two consecutive samples for time steps t and \(t+1\), both for a given variable \({\mathbf {x}}\) and its time derivative \(\dot{{\mathbf {x}}}\), the trapezoidal integrator factor \(f_{ti}\), shown in Fig. 5b, can be defined as:

$$\begin{aligned}&f_{ti}\text { error function:} \quad {\mathbf {e}}(\cdot ) \nonumber \\& = {\mathbf {x}}_{t+1} - {\mathbf {x}}_t - \dfrac{\Delta t}{2} \dot{{\mathbf {x}}}_t - \dfrac{\Delta t}{2} \dot{{\mathbf {x}}}_{t+1} \end{aligned}$$
(23a)
$$\begin{aligned}&\text {Variables:} \quad {\mathbf {x}}_t,{\mathbf {x}}_{t+1},\dot{{\mathbf {x}}}_t,\dot{{\mathbf {x}}}_{t+1} \end{aligned}$$
(23b)
$$\begin{aligned}&\text {Fixed parameters:} \quad \Delta t \end{aligned}$$
(23c)
$$\begin{aligned}&\text {Jacobians:} \quad \dfrac{\partial {\mathbf {e}}}{\partial {\mathbf {x}}_t} = -{\mathbf {I}}_d \quad \dfrac{\partial {\mathbf {e}}}{\partial {\mathbf {x}}_{t+1}} = {\mathbf {I}}_d \nonumber \\& \dfrac{\partial {\mathbf {e}}}{\partial \dot{{\mathbf {x}}}_t} = -\dfrac{\Delta t}{2} {\mathbf {I}}_d \quad \dfrac{\partial {\mathbf {e}}}{\partial \dot{{\mathbf {x}}}_{t+1}} = -\dfrac{\Delta t}{2} {\mathbf {I}}_d\nonumber \\ \end{aligned}$$
(23d)

As mentioned in Sect. 5.1, this integrator is preferred for its better accuracy in comparison with the Euler method.

6.4 \(f^d_{pc}\): Factor for position constraints in dependent coordinates

This factor, depicted in Fig. 5c, ensures the fulfillment of mechanical holonomic constraints of Eq. (1). As explained in Sect. 2.1, modeling a mechanism in dependent coordinates leads to a number of constraint equations largest or equal to the number of generalized coordinates.

This factor has the following general form:

$$\begin{aligned}&f^d_{pc}\text { error function:} \quad {\mathbf {e}}(\cdot ) = {\mathbf {\Phi }}({{\mathbf {q}}}_t) \end{aligned}$$
(24a)
$$\begin{aligned}&\text {Variables:} \quad {{\mathbf {q}}}_t \end{aligned}$$
(24b)
$$\begin{aligned}&\text {Fixed parameters:} \quad \text {Mechanism model} \end{aligned}$$
(24c)
$$\begin{aligned}&\text {Jacobians:} \quad \dfrac{\partial {\mathbf {e}}}{\partial {{\mathbf {q}}}_t} = \mathbf {\Phi _q}\left( {{\mathbf {q}}}_t \right) \end{aligned}$$
(24d)

where both \({\mathbf {\Phi }}\) and \( \mathbf {\Phi _q}\) can be automatically built out of pre-designed blocks (see Appendix III), according to a formal description of the topology of the mechanical system and the specific joints connecting each pair of adjacent bodies. In particular, since each physical constraint becomes one or more entries in the \({\mathbf {\Phi }}({{\mathbf {q}}})\) vector, each such constraint fully determines the corresponding rows in the Jacobian \(\mathbf {\Phi _q}({{\mathbf {q}}})\), easing its automated construction from the elemental expressions exposed in Appendix III.

6.5 \(f^d_{vc}\): Factor for velocity constraints in dependent coordinates

Velocity constraint factors further improve the quality of MBS simulation. These kind of factors are modeled by Eq. (2) and depend on both, \({{\mathbf {q}}}\) and \({\dot{{\mathbf {q}}}}\), as illustrated in Fig. 5d. Its definition is as follows:

(25a)
$$\begin{aligned}&\text {Variables:} \quad {{\mathbf {q}}}_t, {\dot{{\mathbf {q}}}}_t \end{aligned}$$
(25b)
$$\begin{aligned}&\text {Fixed parameters:} \quad \text {Mechanism model} \end{aligned}$$
(25c)
$$\begin{aligned}&\text {Jacobians:} \quad \dfrac{\partial {\mathbf {e}}}{\partial {{\mathbf {q}}}_t} = \dfrac{\partial \mathbf {\Phi _q}\left( {{\mathbf {q}}}_t \right) {\dot{{\mathbf {q}}}}_t}{\partial {{\mathbf {q}}}_t} \nonumber \\& \dfrac{\partial {\mathbf {e}}}{\partial {\dot{{\mathbf {q}}}}_t} = \mathbf {\Phi _q}({{\mathbf {q}}}_t) \end{aligned}$$
(25d)

where we assumed that no constraint depends explicitly on time; hence, \({\mathbf {\Phi }}_t({{\mathbf {q}}}_t)\) can be safely neglected. Again, each row in the Jacobians comes from exactly one constraint in the definition of the mechanical system, with most common cases described in Appendix III.

6.6 \(f^d_{dyn}\), \(f^d_{idyn}\) and \(f^i_{dyn}\), \(f^i_{idyn}\): Factors for dynamics

These factors minimize the error between the actual acceleration estimates (\({\ddot{{\mathbf {q}}}}\) and \({\ddot{{\mathbf {z}}}}\)) and the corresponding equations of motion Eqs. (6) and  (15) for dependent and independent coordinates, respectively. To deal both forward and inverse dynamics, this factor connects the generalized positions, velocities, and accelerations with the forces (\({{\mathbf {Q}}}_t\)) for each single time step t. In the case of forward dynamics, both the variable \({{\mathbf {Q}}}_t\) and the edge connecting it to the factor are dropped, becoming \({{\mathbf {Q}}}_t\) an internal known parameter to the factor. For this reason, two factors for forward and inverse dynamics must be defined independently:

  • Forward dynamics

    For dependent coordinates:

    $$\begin{aligned}&f^d_{dyn}\text { error function:} \quad {\mathbf {e}}(\cdot ) = {\ddot{{\mathbf {q}}}}({{\mathbf {q}}}_t,{\dot{{\mathbf {q}}}}_t)-{\ddot{{\mathbf {q}}}}_t\qquad \end{aligned}$$
    (26a)
    $$\begin{aligned}&\text {Variables:} \quad {{\mathbf {q}}}_t, {\dot{{\mathbf {q}}}}_t, {\ddot{{\mathbf {q}}}}_t \end{aligned}$$
    (26b)
    $$\begin{aligned}&\text {Fixed parameters:} (\text {masses, external forces}) \nonumber \\ \end{aligned}$$
    (26c)
    $$\begin{aligned}&\text {Jacobians:} \quad \dfrac{\partial {\mathbf {e}}}{\partial {{\mathbf {q}}}_t} = \dfrac{\partial {\ddot{{\mathbf {q}}}}({{\mathbf {q}}}_t,{\dot{{\mathbf {q}}}}_t)}{\partial {{\mathbf {q}}}_t} \quad \dfrac{\partial {\mathbf {e}}}{\partial {\ddot{{\mathbf {q}}}}_t} = -{\mathbf {I}} \nonumber \\& \dfrac{\partial {\mathbf {e}}}{\partial {\dot{{\mathbf {q}}}}_t} = \dfrac{\partial {\ddot{{\mathbf {q}}}}({{\mathbf {q}}}_t,{\dot{{\mathbf {q}}}}_t)}{\partial {\dot{{\mathbf {q}}}}_t} \end{aligned}$$
    (26d)

    and for independent coordinates:

    $$\begin{aligned}&f^i_{dyn}\text {error function:} \quad {\mathbf {e}}(\cdot ) = {\ddot{{\mathbf {z}}}}({{\mathbf {z}}}_t,{\dot{{\mathbf {z}}}}_t)-{\ddot{{\mathbf {z}}}}_t \end{aligned}$$
    (27a)
    $$\begin{aligned}&\text {Variables:} \quad {{\mathbf {z}}}_t, {\dot{{\mathbf {z}}}}_t, {\ddot{{\mathbf {z}}}}_t \end{aligned}$$
    (27b)
    $$\begin{aligned}&\text {Fixed parameters:} (\text {masses, external forces}) \nonumber \\ \end{aligned}$$
    (27c)
    $$\begin{aligned}&\text {Jacobians:} \quad \dfrac{\partial {\mathbf {e}}}{\partial {{\mathbf {z}}}_t} = \dfrac{\partial {\ddot{{\mathbf {z}}}}({{\mathbf {z}}}_t,{\dot{{\mathbf {z}}}}_t)}{\partial {{\mathbf {z}}}_t} \quad \dfrac{\partial {\mathbf {e}}}{\partial {\ddot{{\mathbf {z}}}}_t} = -{\mathbf {I}} \nonumber \\& \dfrac{\partial {\mathbf {e}}}{\partial {\dot{{\mathbf {z}}}}_t} = \dfrac{\partial {\ddot{{\mathbf {z}}}}({{\mathbf {z}}}_t,{\dot{{\mathbf {z}}}}_t)}{\partial {\dot{{\mathbf {z}}}}_t} \end{aligned}$$
    (27d)
  • Inverse dynamics

    For dependent coordinates:

    $$\begin{aligned}&f^d_{dyn}\text { error function:} \quad {\mathbf {e}}(\cdot ) \nonumber \\& = {\ddot{{\mathbf {q}}}}({{\mathbf {q}}}_t,{\dot{{\mathbf {q}}}}_t,{{\mathbf {Q}}}_t)-{\ddot{{\mathbf {q}}}}_t \end{aligned}$$
    (28a)
    $$\begin{aligned}&\text {Variables:} \quad {{\mathbf {q}}}_t, {\dot{{\mathbf {q}}}}_t, {\ddot{{\mathbf {q}}}}_t, {{\mathbf {Q}}}_t \end{aligned}$$
    (28b)
    $$\begin{aligned}&\text {Fixed parameters:} \quad (\text {masses}) \end{aligned}$$
    (28c)
    $$\begin{aligned}&\text {Jacobians:} \quad \dfrac{\partial {\mathbf {e}}}{\partial {{\mathbf {q}}}_t} = \dfrac{\partial {\ddot{{\mathbf {q}}}}({{\mathbf {q}}}_t,{\dot{{\mathbf {q}}}}_t,{{\mathbf {Q}}}_t)}{\partial {{\mathbf {q}}}_t} \quad \end{aligned}$$
    (28d)
    $$\begin{aligned}& \dfrac{\partial {\mathbf {e}}}{\partial {\dot{{\mathbf {q}}}}_t} = \dfrac{\partial {\ddot{{\mathbf {q}}}}({{\mathbf {q}}}_t,{\dot{{\mathbf {q}}}}_t,{{\mathbf {Q}}}_t)}{\partial {\dot{{\mathbf {q}}}}_t} \end{aligned}$$
    (28e)
    $$\begin{aligned}& \dfrac{\partial {\mathbf {e}}}{\partial {\ddot{{\mathbf {q}}}}_t} = -{\mathbf {I}} \end{aligned}$$
    (28f)
    $$\begin{aligned}& \dfrac{\partial {\mathbf {e}}}{\partial {{\mathbf {Q}}}_t} = \dfrac{\partial {\ddot{{\mathbf {q}}}}({{\mathbf {q}}}_t,{\dot{{\mathbf {q}}}}_t,{{\mathbf {Q}}}_t)}{\partial {{\mathbf {Q}}}_t} \end{aligned}$$
    (28g)

    and for independent coordinates:

    $$\begin{aligned}&f^i_{dyn}\text { error function:} \quad {\mathbf {e}}(\cdot ) \nonumber \\& ={\ddot{{\mathbf {z}}}}({{\mathbf {z}}}_t,{\dot{{\mathbf {z}}}}_t,{{\mathbf {Q}}}_t)-{\ddot{{\mathbf {z}}}}_t \end{aligned}$$
    (29a)
    $$\begin{aligned}&\text {Variables:} \quad {{\mathbf {z}}}_t, {\dot{{\mathbf {z}}}}_t, {\ddot{{\mathbf {z}}}}_t, {{\mathbf {Q}}}_t \end{aligned}$$
    (29b)
    $$\begin{aligned}&\text {Fixed parameters:} \quad (\text {masses}) \end{aligned}$$
    (29c)
    $$\begin{aligned}&\text {Jacobians:} \quad \dfrac{\partial {\mathbf {e}}}{\partial {{\mathbf {z}}}_t} = \dfrac{\partial {\ddot{{\mathbf {z}}}}({{\mathbf {z}}}_t,{\dot{{\mathbf {z}}}}_t,{{\mathbf {Q}}}_t)}{\partial {{\mathbf {z}}}_t} \end{aligned}$$
    (29d)
    $$\begin{aligned}& \dfrac{\partial {\mathbf {e}}}{\partial {\dot{{\mathbf {z}}}}_t} = \dfrac{\partial {\ddot{{\mathbf {z}}}}({{\mathbf {z}}}_t,{\dot{{\mathbf {z}}}}_t,{{\mathbf {Q}}}_t)}{\partial {\dot{{\mathbf {z}}}}_t} \end{aligned}$$
    (29e)
    $$\begin{aligned}& \dfrac{\partial {\mathbf {e}}}{\partial {\ddot{{\mathbf {z}}}}_t} = -{\mathbf {I}} \end{aligned}$$
    (29f)
    $$\begin{aligned}& \dfrac{\partial {\mathbf {e}}}{\partial {{\mathbf {Q}}}_t} = \dfrac{\partial {\ddot{{\mathbf {z}}}}({{\mathbf {z}}}_t,{\dot{{\mathbf {z}}}}_t,{{\mathbf {Q}}}_t)}{\partial {{\mathbf {Q}}}_t} \end{aligned}$$
    (29g)

Refer to Eqs. (6) and  (15) for the evaluation of the error functions Eqs. (28a28b28c28d28e28f) and  (29a29b29c29d29e29f), respectively. Since inverse dynamics problems seek the generalized forces in a small subset of components of the vector \({{\mathbf {Q}}}\) (i.e., those actuated with motors) given known values of positions \({{\mathbf {q}}}\) and velocities \({\dot{{\mathbf {q}}}}\), in our implementation we opted for implementing only the Jacobians with respect to \({{\mathbf {Q}}}_t\), that is, Eqs. (28g) and  (29g). Devising closed-form expressions for these Jacobians, reported in Appendix IV, is another contribution of this work.

Not defining the other Jacobians and assuming they are zero amount to disregarding the information on \({{\mathbf {q}}}\) and \({\dot{{\mathbf {q}}}}\) directly inferred by the external forces \({{\mathbf {Q}}}_t\), which is a reasonable assumption in practice. Note that indirect information on \({{\mathbf {q}}}\) and \({\dot{{\mathbf {q}}}}\) is never lost via their time dependency on accelerations, which are indeed taken into account in this factor.

6.7 \(f^i_{pc}\): Factor for position constraints in independent coordinates

Figure 3 shows three sets of factors linking dependent coordinates (position, velocities, and accelerations) with their corresponding degrees of freedom, that is, with their counterparts in independent coordinates. The first one of these factor is \(f^i_{pc}\), in charge of imposing the simultaneous fulfillment of: (a) position constraints in \({\mathbf {\Phi }}({{\mathbf {q}}})\), and (b) that independent coordinates \({{\mathbf {z}}}\) are integrated into its corresponding positions within \({{\mathbf {q}}}\). Therefore, its error function and Jacobians read:

$$\begin{aligned}&f^i_{pc}\text { error function:} \quad {\mathbf {e}}({{\mathbf {q}}}_t, {{\mathbf {z}}}_t) \nonumber \\& = \begin{bmatrix} {\mathbf {\Phi }}({{\mathbf {q}}}_t) \\ {{\mathbf {q}}}_t(\{idxs\}) - {{\mathbf {z}}}_t \end{bmatrix}_{(m+d) \times 1} \end{aligned}$$
(30a)
$$\begin{aligned}&\text {Variables:} \quad {{\mathbf {q}}}_t, {{\mathbf {z}}}_t \end{aligned}$$
(30b)
$$\begin{aligned}&\text {Fixed parameters:} \quad \{idxs\}, \nonumber \\& \text {constant distances, angles, etc.} \end{aligned}$$
(30c)
$$\begin{aligned}&\text {Jacobians:} \quad \dfrac{\partial {\mathbf {e}}}{\partial {{\mathbf {q}}}_t} = \begin{bmatrix} \mathbf {\Phi _q}\left( {{\mathbf {q}}}_t \right) \\ {\mathbf {I}}_{idxs} \end{bmatrix}_{(m+d) \times n} \nonumber \\& \dfrac{\partial {\mathbf {e}}}{\partial {{\mathbf {z}}}_t} = \begin{bmatrix} {\mathbf {0}}_{m \times d} \\ -{\mathbf {I}}_{d} \\ \end{bmatrix}_{(m+d) \times d} \end{aligned}$$
(30d)

where \(\{idxs\}=\{y_1,y_2,...,y_d\}\) stands for the fixed sequence of indices of each \({{\mathbf {z}}}\) coordinate inside the n-vector \({{\mathbf {q}}}\) and the coefficients \(I_{i,j}\) of the binary matrix \({\mathbf {I}}_{idxs}\) are defined as 1 if \(j=y_i\), or as 0 otherwise, where \(i={1,...,d}\) and \(j={1,...,n}\).

6.8 \(f^i_{vc}\): Factor for velocity constraints in independent coordinates

This factor ensures the fulfillment of the velocity constraints for independent coordinates:

$$\begin{aligned}&f^i_{vc}\text { error function:} \quad {\mathbf {e}}({{\mathbf {q}}}_t, {\dot{{\mathbf {q}}}}_t, {\dot{{\mathbf {z}}}}_t) \nonumber \\&= \begin{bmatrix} \mathbf {\Phi _q}({{\mathbf {q}}}_t){\dot{{\mathbf {q}}}}_t \\ {\dot{{\mathbf {q}}}}_t(\{idxs\}) - {\dot{{\mathbf {z}}}}_t \end{bmatrix}_{(m+d) \times 1} \end{aligned}$$
(31a)
$$\begin{aligned}&\text {Variables:} \quad {{\mathbf {q}}}_t, {\dot{{\mathbf {q}}}}_t, {\dot{{\mathbf {z}}}}_t \end{aligned}$$
(31b)
$$\begin{aligned}&\text {Fixed parameters:} \quad \{idxs\}, \nonumber \\& \text {constant distances, angles, etc.} \end{aligned}$$
(31c)
$$\begin{aligned}&\text {Jacobians:} \quad \dfrac{\partial {\mathbf {e}}}{\partial {{\mathbf {q}}}_t} = \begin{bmatrix} \mathbf {\Phi _{{{\mathbf {q}}}{{\mathbf {q}}}}}({{\mathbf {q}}}_t) {\dot{{\mathbf {q}}}}_t \\ {\mathbf {0}}_{d \times n} \end{bmatrix}_{(m+d) \times n} \nonumber \\& \dfrac{\partial {\mathbf {e}}}{\partial {\dot{{\mathbf {q}}}}_t} = \begin{bmatrix} \mathbf {\Phi _q}({{\mathbf {q}}}_t) \\ {\mathbf {I}}_{idx} \end{bmatrix}_{(m+d) \times n} \nonumber \\& \dfrac{\partial {\mathbf {e}}}{\partial {\dot{{\mathbf {z}}}}_t} = \begin{bmatrix} {\mathbf {0}}_{m \times d} \\ -{\mathbf {I}}_{d} \\ \end{bmatrix}_{(m+d) \times d} \end{aligned}$$
(31d)

where the tensor vector product \(\mathbf {\Phi _{{{\mathbf {q}}}{{\mathbf {q}}}}}{\dot{{\mathbf {q}}}}_t\) results in the 2-D matrix \({\dot{{\mathbf {\Phi }}}}_{{\mathbf {q}}}\) (see Eq. (40a)) which can be systematically built row by row from a description of the mechanism from the building blocks described in Appendix III.

6.9 \(f^i_{ac}\): Factor for acceleration constraints in independent coordinates

This factor ensures the fulfillment of the acceleration constraints for independent coordinates:

$$\begin{aligned}&f^i_{vc}\text { error function:} \quad {\mathbf {e}}({{\mathbf {q}}}_t, {\dot{{\mathbf {q}}}}_t,{\ddot{{\mathbf {q}}}}_t, {\ddot{{\mathbf {z}}}}_t) \nonumber \\&= \begin{bmatrix} {\dot{{\mathbf {\Phi }}}}_{{{\mathbf {q}}}}({{\mathbf {q}}}_t){\dot{{\mathbf {q}}}}_t + \mathbf {\Phi _q}({{\mathbf {q}}}_t){\ddot{{\mathbf {q}}}}_t \\ {\ddot{{\mathbf {q}}}}_t(\{idxs\}) - {\ddot{{\mathbf {z}}}}_t \end{bmatrix}_{(m+d) \times 1} \end{aligned}$$
(32a)
$$\begin{aligned}&\text {Variables:} \quad {{\mathbf {q}}}_t, {\dot{{\mathbf {q}}}}_t, {\ddot{{\mathbf {q}}}}_t, {\ddot{{\mathbf {z}}}}_t \end{aligned}$$
(32b)
$$\begin{aligned}&\text {Fixed parameters:} \quad \text {(constant distances,} \nonumber \\& \text {angles, etc.)} \end{aligned}$$
(32c)
$$\begin{aligned}&\text {Jacobians:} \quad \dfrac{\partial {\mathbf {e}}}{\partial {{\mathbf {q}}}_t} = \begin{bmatrix} ({\dot{{\mathbf {\Phi }}}}_{{{\mathbf {q}}}})_{{{\mathbf {q}}}}({{\mathbf {q}}}_t) {\dot{{\mathbf {q}}}}_t + \mathbf {\Phi _{{{\mathbf {q}}}{{\mathbf {q}}}}}({{\mathbf {q}}}_t) {\ddot{{\mathbf {q}}}}_t \\ {\mathbf {0}}_{d \times n} \end{bmatrix}_{(m+d) \times n} \nonumber \\& \dfrac{\partial {\mathbf {e}}}{\partial {\dot{{\mathbf {q}}}}_t} = \begin{bmatrix} 2\mathbf {\Phi _{{{\mathbf {q}}}{{\mathbf {q}}}}}({{\mathbf {q}}}_t){\dot{{\mathbf {q}}}}_t \\ {\mathbf {0}}_{d \times n} \end{bmatrix}_{(m+d) \times n} \nonumber \\& \dfrac{\partial {\mathbf {e}}}{\partial {\ddot{{\mathbf {q}}}}_t} = \begin{bmatrix} \mathbf {\Phi _q}({{\mathbf {q}}}_t) \\ {\mathbf {I}}_{idx} \end{bmatrix}_{(m+d) \times n} \nonumber \\& \dfrac{\partial {\mathbf {e}}}{\partial {\ddot{{\mathbf {z}}}}_t} = \begin{bmatrix} {\mathbf {0}}_{m \times d} \\ -{\mathbf {I}}_{d} \\ \end{bmatrix}_{(m+d) \times d} \end{aligned}$$
(32d)

where the tensor vector products \(({\dot{{\mathbf {\Phi }}}}_{{{\mathbf {q}}}})_{{{\mathbf {q}}}}({{\mathbf {q}}}_t) {\dot{{\mathbf {q}}}}_t\), \(\mathbf {\Phi _{{{\mathbf {q}}}{{\mathbf {q}}}}}{\ddot{{\mathbf {q}}}}_t\), and \(2\mathbf {\Phi _{{{\mathbf {q}}}{{\mathbf {q}}}}}{\dot{{\mathbf {q}}}}_t = 2{\dot{{\mathbf {\Phi }}}}_{{\mathbf {q}}}\) can be systematically built as described in Appendix III.

Fig. 6
figure 6

a First planar mechanism used in the numeric simulations described in Sect. 7. b Table of mechanism properties (length and mass) for each link. Inertias \(I_i\) are given with respect to the center of mass of each link, placed at its center

7 Test cases validation

Firstly, a four-bar linkage is employed to exemplify the implementation and the performance of the proposed FG-based framework. The mechanism is shown in Fig. 6a, where two revolute joints, \(P_1\) and \(P_2\), can move in the motion plane, whereas A and D are fixed.

Geometric and inertial properties are summarized in Fig. 6b. The motion of the mechanism has been simulated using the commercial MBS environment Adams / View from MSC.Software with a fixed-step integrator at 1/10 of the time step used in the factor graph; hence, its results can be considered the ground truth for comparison purposes.

A companion open-source reference C++ implementationFootnote 1 of the proposed factor graph approach has been released along with the paper, together with tools to easily define and simulate mechanisms via human-friendly YAML files, as in the example of Listing 1.

Note that all factor closed-form Jacobians have been thoroughly validated by means of unit tests against numerical finite differences.

figure a
Fig. 7
figure 7

Simulation-based validation of the proposed FG approach to forward dynamics for a four-bar linkage under the effects of gravity. Generalized coordinates (a) and velocities (b) are shown for our approach (using a fixed-lag smoother with \(N_w=2\), time step of 1 ms) and for MSC.ADAMS as ground truth. c and d show the error as the difference between ground truth and the output of the proposed FG-based method. Overall RMSE for this case is \(\approx 0.0031\) m and \(\approx 0.026\) m/s for \({{\mathbf {q}}}\) and \({\dot{{\mathbf {q}}}}\), respectively. See discussion in Sect. 7.1

7.1 Forward dynamic simulation

Free motion of the four-bar mechanism under the sole effect of gravity (\(g=9.8~ \text {m}/\text {s}^2\)) has been simulated, with an initial crank angle \(\theta \) (refer to Fig. 6a) of zero and null initial velocities.

The FG in Fig. 2 using dependent coordinates has been implemented and then solved numerically in an incremental fashion using a fixed lag smoother, when only the factors of the last \(N_w\) time steps (the window length) up to time t are considered, with discrete time t ranging from 0 up to 5 seconds with steps of \(dt=1~\text {ms}\). A Levenberg–Marquardt nonlinear iterative solver algorithm has been selected for these runs, with a maximum number of iterations of 15. Note, however, that virtually all calls to the optimizer required only 3 to 5 iterations until convergence.

Figure 7a, b shows the time history of the dependent generalized coordinates of points \(P_1\) and \(P_2\) and their corresponding velocities, respectively, obtained from both, the proposed FG-based approach and from MSC Adams. Both series accurately coincide; hence, a more detailed comparison is in order in Fig. 7c, d, where the differences in \({{\mathbf {q}}}\) and \({\dot{{\mathbf {q}}}}\) are reported between the proposed FG-based method and the ground truth. Similar good agreement (omitted for brevity’s sake) is obtained when adopting the independent coordinates FG, as explained in Fig. 3, thus verifying our two proposed schemes. The noise models used in all factors involved are summarized in Table 2, whose absolute values are not relevant and have no special physical meaning, since all that matters is their relative weight with respect to each other. Note that FG optimal solutions are robust against changes in those noise models by at least one order of magnitude; hence, there is no need for fine tuning those parameters. The fulfillment of the constraints \(|{\mathbf {\Phi }}({{\mathbf {q}}})|\) over time, illustrated in Fig. 8, is ensured by means of the addition of position constraint factors to the FG.

Fig. 8
figure 8

Evolution over time of the \(L_2\) norm of the constraints vector \({\mathbf {\Phi }}({{\mathbf {q}}})\) for the forward dynamics experiment for the four-bar linkage

In order to quantify the accuracy and computational cost of both schemes (dependent and independent-based FGs) and to explore the importance of the fixed lag window length parameter \(N_w\) for this problem, the root mean square error (RMSE) of the overall \({{\mathbf {q}}}\) trajectories is shown in Table 1 for window sizes varying from 2 to 10 time steps, together with the average execution time for each simulation time step measured in a single thread of an Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz. Three conclusions can be drawn from these data: (i) both schemes, dependent and independent coordinates, seem to provide comparable accuracy, (ii) the dependent coordinates FG is more efficient (which comes at no surprise since it has less variables and factors than the independent coordinate version), and (ii) using large window lengths does not help increasing the accuracy of the estimation. This latter point can be explained by the fact that the present experiments focus on forward dynamics only, not estimation from noisy sensor measurements, a case where it should be expected that larger windows lead to better results. This topic, however, falls out of the scope of the present paper, and it should be studied in the future. Finally, please note a summary of the noise models employed in this experiment is shown in Table 2.

Table 1 RMSE for \({{\mathbf {q}}}\) trajectories estimated by the dependent coordinates (DC) and independent coordinates (IC) factor graphs (FG) with respect to ground truth for the simulation described in the text for different window lengths of the fixed lag smoother estimator, together with the execution time of each time step
Table 2 Noise models (covariance matrices) employed in the forward dynamics simulations
Fig. 9
figure 9

Evaluation of the inverse dynamics test with the four-bar mechanism controlled with a torque at the input crank. Refer to discussion in Sect. 7.2

7.2 Inverse dynamics simulation

The FG proposed in Fig. 4 for inverse dynamic calculation is firstly put at test with the same four-bar mechanism described above. Here, the crank angle \(\theta \) trajectory over time is specified as an arbitrary smooth curve, and the goal is to solve for the unknown torque at the crank (point A) that would be required to balance gravity and inertial forces as accurately following the prescribed path.

In this case, we use a Levenberg–Marquardt nonlinear solver on the FG in Fig. 4 in batch mode, as opposite to filtering or fixed-lag smoother mode. However, the position problem is strongly nonlinear, preventing solvers to easily find the global optimal of the whole FG in Fig. 4 for arbitrary initial values for all variables. Instead, we propose attacking the inverse dynamics FG in the following four stages (running the batch optimizer once after each stage), which ensure that the global optimum is always easy to find by the nonlinear solver:

  • Stage 1: Include \({{\mathbf {q}}}\) variables, prior factors for the desired trajectory (\(f_{prior}\)), position constraint factors (\(f^d_{pc}\)), and soft equality factors (\(f_{eq}\)).

  • Stage 2: Add variables \({\dot{{\mathbf {q}}}}\), \({\ddot{{\mathbf {q}}}}\), and numerical integration factors \(f_{ti}\).

  • Stage 3: Add velocity constraint factors (\(f^d_{vc}\)).

  • Stage 4: Add generalized force variables \({{\mathbf {Q}}}\) and inverse dynamics factors (\(f^d_{idyn}\)).

Numerical results for this approach are shown in Fig. 9 for the four-bar mechanism, where the trajectories prescribed and actually followed by the mechanism crank are depicted in Fig. 9a for both the proposed approach and the commercial software MSC ADAMS. Both methods successfully solve the inverse dynamics problem by finding the torque sequence in Fig. 9b. In this case, both our method and MSC lead to an excellent stable following error, less than \(10^{-3}\) degrees, as shown in Fig. 9c.

Table 3 Summary of performance parameters for the inverse dynamics experiments
Fig. 10
figure 10

Second mechanism used as a benchmark for the proposed methods: a parallel planar robot with two degrees of freedom, modeled as 11 bodies with 14 points. The robot is actuated via two motors in points #1 and #12. Kinematic chain based on [39]

Secondly, we evaluated the inverse dynamics FG formulation against MSC.ADAMS on a more complex mechanism: a planar parallel robot featuring two degrees of freedom, based on [39]. It is controlled via two motors (unknown torques) at joints numbered as #1 and #12 in Fig. 10, and the goal is for the end-effector body to describe an “L”-like trajectory. Figure 11a shows the reference trajectory of the end-effector point #6, with a detailed view in Fig. 11b. As can be seen in the latter, the trajectory is accurately honored near the corner with a maximum following error of \(\sim 0.02~\text {mm}\). Unlike the case of the four-bar mechanism above, the fact that the reference trajectory is given for a coordinate (the end-effector) that does not match where actuators are implies that inverse kinematics must be solved first to obtain the desired trajectories (angles \(\theta _1\) and \(\theta _2\)) for the two motors at joints #1 and #12, respectively. The reference and the achieved trajectories for our method are represented in Fig. 11c. Regarding the inverse dynamics results themselves, the torques computed by MSC.ADAMS and our approach are plotted in Fig. 12. It is noteworthy that our approach achieves a smoother solution, finding optimal transitions with finite jerk (i.e., acceleration derivative) in contrast with the more noisy and unrealistic acceleration obtained from MSC, which would be more problematic to transfer into an actual physical device.

A summary of relevant data to evaluate the computational cost of these inverse dynamics experiments is found in Table 3.

8 Conclusions and future works

This work has settled the theoretical bases for formulating kinematics and dynamics problems from computational mechanics in the form of factor graphs and demonstrated the validity of our reference open-source implementation. The results showed the practical feasibility of the proposed approach and its accuracy in comparison with a commercial MBS software based on classical approaches. One additional advantage of our formalism lies at its excellent suitability to design state observers and parameter identification systems for arbitrary mechanical systems, ranging from robots to vehicles. This aspect will be part of future works.

Fig. 11
figure 11

Inverse dynamics results for the pick-and-place parallel robot mechanism. The reference (black dotted) and FG-based (solid blue) trajectories for point #6 are shown in (a)–(b). The reference angles for the two motors are shown in (c)

Fig. 12
figure 12

Inverse dynamics results for the pick-and-place parallel robot mechanism (continuation)