Convergence of Lobatto-type Runge–Kutta methods for partitioned differential-algebraic systems of index 2

In this paper a numerical scheme for partitioned systems of index 2 DAEs, such as those arising from nonholonomic mechanical problems, is proposed and its order for a certain class of Runge–Kutta methods we call of Lobatto-type is proven.


Introduction
Let N = R n and M ⊆ N a submanifold of codimM = m. Let M be defined as the null-set of φ : N → R m . A generic explicit differential equation on M can be recast into a semi-explicit index 2 differential algebraic equation (DAE) on N taking the form: ẏ = f (y, λ) 0 = φ(y) (1.1) where y ∈ N and λ ∈ V , with V a vector space such that dim V = m. Studies on the numerical solution of initial value problems (IVP) for such systems can be found in the bibliography that serves as foundation for this paper, such as [2] or [5]. We are interested in a subset of such problems, which will be referred to as partitioned, where y = (q, p), dim Q = n q , dim P = n p , and λ ∈ R m .
⎧ ⎨ ⎩q = f (q, p) p = g(q, p, λ) 0 = φ(q, p). (1.2) Such is the case of the equations of motion of nonholonomic mechanical systems which motivates our study, where dim Q = dim P = n (thus, in this case dim N = 2n). For a Hamiltonian function H : N → R and linear nonholonomic constraints μ α i (q)q i = 0, i = 1, . . . , n, α = 1, . . . , m, we get An IVP for this partitioned DAE is defined by an initial condition (q 0 , p 0 , λ 0 ) ∈ N | M × R m . The development and application of the methods shown here in the case of nonholonomic mechanical systems will be the subject of a follow-up paper where further numerical experiments will be performed [1].
For the remainder of the paper we will assume that f , g and φ are sufficiently differentiable and that D p φ D λ g (q, p, λ) remains invertible in a neighbourhood of the exact solution. Here D p means the derivative with respect to the p variables, and similarly with q and λ.

Lobatto-type methods
A numerical solution of an IVP for (1.1) can be found using an s-stage Runge-Kutta (RK) method with coefficients (a i j , b j ). Writing the corresponding equations is a relatively trivial matter, taking the form: Note that these l j are not given explicitly and must instead be solved for with the help from the constraint equations. In fact, provided the RK coefficients satisfy certain conditions, the second set of equations in (2.1a) and (2.1b) may be discarded. Now, a numerical solution of an IVP for Eq. (1.2) can also be found using an sstage partitioned RK method but already the correct application of such a scheme is non-trivial. One could naively write: Again, U j are not given explicitly and, as above, in some cases, it may also be possible to discard the third set of equations in (2.2a) and (2.2b). Such a system of equations may have certain issues, both from a solvability and a numerical convergence point of view. This is especially true for the particular case of partitioned RK methods that we will consider. In [5] the author considers RK methods satisfying the hypotheses: H1 implies that c 1 = s j=1 a 1 j = 0 and for Eq. (2.1) Y 1 = y 0 , 1 = λ 0 . H3 implies that y 1 = Y s , λ 1 = s . Furthermore, if the method is consistent, i.e. j b j = 1, then H3 implies c s = 1. For Eq. (2.2), if (ǎ i j ,b j ) also satisfies the hypotheses, then Q 1 = q 0 , 1 = λ 0 , Q s = q 1 and s = λ 1 . The most salient example of these methods is the Lobatto IIIA, which is a continuous collocation method.
The Lobatto IIIB is a family of discontinuous collocation methods which are symplectically conjugate to the IIIA methods. Two RK methods, (a i j , b j ) and (â i j ,b j ), are symplectically conjugate if they satisfy: Together they form the Lobatto IIIA-IIIB family of symplectic partitioned Runge-Kutta methods, which is precisely the one we want to study (see also [3,9]). Note that Lobatto IIIB methods do not satisfy hypotheses H1, H2 and H3. In fact, any symplectic conjugate method to a method satisfying those hypotheses must necessarily be such that: Obviously, the submatrixÃ := (â i j ) i, j≥2 is never invertible because of H1', and this is the culprit of the solvability issues of (2.2).
We shall consider a further compatibility hypothesis: H*ĉ i = c i for i = 1, . . . , s. This hypothesis ensures that the stages of both methods are concurrent. The Lobatto IIIA-IIIB family satisfies H*, but one should note that for s = 2,ĉ i = 2 j=1â i j . For such methods, we propose the following equations for the numerical solution of the partitioned IVP: together with 1 = λ 0 and s = λ 1 . From H1, we have that L 1 = p 0 , and from H3 and Eq. (2.3), L s = p 1 . The intuition behind the proposed scheme is that P i are not as good an approximation to the continuous p as Q i are to q, and in order to impose the constraint we need a more accurate estimate of p. Such new variables can be added to partitioned non-DAE systems, where they become decoupled and can be computed as a post-processing step to have better approximations of the p variables inside the interval. The auxiliary L i variables can be eliminated altogether, leading to It should be noted that, although similar, these methods do not generally coincide with the SPARK methods proposed by L. O. Jay in [7]. (After talking to prof. Jay, he noted the approach suggested here is similar to that of Murua [8] which was not previously known by the author.) There are several simplifying assumptions that a given RK scheme satisfies: When referring to these assumptions for a RK method (â i j ,b i ) we will write them as X (ŷ). Note that if (a i j , b i ) and (â i j ,b i ) are two symplectically conjugate methods, each satisfying the symplifying assumptions B( p), C(q), D(r ) and B(p), C(q), D(r ) thenp = p, C(q) impliesr = q, and conversely D(r ) impliesq = r . Also in the symplectically conjugate case, whenever both p, r ≥ 1, H* is automatically satisfied. Apart from these, there are a few more simplifying assumptions that pairs of compatible methods satisfy (see [6]): Lastly, there is a function associated to a RK method that we need to define. Consider the linear problemẏ = λy, and apply one step of the given method for an initial value y 0 . The function R(z) defined by y 1 = R(hλ)y 0 is the so-called stability function of the method.
For an arbitrary RK method we have that . . , b s ) T and 1 s = s (1, . . . , 1) T . In the particular case of a method satisfying that a s j = b j , i.e. e T s A = b T , which is the case of Lobatto methods, this can be reduced to:

Existence, uniqueness and influence of perturbations
Before proceeding, let us introduce the following notational conventions: We will mainly be concerned with derivatives of the functions that define our partitioned vector field and the constraint, Eq. (1.2), namely f , g and φ, evaluated at the different stages of our RK methods, (2.5) and (2.5). For this we define: where D q f 1 = D q f (Q 1 , P 1 ) and the tilde indicates the elimination of the derivative with respect to the first stage. The same applies to derivatives of g and φ, where the latter will be evaluated at (Q i , L i ). In analogy with our notation for A andÂ, we write Similar use of tilde for the elimination of the first stage components applies also to other vectors such as δ Q , δ P , θ in Theorem 3.2.
, a set of h-dependent starting values, and assume: Assume also that the Runge-Kutta coefficients A verify the hypotheses H1 and H2, and thatÂ satisfies H1' and H*. Then, for h ≤ h 0 , there exists a locally unique solution to: Proof The proof of existence differs little from what is already offered in [2] (for invertible A matrix) or [5] (for A satisfying the hypotheses H1 and H2). The idea is to consider a homotopic deformation of the equations which leads to a system of differential equations where the existence of a solution for the corresponding IVP implies the existence of a solution to the original system.
The key of the proof is the use of the invertibility of D p φ(Ã ⊗ I )D λ g, which is a term arising from the constraint equation. As stated in the former section, if the system were described by Eq. (2.2) we would instead have D p φ(Ã ⊗ I )D λ g, which is not invertible by H1', rendering the system unsolvable.
The proof of uniqueness remains essentially the same.
withˆ 1 =λ 0 , and where δ Q,i , δ P,i and θ i are perturbation terms. Additionally, assume that:q Then, using the notation X :=X − X and X := max i X i , for small h we have: Proof To tackle this problem we first subtract Eq. (3.1) from Eq. (3.2) and linearize. If we temporarily introduce back the auxiliary L i variables defined in Eq. (2.5), we get: We can thus rewrite the system as Thus, isolating the rest of the variables, we can rewrite the system as Our mission is to solve for , but due to the singularity of A it will not be possible to solve for the entire vector and instead we will solve only for ˜ . We will first insert Q and L in the constraint equation. This eliminates the latter variables from the analysis.
From the hypotheses we have that D pφ Ã ⊗ I n D λg is invertible, so we can indeed solve for ˜ in terms of Q, P and 1 .
Substituting this back into the equations we can read off the results from the theorem.

Lemma 3.1 In addition to the hypotheses of Theorem 3.1, suppose that C(q),Ĉ(q) and that
Then the solution of Eq. (3.1), Q i , P i and i satisfies: , and D Q (i) , DP (i) and D (i) are functions composed by products of the derivatives of f , g and φ of i-th order evaluated at (q 0 , p 0 , λ 0 (q 0 , p 0 )).
Proof Following [5], Lemma 4.3, we can use the implicit function theorem to obtain where we have made use of the fact that λ 0 = O(h κ ). What remains is to compute δ Q , δ L , δ P to obtain the result we are after.
Inserting the exact solution into Eq. (3.2) we obtain: Now, expanding in Taylor series about t 0 and taking into account that for a RK satisfying C(q) we have that we get: Finally, we obtain: which proves our lemma.

Remark 3.1
For the Lobatto IIIA-B methods we have thatq + 2 = Q = q = s and this result implies that: with κ ≥ 1. Then we have: Proof This proof follows closely that of [5], Theorem 4.4. The idea is to take the results from Theorem 3.2, insert them in Eqs. (3.4a) and (3.4c) and solve for Q and P. Later, we insert the results into where ˜ is a function of (Q, P, 1 ,Q,P,ˆ 1 ), and perform a Taylor expansion of each term. Just as in [5], the important result here is the h m+2 factor in front of λ 0 , which means that we need to pay special attention to 1 . In our case λ 1 = s coincides with Z s in [5] of the same theorem without changes. The differences appear in the rest of the components, where having two sets of RK coefficients makes the Taylor expansion of the terms and the tracking of each component much more difficult.
From here on we will forget about all terms except for the ones with 1 , as the rest vary little from what was found in theorem 3.2 and they can be easily obtained, thus barring the need to carry them around any longer.
Inserting Eq. (3.5) into Eq. (3.10) we can collect all terms multiplying by P, Q and 1 separately. Let where i = q, p. Thus, we can write ) + · · · (3.12) Similar groupings can be done for P after inserting Eq. (3.5) into Eq. (3.4c) ] + · · · (3.13) In order to account for the implicit dependence on 1 in Eq. (3.12), we need to solve the system formed by Eq. (3.13) together with (3.14) Before solving that system, let us first expand the terms multiplying 1 . Insidẽ X p , we find the product D pφ Ã ⊗ I n D λg Ã ⊗ I m −1 composed of: where, from Lemma 3.1, χ = min(μ, ν) and ω = min(μ, ν, ξ). This results in: Inversion of this product can be carried out as a Taylor expansion resulting in a so-called von Neumann series (I − T ) −1 = ∞ i=0 T i . To do this, first, let us rewrite the former expression as In the last line we have introduced the multi-index α of dim α = 2, i.e. α ∈ N 2 . We will denote the k-th component of α by α (k) . If α = (i, j), where now β can be interpreted as: -a multi-index of even dimension, dim β ≤ 2|β|, such that for i odd β (i) and β (i+1) are never both 0, or -a multi-index of α-multi-indices, i.e. β = (α 1 , . . . , α n ), running for 1 ≤ n ≤ |β|, and 1 ≤ |α i | ≤ |β|, the inverse of this expression may be written as We can sandwich the expression between D λg Ã ⊗ I m and D pφ to obtain: where: with γ multi-index of dim γ ≤ 2ω even, and such that for i even, γ (i) and γ (i+1) are never both 0.
. Furthermore, their contributions have opposite signs and therefore, they vanish. Thus, the only terms of the expansion that can survive are those such that γ (dim γ −1) = 0, γ (dim γ ) = 0 and this in turn means that all surviving RK symbol combinations R γ must end inÃ −1 ( no freeC at the end).
Using the notation and with our previous analysis, the expansion of the terms explicitly multiplying 1 in Eq. (3.13) takes the form: where O ρ is a term composed by multiplication of and derivatives of g and φ evaluated at the initial condition.
For the remaining expansions we do not need to be as precise as with this last one as there will not be cancellations due to signs. Thus, we will only care about the different symbol combinations that arise. Now, we need to solve the system formed by Eqs. (3.14) and (3.13). This involves inverting a matrix of the form .

If we write
where the matrices J , K , M, N are O(h), then its inverse is: The only terms we are interested in are U and W , as those are the only ones that connect with 1 . The Taylor expansion of any of these terms is a daunting task given the amount of nested expansions involved, but it will suffice to see which types of symbol combinations can appear.
We can see that U = (1 − J ) −1 K W . This means that once we know the behaviour of W , the behaviour of U will be easy to derive. Also from this, we can see that all the resulting symbol combinations of U must start with the coefficient matrix A, while for W they must start with the coefficient matrixÂ with the exception of the zero-th order term.
Given the expression of W , its expansion must contain A, C,Â or A − C A to the right ofÂ. As for U , all the terms in W will show up multiplied by (1 − J ) −1 K . The symbols this factor adds at order n are A × [(n − 1)-element variations of {A, C}], thus, no A − C A can appear in U until after the firstÂ coming from W is in place.
Finally we can go back to Eq. (3.12). Expanding the terms involving Q and P is essentially the same as expandingX q andX p , the latter of which we have already done and not much differs. It is important to note that as we are multiplying those terms with A on the right, we will always have one A − less than the number of As, which prevents AC k A − terms from appearing at the very end of a symbol combination.
Performing a Taylor expansion of q 1 in terms of Q and P we get whereH i are linear combinations of derivatives of g and f evaluated at the initial condition. Doing the same for p 1 , we get more variety, whereJ α,β are terms involving and derivatives of f , g and φ evaluated at the initial condition. The main difference here is that we can have Cs to the left of the first A, as well as the possibility of having C → A − C A substitutions to its right. Putting everything together, and keeping in mind that ω = min(μ, ν, ξ), the respective 1 -dependent terms resulting from the expansion of Eqs. (3.9) and (3.10) can be brought to the form: where eachŪ j,α i is again a combination of products of the derivatives of f , g, φ with evaluated at the initial condition, andK j,α i is a RK symbol combination of order |α i | as in Lemma 3.2. The difference betweenK Q,α i andK p,α i lies in the fact that K Q,α i cannot begin with C i and there cannot be C → A − C A substitutions between the initial A and the firstÂ, while onK p,α i there can be. Applying the result of Lemma 3.2, all these terms vanish which is what we set out to prove.

where M i and N i can be C, A,Â, A − C A, AC A − for any i except k where M k = AC A − cannot occur.
Proof Multiplying D(r ) by A − we may obtain that: As A satisfies H3, we also have that e T s A = b T , and consequently b T A − = e T s . The vanishing of the different symbol terms rests in both the vanishing of the following reduced combinations and the fact that any symbol combination that appears in the expansion can be brought to one of these.
This is said to be of order k − 1, as that is the number of times C appears. It vanishes because: The application of the simplifying assumptionD(r ) in the second line and D(r ) in the fourth line are the limiting factors for the order.
This is said to be of order k, as that is the number of times C appears Again, the application of the simplifying assumption D(r ) in the first line is the limiting factor for the order. (3.20c) Proof The proof of this theorem is similar to that of [5], Theorem 5.1, which follows that of [2], Theorem 5.9, and [4], Theorem 8.10. The arguments are essentially the same as those used in [2] for A invertible, but using a bi-colored tree extension (see Fig. 1). The inverses that appear only need to be swapped by A − . In these results two trees are used, t and u trees, referring to y and z equations respectively. In our case we will have both t Q and t P for Q and P equations, plus u for λ equations.
The key difference with respect to both this and [5] is that instead of only needing to set the limit such that for [t, u] y either t or u are above the maximum reduction order by C(q) (q + 1 and q − 1), which leads to 2q, we need to be careful because we have two types of trees and two simplifying assumptions C(q) andĈ(r ). First of all, it is impossible to have [t Q , u] Q as f does not depend on λ, and we can only have [t Q , t P ] Q , which pushes the limit to q + r + 2. On the other hand, [t P , u] P also sets a limit, which as it turns out is q + r . For both there is also the limit q + r + 1 set by D(r ), which is more restrictive than the limit set for Q equations but less so than the limit set for P, so this last one prevails. Theorem 3.5 Consider the IVP posed by the partitioned differential-algebraic system of Eq. (1.2), together with consistent initial values and the Runge-Kutta method (2.6). In addition to the hypotheses of Theorem 3.4, suppose that |R A (∞)| ≤ 1 and q ≥ 1 if R A (∞) = 1. Then, for t n − t 0 = nh ≤ C, where C is a constant, the global error satisfies: (3.21c) Proof Following the steps of [5], Theorem 5.2, for |R A (∞)| < 1 and |R A (∞)| = 1, λ n − λ(t n ) can be found to be of order O(h q ) and O(h q−1 ) respectively. As stated there, the result for R A (∞) = −1 can actually be improved to O(h q ) by considering a perturbed asymptotic expansion. Now, we proceed as in [4], Theorem VI.7.5, applying (3.8)(3.6)(3.7) to two neighbouring RK solutions, q n ,p n ,λ n and q n ,p n ,λ n , with δ i = 0, θ = 0. Using the notation x n =x n −x n , we can write: where 1,n and 2,n are the projectors defined in the statement of Theorem 3.3, evaluated atq n ,p n ,λ n , and m = min We can follow the same philosophy of [2], Lemma 4.5, and try to relate { q n , p n , λ n } with { q 0 , p 0 , λ 0 }. For this, we make use of the fact that i,n+1 = i,n + O(h), 2,k 2 = 2,k and 2,k 1,k = 0 (these latter facts can be readily derived from their definition).
This leads to: Thus, the error estimates become: Proceeding as in [2] to use the Lady Windermere's Fan construction and using the results from Theorem 3.4 for δq h (t k ), δ p h (t k ), and the results we derived for δλ h (t k ), with m = min(q − 1, r , p − q, p − r ) for −1 ≤ R A (∞) < 1 as well as m = min(q − 2, r , p − q, p − r ) for R A (∞) = 1, we find the global error by addition of local errors, which gives the result we were looking for.

Numerical experiment: nonholonomic particle in an harmonic potential
For the purposes of testing, we performed a series of simulations of a system with the Lobatto IIIA-B family with 2, 3, 4 and 5 stages and different values of the step h.
Simulations with h = 1e-4 were taken as ground truth and we produce log-log plots of the error w.r.t. ground truth versus step to check the order. The system in question is known as the nonholonomic particle in a harmonic potential, a classic academic example. Its equations are of the form of Eq. (1.3). We have N = Q × P, Q = P = R 3 , (x, y, z) ∈ Q representing position, ( p x , p y , p z ) ∈ P representing canonical momenta, λ ∈ R, and Hamiltonian (energy) and constraint H (x, y, z, p x , p y , p z ) = 1 2 ( p 2 x + p 2 y + p 2 z ) + φ(x, y, z, p x , p y , p z ) = p z − yp x The initial condition chosen for the experiments is (1, 0, 0) ∈ Q, (0, 1, 0) ∈ P. As it can be seen in Fig. 2, the numerical order obtained coincides with the expected one.

Conclusion
In this paper we have proposed a new numerical scheme for partitioned index 2 DAEs proving its order. The method opens the possibility to construct high-order methods for nonholonomic systems in a systematic way, preserving the nonholonomic constraints exactly. So far, geometric methods to numerically integrate a given nonholonomic system were constructed using discrete gradient techniques or modifications of variational integrators based on discrete versions of the Lagrange-d'Alembert's principle. Integrators in the latter category, in which our method falls, tend to display a certain amount of arbitrariness, particularly in the way constraints are discretized or imposed. In most cases, with the exception of SPARK methods [7], the resulting methods are limited to low order unless composition is applied, and without a general framework for error analysis. However, our methods offer a clear and natural way to construct them to arbitrary order. Further considerations about our construction, particularly with respect to its interpretation will be left for [1].