1 Introduction

Higher-index DAEs do not only represent integration problems but also differentiation problems, as well (see, e.g., [4, 22, 23, 28]). Therefore, it seems worthwhile to solve an associated ODE with classical integration schemes and the differentiation problems using Automatic Differentiation (AD). However, depending on the structure, both differentiations and integrations may be intertwined in a complex manner such that this plausible idea may be difficult to realize in general.

In this context, different approaches have been considered for DAEs in order to combine AD with ODE integrations schemes. By construction, the approaches are based on corresponding index definitions and lead, therefore, to quite different algorithms.

  • In [26, 27] and the related work, the structural index was used to determine the degree of freedom.

  • In [13], we used the tractabiliy matrix sequence to solve the inherent ODE for DAEs of index up to two. The generalization to higher-index DAEs seemed rather complicated.

  • In [17], we described briefly how an approach based on the differentiation index defined in [14, 16] leads to an explicit Taylor series methods for DAEs. An analysis of the corresponding projected explicit ODE can be found in [18]. These methods can be considered as projected Taylor series methods.

In this paper, we analyze more general classes of the latter mentioned projected Taylor series methods. In particular, we discuss how projected implicit Taylor series methods can be defined for DAEs, generalizing the approach from [17]. Here we focus on the methods from [7, 21].

The main idea in this context is that the computation of Taylor coefficients of a solution of an implicit ODE can be considered as the solution of a nonlinear system of equations. In this sense, we will see that a generalization for DAEs can be obtained by solving a nonlinear optimization problem [16]. The obtained solution corresponds to a projected method. There are several advantages of this approach:

  • We assume weak structural properties of the DAEs (1), such that ODEs and semi-explicit DAEs are simple special cases. Theoretically, we can consider DAEs of any index.

  • An explicit description of the inherent dynamics is not required for the algorithmic realization. Indeed, due to our formulation as an optimization problem, implicitly we consider the orthogonally projected explicit ODE introduced in [18].

  • We can use higher-order integration schemes, also for stiff ODEs/DAEs.

The described methods were implemented in a prototype and first numerical tests for DAEs up to index 5 and integration schemes up to order 8 were successful.

The purpose of our implementation is the improvement of our diagnosis software [17]. Therefore, we do not focus on high-performance, but on information about the reliability of the numerical result, specially with regard to higher-index issues and the diagnosis of singularities like [19]. Due to their stability and order properties, the new higher-order methods presented here are a considerable improvement in comparison to [17], where only explicit Taylor series methods were considered.

The paper is organized as follows. In Section 2, we introduce DAEs and summarize some of our previous results that are crucial for the approach presented here.

The properties of Taylor coefficients in the derivative array of a DAE are discussed in Section 3. Using these properties, in Section 4, we define the general projected explicit/implicit Taylor series method that are, indeed, a generalization of the explicit method from [17].

Within this framework, in Section 5, we present four different types of projected methods: explicit methods, fully implicit method, two-halfstep (TH) schemes, and higher-order Padé (HOP) schemes.

The properties of the considered optimization problems that provide the projection are discussed in Section 6 and some practical considerations for the implementation are addressed in Section 7.

Our prototype implementation of the proposed projected methods for DAEs is tested on several well-known examples and benchmarks from the literature in Section 8. An outlook discussing directions for further investigations concludes this paper.

For completeness, in the Appendix, we summarize the stability functions and stability regions for the considered Taylor series methods, since they are essential for the development of HOP methods. We also summarize some linear algebra results for decoupling DAEs and provide the DAE formulation of the tested examples resulting from servo-constraint problems for multi-body systems.

2 DAEs: index, consistent values, and decoupling

In this article, we consider general DAEs:

$$ f(x^{\prime},x,t)=0, $$
(1)

for \(f: \mathcal G_{f} \rightarrow {\mathbb {R}}^{n}\), \(\mathcal G_{f} \subset {\mathbb {R}}^{n} \times {\mathbb {R}}^{n} \times {\mathbb {R}}\), where the partial Jacobian \(f_{x^{\prime }}\) is singular and \(\ker f_{x^{\prime }}(x^{\prime },x,t)\) is constant. For our purposes, we define the constant orthogonal projector Q onto \(\ker f_{x^{\prime }}\) as well as the complementary orthogonal projector P := IQ.

Recall that the singularity of \( f_{x^{\prime }}\) means that (1) contains derivative-free equations, called explicit constraints, and that the differentiation of (1) may lead to further derivative-free equations, called hidden constraints. A consistent initial value x0 has to fulfill all explicit and implicit constraints. The characterization of all these constraints motivated the following definition for the differentiation index, cf. [4].

Definition 1

[14] The differentiation index is the smallest integer μ such that:

$$ \begin{array}{@{}rcl@{}} f(x^{\prime},x,t)&=&0, \end{array} $$
(2)
$$ \begin{array}{@{}rcl@{}} \frac{d}{dt} f(x^{\prime},x,t)&=&0, \end{array} $$
(3)
$$ \begin{array}{@{}rcl@{}} &\vdots& \\ \frac{d^{\mu-1}}{dt^{\mu-1}}f(x^{\prime},x,t)&=&0, \end{array} $$
(4)

uniquely determines Qx as a function of (Px,t).

If μ is the differentiation index according to Definition 1, then the conventional differentiation index (see, e.g., [4]) results to be μ as well. According to this definition, in the following we will never prescribe initial values for Qx0, since we may compute Qx0 evaluating a function at (Px0,t0). Moreover, in the higher-index case, the Eqs. (2)–(4) contain explicit and hidden constraints that restrict the choice for Px0.

According to [16], for an initial guess \(\alpha \in {\mathbb {R}}^{n}\), consistent initial values x0 can be computed solving the following constrained optimization problem:

$$ \begin{array}{@{}rcl@{}} \quad \min &\quad \left\| P(x_{0} - \alpha) \right\|_{2} \end{array} $$
(5)
$$ \begin{array}{@{}rcl@{}} \text{subject to} & \text{ all explicit and hidden constraints}. \end{array} $$
(6)

Equivalently, we can solve the system of equations:

$$ \begin{array}{@{}rcl@{}} & \quad \quad {\Pi} (x_{0}-\alpha)=0 \end{array} $$
(7)
$$ \begin{array}{@{}rcl@{}} &\text{all explicit and hidden constraints} \end{array} $$
(8)

for a suitable orthogonal projector π with rank π = d ≤rank P, where d is the degree of freedom of (1), cf. [15]. However, in particular for nonlinear DAEs, π may be difficult to compute and therefore, in practice, (5) may be more convenient than (7), cf. Appendix Appendix. The approach (5)–(6) was implemented in InitDAE, a Python program to determine the index and consistent initial values for DAEs [12, 17].

Furthermore, for linear DAEs of the form:

$$ A(t)x^{\prime}+B(t)x=q(t), $$
(9)

in [18], it was shown that using the derivative array (2)–(4), the decoupling:

$$ \begin{array}{@{}rcl@{}} ({\Pi} x)'&=&\varphi_{1}\left( {\Pi} x, q, q^{\prime}, \ldots, q^{(\mu)}\right), \end{array} $$
(10)
$$ \begin{array}{@{}rcl@{}} (I-{\Pi}) x &=&\varphi_{2}\left( {\Pi} x, q, q^{\prime}, \ldots, q^{(\mu-1)}\right), \end{array} $$
(11)

can be obtained for suitable functions φ1,φ2, where (10) is an ODE in the invariant subspace im π. Therefore, theoretically, we can set up (10) and solve it with an integration scheme for ODEs. Subsequently, (I −π)x can be computed at each time point using (11). Notice that in doing so, the error in (I −π)x depends only on the error made solving (10) and the properties of φ2 from (11). Moreover, the values of (I −π)x at previous time points do not influence (10).

For nonlinear DAEs, analogous considerations can be undertaken considering the linearization along a solution. However, of course, the properties of f are decisive in practice. Since a detailed analysis for the nonlinear case goes far beyond the scope of this article, we focus on a general formulation of projected Taylor methods using (5)–(6), having in mind that at present, the theoretical basis has been developed for linear DAEs only. At least the numerical tests from Section 8 suggest the applicability for some classes of nonlinear DAEs.

3 Taylor series and DAEs

Since we wish to analyze one-step methods, we consider the computation of an approximation of the solution x(t) of the ODE/DAE (1) at time tj+ 1, given an approximation of the solution at time-point tj. Consequently, in order to describe our method in terms of Taylor expansion coefficients, for \(k_{c} \in {\mathbb {N}}\), we suppose that a suitable approximation:

$$ [(c_{0})_{j}, (c_{1})_{j}, (c_{2})_{j},\ldots, (c_{k_{c}})_{j} ] \approx [ x(t_{j}), x^{\prime}(t_{j}), \frac{1}{2}x^{\prime\prime}(t_{j}), \ldots, \frac{1}{k_{c}!}x^{(k_{c})}(t_{j})] $$
(12)

is given and that we look at adequate methods to compute:

$$ [(c_{0})_{j+1},\! (c_{1})_{j+1}, \ldots,\! (c_{k_{c}})_{j+1} ] \!\approx\! [ x(t_{j+1}), x^{\prime}(t_{j+1}),\! \frac{1}{2}x^{\prime\prime}(t_{j+1}),\! \ldots, \frac{1}{k_{c}!}x^{(k_{c})}(t_{j+1})]. $$

If we suppose that the ODE/DAE is described by (1), we require that:

$$ f((c_{1})_{j},(c_{0})_{j},t_{j})=0 \quad \text{and} \quad f((c_{1})_{j+1},(c_{0})_{j+1},t_{j+1})=0 $$

are fulfilled. For our purposes, given K ≥ 1, we further consider the order (K − 1) derivative array derivative array [4] containing (1) and K − 1 derivatives of (1):

$$ \begin{array}{@{}rcl@{}} \begin{pmatrix} f(x^{\prime},x,t) \\ \frac{d}{dt}f(x^{\prime},x,t) \\ \frac{d^{2}}{dt^{2}}f(x^{\prime},x,t)\\ \vdots\\ \frac{d^{(K-1)}}{dt^{(K-1)}}f(x^{\prime},x,t)\\ \end{pmatrix}= \begin{pmatrix} f(x^{\prime},x,t) \\ f_{x^{\prime}}(x^{\prime},x,t)x^{\prime\prime}+ f_{x}(x^{\prime},x,t)x^{\prime} + f_{t}(x^{\prime},x,t)\\ \vdots\\ \vdots\\ \\ \end{pmatrix}. \end{array} $$
(13)

Therefore, we suppose that for the corresponding function:

$$ \begin{array}{@{}rcl@{}} \!\!\!\!\!\!\!\!\!r(c_{0}, c_{1}, \ldots, c_{K},t)&:=&\!\begin{pmatrix} r_{0}(c_{0},c_{1},t)\\ r_{1}(c_{0},c_{1},c_{2},t)\\ \vdots\\ r_{K-1}(c_{0},c_{1}, \ldots,c_{K},t) \end{pmatrix} \\ &:=&\!\begin{pmatrix} f(c_{1},c_{0},t)\\ 2 f_{x^{\prime}}(c_{1},c_{0},t)c_{2} +f_{x}(c_{1},c_{0},t)c_{1}+ f_{t}(c_{1},c_{0},t)\\ \vdots \end{pmatrix} \end{array} $$
(14)

it holds:

$$ r((c_{0})_{j}, (c_{1})_{j}, (c_{2})_{j},\ldots, (c_{K})_{j} ,t_{j})=0 $$
(15)

and

$$ r((c_{0})_{j+1}, (c_{1})_{j+1}, (c_{2})_{j+1},\ldots, (c_{K})_{j+1},t_{j+1})=0 . $$
(16)

In practice, the function r can be provided by automatic differentiation (AD) [17, 31].

Using this notation, the index from Definition 1 is the smallest integer μ such that for K = μ, the derivative array r uniquely determines Qc0 as a function of (Pc0,t). For this purpose, the 1-fullness of the Jacobian of r is verified in [16, 17], cf. Appendix Appendix. We emphasize that the main difference to the conventional differentiation index [4] is precisely that for 1-fullness the columns corresponding to c0 (and not c1) are considered. With this index definition in mind, we can define consistency for the Taylor coefficients.

Definition 2

For Kμ and 0 ≤ kcKμ, the Taylor coefficients up to kc are consistent if they are in the set:

$$ \mathcal{T}_{k_{c}}^{j}:=\left\{ [(c_{0})_{j}, (c_{1})_{j},\ldots, (c_{k_{c}})_{j} ] \in {\mathbb{R}}^{(k_{c}+1) \cdot n}\biggm| \begin{array}{l} \text{ There exist } (\tilde{c}_{k_{c}+1})_{j},\ldots, (\tilde{c}_{K})_{j} \text{ such that }\\ r((c_{0})_{j}, \ldots, (c_{k_{c}})_{j}, (\tilde{c}_{k_{c}+1})_{j}, \ldots, (\tilde{c}_{K})_{j} ,t_{j})=0 \end{array} \right\} $$

Note that \(\mathcal {T}_{0}^{0}\) corresponds to the set of consistent initial values and that if sufficient smoothness of f is given, we can suppose that for all \(c_{0} \in \mathcal {T}_{0}^{0}\) in regularity regions, there is a unique solution of the initial value problem. For a discussion of regularity regions and singularities within a projector-based analysis, we refer to [14, 23].

For sufficiently smooth regular linear DAEs (9), Theorem 1 from [18] implies that for any \(c_{0} \in \mathcal {T}_{0}^{0}\), there is a unique solution fulfilling x(t0) = c0. Analogously, for \([(c_{0})_{j}, (c_{1})_{j},\ldots , (c_{k_{c}})_{j} ] \in \mathcal {T}_{k_{c}}^{j}\), there exists a unique solution x(t) such that \( \frac {x^{(k)}}{k !}(t_{j}) =(c_{k})_{j}\), 0 ≤ kkc. Indeed, Theorem 1 from [18] provides a general description of the inherent dynamics in terms of the associated orthogonally projected explicit ODE (10).

Let us focus on the relation between kc, K, the DAE-index μ and the computation of consistent initial values and Taylor coefficients at a particular t0 considering:

$$ \begin{array}{@{}rcl@{}} r((c_{0})_{0}, (c_{1})_{0}, (c_{2})_{0},\ldots, (c_{K})_{0},t_{0})=0. \end{array} $$
(17)
  • For initial value problems x(t0) = c0 for ODEs \(g(x^{\prime }(t),x(t),t)=0\) with regular \(g_{x^{\prime }}\), if we consider (17), then we can compute K consistent coefficients:

    $$ (c_{1})_{0}, (c_{2})_{0}, (c_{3})_{0},\ldots, (c_{K})_{0}, $$

    at t0, since c0 is given. In the above notation, the maximal value for kc is kc = K.

  • If we consider an uniquely solvable nonlinear time-dependent equation

    g(x(t),t) = 0 with regular gx and a corresponding system of equations (17), then, at t0, we cannot prescribe c0 and compute therefore K consistent coefficients:

    $$ (c_{0})_{0}, (c_{1})_{0}, (c_{2})_{0},\ldots, (c_{K-1})_{0}. $$

    In this case, the maximal value is kc = K − 1. For the coefficients (cK)0, no equations are given, since in this particular case, they do not appear in (17). Note that in principle, g(x,t) = 0 can be considered an index-1 DAE. In this sense, it fits into the case below.

  • For DAEs (1), if we consider (17) and fix the free initial conditions of c0, then in general, we may compute K + 1 − μ consistent coefficients:

    $$ \begin{array}{@{}rcl@{}} (c_{0})_{0}, (c_{1})_{0}, (c_{2})_{0},\ldots, (c_{K-\mu})_{0}, \end{array} $$
    (18)

    cf. [17]. In this case, we have at most kc = Kμ. In general, the coefficients cK+ 1−μ,…,cK cannot be computed considering (17). Another crucial aspect is that not all components of c0 can be prescribed, since all the constraints have to be satisfied.

Note that according to (5)–(6), for an arbitrary initial guess α that, in general, may be not consistent, the optimization problem:

$$ \begin{array}{@{}rcl@{}} \min &\quad \left\| P\left( (c_{0})_{0}-\alpha \right) \right\|_{2} \end{array} $$
(19)
$$ \begin{array}{@{}rcl@{}} \text{subject to} & \quad r((c_{0})_{0}, (c_{1})_{0}, (c_{2})_{0},\ldots, (c_{K})_{0},t_{0})=0, \end{array} $$
(20)

provides consistent initial values (18). Moreover, in terms of (7)–(8), this minimization problem is equivalent to the system of equations:

$$ \begin{array}{@{}rcl@{}} {\Pi} \left( (c_{0})_{0}-\alpha\right) &=& 0, \end{array} $$
(21)
$$ \begin{array}{@{}rcl@{}} r((c_{0})_{0}, (c_{1})_{0}, (c_{2})_{0},\ldots, (c_{K})_{0},t_{0})&=&0, \end{array} $$
(22)

where π describes an appropriate orthogonal projector, and the rank of π coincides with the number of degrees of freedom of the DAE [15, 18]. Note further that, in general, the coefficients [(cK+ 1−μ)0,…,(cK)0] are not uniquely determined neither by (19)–(20) nor by (21)–(22). In our implementation from [17], the minimum norm solution \([(\tilde {c}_{K+1-\mu })_{0},\ldots , (\tilde {c}_{K})_{0} ]\) is computed.

Example 1

Consider the index-4 DAE:

$$ \begin{array}{@{}rcl@{}} \begin{pmatrix} 1 & 0 & 0 & 0 & 0\\ 0 & 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 \end{pmatrix} x^{\prime} + \begin{pmatrix} 1 & 1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0 & 0\\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \end{pmatrix} x = \begin{pmatrix} 0 \\ 0 \\ 0 \\ 0 \\ \mathrm{e}^{t} \end{pmatrix} \end{array} $$
(23)

with the general solution:

$$ \begin{array}{@{}rcl@{}} x(t;C)=\begin{pmatrix} \mathrm{C} \mathrm{e}^{- t} - \frac{\mathrm{e}^{t}}{2} \\ -\mathrm{e}^{t}\\ \mathrm{e}^{t}\\ -\mathrm{e}^{t}\\ \mathrm{e}^{t} \end{pmatrix}. \end{array} $$
(24)

For this clearly structured example, the projector-based approach will lead to:

$$ P=\begin{pmatrix} 1 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 0 & 1\\ \end{pmatrix}, \quad {\Pi}=\begin{pmatrix} 1 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0\\ \end{pmatrix}. $$

This means that Qx corresponds to x2 and according to the notation introduced in [18], the EOPE-ODE (the essential orthogonally projected ODE describing the dynamics) that corresponds to (10) will be formulated in terms of x1.

Let us have a closer look to the derivative array with respect to the index determination and the computation of consistent initial values, both related to the computation of π, see Appendix Appendix.

According to (12), for n = 5 and K = 4, we consider:

$$ (c_{\ell})_{t_{j}} = \begin{pmatrix} c_{1 \ell} \ c_{2 \ell} \ c_{3 \ell} \ c_{4 \ell} \ c_{5 \ell} \end{pmatrix}^{T}, \quad \ell =0, \ldots, 4, $$

such that the equations on the left, that are formulated for functions as described in (13), correspond to the equations on the right, that can be formulated at some t = tj for the scalar numbers \(c_{k \ell }=\frac {x_{k}^{(\ell )}(t_{j};C)}{\ell !}\) (cf. (14)):

figure a

The colors visualize the chains of calculation by with each entry of (I −π)c0 is uniquely determined by the equations r = 0, cf. (15). In particular, the red equations permit the computation of Qc0, since:

figure b

Since no representation of c20 is possible with less differentiations, the index is μ = 4. Moreover, the violet, blue, and green expressions provide values for components of Pc0, in particular c50, c40, and c30, respectively. This means that we cannot prescribe initial values for (P −π)c0.

Indeed, the EOPE-ODE reads:

$$ \begin{array}{@{}rcl@{}} x_{1}^{\prime} + x_{1} = \mathrm{e}^{t}. \end{array} $$
(25)

Summarizing, we see that to compute (c0)0, we have to prescribe a value for x1(t0) and consider at least derivatives of order up to three (with K = μ = 4) in the derivative array, i.e., r((c0)0,…,(c3)0,(c4)0,t0) = 0.

We emphasize that the gray items must vanish in order to satisfy r = 0, but for K = μ − 1, they do not uniquely determine all coefficients of (c)0 for any > 0. This means that the value for kc from Definition 2 is kc = 0 = Kμ.

If we increase the number of derivatives with K = μ + 1 = 5 and consider r((c0)0,(c1)0,(c2)0,…,(c5)0,t0) = 0 together with an initial value for c10, then correct values for (c0)0 and (c1)0 can be computed. In general, for Kμ, consistent (c0)0,…,(cKμ)0 can be obtained.

Since with the approach (19)–(20) the projector π is not computed explicitly and at least we consider nonlinear under-determined systems of equations, they are solved in a minimum-norm sense. Therefore, the used solver obtains values for all higher derivatives, whereas we cut off \((\tilde {c}_{K-\mu +1})_{0}, \ldots , (\tilde {c}_{K})_{0}\), since only (c0)0,…,(cKμ)0 are consistent in the sense of Definition 2.

In Table 1, we present the results of the computation of consistent initial values with InitDAE [12] that solves (19)–(20). For the considered initial value x1(0) = 1, the solution is \(x_{1}(t)=\cosh (t)\). We can appreciate that for K = 5, only the Taylor coefficient (c0)0 and (c1)0 are consistent, i.e., kc = 1 = Kμ. Increasing K, correspondingly more consistent Taylor coefficients could be computed.

Table 1 Numerical solution of the initialization problem for system (23) from Example 1 for t0 = 0 and α = [1,0,0,0,0] using Taylor coefficients with K = 5

The numerical solution delivered by the methods defined in the following corresponds to:

  • the numerical solution obtained by Taylor series methods applied to the projected explicit ODE (10) for πx, and

  • corresponding values for the components (I −π)x that result from (11).

Therefore, the stability and order properties of the integration methods defined below can be transferred from ODEs to DAEs. Due to the formulation as an optimization problem, the inherent dynamics of the DAE that can be expressed in terms of πx is not considered explicitly, but implicitly.

4 General definition of explicit/implicit methods

Recall that:

  • P = π = I holds for ODEs and that, therefore, for ODEs, the approach (19)–(20) means to compute the Taylor coefficients if c0 is prescribed.

  • We assumed that \( \ker f_{x^{\prime }}(c_{1},c_{0},t) \) and therefore also P do not depend on (c1,c0,t). Therefore, the Taylor coefficients of Px(t) at tj correspond to:

    $$ [P(c_{0})_{j}, P(c_{1})_{j}, P(c_{2})_{j},\ldots, P(c_{k_{c}})_{j} ]. $$

With these two properties in mind, we can present a very general formulation for implicit and explicit Taylor series methods for ODEs and DAEs defining suitable objective functions instead of (19).

In a first step, we focus on consistency.

Lemma 1

Consider:

$$ [(c_{0})_{j}, (c_{1})_{j}, (c_{2})_{j},\ldots, (c_{K})_{j} ], $$

\(k_{e}, k_{i} \in {\mathbb {N}}\), 0 ≤ ke,kiK and weights \(\omega _{\ell _{e}}^{e}, \omega _{\ell _{i}}^{i} \in {\mathbb {R}}\) to define the objective function

$$ \begin{array}{@{}rcl@{}} p\left( (c_{0})_{j+1},\ldots, (c_{k_{i}})_{j+1}\right) \!:=\!P\!\left( \sum\limits_{\ell_{i}=0}^{k_{i}} \omega_{\ell_{i}}^{i}(c_{\ell_{i}})_{j+1}\left( -h_{j} \right)^{\ell_{i}} - \sum\limits_{\ell_{e}=0}^{k_{e}} \omega_{\ell_{e}}^{e} (c_{\ell_{e}})_{j}\left( h_{j}\right)^{\ell_{e}}\!\right) . \end{array} $$

Then for any solution:

$$ \begin{array}{@{}rcl@{}} [(c_{0})_{j+1}, (c_{1})_{j+1}, (c_{2})_{j+1},\ldots, (c_{K})_{j+1} ], \end{array} $$
(26)

of the minimization problem:

$$ \begin{array}{@{}rcl@{}} \min &\quad \left\| p\left( (c_{0})_{j+1}, (c_{1})_{j+1}, (c_{2})_{j+1},\ldots, (c_{k_{i}})_{j+1}\right) \right\|_{2} \end{array} $$
(27)
$$ \begin{array}{@{}rcl@{}} \text{subject to} & \quad r((c_{0})_{j+1}, (c_{1})_{j+1}, (c_{2})_{j+1},\ldots, (c_{K})_{j+1},t_{j+1})=0 \end{array} $$
(28)

the values:

$$ [(c_{0})_{j+1}, (c_{1})_{j+1}, (c_{2})_{j+1},\ldots, (c_{k_{i}})_{j+1} ], $$
(29)

are consistent at tj+ 1 for all kiKμ.

Proof

According to Definition 2, the coefficients (29) are consistent, since the constraints (28) are fulfilled. Recall further that, under suitable assumptions, the solvability of (27)–(28) follows from the Definition 1 of the index μ and has been discussed in [16]. □

Corollary 1

Consider linear DAEs (9) and suppose that consistent values:

$$ [(c_{0})_{j}, (c_{1})_{j}, (c_{2})_{j},\ldots, (c_{k_{e}})_{j} ], $$

are given. Consider further an integration method defined by (27)–(28) for suitable weights \(\omega _{\ell _{e}}^{e}, \omega _{\ell _{i}}^{i}\). Then the following two approaches provide the same consistent results (29):

  • the solution of (27)–(28) for the original DAE (9),

  • the solution (27)–(28) for the ODE (10) that is invariant in the subspace im π, and the subsequent computation of the remaining components according to (11).

Proof

On the one hand, according to (21)–(22), the solution of (27)–(28) for the original DAE delivers the same result (29) as:

$$ \begin{array}{@{}rcl@{}} {\Pi}(t_{j+1} )\left( p\left( (c_{0})_{j+1},\ldots, (c_{k_{i}})_{j+1}\right)\right) &=& 0, \end{array} $$
(30)
$$ \begin{array}{@{}rcl@{}} r((c_{0})_{j+1}, (c_{1})_{j+1}, (c_{2})_{j+1},\ldots, (c_{K})_{j+1},t_{j+1})&=&0. \end{array} $$
(31)

On the other hand, the derivative array (31) for the DAE contains the derivative array of the ODE (10) and the nonlinear equations (11). □

With the notation of the objective function, different Taylor integration methods can be described with corresponding weights \(\omega _{\ell _{e}}^{e}\), \(\omega _{\ell _{i}}^{i}\). This provides us a very flexible way to implement schemes with different order and stability properties.

5 Projected taylor integration methods

5.1 Explicit Taylor series method for DAEs

In terms of the above notation, the explicit Taylor series method for ODEs corresponds to ke ≥ 1, \(\omega _{\ell _{e}}^{e}=1\) for 0 ≤ eke, ki = 0, \({\omega _{0}^{i}}=1\). Recall that the approach from [17] for DAEs consists of the following steps:

  • Initialization: Solve the optimization problem:

    $$ \begin{array}{@{}rcl@{}} \min &\quad \left\| P\left( (c_{0})_{0}-\alpha\right) \right\|_{2} \end{array} $$
    (32)
    $$ \begin{array}{@{}rcl@{}} \text{subject to} & \quad r((c_{0})_{0}, (c_{1})_{0}, (c_{2})_{0},\ldots, (c_{K})_{0},t_{0})=0, \end{array} $$
    (33)

    for an initial guess α.

  • For time-points tj+ 1, j ≥ 0, hj = tj+ 1tj: Solve the optimization problems:

    $$ \begin{array}{@{}rcl@{}} \min &\quad || P \big((c_{0})_{j+1}-\underbrace{\sum\limits_{\ell=0}^{k_{e}} (c_{\ell})_{j} h_{j}^{\ell}}_{\approx x(t_{j}+h_{j})}\big) ||_{2} , \end{array} $$
    (34)
    $$ \begin{array}{@{}rcl@{}} \text{subject to} & \quad r((c_{0})_{j+1}, (c_{1})_{j+1}, (c_{2})_{j+1},\ldots, (c_{K})_{j+1},t_{j+1})=0 , \end{array} $$
    (35)

    successively for keKμ, where ke is the order of the series in tj.

This method is called explicit, since (32) is an explicit equation for (c0)j+ 1 that does not involve any value (c)j+ 1 for ≥ 1. In contrast to explicit ODEs, where Taylor coefficients may be obtained by function evaluation, with this approach for DAEs in general, we have to solve a nonlinear system of equations. Therefore, it seems reasonable to consider also implicit Taylor approximations in the integration scheme.

5.2 Fully implicit Taylor series methods for DAEs

The implicit counterpart of the explicit Taylor series method for ODEs corresponds to ki ≥ 1, \(\omega _{\ell _{i}}^{i}=1\) for 0 ≤ iki, ke = 0, \({\omega _{0}^{e}}=1\). Our generalization for DAEs consists, therefore, of the following steps.

  • Initialization as in Section 5.1, Eqs. (32)–(33).

  • For time-points tj+ 1, j ≥ 0, hj = tj+ 1tj: Solve the optimization problems:

    $$ \begin{array}{@{}rcl@{}} \min &\quad || P \big(\underbrace{\sum\limits_{\ell=0}^{k_{i}} (c_{\ell})_{j+1} (-h_{j})^{\ell}}_{\approx x(t_{j+1}-h_{j})}-(c_{0})_{j}\big) ||_{2} \end{array} $$
    (36)
    $$ \begin{array}{@{}rcl@{}} \text{subject to} & \quad r((c_{0})_{j+1}, (c_{1})_{j+1}, (c_{2})_{j+1},\ldots, (c_{K})_{j+1},t_{j+1})=0 , \end{array} $$
    (37)

    successively for kiKμ, where ki is the order at tj+ 1.

Obviously, if, instead of (34) and (36), more general conditions of the type (27) are considered, then the dimension of the system that has to be solved remains equal. Therefore, it seems natural to consider more general schemes with better convergence and stability properties than the explicit and the fully implicit Taylor series methods.

5.3 Two-halfstep explicit/implicit schemes

One straightforward combination of the explicit and implicit integration schemes is to approximate x(tj + σhj) = x(tj+ 1 − (1 − σ)hj) for 0 ≤ σ ≤ 1 as follows:

$$ \begin{array}{@{}rcl@{}} x(t_{j}+\sigma h_{j}) &\approx & \sum\limits_{\ell_{e}=0}^{k_{e}} (c_{\ell_{e}})_{j}\left( \sigma h_{j}\right)^{\ell_{e}}, \end{array} $$
(38)
$$ \begin{array}{@{}rcl@{}} x(t_{j+1}-(1-\sigma) h_{j}) &\approx & \sum\limits_{\ell_{i}=0}^{k_{i}} (c_{\ell_{i}})_{j+1}\left( -(1-\sigma)h_{j}\right)^{\ell_{i}}, \end{array} $$
(39)

and equalize the expressions from both right-hand sides. The properties of the methods (38)–(39) are described in [21]. The choice \(\sigma = \frac {1}{2}\), which can be interpreted as a generalization of the trapezoidal rule, turns out to be convenient. For ke = ki, \(\sigma = \frac {1}{2}\), it coincides with the one tested in [2, 9].

Remark 1

Note that another closely related implicit/explicit scheme is described in the literature, see, e.g., [29]. There, the first step is implicit and the second one explicit, in contrast to the approach from above. According to the extensive analysis from [29], \(\sigma = \frac {1}{2}\) is convenient also in that case. However, for 0 < σ < 1, these methods are less suitable for our DAE-scheme since the Taylor coefficients would be considered at tj + σhj, whereas the constraints have to be fulfilled at tj and tj+ 1.

In the notation from Section 4, choosing \(\sigma = \frac {1}{2}\) in (38)–(39) means to consider:

$$ \begin{array}{@{}rcl@{}} p :=&P\left( \sum\limits_{\ell_{i}=0}^{k_{i}}(c_{\ell_{i}})_{j+1}\left( -\frac{h_{j}}{2}\right)^{\ell_{i}} - \sum\limits_{\ell_{e}=0}^{k_{e}} (c_{\ell_{e}})_{j}\left( \frac{h_{j}}{2}\right)^{\ell_{e}}\right) \end{array} $$
(40)

for 0 ≤ ke,kiKμ, i.e.:

$$ \omega_{\ell_{e}}^{e}= \left( \frac{1}{2}\right)^{\ell_{e}}, \ \ell_{e} =0, {\ldots} , k_{e}, \quad \quad \omega_{\ell_{i}}^{i}=\left( \frac{1}{2}\right)^{\ell_{i}}, \ell_{i} =0, {\ldots} , k_{i}. $$

For shortness, we denote these two-halfstep methods by (ke,ki)-TH.

Recall that, in general, the stability function R(z) (cf. Appendix Appendix) of a (ke,ki)-TH method is not a Padé approximation of the exponential function. Consequently, the maximally achievable order of the integration method for fixed ke and ki is not given for these particular weights in general. Therefore, further higher-order schemes for stiff ODEs have been developed, namely the HOP-methods described in [7].

5.4 Higher-order Padé methods

According to [7], HOP may be interpreted as Hermite-Obrechkoff-Padé or higher-order Padé. The corresponding integration schemes may be considered as implicit Taylor series methods based on Hermite quadratures.

In our notation, a (ke,ki)-HOP scheme means choosing:

$$ \begin{array}{@{}rcl@{}} \omega^{e}_{\ell_{e}}&:=& \frac{k_{e} ! (k_{e}+k_{i}-\ell_{e})!}{(k_{e}+k_{i})!(k_{e}-\ell_{e})!}, \quad \ell_{e} =0, \ldots, k_{e},\\ \omega^{i}_{\ell_{i}}&:=& \frac{k_{i} ! (k_{e}+k_{i}-\ell_{i})!}{(k_{e}+k_{i})!(k_{i}-\ell_{i})!}, \quad \ell_{i} =0, \ldots, k_{i}. \end{array} $$

These coefficients correspond to the (ke,ki)-Padé approximation of the exponential function such that the stability function R(z) is precisely this approximation, see Appendix Appendix. Indeed, (ke,ki)-HOP methods have the following properties, cf. [7]:

  • the order of consistency is ke + ki,

  • the order of the local error is ke + ki + 1,

  • they are A-stable for ki − 2 ≤ keki,

  • they are L-stable for ki − 2 ≤ keki − 1.

Note that also in this case, the trapezoidal rule corresponds to ke = ki = 1 and the implicit Euler method to ke = 0, ki = 1. In this sense, the methods with ke = ki could be viewed as a generalization of the trapezoidal rule and those with ke = ki − 1 as a generalization of the implicit Euler method, cf. [7].

In Section 8, we numerically verify the outstanding properties of these methods.

6 Properties of the minimization problems

In [16], we analyzed the properties of the minimization problem obtained when computing consistent initial values. That analysis can directly be applied to the explicit Taylor series method, cf. [17]. To appreciate the properties for implicit methods (i.e., ki > 0), we define, for \(k \geq \max \limits \left \{k_{e},k_{i}\right \}\), the matrices:

$$ \begin{array}{@{}rcl@{}} T_{e}&:=&\begin{pmatrix} P{\omega^{e}_{0}} & P{\omega^{e}_{1}}h_{j}&P{\omega^{e}_{2}}{h_{j}^{2}} & {\ldots} & P{\omega^{e}_{k}}{h_{j}^{k}} \end{pmatrix} \in {\mathbb{R}}^{n\times n \cdot (k+1)}\\ T_{i}&:=&\begin{pmatrix} P{\omega^{i}_{0}} & P{\omega^{i}_{1}}(-h_{j})&P{\omega^{i}_{2}}{h_{j}^{2}} & {\ldots} & P{\omega^{i}_{k}}(-h_{j})^{k} \end{pmatrix} \in {\mathbb{R}}^{n\times \cdot (k+1)}, \end{array} $$

assuming \(\omega _{\ell _{i}}=0\) for li > ki and \(\omega _{\ell _{e}}=0\) for e > ke, and the vectors:

$$ X_{j}=\begin{pmatrix} (c_{0})_{j}, \ldots, (c_{k})_{j} \end{pmatrix}, \quad X_{j+1}=\begin{pmatrix} (c_{0})_{j+1}, \ldots, (c_{k})_{j+1} \end{pmatrix}. $$

With this notation, we write:

$$ \begin{array}{@{}rcl@{}} p &:=&P\left( \sum\limits_{\ell_{i}=0}^{k_{i}} \omega_{\ell_{i}}^{i}(c_{\ell_{i}})_{j+1}\left( -h_{j} \right)^{\ell_{i}} - \sum\limits_{\ell_{e}=0}^{k_{e}} \omega_{\ell_{e}}^{e} (c_{\ell_{e}})_{j}\left( h_{j}\right)^{\ell_{e}}\right) \\ &=& T_{i} X_{j+1}-T_{e} X_{j}. \end{array} $$

Therefore, as in [16], for α := TeXj, X := Xj+ 1, we consider the objective function:

$$ \begin{array}{@{}rcl@{}} f(X) &:=& \frac{1}{2} \left\| T_{i} X-\alpha \right\|^{2}\\ &=& \frac{1}{2} \left\|P(T_{i} X-\alpha) \right\|^{2}\\ &=&\frac{1}{2} (T_{i} X-\alpha)^{T} P (T_{i} X-\alpha)\\ &=& \frac{1}{2} \left( X^{T} (T_{i})^{T}PT_{i} X - 2 \alpha^{T}PX+\alpha^{T}P\alpha \right). \end{array} $$

Observe that the matrix:

$$ \begin{array}{@{}rcl@{}} \widetilde{P}_{i}&:=&(T_{i})^{T}PT_{i}=(T_{i})^{T} T_{i}\\ &=& \begin{pmatrix} P({\omega^{i}_{0}})^{2} & P{\omega^{i}_{0}}{\omega^{i}_{1}}(-h_{j})^{2}& {\ldots} & P{\omega^{i}_{0}}{\omega^{i}_{k}}(-h_{j})^{k} \\ P{\omega^{i}_{0}}{\omega^{i}_{1}}(-h_{j})^{2} & & & \\ {\vdots} & & &\\ P{\omega^{i}_{0}}{\omega^{i}_{k}}(-h_{j})^{k} & P{\omega^{i}_{1}}{\omega^{i}_{k}}(-h_{j})^{K}& {\ldots} & P({\omega^{i}_{k}})^{2}(-h_{j})^{2k} \end{pmatrix}\\ &=& \begin{pmatrix} P({\omega^{i}_{0}})^{2} & {\ldots} & P{\omega^{i}_{0}}\omega^{i}_{k_{i}}(-h_{j})^{k_{i}} & 0 \\ {\vdots} & & &{\vdots} \\ P{\omega^{i}_{0}}\omega^{i}_{k_{i}}(-h_{j})^{k_{i}} & {\ldots} & P(\omega^{i}_{k_{i}})^{2}(-h_{j})^{2k_{i}} & \\ 0 & {\ldots} & & 0 \end{pmatrix} \in {\mathbb{R}}^{n \cdot (k+1) \times n \cdot (k+1) } \end{array} $$

is, by construction, positive semi-definite. However, it is not an orthogonal projector in general. Therefore, Theorem 1 and Corollary 1 of [16] cannot be applied directly. Hence, the solvability of the optimization problem is more difficult than for explicit Taylor methods. More precisely, we want to emphasize that, for:

$$ \widetilde{P}:=\begin{pmatrix} P & 0 \\ 0 & 0 \end{pmatrix} \in {\mathbb{R}}^{n \cdot (k+1) \times n \cdot (k+1) }, $$

the nullspaces:

$$ \ker \begin{pmatrix} \widetilde{P}_{i} & G^{T}\\ G & 0 \end{pmatrix} \quad \text{and} \quad \ker \begin{pmatrix} \widetilde{P} & G^{T}\\ G & 0 \end{pmatrix} $$

may be different. However, since \(\widetilde {P}_{i}\) depends on hj, it is reasonable to assume that a suitable stepsize hj can be found such that the optimization problem becomes solvable in the sense discussed in [16].

7 Some practical considerations

7.1 Dimension of the nonlinear systems solved in each step

For a given \(K \in {\mathbb {N}}\), the Lagrange approach for solving (27)–(28) leads to a nonlinear system of equations of dimension 2n ⋅ (K + 1), cf. [16]. Thereby, consistent coefficients:

$$ (c_{0})_{j+1}, (c_{1})_{j+1}, (c_{2})_{j+1},\ldots, (c_{K-\mu})_{j+1} $$

are obtained. In contrast, the coefficients cKμ+ 1,…,cK will not be consistent in general and the introduced Lagrange-multipliers are not even of interest.

However, increasing K by one means solving a nonlinear system containing 2n additional variables and equations.

7.2 Setting k e and k i in a simple implementation

Dealing with automatic differentiation (AD), the number \(K \in {\mathbb {N}}\) has to be prescribed a priori in order to consider (K + 1) Taylor coefficients. Since 0 ≤ ke,kiKμ must be given in general, for the (ke,ki) TH and HOP methods, we set:

$$ k_{i}:=K-\mu \quad \text{and}\quad k_{e}:=k_{i}, $$

by default. We further tested ki := Kμ, ke := ki − 1 for HOP methods. So far, we considered schemes with constant order and stepsize only.

7.3 Jacobian matrices

To solve the optimization problems (27)–(28) numerically, we provide the corresponding Jacobians.

  • The Jacobian of the constraints (28) is described in [17], since it is also used for the computation of consistent initial values.

  • To describe the Jacobian of the objective function (27), which is a gradient, we define:

    $$ q\left( (c_{0})_{j+1},\ldots, (c_{k_{i}})_{j+1}\right):= \left\|p \left( (c_{0})_{j+1},\ldots, (c_{k_{i}})_{j+1}\right)\right\|_{2} $$

    and realize that:

    $$ \frac{\partial q}{ \partial (c_{\ell_{i}})_{j+1}} = \frac{1}{q\left( (c_{0})_{j+1},\!\ldots, \!(c_{k_{i}})_{j+1}\right)}\!\left( p\!\left( \!(c_{0})_{j+1},\ldots, (c_{k_{i}})_{j+1}\right)\right)^{T} \cdot \omega_{\ell_{i}}^{i}\cdot \left( \!-h_{j} \right)^{\ell_{i}}, $$

    for q≠ 0, 0 ≤ iki.

8 Numerical tests

8.1 Order validation

To visualize the order of the methods, we integrate Example 1 in the interval \(\left [0,1 \right ]\) with different stepsizes. The results can be found in Fig. 1. On the left-hand side, we show the results for the index-4 DAE. On the right-hand side, we report the results obtained for the corresponding ODE described in (25).

Fig. 1
figure 1

Stepsize-error diagram for the error \(\left |x_{1}(1)-\cosh (1)\right |\) for the DAE (left) and the essential ODE (right) corresponding to Example 1 for different methods and ftol for the module minimize from SciPy. For ke, ki= 2, we included graphs of Chp for p = 2,3,4 to appreciate the order of the methods. The crossing of the (4,4)-HOP and (3,4)-HOP methods for the DAE (left) are probably due to rounding errors, since the error is about 1e − 14

Summarizing, we observe that:

  • For ki,ke ≤ 1, the methods coincide with the explicit and implicit Euler methods or the trapezoidal rule. Therefore, the graphs coincide up to effects resulting from rounding errors.

  • The similarity of the overall behavior for the DAE and the ODE is remarkable.

  • As expected, the HOP methods are considerably more accurate due to the higher order.

  • For small h and large ke,ki, scaling and rounding errors impede more accurate results in dependence of the tolerance ftol from the module minimize from SciPy.

8.2 Numerical test for examples from the literature

8.2.1 Validation of known results

We report numerical results obtained by the methods (3,3)-HOP and (4,4)-HOP for the following examples from the literature:

  • Mass-on-car from [30], see Appendix, Section Appendix,

  • Extended mass-on-car from [25], see Appendix, Section Appendix,

  • Pendulum index 3, which can be found in almost all introductions to DAEs, in the reduced to first-order form, with the positive y axis pointing upwards and the parameters m = 1.0, l = 1.0, and g = 1.0. (x1,x2) are the coordinates, (x3,x4) the corresponding velocities and x5 the Lagrange parameter. In our computation, the system starts from rest at 45 degrees to the vertical.

  • Car axis index 3 formulation with all parameters as given in [24]. In order to avoid a disadvantageous scaling of the Taylor coefficients, we changed the independent variable t to τ = 10t. This is advantageous, since the time-dependent input-function is \(y_{b}=0.1\sin \limits (10 \ t)\). For large K, the corresponding higher Taylor coefficient lead to considerable scaling differences that are avoided by the substitution with τ. For the details of our reformulation, we refer to [17].

For all examples (see also Table 2), we used ftol for the tolerance of the module minimize from SciPy. To estimate the error, we considered the difference between the results obtained by (3,3)-HOP and (4,4)-HOP.

Table 2 Overview of the examples from Section 8.2.1

All tests confirmed the applicability of the method. The solution graphs look identical with those given in the literature [24, 25, 30]. The graphs of the estimated errors of the (3,3)-HOP methods in Fig. 2 confirm the order expectations.

Fig. 2
figure 2

Numerical solutions of the examples from Section 8.2 obtained by (4,4)-HOP (left) and estimation of the error (right) considering the difference between the solution from (3,3)-HOP and (4,4)-HOP. The specifics of the integration can be found in Table 3

Since it is obvious that our implementation is not competitive with respect to runtime (see Table 3), we have not made a systematic comparison with other solvers here.

Table 3 Overview of the computations carried out for Fig. 2 with fixed stepsize h

8.2.2 A challenging index 5 DAE

We consider now the index-5 example from [26] resulting from a model of two pendula, where the Lagrangian multiplier λ1 of the first one controls the length of the second one:

$$ \begin{array}{@{}rcl@{}} x_{1}^{\prime}&=&{v_{x_{1}}},\qquad\qquad \qquad x_{2}^{\prime}={v_{x_{2}}}, \\ y_{1}^{\prime} & = & {v_{y_{1}}},\qquad\qquad\qquad y_{2}^{\prime}={v_{y_{2}}},\\ {v_{x_{1}}}^{\prime}& = & -x_{1} \lambda_{1}, \qquad\qquad {v_{x_{2}}}^{\prime}= -x_{2} \lambda_{2},\\ {v_{y_{1}}}^{\prime} &=& -y_{1} \lambda_{1} + g, \quad\quad {\kern2pt} {v_{y_{2}}}^{\prime} = -y_{2} \lambda_{2} + g ,\\ 0& = & {x_{1}^{2}}+{y_{1}^{2}} -L^{2}, \qquad {\kern1.5pt} 0= {x_{2}^{2}}+{y_{2}^{2}} -(L+c\lambda_{1})^{2}. \end{array} $$

Note that in this formulation from [26] the positive y axis is pointing downwards. The DAE has index 5 and four degrees of freedom. For the numerical tests, we use the gravity constant g = 1, the length of the first pendulum L = 1, the parameter c = 0.1 and the interval [0,80]. In [26], this example was integrated with constant stepsize h = 0.05 and order k = 7 as well as h = 0.025 and order k = 8. For the component x2, the two solutions were very close until about t = 30 and clearly diverging from there to about t = 50 and totally unrelated from t = 55 on.

Our implementation leads to a very good results in the sense that two solutions have a small difference up to much larger t. We compare the solutions for K = 9 (with HOP method ki = ke = 4 and order 8) and K = 8 (with HOP method ki = ke = 3 and order 6) for the (numerically) consistent initial value:

$$ c_{0}= \begin{pmatrix} x_{1}\\ y_{1}\\ x_{2} \\ y_{2}\\ v_{x_{1}}\\ v_{y_{1}}\\ v_{x_{2}} \\ v_{y_{2}}\\ \lambda_{1} \\ \lambda_{2} \end{pmatrix}_{0} =\begin{pmatrix} ~1.000000000000000e+00 \\ -6.346337564282729e-09 \\ ~1.000000000000000e+00 \\ ~3.713317265246974e-01 \\ ~5.183756806486933e-09 \\ ~8.168107595885199e-01 \\ -9.661740336543358e-02 \\ ~9.641228990309292e-01 \\ ~6.671798106332355e-01 \\ ~8.174254817186853e-01 \end{pmatrix}. $$

The quality of our results is visualized for x2 in Fig. 3. For slightly perturbed initial values for x2 and y2, the obtained solutions behave analogously. For arbitrarily perturbed initial values, convergence difficulties during the computation of consistent initial values may appear.

Fig. 3
figure 3

Difference of the result obtained with (3,3)-HOP with h = 0.05 and (4,4)-HOP with h = 0.025 for the component x2 of the index-5 example from Section 8.2.2 in the interval [0,40] (left) and [0,80] (right)

8.2.3 Andrews squeezing mechanism

Finally, we want to report here the behavior we obtained for a well-known index-3 benchmark problem with an extreme scaling. According to [20, 24], the problem is of the form:

$$ \begin{pmatrix} q^{\prime}\\ v^{\prime} \\ 0 \\ 0 \end{pmatrix} = \begin{pmatrix} v \\ w \\ M(q)w -f(q,v) + G^{T}(q) \lambda\\ g(q) \end{pmatrix} $$

for \(q \in {\mathbb {R}}^{7}, \lambda \in {\mathbb {R}}^{6}\). To our surprise, using for α the initial value given in the literature, which leads there to an dynamic behavior, our computation of consistent initial values delivers a stationary solution, such that all our integration methods provide the same constant solution for:

$$ \begin{array}{@{}rcl@{}} q=\begin{pmatrix} \upbeta \\ {\Theta} \\ \gamma \\ {\Phi} \\ \delta \\ {\Omega} \\ \epsilon \end{pmatrix} &=&\begin{pmatrix} -1.2456368688861551e-01 \\ ~4.7359420595613766e-02\\ ~4.5493601541816081e-01 \\ ~2.2197419077471500e-01 \\ ~4.8744266257012064e-01 \\ -2.2197419077470643e-01 \\ ~1.2302849924800077e+00 \\ \end{pmatrix}, \end{array} $$
(41)
$$ \begin{array}{@{}rcl@{}} \begin{pmatrix} \lambda_{1} \\ \lambda_{2} \end{pmatrix}&=& \begin{pmatrix} ~9.9283318628552919e+01 \\ -7.6803614318109092e+00 \\ \end{pmatrix} \end{array} $$
(42)

and all other components are (numerically) equal to zero. Therefore, we explain here why this happened.

First of all, we want to mention that the indicated condition number, introduced in [14], corresponding to the DAE at α is about 1011, such that a clear hint to the scaling difficulties is given. In contrast, at the given stationary solution, the condition number is about 106.

To simplify further considerations, we notice that the last four equations of g(p) = 0 are used to compute Φ,δ,Ω,𝜖, such that we can neglect them and consider only g1,2 and f1,2,3 to determine β,Θ,γ,λ1,2. For a constant solution v = w = 0, the relevant equations are therefore:

$$ \begin{array}{@{}rcl@{}} 0&=&-\begin{pmatrix} mom\\ 0 \\ f_{3}(\gamma) \\ \end{pmatrix}+ \tilde{G}^{T}(\upbeta,{\Theta}, \gamma) \begin{pmatrix} \lambda_{1} \\ \lambda_{2} \\ \end{pmatrix}, \\ 0&=& \begin{pmatrix} g_{1}(\upbeta,{\Theta}, \gamma)\\ g_{2}(\upbeta,{\Theta}, \gamma) \end{pmatrix} =:\tilde{g}. \end{array} $$

At the equilibrium point corresponding to (41)–(42), the constant drive torque mom, the spring force f3, and the Lagrangian forces are equalized.

Therefore, it only remains to explain why the approach (19)–(20) delivers a stationary solution. Due to the high condition, the Lagrange multipliers λ are numerically difficult to compute. In fact, for other numerical computations, the accuracy for λ is not controlled [20, p. 536ff]. In contrast, if no (numerical) full row rank of the Jacobian \(\tilde {G}\) is given, then (19)–(20) computes a minimal norm solution [15], that in this case minimizes the values for λ, leading to the stationary solution. To our knowledge, this stationary solution was not reported before in the DAE literature.

We plan further investigations on this unexpected behavior. Indeed, for some perturbed initial values, we obtained a solution that converge towards the stationary solution (41)–(42). Moreover, with scaled equations and very different initial values, a nonconstant solution that behaves like the one described in [20, 24] has been obtained.

9 Summary and future work

In this article, we presented a projection approach that permits the extension of explicit/implicit Taylor integrations schemes from ODEs to DAEs. As a result, we obtained higher-order methods that can directly be applied also to higher-index DAEs. The methods are relatively easy to implement using InitDAE and convenient since, thanks to the formulation as an optimization problem, the inherent dynamics of the DAE are considered indirectly. We analyzed in detail explicit, fully implicit, two-halfstep (TH) and higher-order-Padé (HOP) methods. Particularly HOP methods present excellent stability and order properties.

The results obtained by a prototype in Python that is based on InitDAE [12] outperform our expectations, in particular with regard to the accuracy for higher-index DAEs, cf. Section 8.2.2. Until now, our focus was on the extension from ODEs to DAEs in order to use higher-order and A-stable methods with InitDAE for our diagnosis purposes during the integration to monitor singularities [17]. With this promising first results, we think that further developments of these projected methods are worthwhile.

In fact, at present, our implementation is not competitive by far. One reason is that setting up the nonlinear equations (27)–(28) and the corresponding Jacobians with AlgoPy, cf. [31], is still very costly. If equations (27)–(28) and the corresponding Jacobians are supplied in a more efficient way, competitive solvers might be achieved. At present, we do not even consider the sparsity of matrices. Furthermore, an improvement seems likely if we take advantage of specific structural properties, e.g., solving subsystems step-by-step, cf. [10, 11]. Another reason for our high computational costs is that the package minimize from SciPy often performs more iterations than we expected (often more than 30), although a good initial guess computed with the explicit Taylor series method is given in general. This behavior has to be inspected in more detail. For linear systems, a direct implementation considering the projector π from (21) (or, more precisely, a corresponding basis) should deliver an efficient algorithm. This could be of interest, e.g., for the applications from [25, 30]. Last but not least, competitive solvers require adaptive order and stepsize strategies—a broad field for future work.

Although these algorithms open new possibilities to integrate higher-index DAEs, we want to emphasize that, in practice, a high index is often due to modelling assumptions that should be considered very carefully. The dependencies on higher derivatives should always be well founded.