1 Introduction

We consider the numerical solution of general nonlinear systems of differential-algebraic equations (DAEs)

$$\begin{aligned} F(t,x,\dot{x})=0,\quad F\in C({\mathbb I}\times {\mathbb D}_x\times {\mathbb D}_{\dot{x}},{\mathbb R}^n)\hbox { sufficiently smooth}, \end{aligned}$$
(1.1)

where \({\mathbb D}_x,{\mathbb D}_{\dot{x}}\subseteq {\mathbb R}^n\) are open domains and \({\mathbb I}\subseteq {\mathbb R}\) is a compact non-trivial interval, together with a given initial condition

$$\begin{aligned} x(t_0)=x_0,\quad t_0\in {\mathbb I},\ x_0\in {\mathbb D}_x. \end{aligned}$$
(1.2)

For this task, numerous discretization schemes that work directly on (1.1) or on some index-reduced reformulation have been given in the literature, see e.g. [7, 10, 13]. In this paper, we consider discretization schemes that work on a so-called inherent ordinary differential equation (ODE) of the given DAE. The advantage of such an approach is that we can make use of any discretization scheme suitable for the numerical integration of ODEs. In particular, if we are able to choose the inherent ODE in such a way that it inherits symmetry properties of the given DAE and thus special properties of its flow, we may be in the situation to use geometric integration, i.e., to use special discretization schemes whose numerical flow possesses similar geometric properties, see [8].

In the context of geometric integration, we concentrate in this paper on linear problems. The basic principle of geometric integration in this special case is as follows. Given a linear initial value problem

$$\begin{aligned} \dot{x}=A(t)x+f(t),\quad x(t_0)=x_0, \end{aligned}$$
(1.3)

its solution can be written by means of the variation of constant formula as

$$\begin{aligned} x(t;t_0;x_0)=\Phi (t)x_0+\int _{t_0}^t\Phi (t)\Phi (s)^{-1}f(s)\,ds, \end{aligned}$$

where \(\Phi \in C^1({\mathbb I},{\mathbb R}^{n,n})\) is the solution of the matricial initial value problem

$$\begin{aligned} \dot{\Phi }=A(t)\Phi ,\quad \Phi (t_0)=I_n. \end{aligned}$$
(1.4)

In particular, we have that \(\frac{d}{dx_0}x(t;t_0,x_0)=\Phi (t)\). If A now lies pointwise in a Lie algebra then the flow \(\Phi \) lies pointwise in the corresponding Lie group. The idea of geometric integration is then to construct numerical integration methods that inherit this geometric property. If we write the numerical solution after one step as \(x_h(t_0+h;t_0,x_0)\), we therefore require \(\frac{d}{dx_0}x_h(t_0+h;t_0,x_0)\) also to lie in this Lie group.

In the case of a quadratic Lie group

$$\begin{aligned} {\mathbb {G}}=\{G\in \mathop {\textrm{GL}}\nolimits (n)| G^TXG=X\}, \end{aligned}$$
(1.5)

with some given \(X\in {\mathbb R}^{n,n}\) and \(\mathop {\textrm{GL}}\nolimits (n)\) denoting the general linear group of invertible matrices in \({\mathbb R}^{n,n}\), and its associated Lie algebra

$$\begin{aligned} {\mathbb A}=\{A\in {\mathbb R}^{n,n}| A^TX+XA=0\}, \end{aligned}$$
(1.6)

the above property that \(\Phi \) lies pointwise in \({\mathbb G}\) if A lies pointwise in \({\mathbb A}\) follows from \(\Phi (t_0)^TX\Phi (t_0)=X\) and

$$\begin{aligned} {\textstyle {d\over dt}}(\Phi ^TX\Phi )=\dot{\Phi }^TX\Phi +\Phi ^TX\dot{\Phi }= \Phi ^TA^TX\Phi +\Phi ^TXA\Phi =\Phi ^T(A^TX+XA)\Phi =0. \end{aligned}$$

In particular, \(\Phi ^TX\Phi \) is a quadratic invariant of (1.4). It can then be shown that all Runge–Kutta methods that conserve quadratic invariants constitute geometric integration methods in the case of quadratic Lie groups and their Lie algebras, see [8]. A prominent class of Runge–Kutta methods that conserve quadratic invariants are given by Gauß collocation, see again [8].

Writing linear time-varying DAEs in the form

$$\begin{aligned} E(t) \dot{x} =A(t) x+f(t),\quad \begin{array}{l} E,A\in C({\mathbb I},{\mathbb R}^{n,n}),\ f\in C({\mathbb I},{\mathbb R}^{n})\hbox { sufficiently smooth}, \end{array}\nonumber \\ \end{aligned}$$
(1.7)

we are interested in this paper in the following symmetry properties.

Definition 1.1

The DAE (1.7) and its associated pair (EA) of matrix functions are called self-adjoint if

$$\begin{aligned} E^T=-E,\quad A^T=A+\dot{E} \end{aligned}$$
(1.8)

as equality of functions.

Definition 1.2

The DAE (1.7) and its associated pair (EA) of matrix functions are called skew-adjoint if

$$\begin{aligned} E^T=E,\quad A^T=-A-\dot{E} \end{aligned}$$
(1.9)

as equality of functions.

For these two cases, it has been shown in [15] that the inherent ODE can be chosen in such a way that its flow possesses certain geometric properties posing the question of possible geometric integration. In particular, we are here concerned with the quadratic Lie group \(\mathop {\textrm{Sp}}\nolimits (2p)\) of symplectic matrices related to

$$\begin{aligned} X=J,\quad J= \left[ \begin{array} {cc}0&{}I_p\\ -I_p&{}0\end{array}\right] \end{aligned}$$
(1.10)

and the associated Lie algebra of Hamiltonian matrices in the case of self-adjoint DAEs and with the quadratic Lie group of generalized orthogonal matrices related to

$$\begin{aligned} X=S,\quad S= \left[ \begin{array}{cc}I_p&{}0\\ 0 &{} -I_q\end{array}\right] \end{aligned}$$
(1.11)

in the case of skew-adjoint DAEs.

2 Preliminaries

In the following, we give a concise overview of the relevant theory on DAEs that we make use of, see e.g. [13]. The basis are the so-called derivative array equations

$$\begin{aligned} F_\ell (t,x,\dot{x},\ldots ,x^{(\ell +1)})=0, \end{aligned}$$
(2.1)

see [3], where \(F_\ell \) has the form

$$\begin{aligned} F_\ell (t,x,\dot{x},\ldots ,x^{(\ell +1)}) = \left[ \begin{array}{c} F(t,x,\dot{x})\\ {\textstyle {d\over dt}}F(t,x,\dot{x})\\ \vdots \\ \big ({\textstyle {d\over dt}}\big )^\ell F(t,x,\dot{x}) \end{array}\right] \end{aligned}$$

with Jacobians (denoting the derivative of F with respect to the variable x by \(F_x\) and accordingly)

$$\begin{aligned} \begin{aligned} M_\ell (t,x,\dot{x},\dots ,x^{(\ell +1)})&= F_{\ell ;\dot{x},\dots ,x^{(\ell +1)}}(t,x,\dot{x},\dots ,x^{(\ell +1)}), \\ N_\ell (t,x,\dot{x},\dots ,x^{(\ell +1)})&= -[\>F_{\ell ;x}(t,x,\dot{x},\dots ,x^{(\ell +1)})\>\>0\>\>\ldots \>\>0\>]. \end{aligned} \end{aligned}$$
(2.2)

The following hypothesis then states sufficient conditions for the given DAE to describe a regular problem.

Hypothesis 2.1

There exist integers \(\mu \), a, and d such that the set

$$\begin{aligned} {\mathbb L}_\mu =\{(t,x,\dot{x},\ldots ,x^{(\mu +1)})\in {\mathbb R}^{(\mu +2)n+1}| F_{\mu }(t,x,\dot{x},\ldots ,x^{(\mu +1)})=0\} \end{aligned}$$
(2.3)

associated with F is nonempty and such that for every \((t_0,x_0,\dot{x}_0,\ldots ,x^{(\mu +1)}_0)\in {\mathbb L}_\mu \), there exists a (sufficiently small) neighborhood in which the following properties hold:

  1. 1.

    We have \(\mathop \textrm{rank}\nolimits M_{\mu }(t,x,\dot{x},\ldots ,x^{(\mu +1)})=(\mu +1)n-a\) on \({\mathbb L}_\mu \) such that there exists a smooth matrix function \(Z_2\) of size \({(\mu +1)n\times a}\) and pointwise maximal rank, satisfying \(Z_2^TM_{\mu }=0\) on \({\mathbb L}_\mu \).

  2. 2.

    We have \(\mathop \textrm{rank}\nolimits \hat{A}_2(t,x,\dot{x},\ldots ,x^{(\mu +1)})=a\), where \(\hat{A}_2=Z_2^TN_{\mu }[I_n\>0\>\cdots \>0]^T\) such that there exists a smooth matrix function \(T_2\) of size \({n\times d}\), \(d=n-a\), and pointwise maximal rank, satisfying \(\hat{A}_2T_2=0\).

  3. 3.

    We have \(\mathop \textrm{rank}\nolimits F_{\dot{x}}(t,x,\dot{x})T_2(t,x,\dot{x},\ldots ,x^{(\mu +1)})=d\) such that there exists a smooth matrix function \(Z_1\) of size \({n\times d}\) and pointwise maximal rank, satisfying \(\mathop \textrm{rank}\nolimits \hat{E}_1T_2=d\), where \(\hat{E}_1=Z_1^TF_{\dot{x}}\).

Note that the local existence of functions \(Z_2,T_2,Z_1\) can be guaranteed by the application of the implicit function theorem, see [13, Theorem 4.3]. Moreover, we may assume that they possess (pointwise) orthonormal columns. Note also that due to the full rank requirement we may choose \(Z_1\) to be constant.

Following the presentation in [11], we use the shorthand notation \(y=(\dot{x},\ldots ,x^{(\mu +1)})\) and similarly \(\smash {y_0=(\dot{x}_0,\ldots ,x^{(\mu +1)}_0)}\). The system of nonlinear equations

$$\begin{aligned} H(t,x,y,\alpha )=\left[ \begin{array}{c} F_\mu (t,x,y)-Z_{2,0}\alpha \\ T_{1,0}^T(y-y_0) \end{array}\right] , \end{aligned}$$
(2.4)

with the columns of \(T_{1,0}\) forming an orthonormal basis of \(\mathop \textrm{kernel}\nolimits F_{\mu ;y}(t_0,x_0,y_0)\) and \(Z_{2,0}=Z_2(t_0,x_0,y_0)\) according to Hypothesis 2.1, is then locally solvable for \(y,\alpha \) in terms of (tx) due to the implicit function theorem. In particular, \(\alpha =\hat{F}_2(t,x)\) with some function \(\hat{F}_2\). One can show that \(\hat{F}_2(t,x)=0\) describes the whole set of algebraic constraints implied by the original DAE. Setting furthermore \(\hat{F}_1(t,x,\dot{x})=Z_1^TF(t,x,\dot{x})\) yields a so-called reduced DAE

$$\begin{aligned} \begin{array}{ll} \hat{F}_1(t,x,\dot{x})=0,\qquad &{}\hbox {(d differential equations)}\\ \hat{F}_2(t,x)=0,&{}\hbox {(a algebraic equations)} \end{array} \end{aligned}$$
(2.5)

in the sense that it satisfies Hypothesis 2.1 with \(\mu =0\).

Moreover, one can show that \(\hat{F}_{2;x}\) possesses full row rank implying that we can split x possibly after a renumeration of the components according to \(x=(x_1,x_2)\) such that \(\hat{F}_{2;x_2}\) is nonsingular. The implicit function theorem then yields \(x_2={{\mathscr {R}}}(t,x_1)\) with some function \({{\mathscr {R}}}\). Differentiating this relation to eliminate \(x_2\) and \(\dot{x}_2\) in the first equation of (2.5), we can apply the implicit function theorem once more (requiring the solvability of the DAE) yielding \(\dot{x}_1={{\mathscr {L}}}(t,x_1)\), a so-called inherent ODE, with some function \({{\mathscr {L}}}\). Putting both parts together, we end up with a second kind of reduced DAE

$$\begin{aligned} \begin{array}{ll} \dot{x}_1={{\mathscr {L}}}(t,x_1),\qquad &{}\hbox {(d differential equations)}\\ x_2={{\mathscr {R}}}(t,x_1).&{}\hbox {(a algebraic equations)} \end{array} \end{aligned}$$
(2.6)

Note that, once we have fixed the splitting of the variables, the constructed functions \({{\mathscr {L}}}\) and \({{\mathscr {R}}}\) are unique. In particular, the set \({\mathbb L}_{\mu +1}\) can be locally parameterized according to

$$\begin{aligned} F_{\mu +1}(t,x_1,{{\mathscr {R}}}(t,x_1),{{\mathscr {L}}}(t,x_1), {{\mathscr {R}}}_t(t,x_1)+{{\mathscr {R}}}_{x_1}(t,x_1){{\mathscr {L}}}(t,x_1),{{\mathscr {W}}}(t,x_1,p))\equiv 0\nonumber \\ \end{aligned}$$
(2.7)

with a suitable parameter \(p\in {\mathbb R}^a\) and a related function \({{\mathscr {W}}}\).

Under some technical assumptions, see [13], the original DAE and the reduced DAEs (2.5) and (2.6) possess the same solutions. As a consequence, we may discretize the reduced DAEs instead of the original DAE utilizing the better properties of the latter ones. But this requires the possibility to evaluate the implicitly defined functions. In the case of \(\hat{F}_2\) in (2.5) the standard approach, see [13], is to go back to the definition of \(\hat{F}_2\) in such a way that we replace \(\hat{F}_2(t,x)=0\) by \(F_\mu (t,x,y)=0\).

In the special case of linear time-varying DAEs (1.7), the Jacobians \(M_\mu ,N_\mu \) used in Hypothesis 2.1 only depend on t such that the functions \(Z_2,T_2,Z_1\) can be chosen to depend also only on t. The corresponding reduced DAE (2.5) then takes the form

$$\begin{aligned} \begin{array}{rll} {\hat{E}}_{1}(t)\dot{x}&{}={\hat{A}}_{1}(t)x+\hat{f}_1(t),\qquad &{}\hbox {(d differential equations)}\\ 0&{}={\hat{A}}_{2}(t)x+\hat{f}_2(t),&{}\hbox {(a algebraic equations)} \end{array} \end{aligned}$$
(2.8)

where

$$\begin{aligned} \begin{array}{lll} {\hat{E}}_{1}=Z_1^TE,&{}{\hat{A}}_{1}=Z_1^TA,&{}{\hat{f}}_{1}=Z_1^Tf,\\ &{}{\hat{A}}_{2}=Z_2^TN_\mu [I_n\>0\>\cdots \>0]^T,&{}{\hat{f}}_{2}=Z_2^Tg_\mu \end{array} \end{aligned}$$
(2.9)

with

$$\begin{aligned} M_\mu =\left[ \begin{array}{cccc} E\\ \dot{E}-A&{}E\\ \ddot{E}-2\dot{A}&{}2\dot{E}-A&{}E\\ \vdots &{}\vdots &{}\ddots &{}\ddots \end{array}\right] ,\ N_\mu =\left[ \begin{array}{cccc} A&{}0&{}\cdots &{}0\\ \dot{A}&{}0&{}\cdots &{}0\\ \ddot{A}&{}0&{}\cdots &{}0\\ \vdots &{}\vdots &{}&{}\vdots \end{array}\right] ,\ g_\mu =\left[ \begin{array}{c} f\\ \dot{f}\\ \ddot{f}\\ \vdots \end{array}\right] .\nonumber \\ \end{aligned}$$
(2.10)

The splitting of the variables as \(x=(x_1,x_2)\) that leads to the second form of a reduced DAE corresponds to a splitting of \(\hat{A}_2=[\>A_{21}\>\>A_{22}\>]\) with the requirement that \(A_{22}\) is pointwise nonsingular. It is then obvious that we can solve the second equation of (2.8) for \(x_2\) in terms of \(x_1\), differentiate, and eliminate \(x_2\) and \(\dot{x}_2\) in the first equation of (2.8) to obtain a linear version of (2.6).

In order to utilize global canonical forms as they were presented in [15], we observe that the construction of (2.8) transforms covariantly with global equivalence transformations as follows. Let \((\tilde{E},\tilde{A})\) be globally equivalent to (EA), i.e., let sufficiently smooth, pointwise nonsingular matrix functions \(P\in C({\mathbb I},{\mathbb R}^{n,n})\) and \(Q\in C^1({\mathbb I},{\mathbb R}^{n,n})\) be given such that

$$\begin{aligned} \tilde{E}=PEQ,\quad \tilde{A}=PAQ-PE\dot{Q}, \end{aligned}$$
(2.11)

describing scalings of the DAE (1.7) and the unknown x, respectively. The corresponding Jacobians are then related by

$$\begin{aligned} \tilde{M}_\mu =\Pi _\mu M_\mu \Theta _\mu ,\quad \tilde{N}_\mu =\Pi _\mu N_\mu \Theta _\mu -\Pi _\mu M_\mu \Psi _\mu \end{aligned}$$
(2.12)

with

$$\begin{aligned} \Pi _\mu =\left[ \begin{array}{cccc} P\\ \dot{P}&{}P\\ \ddot{P}&{}2\dot{P}&{}P\\ \vdots &{}\vdots &{}\ddots &{}\ddots \end{array}\right] ,\ \Theta _\mu =\left[ \begin{array}{cccc} Q\\ 2\dot{Q}&{}Q\\ 3\ddot{Q}&{}3\dot{Q}&{}Q\\ \vdots &{}\vdots &{}\ddots &{}\ddots \end{array}\right] ,\ \Psi _\mu =\left[ \begin{array}{cccc} \dot{Q}&{}0&{}\cdots &{}0\\ \ddot{Q}&{}0&{}\cdots &{}0\\ \dddot{Q}&{}0&{}\cdots &{}0\\ \vdots &{}\vdots &{}&{}\vdots \end{array}\right] .\nonumber \\ \end{aligned}$$
(2.13)

With given choices \(Z_2,T_2,Z_1\) for (EA) along Hypothesis 2.1 we may choose \(\tilde{Z}_2,\tilde{T}_2,\tilde{Z}_1\) for \((\tilde{E},\tilde{A})\) as

$$\begin{aligned} \tilde{Z}_2^T=Z_2^T\Pi _\mu ^{-1},\quad \tilde{T}_2=Q^{-1}T_2,\quad \tilde{Z}_1^T=Z_1^TP^{-1}. \end{aligned}$$
(2.14)

Having summarized the theory for general nonlinear and linear time-varying DAEs, the next section deals with the construction of suitable inherent ODEs for a given DAE.

3 Construction and evaluation of an inherent ODE

To get more flexibility into the choice of an inherent ODE, we introduce a (linear but in general time-dependent) transformation of the unknown x before we perform the splitting, i.e., we consider

$$\begin{aligned} x=Q(t)\left[ \begin{array}{c}x_1\\ x_2\end{array}\right] , \end{aligned}$$
(3.1)

where \(Q\in C^1({\mathbb I},{\mathbb R}^{n,n})\) is sufficiently smooth and pointwise nonsingular. According to [13, Lemma 4.6] the so transformed DAE (1.1) satisfies Hypothesis 2.1 as well with the same characteristic values \(\mu ,a,d\). As before, the only requirement for Q is that we can solve the algebraic constraints for \(x_2\) in terms of \(x_1\). Writing

$$\begin{aligned} Q=[\>T_2 \>\>T_2'\>], \end{aligned}$$
(3.2)

the algebraic constraints read

$$\begin{aligned} \hat{F}_2(t,T_2 x_1+T_2'x_2)=0. \end{aligned}$$

Hence, in order to be able to solve for \(x_2\) we need \(\hat{F}_{2;x}T_2'\) to be pointwise nonsingular. If this is the case, then the chosen Q fixes a reduced DAE of the form (2.6) satisfying

$$\begin{aligned} \begin{array}{l} F_{\mu +1}\left( t,Q(t)\left[ \begin{array}{c}x_1\\ x_2\end{array}\right] , Q(t)\left[ \begin{array}{c}\dot{x}_1\\ \dot{x}_2\end{array}\right] + \dot{Q}(t)\left[ \begin{array}{c}x_1\\ x_2\end{array}\right] ,{{\mathscr {W}}}(t,x_1,p)\right) \equiv 0,\\ \quad \dot{x}_1={{\mathscr {L}}}(t,x_1),\ x_2={{\mathscr {R}}}(t,x_1),\ \dot{x}_2={{\mathscr {R}}}_t(t,x_1)+{{\mathscr {R}}}_{x_1}(t,x_1){{\mathscr {L}}}(t,x_1) \end{array} \end{aligned}$$
(3.3)

with a suitable parameter \(p\in {\mathbb R}^a\) and a related function \({{\mathscr {W}}}\).

For a numerical realization, we are confronted with two problems. First, we must be able to evaluate the implicitly defined functions \({{\mathscr {L}}}\) and \({{\mathscr {R}}}\). Second, for a nontrivial choice of Q we must have access to \(\dot{Q}\).

In the next subsections, we discuss how to overcome these problems.

3.1 Numerical evaluation of the inherent ODE

The first problem can be dealt with by solving the system of (nonlinear) equations

$$\begin{aligned} F_{\mu +1}(t,x,\dot{x},w)=0,\quad [\>I_d\>\>0\>]Q(t)^{-1}x=x_1 \end{aligned}$$
(3.4)

for given \((t,x_1)\). Because of the first part in (3.4), at a solution, the resulting \((t,x,\dot{x},w)\) must satisfy

$$\begin{aligned} x= & {} Q(t)\left[ \begin{array}{c}x_1\\ {{\mathscr {R}}}(t,x_1)\end{array}\right] ,\ \dot{x}=Q(t)\left[ \begin{array}{c}{{\mathscr {L}}}(t,x_1)\\ {{\mathscr {R}}}_t(t,x_1)+{{\mathscr {R}}}_{x_1}(t,x_1){{\mathscr {L}}}(t,x_1)\end{array}\right] \\{} & {} \qquad \qquad \qquad \qquad \qquad +\dot{Q}(t)\left[ \begin{array}{c}x_1\\ {{\mathscr {R}}}(t,x_1)\end{array}\right] . \end{aligned}$$

Because of the second part in (3.4), we regain the prescribed \(x_1\). Furthermore, we observe that

$$\begin{aligned} {{\mathscr {R}}}(t,x_1)=[\>0\>\>I_a\>]Q(t)^{-1}x,\ {{\mathscr {L}}}(t,x_1)=[\>I_d\>\>0\>]Q(t)^{-1}(\dot{x}-\dot{Q}(t)Q(t)^{-1}x) \end{aligned}$$

yielding the required evaluations of \({{\mathscr {L}}}\) and \({{\mathscr {R}}}\).

Since (3.4) constitutes an underdetermined system of equations, the method of choice to solve (3.4) numerically is the Gauß-Newton method. In order to show that the Gauß-Newton method will convergence quadratically for sufficiently good starting values, we need to show that the Jacobian at a solution possesses full row rank, see e.g. [5].

Theorem 3.1

Let (1.1) satisfy Hypothesis 2.1 both with \(\mu ,a,d\) and with \(\mu +1,a,d\). Then, the Jacobian of (3.4) possesses full row rank at every solution provided that \(\smash {\hat{F}_{2;x} T_2'}\) is pointwise nonsingular.

Proof

Due to (2.4) for \(\mu +1\) replacing \(\mu \) we have

$$\begin{aligned} F_{\mu +1;x}-Z_{2,0}\hat{F}_{2;x}=0, \end{aligned}$$

omitting for convenience the arguments here and later. Hence,

$$\begin{aligned} \hat{F}_{2;x}=(Z_2^TZ_{2,0} )^{-1}Z_2^TF_{\mu +1;x} \end{aligned}$$

in a sufficiently small neighborhood. Completing \(Z_2\) to a pointwise nonsingular matrix function \([\>Z_2'\>\>Z_2 \>]\), elementary row operations of the Jacobian of the first part in (3.4) yield

$$\begin{aligned} \left[ \begin{array}{cc}F_{\mu +1;x}&F_{\mu +1;\dot{x},\ldots ,x^{(\mu +2)}}\end{array}\right] \rightarrow \left[ \begin{array}{cc}Z_2'^{T}F_{\mu +1;x}&{}Z_2'^{T}F_{\mu +1;\dot{x},\ldots ,x^{(\mu +2)}}\\ Z_2^TF_{\mu +1;x}&{}0\end{array}\right] . \end{aligned}$$

According to Hypothesis 2.1 the entry \(\smash {Z_2'^{T}F_{\mu +1;\dot{x},\ldots ,x^{(\mu +2)}}}\) possesses full row rank such that we are left with the entry \(Z_2^TF_{\mu +1;x}\) together with the Jacobian \([\>I_d\>\>0\>]Q^{-1}\) of the second equation in (3.4). Multiplying the first part with \((Z_2^TZ_{2,0} )^{-1}\) from the left and both parts with Q from the right yields the matrix function

$$\begin{aligned} \left[ \begin{array}{cc}\hat{F}_{2;x}T_2 &{}\hat{F}_{2;x}T_2'\\ I_d&{}0\end{array}\right] \end{aligned}$$

which is pointwise nonsingular provided that \(\hat{F}_{2;x}T_2'\) is pointwise nonsingular. \(\square \)

3.2 Numerical construction of the transformation

It remains the question how we can deal with \(\dot{Q}\) in extracting the evaluation of \({{\mathscr {L}}}(t,x_1)\). In particular, we are interested in applications where a trivial choice as constant Q or beforehand given Q with implemented functions to evaluate both Q(t) and \(\dot{Q}(t)\) is not possible but where Q has to be chosen numerically during the integration of the DAE. The main problem in this context is that we must choose Q in a smooth way, at least on the current interval \([t_0,t_0+h]\) of the numerical integration with \(h>0\) sufficiently small, and that we must be able to evaluate \(\dot{Q}\).

The approach we will follow here is automatic differentiation, see [6]. This means that we work not only with the value of a variable but with a pair of numbers that represent the value and the derivative of a variable. Operations on such pairs are then defined by means of the known differentiation rules. If we use the notation \(\langle x,\dot{x}\rangle \) for such a pair, the typical operations used in linear algebra then read

$$\begin{aligned} \begin{array}{ll} \hbox \mathrm{(a)\qquad } \langle x,\dot{x}\rangle +\langle y,\dot{y}\rangle =\langle x+y,\dot{x}+\dot{y}\rangle ,\\ \hbox \mathrm{(b)\qquad } \langle x,\dot{x}\rangle -\langle y,\dot{y}\rangle =\langle x-y,\dot{x}-\dot{y}\rangle ,\\ \hbox \mathrm{(c)\qquad } \langle x,\dot{x}\rangle \cdot \langle y,\dot{y}\rangle =\langle x\cdot y,\dot{x}\cdot y+x\cdot \dot{y}\rangle ,\\ \hbox \mathrm{(d)\qquad } \langle x,\dot{x}\rangle /\langle y,\dot{y}\rangle =\langle x/y,(\dot{x}-x\cdot \dot{y}/y)/y\rangle ,\\ \hbox \mathrm{(e)\qquad } \sqrt{\langle x,\dot{x}\rangle }=\langle \sqrt{x},\frac{1}{2}\dot{x}/\sqrt{x}\rangle . \end{array} \end{aligned}$$
(3.5)

These operations can be obviously extended in a componentwise way to vector and matrix operations.

Note that in a programming language like C++ this approach can be implemented by defining a corresponding new class and overloading the above operations to work with this class. In this way it is possible to perform tasks of linear algebra like Cholesky decomposition \(A=L\cdot L^T\) in a smooth way yielding \(\langle L,\dot{L}\rangle \) for given \(\langle A,\dot{A}\rangle \). This is valid for all numerical algorithms that do not include if-clauses. If there are if-clauses, as for example in the QR decomposition \(A\cdot \Pi =Q\cdot R\), then we can at least locally get a smooth version. To do this for the QR decomposition, we may proceed as follows. For a reference point, typically \(t_0\), we perform a standard QR decomposition \(A(t_0)\cdot \Pi _0=Q_0\cdot R_0\). We then freeze all if-clauses and use automatic differentiation in the evaluation of the QR decomposition \(A\cdot \Pi _0=Q\cdot R\). In this way, we get \(\langle Q,\dot{Q}\rangle \) and \(\langle R,\dot{R}\rangle \) for given \(\langle A,\dot{A}\rangle \).

In particular, we can use this approach to perform the construction of reduced DAEs for linear time-varying systems as described in Sect. 2 with the aim to get not only values for the involved transformations but also values for their derivatives.

To start the construction of the reduced system (2.8), we need \(\dot{M}_\mu ,\dot{N}_\mu ,\dot{g}_\mu \) besides \(M_\mu ,N_\mu ,g_\mu \). Writing MNg for the formally infinite extensions of \(M_\mu ,N_\mu ,g_\mu \) and defining

$$\begin{aligned} S=\left[ \begin{array}{cccc}0\\ I_n&{}0\\ {} &{}I_n&{}0\\ {} &{}&{}\ddots &{}\ddots \end{array}\right] ,\quad V=\left[ \begin{array}{c}I_n\\ 0\\ 0\\ \vdots \end{array}\right] \end{aligned}$$

we have the relations

$$\begin{aligned} \dot{M}=S^TM-MS^T+N,\quad \dot{N}=S^TN,\quad \dot{g}=S^Tg, \end{aligned}$$

see [4]. Hence, from the evaluations \(M_{\mu +1},N_{\mu +1},g_{\mu +1}\) we can actually retrieve the desired \(\langle M_\mu ,\dot{M}_\mu \rangle \), \(\langle N_\mu ,\dot{N}_\mu \rangle \),and \(\langle g_\mu ,\dot{g}_\mu \rangle \). A first locally smooth QR decomposition then yields \(\langle Z_2,\dot{Z}_2\rangle \) and thus \(\smash {\langle \hat{A}_2,{\textstyle {d\over dt}}{\hat{A}}_2\rangle }\). A second locally smooth QR decomposition then gives \(\langle T_2,\dot{T}_2\rangle \) and with a third locally smooth QR decomposition for \(\langle E,\dot{E}\rangle \cdot \langle T_2,\dot{T}_2\rangle \) we finally get \(\langle Z_1,\dot{Z}_1\rangle \). In the latter case we can also use a standard QR decomposition once at \(t_0\) and use the so obtained \(Z_{1,0}\) to set \(\langle Z_1,\dot{Z}_1\rangle =\langle Z_{1,0},0\rangle \) if it seems more suited. The remaining quantities of the reduced DAE are then given by automatic differentiation along the lines of (2.9).

With a given choice \(\langle Q,\dot{Q}\rangle \) for fixing an inherent ODE, transforming the reduced DAE (2.5) by means of (3.1) yields

$$\begin{aligned} \begin{array}{rl} {\hat{E}}_{11}(t)\dot{x}_1+{\hat{E}}_{12}(t)\dot{x}_2&{}={\hat{A}}_{11}(t)x_1+{\hat{A}}_{12}(t)x_2+{\hat{f}}_{1}(t),\\ 0&{}={\hat{A}}_{21}(t)x_1+{\hat{A}}_{22}(t)x_2+{\hat{f}}_{2}(t), \end{array} \end{aligned}$$

where

$$\begin{aligned} \begin{array}{ll} {\hat{E}}_{11}={\hat{E}}_{1}T_{2},&{}{\hat{E}}_{12}={\hat{E}}_{1}T_{2}',\\ {\hat{A}}_{11}={\hat{A}}_{1}T_{2}-{\hat{E}}_{1}\dot{T}_{2},&{}{\hat{A}}_{12}={\hat{A}}_{1}T_{2}'-{\hat{E}}_{1}\dot{T}_{2}',\\ {\hat{A}}_{21}={\hat{A}}_{2}T_{2},&{}{\hat{A}}_{22}={\hat{A}}_{2}T_{2}', \end{array} \end{aligned}$$

and we are in the same situation as in the special case described in Sect. 2. In particular, we can solve for \(x_2\), differentiate, eliminate, and solve for \(\dot{x}_1\) to get the fixed inherent ODE.

A special choice of Q can be obtained by a locally smooth QR decomposition of \(\smash {\langle \hat{E}_1^T,{\textstyle {d\over dt}}{\hat{E}}_1^T\rangle }\) leading to \(\hat{E}_{12}=0\). Hypothesis 2.1 then guarantees that \(\hat{A}_{22}\) is pointwise nonsingular. If we set \(Q_0=Q(t_0)\) and \(\dot{Q}_0=\dot{Q}(t_0)\), we may also replace Q by the constant version \(Q(t)=Q_0\) or by the linearized version \(Q(t)=Q_0+(t-t_0)\dot{Q}_0\). The latter corresponds to the construction of so-called spin-stabilized integrators introduced in [14]. In the case that \(\mu =0\), the constructions can be simplified by using E instead of \(\hat{E}_1\) since no construction of a reduced system is required.

4 Symmetries and geometric integration

In this section we treat linear time-varying DAEs that are self-adjoint or skew-adjoint. The aim is to utilize the symmetry in the construction of a suitable inherent ODE such that it inherits certain properties of the original DAE. Note that self-adjointness and skew-adjointness are invariant under so-called congruence, i.e., under global equivalence (2.11) with \(P=Q^T\), see e.g. [15]. As there, we will write \((\tilde{E},\tilde{A})\equiv (E,A)\) to indicate that the pairs are congruent. Note also that regularity of a pair (EA) of sufficiently smooth matrix function \(E,A\in C({\mathbb I},{\mathbb R}^{n,n})\) is necessary and sufficient for the asscociated DAE (1.7) to satisfy Hypothesis 2.1, see e.g. [13].

4.1 Self-adjoint DAEs

Assuming (1.8) for (1.7), we will make use of the following global canonical form taken from [15] in a slightly rephrased version.

Theorem 4.1

Let (EA) with \(E,A\in C({\mathbb I},{\mathbb R}^{n,n})\) be sufficiently smooth and let the associated DAE (1.7) satisfy Hypothesis 2.1. If (EA) is self-adjoint, then we have that

$$\begin{aligned} (E,A)\equiv \left( \left[ \begin{array}{ccc} 0&{}I_{p}&{}0\\ -I_{p}&{}0&{}0\\ 0&{}0&{}E_{33} \end{array}\right] , \left[ \begin{array}{ccc} 0&{}0&{}0\\ 0&{}A_{22}&{}A_{23}\\ 0&{}A_{32}&{}A_{33} \end{array}\right] \right) , \end{aligned}$$
(4.1)

where

$$\begin{aligned} E_{33}(t)\dot{x}_3=A_{33}(t)x_3+f_3(t), \end{aligned}$$
(4.2)

is uniquely solvable for every sufficiently smooth \(f_3\) without specifying initial conditions. Furthermore,

$$\begin{aligned} E_{33}^T=-E_{33},\quad A_{22}^T=A_{22},\quad A_{32}^T=A_{23},\quad A_{33}^T=A_{33}+\dot{E}_{33}. \end{aligned}$$
(4.3)

In order to construct a suitable reduced DAE (2.8), we follow the lines of Hypothesis 2.1 for the global canonical form, indicated by tildes, and start with

$$\begin{aligned} \tilde{M}_\mu =\left[ \begin{array}{ccc|ccc|ccc|c} 0&{}I_{p}&{}0&{}&{}&{}&{}&{}&{}&{}\\ -I_{p}&{}0&{}0&{}&{}&{}&{}&{}&{}&{}\\ 0&{}0&{}E_{33}&{}&{}&{}&{}&{}&{}&{}\\ \hline 0&{}0&{}0&{}0&{}I_{p}&{}0&{}&{}&{}&{}\\ 0&{}-A_{22}&{}-A_{23}&{}-I_{p}&{}0&{}0&{}&{}&{}&{}\\ 0&{}-A_{32}&{}\dot{E}_{33}-A_{33}&{}0&{}0&{}E_{33}&{}&{}&{}&{}\\ \hline 0&{}0&{}0&{}0&{}0&{}0&{}0&{}I_{p}&{}0\\ 0&{}-2\dot{A}_{22}&{}-2\dot{A}_{23}&{}0&{}-A_{22}&{}-A_{23}&{}-I_{p}&{}0&{}0\\ 0&{}-2\dot{A}_{32}&{}\ddot{E}_{33}-2\dot{A}_{33}&{}0&{}-A_{32}&{}2\dot{E}_{33}-A_{33}&{}0&{}0&{}E_{33}\\ \hline \vdots &{}\vdots &{}\vdots &{}\vdots &{}\vdots &{}\vdots &{}\vdots &{}\vdots &{}\vdots &{}\ddots \end{array}\right] . \end{aligned}$$

Due to the identities, the only possible rank-deficiency is related to the part belonging to the pair \((E_{33},A_{33})\). The properties of (4.2) then imply that \(d=2p\) and \(a=n-2p\) in Hypothesis 2.1. Furthermore, the left null space of \(\tilde{M}_\mu \) is described by

$$\begin{aligned} \tilde{Z}_2^T=\left[ \begin{array}{ccc|ccc|ccc|c} *&0&\tilde{Z}_{2,0}^T&*&0&\tilde{Z}_{2,1}^T&*&0&\tilde{Z}_{2,2}^T&\cdots \end{array}\right] . \end{aligned}$$

Observing that

$$\begin{aligned} \tilde{N}_\mu [I_n\>0\>\cdots \>0]^T=\left[ \begin{array}{ccc} 0&{}0&{}0\\ 0&{}A_{22}&{}A_{23}\\ 0&{}A_{32}&{}A_{33}\\ \hline 0&{}0&{}0\\ 0&{}\dot{A}_{22}&{}\dot{A}_{23}\\ 0&{}\dot{A}_{32}&{}\dot{A}_{33}\\ \hline 0&{}0&{}0\\ 0&{}\ddot{A}_{22}&{}\ddot{A}_{23}\\ 0&{}\ddot{A}_{32}&{}\ddot{A}_{33}\\ \hline \vdots &{}\vdots &{}\vdots \end{array}\right] , \end{aligned}$$

we get

$$\begin{aligned} \hat{A}_2=\left[ \begin{array}{ccc}0&\hat{A}_{32}&I_a\end{array}\right] \end{aligned}$$

for the second part of Hypothesis 2.1, where the identity comes from a special choice of \(\tilde{Z}_2^T\). Choosing

$$\begin{aligned} \tilde{T}_2=\left[ \begin{array}{cc}I_p&{}0\\ 0&{}I_p\\ 0&{}-\hat{A}_{32}\end{array}\right] \end{aligned}$$

and \(\tilde{Z}_1=\tilde{T}_2\) yields

$$\begin{aligned} \tilde{Z}_1^T\tilde{E}\tilde{T}_2 = \left[ \begin{array}{ccc}I_p&{}0&{}0\\ 0&{}I_p&{}-\hat{A}_{32}^T\end{array}\right] \left[ \begin{array}{ccc}0&{}I_{p}&{}0\\ -I_{p}&{}0&{}0\\ 0&{}0&{}E_{33}\end{array}\right] \left[ \begin{array}{cc}I_p&{}0\\ 0&{}I_p\\ 0&{}-\hat{A}_{32}\end{array}\right] = \left[ \begin{array}{cc}0&{}I_{p}\\ -I_{p}&{}\hat{A}_{32}^TE_{33} \hat{A}_{32} \end{array}\right] , \end{aligned}$$

which is indeed pointwise nonsingular, thus satisfying the third part of Hypothesis 2.1. In particular, the special choice \(\tilde{Z}_1=\tilde{T}_2\) is possible. According to (2.14) with \(P=Q^T\) we can also choose \(Z_1=T_2\) for the original pair such that the reduced DAE inherits some symmetry properties of the original DAE. Note also that we may assume that \(T_2\) possesses pointwise orthonormal columns.

By construction, the matrix function \(T_2^TET_2 \) is not only pointwise skew-symmetric but also pointwise nonsingular. We can then proceed similar to [16]. Setting

$$\begin{aligned} T_2^TET_2 =\left[ \begin{array}{cc}\bar{E}&{}c\\ -c^T&{}0\end{array}\right] , \end{aligned}$$

there exists a smooth pointwise orthogonal transformation U with \(U^Tc=\alpha e_1\), \(\alpha \ne 0\), where \(e_1\) denotes the first canonical basis vector of appropriate size, see e.g. [13, Theorem 3.9]. It follows that

$$\begin{aligned} \left[ \begin{array}{cc}U\\ {} &{}1\end{array}\right] ^T\! \left[ \begin{array}{cc}\bar{E}&{}c\\ -c^T&{}0\end{array}\right] \left[ \begin{array}{cc}U\\ {} &{}1\end{array}\right] = \left[ \begin{array}{cc}U^T\bar{E}U&{}\alpha e_1\\ -\alpha e^T&{}0\end{array}\right] = \left[ \begin{array}{ccc}*&{}*&{}\alpha \\ {}*&{}\bar{\bar{E}}&{}0\\ -\alpha &{}0&{}0\end{array}\right] , \end{aligned}$$

where \(\bar{\bar{E}}\) is again skew-symmetric and pointwise nonsingular. Thus, inductively after p steps, we arrive at

$$\begin{aligned} W_1^TT_2^TET_2 W_1 =\left[ \begin{array}{cc}\tilde{E}_{11}&{}\tilde{E}_{12}\\ -\tilde{E}_{12}^T&{}0\end{array}\right] , \end{aligned}$$

where \(W_1\) collects all the applied transformations. By construction, \(\tilde{E}_{11}\) is skew-symmetric and \(\tilde{E}_{12}\) is anti-triangular and pointwise nonsingular. Finally, setting

$$\begin{aligned} W_2=\left[ \begin{array}{cc}I_p&{}0\\ -\frac{1}{2}\tilde{E}_{12}^{-1}\tilde{E}_{11} &{}\tilde{E}_{12}^{-1}\end{array}\right] \end{aligned}$$

yields

$$\begin{aligned} W_2^TW_1^TT_2^TET_2 W_1 W_2 =\left[ \begin{array}{cc}0&{}I_p\\ -I_p&{}0\end{array}\right] =J. \end{aligned}$$

For convenience, we write again \(T_2\) instead of the transformed \(T_2W_1W_2\). Completing \(T_2\) to a pointwise nonsingular Q according to (3.2), we get

$$\begin{aligned} Q^TEQ=\left[ \begin{array}{cc}J&{}\hat{E}_{12}\\ {}*&{}*\end{array}\right] ,\quad Q^TAQ-Q^TE\dot{Q}=\left[ \begin{array}{cc}C&{}\hat{A}_{12}\\ {}*&{}*\end{array}\right] . \end{aligned}$$

Since self-adjointness is invariant under congruence and J is constant, the matrix function C is pointwise symmetric. With (3.1) the reduced DAE transforms to

$$\begin{aligned} \begin{array}{rl} J{\dot{x}}_{1}+{\hat{E}}_{12}(t){\dot{x}}_{2}&{}=C(t)x_{1}+ {\hat{A}}_{12}(t)x_{2}+T_{2}(t)^Tf(t),\\ 0&{}={\hat{A}}_{22}(t)x_{2}+{\hat{f}}_{2}(t), \end{array} \end{aligned}$$

where \({\hat{A}}_{22} ={\hat{A}}_{2} T_{2}'\) is pointwise nonsingular. Solving the second equation for \(x_2\), differentiating, and eliminating \(x_2\) and \(\dot{x}_2\) from the first equation yields the inherent ODE

$$\begin{aligned} \dot{x}_1=J^{-1}C(t)x_1+\tilde{f}_1(t) \end{aligned}$$
(4.4)

with some transformed inhomogeneity \(\tilde{f}_1\).

Theorem 4.2

Let (EA) with \(E,A\in C({\mathbb I},{\mathbb R}^{n,n})\) be sufficiently smooth and let the associated DAE (1.7) satisfy Hypothesis 2.1. If (EA) is self-adjoint, then Q in (3.1) can be chosen from a restricted class of transformations in such a way that the ODE (1.4) belonging to the so constructed inherent ODE possesses a symplectic flow.

Proof

The above construction shows that it is possible to fix an inherent ODE such that the associated ODE (1.4) has a symplectic flow. It is special in the sense that it works with pointwise orthogonal transformations with the exception of \(W_2\) which transforms within one half of the variables and adapts the other half to obtain the matrix J and thus a set of variables for which the inherent ODE is Hamiltonian. \(\square \)

In the special case \(\mu =0\) a slightly simplified construction is possible. Here, Hypothesis 2.1 says that E has constant rank allowing to choose Q in the form (3.2) such that

$$\begin{aligned} Q^TEQ=\left[ \begin{array}{cc}\hat{E}_{11}&{}0\\ 0&{}0\end{array}\right] \end{aligned}$$

with \(\hat{E}_{11} =T_2^TET_2 \) pointwise nonsingular. Then, the same modifications of \(T_2\) as before are possible leading to a modified \(T_2\) with \(\hat{E}_{11}=J\). With the corresponding modified Q, observing \(ET_2'=0\), we get that

$$\begin{aligned} Q^TEQ=\left[ \begin{array}{cc}J&{}0\\ 0&{}0\end{array}\right] ,\quad Q^TAQ-Q^TE\dot{Q}=\left[ \begin{array}{cc}\hat{A}_{11}&{}\hat{A}_{12}\\ \hat{A}_{21}&{}\hat{A}_{22}\end{array}\right] . \end{aligned}$$

Since congruence conserves self-adjointness, see e.g. [16], we have \(\hat{A}_{11}^T=\hat{A}_{11} \), \(\hat{A}_{12}^T=\hat{A}_{21}\), and \(\hat{A}_{22}^T=\hat{A}_{22} \). Moreover, Hypothesis 2.1 with \(\mu =0\) requires that \(\hat{A}_{22}\) is pointwise nonsingular. The corresponding reduced DAE, which is here just the original DAE, transforms to

$$\begin{aligned} \begin{array}{rl} J\dot{x}_1&{}={\hat{A}}_{11}(t)x_1+{\hat{A}}_{12}(t)x_2+T_2(t)^Tf(t),\\ 0&{}={\hat{A}}_{12}(t)^Tx_1+{\hat{A}}_{22}(t)x_2+T_2'(t)^Tf(t). \end{array} \end{aligned}$$

Solving the second equation for \(x_2\) and eliminating it from the first equation, we again obtain an inherent ODE of the form (4.4), where

$$\begin{aligned} C={\hat{A}}_{11} -{\hat{A}}_{12} {\hat{A}}_{22}^{-1}{\hat{A}}_{12}^T \end{aligned}$$

is pointwise symmetric.

Theoretically, all constructions can be performed globally. For a numerical realization one typically uses locally smooth variants as described in Sect. 3, which in this case is straightforward on the basis of locally smooth QR decompositions.

4.2 Skew-adjoint DAEs

Assuming (1.9) for (1.7), we will make use of the following global canonical form taken from [15] in a slightly rephrased version.

Theorem 4.3

Let (EA) with \(E,A\in C({\mathbb I},{\mathbb R}^{n,n})\) be sufficiently smooth and let the associated DAE (1.7) satisfy Hypothesis 2.1. If (EA) is skew-adjoint, then we have that

$$\begin{aligned} (E,A)\equiv \left( \left[ \begin{array}{ccc} I_{p}&{}0&{}0\\ 0&{}-I_{q}&{}0\\ 0&{}0&{}E_{33} \end{array}\right] , \left[ \begin{array}{ccc} 0&{}0&{}0\\ 0&{}0&{}0\\ 0&{}0&{}A_{33} \end{array}\right] \right) , \end{aligned}$$
(4.5)

where

$$\begin{aligned} E_{33}(t)\dot{x}_3=A_{33}(t)x_3+f_3(t) \end{aligned}$$
(4.6)

is uniquely solvable for every sufficiently smooth \(f_3\) without specifying initial conditions. Furthermore,

$$\begin{aligned} E_{33}^T=E_{33},\quad A_{33}^T=-A_{33}-\dot{E}_{33} \end{aligned}$$
(4.7)

In order to construct a suitable reduced DAE (2.8), we proceed as in the self-adjoint case using the same notation. For the canonical form, we have

$$\begin{aligned} \tilde{M}_\mu =\left[ \begin{array}{ccc|ccc|ccc|c} I_{p}&{}0&{}0&{}&{}&{}&{}&{}&{}&{}\\ 0&{}-I_{q}&{}0&{}&{}&{}&{}&{}&{}&{}\\ 0&{}0&{}E_{33}&{}&{}&{}&{}&{}&{}&{}\\ \hline 0&{}0&{}0&{}I_{p}&{}0&{}&{}&{}&{}\\ 0&{}0&{}0&{}0&{}-I_{q}&{}0&{}&{}&{}&{}\\ 0&{}0&{}\dot{E}_{33}-A_{33}&{}0&{}0&{}E_{33}&{}&{}&{}&{}\\ \hline 0&{}0&{}0&{}0&{}0&{}0&{}I_{p}&{}0&{}0\\ 0&{}0&{}0&{}0&{}0&{}0&{}0&{}I_{q}&{}0&{}0\\ 0&{}0&{}\ddot{E}_{33}-2\dot{A}_{33}&{}0&{}0&{}2\dot{E}_{33}-A_{33}&{}0&{}0&{}E_{33}\\ \hline \vdots &{}\vdots &{}\vdots &{}\vdots &{}\vdots &{}\vdots &{}\vdots &{}\vdots &{}\vdots &{}\ddots \end{array}\right] . \end{aligned}$$

Due to the identities, the only possible rank-deficiency is related to the part belonging to the pair \((E_{33},A_{33})\). The properties of (4.6) then imply that \(d=p+q\) and \(a=n-(p+q)\) in Hypothesis 2.1. Furthermore, the left null space of \(\tilde{M}_\mu \) is described by

$$\begin{aligned} \tilde{Z}_2^T=\left[ \begin{array}{ccc|ccc|ccc|c} 0&0&\tilde{Z}_{2,0}^T&0&0&\tilde{Z}_{2,1}^T&0&0&\tilde{Z}_{2,2}^T&\cdots \end{array}\right] . \end{aligned}$$

Observing that

$$\begin{aligned} \tilde{N}_\mu [I_n\>0\>\cdots \>0]^T=\left[ \begin{array}{ccc} 0&{}0&{}0\\ 0&{}0&{}0\\ 0&{}0&{}A_{33}\\ \hline 0&{}0&{}0\\ 0&{}0&{}0\\ 0&{}0&{}\dot{A}_{33}\\ \hline 0&{}0&{}0\\ 0&{}0&{}0\\ 0&{}0&{}\ddot{A}_{33}\\ \hline \vdots &{}\vdots &{}\vdots \end{array}\right] , \end{aligned}$$

we get that

$$\begin{aligned} \hat{A}_2=\left[ \begin{array}{ccc}0&0&I_a\end{array}\right] \end{aligned}$$

for the second part of Hypothesis 2.1, where the identity comes from a special choice of \(\tilde{Z}_2^T\). Choosing

$$\begin{aligned} \tilde{T}_2=\left[ \begin{array}{cc}I_p&{}0\\ 0&{}I_q\\ 0&{}0\end{array}\right] \end{aligned}$$

and \(\tilde{Z}_1=\tilde{T}_2\) yields

$$\begin{aligned} \tilde{Z}_1^T\tilde{E}\tilde{T}_2 = \left[ \begin{array}{ccc}I_p&{}0&{}0\\ 0&{}I_p&{}0\end{array}\right] \left[ \begin{array}{ccc}I_{p}&{}0&{}0\\ 0&{}-I_{q}&{}0\\ 0&{}0&{}E_{33}\end{array}\right] \left[ \begin{array}{cc}I_p&{}0\\ 0&{}I_q\\ 0&{}0\end{array}\right] = \left[ \begin{array}{cc}I_{p}&{}0\\ 0&{}-I_{q}\end{array}\right] , \end{aligned}$$

which is indeed pointwise nonsingular, thus satisfying the third part of Hypothesis 2.1. In particular, the special choice \(\tilde{Z}_1=\tilde{T}_2\) is possible. According to (2.14) with \(P=Q^T\) we can also choose \(Z_1=T_2\) for the original pair such that the reduced DAE inherits some symmetry properties of the original DAE. Note also that we may assume that \(T_2\) possesses pointwise orthonormal columns.

By construction, the matrix function \(T_2^TET_2 \) is not only pointwise symmetric but also pointwise nonsingular. We can then apply the results of [12], which guarantee the existence of a smooth matrix function W with

$$\begin{aligned} W^TT_2^TET_2 W=\left[ \begin{array}{cc}I_p&{}0\\ 0&{}-I_q\end{array}\right] =S. \end{aligned}$$

For convenience, we write again \(T_2\) instead of the transformed \(T_2W\). Completing \(T_2\) to a pointwise nonsingular Q according to (3.2), we get

$$\begin{aligned} Q^TEQ=\left[ \begin{array}{cc}S&{}\hat{E}_{12}\\ {}*&{}*\end{array}\right] ,\quad Q^TAQ-Q^TE\dot{Q}=\left[ \begin{array}{cc}J&{}\hat{A}_{12}\\ {}*&{}*\end{array}\right] . \end{aligned}$$

Since skew-adjointness is invariant under congruence, see [1, 15], and S is constant, the matrix function J is pointwise skew-symmetric. With (3.1) the reduced DAE transforms to

$$\begin{aligned} \begin{array}{rl} S{\dot{x}}_{1}+{\hat{E}}_{12}(t){\dot{x}}_2&{}=J(t)x_1 +{\hat{A}}_{12}(t)x_{2}+T_{2}(t)^Tf(t),\\ 0&{}={\hat{A}}_{22}(t)x_{2}+{\hat{f}}_{2}(t), \end{array} \end{aligned}$$

where \(\hat{A}_{22} =\hat{A}_2 T_2'\) is pointwise nonsingular. Solving the second equation for \(x_2\), differentiating, and eliminating \(x_2\) and \(\dot{x}_2\) from the first equation yields the inherent ODE

$$\begin{aligned} \dot{x}_1=S^{-1}J(t)x_1+\tilde{f}_1(t) \end{aligned}$$
(4.8)

with a transformed inhomogeneity \(\tilde{f}_1\).

Theorem 4.4

Let (EA) with \(E,A\in C({\mathbb I},{\mathbb R}^{n,n})\) be sufficiently smooth and let the associated DAE (1.7) satisfy Hypothesis (2.1). If (EA) is skew-adjoint, then Q in (3.1) can be chosen from a restricted class of transformations in such a way that the ODE (1.4) belonging to the so constructed inherent ODE possesses a generalized orthogonal flow.

Proof

The above construction shows that it is possible to fix an inherent ODE such that the associated ODE (1.4) has a generalized orthogonal flow. It is special in the sense that it works with pointwise orthogonal transformations with the exception of W. \(\square \)

In the special case that \(\mu =0\), a slightly simplified construction is possible. Here, Hypothesis 2.1 implies that E has constant rank allowing to choose Q in the form (3.2) such that

$$\begin{aligned} Q^TEQ=\left[ \begin{array}{cc}\hat{E}_{11}&{}0\\ 0&{}0\end{array}\right] \end{aligned}$$

with \(\hat{E}_{11} =T_2^TET_2 \) pointwise nonsingular. Then, the same modifications of \(T_2\) as before are possible leading to a modified \(T_2\) with \(\hat{E}_{11}=S\). With the corresponding modified Q, observing \(ET_2'=0\), we get

$$\begin{aligned} Q^TEQ=\left[ \begin{array}{cc}S&{}0\\ 0&{}0\end{array}\right] ,\quad Q^TAQ-Q^TE\dot{Q}=\left[ \begin{array}{cc}\hat{A}_{11}&{}\hat{A}_{12}\\ \hat{A}_{21}&{}\hat{A}_{22}\end{array}\right] . \end{aligned}$$

Since congruence transformations conserve skew-adjointness, we have \(\hat{A}_{11}^T=-\hat{A}_{11} \), \(\hat{A}_{12}^T=-\hat{A}_{21}\), and \(\hat{A}_{22}^T=-\hat{A}_{22} \). Moreover, Hypothesis 2.1 with \(\mu =0\) requires that \(\hat{A}_{22}\) is pointwise nonsingular. The corresponding reduced DAE, which is here just the original DAE, transforms to

$$\begin{aligned} \begin{array}{rl} S\dot{x}_1&{}={\hat{A}}_{11}(t)x_1+{\hat{A}}_{12}(t)x_2 +T_2(t)^Tf(t),\\ 0&{}={\hat{A}}_{12}(t)^Tx_1+{\hat{A}}_{22}(t)x_2+T_2'(t)^Tf(t). \end{array} \end{aligned}$$

Solving the second equation for \(x_2\) and eliminating it from the first equation, we again obtain an inherent ODE of the form (4.4), where

$$\begin{aligned} J={\hat{A}}_{11} -{\hat{A}}_{12} {\hat{A}}_{22}^{-1}{\hat{A}}_{12}^T \end{aligned}$$

is pointwise skew-symmetric.

Theoretically, all constructions can be performed globally. For a numerical realization one typically uses locally smooth variants as described in Sect. 3. The only exception is the construction of a suitable W, where we are still in need of a locally smooth variant to be used within an integration. One possibility is given in the following, cp. [12].

We start with a reference factorization

$$\begin{aligned} W_0^T\hat{E}_{11}(t_0)W_0=S \end{aligned}$$

which may be obtained by solving the symmetric eigenvalue problem and then scaling the eigenvalues by congruence to \(\pm 1\) or by a Cholesky-like factorization for indefinite matrices as given by [2]. We then consider the matrix function

$$\begin{aligned} W_0^T\hat{E}_{11} W_0 =\left[ \begin{array}{cc}\tilde{E}_{11}&{}\tilde{E}_{12}\\ \tilde{E}_{21}&{}\tilde{E}_{22}\end{array}\right] , \end{aligned}$$

where \(\tilde{E}_{11}^T=\tilde{E}_{11} \), \(\tilde{E}_{12}^T=\tilde{E}_{21} \), and \(\tilde{E}_{22}^T=\tilde{E}_{22} \). In a sufficiently small neighborhood, the entry \(\tilde{E}_{11}\) is close to \(I_p\), the entry \(\tilde{E}_{22}\) is close to \(-I_q\), and the entry \(\tilde{E}_{12}\) is small in norm. In particular, the entry \(\tilde{E}_{11}\) is symmetric positive definite allowing for a Cholesky factorization

$$\begin{aligned} \tilde{E}_{11} =L_{11} L_{11}^T, \end{aligned}$$

which is a smooth process. We then get

$$\begin{aligned} \left[ \begin{array}{cc}L_{11}^{-1}&{}0\\ -\tilde{E}_{12}^T\tilde{E}_{11}^{-1}&{}I_q\end{array}\right] \left[ \begin{array}{cc}\tilde{E}_{11} &{}\tilde{E}_{12} \\ \tilde{E}_{12}^T&{}\tilde{E}_{22} \end{array}\right] \left[ \begin{array}{cc}L_{11}^{-T}&{} - \tilde{E}_{11}^{-1}\tilde{E}_{12} \\ 0&{}I_q\end{array}\right] = \left[ \begin{array}{cc}I_p&{}0\\ 0&{}\tilde{E}_{22} -\tilde{E}_{12}^T\tilde{E}_{11}^{-1}\tilde{E}_{12} \end{array}\right] . \end{aligned}$$

In a sufficiently small neighborhood, the Schur complement \(\tilde{E}_{22} -\tilde{E}_{12}^T\tilde{E}_{11}^{-1}\tilde{E}_{12} \) is symmetric negative definite allowing for a Cholesky factorization

$$\begin{aligned} -(\tilde{E}_{22} -\tilde{E}_{12}^T\tilde{E}_{11}^{-1}\tilde{E}_{12})=L_{22} L_{22}^T, \end{aligned}$$

such that

$$\begin{aligned} \left[ \begin{array}{cc}I_p&{}0\\ 0&{}L_{22}^{-1}\end{array}\right] \left[ \begin{array}{cc}I_p&{}0\\ 0&{}\tilde{E}_{22} -\tilde{E}_{12}^T\tilde{E}_{11}^{-1}\tilde{E}_{12} \end{array}\right] \left[ \begin{array}{cc}I_p&{}0\\ 0&{}L_{22}^{-T}\end{array}\right] = \left[ \begin{array}{cc}I_p&{}0\\ 0&{}-I_q\end{array}\right] =S. \end{aligned}$$

Gathering all transformations gives the locally smooth

$$\begin{aligned} W=W_0\left[ \begin{array}{cc}L_{11}^{-T}&{}-\tilde{E}_{11}^{-1}\tilde{E}_{12} \\ 0&{}I_q\end{array}\right] \left[ \begin{array}{cc}I_p&{}0\\ 0&{}L_{22}^{-T}\end{array}\right] \end{aligned}$$

and all steps can be executed numerically in a smooth way using automatic differentiation.

5 Numerical experiments

The presented numerical method has been implemented using automatic differentiation in order to be able to evaluate all needed derivatives and Jacobians. For the determination of \(\langle Q,\dot{Q}\rangle \) on the current interval \([t_0,t_0+h]\) one can choose between the following possibilities.

$$\begin{aligned} \begin{array}{|l|l|}\hline \texttt {INHERENT}&{}Q(t)=Q_0\\ \hline \texttt {SPIN}\_\texttt {STABILIZED}&{}Q(t)=Q_0+(t-t_0){\dot{Q}}_0\\ \hline \texttt {ROTATED}&{}Q=[\>T_{2} \>\>T_{2}'\>],\ {\hat{E}}_{1} T_{2}'=0\\ \hline \texttt {SELF}\_\texttt {ADJOINT}&{}Q\hbox { as described in Subsection } 4.1\\ \hline \texttt {SKEW}\_\texttt {ADJOINT}&{}Q\hbox { as described in Subsection } 4.2\\ \hline \texttt {PRESCRIBED}&{}Q\hbox { by user-provided routine}\\ \hline \end{array} \end{aligned}$$

In all cases except for the last one, one can choose between the general approach, which includes transformation to a reduced DAE, and the simplified approach assuming that no such transformation is necessary. Schemes based on the direct discretization of (2.5) are labelled as DIRECT. As numerical integration methods we use the following discretization methods, see e.g. [10, 13].

Experiment 5.1

The linear DAE

$$\begin{aligned} \left[ \begin{array}{cc} \delta -1 &{} \delta t \\ 0 &{} 0 \end{array}\right] \left[ \begin{array}{c} \dot{x}_1 \\ \dot{x}_2 \end{array}\right] = \left[ \begin{array}{cc}-\eta (\delta -1) &{} -\eta \delta t \\ \delta -1 &{} \delta t-1 \end{array}\right] \left[ \begin{array}{c} x_1 \\ x_2 \end{array}\right] + \left[ \begin{array}{c} f_1(t) \\ f_2(t) \end{array}\right] , \end{aligned}$$

cp. [18], with real parameters \(\eta \) and \(\delta \ne 1\) is constructed in such a way that direct discretization by the implicit Euler method corresponds to the discretization of an inherent ODE by the explicit Euler method. Setting \(\delta =-10^5,\ \eta =0\) yields a stiff inherent ODE and we expect stability problems when working directly with the implicit Euler method. For our numerical experiments we have chosen \(f_1,f_2\) and the initial condition so that the solution is given by \(x_1(t)=x_2(t)=\exp (-t)\). Integration interval was [0, 1] and tolerance was \(10^{-5}\). The following table gives the cpu times and the number of integration steps for the various versions of the implicit Euler method.

$$\begin{aligned} \begin{array}{|l|c|c|}\hline \text {version}&{}\text {cpu time}&{}\text {steps}\\ \hline \hline \texttt {DIRECT}&{}\texttt {10.31}&{}\texttt {97840}\\ \hline \texttt {INHERENT}&{}\texttt {~0.73}&{}\texttt {~~~10}\\ \hline \texttt {SPIN}\_\texttt {STABILIZED}&{}\texttt {~0.60}&{}\texttt {~~~10}\\ \hline \texttt {ROTATED}&{}\texttt {~0.67}&{}\texttt {~~~10}\\ \hline \end{array} \end{aligned}$$

The stabilizing effect of discretizing an inherent ODE is obvious. The three different versions in the choice of the inherent ODE do not differ significantly.

Experiment 5.2

A mathematical model of a pendulum is given by the DAE

$$\begin{aligned} \begin{array}{rl} \dot{x}_3&{}=x_1,\\ \dot{x}_4&{}=x_2,\\ -\dot{x}_1&{}=2x_3x_5,\\ -\dot{x}_2&{}=1+2x_4x_5,\\ 0&{}=x_3^2+x_4^2-1, \end{array} \end{aligned}$$

which is known to satisfy Hypothesis 2.1 with \(\mu =2\), \(a=3\), and \(d=2\). The equations and unknowns are ordered in such a way that

$$\begin{aligned} F_{\dot{x}}(t,x,\dot{x})=\left[ \begin{array}{ccccc} 0&{}0&{}1&{}0&{}0\\ 0&{}0&{}0&{}1&{}0\\ -1&{}0&{}0&{}0&{}0\\ 0&{}-1&{}0&{}0&{}0\\ 0&{}0&{}0&{}0&{}0\end{array}\right] ,\quad F_{x}(t,x,\dot{x})=\left[ \begin{array}{ccccc} 1&{}0&{}0&{}0&{}0\\ 0&{}1&{}0&{}0&{}0\\ 0&{}0&{}2x_5&{}0&{}2x_3\\ 0&{}0&{}0&{}2x_5&{}2x_4\\ 0&{}0&{}2x_3&{}2x_4&{}0\end{array}\right] . \end{aligned}$$

Hence, \((F_{\dot{x}},F_{x})\) is self-adjoint for all arguments. The constructions of Sect. 4, however, are only valid for linear DAEs and therefore not applicable. The only valid use of an inherent ODE as presented here is by the versions INHERENT and PRESCRIBED, since in the nonlinear case the Jacobians do not only depend on t. The following table shows the performance of various discretization schemes when integrating over the interval [0, 10] with stepsize control starting with \(x(0)=(0,0,1,0,0)^T\) and using a tolerance of \(10^{-5}\).

$$\begin{aligned} \begin{array}{|l|l|c|c|c|c|}\hline \text {method}&{}\text {version}&{}\text {stages}&{}\text {order}&{}\text {cpu time}&{}\text {steps}\\ \hline \hline \texttt {GAUSS-LOBATTO}&{}\texttt {DIRECT}&{}\texttt {2-3}&{}\texttt {4}&{}\texttt {~0.93}&{}\texttt {55}\\ \hline \texttt {RADAU}&{}\texttt {DIRECT}&{}\texttt {4}&{}\texttt {7}&{}\texttt {~0.99}&{}\texttt {28}\\ \hline \texttt {DORMAND-PRINCE}&{}\texttt {INHERENT}&{}\texttt {7}&{}\texttt {4}&{}\texttt {~1.33}&{}\texttt {47}\\ \hline \texttt {DORMAND-PRINCE}&{}\texttt {INHERENT}&{}\texttt {13}&{}\texttt {7}&{}\texttt {~1.29}&{}\texttt {28}\\ \hline \texttt {GAUSS}&{}\texttt {INHERENT}&{}\texttt {2}&{}\texttt {4}&{}\texttt {11.76}&{}\texttt {55}\\ \hline \texttt {RADAU}&{}\texttt {INHERENT}&{}\texttt {4}&{}\texttt {7}&{}\texttt {12.92}&{}\texttt {34}\\ \hline \end{array} \end{aligned}$$

In particular, we observe that we are able to solve the given problem by explicit schemes for the chosen inherent ODE with nearly the same efficiency as the standard direct methods. As in the standard ODE case, implicit methods applied to the inherent ODE are for this problem outperformed by explicit methods since the inherent ODE is non-stiff.

In the following experiments we measure the geometric error in the flow \(\Phi \) with respect to a quadratic Lie group (1.5), i. e. the deviation of the flow \(\Phi \) from being pointwise in the corresponding Lie group, by \(\Vert \Phi (t)^TX\Phi (t)-X\Vert \), where we use the matrix norm defined by \(\Vert \Delta \Vert =\max _{i,j=1,\ldots ,n}|\Delta _{ij}|\) for a matrix \(\Delta =[\Delta _{ij}]\in {\mathbb R}^{n,n}\).

Experiment 5.3

The self-adjoint DAE \(E(t)\dot{x}=A(t)x\) given by

$$\begin{aligned} E=Q^T\hat{E}Q,\quad A=Q^T\hat{A}Q-Q^T\hat{E}\dot{Q}, \end{aligned}$$

where

$$\begin{aligned} \hat{E}=\left[ \begin{array}{ccc}0&{}1&{}0\\ -1&{}0&{}0\\ 0&{}0&{}0\end{array}\right] ,\quad \hat{A}=\left[ \begin{array}{ccc}1&{}0&{}0\\ 0&{}1&{}0\\ 0&{}0&{}1\end{array}\right] ,\quad Q=\left[ \begin{array}{ccc}1&{}s&{}0\\ s&{}1&{}s\\ 0&{}s&{}1\end{array}\right] , \end{aligned}$$

with \(s(t)=\frac{1}{2}\sin \omega t,\ \omega =1\), possesses a symplectic flow with respect to the first two components of the transformed unknown \(\hat{x}=Qx\).

The following table shows the performance and the maximal geometric error in the flow for various discretization schemes when integrating over the interval \([0,200\pi ]\) using 1, 000 equidistant steps. We used the simplified approach due to \(\mu =0\).

$$\begin{aligned} \begin{array}{|l|l|c|c|c|c|}\hline \text {method}&{}\text {version}&{}\text {stages}&{}\text {order}&{}\text {cpu time}&{}\text {error}\\ \hline \hline \texttt {GAUSS-}{} \texttt {LOBATTO}&{}\texttt {DIRECT}&{}\texttt {2-}{} \texttt {3}&{}\texttt {4}&{}\texttt {~1.44}&{}\texttt {1.380e-}{} \texttt {02}\\ \hline \texttt {DORMAND-}{} \texttt {PRINCE}&{}\texttt {INHERENT}&{}\texttt {7}&{}\texttt {4}&{}\texttt {~5.36}&{}\texttt {2.468e-}{} \texttt {01}\\ \hline \texttt {GAUSS}&{}\texttt {ROTATED}&{}\texttt {2}&{}\texttt {4}&{}\texttt {23.68}&{}\texttt {7.281e-}{} \texttt {04}\\ \hline \texttt {GAUSS}&{}\texttt {SELF}\_\texttt {ADJOINT}&{}\texttt {2}&{}\texttt {4}&{}\texttt {24.88}&{}{\texttt {1.927e-}{} \texttt {11}}\\ \hline \end{array} \end{aligned}$$

In particular, we see that the last method given in the table shows a geometric error only governed by roundoff effects, thus constituting a geometric integrator for this class of problems.

Experiment 5.4

The skew-adjoint DAE \(E(t)\dot{x}=A(t)x\) given by

$$\begin{aligned} E=Q^T\hat{E}Q,\quad A=Q^T\hat{A}Q-Q^T\hat{E}\dot{Q}, \end{aligned}$$

where

$$\begin{aligned} \hat{E}=\left[ \begin{array}{cccc}1&{}0&{}0&{}0\\ 0&{}1&{}0&{}0\\ 0&{}0&{}0&{}0\\ 0&{}0&{}0&{}0\end{array}\right] ,\quad \hat{A}=\left[ \begin{array}{cccc}0&{}1&{}0&{}0\\ -1&{}0&{}0&{}0\\ 0&{}0&{}0&{}1\\ 0&{}0&{}-1&{}0\end{array}\right] ,\quad Q=\left[ \begin{array}{cccc}1&{}s&{}0&{}0\\ s&{}1&{}s&{}0\\ 0&{}s&{}1&{}s\\ 0&{}0&{}s&{}1\end{array}\right] ,\quad \end{aligned}$$

with \(\smash {s(t)=\frac{1}{2}\sin \omega t,\ \omega =1}\), possesses an orthogonal flow with respect to the first two components of the transformed unknown \(\hat{x}=Qx\).

The following table shows the performance and the maximal geometric error in the flow for various discretization schemes when integrating over the interval \([0,200\pi ]\) using 1, 000 equidistant steps. We used the simplified approach due to \(\mu =0\).

$$\begin{aligned} \begin{array}{|l|l|c|c|c|c|}\hline \text {method}&{}\text {version}&{}\text {stages}&{}\text {order}&{}\text {cpu time}&{}\text {error}\\ \hline \hline \texttt {GAUSS-}{} \texttt {LOBATTO}&{}\texttt {DIRECT}&{}\texttt {2}-\texttt {3}&{}\texttt {4}&{}\texttt {~2.05}&{}\texttt {1.226e-}{} \texttt {01}\\ \hline \texttt {DORMAND-}{} \texttt {PRINCE}&{}\texttt {INHERENT}&{}\texttt {7}&{}\texttt {4}&{}\texttt {12.52}&{}\texttt {1.965e-}{} \texttt {02}\\ \hline \texttt {GAUSS}&{}\texttt {ROTATED}&{}\texttt {2}&{}\texttt {4}&{}\texttt {74.85}&{}\texttt {9.363e+}{} \texttt {00}\\ \hline \texttt {GAUSS}&{}\texttt {SKEW}\_\texttt {ADJOINT}&{}\texttt {2}&{}\texttt {4}&{}\texttt {80.64}&{}{\texttt {3.814e-}{} \texttt {12}}\\ \hline \end{array} \end{aligned}$$

In particular, we see that the last method given in the table shows a geometric error only governed by roundoff effects, thus constituting a geometric integrator for this class of problems.

Experiment 5.5

The skew-adjoint DAE \(E(t)\dot{x}=A(t)x\) given by

$$\begin{aligned} E=Q^T\hat{E}Q,\quad A=Q^T\hat{A}Q-Q^T\hat{E}\dot{Q}, \end{aligned}$$

where

$$\begin{aligned} \hat{E}=\left[ \begin{array}{ccccc}1&{}0&{}0&{}0&{}0\\ 0&{}1&{}0&{}0&{}0\\ 0&{}0&{}-1&{}0&{}0\\ 0&{}0&{}0&{}0&{}0\\ 0&{}0&{}0&{}0&{}0\end{array}\right] ,\quad \hat{A}=\left[ \begin{array}{ccccc}0&{}1&{}0&{}0&{}0\\ -1&{}0&{}0&{}0&{}0\\ 0&{}0&{}0&{}0&{}0\\ 0&{}0&{}0&{}0&{}1\\ 0&{}0&{}0&{}-1&{}0\end{array}\right] ,\quad Q=\left[ \begin{array}{ccccc}1&{}s&{}0&{}0&{}0\\ s&{}1&{}s&{}0&{}0\\ 0&{}s&{}1&{}s&{}0\\ 0&{}0&{}s&{}1&{}s\\ 0&{}0&{}0&{}s&{}1\end{array}\right] ,\quad \end{aligned}$$

with \(\smash {s(t)=\frac{1}{2}\sin \omega t,\ \omega =1}\), possesses a generalized orthogonal flow in with respect to the first three components of the transformed unknown \(\hat{x}=Qx\).

The following table shows the performance and the maximal geometric error in the flow for various discretization schemes when integrating over the interval \([0,200\pi ]\) using 1, 000 equidistant steps. We used the simplified approach due to \(\mu =0\).

$$\begin{aligned} \begin{array}{|l|l|c|c|c|c|}\hline \text {method}&{}\text {version}&{}\text {stages}&{}\text {order}&{}\text {cpu time}&{}\text {error}\\ \hline \hline \texttt {GAUSS-}{} \texttt {LOBATTO}&{}\texttt {DIRECT}&{}\texttt {2-}{} \texttt {3}&{}\texttt {4}&{}\texttt {~~3.33}&{}\texttt {4.548e-}{} \texttt {01}\\ \hline \texttt {DORMAND-}{} \texttt {PRINCE}&{}\texttt {INHERENT}&{}\texttt {7}&{}\texttt {4}&{}\texttt {~32.40}&{}\texttt {8.957e-}{} \texttt {01}\\ \hline \texttt {GAUSS}&{}\texttt {ROTATED}&{}\texttt {2}&{}\texttt {4}&{}\texttt {226.55}&{}\texttt {6.912e-}{} \texttt {01}\\ \hline \texttt {GAUSS}&{}\texttt {SKEW}\_\texttt {ADJOINT}&{}\texttt {2}&{}\texttt {4}&{}\texttt {288.55}&{}{\texttt {1.096e-}{} \texttt {11}}\\ \hline \end{array} \end{aligned}$$

In particular, we see that again the last method given in the table shows a geometric error only governed by roundoff effects, thus constituting a geometric integrator for this class of problems.

6 Conclusions

We have presented discretization methods for DAEs that are based on the integration of an inherent ODE which is extracted from the derivative array equations associated with the given DAE utilizing automatic differentiation. We have shown that for this inherent ODE we can use classical discretization schemes for the numerical integration of ODEs that cannot be used for DAEs directly. For self-adjoint and skew-adjoint linear time-varying DAEs we have shown that the inherent ODE can be constructed in such a way that it inherits these symmetry properties of the given DAE and thus also the geometric properties of its flow. We then have exploited this property to construct geometric integration schemes with a numerical flow that preserves these geometric properties.