1 Introduction

Isospectral systems, for which the time evolution preserves the spectrum of a matrix state variable, arise in the mathematical description of various systems of interest. They are generically given by

$$\begin{aligned} {\dot{A}} = \big [ B(A) , A \big ] , \quad A(0) = A_0 \end{aligned}$$
(1.1)

where A, B(A) are matrices and the spectrum of A is preserved [9, IV.3.2]. See [6] for a detailed discussion of the properties of these systems.

A classical example for an isospectral system is the ideal rigid body in its reduced formulation [12], as described by Euler’s equations. In this case the angular momentum (as an anti-symmetric matrix in \(\mathfrak {so}^*(3)\)) is the isospectrally conserved quantity. The periodic Toda lattice in its Lax pair formulation is also isospectral with the isospectral state L in this case being symmetric and the generator of the dynamics B(L) being anti-symmetric. Another example is the discrete 2D barotropic vorticity equation on the torus and 2-sphere when the “matrix model” is used for spatial discretization, i.e. the representation theory of the discrete Heisenberg group and \(\mathrm {SO}(3)\), respectively, is used [10, 21, 22]. The isospectral property is in this case a discrete analogue (with N conserved quantities for \(N^{2}\) degrees of freedom) of the integrated powers of vorticity that are conserved in the continuous system [1].

Since conserved quantities are an important intrinsic characteristic of a physical system, it is natural to preserve isospectrality under discretization. An early work that accomplished this was that by Moser and Veselov [17] for rigid body like systems. Moore, Mahony, and Helmke [16] introduced isospectral time integrators for the solution of problems in linear algebra, where many classical questions, such as the QR or eigen decomposition, can be cast as isospectral flows. Isospectral numerical time integrators were also considered by Diele, Lopez and Politi who proposed the use of the Cayley transform [7]. Calvo, Iserles and Zanna [6] showed that Runge-Kutta methods applied to Eq. 1.1 break isospectrality. To overcome this, they introduce modified Gauss-Legendre Runge-Kutta methods that provide isospectral time integration schemes of arbitrary high order. In [3], Bogfjellmo and Marthinsen proposed higher order symplectic integrators on the cotangent bundle \(T^*G\). Modin and Viviani [14] recently developed an alternative route to overcome the obstruction from [6] by applying a Runge-Kutta scheme to the cotangent bundle \(T^*G\) of a quadratic matrix Lie group G. Reduction to the dual Lie algebra \({\mathfrak {g}}^*\) with the momentum map was then used to obtain an isospectral integrator, analogous to how Lie-Poisson reduction works in the continuous case to obtain reduced dynamical equations, cf. [12, Ch. 13]. The ansatz exploits that many isospectral flows are Lie-Poisson, i.e. their phase space is a dual Lie algebra \({\mathfrak {g}}^*\), and the dynamics are reduced ones from \(T^*G\). Viviani [20] recently extended [14] and showed that the isospectral implicit midpoint (Runge-Kutta) scheme can be written and implemented efficiently with an intermediate time step (in the spirit of the Verlet method). He also pointed out that the Cayley transform then arises naturally and does not have to be introduced as a group map as in [7]. The integrators developed in [14, 20] also preserve isospectrality for systems that are not Lie-Poisson, e.g., those on the orthogonal complements of quadratic Lie algebras.

Another line of research on structure preserving discretizations is based on the Lagrangian formulation of mechanics and uses a discrete Hamilton’s principle [13]. This direction has also been developed for reduced dynamical systems on Lie groups and algebras [4, 5]. Gawlik et al. [8], in particular, developed a formulation for semi-direct products that, as a special case, is applicable to systems such as the rigid body.

In the present work, we will derive the Lagrangian analogue of the work of Modin and Viviani [14] and show that isospectral Runge-Kutta methods can be obtained from the reduced Hamilton’s principle of Bou-Rabee and Marsden [4] and Gawlik et al. [8]. We do so for the case of symplectic diagonally implicit Runge-Kutta methods (symplectic DIRKs), which are an important special case since they are the simplest and most easily implemented symplectic Runge-Kutta schemes. Similar to [14], we will consider isospectral Lie-Poisson systems for our derivation. We also extend the results of Viviani [20] by showing that the Cayley transform arises for any symplectic DIRK applied at the group level when it is written using intermediate time steps located between the Runge-Kutta steps. Another result of the present work is that the midpoint rule by Gawlik et al. [8] contains, in disguise, an isospectral method; more precisely, the scheme from [8] updates the midpoints of the isospectral midpoint rule of Viviani [20]. We exemplify our results by implementing our isospectral symplectic DIRKs for the rigid body, the Toda lattice, and the matrix model discretization of the 2D Euler fluid. We show that with a 7-stage \(4^{\text {th}}\)-order symplectic DIRK, the oscillations in the energy, which are standard for a symplectic integrator, are on the order of machine precision. Within the accuracy afforded by standard double precision, the integrator therefore preserves the invariants of the continuous system. Symplectic DIRKs provide here the advantage that these can be implemented more easily and can be computed more efficiently than general isospectral Runge-Kutta methods.

The remainder of the paper is structured as follows. In Sect. 2 we will recall the necessary background on Lie groups, their algebras and isospectral Lie-Poisson systems. In Sect. 3 we will show that the Cayley transform arises for any symplectic DIRK on a quadratic matrix Lie group by introducing intermediate time steps. The central result of the paper, namely a variational derivation of isospectral symplectic DIRKs, will be presented in Sect. 4. Numerical results for the rigid body, the Toda lattice, and the discrete 2D Euler fluid are given in Sect. 5.

2 Preliminaries

2.1 Lie groups, lie algebras and retraction maps

Let G be a matrix Lie group with identity e and \({\mathfrak {g}}\) its Lie algebra. We will denote the conjugate transpose of an element \(g \in G\) with \(g^{\dagger }\). If conjugation is defined as a left action \(g \cdot h = g \,h \,g^{-1}\), the adjoint action \({\mathrm {Ad} }: G \times {\mathfrak {g}}\rightarrow {\mathfrak {g}}\) of the group G on its Lie algebra \({\mathfrak {g}}\) is given by \(\text {Ad}_{g} \cdot \xi = g\, \xi \, g^{-1}\). Infinitesimally, the adjoint action becomes \(\mathrm {ad}_{\xi }(\eta ) = [\xi ,\eta ]\). The dual of \(\mathrm {ad}\) is the infinitesimal coadjoint action \(\text {ad}^* : {\mathfrak {g}}\times {\mathfrak {g}}^* \rightarrow {\mathfrak {g}}^*\) defined through \(\smash {\langle \mathrm {ad}_{\xi }( \eta ) , \alpha \rangle = \langle \eta , \mathrm {ad}_{\xi }^* ( \alpha ) \rangle }\), which plays an important role for Lie-Poisson dynamics. We work with the pairing between \({\mathfrak {g}}\) and \({\mathfrak {g}}^*\) associated with the Frobenius inner-product on \({\mathfrak {g}}\) given by \(\langle \xi , \alpha \rangle = \mathrm {tr}( \xi ^{\dagger } \alpha )\), which also allows us to identify \({\mathfrak {g}}\) and \({\mathfrak {g}}^*\). This identification is used in Sect. 2.2 where it allows us to make the calculations clearer and isospectrality more explicit. In Sect. 4 the identification between \({\mathfrak {g}}\) and \({\mathfrak {g}}^*\) is only used when we refer to isospectrality or to Sect. 3. The discrete variational principle in Theorem 4.1 is independent of this identification.

A local diffeomorphism \(\tau :{\mathfrak {g}} \rightarrow G\) satisfying \(\tau (0)= e\), \(\tau (\xi )^{-1} = \tau (-\xi )\) and whose tangent map fulfills \(T_{0}\tau = \text {Id}\) is called a retraction map (or local group diffeomorphism). Such a map provides a local chart for the Lie group around the identity (with coordinates of the first kind) and induces an atlas for G through the translation action. The tangent \(T_\xi \tau \) of \(\tau \) maps \(T_\xi {\mathfrak {g}} \cong T_e G = {\mathfrak {g}}\) to \(T_{\tau (\xi )}G\). The same is accomplished with the tangent map \(T_e R_{\tau (\xi )}\) of the right translation \(R_g(h) = h \cdot g\). The maps \({ T_\xi \tau }\) and \(T_e R_{\tau (\xi )}\) are related by the right trivialization \(\mathrm {d}\tau _{\xi }: {\mathfrak {g}} \rightarrow {\mathfrak {g}}\) of \(\tau \),

$$\begin{aligned} T_\xi \tau = T_e R_{\tau (\xi )}\circ \mathrm {d}\tau _{\xi }. \end{aligned}$$

Since \(\tau \) is a local diffeomorphism, the map \(T_\xi \tau \) is invertible for \(\xi \) close enough to \(0 \in {\mathfrak {g}}\), which implies that \(\mathrm {d}\tau _{\xi }\) is also invertible. We will write \(\mathrm {d}\tau ^{-1}_{\xi } \equiv (\mathrm {d}\tau _{\xi })^{-1}: {\mathfrak {g}} \rightarrow {\mathfrak {g}}\) for the inverse. A basic property of the right trivialization used throughout the paper is \(\text {Ad}_{\tau (\xi )}= \mathrm {d}\tau _{\xi } \circ \mathrm {d}\tau ^{-1}_{-\xi }\), cf. [18].

In the following, we will consider quadratic matrix Lie groups, i.e. we will work with \(G\subset \mathrm {GL}(N)\) so that there exists a J such that \({ g^\dagger \, J \, g = J, \forall g}\). For \(\xi \in {\mathfrak {g}}\) it then holds that \(J\,\xi + \xi ^\dagger J = 0\). Examples of quadratic Lie groups are \(\mathrm {O}(n)\) and \(\mathrm {SO}(n)\), with \(J = \mathrm {Id}\), and the symplectic group \(\mathrm {Sp}(2N,{\mathbb {R}})\), with J the symplectic matrix. An expedient retraction map for quadratic matrix Lie groups is the Cayley transform [5]

$$\begin{aligned} \mathrm {cay}(\xi ) = \left( \text {Id}-\frac{\xi }{2} \right) ^{-1} \! \left( \text {Id}+\frac{\xi }{2} \right) . \end{aligned}$$
(2.1)

Note that the two factors in the definition above commute. The right trivialization \(\mathrm {d}\tau \) for the Cayley transform is given by [9]

$$\begin{aligned} \mathrm {d}\mathrm {cay}_{\xi }\cdot \eta&= \left( \text {Id}-\frac{\xi }{2} \right) ^{-1} \eta \left( \text {Id}+\frac{\xi }{2} \right) ^{-1} \\ \mathrm {d}\mathrm {cay}^{-1}_{\xi }\cdot \eta&= \left( \text {Id}-\frac{\xi }{2} \right) \eta \left( \text {Id}+\frac{\xi }{2} \right) . \end{aligned}$$

2.2 Lie-Poisson systems on reductive lie algebras

We recall Lie-Poisson systems on the dual of a Lie algebra \({\mathfrak {g}} \subseteq \mathfrak {gl}(n)\) and the associated reconstruction process to the cotangent bundle of the Lie group \(T^*G\). The flow of a Lie-Poisson system on a dual Lie algebra \({\mathfrak {g}}^*\) is a solution to

$$\begin{aligned} {\dot{\mu }} = \text {ad}^*_{\nabla H(\mu )} \mu , \quad \mu \in {\mathfrak {g}}^* \end{aligned}$$
(2.2)

where \(H : {\mathfrak {g}}^* \rightarrow {\mathbb {R}}\) is a Hamiltonian function. Using the Frobenius inner product \(\langle \xi , \eta \rangle = \mathrm {tr}(\xi ^{\dagger } \eta )\) in \({\mathfrak {g}}\), we can identify \({\mathfrak {g}}\) with \({\mathfrak {g}}^*\) and the dual of the infinitesimal adjoint action then becomes \(\mathrm {ad}^*_\xi (\eta ) = \mathrm {\Pi }([\xi ^\dagger , \eta ])\) where \(\mathrm {\Pi }\) is the projection from \(\mathfrak {gl}(n)\) onto \({\mathfrak {g}}\). For the case where \({\mathfrak {g}}\) is a reductive Lie algebra, it is known that \([{\mathfrak {g}}^{\dagger }, {\mathfrak {g}}] \subset {\mathfrak {g}}\) and therefore the Lie-Poisson Eq. 2.2 becomes

$$\begin{aligned} {\dot{\mu }} = \left[ \nabla H(\mu )^{\dagger }, \mu \right] . \end{aligned}$$
(2.3)

An important property of Eq. 2.3 is the isospectrality in \(\mu \), i.e, the eigenvalues of the matrices \(\mu (t)\) are independent of t.

Every Lie-Poisson system can be reconstructed to a canonical Hamiltonian system on \(T^*G\). The main tool in Lie-Poisson reconstruction is the momentum map \({ {\mathscr {J}} : T^*G \rightarrow {\mathfrak {g}}^* \simeq {\mathfrak {g}}}\) which for the right action \((g,p)\cdot h = (gh, p(h^{-1})^\dagger )\) of the group \(G \subseteq \mathrm {GL}(n)\) on the phase space \(T^*G\) is given by \(\mu = {\mathscr {J}}(g,p) = g^{\dagger } p\). With \({\mathscr {J}}\) we can define a left invariant Hamiltonian \({\tilde{H}}(g,p)= H({\mathscr {J}}(g,p)) = H(g^{\dagger } p)\) and the canonical Hamilton equations on \(T^*G\) are

$$\begin{aligned} \begin{aligned} {\dot{g}}&= g \nabla H(g^\dagger p) = g B(g^\dagger p)^\dagger \\ {\dot{p}}&= -p \nabla H(g^\dagger p)^\dagger = - p B(g^\dagger p) \end{aligned} \end{aligned}$$
(2.4)

where we write \(B(\cdot ) = \nabla H (\cdot )^\dagger : {{\mathfrak {g}}^*\simeq {\mathfrak {g}}} \rightarrow {\mathfrak {g}}\) as in [14]. For the right invariant version of the Lie-Poisson reconstruction, see Appendix A.

2.3 Symplectic diagonally implicit Runge-Kutta methods

The Butcher tableau for an s-stage symplectic diagonally implicit Runge-Kutta methods (symplectic DIRK) is [9, Ch. V.2.1]

(2.5)

Given a fixed time step h, we will write \(h_i = h \, b_i\) and \(r_i = \sum _{j=1}^i b_j\) for \(i =0, \ldots , s\) so that, through the diagonal entries in Eq. 2.5, \(c_i\) and \(r_{i-1}\) always differ by \(b_i / 2\). Hence \({ \sum ^s_{i=1} h_i = h}\) and \(r_s = 1\) and also \(b_0=0\). An important result for us is that an s-stage symplectic DIRK can be decomposed into s implicit midpoint rule steps with suitable step sizes [9, Theorem VI.4.4].

3 The Cayley transform and symplectic diagonally implicit Runge-Kutta methods

As discussed in the last section, any s-stage symplectic diagonally implicit Runge-Kutta method (symplectic DIRK) can be decomposed into s implicit midpoint rule steps [9, Theorem VI.4.4]. In this section we will show that this carries over to isospectral symplectic DIRKs. Hence, the simplicity of the isospectral minimal midpoint algorithm presented in [20] also holds for higher order isospectral symplectic DIRKs. We will also show that the Cayley transform appears naturally in this context. The key to this result is to introduce not only the usual intermediate time steps that appear in a Runge-Kutta method, i.e. \(k_{n,i}^g\) and \(k_{n,i}^p\) in the Hamiltonian setting, but also intermediate half time steps at \({ t_i^n = h n + h \sum _{j=1}^i b_j}\). These intermediate half points, denoted \(g_{n,r_i}\) and \(p_{n,r_i}\) in our work, can be seen as a generalization of the intermediate step that one considers for the Verlet integrator and they are defined through the strictly lower triangular part of the Butcher tableau in Eq. 2.5.

Although the Lie groups we consider are nonlinear, since they are quadratic it, nonetheless, holds that a symplectic Runge-Kutta method applied linearly to Hamilton’s equations in Eq. 2.4 yields \(g_n \in G, \forall n\). Exploiting this for an s-stage symplectic DIRK on G, the Runge-Kutta steps can be written in terms of the intermediate Runge-Kutta points as

$$\begin{aligned} k_{n,i}^g&= \left( g_{n,r_{i-1}} + \frac{h_i}{2} \, k_{n,i}^g \right) B({\mu }_{n,c_i})^\dagger \end{aligned}$$
(3.1a)
$$\begin{aligned} k_{n,i}^p&= -\left( p_{n,r_{i-1}} + \frac{h_i}{2} \, k_{n,i}^p \right) B({\mu }_{n,c_i}) \end{aligned}$$
(3.1b)

where the reduced momentum is

$$\begin{aligned} {\mu }_{n,c_i} = \left( g_{n,r_{i-1}} + \frac{h_i}{2} \, k_{n,i}^g \right) ^{\! \dagger } \! \left( p_{n,r_{i-1}} + \frac{h_i}{2} \, k_{n,i}^p \right) \end{aligned}$$
(3.1c)

by the definition of the momentum map. We also introduce the intermediate points \(g_{n, r_i}\) and \(p_{n, r_i}\) given by

$$\begin{aligned} g_{n,r_i} = g_n + \sum _{j=0}^i h_j k_{n,j}^g \quad \quad p_{n,r_i} = p_n + \sum _{j=0}^i h_j k_{n,j}^p \end{aligned}$$

and they are always a half step away from \(k_{n,i+1}^g\) and \(k_{n,i+1}^p\), cf. Eq. 2.5. Here we have \(g_{n,r_0} = g_n\) and \(g_{n,r_s} = g_{n+1}\) (and analogous for the variable p).

The first intermediate point at \(t = h n + b_1\) is therefore

$$\begin{aligned} g_{n,r_1} = g_n + h_1 \, k_{n,1}^g = g_n + h_1 \, \left( g_n + \frac{h_1}{2} k_{n,1}^g \right) B({\mu }_{n,c_1})^\dagger . \end{aligned}$$

Fully expanding the right hand side of the last equation and using from the left hand side that \({ k_{n,1}^g} = ( g_{n,r_1} - g_n) / h_1\) we obtain

$$\begin{aligned} g_{n,r_1} \left( \text {Id} - \frac{h_1}{2}B({\mu }_{n,c_1})^{\dagger } \right) = g_n \left( \text {Id} + \frac{h_1}{2}B({\mu }_{n,c_1})^{\dagger } \right) . \end{aligned}$$

The last equation states conceptually that taking a (linear) half step backward from \(g_{n,r_1}\) and a (linear) half step forward from \(g_n\) agree. With the definition of the Cayley transform in Eq. 2.1 and taking into account that the terms in the parenthesis commute, this is equivalently given by

$$\begin{aligned} g_{n,r_1} = g_n \, \mathrm {cay}(h_1 \, B({\mu }_{n,c_1})^{\dagger }). \end{aligned}$$

Following the same computations as above for \(g_{n,r_i}\) one obtains \(g_{n,r_i} = g_{n,{r_{i-1}}} \, \mathrm {cay}(h_i B({\mu }_{n,c_i})^\dagger )\). and we conclude that

$$\begin{aligned} g_{n+1} = g_n \prod _{i=1}^s \mathrm {cay}(h_i \, B({\mu }_{n,c_i})^\dagger ) \end{aligned}$$
(3.2)

where the product notation represents multiplication from left to right, i.e, \(\prod _{i=1}^3 a_i = a_1\,a_2\,a_3\) and \(\prod _{i=3}^1 a_i = a_3\,a_2\,a_1\). An analogous calculation shows that

$$\begin{aligned} p_{n+1} = p_n \prod _{i=1}^s \mathrm {cay}(h_i \, B({\mu }_{n,c_i}))^{-1}. \end{aligned}$$
(3.3)

Through the momentum map \({\mathscr {J}}\), the phase space points \((q_n,p_n)\) are related to a sequence of reduced momenta \(\mu _n \in {{\mathfrak {g}}^*\simeq {\mathfrak {g}}}\) by \(\mu _n = {\mathscr {J}}(g_n,p_n) = g_n^{\dagger } p_n\). Using Eq. 3.2 and Eq. 3.3 we have

$$\begin{aligned} \mu _{n+1} = \left( \prod _{i=s}^1\mathrm {cay}(h_i \, B({\mu }_{n,c_i}))\right) \, \mu _n \left( \prod _{i=1}^s\mathrm {cay}(h_i \, B({\mu }_{n,c_i}))^{-1}\right) . \end{aligned}$$
(3.4)

Using the momentum map also to define the reduced momentum variable at the intermediate point we obtain

$$\begin{aligned} \mu _{n,r_i} = g_{n,r_i}^\dagger p_{n,r_i} = \mathrm {cay}(h_{i}\, B({\mu }_{n,c_i})) \mu _{n, r_{i-1}} \mathrm {cay}(h_{i}\, B({\mu }_{n,c_i}))^{-1} \end{aligned}$$
(3.5)

and a calculation shows that

$$\begin{aligned} \mu _{n,c_i} = \mathrm {d}\mathrm {cay}_{h_{i}\, B({\mu }_{n,c_i}))}\, \mu _{n, r_{i-1}} \end{aligned}$$

Eq. 3.5 can be written as

$$\begin{aligned} \mu _{n,r_i} = \mathrm {Ad}_{\mathrm {cay}(h_{i}\, B({\mu }_{n,c_i}))} \, \mu _{n, r_{i-1}} = { \mathrm {d}\mathrm {cay}^{-1}_{-h_{i}\, B({\mu }_{n,c_i})}\, \mathrm {d}\mathrm {cay}_{h_{i}\, B({\mu }_{n,c_i})} \mu _{n, r_{i-1}}} . \end{aligned}$$

One thus has the following algorithm for updating the intermediate Lie algebra points

$$\begin{aligned} \mu _{n, r_{i-1}}&= \mathrm {d}\mathrm {cay}^{-1}_{h_{i}\, { B({\mu }_{n,c_i})}}\,\mu _{n,c_i} \end{aligned}$$
(3.6a)
$$\begin{aligned} \mu _{n, r_{i}}&= \mathrm {d}\mathrm {cay}^{-1}_{-h_{i}\, { B({\mu }_{n,c_i})}}\,\mu _{n,c_i} \end{aligned}$$
(3.6b)

where the first equation is solved implicitly to find \(\mu _{n,c_i}\) from the known \(\mu _{n, r_{i-1}}\) and the results is used in the second equation to explicitly update to \(\mu _{n, r_{i}}\). The \(\mu _{n, r_{i}}\) and \(\mu _{n,c_i}\) thus form a staggered grid that can be used in conjunction to advance in time to the full steps. The existence of a solution of the implicit step is proven for sufficiently small h in [20, Lemma 3]. See appendix A for the right invariant version of the above algorithm. We summarize this section with the following theorem.

Theorem 3.1

Consider the reduced isospectral Hamiltonian system in Eq. 2.3 and the Butcher tableau for a symplectic diagonally implicit Runge-Kutta method in Eq. 2.5. Then Eq. 3.6 provides an isospectral integrator for the system.

The theorem follows directly from Eq. 3.4 and this provides an alternative proof of Theorem 4 in the recent work by Modin and Viviani [14] for the case of a symplectic diagonally implicit Runge-Kutta method. Previously, Viviani [20] already obtained an analogous result for the case of the implicit midpoint rule. Our extension to symplectic DIRKs partially verifies a conjecture that can be found in his work [20, Remark 4]. Furthermore, we showed that [9, Theorem VI.4.4], which states that an s-stage symplectic DIRK can be decomposed in s implicit midpoint rule steps with suitable step size, also holds in the isospectral reduced case.

Remark 3.1

In the work by Calvo, Iserles, and Zanna [6], one has that \(\mu _{n,1/2} = (\mu _n + \mu _{n+1})/2\), i.e, the midpoint used in the update equation is the average of two consecutive full algebra points. A similar interpretation can be given for the isospectral integrators in Eq. 3.6. Defining the variables

$$\begin{aligned} g_{n,c_i}&= g_{n,r_i} + \frac{h_i}{2} k_{n,i}^g = \frac{g_{n,r_i}+g_{n,r_{i+1}}}{2}\\ p_{n,c_i}&= g_{n, r_i} + \frac{h_i}{2} k_{n,i}^p = \frac{p_{n,r_i}+p_{n,r_{i+1}}}{2} \end{aligned}$$

we have that the intermediate half points \(\mu _{n,c_i} = g_{n,c_i}^\dagger p_{n,c_i}\) result from the averaging of the intermediate points \((g_{n,r_i}, p_{n,r_i})\) and \((g_{n,r_{i+1}}, p_{n,r_{i+1}})\) at the cotangent bundle level.

4 Variational symplectic diagonally implicit Runge-Kutta methods

From the general theory of reduction one expects that the isospectral symplectic Runge-Kutta methods of Modin and Viviani that were derived through Hamiltonian reduction and the momentum map can also be obtained through a reduced discrete variational principle. In the following theorem we show that this is indeed the case for isospectral symplectic DIRKs. The Cayley transform, which serendipitously appeared in the previous section, will be an integral part for this. Our result also has a surprising connection to an earlier variational Lie group integrator in the work by Gawlik et al. [8], see Remark 4.2.

Theorem 4.1

Let G be a matrix Lie group with Lie algebra \({\mathfrak {g}}\) and \(L:TG \rightarrow {\mathbb {R}}\) be a left invariant Lagrangian with left trivialization \(\ell :{\mathfrak {g}} \rightarrow {\mathbb {R}}\).

Let \(\{ g_{n,r_1}, g_{n,r_2}, \ldots , g_{n,r_{s}} \}_{n=0,\ldots ,N-1} \subset G\) for a positive integer s with \(g_n = g_{n,r_0} = g_{n-1,r_s}\) and let \(\{\xi _{n,c_1}, \ldots , \xi _{n,c_s}\}_{n=0,\ldots ,N-1} \subset {\mathfrak {g}}\) be the associated sequence in the Lie algebra such that

$$\begin{aligned} g_{n,r_i} = g_{n,r_{i-1}} \tau \big ( h_i \, \xi _{n,c_i} \big ) \end{aligned}$$
(4.1)

where h is a time step, \(h_i = h\,b_i\), \(\{b_i, c_i\}\) are the coefficients of the Butcher tableau of an s-stage symplectic DIRK scheme, and \(\tau \) is a retraction map. If the sequence \(\{g_{n,r_i}\}\) is stationary for the action \(s_d: G^{N} \rightarrow {\mathbb {R}}\)

$$\begin{aligned} s_d(\{g_{n,r_i}\}) = \sum _{n=0}^{N-1} \sum _{i = 1}^s h_i \, l( \xi _{n,c_i}) \end{aligned}$$

under discrete variations of \(\{ g_{n,r_i} \}\) with fixed endpoints, then the sequence \(\{\xi _{n,c_i}\}\) and associated \(\mu _{n,c_i} = \frac{\delta \ell }{\delta \xi _{n,c_i}}\) have to satisfy the discrete Euler-Poincaré equations

$$\begin{aligned} \begin{aligned} \left( \mathrm {d}\tau ^{-1}_{h_{i+1} \xi _{n,c_{i+1}}}\right) ^* \mu _{n,c_{i+1}}&= \left( \mathrm {d}\tau ^{-1}_{-h_{i} \xi _{n,c_i}}\right) ^* \mu _{n,c_i} \qquad i = 1, \ldots , s-1 \\ \left( \mathrm {d}\tau ^{-1}_{h_1 \,\xi _{n+1,c_1}}\right) ^* \mu _{n+1,c_1}&= \left( \mathrm {d}\tau ^{-1}_{-h_s\, \xi _{n,c_s}}\right) ^* \mu _{n,c_s} \qquad i = s . \end{aligned} \end{aligned}$$
(4.2)

Furthermore, given an initial condition \(\mu _0\), the sequence \(\{\mu _n\}\) defined by

$$\begin{aligned} \mu _{n+1} = \left( \mathrm {Ad}_{\prod _{i=1}^s \tau (h_i \, \xi _{n,c_i})}\right) ^* \mu _n = \mathrm {Ad}^*_{\left( \prod _{i=1}^s \tau (h_i \, \xi _{n,c_i})\right) ^{-1}} \, \mu _n. \end{aligned}$$
(4.3)

is an isospectral symplectic DIRK under the identification between \({\mathfrak {g}}\) and \({\mathfrak {g}}^*\).

Computationally, Eq. 4.2 allows to implicitly update the intermediate reduced momenta \(\mu _{n,c_{i}}\) and with Eq. 4.3 the next full time step can be obtained. Eq. 4.2 is of Euler-Poincaré type by considering \(\mathrm {d}\tau _{-h_{i} \xi _{n,c_i}}^{-1}\) as the finite time analogue of the infinitesimal coadjoint action \(\mathrm {ad}_{\xi _{n,c_i}}^*\) in the continuous Euler-Poincaré equations. For the case of quadratic Lie algebras and with \(\tau = \mathrm {cay}\), the reduced Legendre transform \(\mu = \frac{\delta \ell }{\delta \xi }\) [12, Ch. 13.5, p. 439] is a diffeomorphism \({\mathfrak {g}} \rightarrow {\mathfrak {g}}^*\). Defining the reduced Hamiltonian \(h:{\mathfrak {g}}^* \rightarrow {\mathbb {R}}\) as

$$\begin{aligned} h(\mu ) = \langle \mu , \xi \rangle - \ell (\xi ) \end{aligned}$$
(4.4)

it follows \(B(\mu )^\dagger = \nabla h(\mu ) = \xi \) and Eq. 4.3 and Eq. 3.4 are equivalent.

Proof

The stationarity of the discrete action under variations of the group sequence \(\{ g_{n,r_i}\}\) with fixed endpoints means

$$\begin{aligned} 0 = \frac{d}{d\epsilon } \Big |_{\epsilon = 0} s_d\left( \{ g_{n,r_i}^\epsilon \}\right) =\sum _{n=0}^{N-1} \sum _{i = 1}^s\frac{d}{d\epsilon } h_i \, l( \xi ^\epsilon _{n,c_i}) \, \Big |_{\epsilon = 0} = \sum _{n=0}^{N-1} \sum _{i = 1}^s h_i \, \left\langle \frac{\delta l}{\delta \xi _{n,c_i}}, \delta \xi _{n,c_i} \right\rangle \end{aligned}$$

where \(g^{\epsilon }_{n,r_i}\) and \(\xi _{n,c_i}^\epsilon \) are smooth curves with parameter \(\epsilon \) on G and \({\mathfrak {g}}\), respectively, with \(g^\epsilon _{0} = g_0\), \(g^\epsilon _{N} = g_N\), \(\{g^0_{n,r_i}\} = \{g_{n,r_i}\}\) and \(g_{n,r_i}^\epsilon = g_{n,r_{i-1}}^\epsilon \tau \big ( h_i \, \xi _{n,c_i}^\epsilon \big )\) for all \(n,i, \epsilon \). Through Eq. 4.1, a discrete variation in the group \(\{ \delta g_{n,r_i} \} = \{ \frac{d}{d\epsilon } g^\epsilon _{n,r_i} \big \vert _{\epsilon = 0} \}\) induces a variation in the corresponding algebra sequence. Writing \(\eta = g^{-1} \delta g \) and using the properties of retraction maps, the induced algebra variations are given by

$$\begin{aligned} \delta \xi _{n,c_i} = \frac{1}{h_i} \left( -\mathrm {d}\tau ^{-1}_{h_i \xi _{n,c_i}}(\eta _{n,r_{i-1}}) + \mathrm {d}\tau ^{-1}_{-h_i \xi _{n,c_i}}(\eta _{n,r_i}) \right) . \end{aligned}$$

Stationarity of the discrete action thus amounts to

$$\begin{aligned} 0&= \sum _{n=0}^{N-1} \sum _{i = 1}^s\, \left\langle \frac{\delta l}{\delta \xi _{n,c_i}}, \left( -\mathrm {d}\tau ^{-1}_{h_i \xi _{n,c_i}}(\eta _{n,r_{i-1}}) + \mathrm {d}\tau ^{-1}_{-h_i \xi _{n,c_i}}(\eta _{n,r_{i}}) \right) \right\rangle . \end{aligned}$$

Since \(g_0^\epsilon = g_0\) and \(g_N^\epsilon = g_N\) for every \(\epsilon \), we have \(\delta g_0 = \delta g_N = 0\) and \(\eta _0 = \eta _N =0\). Eq. 4.2 then follows by duality and using a standard summation by parts argument. Under the identification of \({\mathfrak {g}}\) and \({\mathfrak {g}}^*\), Eq. 4.3 implies that the matrices \(\mu _n\) are similar for all n and therefore have the same eigenvalues.

\(\square \)

Fig. 1
figure 1

Comparison of isospectral midpoint rule (left) and the implicit midpoint rule from Gawlik et al. [8] (right) for change in energy (top) and change in the eigenvalues (bottom)

Remark 4.1

In the continuous case and with \({\mathfrak {g}}\) and \({\mathfrak {g}}^*\) identified, the Euler-Poincaré equation for an isospectral flow is

$$\begin{aligned} {\dot{\mu }} = [\xi , \mu ] \end{aligned}$$

The solution \(\mu (t)\) and the group curve satisfying \(g^{-1}(t) {\dot{g}}(t) = \xi (t)\) are related by [9, IV.3.2]

$$\begin{aligned} g(t) \, \mu (t) g(t)^{-1} = g(0) \, \mu (0) g(0)^{-1} \quad \hbox { for all}\ t. \end{aligned}$$

This ensures that the \(\mu (t)\) are related by a similarity transformation for all t and therefore all possess the same spectrum. Eq. 4.3 is equivalent to the discrete equation

$$\begin{aligned} g_{n+1} \, \mu _{n+1} g_{n+1}^{-1} = g_n \, \mu _n g_n^{-1} \end{aligned}$$

which provides a direct verification that our integrator preserves the isospectral property.

Remark 4.2

The simplest integrator that results from the above theorem is the implicit midpoint rule studied by Viviani [20]. Interestingly, it also appeared in disguise in the work by Gawlik et al. [8] on variational discretizations of (possibly infinite dimensional) systems on Lie groups. Combining a forward half-step of the implicit midpoint rule from the intermediate point \({\tilde{\mu }}_k\) (which is \(\mu _{k,c_1}\) in our notation) with a backward half-step from \({\tilde{\mu }}_{k+1}\) one obtains the following implicit update rule for the half-steps

$$\begin{aligned} \mathrm {d}\mathrm {cay}^{-1}_{h B({\tilde{\mu }}_{k+1})} {\tilde{\mu }}_{k+1}&= \mathrm {d}\mathrm {cay}^{-1}_{-h B({\tilde{\mu }}_{k})} {\tilde{\mu }}_{k} \end{aligned}$$

or, equivalently,

$$\begin{aligned} \left( \mathrm {Id} - \frac{h B({\tilde{\mu }}_{k+1})}{2} \right) {\tilde{\mu }}_{k+1} \left( \mathrm {Id} + \frac{h B({\tilde{\mu }}_{k+1})}{2}\right)&= \left( \mathrm {Id} + \frac{h B({\tilde{\mu }}_{k})}{2} \right) {\tilde{\mu }}_{k} \left( \mathrm {Id} - \frac{h B({\tilde{\mu }}_{k})}{2}\right) . \end{aligned}$$

The last two equations are Eq. 4.9 and Eq. 4.12 in the work by Gawlik et al. [8]. However, since only the half time steps are considered in this work their integrator is not isospectral and also yields significantly stronger oscillations in the energy, see Fig. 1.

5 Numerical experiments

In this section we present numerical results for the isospectral symplectic DIRKs from Theorem 4.1 for the rigid body, the Toda lattice, and the matrix model discretization of the barotropic vorticity equation on \(S^2\). We begin with some details on the computations. Our implementation is available online [http://graphics.cs.uni-magdeburg.de/projects/sdirk/].

5.1 Implementation of isospectral symplectic DIRKs for quadratic Lie groups

For the case of isospectral systems defined on quadratic Lie algebras, Eq. 4.2 and Eq. 4.3 are equivalent to Eq. 3.6. Given an symplectic DIRK and using the same notation as in Sect. 2.3, Theorem 4.1 amounts to

$$\begin{aligned} \begin{aligned} \mu _{n,r_{i-1}}&= \left( \text {Id}-\frac{h_i}{2}B\left( \mu _{n,c_i}\right) \right) \mu _{n,c_i} \left( \text {Id}+\frac{h_i}{2} B\left( \mu _{n,c_i}\right) \right) \\ \mu _{n,r_i}&= \left( \text {Id}+\frac{h_i}{2}B\left( \mu _{n,c_i}\right) \right) \mu _{n,c_i} \left( \text {Id}-\frac{h_i}{2} B\left( \mu _{n,c_i}\right) \right) \end{aligned} \end{aligned}$$
(5.1)

which allows to update the \(\mu _{n,r_i}\) and \(\mu _{n,c_i}\) by leap frogging between them. We summarize the overall computations in Algo. 5.1.

figure a

In the experiments, we will consider the \(2^{\text {nd}}\)-order implicit midpoint rule and the isospectral integrators obtained from a 2-stage \(2^{\text {nd}}\)-order symplectic DIRK [11] and a 7-stage \(4^{\text {th}}\)-order symplectic DIRK [11].

5.2 Rigid body

The ideal rigid body is the classical example of an isospectral system. Its dynamics are described by Euler’s equations, which, in matrix form, are [12, Ch. 15]

$$\begin{aligned} {\dot{W}}= [ \nabla H(W)^\dagger , W] \end{aligned}$$
(5.2)

where \(W \in \mathfrak {so}^*(3)\) and the Hamiltonian is \(H(W) = \frac{1}{2} \left\langle {\mathscr {I}}^{-1}W, W \right\rangle \) with \({\mathscr {I}}\) a symmetric \(3 \times 3\) matrix. Figure 2 shows results for the three isospectral integrators we consider for a random skew-symmetric matrix with entries in \([-1,1]\) as initial condition and with a time step of \(h=0.01\). The superiority of the \(4^{\text {th}}\)-order integrator is evident in the conservation of the Hamiltonian, which is almost on the order of machine precision. The conservation of the angular momentum (not shown) is of the same order as those for the eigenvalues.

5.3 Toda lattice

Fig. 2
figure 2

Change in energy and change in eigenvalues for the rigid body for implicit midpoint rule (top), a 2-stage \(2^{\text {nd}}\)-order scheme (middle) and a 7-stage \(4^{\text {nd}}\)-order scheme (bottom)

The Toda lattice describes the pairwise interactions of particles on a line exerting exponential forces on each other [19]. When the line is periodic, the Lax pair formulation of the Toda lattice [9, Ch. IV.3.2] results in the isospectral flow \({\dot{L}} = \left[ B(L), L\right] \) with Hamiltonian \(H(L) = 2 \text {Tr}(L^2)\) for

Fig. 3
figure 3

Change in energy (left) and change in eigenvalues (right) for the isospectral implicit midpoint (top) rule and the \(4^{\text {th}}\)-order symplectic DIRK (bottom) for the Toda lattice

$$\begin{aligned} L= \begin{pmatrix} a_1 &{} b_1 &{} 0 &{} \ldots &{} b_n \\ b_1 &{} a_2 &{} b_2 &{} \ldots &{} 0 \\ 0 &{} b_2 &{} a_3 &{} \ldots &{} 0 \\ \vdots &{} \vdots &{} \vdots &{} \ddots &{} \vdots \\ b_n &{} 0 &{} 0 &{} \ldots &{} a_n \end{pmatrix} ,\quad B(L) = \begin{pmatrix} 0 &{} b_1 &{} 0 &{} \ldots &{} -b_n \\ -b_1 &{} 0 &{} b_2 &{} \ldots &{} 0 \\ 0 &{} -b_2 &{} 0 &{} \ldots &{} 0 \\ \vdots &{} \vdots &{} \vdots &{} \ddots &{} \vdots \\ b_n &{} 0 &{} 0 &{} \ldots &{} 0 \end{pmatrix} . \end{aligned}$$

While the Toda lattice is isospectral, it is not Lie-Poisson. It can, however, be generalized to an isospectral flow in \(\mathfrak {gl}(n)\) by introducing

$$\begin{aligned} B(W) = \begin{pmatrix} 0 &{} W_{12} &{} 0 &{} \ldots &{} -W_{1n} \\ -W_{21} &{} 0 &{} W_{23} &{} \ldots &{} 0 \\ 0 &{} { -W_{32}}&{} 0 &{} \ldots &{} 0 \\ \vdots &{} \vdots &{} \vdots &{} \ldots &{} \vdots \\ W_{n1} &{} 0 &{} 0 &{} \ldots &{} 0 \\ \end{pmatrix} \end{aligned}$$

which corresponds to the extended Hamiltonian \({\tilde{H}}(W) = -\text {Tr} \left( W^\dagger B(W) \right) + H(W) \). The dynamics of the extended Lie-Poisson system are \({\dot{W}} = { [\nabla {\tilde{H}}(W)^\dagger , W]}\) which has the form of Eq. 2.3.

In Fig. 3 we show experimental results for the periodic Toda lattice for the implicit midpoint rule and the 7-stage \(4{\text {th}}\)-order Runge-Kutta scheme. In both cases we obtained excellent conservation of the eigenvalues and the extended Hamiltonian. The initial condition \(W_0\) is the same as used in [14] where \(a_i = b_i = (-1)^i, i = 1, \ldots ,4\) and we used \(h = 0.1\) for the time step.

5.4 Matrix model of barotropic vorticity equation on \(S^2\)

The barotropic vorticity equation on the sphere is given by

$$\begin{aligned} {\dot{\zeta }} = \{ \zeta , \varDelta ^{-1} \zeta \} , \end{aligned}$$
(5.3)

where \(\zeta \) is vorticity and \(\{ \, , \}\) the canonical Poisson bracket on \(S^2\). Eq. 5.3 is a simple model for weather and climate dynamics and plays an important role in various applications.

A structure preserving discretization for it is the matrix model by Zeitlin [22] that goes back to early work for the torus [21] and results by Hoppe on matrix harmonics for \(S^2\) [10]. Recently, Modin and Viviani [15] used it to study turbulence on the sphere. The governing equation for the matrix model is

$$\begin{aligned} { {\dot{W}} = N^{3/2} \, [ {\hat{\varDelta }}^{-1} W , W ]} \end{aligned}$$
(5.4)

where \({\hat{\varDelta }}\) is a discrete Laplacian [10] and \(W \in {\mathfrak {su}(N)}\) with \(N=L+1\) is the representation of vorticity. It is related to the real spatial vorticity field by

$$\begin{aligned} W = \sum _{l=1}^L \sum _{m=-l}^l \zeta _{lm} \, T_{lm}^{(L)} \end{aligned}$$

where the \(\zeta _{lm}\) are the spectral coefficients of vorticity with respect to spherical harmonics and the \(T_{lm}\) are matrix harmonics [10].

Eq. 5.4 is an isospectral flow on \({\mathfrak {su}(N)}\) with conserved quantities

$$\begin{aligned} C_n = \mathrm {tr}( W^n ) , \quad n = 1 , \ldots , N, \end{aligned}$$

which are the discrete analoges to the integrated powers of vorticity that are conserved for the continuous barotropic vorticity equation [1].

In Fig. 4 we report our experimental results for the \(2{\text {nd}}\)-order implicit midpoint rule and the \(4{\text {th}}\)-order symplectic DIRK. The initial condition \(W_0\) was a randomly generated matrix in \(\mathfrak {sl}(N)\) and we use \(h = 0.0025\) (note that the matrix model results in a time rescaling, cf. [15, sec. 2.4]). Both isospectral time integration schemes preserve the eigenvalues of W up to machine precision, as expected, and this carries over to the Casimirs, as exemplified for the discrete enstrophy \(C_2\).

Fig. 4
figure 4

Change in energy (top), eigenvalues (middle) and enstrophy (bottom) for the isospectral implicit midpoint rule and the \(4^{\text {th}}\)-order symplectic DIRK for the matrix model of the barotropic vorticity equation. We used a truncation at \(N=33\)

6 Conclusion

In this work, we showed that isospectral symplectic diagonally implicit Runge-Kutta methods (symplectic DIRKs) can be derived from the discrete Hamilton’s principle of [5, 8]. The resulting numerical schemes are the simplest and most easily implemented isospectral higher order Runge-Kutta methods. We demonstrate this with our numerical results for the rigid body, the Toda lattice and the discrete 2D Euler fluid. Our results for a \(4{\text {th}}\)-order symplectic DIRK yield a time integration that preserve energy and isospectrality almost up to machine precision. In the available precision, it is hence a faithful discretization of the system.

Some interesting directions for future work are as follow. Theorem 4.1 only applies to symplectic DIRKs and a generalization to arbitrary symplectic Runge-Kutta methods is hence open. It is currently also not clear to us why the derivation of Gawlik et al. [8] yields the midpoints of the isospectral implicit midpoint rules by Viviani [20]. We believe that this might shed more light onto the overall structure of discrete variational principles for isospectral systems. An interesting objective for future work is also to extend our results to systems whose configuration space is a semi-direct product. In some cases, such as the heavy top, then also an infinite number of conserved quantities exist. Bobenko and Suris [2] studied this already for the special case of the Lagrange top although integrability is no longer given in the general case.