Krylov integrators for Hamiltonian systems

We consider Arnoldi like processes to obtain symplectic subspaces for Hamiltonian systems. Large systems are locally approximated by ones living in low dimensional subspaces; we especially consider Krylov subspaces and some extensions. This will be utilized in two ways: solve numerically local small dimensional systems or in a given numerical, e.g. exponential, integrator, use the subspace for approximations of necessary functions. In the former case one can expect an excellent energy preservation. For the latter this is so for linear systems. For some second order exponential integrators we consider these two approaches are shown to be equivalent. In numerical experiments with nonlinear Hamiltonian problems their behaviour seems promising.


Introduction
Symplectic methods have shown to be very effective in long time integration of Hamiltonian systems (see [11]). Many of them are implicit and necessitate the solution of systems of equations. If the differential equation system is large and sparse, a natural approach is to use Krylov subspace techniques to approximate solution of the algebraic equations.
A related approach is to use Krylov approximations of the matrix exponential in the so-called exponential integrators (see [13]). This has turned out to be a superior technique for many large systems of ordinary differential equations.
Krylov subspace techniques can be viewed as local low dimensional approximations of the large system. For Hamiltonian systems the standard Arnoldi type iterations produce low dimensional systems that are no longer Hamiltonian. In this paper special attention is paid to produce subspaces with symplectic bases. Also the time symmetry of the Hamiltonian systems taken into account when producing the bases. This is an extended version of the slides by Eirola, presented at a Workshop on Exponential Integrators in Innsbruck in 2004 (see [7]). Part of the material was introduced in the master's thesis of the second author [16] which was supervised by Eirola. The original ideas of Eirola came from considering linear Hamiltonian systems in R 2n as R -linear 1 systems in C n (see [8]). The slides [7] were using that language, but the present version is written in a more standard form.

Hamiltonian systems
Given a smooth function H : R 2n → R , consider the Hamiltonian system where J = 0 I −I 0 . A matrix A ∈ R 2n×2n is called Hamiltonian, if A T = JAJ , and symplectic, if A T JA = J . The Jacobian of ∇H(x) is a symmetric matrix at every point. Thus D x J −1 ∇H(x) is a Hamiltonian matrix.
We assume that (1) has a unique solution and write it x(t) = ϕ t (x 0 ) . Then • Energy is preserved: H(x(t)) is constant in t .
• For every t the mapping ϕ t : R 2n → R 2n is symplectic (or canonical), that is, its derivative is a symplectic matrix at every point.
Symplectic integrators produce symplectic one step maps for Hamiltonian systems (see [11]). For example, the implicit midpoint rule is such. For linear systems, i.e., when H is of the form H(x) = 1 2 x T S x + c T x , the energy is also preserved in the numerical solution with this and many other symplectic methods. One step methods are called symmetric the map given by the integrator is time symmetric, i.e. chaning h to −h is equivalent to switching x j and x j+1 . The implicit midpoint rule, for example, is symmetric.
For large systems implicit methods may become expensive. In this paper we will consider several low dimensional Hamiltonian approximations and the use of implicit methods or exponential integrators for these.

Symplectic subspaces and low dimensional approximations
Recall some basic definitions and properties in R 2n . Denote the nondegenerate skew-symmetric bilinear form ω(x, y) := x T Jy. A subspace V is isotropic, if ω(x, y) = 0 for all x, y ∈ V , and a subspace W is symplectic, if for every nonzero x ∈ W there exists y ∈ W such that ω(x, y) = 0 . Then the dimension of W is even. A basis e 1 , . . . , e k , f 1 , . . . , f k ∈ W is called symplectic, or a Darboux basis, if for all i, j = 1, . . . , k holds If V is an isotropic subspace with an orthonormal basis e 1 , . . . , e k , then e 1 , . . . , e k are also ω -orthogonal and W = V ⊕JV is a symplectic subspace and e 1 , . . . , e k , J −1 e 1 , . . . , J −1 e k is a symplectic basis of W .
We call also a matrix U ∈ R 2n×2k symplectic, if (pointing out the di- We will consider local approximations of the Hamiltonian system . Assume that at a point x 0 ∈ R 2n we are given a symplectic matrix U ∈ R 2n×2k . Consider the Hamiltonian system in R 2k corresponding to the function η(ξ) = H(x 0 + Uξ) . Then we get One strategy is to solve (2) numerically from ξ 0 = 0 up to ξ 1 ≈ ξ(t 1 ) and set x 1 = x 0 + Uξ 1 . Clearly, if we use an energy preserving scheme for the system (2), we will conserve the energy of the large system too, i.e.
Note that if the sets of constant energy of the original system are bounded, then they are such for the small dimensional approximations too. This implies that the approximations inherit stability of equilibria in a natural way.
In case that at x 0 we are given a matrix U with orthonormal columns we set U † = U T in (2). Then the system is not necessarily Hamiltonian.
We will consider also another strategy which is, instead of solving lowdimesional systems, we approximate suitable functions of a numerical method in the low dimensional space R 2k . As we will see, for the exponential integrators we consider these two approaches are equivalent.
The idea of approximating a Hamiltonian system by another of smaller dimension is not new. See, for example the discussion in [17]. A novelty here is to use local (later Krylov) approximations.
If U is symplectic and does not depend on x 0 , then it is not difficult to prove, for example, that using the implicit midpoint rule for (2) induces a map ψ : But in order to get efficient algorithms we let U to depend on x 0 and then this approach generally does not produce a symplectic map.

Exponential integrators
The use of approximations of the matrix exponential as a part of time propagation methods for differential equations has turned out to be very effective (see e.g. [13]). We will consider application of three second order exponential integrators to the Hamiltonian system (1). In what follows, the matrix H will denote the Jacobian of the right hand side of (1) at x 0 , i.e., H = Df(x 0 ). The methods can be seen as exponential integrators applied to semilinear equations which are local linearizations of (1). In the literature methods of this type are also called exponential Rosenbrock methods [14].

Exponential Euler method
As a first method we consider the exponential Euler method (EE) 2 where φ(z) = 1 0 e tz dt = z −1 (e z − 1) and H = Df(x) . Note that if f is linear, then x + = ϕ h (x) , i.e., the method gives exact values of the solution.
Assume now that the system If we use the exponential Euler method (3) for the low dimensional system (2) we produce For linear problems this will preserve energy exactly: Then the exponential Euler method (4) preserves energy, i.e., H(x + ) = H(x) .
Proof. The local problem now is (see (2)) i.e., the exponential Euler approximation gives the exact solution for the problem in R 2k . Hence the energy is preserved in the small system and consequently also for x + = x+U ξ(h) .

Explicit exponential midpoint rule
We consider next the explicit exponential midpoint rule (see [13]) where H = Df(x) . For linear Hamiltonian problems f(x) = Hx + c this gives . Hence the energy of the averages is preserved: Remark 2. In [9] it was noticed that the explicit midpoint rule (for the homogeneous problem) Hx preserves another quantity: ω(Hx, x + ) = ω(Hx − , x) . Equation (6) implies this, too.
Again we approximate (5) with For this we have Theorem 3. Let f be linear, U ∈ R 2n×2k symplectic, and assume that x − x − is in the range of U . Then (6) holds for the scheme (7). (7) and zφ(z) = e z − 1 we get Thus the x -vectors propagate according to the exponential Euler method (4) and we get the result by Lemma 1.
In case the column space of U contains the vector x − − x , we also have the following.

Lemma 4.
Assume that U is a full rank matrix at x with a left inverse U † , and that R(U) contains x − − x . Then, the approximate explicit exponential midpoint rule is symmetric.
Thus the steps backward can be taken by replacing h with −h .

Implicit exponential midpoint rule
As a third exponential integrator, we consider implicit exponential midpoint rule (IEMP) (see [3]). This gives a symmetric method when the linear part H of f is fixed. When H comes from a linearization of a nonlinear Hamiltonian system For linear systems of the form f(x) = Hx + c , the second equation of (8) can be written equivalently as Then, x + propagates according to the exponential Euler method and the energy is preserved in case U is symplectic (Lemma 1).
When we apply (8), the total approximation is symmetric if H is evaluated at the midpoint x. (8). Consider the approximation where ξ is obtained from applying (8) to the local system. Then, (9) gives a symmetric method.
Proof. Applying (8) to the local system (2) gives We show that (10) leads to a symmetric approximation of the full system. Multiplying the upper equation of (10) by e hF , and using the relation Combining this and both equations of (10) gives Multiplying this from left by U and adding x gives Replacing here h with −h and multiplying from the left by Ue hF U † shows the symmetry.
Crucial for the symmetry of EIMP is that the Jacobian H and the basis U are the same when considering stepping from x to x + and vice versa. This is the case if H is evaluated at x , and U is generated using the Krylov subspace methods described in 5.
Our numerical strategy is to perform one time step h using the exponential Euler method from x to x in order to approximate the midpoint x . Then after evaluating the Jacobian H and forming the basis U at x we solve the implicit equation using fixed point iteration and perform the step of size 2h to obtain ξ + and x + = x + Uξ + .

Forming the local basis using Krylov subspace methods
We discuss next the approximation of matrix valued φ functions using Krylov subspace methods and show how they are naturally connected to the local approximation discussed in Section 4. When matrix A is large but the operation v → Av inexpensive, it is reasonable to use Krylov subspace methods. These work in Krylov subspaces . The Arnoldi iteration uses the Gram-Schmidt process and produces an orthonormal basis q 1 , . . . , q k for c) If ϕ(z) = j a j z j has convergence radius larger than the spectral radius of A and w ∈ R(Q k ) , then The effectivity of Krylov subspace methods is based on the fact that if the component of A k v orthogonal to K k (A, v) is small, then things are approximately as above and this can happen already for a reasonable size k . Thus it is reasonable to consider the approximation which was used already in [5] and [10]. We refer to [12] for a detailed error analysis.
We show next how the Krylov approximation (11) is naturally connected to the strategy of applying exponential integrators to the local system (2).

Equivalence of the Krylov and the local system approximations
Consider the local system (2) corresponding to the basis U = Q k , where Q k gives an orthonormal basis for K k (H, f(x 0 )). Recall from Section 4 the strategy of solving the local system (2), i.e., numerically from ξ 0 = 0 up to ξ 1 ≈ ξ(t 1 ) and setting x 1 = x 0 + Uξ 1 . As shown in Subsection 4.1, applying the exponential Euler method to the local system gives the approximation We immidiately see from (11) that this is the Krylov subspace approximation of the exponential Euler step (3). As shown in subsection 4.2, applying the exponential explicit midpoint rule gives As for the explicit Euler method, the resulting formula can be seen as a Krylov subspace approximation (11) of the EEMP step (5). In order to satisfy the assumptions of Lemma 4 the vector x − − x has to be in the range of U. This is addressed in Subsection 5.2.1.
Similarly, if we perform a Krylov approximation of the IEMP step (8), and denote ξ = x − x and ξ + = x + − x, we get the small dimensional system (10).
The present concern, however, is that if A above is a Hamiltonian matrix, this does not necessarily bring any special structure to Q k or F k .

Symplectic Krylov processes
In order to obtain good local approximations for a Hamiltonian system with linear part H we would like to have a) A symplectic subspace W with a corresponding basis. b) K k (H, f) ⊂ W in order to have polynomials of H applied to f represented in W . We expect this to be worth pursuing for approximations of ϕ(H)f .
Consider first the Krylov subspace corresponding to H and v : The construction of a symplectic basis for W k is slightly more complicated than the standard Arnoldi process: We just reorthogonalize with respect to · , · and ω( · , · ) -the · , · -orthogonal vectors provided by the standard Arnoldi. The result is a symplectic and orthonormal matrix.
Here the columns of Q form an orthonormal basis for K k (H, v) and those of U a symplectic basis for W k .

Remark 6.
There is a way to construct matrix F more economically from the computations of step 2. but anyway the reorthogonalization stays the costly part of this approach.

Isotropic Arnoldi
Mehrmann and Watkins [18] suggest the isotropic Arnoldi process, which is a direct · , · and ω( · , · ) -orthogonalization in the Arnoldi process: Here we obtain a symplectic matrix with orthonormal columns. However its range does not necessarily contain the Krylov subspace K k (H, v) . Thus, generally, this iteration does not have the property that p(H)v is in the span of u 1 , . . . , u k for every polynomial p of degree k − 1 . Since our present aim is to approximate operator functions that have converging power series, this iteration can be expected to be less effective for our purposes 3 . We will see this in the numerical tests. Also there is a possibility of a breakdown: after orthogonalization we may get r = 0 without obtaining any useful information about K k (H, v) .

Hamiltonian Lanczos
Benner, Faßbender, and Watkins have several versions of Hamiltonian Lanczos processes (see [1], [20]). The following is a promising one from Watkins (Algorithm 4 of [20]): Form the matrices . 3 Mehrmann and Watkins use the iteration for a quite different purpose: eigenpairs of skew-Hamiltonian/Hamiltonian pencils.
Due to short recursion this is an economic iteration. But it has similar problems as the usual biorthogonal Lanczos, e.g., near breakdowns and loss of orthogonality. These can be partly circumvented. For small k this may be a good choice.
By the very construction of these symplectic maps we get the following: Proposition 7. Combining any of the symplectic Krylov processes with a method that preserves energy for the small dimensional system (2) will preserve the energy of the original system, too.
In the numerical experiments we will use four algorithms to produce Krylov subspaces: the standard Arnoldi iteration in R 2n and the three symplectic ones: symplectic Arnoldi, isotropic Arnoldi, and the Hamiltonian Lanczos process. With respect to their costs to produce a space of fixed dimension they can be ordered as Hamiltonian Lanczos < Arnoldi < Isotropic Arnoldi < Symplectic Arnoldi The main weaknesses of each of these are • Arnoldi in R 2n : the approximation is not Hamiltonian.
• Isotropic Arnoldi: does not include a Krylov subspace.

Adding a vector to the basis
When using the EEMP method (5), the vector u −1 −u 0 needs to be added to the basis U k at each time step. For orthogonal and/or isotropic basis this is straightforward. For the symplectic basis, a symplectic version of the Gram-Schmidt algorithm adds x and Jx to the basis U = [V W]. This algorithm is shown in the following pseudocode. Here we denote ω(·, ·) = J·, · . Here the symplectic orthogonalization can also be performed in modified Gram-Schmidt manner, one vector at a time. Notice also that in the second step the vector x can be scaled with any constant.

Numerical tests
We compare numerically the three exponential time integrators of Section 4 and the four Arnoldi like processes of Section 5 to produce the local basis U k . We apply the methods to large sparse Hamiltonian systems which are obtained from finite difference discretizations of one dimensional nonlinear wave equations. For ease of presentation we first illustrate by an example our approach of deriving large sparse Hamiltonian systems from Hamiltonian PDEs. For further examples we refer to [4].

Spatial discretization of Hamiltonian PDEs
As an example consider the nonlinear Klein-Gordon equation in one dimension, where u(x, t) is scalar valued periodic function (u(0, t) = u(L, t) for some L > 0) and f is a smooth function. Setting v = u t and u = (u, v) T , the equation (12) can be viewed as a Hamiltonian system and ∂H ∂u and ∂H ∂v denote the functional derivatives of the Hamiltonian To obtain a Hamiltonian system approximating the equation (12), we perform a discretization with respect to the spatial variable x on an equidistant grid with an interval ∆x = L/n, n ∈ N + , and denote by q i (t) and p i (t) i = 1, . . . , n, the approximations to u(i∆x, t) and v(i∆x, t). For the second derivative u xx we use the central difference approximation Expressing the approximations as vectors we get the approximation of the PDE in matrix form as where {f(q)} i = f (q i ) and ∆ n ∈ R n×n is the discretized Laplacian with periodic boundary conditions, Defining the Hamiltonian function we see that p ′ (t) = −∇ q H(q(t), p(t)), and by setting x(t) = q(t) p(t) , we have the Hamiltonian system where , and u 0 and v 0 come from the discretizations of the initial values of (12). Notice that (15) is the discrete counterpart of (13).

Linear wave equation
As a first numerical example we consider the linear wave equation with periodic boundary conditions, Performing spatial discretization on an equidistant grid of size n using central differences leads to a Hamiltonian system of the form (16) and initial data We set L = 2, n = 400, ∆x = L/n, and we integrate up to T = 50 with time step size h = T /n t , where n t = 2000.
Using this linear example we illustrate the differences between the iterative processes of Section 5 to produce the basis U k ∈ R 2n×2k . We apply the exponential Euler method (4) to the small dimensional system (2) obtained from the projection using U k . Note that for linear systems all the three integrators of Section 4 propagate as the exponential Euler method.
As illustrated in Figures 1, the approximation obtained using the Arnoldi iteration results generally in a linear growth of the energy error, whereas the symplectic basis gives a bounded energy error. Figure 2 shows that, as opposed to the Hamiltonian Lanczos approximation, the energy error of the Arnoldi approximation is dependent on the accuracy of the approximation. Notice that in both cases and f 0 is the right hand side of (16) evaluated at x(0). Property (17) means that these processes give a polynomial approximation of degree ℓ for the exponential Euler step which gives the exact solution at t = h. This effect is also seen in Figure 3, which depicts the solution errors for the Arnoldi iteration and the Hamiltonian Lanczos process. When dim(U k ) = 16, the methods give errors not far from each other, however for smaller basis size the symplectic alternative gives more accurate results. When increasing the basis size also the isotropic Arnoldi and the symplectic Arnoldi start to perform better (see Figure 4). Need for a larger dimension is expected for the symplectic Arnoldi since instead of (17), only K ℓ/2 (H, f 0 ) ⊂ Range(U k ), where ℓ = dim(U k ). Isotropic Arnoldi performs worse as expected due to its poor polynomial approximation properties. However, both processes give bounded energy errors as in both cases U k is symplectic ( Figure 5). Ham. Lanczos, dim(U k ) = 12 Figure 1: Linear wave equation and relative energy errors for the exponential Euler method, when U k is produced by the Arnoldi iteration (U k ∈ R 2n×16 ) and the Hamiltonian Lanczos process (U k ∈ R 2n×12 ).

Nonlinear Schrödinger equation
Consider next a one dimensional nonlinear SchrÃP dinger equation (NLS) on [−4π, 4π] with periodic boundary conditions, (see [2]). The initial value is given by where the phase function θ(x) satisfies tan(θ(x)) = ± 1 + V 0 /B tan(x) (see Figure 6). We set V 0 = B = 1.0 which gives a stable soliton solution (see [2]). The equation (6.4) can be derived from the energy functional As in the example of Subsection 6.1, we first carry out a spatial discretization on an equidistant grid with grid size ∆x = 8π/n and denote by q i (t) and p i (t) the approximations to Re ψ(ih, t) and Im ψ(ih, t). Expressing these approximations as vectors we get the discrete counterpart of the energy functional (19), where ∆ n is the discretized Laplacian (14). Setting x(t) = q(t) p(t) , we get from (20) the Hamiltonian system (16), i.e., where • denotes the Hadamard product, (u • v) i = u i v i , and sin 2 (x) i = sin 2 (x i ).
We set n = 500 and integrate up to T = 40π with step size h = T /n t , where n t = 8000.
The benefits obtained from the symmetry properties of EEMP (Lemma 4) are illustrated by Figures 8 and 7 which depict the relative energy errors and solution errors given by the exponential Euler method and EEMP, when U k ∈ R n×20 . The nonsymmetric EE shows a linear growth in energy error and quadratic growth in solution error whereas EEMP gives a bounded energy error and a linear growth in solution error.
Next we set T = 80π, n t = 10000 and h = T /n t , which implies that the norm of h H is now bigger and thus larger dimension for U k is required. Differences resulting from the symplecticity of the basis U k can be seen in Figure 9 where we compare the IEMP when U k ∈ R 2n×24 is given by the Arnoldi iteration and the Hamiltonian Lanczos process. The Arnoldi iteration gives a growth of energy error whereas the Hamiltonian Lanczos iteration shows bounded energy error.

Nonlinear Klein-Gordon equation
As a last numerical example we consider the nonlinear Klein-Gordon equation with periodic boudnary conditions, The equation is now of the form (12) for f (u) = m 2 u + gu 3 , and after spatial discretization on the interval [0, L] using finite differences with n points we get a Hamiltonian system with the Hamiltonian (15), where F (u) = m 2 2 u 2 + Arnoldi Lanczos Figure 9: NLS and energy error for IEMP when the basis U k ∈ R 2n×24 is produced by the Arnoldi and by the Hamiltonian Lanczos iteration.
Here applying EEMP using the basis U k generated by the Arnoldi iteration results in an unstable method. However, the Hamiltonian Lanczos process gives a stable alternative. Figures 10 show the relative energy and solution errors for EEMP combined with the Hamiltonian Lanczos process, and for EE combined with the Arnoldi iteration. For the energy error the symmetric EEMP with symplectic basis U k gives a bounded energy error and also a smaller solution error than EE with orthogonal basis given by the Arnoldi iteration.
When applying IEMP, the effect of the symplecticity of U k shows up. Figure 11 shows the relative energy errors when U k ∈ R 2n×22 is produced using the Arnoldi iteration and the Hamiltonian Lanczos process.

Conclusions and outlook
The theoretical background of this numerical exploration was the following. By backward error analysis (see [11,Ch. IX]) it can be shown that applying a symplectic integrator to an integrable Hamiltonian systems gives a  Figure 11: Klein-Gordon equation and the relative energy errors for IEMP, when the basis U k ∈ R 2n×22 is produced using the Arnoldi iteration and the Hamiltonian Lanczos process.
symplectic map x j → x j+1 and also (#) small error in energy uniformly for long time, error growth linear in t for exponentially long times (see [11, Ch. X]). Behaviour (#) can also be shown for symmetric time integrators when applied to integrable Hamiltonian systems (see [11,Ch. XI]).
Here we have given exponential integration methods which give symplectic maps when applied to linear Hamiltonian systems. When using the approximations to nonlinear systems the resulting maps are not symplectic, but (#) can anyway be observed numerically. This is escpecially true when using the exponential explicit midpoint rule in the symmetric way: the range of U contains Then x j−1 is obtained from x j and x j+1 from the same formula by changing h to −h . The effect of symplecticity of U can be seen numerically when applying the method IEMP (Subsection 4.3) to nonlinear Hamiltonian problems (see e.g. Figure 11).
The numerical experiments clearly show that both preserving the Hamiltonian structure and the time symmetry are important when applying exponential integrators with Krylov approximations to large scale Hamiltonian systems. The Hamiltonian Lanczos method appears to be the most efficient method to produce a symplectic basis U k among those alternatives that provide the needed Krylov subspace of a given dimension. However, further study is needed to find an iteration with short ω-orthogonalization recursions that is more efficient and numerically stable for approximation of the φ functions.