1 Introduction

We are concerned in this work with the numerical integration of the linear ordinary differential equation

$$\begin{aligned} i \frac{d u}{dt} + H u = 0, \qquad \quad u(0) = u_0, \end{aligned}$$
(1.1)

where \(u \in \mathbb {C}^N\) and \(H \in \mathbb {R}^{N \times N}\) is a real matrix. A particular example of paramount importance leading to Eq. (1.1) is the time-dependent Schrödinger equation once it is discretized in space. In that case H (related to the Hamiltonian of the system) can be typically split into two parts, \(H = A + B\). The equation

$$\begin{aligned} y^{\prime \prime } + K y = 0 \end{aligned}$$

with \(y \in \mathbb {R}^d\), \(K \in \mathbb {R}^{d \times d}\) can also be recast in the form (1.1) if the matrix K satisfy certain conditions [6].

Although the solution of (1.1) is given by \(u(t) = \textrm{e}^{it H} u_{0}\), very often the dimension of H is so large that evaluating directly the action of the matrix exponential on \(u_0\) is computationally very expensive, and so other approximation techniques are desirable. When \(H = A + B\) and \(\textrm{e}^{i t A} u_0\), \(\textrm{e}^{i t B} u_0\) can be efficiently evaluated, then splitting methods constitute a natural option [17]. They are of the form

$$\begin{aligned} S_h = \textrm{e}^{i h a_0 A} \, \textrm{e}^{i h b_0 B} \, \cdots \, \textrm{e}^{i h b_{2n-1} B} \, \textrm{e}^{i h a_{2n} A} \end{aligned}$$
(1.2)

for a time step h. Here \(a_j\), \(b_j\) are coefficients chosen in such a way that \(S_h = \textrm{e}^{i h H} + \mathcal {O}(h^{p+1})\) when \(h \rightarrow 0\) for a given \(p \ge 1\). After applying the Baker–Campbell–Hausdorff (BCH) formula, \(S_h\) can be formally expressed as \(S_h=\exp \left( i h H_h \right) \), with \(iH_h=i H_h^o+H_h^e\) and

$$\begin{aligned} H^o_h= & {} (g_{1,1}A+g_{1,2}B)+h^2(g_{3,1}[A,[A,B]]+g_{3,2}[B,[A,B]])+\ldots \\ H^e_h= & {} h g_{2,1}[A,B]+h^3 (g_{4,1}[A,[A,[A,B]]]+\ldots )+\ldots \end{aligned}$$

Here \([A,B]:= AB-BA\), \(g_{k,j}\) are polynomials of degree k in the coefficients \(a_i,b_i\) verifying \(g_{1,1}=g_{1,2}=1\) (for consistency), and \(g_{k,j}=0, \ k=1,2,\ldots ,p, \ \forall j\) for achieving order p.

If A and B are real symmetric matrices, then [AB] is skew-symmetric and [A, [AB]] is symmetric. In general, all nested commutators with an even number of matrices AB are skew-symmetric and those containing an odd number are symmetric, so that \((H_h^o)^T=H_h^o\) and \((H_h^e)^T=-H_h^e\).

When the coefficients \(a_j, b_j\) are real, then \(g_{k,j}\) are also real and therefore \(S_h=\textrm{e}^{ih H_h}\) is a unitary matrix. In addition, if the composition (1.2) is palindromic, i.e., \(a_{2n-j} = a_j\), \(b_{2n-1-j} = b_j\), \(j=1,2,\ldots \), then \(g_{2k,j}=0\) and \(H_{-h}=H_h\), thus leading to a time-reversible method, \(S_{-h}=S_h^{-1}\). In other words, if \(u_n\) denotes the approximation at time \(t = n h\), then \(S_{-h}(u_{n+1}) = u_n\). As a result, one gets a very favorable long-time behavior of the error for this type of integrators [16]. Thus, in particular,

$$\begin{aligned} \mathcal {M}(u):= |u|^2 \qquad \qquad \text{(norm) } \end{aligned}$$

and

$$\begin{aligned} \mathcal {H}(u):= \bar{u}^T H u \qquad \qquad \text{(expected } \text{ value } \text{ of } \text{ the } \text{ energy) } \end{aligned}$$

are almost globally preserved.

Recently, some preliminary results obtained with a different class of splitting methods (1.2) have been reported when they are applied to the semi-discretized Schrödinger equation [4]. These schemes are characterized by the fact that the coefficients in (1.2) are complex numbers. Notice, however, that in this case the polynomials \(g_{k,j}\in \mathbb {C}\), so that \(S_h=\textrm{e}^{ih H_h}\) is not unitary in general. This is so even for palindromic compositions, since \(g_{2\ell + 1,j}\) are complex anyway.

There is nevertheless a special symmetry in the coefficients, namely

$$\begin{aligned} a_{2n-j} = \overline{a}_j \qquad \text{ and } \qquad b_{2n-1-j} = \overline{b}_j, \qquad j=1,2,\ldots , \end{aligned}$$
(1.3)

worth to be considered. Methods of this class can be properly called symmetric-conjugate compositions. In that case, a straightforward computation shows that the resulting composition satisfies

$$\begin{aligned} \overline{S}_h = S_h^{-1} \end{aligned}$$
(1.4)

for real matrices A and B, and in addition

$$\begin{aligned} (\overline{S}_h)^T = S_{-h} \end{aligned}$$
(1.5)

if A and B are real symmetric. In consequence,

$$\begin{aligned} iH_h = i(H+{\hat{H}}_h^o) + i {\hat{H}}_h^e \end{aligned}$$

for certain real matrices \({\hat{H}}_h^o\) (symmetric), and \({\hat{H}}_h^e\) (skew-symmetric). Since \(i {\hat{H}}_h^e\) is not real, then unitarity is lost. In spite of that, the examples collected in [4] seem to indicate that this class of schemes behave as compositions with real coefficients regarding preservation properties, at least for sufficiently small values of h. Intuitively, this can be traced back to the fact that \(i {\hat{H}}_h^e= \mathcal{O}(h^{p})\) and is purely imaginary.

One of the purposes of this paper is to provide a rigorous justification of this behavior by generalizing the treatment done in [4] for the problem (1.1) defined in the group SU(2), i.e., when H is a linear combination of Pauli matrices. In particular, we prove here that, typically, any consistent symmetric-conjugate splitting method applied to (1.1when H is real symmetric, is conjugated to a unitary method for sufficiently small values of h. In fact, this property can be related to the reversibility of the map \(S_h\) with respect to complex conjugation, as specified next.

Let C be the linear transformation defined by \(C(u) = \overline{u}\) for all \(u \in \mathbb {C}^N\). Then, the differential equation (1.1) is C-reversible, in the sense that \(C(i H u) = -i H(C(u))\) [12, section V.1]. Moreover, since (1.4) holds, then \(C \circ S_h = S_h^{-1} \circ C\). In other words, the map \(S_h(u)\) is C-reversible [12] (or reversible for short). Notice that this also holds for palindromic compositions (1.2) with real coefficients.

In the sequel we will refer to compositions verifying (1.3) as symmetric-conjugate or reversible methods.

Splitting and composition methods with complex coefficients have also interesting properties concerning the magnitude of the successive terms in the asymptotic expansion of the local truncation error. Contrarily to methods with real coefficients, higher order error terms in the expansion of a given method have essentially a similar size as lower order terms [3]. In addition, an integrator of a given order with the minimum number of flows typically achieves a good efficiency, whereas with real coefficients one has to introduce additional parameters (and therefore more flows in the composition) for optimization purposes. It makes sense, then, to apply this class of schemes to equation (1.1) and eventually compare their performance with splitting methods involving real coefficients, since in any case the presence of complex coefficients does not lead to an increment in the overall computational cost.

The structure of the paper goes as follows. In Sect. 2 we provide further experimental evidence of the preservation properties exhibited by C-reversible splitting methods applied to different classes of matrices H by considering several illustrative numerical examples. In Sect. 3 we analyze in detail this type of methods and validate theoretically the observed results by stating two theorems concerning consistent reversible maps. Then, in Sect. 4 we present new symmetric-conjugate schemes up to order 6 specifically designed for the semi-discretized Schrödinger equation and other problems with the same algebraic structure. Finally, these new methods are tested in Sect. 5 for a specific potential.

2 Symmetric-conjugate splitting methods in practice: some illustrative examples

To illustrate the preservation properties exhibited by symmetric-conjugate (or reversible) methods when applied to (1.1) with \(H = A + B\), we consider some low order compositions of this type. Specifically, the tests will be carried out with the following schemes:

Order 3. The simplest symmetric-conjugate method corresponds to

$$\begin{aligned} S_h^{[3,1]} = \textrm{e}^{i h \overline{b}_0 B} \, \textrm{e}^{i h \overline{a}_1 A} \, \textrm{e}^{i h b_1 B} \, \textrm{e}^{i h a_1 A} \, \textrm{e}^{i h b_0 B}, \end{aligned}$$
(2.1)

with \(a_1 = \frac{1}{2} + i \frac{\sqrt{3}}{6}\), \(b_0 = \frac{a_1}{2}\), \(b_1 = \frac{1}{2}\) and was first obtained in [1]. In addition, and as a representative of the schemes considered in Sect. 4, we also use the following method, with \(a_j > 0\) and \(b_j \in \mathbb {C}\), \(\Re (b_j) > 0\):

$$\begin{aligned} S_h^{[3,2]} = \textrm{e}^{i h \overline{b}_0 B} \, \textrm{e}^{i h a_1 A} \, \textrm{e}^{i h \overline{b}_1 B} \, \textrm{e}^{i h a_2 A} \, \textrm{e}^{i h b_1 B} \, \textrm{e}^{i h a_1 A} \, \textrm{e}^{i h b_0 B}, \end{aligned}$$
(2.2)

where

$$\begin{aligned} a_1 = \frac{3}{10}, \quad a_2 = \frac{2}{5}, \quad b_0 = \frac{13}{126} - i \frac{\sqrt{59/2}}{63}, \quad b_1 = \frac{25}{63} + i \frac{5 \sqrt{59/2}}{126}. \end{aligned}$$

Order 4. The scheme has the same exponentials as (2.2),

$$\begin{aligned} S_h^{[4]} = \textrm{e}^{i h \overline{b}_0 B} \, \textrm{e}^{i h \overline{a}_1 A} \, \textrm{e}^{i h \overline{b}_1 B} \, \textrm{e}^{i h a_2 A} \, \textrm{e}^{i h b_1 B} \, \textrm{e}^{i h a_1 A} \, \textrm{e}^{i h b_0 B}, \end{aligned}$$
(2.3)

but now

$$\begin{aligned} a_1 = \frac{1}{12} (3 + i \sqrt{15}), \quad a_2 = \frac{1}{2}, \quad b_0 = \frac{a_1}{2}, \quad b_1 = \frac{1}{24} (9 + i \sqrt{15}). \end{aligned}$$

When the matrix H results from a space discretization of the time-dependent Schrödinger equation (for instance, by means of a pseudo-spectral method), then it is real symmetric and A, B are also symmetric (in fact, B is diagonal). It makes sense, then, to start analyzing this situation, where, in addition, all the eigenvalues of H are simple. To proceed, we generate a \(N \times N\) real matrix with \(N=10\) and uniformly distributed elements in the interval (0, 1), and take H as its symmetric part. The symmetric matrix A is generated analogously, and finally we fix \(B = H - A\). Next we compute the approximations obtained by \(S_h^{[3,1]}\), \(S_h^{[3,2]}\) and \(S_h^{[4]}\) for different values of h, determine their eigenvalues \(\omega _j\) and compute the quantity

$$\begin{aligned} D_h = \max _{1 \le j \le N} (\big | |\omega _j|-1 \big |) \end{aligned}$$

for each h. Finally, we depict \(D_h\) as a function of h.

Figure 1 (left) is representative of the results obtained in all cases we have tested: all \(|\omega _j|\) are 1 (except round-off) for some interval \(0< h < h^*\), and then there is always some \(\omega _{\ell }\) such that \(|\omega _{\ell }| > 1\). In other words, \(S_h^{[3,1]}\), \(S_h^{[3,2]}\) and \(S_h^{[4]}\) behave as unitary maps in this interval. This is precisely what happens in the group SU(2), as shown in [4].

The right panel of Fig. 1 is obtained in the same situation (i.e., H real symmetric with simple eigenvalues), but now both A and B are no longer symmetric: essentially the same behavior as before is observed. Of course, when \(h < h^*\), both the norm of u, \(\mathcal {M}(u)\), and the expected value of the energy, \(\mathcal {H}(u)\) are preserved for long times, as shown in [4].

Fig. 1
figure 1

Absolute value of the largest eigenvalue of the approximations \(S_h^{[3,1]}\) (black solid line), \(S_h^{[3,2]}\) (red dash-dotted line) and \(S_h^{[4]}\) (blue dashed line) for different values of h when \(H=A+B\) is a real symmetric matrix with simple eigenvalues. Left: A and B are also real symmetric. Right: A and B are real, but not symmetric (color figure online)

Our next simulation concerns a real (but not symmetric) matrix H with all its eigenvalues real and simple. Again, there exists a threshold \(h^* > 0\) such that for \(h < h^*\) the schemes render unitary approximations. This is clearly visible in Fig. 2 (left panel). If we consider instead a completely arbitrary real matrix H, then the outcome is rather different: \(D_h > 0\) for any \(h>0\) (right panel; for this example \(D_h = 9.79 \cdot 10^{-4}\) already for \(h=0.001\)).

Fig. 2
figure 2

Same as Fig. 1 when \(H=A+B\) is a real (but not symmetric) matrix. Left: the eigenvalues of H are real and simple. Right: the eigenvalues of H are arbitrary

Next we illustrate the situation when the real matrix H has multiple eigenvalues but is still diagonalizable. As before, we consider first the analogue of Fig. 1, namely: H is symmetric, with A and B symmetric matrices (Fig. 3, left panel) and A and B are real, but not symmetric (right panel). In the first case we notice that, whereas all the eigenvalues of the approximations rendered by \(S_h^{[3,1]}\) and \(S_h^{[4]}\) still have absolute value 1 for some interval \(0< h < h^*\), this is clearly not the case of \(S_h^{[3,2]}\). If, on the other hand, the splitting is done is such a way that A and B are not symmetric (but still real), then \(D_h > 0\) even for very small values of h. The same behavior is observed when H is taken as a real (but not symmetric), diagonalizable matrix with multiple real eigenvalues.

Fig. 3
figure 3

Same as Fig. 1 when \(H=A+B\) is a real symmetric matrix with multiple eigenvalues. Left: A and B are real symmetric matrices. Right: A and B are real, but not symmetric

The different phenomena exhibited by these examples require then a detailed numerical analysis of the class of schemes involved, trying to explain in particular the role played by the eigenvalues of the matrix H in the final outcome, as well as the different behavior of \(S_h^{[3,1]}\) and \(S_h^{[3,2]}\). This will be the subject of the next section.

3 Numerical analysis of reversible integration schemes

3.1 Main results

We next state two theorems and two additional corollaries that, generally speaking, justify the previous experiments and explain the good behavior exhibited by reversible methods.

Theorem 3.1

Let \(H \in \mathbb {R}^{N\times N}\) be a real matrix and let \(S_h\in \mathbb {C}^{N\times N}\) be a family of complex matrices depending smoothly on \(h\in \mathbb {R}\) such that

  • \(S_h\) is a reversible map in the previous sense, so that

    $$\begin{aligned} \overline{S}_h= S_h^{-1}; \end{aligned}$$
  • \(S_h\) is consistent with \(\exp (ihH)\), i.e. there exists \(p\ge 1\) such that

    $$\begin{aligned} S_h\mathop {=}_{h\rightarrow 0} \textrm{e}^{ihH} + \mathcal {O}(h^{p+1}); \end{aligned}$$
    (3.1)
  • the eigenvalues of H are real and simple.

Then there exist

  • \(D_h\), a family of real diagonal matrices depending smoothly on \(h\),

  • \(P_h\), a family of real invertible matrices depending smoothly on \(h\),

such that \(P_h= P_0 + \mathcal {O}(h^p)\), \(D_h= D_0 + \mathcal {O}(h^p)\) and, provided that \(|h|\) is small enough,

$$\begin{aligned} S_h= P_h\, \textrm{e}^{ihD_h} \, P_h^{-1}. \end{aligned}$$
(3.2)

Corollary 3.2

In the setting of Theorem 3.1, there exists a constant \(C>0\) such that, provided that \(|h|\) is small enough, for all \(u\in \mathbb {C}^N\) and all eigenvalues \(\omega \in \sigma (H)\), one has

$$\begin{aligned} \sup _{n\ge 0} \, \Big ||\Pi _\omega S_h^{n} u| - |\Pi _\omega u| \Big | \le C |h|^p |u|, \end{aligned}$$
(3.3)

where \(\Pi _\omega \) denotes the spectral projector onto \(\textrm{Ker}(H-\omega I_N)\). Moreover, if H is symmetric, the norm and the energy are almost conserved, in the sense that, for all \(u\in \mathbb {C}^N\), it holds that

$$\begin{aligned} \sup _{n \in \mathbb {Z}} \, \big | \mathcal {M}(S_h^n u) - \mathcal {M}( u) \big |\le C |h|^p |u|^2 \quad \textrm{ and } \quad \sup _{n \in \mathbb {Z}} \, \big | \mathcal {H}(S_h^n u) - \mathcal {H}( u) \big | \le C |h|^p |u|^2,\nonumber \\ \end{aligned}$$
(3.4)

where \(\mathcal {M}( u) = |u|^2 \) and \(\mathcal {H}( u) = \overline{u}^T H u \).

Proof of Corollary 3.2

First, we focus on (3.3). We note that by consistency, we have

$$\begin{aligned} D_0= P_0^{-1} H P_0. \end{aligned}$$

Since the eigenvalues of H are simple, it follows that the spectral projectors are all of the form

$$\begin{aligned} \Pi ^{(j)}= P_0 (e_j \otimes e_j) P_0^{-1}, \end{aligned}$$
(3.5)

where \(e_1,\ldots ,e_N\) denotes the canonical basis of \(\mathbb {R}^N\). Then, we note that for all \(n\in \mathbb {Z}\), we have

$$\begin{aligned} S_h^n = P_h\, \textrm{e}^{inhD_h} \, P_h^{-1}. \end{aligned}$$

Therefore, since \(\textrm{e}^{inhD_h}\) is uniformly bounded with respect to \(h\) and n (because \(D_h\) is a real diagonal matrix) and \(P_h= P_0 + \mathcal {O}(h^p)\), it follows that

$$\begin{aligned} S_h^n = P_0 \, \textrm{e}^{inhD_h} \, P_0^{-1} + \mathcal {O}(h^p), \end{aligned}$$

where the implicit constant in \(\mathcal {O}\) term does not depend on n (here and later). Therefore, it is enough to use the explicit formula (3.5) to prove that

$$\begin{aligned} \Pi ^{(j)} S_h^n = P_0 (e_j \otimes e_j) P_0^{-1} P_0 \, \textrm{e}^{inhD_h} P_0^{-1} + \mathcal {O}(h^p) = \textrm{e}^{inh(D_h)_{j,j}} \Pi ^{(j)} + \mathcal {O}(h^p). \end{aligned}$$

As a consequence, the estimate (3.3) follows directly by the triangular inequality:

$$\begin{aligned} |\Pi ^{(j)} S_h^{n} u|{} & {} = |\textrm{e}^{inh(D_h)_{j,j}} \Pi ^{(j)} u + \mathcal {O}(h^p)(u)| \le |\textrm{e}^{inh(D_h)_{j,j}} \Pi ^{(j)} u | + |u| \mathcal {O}(h^p)\\{} & {} = | \Pi ^{(j)} u | + |u| \mathcal {O}(h^p). \end{aligned}$$

Now, we focus on (3.4). Here, since H is assumed to be symmetric, its eigenspaces are orthogonal. Therefore by the Pythagorean theorem, we have

$$\begin{aligned} \mathcal {M}(u) = \sum _{\omega \in \sigma (H)} |\Pi _\omega (u)|^2 \quad \textrm{and} \quad \mathcal {H}(u) = \sum _{\omega \in \sigma (H)} \omega |\Pi _\omega (u)|^2. \end{aligned}$$

As a consequence, (3.4) follows directly of (3.3). \(\square \)

The main limitation of Theorem 3.1 is the assumption on the simplicity of the eigenvalues of H. Indeed, even if this assumption is typically satisfied, it depends only on the equation we aim at solving and not of the numerical method one uses. The following theorem, which is a refinement of Theorem 3.1, remedies this point by making an assumption on the leading term of the consistency error (which is typically satisfied for generic choices of numerical integrators).

Theorem 3.3

Let \(H \in \mathbb {R}^{N\times N}\) be a real matrix and let \(S_h\in \mathbb {C}^{N\times N}\) be a family of complex matrices depending smoothly on \(h\) such that

  • \(S_h\) is a reversible map, i.e.

    $$\begin{aligned} \overline{S}_h= S_h^{-1}; \end{aligned}$$
  • \(S_h\) is consistent with \(\exp (ihH)\), i.e.

    $$\begin{aligned} S_h\mathop {=}_{h\rightarrow 0} \textrm{e}^{ihH} + i h^{p+1} R + \mathcal {O}(h^{p+2}), \end{aligned}$$
    (3.6)

    where \(p\ge 1\) is the order of consistency and R is a real matrix;Footnote 1

  • H is diagonalizable and its eigenvalues are real;

  • for all \(\omega \in \sigma (H)\), the eigenvalues of \(\Pi _\omega R_{| E_{\omega }(H)}\) are real and simple, where \(\Pi _\omega \) denotes the spectral projector on \(E_\omega (H):= \textrm{Ker}(H-\omega I_N)\).

Then there exist

  • \(D_h\), a family of real diagonal matrices depending smoothly on \(h\),

  • \(P_h\), a family of real invertible matrices depending smoothly on \(h\),

such that, both \(P_0^{-1} B P_0\) and \(P_0^{-1} H P_0\) are diagonal, where \(B:= \sum _{\omega \in \sigma (H)} \Pi _\omega \, R \, \Pi _\omega \), and provided that \(|h|\) is small enough, it holds that

$$\begin{aligned} S_h= P_h\, \textrm{e}^{ihD_h} \, P_h^{-1}. \end{aligned}$$
(3.7)

Corollary 3.4

In the setting of Theorem 3.3, there exists a constant \(C>0\) such that, provided that \(|h|\) is small enough, for all \(u\in \mathbb {C}^N\), all \(\omega \in \sigma (H)\) and all \(\lambda \in \sigma (\Pi _\omega R_{| E_{\omega }(H)})\), we have

$$\begin{aligned} \sup _{n\ge 0} \, \Big ||\mathcal {P}_{\lambda ,\omega } S_h^{n} u| - |\mathcal {P}_{\lambda ,\omega } u| \Big | \le C |h| |u|, \end{aligned}$$

where \(\mathcal {P}_{\lambda ,\omega }\) denotes the projector along \(\bigoplus _{(\eta ,\mu )\ne (\lambda ,\omega )} E_\eta (\Pi _\mu R_{| E_{\mu }(H)})\) onto \(E_\lambda (\Pi _\omega R_{| E_{\omega }(H)})\).

Moreover, if H and R are symmetric, for all \(\omega \in \sigma (H)\), one gets

$$\begin{aligned} \sup _{n\ge 0} \, \Big ||\Pi _\omega S_h^{n} u|^2 - |\Pi _\omega u|^2 \Big | \le C |h| |u|^2, \end{aligned}$$

and the mass and the energy are almost conserved, i.e. for all \(u\in \mathbb {C}^N\), it holds that

$$\begin{aligned} \sup _{n \in \mathbb {Z}} \, \big | \mathcal {M}(S_h^n u) - \mathcal {M}( u) \big |\le C |h| |u|^2 \qquad \textrm{ and } \qquad \sup _{n \in \mathbb {Z}} \, \big | \mathcal {H}(S_h^n u) - \mathcal {H}( u) \big | \le C |h| |u|^2, \end{aligned}$$

where, as before, \(\mathcal {M}( u) = |u|^2 \) and \(\mathcal {H}( u) = \overline{u}^T H u \).

Proof of Corollary 3.4

The proof is almost identical to the one of Corollary 3.2. The key point is that, since both \(P_0^{-1} B P_0\) and \(P_0^{-1} H P_0\) are diagonal, then the projectors \(\mathcal {P}_{\lambda ,\omega }\) are exactly the projectors \(\Pi ^{(j)}\), \(1\le j \le N\) (given by (3.5)). Note that, contrary to Theorem 3.1, in Theorem 3.3 one does not claim that \(P_h= P_0 + \mathcal {O}(h^p)\). A priori, here, in general, the best estimate we expect is \(P_h= P_0 + \mathcal {O}(h)\) (which follows directly from the smoothness of \(P_h\) with respect to \(h\)). It is this loss which explains why, in Corollary 3.4, the error terms are of order \(\mathcal {O}(h)\) whereas they are of order \(\mathcal {O}(h^p)\) in Corollary 3.2. \(\square \)

Remark

Before starting the proof of these theorems, let us provide some comments about the context and the ideas involved.

  • In Theorem 3.1 and its proof, we are just putting \(S_h\) in Birkhoff normal form. The fact that \(S_h\) can be diagonalized is due to the simplicity of the eigenvalues of H while the fact that its eigenvalues are complex numbers of modulus 1 is due to the reversibility of \(S_h\). This approach is robust and well known, in particular it can be extended to the nonlinear setting (see e.g. [12, section V.1]). Note that here, we reach convergence of the Birkhoff normal form because the system is linear.

  • Theorem 3.3 is a refinement of Theorem 3.1. To prove the absence of resonances due to the multiplicity of the eigenvalues of H, we use the first correction to the frequencies generated by the perturbation of H (i.e., the projections of R in Theorem 3.3). This approach is typical of what one does in the proof of Nekhoroshev theorems or KAM theorems (see also [12]).

  • In order to give some intuition about the proof and the assumptions of Theorem 3.1, let us prove simply that, provided h is small enough, \(S_h\) is conjugated to a unitary matrix. Indeed, since \(S_h\) is reversible it writes as

    $$\begin{aligned} S_h = e^{ihH_h}, \end{aligned}$$

    where \(H_h = H + \mathcal {O}(h^p)\) is a real matrix (provided that h is small enough). Now, since the set of the real matrices whose eigenvalues are simple and real is open in the space of the real matrices (by continuity of the eigenvalues) and \(H_h\) is a real perturbation of such a matrix (H by assumption), we deduce that, provided h is small enough, its eigenvalues are simple and real. This implies that \(H_h\) is conjugated to a real diagonal matrix and so that \(S_h\) is conjugated to a unitary matrix.

3.2 Technical lemmas

In the proof of the previous theorems we will make use of the following three lemmas.

Lemma 3.5

Let M be a complex matrix and let P be a complex invertible matrix. Then \(\textrm{ad}_{P^{-1} M P}\) and \(\textrm{ad}_{M}\) are similar. More precisely,

$$\begin{aligned} \textrm{ad}_{\textrm{int}_P \, M} = (\textrm{int}_P) \textrm{ad}_{M} (\textrm{int}_P)^{-1}, \end{aligned}$$

where \(\textrm{int}_P M:= P^{-1} M P\). Here \(\textrm{ad}_M\) stands for the adjoint operator: \(\textrm{ad}_M X:= [M, X] = M X - X M\), for any matrix X.

Proof

A straightforward calculation shows that, for any X,

$$\begin{aligned} (\textrm{int}_P) \textrm{ad}_{M} X= & {} P^{-1} [M, X] P = [P^{-1} M P, P^{-1} X P] = \textrm{ad}_{\textrm{int}_P \, M} (P^{-1} X P) \\= & {} \textrm{ad}_{\textrm{int}_P \, M} (\textrm{int}_P) X. \end{aligned}$$

\(\square \)

Lemma 3.6

Let M be a complex matrix. Then M is diagonalizable if and only if the kernel and the image of \(\textrm{ad}_M\) are supplementary, i.e.

$$\begin{aligned} \textrm{Ker}_{\mathbb {C}} \ \textrm{ad}_M \cap \textrm{Im}_{\mathbb {C}} \ \textrm{ad}_M = \{0\}. \end{aligned}$$
(3.8)

Proof

We can assume, in virtue of Lemma 3.5 and without loss of generality, that M is in Jordan normal form.Footnote 2 On the one hand, if M is diagonal, we have \(\textrm{ad}_M A = (( m_{i,i} - m_{j,j}) A)_{i,j}\) and so the support of the matrices in \(\textrm{Ker}_{\mathbb {C}} \ \textrm{ad}_M\) and \(\textrm{Im}_{\mathbb {C}} \ \textrm{ad}_M\) are clearly disjoint (which implies (3.8)). Conversely, doing calculations by blocks it is enough to consider the case where \(M= \lambda I_N + \mathcal {N}\) is a Jordan matrix (i.e. \(\lambda \in \mathbb {C}\) and \(\mathcal {N}\) nilpotent). Then we just have to note that \(\textrm{ad}_{\lambda I_N + \mathcal {N}} = \textrm{ad}_{\mathcal {N}}\) and that since \(\textrm{ad}_{\mathcal {N}}\) is nilpotent necessarily we have \(\textrm{Ker}_{\mathbb {C}} \ \textrm{ad}_{\mathcal {N}} \cap \textrm{Im}_{\mathbb {C}} \ \textrm{ad}_{\mathcal {N}} \ne \{0\}\). \(\square \)

Lemma 3.7

Let \(M_h\) be a family of real matrices depending smoothly on \(h\) and of the form

$$\begin{aligned} M_h= M_0 + \mathcal {O}(h^p), \quad \text{ where } \quad p\ge 1. \end{aligned}$$

If \(M_0\) is diagonalizable on \(\mathbb {C}\), then there exists a family of real matrices \(\chi _h\), depending smoothly on \(h\), such that if \(|h|\) is small enough, \(\textrm{e}^{-h^p \chi _h} M_h\, \textrm{e}^{h^p\chi _h}\) commutes with \(M_0\), i.e.

$$\begin{aligned} {[}\textrm{e}^{h^p\chi _h} M_h\, \textrm{e}^{-h^p\chi _h}, M_0 ] =0. \end{aligned}$$

Proof

We aim at designing the family \(\chi _h\) as solution of the equation

$$\begin{aligned} \textrm{ad}_{M_0} \left( \textrm{e}^{h^p\chi _h} M_h\, \textrm{e}^{-h^p\chi _h} \right) =0. \end{aligned}$$

Thanks to the well known identity \(\textrm{e}^{A} B \, \textrm{e}^{-A} = \textrm{e}^{\textrm{ad}_A} B\), this equation rewrites as

$$\begin{aligned} \textrm{ad}_{M_0} \left( \textrm{e}^{h^p \textrm{ad}_{\chi _h}} M_h\right) = 0. \end{aligned}$$
(3.9)

Next we write the Taylor expansion of \(M_h\) at order p as

$$\begin{aligned} M_h= M_0 + h^p R_h, \end{aligned}$$

where \(R_h\) is a family of real matrices depending smoothly on \(h\). Then, isolating the terms of order 0 (and dividing by \(h^p\)), the Eq. (3.9) leads to

$$\begin{aligned} f(h,\chi _h):= \textrm{ad}_{M_0} \left( \textrm{e}^{h^p \textrm{ad}_{\chi _h}} R_h-\varphi _1(h^p \textrm{ad}_{\chi _{h}}) \, \textrm{ad}_{M_0} \chi _h\right) = 0, \end{aligned}$$

where \(\varphi _1(z):= \frac{e^z - 1}{z}\). We restrict ourselves to \(\chi _h\) in \(\textrm{Im}_{\mathbb {R}} \, \textrm{ad}_{M_0}\) and consider f as a smooth map from \(\mathbb {R} \times \textrm{Im}_{\mathbb {R}} \, \textrm{ad}_{M_0}\) to \(\textrm{Im}_{\mathbb {R}} \, \textrm{ad}_{M_0}\). To solve the equation \(f(h,\chi _h)=0\) using the implicit function theorem, we just have to design \(\chi _0\) so that

$$\begin{aligned} f(0,\chi _0) = \textrm{ad}_{M_0} R_0- \textrm{ad}_{M_0}^2 \chi _0 = 0 \end{aligned}$$

and prove that \(\textrm{d}_\chi f(0,\chi _0) = - \textrm{ad}_{M_0}^2: \textrm{Im}_{\mathbb {R}} \, \textrm{ad}_{M_0} \rightarrow \textrm{Im}_{\mathbb {R}} \, \textrm{ad}_{M_0}\) is invertible. Actually, these properties are clear because the first one is a consequence of the second one, whereas the second follows directly from Lemma 3.6. \(\square \)

3.3 Proofs of the theorems

We are now in a position to prove Theorems 3.1 and 3.3. Without loss of generality, and to simplify notations, we assume that H is diagonal

$$\begin{aligned} H = \begin{pmatrix} \omega _1 I_{n_1} \\ {} &{} \ddots \\ {} &{} &{} \omega _d I_{n_d} \end{pmatrix}, \end{aligned}$$

where \(\omega _1<\cdots < \omega _d\) denote the eigenvalues of H and \(n_1,\cdots ,n_d\) are positive integers satisfying \(n_1+\cdots +n_d=N\).

Thanks to the consistency assumption (3.6) (which is equivalent to (3.1)), provided that \(|h|\) is small enough, \(S_h\) rewrites as

$$\begin{aligned} S_h= \textrm{e}^{i hH_h}, \quad \text{ where } \quad H_h= H+h^p R + \mathcal {O}(h^{p+1}). \end{aligned}$$

Moreover, the reversibility assumption \(S_h^{-1} = \overline{S}_h\) implies that \(H_h\) is a real matrix (provided that \(|h|\) is small enough). Note that, hence, we deduce that R is also a real matrix. Then, applying Lemma 3.7 to \(H_h\), we get a family of real matrices \(\chi _h\) such that, provided that \(|h|\) is small enough,

$$\begin{aligned} {[}W_h, H ] =0, \qquad \text{ where } \qquad W_h= \textrm{e}^{h^p\chi _h} H_h\, \textrm{e}^{-h^p\chi _h}. \end{aligned}$$

We conclude that \(W_h\) is block-diagonal (with the same structure of blocks as H), i.e. there exists some \(n_j \times n_j\) real matrices \(W_h^{(j)}\) such that

$$\begin{aligned} W_h= \begin{pmatrix} W_h^{(1)} \\ {} &{} \ddots \\ {} &{} &{} W_h^{(d)} \end{pmatrix}. \end{aligned}$$
(3.10)

As a consequence, if the eigenvalues of H are simple (i.e. \(d=N\) and \(n_j=1\) for all j) then \(W_h\) is diagonal. Therefore, in this case, it is enough to set \(P_h=\textrm{e}^{-h^p\chi _h}\) and \(W_h= D_h\) to conclude the proof of Theorem 3.1.

So, from now on, we only focus on the proof of Theorem 3.3. First, we aim at identifying the matrices on the blocks in (3.10). The Taylor expansion of \(W_h\) is clearly

$$\begin{aligned} W_h= H + h^{p} B + \mathcal {O}(h^{p+1}), \qquad \text{ with } \qquad B:= R+ [\chi _0,H]. \end{aligned}$$

However, since \([W_h,H]=0\), we deduce that \([B,H]=0\) and so that B is block-diagonal. Moreover, since the matrix \([\chi _0,H]\) is identically equal to zero on the diagonal blocks, the diagonal blocks of B are exactly those of R. As a consequence, with a slight abuse of notations, we may write

$$\begin{aligned} W_h^{(j)} = \omega I_{n_j} + h^p B^{(j)} + h^{p+1}Y_h^{(j)}, \qquad \text{ where } \qquad B^{(j)}:= \Pi _{\omega _{j}} R_{| E_{\omega _j}(H)} \end{aligned}$$

and \(Y_h^{(j)}\) is a family of real matrices depending smoothly on \(h\).

Next we aim at diagonalizing these blocks. By assumption, the eigenvalues of each matrix \(B^{(j)}\) are real and simple. Therefore, all \(B^{(j)}\) are diagonalizable. As a consequence, and again by applying Lemma 3.7, we get a family of real matrices \(\Upsilon ^{(j)}_h\) such that if \(|h|\) is small enough, for all \(j\in \llbracket 1,d\rrbracket \) we have

$$\begin{aligned} \Big [ \textrm{e}^{h\Upsilon ^{(j)}_h} (B^{(j)} + hY_h^{(j)}) \textrm{e}^{-h\Upsilon ^{(j)}_h}, B^{(j)} \Big ] = 0. \end{aligned}$$

This means that the eigenspaces of \(B^{(j)}\) are stable by the action of \( \textrm{e}^{h\Upsilon ^{(j)}_h} (B^{(j)} + hY_h^{(j)}) \textrm{e}^{-h\Upsilon ^{(j)}_h}\). However, by assumption, these spaces are lines. Therefore, if \(Q^{(j)}\) is a real invertible matrix such that \( Q^{(j)} B^{(j)} (Q^{(j)})^{-1}\) is diagonal then \( Q^{(j)} \textrm{e}^{h\Upsilon ^{(j)}_h} (B^{(j)} + hY_h^{(j)}) \textrm{e}^{-h\Upsilon ^{(j)}_h}(Q^{(j)})^{-1} \) is also diagonal.

Finally, as a consequence, setting

$$\begin{aligned} P_h:= \textrm{e}^{-h^p\chi _h} \begin{pmatrix} \textrm{e}^{-h\Upsilon ^{(1)}_h} Q^{(1)} \\ {} &{} \ddots \\ {} &{} &{} \textrm{e}^{-h\Upsilon ^{(d)}_h} Q^{(d)} \end{pmatrix} \end{aligned}$$

we have proven that \(D_h:= P_h^{-1} H_hP_{h}\) is real diagonal, which concludes the proof of Theorem 3.3.

3.4 Applications to reversible splitting and composition methods

Theorems 3.1 and 3.3 shed light on the behavior observed in the examples collected in Sect. 2. Thus, suppose \(H = A + B\) is a real symmetric matrix, with A, B also real. Furthermore, consider a splitting scheme \(S_h\) of the form (1.2) with coefficients satisfying the symmetry conditions (1.3) and consistency,

$$\begin{aligned} a_0 + \cdots + a_{2n} = 1, \qquad \qquad b_0 + \cdots + b_{2n-1} =1. \end{aligned}$$

Clearly, \(S_h\) is a reversible map and moreover, it is consistent with \(\textrm{e}^{ihH}\) at least at order 1, so that (3.1) holds with \(p \ge 1\). Since H is real symmetric, it is diagonalizable. Therefore, if the eigenvalues of H are simple, the dynamics of \((S_h^n)_{n\in \mathbb {Z}}\) is given by Theorem 3.1: for sufficiently small h, there exist real matrices \(D_h\) (diagonal) and \(P_h\) (invertible) so that \(S_h^n = P_h \, \textrm{e}^{i n D_h} P_h^{-1}\), all the eigenvalues of \(S_h\) verify \(|\omega _j| = 1\) and \(\mathcal {M}(u)\) and \(\mathcal {H}(u)\) are almost preserved for long times. This corresponds to the examples of Fig. 1. The same conclusions apply as long as H is a real matrix with all its eigenvalues real and simple (Fig. 2, left), whereas the general case of complex eigenvalues is not covered by the theorem, and no preservation is ensured (Fig. 2, right).

Suppose now that the real matrix H has multiple real eigenvalues, but is still diagonalizable, and that A and B are real and symmetric. In that case, a symmetric-conjugate splitting method satisfies both conditions (1.4) and (1.5), so that it can be written as

$$\begin{aligned} S_h= e^{ihH_h}, \end{aligned}$$

where \(H_h\) is a family of real matrices whose even terms in h are symmetric and odd terms are skew-symmetric. Suppose in addition that \(S_h\) is of even order (i.e., p is even in (3.6)). In that case the matrix R in Theorem 3.3 is symmetric, and so its eigenvalues are real. Moreover, since R strongly depends on the coefficients \(a_j, b_j\) and the decomposition \(H=A+B\), it is very likely that typically the eigenvalues of the operators \(\Pi _\omega R_{| E_{\omega }(H)}\) are simple and so that the dynamics of \((S_h^n)_{n\in \mathbb {Z}}\) is given by Theorem 3.3 and is therefore similar to the one of \((\textrm{e}^{inhH})_{n\in \mathbb {Z}}\). Notice that this does not necessarily hold if the scheme is of odd order and/or A and B are not symmetric. This phenomenon is clearly illustrated in the examples of Fig. 3 by methods \(S_h^{[3,2]}\) and \(S_h^{[4]}\).

Notice, however, that method \(S_h^{[3,1]}\), although of odd order, works in fact better than expected from the previous considerations. The reason for this behavior resides in the following

Proposition 3.8

The 3rd-order symmetric-conjugate splitting method

$$\begin{aligned} S_h^{[3,1]} = \textrm{e}^{i h \overline{b}_0 B} \, \textrm{e}^{i h \overline{a}_1 A} \, \textrm{e}^{i h b_1 B} \, \textrm{e}^{i h a_1 A} \, \textrm{e}^{i h b_0 B}, \end{aligned}$$

with \(a_1 = \frac{1}{2} + i \frac{\sqrt{3}}{6}\), \(b_0 = \frac{a_1}{2}\), \(b_1 = \frac{1}{2}\), is indeed conjugate to a reversible integrator \(V_h\) of order 4, i.e., there exists a real near-identity transformation \(F_h\) such that \(F_h \, S_h^{[3,1]} \, F_h^{-1} = V_h = \textrm{e}^{i h H} + \mathcal {O}(h^5)\) and \(\overline{V}_h = V_h^{-1} \).

Proof

Method \(S_h^{[3,1]}\) constitutes in fact a particular case of a composition \(\psi _h = \mathcal {S}_{\bar{\alpha } h}^{[2]} \, \mathcal {S}_{\alpha h}^{[2]}\), where \(\mathcal {S}_{h}^{[2]}\) is a time-symmetric 2nd-order method and \(\alpha = a_1\). Specifically, \(S_h^{[3,1]}\) is recovered when \(\mathcal {S}_{h}^{[2]} = \textrm{e}^{\frac{h}{2} B} \, \textrm{e}^{h A} \, \textrm{e}^{\frac{h}{2} B}\). Therefore, it can be written as

$$\begin{aligned} \mathcal {S}_{h}^{[2]} = \exp ( i h H - i h^3 F_3 + i h^5 F_5 + \cdots ) \end{aligned}$$

for certain real matrices \(F_{2j+1}\). In consequence, by applying the BCH formula, one gets \(\psi _h = \textrm{e}^{W(h)}\), with

$$\begin{aligned} W(h){} & {} = i h H + \frac{1}{2} h^4 |\alpha |^2(\alpha ^2-\bar{\alpha }^2) [H, F_3]\\{} & {} \qquad + i h^5 \big ( w_{5,1} F_5 + w_{5,2} [H,[H, F_3]] \big ) + \mathcal {O}(h^6). \end{aligned}$$

Here \(w_{5,j}\) are polynomials in \(\alpha \). Now let us consider

$$\begin{aligned} V_h = \textrm{e}^{V(h)} = \textrm{e}^{\lambda h^3 F_3} \, \textrm{e}^{W(h)} \, \textrm{e}^{-\lambda h^3 F_3} \end{aligned}$$

for a given parameter \(\lambda \). Then, clearly,

$$\begin{aligned} V(h) = \textrm{e}^{\lambda h^3 \textrm{ad}_{F_3}} W(h) = i h H + h^4 \left( \frac{1}{2} \alpha ^3 - i \lambda \right) [H, F_3] + \mathcal {O}(h^5), \end{aligned}$$

so that by choosing \(\lambda = -\frac{i}{2} \alpha ^3 = - \frac{\sqrt{3}}{18}\), we have \(V(h) = i h H + \mathcal {O}(h^5)\) and the stated result is obtained, with \(F_h = \textrm{e}^{\lambda h^3 F_3} \). \(\square \)

This result can be generalized as follows: given a time-symmetric method \(\mathcal {S}_{h}^{[2k]}\) of order 2k, if \(\alpha \) is chosen so that the composition \(\psi _h = \mathcal {S}_{\bar{\alpha } h}^{[2k]} \, \mathcal {S}_{\alpha h}^{[2k]}\) is of order \(2k+1\), then \(\psi _h\) is conjugate to a reversible method of order \(2k+2\).

Theorems 3.1 and 3.3 also allow one to explain the good behavior shown by symmetric-conjugate composition methods for this type of problems. In fact, suppose H is a real symmetric matrix and \(\Phi _H^z\) is a family of linear maps which are consistent with \(\textrm{e}^{izH}\) at least at order 1 and satisfy

$$\begin{aligned} (\Phi _H^z)^{-1} = \overline{\Phi _H^{\overline{z}}}. \end{aligned}$$

If we define \(S_h\) as the symmetric-conjugate composition

$$\begin{aligned} S_h= \Phi _H^{\alpha _0 h} \cdots \Phi _H^{\alpha _n h}, \end{aligned}$$

where \(\alpha _j\) are some complex coefficients satisfying the symmetry condition

$$\begin{aligned} \alpha _{n-j} = \overline{\alpha }_j, \qquad j=1,2,\ldots \end{aligned}$$

and the consistency condition

$$\begin{aligned} \alpha _0 + \cdots + \alpha _n = 1, \end{aligned}$$

then \(S_h\) is a reversible map. Moreover, it is consistent with \(\textrm{e}^{ihH}\) at least at order 1. Therefore, one can apply Theorems 3.1 and 3.3 also in this case. Notice, in particular, that even if the maps \(\textrm{e}^{i h a_j A}\) and/or \(\textrm{e}^{i h b_j B}\) in the symmetric-conjugate splitting method (1.2) are not computed exactly, but only conveniently approximated, the previous theorems still apply, so that one can expect good long term behavior from the resulting approximation.

4 Symmetric-conjugate splitting methods for the Schrödinger equation

An important application of the previous results corresponds to the numerical integration of the time dependent Schrödinger equation (\(\hbar = m = 1\))

$$\begin{aligned} i \frac{\partial }{\partial t} \psi (x,t) = \hat{H} \psi (x,t), \qquad \quad \psi (x,0)=\psi _0(x), \end{aligned}$$
(4.1)

where \(\psi : \mathbb {R}^3 \times \mathbb {R} \longrightarrow \mathbb {C}\). The Hamiltonian operator \(\hat{H}\) is the sum \(\hat{H} = \hat{T} + \hat{V}\) of the kinetic energy operator \(\hat{T}\) and the potential \(\hat{V}\). Specifically,

$$\begin{aligned} (\hat{T} \psi )(x) = -\frac{1}{2} \Delta \psi (x,t), \qquad \quad (\hat{V} \psi )(x) = \hat{V}(x) \psi (x,t). \end{aligned}$$

In addition, a simple computation shows that \([\hat{V},[\hat{T},\hat{V}]] \ \psi = | \nabla \hat{V}|^2 \psi \), and therefore

$$\begin{aligned}{}[\hat{V},[\hat{V},[\hat{V},\hat{T}]]] \ \psi = 0. \end{aligned}$$
(4.2)

Assuming \(d=1\) and periodic boundary conditions, the application of a pseudo-spectral method in space (with N points) leads to the N-dimensional system (1.1), where \(u(0) = u_{0} \in \mathbb {C}^N\) and H represents the (real symmetric) \(N \times N\) matrix associated with the operator \(-\hat{H}\) [16]. Now

$$\begin{aligned} H = A + B, \end{aligned}$$

where A is the (minus) differentiation matrix corresponding to the discretization of \(\hat{T}\) (a real and symmetric matrix) and B is the diagonal matrix associated to \(-\hat{V}\) at the grid points. Since \(\exp (t A)\) can be efficiently computed with the fast Fourier transform (FFT) algorithm, it is a common practice to use splitting methods of the form (1.2) to integrate this problem. In this respect, notice that property (4.2) will be inherited by the matrices A and B only if the number of discretization points N is sufficiently large to achieve spectral accuracy, i.e.,

$$\begin{aligned} {[}B,[B,[B,A]]] u= 0 \qquad \text{ if } \text{ N } \text{ is } \text{ large } \text{ enough. } \end{aligned}$$
(4.3)

Assuming this is satisfied, then there is a reduction in the number of conditions necessary to construct a method (1.2) of a given order p [2, 12]. Integrators of this class are sometimes called Runge–Kutta–Nyström (RKN) splitting methods [5].

Two further points are worth remarking. First, the computational cost of evaluating (1.2) is not significantly increased by incorporating complex coefficients into the scheme, since one has to use complex arithmetic anyway. Second, since \(\sum _j a_j = 1\) for a consistent method, if \(a_j \in \mathbb {C}\), then both positive and negative imaginary parts are present, and this can lead to severe instabilities due to the unboundedness of the Laplace operator [8, 14]. On the other hand, the spurious effects introduced by complex \(b_j\) can be eliminated (at least for sufficiently small values of h) by introducing an artificial cut-off bound in the potential when necessary.

In view of these considerations, we next limit our exploration to symmetric-conjugate splitting methods of the form (1.2) with \(0< a_j < 1\) and \(b_j \in \mathbb {C}\) with \(\Re (b_j) > 0\) to try to reduce the size of the error terms appearing in the asymptotic expansion of the modified Hamiltonian \(H_h\) associated with the integrator.

For simplicity, we denote the symmetric-conjugate splitting schemes \(S_h\) by their sequence of coefficients as

$$\begin{aligned} (a_0, b_0, a_1, b_1, \ldots , a_r, b_r, a_r, \ldots , \overline{b}_1, a_1, \overline{b}_0, a_0). \end{aligned}$$
(4.4)

As a matter of fact, since A and B are sought to verify (4.3), sequences starting with B may lead to schemes with a different efficiency, so that we also analyze methods of the form

$$\begin{aligned} (b_0, a_0, b_1, a_1, \ldots , b_r, a_r, \overline{b}_r, \ldots , a_1, \overline{b}_1, a_0, \overline{b}_0). \end{aligned}$$
(4.5)

Schemes (4.4) and (4.5) include integrators where the central exponential corresponds to A (when \(b_r=0\)) and B (when \(a_r=0\)), respectively. The method has s stages if the number of exponentials of A is precisely s for the scheme (4.5) or \(s+1\) for the scheme (4.4).

The construction process of methods within this class is detailed elsewhere (e.g. [5, 7] and references therein), so that it is only summarized here. First, we get the order conditions a symmetric-conjugate scheme has to satisfy to achieve a given order \(p=4, 5\) and 6. These are polynomial equations depending on the coefficients \(a_j\), \(b_j\), and can be obtained by identifying a basis in the Lie algebra generated by \(\{A, B\}\) and using repeatedly the BCH formula to express the splitting method as \(S_h = \exp (h H_h)\), with \(H_h\) in terms of A, B and their nested commutators. The order conditions up to order p are obtained by requiring that \(H_h = H + \mathcal {O}(h)^{p+1}\), and the number is 7, 11 and 16 for orders 4, 5 and 6, respectively.

Second, we take compositions (4.4) and (4.5) involving the minimum number of stages required to solve the order conditions and get eventually all possible solutions with the appropriate symmetry. Sometimes, one has to add parameters, because there are no such solutions. In particular, there are no 4th-order schemes with 4 stages with both \(a_j > 0\) and \(\Re (b_j) > 0\).

Even when there are appropriate solutions, it may be convenient to explore compositions with additional stages to have free parameters for optimization. This strategy usually pays off when purely real coefficients are involved, and so it is worth to be explored also in this context. Of course, some optimization criterion related with the error terms and the computational effort has to be adopted. In our study we look at the error terms in the expansion of \(H_h\) at successive orders and the size of the \(b_j\) coefficients. Specifically, we compute for each method of order, say, p, the quantities

$$\begin{aligned} \Delta _b:= \sum _j |b_j| \qquad \text{ and } \qquad E_f^{(r+1)}:= s \, \big ( \mathcal {E}_{r+1} \big )^{1/r}, \qquad r = p, p+1, \ldots \end{aligned}$$
(4.6)

Here s is the number of stages and \(\mathcal {E}_{r+1}\) is the Euclidean norm of the vector of error coefficients in \(H_h\) at higher orders than the method itself. In particular, for a method of order 6, \(E_f^{(7)}\) gives an estimate of the efficiency of the scheme by considering only the error at order 7. By computing \(E_f^{(8)}\) and \(E_f^{(9)}\) for this method we get an idea of how the higher order error terms behave. It will be of interest, of course, to reduce these quantities as much as possible to get efficient schemes.

Solving the polynomial equations required to construct splitting methods with additional stages is not a trivial task, especially for orders 5 and 6. In these cases we have used the Python function fsolve of the SciPy library, with a large number of initial points in the space of parameters to start the procedure. From the total number of valid solutions thus obtained, we have selected those leading to reasonably small values of all quantities (4.6) and checked them on numerical examples.

The corresponding values for the most efficient methods we have found by following this approach have been collected in Table 1, where \(\mathcal{N}\mathcal{A}_{s}^{*[p]}\) refers to a symmetric-conjugate method of type (4.4) of order p involving s stages, and \(\mathcal{N}\mathcal{B}_{s}^{*[p]}\) is a similar scheme of type (4.5). For completeness, we have also included the most efficient integrators of order 4, 6 and 8 with real coefficients for systems satisfying the condition (4.3) (same notation without \(*\)) and also the symmetric-conjugate splitting schemes presented in [10, 11] (denoted by \(\mathcal{G}\mathcal{B}_{s}^{*[p]}\)). They do not take into account the property (4.3) for their formulation.

In Table 1 we also write the value of \(\Delta _a:= \sum _j |a_j|\) and \(\Delta _b:= \sum _j |b_j|\) for each method. Of course, by construction, \(\Delta _a = 1\) for all symmetric-conjugate integrators. The coefficients of the most efficient schemes we have found (in boldface) are collected in Table 2.

Table 1 1-norm and effective errors for several splitting methods of order 4, 5 and 6 designed for problems satisfying the condition (4.3). In boldface, the most efficient schemes
Table 2 Coefficients of the most efficient symmetric-conjugate RKN splitting methods of order 4, 5 and 6

In the “Appendix” we provide analogous information for general schemes of orders 3, 4, 5 and 6, i.e., of splitting methods for general problems of the form \(H = A + B\), with \(a_j > 0\) and \(b_j \in \mathbb {C}\) with \(\Re (b_j) > 0\). They typically involve more stages, but can be applied in more general contexts.

One should take into account, however, that all these symmetric-conjugate methods have been obtained by considering the ordinary differential Eq. (1.1) in finite dimension, whereas the time dependent Schrödinger equation is a prototypical example of an evolutionary PDE involving unbounded operators (the Laplacian and possibly the potential). In consequence, one might arguably question the viability of using the above schemes in this setting. That this is indeed possible comes as a consequence of some previous results obtained in the context of PDEs defined in analytic semigroups.

Specifically, Eq. (4.1) can be written in the generic form

$$\begin{aligned} u' = \hat{L} u = (\hat{A}+\hat{B}) u, \qquad u(0) = u_0, \end{aligned}$$
(4.7)

with \(\hat{A} = \frac{i}{2} \Delta \) and \(\hat{B} = -i \hat{V}\). It has been shown in [13] (see also [15, 18]) that, under the two assumptions stated below, a splitting method of the form

$$\begin{aligned} S_h = \textrm{e}^{h a_0 \hat{A}} \, \textrm{e}^{ h b_0 \hat{B}} \, \cdots \, \textrm{e}^{ h b_{2n-1} \hat{B}} \, \textrm{e}^{ h a_{2n} \hat{A}} \end{aligned}$$
(4.8)

is of order p for problem (4.7) if and only if it is of classical order p in the finite dimensional case. The assumptions are as follows:

  1. 1.

    Semi-group property: \(\hat{A}\), \(\hat{B}\) and \(\hat{L}\) generate \(C^0\)-semigroups on a Banach space X with norm \(\Vert \cdot \Vert \) and, in addition, they satisfy the bounds

    $$\begin{aligned} \Vert \textrm{e}^{t \hat{A}} \Vert \le \textrm{e}^{\omega t}, \qquad \Vert \textrm{e}^{t \hat{B}} \Vert \le \textrm{e}^{\omega t} \end{aligned}$$

    for some positive constant \(\omega \) and all \(t \ge 0\).

  2. 2.

    Smoothness property: For any pair of multi-indices \((i_1,\ldots ,i_m)\) and \((j_1,\ldots ,j_m)\) with \(i_1+ \cdots + i_m + j_1 + \cdots + j_m = p+1\), and for all \(t \in [0,T]\),

    $$\begin{aligned} \Vert \hat{A}^{i_1} \hat{B}^{j_1} \ldots \hat{A}^{i_m} \hat{B}^{j_m} \, \textrm{e}^{t \hat{L}}u_0\Vert \le C \end{aligned}$$

    for a positive constant C.

These conditions restrict the coefficients \(a_j\), \(b_j\) in (4.8) to be positive, however, and thus the method to be of second order at most. Nevertheless, it has been shown in [8, 14] that, if in addition \(\hat{L}\), \(\hat{A}\) and \(\hat{B}\) generate analytic semigroups on X defined in the sector \(\Sigma _{\phi } = \{ z \in \mathbb {C}: |\arg z| < \phi \}\), for a given angle \(\phi \in (0, \pi /2]\) and the operators \(\hat{A}\) and \(\hat{B}\) verify

$$\begin{aligned} \Vert \textrm{e}^{z \hat{A}} \Vert \le \textrm{e}^{\omega |z|}, \qquad \Vert \textrm{e}^{z \hat{B}} \Vert \le \textrm{e}^{\omega |z|} \end{aligned}$$

for some \(\omega \ge 0\) and all \(z \in \Sigma _{\phi }\), then a splitting method of the form (4.8) of classical order p with all its coefficients \(a_j\), \(b_j\) in the sector \(\Sigma _{\phi } \subset \mathbb {C}\), then

$$\begin{aligned} \Vert (S_h^n - \textrm{e}^{n h \hat{L}}) u_0 \Vert \le C h^p, \qquad 0 \le n h \le T \end{aligned}$$

where C is a constant independent of n and h.

5 Numerical illustration: Modified Pöschl–Teller potential

The so-called modified Pöschl–Teller potential takes the form

$$\begin{aligned} V(x) = -\frac{\alpha ^2}{2} \frac{\lambda (\lambda -1)}{\cosh ^2 \alpha x}, \end{aligned}$$
(5.1)

with \(\lambda > 1\), and admits an analytic treatment to compute explicitly the eigenvalues for negative energies [9]. For the simulations we take \(\alpha = 1\), \(\lambda (\lambda -1) = 10\) and the initial condition \(\psi _0(x) = \sigma \, \textrm{e}^{-x^2/2}\), with \(\sigma \) a normalizing constant. We discretize the interval \(x \in [-8,8]\) with \(N=256\) equispaced points and apply Fourier spectral methods. With this value of N it turns out that \(\Vert ([B,[B,[A,B]]]) u_0\Vert \) is sufficiently close to zero to be negligible, so that we can safely apply the schemes of Table 2. If N is not sufficiently large, then the corresponding matrices A and B do not satisfy (4.3), and as a consequence, the schemes are only of order three. This can be indeed observed in practice.

Fig. 4
figure 4

Error in norm \(\mathcal {M}(u)\) (left) and in energy \(\mathcal {H}(u)\) (right) as a function of time for complex-conjugate and palindromic methods involving complex coefficients

Fig. 5
figure 5

Maximum error in the expectation value of the energy along the integration for several 4th-, 5th- and 6th-order symmetric-conjugate splitting methods for the modified Pöschl–Teller potential

Fig. 6
figure 6

Maximum error in the expectation value of the energy along the integration as a function of the computational cost for the new symmetric-conjugate splitting methods intended for general problems of the form \(H = A + B\) (modified Pöschl–Teller potential)

We first check how the errors in the norm \(\mathcal {M}(u)\) and in the energy \(\mathcal {H}(u)\) evolve with time for each type of integrator. To this end we integrate numerically until the final time \(t_f = 10^4\) with three 6th-order compositions involving complex coefficients: (i) the new symmetric-conjugate scheme \(\mathcal{N}\mathcal{B}_{11}^{*[6]}\) collected in Table 2\((h = 100/909 \approx 0.11)\), (ii) the palindromic scheme denoted by \(\mathcal {B}_{16}^{[6]}\) with all \(a_j\) taking the same value \(a_j = 1/16\), \(j=1,\ldots , 8\) and complex \(b_j\) with positive real partFootnote 3\((h=0.16)\), and (iii) the method obtained by composing \(\mathcal {B}_{16}^{[6]}\) with its complex conjugate \((\mathcal {B}_{16}^{[6]})^*\), resulting in a symmetric-conjugate integrator \((h=0.32)\). The step size is chosen in such a way that all the methods require the same number of FFTs. The results are depicted in Fig. 4. We see that, in accordance with the previous analysis, the error in both unitarity and energy furnished by the new scheme \(\mathcal{N}\mathcal{B}_{11}^{*[6]}\) does not grow with time, in contrast with palindromic compositions involving complex coefficients. Notice also that the composition of the palindromic scheme \(\mathcal {B}_{16}^{[6]}\) with its complex conjugate leads to a new (symmetric-conjugate) integrator with good preservation properties. On the other hand, composing a symmetric-conjugate method with its complex conjugate results in a palindromic scheme showing a drift in the error of both the norm and the energy [4] (Fig. 6).

In our second experiment, we test the efficiency of the different schemes. To this end we integrate until the final time \(t_f = 100\), compute the expectation value of the energy,\(\mathcal {H}(u_{\textrm{app}}(t))\), and measure the error as the maximum of the difference with respect to the exact value along the integration:

$$\begin{aligned} \max _{0 \le t \le t_f} \quad |\mathcal {H}(u_{\textrm{app}}(t)) - \mathcal {H}(u_0)|. \end{aligned}$$
(5.2)

The corresponding results are displayed as a function of the computational cost measured by the number of FFTs necessary to carry out the calculations (in log-log plots) in Fig. 5. Notice how the new symmetric-conjugate schemes offer a better efficiency than standard splitting methods for this problem. The improvement is particularly significant in the 6th-order case.