We give an illustration of our theoretical results for two different skew-Hermitian problems in Sect. 8.1, a Hermitian problem in Sect. 8.2, and a non-normal problem in Sect. 8.3. We also compare the performance of different error estimates for practical step size control (Sect. 7) in Sect. 8.1. To show that our error estimate (3.6) is efficient in practice we also compare it with results delivered by the standard package Expokit [47] and a priori error estimates.
The skew-Hermitian case
For our tests we use different types of matrices.
Free Schrödinger equation We consider
$$\begin{aligned} H=\tfrac{1}{4}\, \text {tridiag}(-1,2,-1) \in {\mathbb {R}}^{n\times n}, \end{aligned}$$
(8.1)
with dimension \(n=10\,000\). The matrix H is associated with a finite difference or finite element discretization of the one-dimensional negative Laplacian. With \(A=H\) and \(\sigma =-\mathrm{i}\), in (2.1) we obtain the free Schrödinger equation. The eigenvalue decomposition of H is well known, and we can use the discrete sine transform with high precision arithmetic in Matlab to compute the exact solution E(t)v, see (2.1). The starting vector v is chosen randomly. To compute the Krylov subspace approximation \(S_m(t)v\), see (2.6), we use the eigenvalue decomposition of the tridiagonal matrix \(T_m\).
Discrete Hubbard model For the description of the Hubbard model we employ a self-contained notation. The Hubbard model first appears in [28] and was further used in many papers and books, e.g., [32, 44]. The Hubbard model is used to describe electron density on a given number of sites, which correspond to Wannier discretization of orbitals, and spin up or down. We consider the following Hubbard Hamiltonian, in second quantization and without chemical potential:
$$\begin{aligned} H=\frac{1}{2} \sum _{i,j,\sigma } v_{ij} c_{j\sigma }^{\dagger } c_{i\sigma } + \sum _{j,\sigma } U {\hat{n}}_{j\sigma }{\hat{n}}_{j\sigma '}, \end{aligned}$$
(8.2)
where i, j sum over the number of sites \(n_{\text {sites}}\) and the spins \(\sigma ,\sigma ' \in \{\uparrow ,\downarrow \}\) where \(\sigma '\) is the opposite spin to \(\sigma \). The entries \(v_{ij}\) with \(i,j=1,\ldots ,n_{\text {sites}}\) describe electron hopping from site i to j. In (8.2), the notation \(c_{j\sigma }^\dagger c_{i\sigma }\) describes the 2nd quantization operator and \({\hat{n}}_{j\sigma }=c_{j\sigma }^\dagger c_{j\sigma }\) the occupation number operator. For details on the notation in (8.2) we can recommend several references, e.g., [28, 29, 32, 44].
For our tests we model 8 electrons at 8 sites (\(n_{\text {sites}}=8\)) with spin up and down for each site, this leads to 16 possible states for electrons. Such an electron distribution is also referred to as half-filled in the literature. We also restrict our model by considering the number of electrons with spin up and down to be fixed as \(n_{\text {sites}}/2\). This leads to \(n=(\text {binomial}(8,4))^2=4900\) considered occupation states which create a discrete basis. For the numerical implementation of the basis we consider 16-bit integers for which each bit describes a position which is occupied in case the bit is equal to 1 or empty otherwise. The set of occupation states can be ordered by the value of the integers which leads to a unique representation of the Hubbard Hamiltonian (8.2) by a matrix \(H\in {\mathbb {C}}^{n\times n}\). Such an implementation of the Hubbard Hamiltonian is also described in [29, Section 3].
In our test setting we use \(U=5\) and parameter-dependent values for electron hopping \(v_{ij}=v_{ij}(\omega )\in {\mathbb {C}}\) with \(\omega \in (0,2\pi ]\):
$$\begin{aligned}&v_{11}=v_{88}=-1.75,\quad v_{jj}=-2\quad \text {for}\quad j=2,\ldots ,7,\\&v_{j,j+1}={\bar{v}}_{j+1,j}=-\cos \omega +\mathrm{i}\,\sin \omega \quad \text {for}\quad j=1,\ldots ,7\quad \text {and}\quad v_{ij}=0\quad \text {otherwise}. \end{aligned}$$
For this choice of \(v_{ij}(\omega )\) we obtain a Hermitian matrix \(H_\omega \in {\mathbb {C}}^{n\times n}\) with 43,980 nonzero entries (for a general choice of \(\omega \)) and \(\text {spec}(H_\omega )\subseteq (-19.1,8.3)\). The spectrum of \(H_\omega \) is independent of \(\omega \).
A relevant application where the Hubbard Hamiltonian (8.2) is of importance is the simulation of oxide solar cells with the goal of finding candidates for new materials promising a gain in the efficiency of the solar cell, see [21]. The study of solar cells considers time-dependent electron hoppings \(v_{ij}=v_{ij}(t)\) to model time-dependent potentials which lead to Hamiltonian matrices H(t). The time-dependent Hamiltonian can be parameterized via \(\omega \). Time propagation of a linear, non-autonomous ODE system can be approximated by Magnus-type integrators which are based on one or more evaluations of matrix exponentials applied to different starting vectors at several times t, see for instance [5, 6]. Our test setting for the Hubbard Hamiltonian with arbitrary \(\omega \) is then obtained by (2.1) with the matrix \(A=H_\omega \) as described above and \(\sigma =-\mathrm{i}\).
In the following Sect. 8.1 we focus on the skew-Hermitian case. For tests on the Hermitian case see Sect. 8.2 below.
Verification of upper error bound In the following Figs. 1 and 2 we compare the error \(\Vert L_m(t)v\Vert _2\) with the error estimates \({\mathrm {Err}}_{1}\) and \({\mathrm {Err}}_{a}\). Figure 1 refers to the matrix (8.1) of the free Schrödinger problem and Fig. 2 to the Hubbard Hamiltonian (8.2) with \(\omega =0.123\). For both cases we show results with Krylov subspace dimensions \( m=10 \) and \( m=30 \), respectively.
We observe that the error estimate \({\mathrm {Err}}_{1}\) is a good approximation to the error, but it is not an upper bound in general. In contrast, \({\mathrm {Err}}_{a}\) is a proven upper error bound. Up to round-off error, for \(m=10\) we observe the correct asymptotic behavior of \({\mathrm {Err}}_{a}\) and \({\mathrm {Err}}_{1}\). For larger choices of m the asymptotic regime starts at time steps for which the error is already close to round-off precision. Therefore, for larger choices of m, the Krylov approximation, as a time integrator, cannot achieve its full order for typical time steps in double precision.
The matrix (8.1) has been scaled such that \(\text {spec}(H)\subseteq (0,1)\) and \(\Vert H\Vert _2\approx 1\). In accordance with (7.1) stagnation of the error is observed for times \(t \lessapprox m\), see Fig. 1.
We verify the error estimates in the skew-Hermitian setting of the free Schrödinger Eq. (8.1) for the standard Krylov approximation of the \(\varphi _1\) function in Fig. 3 and the corrected Krylov approximation of the matrix exponential function in Fig. 4. In Fig. 3 the error estimator \({\mathrm {Err}}_{1}\) refers to formula (5.8a) and \({\mathrm {Err}}_{a}\) shows the upper error bound (4.4b) from Theorem 2, both for the case \(p=1\). In Fig. 4, \({\mathrm {Err}}_{1}\) is from formula (5.8b) and \({\mathrm {Err}}_{a}\) denotes the upper error bound (5.9b) from Theorem 3, both for the case \(p=0\).
Illustration of defect-based quadrature error estimates from Section 6 We first illustrate the performance of the estimates based on Hermite quadrature according to (6.3) and improved Hermite quadrature according to (6.5) for the Hubbard model, see Fig. 5. Both estimates are asymptotically correct, whereas the improved quadrature (6.5) is slightly better for larger time steps t, with the drawback of one additional matrix–vector multiplication. (See Remark 7 below for cost efficiency of more expensive error estimates.)
Figure 6 refers to the generalized residual estimate (6.6), and estimates based on the effective order quadrature according to Remark 5, and the Hermite quadrature (6.3). For our test problems the assumptions from Sect. 6 on the defect and its effective order are satisfied for a significant range of values of t. We also observe that the inequalities (6.8) are satisfied. The effective order and Hermite quadrature estimates behave in an asymptotically correct way, while the generalized residual estimate leads to an upper error bound which is, however, not sharp for \(t\rightarrow 0\).
For the skew-Hermitian case use \(\sigma =-\mathrm{i}\) and \(T_m\in {\mathbb {R}}^{m\times m}\) in (6.10) to obtain
$$\begin{aligned} \rho (t)=t\,(T_m)_{m-1,m}\,\text {Re}\bigg ( \frac{-\mathrm{i}\,(\mathrm{e}^{-\mathrm{i}\,t\,T_m}e_1)_{m-1}}{(\mathrm{e}^{-\mathrm{i}\,t\,T_m}e_1)_m} \bigg ). \end{aligned}$$
For computing the effective order we only consider time steps \(\rho (t)>0\), and where \(\rho \) appears indeed to be monotonically decreasing over the computed discrete time steps. This restriction is compatible with our assumptions in Sect. 6.
Corrected Krylov approximation and mass conservation We remark that error estimates for the corrected Krylov approximation usually require one additional matrix–vector multiplication, and applying a standard Krylov approximation of dimension \(m+1\) seems to be a more favorable choice in our approach to error estimation.
The Krylov approximation of the matrix exponential conserves the mass for the skew-Hermitian case in contrast to the corrected Krylov approximation. Whether this is a real drawback of the corrected Krylov approximation depends on the emphasis placed on mass conservation. In the following examples we focus on the standard Krylov approximation, with some exceptions which serve for comparisons with the original Expokit code, which is based on the corrected Krylov approximation.
In exact arithmetic we obtain mass conservation for the skew-Hermitian case: For the case \(\Vert v\Vert _2=1\) and the standard Krylov approximation \(S_m(t)v\) we have
$$\begin{aligned} \Vert S_m(t)v\Vert _2 = \Vert V_m \mathrm{e}^{-\mathrm{i}\,t\,T_m}e_1\Vert _2 = e_1^*\mathrm{e}^{\mathrm{i}\,t\,T_m} V_m^*V_m\mathrm{e}^{-\mathrm{i}\,t\,T_m}e_1=1. \end{aligned}$$
(8.3)
The requirement \(V_m^*V_m=I\) is essential to obtain mass conservation in (8.3). In computer arithmetic the loss of orthogonality of the Krylov basis \(V_m\) has been studied earlier, see also [40]. To preserve the property of mass conservation a reorthogonalization, see [43], may be advisable in this case.
Krylov approximation of the matrix exponential in computer arithmetic It has been shown in [11, 13, 19] that a priori error estimates for the Krylov approximation of the matrix exponential remain valid also taking account of affects of arithmetic. Such results imply that in general in computer arithmetic the convergence of the Krylov approximation is not precluded and round-off errors are not critical. In practice round-off errors may in some cases lead to a delay of convergence which can make a reorthogonalization relevant. Stability of the Krylov approximation has been discussed by many authors, see also [38], but is not further discussed here in detail. In the next paragraph we will give an argument, following [13], that the a posteriori error estimates which are the topic of this work are robust with respect to round-off errors.
We recall that the Krylov subspace constructed in computer arithmetic satisfies the Krylov identity (2.2) with a small perturbation, see also [40] for the Lanczos case and [8, 51] for the Arnoldi case, which can both be extended to complex-valued problems using results from [22]. Following results from [13] we conclude that a small perturbation of the Krylov identity leads to a small perturbation of the defect (residual) \(\delta _m(t)\) in (3.1a) and the integral representation of the error in (3.1b). Thus the error estimates given in Sect. 6 remain stable with respect to round-off.
We further use that by construction the computed \(T_m\) is still upper Hessenberg with a positive lower diagonal and in the Lanczos case also real-valued and symmetric. Then following Proposition 6 in the Hermitian (Lanczos) case, the integral representation of the error in (3.1b) results in the upper error bound \({\mathrm {Err}}_{1}\), which is not critically affected by round-off errors. For the upper bound \({\mathrm {Err}}_{a}\) we further assume that spectral properties of (3.4) still hold mutatis mutandis under a small perturbation, see [41] for such results for the Lanczos case, to obtain stability of this upper error bound also with round-off.
Numerical tests for step size control The idea of choosing discrete time steps for the Krylov approximation is described in Sect. 7. The following tests are applied to the matrix exponential of the Hubbard Hamiltonian. We first clarify the notation used for our test setting.
Expokit and\(\hbox {Expokit}^{\star }\) The original Expokit code uses the corrected Krylov approximation with heuristic step size control and an error estimator which is based on the error expansion (5.1), see [47, Algorithm 3.2] for details. Since the standard Krylov approximation is not part of the Expokit package, we have slightly adapted the code and its error estimate such that the standard Krylov approximation is used. We refer to the adapted package as \(\hbox {Expokit}^{\star }\). With \(\hbox {Expokit}^{\star }\) our comparison can be drawn with the standard Krylov approximation which may in some cases be the method of choice as discussed above.
Step size based on\({\mathrm {Err}}_{a}\) In another test code the upper error bound \({\mathrm {Err}}_{a}\) from Theorem 1 is used. With \({\mathrm {Err}}_{a}\) we obtain proven upper bounds on the error and reliable step sizes (7.5).
By gen.res, eff.o.quad, and \( Err _1\) we refer to the generalized residual estimate (6.6), the effective order quadrature (6.9), and (\({\mathrm {Err}}_{1}\)), respectively. Because these error estimates cannot be inverted directly we need to apply heuristic ideas for the step size control, see (7.7). In addition, we use the iteration (7.9) to improve step sizes. For the test problems we have solved, iteration (7.9) converges in less than 2 iterations for \(m=10\) or less than 5 iterations for \(m=30\). We simply choose \(N_j=5\) for our tests.
The a priori estimates (7.8), [24, Theorem 4] and [34, eq. (20)] are given in the corresponding references. Formula (7.8) taken from the Expokit code directly provides a step size. In [34, eq. (20)] the computation of the step size is described. For the error estimate given in [24, Theorem 4] we apply Newton iteration to determine an appropriate step size. For tests on the Hubbard model we use \((\lambda _{\text {max}} - \lambda _{\text {min}}) = 27.4\) as suggested in the description of the Hubbard Hamiltonian.
In Remark 7 below we also investigate the following variants:
Step size based on\({\mathrm {Err}}_{a}^+\) By \({\mathrm {Err}}_{a}^+\) we denote the upper error bound for the corrected Krylov approximation as given in Theorem 3 with \(p=0\). The corresponding step size is given by (7.6).
By i.H.quad we refer to the improved Hermite quadrature (6.5). Similarly to other quadrature error estimates we use heuristic step size control and iteration (7.9) to determine adequate step sizes.
Remark 6
In the Expokit code the step sizes are rounded to 2 digits in every step. Rounding the step size can give too large errors in some steps. This makes it necessary to include safety parameters in Expokit which on the other hand slow down the performance of the code. It seems advisable to avoid any kind of rounding of step sizes.
In Table 1 we compare the total time step t for the Krylov approximation with \(m=10\) and \(m=30\) after \(N=10\) steps obtained with the different step size control strategies. For the local error we choose the tolerance \(\mathrm{tol}=10^{-8}\). The original Expokit code seems to give larger step sizes, but it also uses a larger number of matrix–vector multiplications, see Remark 7. The error estimate \({\mathrm {Err}}_{a}\) leads to optimal step sizes for \(m=10\) and close to optimal step sizes for \(m=30\). For any choice of m the error estimate \({\mathrm {Err}}_{a}\) gives reliable step sizes. The generalized residual estimate overestimates the error and, therefore, step sizes are smaller. The effective order quadrature and \({\mathrm {Err}}_{1}\) give optimal step sizes. With the assumptions from Sect. 6 (which apply to our test examples), the generalized residual estimate and effective order quadrature give reliable step sizes. For the error estimate \({\mathrm {Err}}_{1}\) we do not have results on the reliability of the step sizes since the error estimate \({\mathrm {Err}}_{1}\) does not lead to an upper bound of the error in general. The tested a priori estimates (7.8), [24, Th. 4], and [34, (20)] overestimate the error and lead to precautious step size choices. For all the tested versions the accumulated error \( L_m^{\star } \) (see (7.2)) satisfies \(\Vert L_m^{\star } v\Vert _2/t \le \mathrm{tol}\).
Table 1 The displayed step size t is the sum of \(N=10\) substeps computed by different versions of step size control, as described above Table 2 With a test setting similar to Table 1, we now compute up to a fixed time \(t=0.3\) and choose the number N of steps according to the step size control Apart from step size control, the upper error bound \({\mathrm {Err}}_{a}\) can be used on the fly to test if the dimension of the Krylov subspace is already sufficiently large to solve the problem in a single time step with the required accuracy. For our test problems this stopping criterion is applied to the \({\mathrm {Err}}_{a}\) estimate. We refer to Table 2, in which we observe the Krylov method with error estimate \({\mathrm {Err}}_{a}\) to stop after 17 steps instead of computing the full Krylov subspace of dimension 30. In comparison, the original Expokit package needs a total of 62 matrix–vector multiplications.
Remark 7
Error estimates for the corrected Krylov approximation or improved error estimates such as the improved Hermite quadrature (6.5) require additional matrix–vector multiplications. Instead of investing computational effort in improving the error estimate, one may as well increase the dimension of the standard Krylov subspace. For comparison we test the original Expokit code, the corrected Krylov approximation with error estimate \({\mathrm {Err}}_{a}^+\) and the improved Hermite quadrature (6.5) with Krylov subspace \(m-1\). Table 3 shows that a standard Krylov approximation with dimension m leads to better results, although all considered versions use the same number of matrix–vector multiplications. Since the reliability of error estimates such as \({\mathrm {Err}}_{a}\) has been demonstrated earlier, it appears that additional cost to improve the error estimate is not justified.
Table 3 All variants shown use exactly m matrix–vector multiplications
The Hermitian case
To obtain a more complete picture, we also briefly consider the case of a Hermitian matrix \(A=H\) with \(\sigma =1\) in (2.1). Such a model is typical of the discretization of a parabolic PDE. Thus, the result may depend on the regularity of the initial data, which is chosen to be random in our experiments.
Heat equation To obtain the heat equation in (2.1) we choose \(A=H\) in (8.1) and \(\sigma =-1\). Details on the test setting are already given in Sect. 8.1.
For the heat equation, H given in (8.1), we can also verify the error estimates, see Fig. 7. In comparison to the skew-Hermitian case we do not observe a large time regime for which the error is of the asymptotic order m. As shown in Proposition 6 we do obtain an upper error bound using \({\mathrm {Err}}_{1}\) for the heat equation.
Similarly to the skew-Hermitian case, we can also apply the effective order quadrature according to Remark 5 to the Hermitian case. Use \(\sigma =-1\) and \(T_m\in {\mathbb {R}}^{m\times m}\) in (6.10) to obtain
$$\begin{aligned} \rho (t)=-t\,\bigg ((T_m)_{m,m} + (T_m)_{m,m-1}\, \frac{(\mathrm{e}^{-t\,T_m}e_1)_{m-1}}{(\mathrm{e}^{-t\,T_m}e_1)_m} \bigg ). \end{aligned}$$
For computing the effective order we only consider time steps \(\rho (t)>0\), and where \(\rho \) appears indeed to be monotonically decreasing over the computed discrete time steps. This restriction is compatible with our assumptions in Sect. 6.
A non-normal problem
For a more general case we consider a convection–diffusion equation (see [14, 36]).
$$\begin{aligned}&\partial _t u = \varDelta u - \tau _1 \partial _{x_1} u - \tau _2 \partial _{x_2} u,\nonumber \\&\quad \tau _1,\tau _2\in {\mathbb {R}},\quad u=u(t,x),~~t\ge 0,\quad x\in \varOmega =[0,1]^3,\nonumber \\&u(0,x)=v(x)\quad \text {for}\quad x\in \varOmega ,\quad u(t,x)=0\quad \text {for}\quad x\in \partial \varOmega . \end{aligned}$$
(8.4)
Following [14, 36] we use a central finite difference scheme to discretize the partial differential operator in (8.4). The grid is chosen uniformly with \((n+2)^3\) points and mesh width \(h=1/(n+1)\). The dimension N of the discrete operator is \(N=n^3\). Choosing \(n=15\) we obtain \(N=3357\). The discretized operator is given by
$$\begin{aligned} A&=( I_{n\times n} \otimes (I_{n\times n} \otimes C_1) ) + ( B \otimes I_{n\times n} + I_{n\times n}\otimes C_2 ) \otimes I_{n\times n} )\in {\mathbb {R}}^{N\times N},\quad \text {with} \nonumber \\ B&=\tfrac{1}{h^2} \text {tridiag}(1,-2,1)\in {\mathbb {R}}^n,\nonumber \\&\quad C_i=\tfrac{1}{h^2} \text {tridiag}(1+\mu _i,-2,1-\mu _i)\in {\mathbb {R}}^n,\quad i=1,2, \end{aligned}$$
(8.5)
and \(\mu _i=\tau _i\,(h/2)\). The spectrum of the non-normal matrix A in (8.5) (see [36]) satisfies
$$\begin{aligned} \text {spec}(A)&\subseteq \tfrac{1}{h^2}\,[-6-2\cos (\pi \,h)\text {Re}(\theta ),-6+2\cos (\pi \,h)\text {Re}(\theta )]\\&\times \tfrac{1}{h^2}\,\mathrm{i}\,[-2\cos (\pi \,h)\text {Im}(\theta ),2\cos (\pi \,h)\text {Im}(\theta )]. \end{aligned}$$
with \(\theta = 1 + \sqrt{1-\mu _1^2} + \sqrt{1-\mu _2^2}\). Therefore, the eigenvalues are complex-valued if at least one \(\mu _i>1\). The matrix A depends on the parameters \(\mu _i\), correspondingly \(\tau _i\), for which we consider two different cases,
$$\begin{aligned} \mu _1=0.9, \mu _2=1.1,\quad \text {with}\quad \text {spec}(h^2\,A) \subseteq [-9,-3] \times \mathrm{i}[-1,1], \end{aligned}$$
(8.6)
and
$$\begin{aligned} \mu _1=\mu _2=10,\quad \text {with}\quad \text {spec}(h^2\,A) \subseteq [-8,-4] \times \mathrm{i}[ -39,39]. \end{aligned}$$
(8.7)
In the following numerical experiments we apply the Krylov approximation to \(\mathrm{e}^{t\,A}v\) (\(\sigma =1\) in (2.1)) for different time steps t and starting vector \(v=(1,\ldots ,1)^*\in {\mathbb {R}}^{N}\) as in [36]. For non-normal A we use the Arnoldi method based on a modified Gram–Schmidt procedure (see [46, Algorithm 6.2]) to generate the Krylov subspace.
The error estimates \({\mathrm {Err}}_{a}\) and \({\mathrm {Err}}_{1}\) are compared to the exact error norm \(\Vert L_m(t) v\Vert _2\) in Fig. 8 for the case (8.6) and in Fig. 9 for the case (8.7). As shown in Theorem 1 the error estimate \({\mathrm {Err}}_{a}\) constitutes an upper error bound. The error estimate \({\mathrm {Err}}_{1}\) gives a good approximation of the error but has not been proven to give an upper bound in general.
Compared to (8.7), the spectrum for (8.6) is closer to the Hermitian case. The spectrum for (8.7), on the other hand, is dominated by large imaginary parts similarly as in the skew-Hermitian case.
In Fig. 8 we observe effects similar to the Hermitian case. The asymptotic order m of the error does not hold for a large time regime, and the error estimate \({\mathrm {Err}}_{a}\) is not as sharp as in the skew-Hermitian case. On the other hand, in Fig. 9, we observe that the performance of the error estimates is closer to the skew-Hermitian case. Therefore, the upper error bound \({\mathrm {Err}}_{a}\) is sharp for a larger range of time steps. As already observed for the Hermitian and skew-Hermitian cases, the error of the Krylov approximation is closer to its asymptotic order m for smaller choices of m.