Computable upper error bounds for Krylov approximations to matrix exponentials and associated \({\varvec{\varphi }}\)functions
 265 Downloads
Abstract
An a posteriori estimate for the error of a standard Krylov approximation to the matrix exponential is derived. The estimate is based on the defect (residual) of the Krylov approximation and is proven to constitute a rigorous upper bound on the error, in contrast to existing asymptotical approximations. It can be computed economically in the underlying Krylov space. In view of timestepping applications, assuming that the given matrix is scaled by a time step, it is shown that the bound is asymptotically correct (with an order related to the dimension of the Krylov space) for the time step tending to zero. This means that the deviation of the error estimate from the true error tends to zero faster than the error itself. Furthermore, this result is extended to Krylov approximations of \(\varphi \)functions and to improved versions of such approximations. The accuracy of the derived bounds is demonstrated by examples and compared with different variants known from the literature, which are also investigated more closely. Alternative error bounds are tested on examples, in particular a version based on the concept of effective order. For the case where the matrix exponential is used in time integration algorithms, a step size selection strategy is proposed and illustrated by experiments.
Keywords
Matrix exponential Krylov approximation A posteriori error estimation Upper boundMathematics Subject Classification
15A16 65F15 65F601 Introduction
Overview of existing approaches and results The approximate evaluation of large matrix exponential functions is a topic which has been extensively treated in the numerical analysis literature, for basic reference see e.g., [18, 35]. A standard approach is to project the given matrix M to a lowdimensional Krylov space via Arnoldi or Lanczos iteration, and to directly exponentiate the projected small matrix. A first mention of the Lanczos approach can be found in [42], where it is also recognized that for the method to perform satisfactorily, the timesteps have to be controlled. However, the control mechanism from [42] is not very elaborate and is based on a series expansion of the error, which is only valid in the asymptotic regime, see for instance [39]. For discretizations of parabolic problems, [17] uses an error estimator to choose the stepsize, this approach is improved in [48] and has been generalized in [34]. Notably, in the latter reference a strict error bound is used to estimate the timestep instead of asymptotic techniques. It is argued in [34] that the strategy from [34] performs better than [33] and better in turn than [42].
A first systematic study of Krylovbased methods for the matrix exponential function was given in [45]. The error is analyzed theoretically, yielding both a priori and computable a posteriori estimates. The analysis there relies on approximation theory and yields a priori error bounds which are asymptotically optimal in the dimension of the Krylov subspace in important situations. The analysis moreover implies correction schemes to lift the convergence order which are cheap to compute based on the already available information. The error expansion also suggests a posteriori error estimators resorting to the leading error term. This approach relies on the assumption of the sufficiently rapid decay of the series representation of the error. A recent generalization of this work together with a more rigorous justification is given in [30]. For early studies of a priori error estimates see also [10, 12].
A thorough theoretical analysis of the error of Krylov methods for the exponential of a Hermitian or skew (anti) Hermitian matrix was given in [24]. The analysis derives an asymptotic error expansion and shows superlinear error decay in the dimension m of the approximation subspace for sufficiently large m. These results are further improved in [4]. In [24], a posteriori error estimation is also discussed. This topic is furthermore addressed in [31]. There, the Krylov approximation method is interpreted as a Galerkin method, whence an error bound can be obtained from an error representation for this variational approximation. This yields a computable estimate via a quadrature approximation of the error integral involving the defect of the numerical approximation. The a priori error analysis reveals a stepsize restriction for the convergence of the method, which is less stringent when the subspace dimension is larger.
Further work in the direction of controlling the Lanczos process through information gained from the defect is given in [7]. The defect is a scalar multiple of the successive Krylov vector arising in the iteration and can be evaluated efficiently. If the error is approximated by a Galerkin approach, the resulting estimator corresponds to the difference of two Lanczos iterates. For the purpose of practical error estimation, in [7] it is seen as preferable to continue the original Krylov process. Some other defectbased upper bounds for the error of the matrix exponential are given in [30], including a closer analysis of the error estimate of [45]. These results still require some a priori information on the matrix spectrum.
Various improved methods for computing the matrix exponential function are given in the literature, for example restarted methods, deflated restarting methods or quadrature based restarting methods, see [1, 15, 16].
It has also been advocated in [49] to use preconditioning in the Lanczos method by a shifted inverse in order to get a good approximation of the leading invariant subspaces. The shiftandinvert approach (a specific choice to construct a rational Krylov subspace) for the matrix exponential function was introduced earlier in [37]. However, the choice of the shift is critical for the success of this procedure. This strategy amounts to a transformation of the spectrum which grants a convergence speed which is independent of the norm of the given matrix. In [49], a posteriori error estimation based on the asymptotical expansion of the error is advocated as well. We note that our results do not immediately carry over to the shiftandinvert approach, see Remark 3.
Overview of present work In Sect. 2 we introduce the Krylov approximation and the integral representation of the approximation error in terms of its defect. In Sect. 3 we derive a new computable upper bound for the error by using data available from the Krylov process with negligible additional computational effort (Theorem 1). This upper bound is cheap to evaluate and update on the fly during the Lanczos iteration. It is also asymptotically correct, i.e., for \( t \rightarrow 0 \) the error of the error estimator tends to zero faster asymptotically than the error itself. In Sect. 4 these results are extended to the case where the Krylov approach is employed to approximate the \(\varphi \)functions of matrices (generalizing the exponential function), see Theorem 2. In Sect. 5, improved approximations derived from a corrected Krylov process [45] are discussed, and corresponding error estimators are analyzed, including an asymptotically correct true upper bound on the error (Theorem 3). This approach can be used to increase the order, but it has the drawback of violating mass conservation. In Proposition 6 error estimates are particularized to the Hermitian case. Another view on defectbased error estimation is presented in Sect. 6.
Section 7 is devoted to practical application of the various error estimators for the control of the time steps t including smaller substeps \( \varDelta t \) if it appears indicated. In Sect. 8 we present numerical results for a finite difference discretization of the free Schrödinger equation, a Hubbard model of solar cells, the heat equation, and a convection–diffusion problem, illustrating our theoretical results. Additional practical aspects are also investigated: A priori estimates and the role of restarting are discussed in particular in the context of practical stepsize adaptation. Finally, we demonstrate the computational efficiency of our adaptive strategy.
2 Problem setting, Krylov approximation, and defectbased representation of the approximation error
Remark 1
We are assuming that the Arnoldi iteration is executed until the desired dimension m. Then, by construction, all lower diagonal entries of \( T_m \) are positive [46]. If this is not the case, i.e., if a breakdown occurs, it is known that this breakdown is lucky,, i.e., the approximation (2.6) below obtained in the step before breakdown is already exact, see [45].
For the case of a Hermitian matrix A the Krylov subspace can be constructed using the Lanczos iteration, which is a special case of the Arnoldi iteration, resulting in a tridiagonal matrix \( T_m \in {\mathbb {R}}^{m \times m} \). In the following we discuss the general case and comment on the case of a Hermitian matrix A whenever appropriate.
Proposition 1
Proof
Remark 2
3 An upper error bound for the nonexpansive case in (2.1)
Proposition 2
Proof
Error estimate and asymptotical correctness Now we apply Proposition 2 in the context of our Krylov approximation.
Theorem 1
Proof
Proposition 3
Proof
Remark 3
In [49, Section 4] a defectbased error formulation is given for the shiftandinvert Krylov approximation of the matrix exponential function. In contrast to the standard Krylov method, the defect is not of order \(m1\) for \(t\rightarrow 0\) there. Hence, our new results do not directly apply to shiftandinvert Krylov approximations. A study of a posteriori error estimates for the shiftandinvert approach is a topic of future investigations.
4 Krylov approximation to \(\varphi \)functions
Theorem 2
Proof
5 Corrected Krylov approximation for the exponential and \(\varphi \)functions
Let us recall the wellknown error representation given in [45].
Proposition 4
In [30] it is even shown that \({\mathrm {Err}}_{1}\) is an upper bound up to a factor depending on spectral properties of the matrix \(A\). For the case of Hermitian \(\sigma A\) we show \(\Vert L_m(t)v\Vert _2 \le {\mathrm {Err}}_{1}\) in Proposition 6 below.
In Remark 4 below we show that \({\mathrm {Err}}_{1}\) is also an asymptotically correct approximation for the error norm (in the sense of Proposition 3). Furthermore, the error estimate \({\mathrm {Err}}_{1}\) is computable at nearly no extra cost, see [45, Proposition 2.1].
Proposition 5
The following remark will be used later on.
Remark 4
We also obtain true upper bounds for the matrix exponential (\(p=0\)) and general \(\varphi \)functions with \(p \ge 1\).
Theorem 3
Proof
If the error estimate (5.9b) is to be evaluated, the effort of the computation of \(\Vert Av_{m+1}\Vert _2\) is comparable to one additional step of the Krylov iteration.
As mentioned before, we also can show that for Hermitian \(\sigma A\) the estimate \({\mathrm {Err}}_{1}\) gives a true upper bound:
Proposition 6
Proof
6 Defectbased quadrature error estimates revisited
For a better understanding of the approximation (6.6) we consider the effective order of \(\delta _m(t)\) as a function of t. Let us denote \(f(t):=\delta _m(t)\) and assume \(f(t)>0\) in a sufficiently small interval (0, T]. For the Hermitian case this assumption is fulfilled for all \(t>0\), see Proposition 6.
Remark 5
Up to now we did refer to the effective order of the defect \(\delta _m(t)\). For \(t \rightarrow 0\) the effective order of the error is given by \(\rho (t)+1\).
7 The matrix exponential as a time integrator
For simplicity we assume the nonexpansive case of (2.1) in this section.
In general a large time step t would necessitate large m or a restart of the Krylov method. For larger dimensional problems memory issues can limit the choice of m and make a restart necessary. Considering global computational cost it may also be favorable to use a moderate value of m in combination with restarts. Even if increasing m results in a larger time step t, the increase in computational cost can lead to a decrease of total performance in some cases. This issue is relevant particularly for rather large choices of m, especially if computational cost scaling with \(m^2\) or worse gets noticeable. We further discuss effects of computer arithmetic on the Krylov approximation of the matrix exponential in Sect. 8 without going into details.
For the matrix exponential seen as a time propagator, a simple restart is possible. The following procedure has been introduced in [47] and is recapitulated here to fix the notation.
The aim of the iteration (7.9) is to determine a step size \(\varDelta t_{j,\infty }\) with \(\mathrm {Err}^{[j,\infty ]}=\varDelta t_{j,\infty }\mathrm{tol}\), see (7.3). The convergence behavior of iteration (7.9) depends on the structure of the corresponding error estimate. The idea of the heuristic step size control is based on the asymptotic order of the error for \(\varDelta t\rightarrow 0\), which in (7.7) and (7.9) is assumed to be m, see (2.11). By substituting the asymptotic order m by the effective order \(\rho (\varDelta t)+1\), which is introduced in (6.10), the iteration (7.9) could be improved for a step size \(\varDelta t\) away from the asymptotic regime. In our practical examples this iteration does not seem to be sensitive with respect to the effective order of the error and converges in a small number of steps using the asymptotic order m.
For the following remarks on \(T_m\) we neglect the index j in \(T_m^{[j]}\) to simplify the notation. In the case of a Hermitian matrix \(A\) the matrix \(T_m\) is symmetric, tridiagonal and realvalued which allows cheap and robust computation of its eigenvalue decomposition. The eigenvalue decomposition of \(T_m\) is independent of the step size \(\varDelta t\) and allows cheap evaluation of \(\mathrm{e}^{\sigma \,\varDelta t\,T_m} e_1 \) or \(\varphi _1(\sigma \,\varDelta t\,T_m) e_1 \) and corresponding error estimates for multiple choices of \(\varDelta t\).
For a nonHermitian matrix \(A\) computing \(\mathrm {Err}^{[j,l]}\) for multiple choices of l, hence different step sizes \(\varDelta t_{j,l}\), only leads to slightly larger computational cost, which is usually negligible.
8 Numerical considerations and examples
We give an illustration of our theoretical results for two different skewHermitian problems in Sect. 8.1, a Hermitian problem in Sect. 8.2, and a nonnormal problem in Sect. 8.3. We also compare the performance of different error estimates for practical step size control (Sect. 7) in Sect. 8.1. To show that our error estimate (3.6) is efficient in practice we also compare it with results delivered by the standard package Expokit [47] and a priori error estimates.
8.1 The skewHermitian case
For our tests we use different types of matrices.
For our tests we model 8 electrons at 8 sites (\(n_{\text {sites}}=8\)) with spin up and down for each site, this leads to 16 possible states for electrons. Such an electron distribution is also referred to as halffilled in the literature. We also restrict our model by considering the number of electrons with spin up and down to be fixed as \(n_{\text {sites}}/2\). This leads to \(n=(\text {binomial}(8,4))^2=4900\) considered occupation states which create a discrete basis. For the numerical implementation of the basis we consider 16bit integers for which each bit describes a position which is occupied in case the bit is equal to 1 or empty otherwise. The set of occupation states can be ordered by the value of the integers which leads to a unique representation of the Hubbard Hamiltonian (8.2) by a matrix \(H\in {\mathbb {C}}^{n\times n}\). Such an implementation of the Hubbard Hamiltonian is also described in [29, Section 3].
A relevant application where the Hubbard Hamiltonian (8.2) is of importance is the simulation of oxide solar cells with the goal of finding candidates for new materials promising a gain in the efficiency of the solar cell, see [21]. The study of solar cells considers timedependent electron hoppings \(v_{ij}=v_{ij}(t)\) to model timedependent potentials which lead to Hamiltonian matrices H(t). The timedependent Hamiltonian can be parameterized via \(\omega \). Time propagation of a linear, nonautonomous ODE system can be approximated by Magnustype integrators which are based on one or more evaluations of matrix exponentials applied to different starting vectors at several times t, see for instance [5, 6]. Our test setting for the Hubbard Hamiltonian with arbitrary \(\omega \) is then obtained by (2.1) with the matrix \(A=H_\omega \) as described above and \(\sigma =\mathrm{i}\).
In the following Sect. 8.1 we focus on the skewHermitian case. For tests on the Hermitian case see Sect. 8.2 below.
Verification of upper error bound In the following Figs. 1 and 2 we compare the error \(\Vert L_m(t)v\Vert _2\) with the error estimates \({\mathrm {Err}}_{1}\) and \({\mathrm {Err}}_{a}\). Figure 1 refers to the matrix (8.1) of the free Schrödinger problem and Fig. 2 to the Hubbard Hamiltonian (8.2) with \(\omega =0.123\). For both cases we show results with Krylov subspace dimensions \( m=10 \) and \( m=30 \), respectively.
We observe that the error estimate \({\mathrm {Err}}_{1}\) is a good approximation to the error, but it is not an upper bound in general. In contrast, \({\mathrm {Err}}_{a}\) is a proven upper error bound. Up to roundoff error, for \(m=10\) we observe the correct asymptotic behavior of \({\mathrm {Err}}_{a}\) and \({\mathrm {Err}}_{1}\). For larger choices of m the asymptotic regime starts at time steps for which the error is already close to roundoff precision. Therefore, for larger choices of m, the Krylov approximation, as a time integrator, cannot achieve its full order for typical time steps in double precision.
Illustration of defectbased quadrature error estimates from Section 6 We first illustrate the performance of the estimates based on Hermite quadrature according to (6.3) and improved Hermite quadrature according to (6.5) for the Hubbard model, see Fig. 5. Both estimates are asymptotically correct, whereas the improved quadrature (6.5) is slightly better for larger time steps t, with the drawback of one additional matrix–vector multiplication. (See Remark 7 below for cost efficiency of more expensive error estimates.)
Figure 6 refers to the generalized residual estimate (6.6), and estimates based on the effective order quadrature according to Remark 5, and the Hermite quadrature (6.3). For our test problems the assumptions from Sect. 6 on the defect and its effective order are satisfied for a significant range of values of t. We also observe that the inequalities (6.8) are satisfied. The effective order and Hermite quadrature estimates behave in an asymptotically correct way, while the generalized residual estimate leads to an upper error bound which is, however, not sharp for \(t\rightarrow 0\).
Corrected Krylov approximation and mass conservation We remark that error estimates for the corrected Krylov approximation usually require one additional matrix–vector multiplication, and applying a standard Krylov approximation of dimension \(m+1\) seems to be a more favorable choice in our approach to error estimation.
The Krylov approximation of the matrix exponential conserves the mass for the skewHermitian case in contrast to the corrected Krylov approximation. Whether this is a real drawback of the corrected Krylov approximation depends on the emphasis placed on mass conservation. In the following examples we focus on the standard Krylov approximation, with some exceptions which serve for comparisons with the original Expokit code, which is based on the corrected Krylov approximation.
Krylov approximation of the matrix exponential in computer arithmetic It has been shown in [11, 13, 19] that a priori error estimates for the Krylov approximation of the matrix exponential remain valid also taking account of affects of arithmetic. Such results imply that in general in computer arithmetic the convergence of the Krylov approximation is not precluded and roundoff errors are not critical. In practice roundoff errors may in some cases lead to a delay of convergence which can make a reorthogonalization relevant. Stability of the Krylov approximation has been discussed by many authors, see also [38], but is not further discussed here in detail. In the next paragraph we will give an argument, following [13], that the a posteriori error estimates which are the topic of this work are robust with respect to roundoff errors.
We recall that the Krylov subspace constructed in computer arithmetic satisfies the Krylov identity (2.2) with a small perturbation, see also [40] for the Lanczos case and [8, 51] for the Arnoldi case, which can both be extended to complexvalued problems using results from [22]. Following results from [13] we conclude that a small perturbation of the Krylov identity leads to a small perturbation of the defect (residual) \(\delta _m(t)\) in (3.1a) and the integral representation of the error in (3.1b). Thus the error estimates given in Sect. 6 remain stable with respect to roundoff.
We further use that by construction the computed \(T_m\) is still upper Hessenberg with a positive lower diagonal and in the Lanczos case also realvalued and symmetric. Then following Proposition 6 in the Hermitian (Lanczos) case, the integral representation of the error in (3.1b) results in the upper error bound \({\mathrm {Err}}_{1}\), which is not critically affected by roundoff errors. For the upper bound \({\mathrm {Err}}_{a}\) we further assume that spectral properties of (3.4) still hold mutatis mutandis under a small perturbation, see [41] for such results for the Lanczos case, to obtain stability of this upper error bound also with roundoff.

Expokit and\(\hbox {Expokit}^{\star }\) The original Expokit code uses the corrected Krylov approximation with heuristic step size control and an error estimator which is based on the error expansion (5.1), see [47, Algorithm 3.2] for details. Since the standard Krylov approximation is not part of the Expokit package, we have slightly adapted the code and its error estimate such that the standard Krylov approximation is used. We refer to the adapted package as \(\hbox {Expokit}^{\star }\). With \(\hbox {Expokit}^{\star }\) our comparison can be drawn with the standard Krylov approximation which may in some cases be the method of choice as discussed above.

Step size based on\({\mathrm {Err}}_{a}\) In another test code the upper error bound \({\mathrm {Err}}_{a}\) from Theorem 1 is used. With \({\mathrm {Err}}_{a}\) we obtain proven upper bounds on the error and reliable step sizes (7.5).

By gen.res, eff.o.quad, and \( Err _1\) we refer to the generalized residual estimate (6.6), the effective order quadrature (6.9), and (\({\mathrm {Err}}_{1}\)), respectively. Because these error estimates cannot be inverted directly we need to apply heuristic ideas for the step size control, see (7.7). In addition, we use the iteration (7.9) to improve step sizes. For the test problems we have solved, iteration (7.9) converges in less than 2 iterations for \(m=10\) or less than 5 iterations for \(m=30\). We simply choose \(N_j=5\) for our tests.

The a priori estimates (7.8), [24, Theorem 4] and [34, eq. (20)] are given in the corresponding references. Formula (7.8) taken from the Expokit code directly provides a step size. In [34, eq. (20)] the computation of the step size is described. For the error estimate given in [24, Theorem 4] we apply Newton iteration to determine an appropriate step size. For tests on the Hubbard model we use \((\lambda _{\text {max}}  \lambda _{\text {min}}) = 27.4\) as suggested in the description of the Hubbard Hamiltonian.

Step size based on\({\mathrm {Err}}_{a}^+\) By \({\mathrm {Err}}_{a}^+\) we denote the upper error bound for the corrected Krylov approximation as given in Theorem 3 with \(p=0\). The corresponding step size is given by (7.6).

By i.H.quad we refer to the improved Hermite quadrature (6.5). Similarly to other quadrature error estimates we use heuristic step size control and iteration (7.9) to determine adequate step sizes.
Remark 6
In the Expokit code the step sizes are rounded to 2 digits in every step. Rounding the step size can give too large errors in some steps. This makes it necessary to include safety parameters in Expokit which on the other hand slow down the performance of the code. It seems advisable to avoid any kind of rounding of step sizes.
The displayed step size t is the sum of \(N=10\) substeps computed by different versions of step size control, as described above
Expokit  \(\text {Expokit}^{\star }\)  \({\mathrm {Err}}_{a}\)  gen.res  eff.o.quad  \({\mathrm {Err}}_{1}\)  (7.8)  [24, Th. 4]  [34, (20)]  

\(m=10\)  
t  0.9020  0.6850  0.8468  0.6568  0.8488  0.8489  0.1918  0.4918  0.6879 
N  10  10  10  10  10  10  10  10  10 
# m–v  110  100  100  100  100  100  100  100  100 
\(\Vert L_m^{\star } v\Vert _2/t\)  \(3.5\times 10^{09}\)  \(2.9\times 10^{09}\)  \(9.8\times 10^{09}\)  \({1.0\times 10^{09}}\)  \(1.0\times 10^{08}\)  \(1.0\times 10^{08}\)  \(3.0\times 10^{14}\)  \(7.5\times 10^{11}\)  \(1.5\times 10^{09}\) 
\(m=30\)  
t  8.5700  8.2500  9.7248  9.0091  10.2127  10.2222  2.1131  8.2642  8.8111 
N  10  10  10  10  10  10  10  10  10 
# m–v  310  300  300  300  300  300  300  300  300 
\(\Vert L_m^{\star } v\Vert _2/t\)  \(2.6\times 10^{10}\)  \(2.9\times 10^{10}\)  \(2.6\times 10^{09}\)  \({3.5\times 10^{10}}\)  \(9.5\times 10^{09}\)  \(9.7\times 10^{09}\)  \(2.9\times 10^{15}\)  \(3.4\times 10^{11}\)  \(1.9\times 10^{10}\) 
With a test setting similar to Table 1, we now compute up to a fixed time \(t=0.3\) and choose the number N of steps according to the step size control
\({m=10}\)  Expokit  \(\text {Expokit}^{\star }\)  \({\mathrm {Err}}_{a}\)  gen.res  \({\mathrm {Err}}_{1}\)  (7.8)  [24, Th. 4]  [34, (20)] 

t  0.3  0.3  0.3  0.3  0.3  0.3  0.3  0.3 
N  2  2  1  1  1  2  1  1 
# m–v  62  60  17  30  30  60  30  30 
\(\Vert L_m^{\star } v\Vert _2/t\)  \(8.4\times 10^{15}\)  \(8.4\times 10^{15}\)  \(1.0\times 10^{09}\)  \(9.7\times 10^{15}\)  \(9.7\times 10^{15}\)  \(1.0\times 10^{14}\)  \(9.7\times 10^{15}\)  \(9.7\times 10^{15}\) 
Apart from step size control, the upper error bound \({\mathrm {Err}}_{a}\) can be used on the fly to test if the dimension of the Krylov subspace is already sufficiently large to solve the problem in a single time step with the required accuracy. For our test problems this stopping criterion is applied to the \({\mathrm {Err}}_{a}\) estimate. We refer to Table 2, in which we observe the Krylov method with error estimate \({\mathrm {Err}}_{a}\) to stop after 17 steps instead of computing the full Krylov subspace of dimension 30. In comparison, the original Expokit package needs a total of 62 matrix–vector multiplications.
Remark 7
Error estimates for the corrected Krylov approximation or improved error estimates such as the improved Hermite quadrature (6.5) require additional matrix–vector multiplications. Instead of investing computational effort in improving the error estimate, one may as well increase the dimension of the standard Krylov subspace. For comparison we test the original Expokit code, the corrected Krylov approximation with error estimate \({\mathrm {Err}}_{a}^+\) and the improved Hermite quadrature (6.5) with Krylov subspace \(m1\). Table 3 shows that a standard Krylov approximation with dimension m leads to better results, although all considered versions use the same number of matrix–vector multiplications. Since the reliability of error estimates such as \({\mathrm {Err}}_{a}\) has been demonstrated earlier, it appears that additional cost to improve the error estimate is not justified.
All variants shown use exactly m matrix–vector multiplications
Expokit  \({\mathrm {Err}}_{a}^+\)  i.H.quad  \({\mathrm {Err}}_{a}\)  eff.o.quad  \({\mathrm {Err}}_{1}\)  

\(m=10\)  
t  0.6620  0.7828  0.5863  0.8346  0.8366  0.8368 
N  10  10  10  10  10  10 
# m–v  100  100  100  100  100  100 
\(\Vert L_m^{\star } v\Vert _2/t\)  \(4.1\times 10^{09}\)  \(8.8\times 10^{09}\)  \(1.0\times 10^{08}\)  \(9.8\times 10^{09}\)  \(1.0\times 10^{08}\)  \(1.0\times 10^{08}\) 
\(m=30\)  
t  8.1900  9.5763  9.6591  9.7482  10.2378  10.2473 
N  10  10  10  10  10  10 
# m–v  100  100  100  100  100  100 
\(\Vert L_m^{\star } v\Vert _2/t\)  \(3.6\times 10^{10}\)  \(2.7\times 10^{09}\)  \(9.2\times 10^{09}\)  \(2.6\times 10^{09}\)  \(9.5\times 10^{09}\)  \(9.7\times 10^{09}\) 
8.2 The Hermitian case
To obtain a more complete picture, we also briefly consider the case of a Hermitian matrix \(A=H\) with \(\sigma =1\) in (2.1). Such a model is typical of the discretization of a parabolic PDE. Thus, the result may depend on the regularity of the initial data, which is chosen to be random in our experiments.
Heat equation To obtain the heat equation in (2.1) we choose \(A=H\) in (8.1) and \(\sigma =1\). Details on the test setting are already given in Sect. 8.1.
For the heat equation, H given in (8.1), we can also verify the error estimates, see Fig. 7. In comparison to the skewHermitian case we do not observe a large time regime for which the error is of the asymptotic order m. As shown in Proposition 6 we do obtain an upper error bound using \({\mathrm {Err}}_{1}\) for the heat equation.
8.3 A nonnormal problem
The error estimates \({\mathrm {Err}}_{a}\) and \({\mathrm {Err}}_{1}\) are compared to the exact error norm \(\Vert L_m(t) v\Vert _2\) in Fig. 8 for the case (8.6) and in Fig. 9 for the case (8.7). As shown in Theorem 1 the error estimate \({\mathrm {Err}}_{a}\) constitutes an upper error bound. The error estimate \({\mathrm {Err}}_{1}\) gives a good approximation of the error but has not been proven to give an upper bound in general.
Compared to (8.7), the spectrum for (8.6) is closer to the Hermitian case. The spectrum for (8.7), on the other hand, is dominated by large imaginary parts similarly as in the skewHermitian case.
9 Summary and outlook
We have studied a new reliable error estimate \({\mathrm {Err}}_{a}\) for Krylov approximations to the matrix exponential and \( \varphi \)functions. This error estimate constitutes an upper bound on the error, and it can be computed on the fly at nearly no additional cost. The Krylov process can be stopped as soon as the error estimate satisfies a given tolerance. \({\mathrm {Err}}_{a}\) is asymptotically correct for \( t \rightarrow 0 \) and very tight in the asymptotic regime. Our numerical experiments illustrate that the asymptotic regime is more relevant for the skewHermitian case (compared to the Hermitian case) and for a smaller choice of m and tolerances. The nonnormal examples seem to be in between the skewHermitian and Hermitian cases.
In our numerical experiments the defect (residual) is seen to behave nicely close to the asymptotic regime and the generalized residual estimate is observed to constitute an upper bound. The generalized residual estimate can be tightened by applying an effective order quadrature.
For the Hermitian case we have shown that the error estimate \({\mathrm {Err}}_{1}\) constitutes an upper bound and, compared to other error estimates, seems to be the most appropriate choice for Hermitian problems.
Step size control for a simple restarted scheme is an important application. The upper error bound \({\mathrm {Err}}_{a}\) is an appropriate tool for this task, since the optimal step size for a given tolerance can be computed directly. This is not the case for other error estimates for the Krylov approximation, which usually employ heuristic schemes to compute optimal step sizes in the restarting approach. We have shown that the step size can be cheaply improved by using a heuristic step size approach in an iterative manner. Also the use of a priori bounds is not optimal in most cases.
Footnotes
 1.
In this case the matrix A is usually named H (Hamiltonian).
 2.
Here, \( {e_m = (0,\ldots ,0,1)^*\in {\mathbb {C}}^m} \), and in the sequel we also denote \( {e_1 = (1,0,\ldots ,0)^*\in {\mathbb {C}}^m} \).
 3.
See also Sect. 6 below.
 4.
In the sequel, the argument of \( \delta _m(\cdot ) \) is again denoted by t instead of s.
 5.
In the setting of [3] (higherorder splitting methods) such an improved error estimate was not taken into account since it cannot be evaluated with reasonable effort in that context.
Notes
Acknowledgements
Open access funding provided by Austrian Science Fund (FWF). This work was supported by the Doctoral College TUD, Technische Universität Wien, and by the Austrian Science Fund (FWF) under the grant P 30819N32.
References
 1.Afanasjew, M., Eiermann, M., Ernst, O., Güttel, S.: Implementation of a restarted Krylov subspace method for the evaluation of matrix functions. Linear Algebra Appl. 429(10), 2293–2314 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
 2.AlMohy, A.H., Higham, N.J.: Computing the action of the matrix exponential, with an application to exponential integrators. SIAM J. Sci. Comput. 33(2), 488–511 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
 3.Auzinger, W., Koch, O., Thalhammer, M.: Defectbased local error estimators for splitting methods, with application to Schrödinger equations, Part II: higherorder methods for linear problems. J. Comput. Appl. Math. 255, 384–403 (2013)CrossRefzbMATHGoogle Scholar
 4.Beckermann, B., Reichel, L.: Error estimation and evaluation of matrix functions via the Faber transform. SIAM J. Numer. Anal. 47, 3849–3883 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
 5.Blanes, S., Moan, P.C.: Fourth and sixthorder commutatorfree Magnus integrators for linear and nonlinear dynamical systems. Appl. Numer. Math. 56, 1519–1537 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
 6.Blanes, S., Casas, F., Oteo, J.A., Ros, J.: The Magnus expansion and some of its applications. Phys. Rep. 470, 151–238 (2008)MathSciNetCrossRefGoogle Scholar
 7.Botchev, M., Grimm, V., Hochbruck, M.: Residual, restarting and Richardson iteration for the matrix exponential. SIAM J. Sci. Comput. 35, A1376–A1397 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
 8.Braconnier, T., Langlois, P., Rioual, J.: The influence of orthogonality on the Arnoldi method. Linear Algebra Appl. 309(1), 307–323 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
 9.Celledoni, E., Moret, I.: A Krylov projection method for systems of ODEs. Appl. Numer. Math. 24, 365–378 (1997) MathSciNetCrossRefzbMATHGoogle Scholar
 10.Druskin, V., Knizhnerman, L.: Two polynomial methods of calculating functions of symmetric matrices. Zh. Vychisl. Mat. Mat. Fiz. 29(6), 112–121 (1989)MathSciNetzbMATHGoogle Scholar
 11.Druskin, V., Knizhnerman, L.: Error bounds in the simple Lanczos procedure for computing functions of symmetric matrices and eigenvalues. Comput. Math. Math. Phys. 31(7), 20–30 (1992)Google Scholar
 12.Druskin, V., Knizhnerman, L.: Krylov subspace approximation of eigenpairs and matrix functions in exact and computer arithmetic. Numer. Linear Algebra Appl. 2(3), 205–217 (1995)MathSciNetCrossRefzbMATHGoogle Scholar
 13.Druskin, V., Greenbaum, A., Knizhnerman, L.: Using nonorthogonal Lanczos vectors in the computation of matrix functions. SIAM J. Sci. Comput. 19, 38–54 (1998)MathSciNetCrossRefzbMATHGoogle Scholar
 14.Eiermann, M., Ernst, O.: A restarted Krylov subspace method for the evaluation of matrix functions. SIAM J. Numer. Anal. 44, 2481–2504 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
 15.Eiermann, M., Ernst, O., Güttel, S.: Deflated restarting for matrix functions. SIAM J. Matrix Anal. Appl. 32(2), 621–641 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
 16.Frommer, A., Güttel, S., Schweitzer, M.: Efficient and stable Arnoldi restarts for matrix functions based on quadrature. SIAM J. Matrix Anal. Appl. 35(2), 661–683 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
 17.Gallopoulos, E., Saad, Y.: Efficient solution of parabolic equations by Krylov approximation methods. SIAM J. Sci. Statist. Comput. 13, 1236–1264 (1992)MathSciNetCrossRefzbMATHGoogle Scholar
 18.Golub, G.H., Van Loan, C.F.: Matrix Computations, 2nd edn. The John Hopkins University Press, Baltimore (1989)zbMATHGoogle Scholar
 19.Greenbaum, A.: Behavior of slightly perturbed Lanczos and conjugategradient recurrences. Linear Algebra Appl. 113, 7–63 (1989)MathSciNetCrossRefzbMATHGoogle Scholar
 20.Gustafsson, K.: Control theoretic techniques for stepsize selection in explicit Runge–Kutta methods. ACM Trans. Math. Softw. 17, 533–554 (1991)MathSciNetCrossRefzbMATHGoogle Scholar
 21.Held, K.: Electronic structure calculations using dynamical mean field theory. Adv. Phys. 56, 829–926 (2007)CrossRefGoogle Scholar
 22.Higham, N.: Accuracy and Stability of Numerical Algorithms. Society for Industrial and Applied Mathematics, Philadelphia (2002)CrossRefzbMATHGoogle Scholar
 23.Higham, N.: Functions of Matrices: Theory and Computations. SIAM, Philadelphia (2008)CrossRefzbMATHGoogle Scholar
 24.Hochbruck, M., Lubich, C.: On Krylov subspace approximations to the matrix exponential operator. SIAM J. Numer. Anal. 34, 1911–1925 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
 25.Hochbruck, M., Lubich, C., Selhofer, H.: Exponential integrators for large systems of differential equations. SIAM J. Sci. Comput. 19, 1552–1574 (1998)MathSciNetCrossRefzbMATHGoogle Scholar
 26.Hochbruck, M., Ostermann, A.: Exponential integrators. Acta Numer. 19, 209–286 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
 27.Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge University Press, Cambridge (1985)CrossRefzbMATHGoogle Scholar
 28.Hubbard, J.: Electron correlations in narrow energy bands. Proc. R. Soc. Lond. A 276, 238–257 (1963)CrossRefGoogle Scholar
 29.Jafari, S.: Introduction to Hubbard model and exact diagonalization. Iran. J. Phys. Res. 8 (2008). Available from http://ijpr.iut.ac.ir/article1279en.pdf
 30.Jia, Z., Lv, H.: A posteriori error estimates of Krylov subspace approximations to matrix functions. Numer. Algorithms 69, 1–28 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
 31.Lubich, C.: From Quantum to Classical Molecular Dynamics: Reduced Models and Numerical Analysis, Zurich Lectures in Advanced Mathematics. European Mathematical Society, Zurich (2008)Google Scholar
 32.Mahan, G.D.: ManyParticle Physics: Physics of Solids and Liquids, 2nd edn. Plenum Press, New York (1993)Google Scholar
 33.Mohankumar, N., Auerbach, S.M.: On timestep bounds in unitary quantum evolution using the Lanczos method. Comput. Phys. Commun. 175, 473–481 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
 34.Mohankumar, N., Carrington, T.: A new approach for determining the time step when propagating with the Lanczos algorithm. Comput. Phys. Commun. 181, 1859–1861 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
 35.Moler, C., Van Loan, C.F.: Nineteen dubious ways to compute the exponential of a matrix, twentyfive years later. SIAM Rev. 45(1), 3–49 (2003) MathSciNetCrossRefzbMATHGoogle Scholar
 36.Moret, I., Novati, P.: An interpolatory approximation of the matrix exponential based on Faber polynomials. J. Comput. Appl. Math. 131(1), 361–380 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
 37.Moret, I., Novati, P.: RD rational approximations of matrix exponentials. BIT 44, 595–615 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
 38.Musco, C., Musco, C., Sidford, A.: Stability of the Lanczos method for matrix function approximation. In: Czumaj, A. (ed.), Proceeding SODA ’18—Proceedings of the TwentyNinth Annual ACMSIAM Symposium on Discrete Algorithms, pp. 1605–1624 (2018). Preprint available from arXiv:1708.07788
 39.Niesen, J., Wright, W.: Algorithm 919: a Krylov subspace algorithm for evaluating the \(\varphi \)functions appearing in exponential integrators. ACM Trans. Math. Softw. 38, 22 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
 40.Paige, C.: Error analysis of the Lanczos algorithm for tridiagonalizing a symmetric matrix. IMA J. Appl. Math. 18(3), 341–349 (1976)MathSciNetCrossRefzbMATHGoogle Scholar
 41.Paige, C.: Accuracy and effectiveness of the Lanczos algorithm for the symmetric eigenproblem. Linear Algebra Appl. 34, 235–258 (1980)MathSciNetCrossRefzbMATHGoogle Scholar
 42.Park, T.J., Light, J.C.: Unitary quantum time evolution by iterative Lanczos reduction. J. Chem. Phys. 85, 5870–5876 (1986)CrossRefGoogle Scholar
 43.Parlett, B.: The Symmetric Eigenvalue Problem. Society for Industrial and Applied Mathematics, Philadelphia (1998)CrossRefzbMATHGoogle Scholar
 44.Pavarini, E., Koch, E., van den Brink, J., Sawatzky, G.: Quantum Materials: Experiments and Theory. Modeling and Simulation, vol. 6. Forschungszentrum Jülich, Jülich (2016)Google Scholar
 45.Saad, Y.: Analysis of some Krylov subspace approximations to the matrix exponential operator. SIAM J. Numer. Anal. 29(1), 209–228 (1992)MathSciNetCrossRefzbMATHGoogle Scholar
 46.Saad, Y.: Iterative Methods for Sparse Linear Systems, 2nd edn. SIAM, Philadelphia (2003)CrossRefzbMATHGoogle Scholar
 47.Sidje, R.: Expokit: a software package for computing matrix exponentials. ACM Trans. Math. Softw. 24(1), 130–156 (1998)CrossRefzbMATHGoogle Scholar
 48.Stewart, D.E., Leyk, T.S.: Error estimates for Krylov subspace approximations of matrix exponentials. J. Comput. Appl. Math. 72, 359–369 (1996)MathSciNetCrossRefzbMATHGoogle Scholar
 49.van den Eshof, J., Hochbruck, M.: Preconditioning Lanczos approximations to the matrix exponential. SIAM J. Sci. Comput. 27, 1438–1457 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
 50.Wang, H., Ye, Q.: Error bounds for the Krylov subspace methods for computations of matrix exponentials. SIAM J. Matrix Anal. Appl. 38(1), 155–187 (2017)MathSciNetCrossRefzbMATHGoogle Scholar
 51.Zemke, J.: Krylov Subspace Methods in Finite Precision: A Unified Approach. Ph.D. Thesis, Technische Universität Hamburg (2003)Google Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.