On an integrated Krylov-ADI solver for large-scale Lyapunov equations

Benner, Peter; Palitta, Davide; Saak, Jens

doi:10.1007/s11075-022-01409-5

On an integrated Krylov-ADI solver for large-scale Lyapunov equations

Original Paper
Open access
Published: 07 October 2022

Volume 92, pages 35–63, (2023)
Cite this article

Download PDF

You have full access to this open access article

Numerical Algorithms Aims and scope Submit manuscript

On an integrated Krylov-ADI solver for large-scale Lyapunov equations

Download PDF

Abstract

One of the most computationally expensive steps of the low-rank ADI method for large-scale Lyapunov equations is the solution of a shifted linear system at each iteration. We propose the use of the extended Krylov subspace method for this task. In particular, we illustrate how a single approximation space can be constructed to solve all the shifted linear systems needed to achieve a prescribed accuracy in terms of Lyapunov residual norm. Moreover, we show how to fully merge the two iterative procedures in order to obtain a novel, efficient implementation of the low-rank ADI method, for an important class of equations. Many state-of-the-art algorithms for the shift computation can be easily incorporated into our new scheme, as well. Several numerical results illustrate the potential of our novel procedure when compared to an implementation of the low-rank ADI method based on sparse direct solvers for the shifted linear systems.

Inexact methods for the low rank solution to large scale Lyapunov equations

Article 18 June 2020

Towards a Lanczos’ $$\tau $$ -Method Toolkit for Differential Problems

Article 19 April 2016

Krylov Subspace Spectral Methods with Coarse-Grid Residual Correction for Solving Time-Dependent, Variable-Coefficient PDEs

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The low-rank alternating direction implicit (LR-ADI) [42, 54] method is one of the state-of-the-art methods for the numerical solution of large-scale Lyapunov equations [19, 65]. This linear matrix equation can be encountered in many applications: control and system theory [34, 66], especially some model reduction techniques for dynamical systems [3, 15], but also discretization of certain partial differential equations (PDEs) [71], and many more.

We consider Lyapunov equations of the form

$$AXE^{\mathsf{T}}+EXA^{\mathsf{T}}+BB^{\mathsf{T}}$$

(1)

where $A, E\in \mathbb {R}^{n\times n}$, and $B\in \mathbb {R}^{n\times q}$, q ≪ n. Moreover, E is supposed to be symmetric positive definite (SPD) and the matrix pencil (A,E) to be asymptotically stable, i.e., its spectrum is contained in the open left half plane $\mathbb {C}_{-}$, which guarantees that a unique solution X exists, and is symmetric positive semidefinite [53].

A special case of (1) is attained whenever E = I, namely the equation of interest is

$$AX+XA^{\mathsf{T}}+BB^{\mathsf{T}}=0.$$

(2)

Oftentimes the coefficient matrix E possesses a structured sparsity pattern. For instance, it is (block) diagonal when the matrices stem from a finite element discretization that uses mass-lumping. In this case, we can easily transform (1) and obtain an equation of the form (2). This can, for example, be achieved by simply pre- and post-multiplying (1) by $E^{-\frac {1}{2}}$ to potentially preserve symmetry of A. For the sake of simplicity, we thus focus on (2) in the following.

In case of very large problem dimensions, the solution X cannot be stored since this matrix is, in general, dense. However, it is well known that its singular values quickly decay to zero under suitable assumptions, see, e.g., [5, 13, 33, 55], so that accurate low-rank approximations ZZ^T ≈ X, $Z\in \mathbb {R}^{n\times t}$, t ≪ n, can be constructed. The efficient computation of the low-rank factor Z is the task of LR-ADI and of all other low-rank methods (see, e.g., the survey papers [19, 65] for further details on different low-rank methods for linear matrix equations).

It is well known that the convergence rate of the LR-ADI method is strictly connected to the selection of some parameters ${\{p_{i}\}}_{i=1,\ldots ,j}\subset \mathbb {C}_{-}$ called shifts^{Footnote 1}. The computation of effective shifts is a highly non-trivial task and it has been a rather active research topic in the last decades. Many strategies are available in the literature and these can be divided into two categories: Offline routines [54, 60, 73], where the shifts are computed a priori, before LR-ADI starts and then, potentially, cyclically reused, and online schemes [12, 37], where the shifts are computed on the fly within the iterative procedure. The name shifts for the values p_j comes from the fact that in each LR-ADI iteration we need to solve a shifted linear system with a coefficient matrix of the form $A + p_{j}E$, or $A + p_{j}I$, in case of (1), or (2), respectively. Notice that since (A,E) (or A in case of (2)) is asymptotically stable and $\{p_{j}\}\subset \mathbb {C}_{-}$, all the linear systems involved in the LR-ADI scheme are well defined.

In Algorithm 1, we report an implementation of the LR-ADI scheme for the solution of (1). Notice that Algorithm 1 is designed to drastically reduce the amount of complex arithmetic calculations. Indeed, even though A and B in (2) are real, the shifts p_j are often complex if A is nonsymmetric, so that complex arithmetic may occur (see [11], [36, Chapter 4], and references therein for details and derivations).

One of the most computationally expensive steps of Algorithm 1 is the solution of the shifted linear systems with q right-hand sides in line 3. Such a job has to be carried out at each LR-ADI iteration. In this contribution, we propose to employ state-of-the-art block Krylov subspace methods for this task. In particular, for (2), we illustrate how to efficiently reuse the approximation space employed at the jth LR-ADI iteration and utilize it also in the next one. To this end, it is crucial that the right-hand side of the linear system we need to solve at the (j +1)-st iteration can be represented in terms of the basis of the subspace employed in the previous iteration. This simple but critical observation lets us design a novel, efficient procedure that can lead to noticeable savings in the running time for the solution of (2). Indeed, all the LR-ADI steps can be completely merged into the Krylov routine so that the LR-ADI iteration is only implicitly performed. Moreover, also the LR-ADI shift computation can be incorporated into the framework proposed in this paper.

The following is a synopsis of the paper. Section 2 is devoted to recalling the general (block) Krylov subspace framework for shifted linear systems. In particular, some details about the extended Krylov subspace method presented in [64] are given in Section 2.1. In Section 3, we present the main contribution of the paper and we show how to fully merge the LR-ADI iteration into the projection method adopted for the linear system solution. The selection of effective shifts is crucial for attaining a fast convergence in terms of number of LR-ADI iterations, and numerous strategies have been proposed in the literature to accomplish this task (see, e.g., [12, 16, 37, 54, 58, 60, 71, 73]). In Section 5, we illustrate how many of these routines can be integrated into our novel framework with no additional cost. The potential of our strategy is depicted in Section 6, where several numerical results are reported. We close the paper with our conclusions in Section 7.

Throughout the paper, we adopt the following notation. The matrix inner product is defined as $\langle X, X\rangle _{F}$: = trace(Y^TX) so that the induced norm is $\|X\|_{F}^{2}= \langle X, X\rangle _{F}$. The Kronecker product is denoted by ⊗ whereas I_n and O_n×m denote the identity matrix of order n and the n × m zero matrix, respectively. Only one subscript is used for a square zero matrix, i.e., O_n×n = O_n, and the subscript is omitted whenever the dimension of I and O is clear from the context. Moreover, e_i is the ith basis vector of the canonical basis of $\mathbb {R}^{n}$. The brackets [⋅] are used to concatenate matrices of conformal dimensions. In particular, a MATLAB-like notation is adopted and [M,N] denotes the matrix obtained by putting M on the left of N whereas [M;N] the one obtained by putting M on top of N, i.e., [M;N] = [M^T,N^T]^T. If $w\in \mathbb {R}^{n}$, diag(w) denotes the n × n diagonal matrix whose ith diagonal entry corresponds to the ith component of w. Given $X\in \mathbb {C}^{n\times m}$, we write X = Re(X) + ı Im(X), where Re(X) and Im(X) are its real and imaginary parts, respectively, and ı is the imaginary unit. The complex conjugate of X is denoted by $\overline {X}= \text {Re}(X)-\imath \text {Im}(X)$.

2 Block Krylov methods for shifted linear systems

The literature about the numerical solution of shifted linear systems by Krylov subspace methods is rather vast. Indeed, sequences of shifted linear systems arise in many applications belonging to different research areas like control theory [23, 41], wave propagation problems [8], mechanical systems [27], and quantum chromodynamics [32].

This algebraic problem is trickier than it looks and many researchers have contributed to its understanding providing important insights on its properties and designing efficient, robust algorithms for its solution. Here is an incomplete list of contributions on numerical schemes for sequences of shifted linear systems and their analysis [7, 28, 29, 48, 62, 67, 68, 70].

In this section, we consider sequences of shifted linear systems of the form

$$(A+p_{j}I)S_{j}=W,\quad W\in\mathbb{R}^{n\times q},$$

(3)

where the right-hand side W does not depend on the index j, even though, in line 3 of Algorithm 1, W_j−1 does change at every LR-ADI iteration. In Section 3, we show how to adapt the machinery, presented here, to the case of linear systems of the form $(A + p_{j}I)S_{j} = W_{j-1}$, arising within the LR-ADI scheme.

Any Krylov routine for (3) computes a numerical solution of the form $S_{m}^{(j)}=S_{0}+V_{m}Y_{m}^{(j)}\approx S_{j}$, $V_{m}=[\mathcal {V}_{1},\ldots ,\mathcal {V}_{m}]\in \mathbb {R}^{n\times m\ell q}$, ℓ ≥1^{Footnote 2}, $\mathcal {V}_{i}\in \mathbb {R}^{n\times \ell q}$, i =1,…,m, $Y_{m}^{(j)}\in \mathbb {C}^{m\ell q\times q}$, where the orthonormal columns of V_m span a suitable subspace $\mathcal {K}_{m}$, namely, $Range (V_{m})=\mathcal {K}_{m}$, S₀ is an initial guess, and the matrix $Y_{m}^{(j)}$ can be computed by imposing different conditions. In particular, $Y_{m}^{(j)}$ is often computed by either imposing a Galerkin condition on the residual or minimizing the residual norm. For the sake of simplicity, we consider S₀ = O in the following.

One of the most common choices for the approximation space $\mathcal {K}_{m}$ is the block Krylov subspace

$$\mathbf{K}_{m}^{\square}(A,W)=\text{Range} ([W,AW,\ldots,A^{m-1}W]).$$

(4)

See, e.g., [30, 50, 61] and the references therein for further details on the block polynomial Krylov subspace $\mathbf {K}_{m}^{\square }(A,W)$ and related methods.

However, Simoncini showed in [64] that the extended Krylov subspace [24]

$$\mathbf{E}\mathbf{K}_{m}^{\square}(A,W)=\text{Range} ([W,A^{-1}W,AW,A^{-2}W,\ldots,A^{m-1}W,A^{-m}W]),$$

(5)

can be a powerful alternative for the solution of (3) in many cases. For instance, when A is large and real while the p_j’s are complex (see also Section 2.1).

The basis V_m of both the polynomial and extended Krylov subspace can be constructed by means of the (extended) Arnoldi process and the following Arnoldi relation is fulfilled

$$AV_{m}=V_{m}T_{m}+\mathcal{V}_{m+1}E_{m+1}^{\mathsf{T}}\underline{T}_{m},$$

(6)

where $\underline {T}_{m}=V_{m+1}^{\mathsf {T}}AV_{m}\in \mathbb {R}^{(m+1)\ell q\times m\ell q}$, T_m is its principal square submatrix, and E_m+1 = e_m+1 ⊗ I_ℓq (see, e.g., [56, 63]).

The Arnoldi relation (6) is one of the most crucial tools in the solution of (3) by Krylov methods. Indeed, it can be used to show the fundamental shift-invariance property of the Krylov subspaces (4) and (5), and the following relation holds true

$$(A+p_{j}I_{n})V_{m}=V_{m}(T_{m}+p_{j}I_{m\ell q})+\mathcal{V}_{m+1}E_{m+1}^{\mathsf{T}}\underline{T}_{m}.$$

(7)

See, e.g., [62, Equation (2.1)], [64, Equation (3.1)].

Equation (7) says that we can compute only one approximation space for solving (3). In particular, the space constructed using A, i.e., $\mathbf {K}_{m}^{\square }(A,W)$ or $\mathbf {EK}_{m}^{\square }(A,W)$, can be employed, by possibly being expanded, to solve all the shifted linear systems in the sequence (3).

Polynomial Krylov subspace methods often need many iterations to achieve the prescribed accuracy, so that a large subspace is constructed. This leads to an increment in both the storage demand and the computational efforts of the selected solution procedure. Different strategies have been developed to avoid the construction of a too large subspace.

With the goal of achieving a fast convergence in terms of number of iterations, the linear system (3) can be preconditioned, namely is transformed into an equivalent problem with better spectral properties. However, designing effective preconditioning operators for a sequence of shifted linear systems is a difficult task and often highly problem dependent. Very sophisticated schemes have been proposed in the literature (see, e.g., [4, 9, 20, 21, 46]).

Restarted routines are an alternative solution. In this framework, the approximation space $\mathcal {K}_{m}$ is expanded until it reaches a prescribed maximum dimension. If the desired level of accuracy is not achieved, the last computed basis block $\mathcal {V}_{m+1}$ is employed as initial block in the construction of a new subspace $\mathcal {K}_{m}^{\prime }$. This procedure is iterated until a stopping criterion is fulfilled (see, e.g., [29, 62] and [30, Section 3.2.1]). However, in our framework, the LR-ADI shifts p_j’s are often computed on the fly and, thus, are not all available at the same time. Therefore, to fully take advantage of the computational efforts needed to solve the linear system $(A + p_{j-1}I)S_{j-1} = W$, we would have to store all the bases computed during the employed restarted Krylov procedure and use them to solve the jth linear system, as well. Unfortunately, this would destroy all the benefits in terms of storage complexity gained from the restart-paradigm.

In [64], Simoncini showed that the employment of the extended Krylov subspace (5), in place of (4), often leads to a faster convergence, in terms of iterations, to the point that the constructed subspace is usually smaller than the polynomial counterpart needed to reach the same level of accuracy. We, thus, decide to use such an approximation space for the solution of the shifted linear systems within the LR-ADI method and in the next section we recall some details of the extended Krylov subspace method.

Notice that the faster convergence of the extended Krylov subspace (5) comes with a toll. Indeed, at each iteration, a linear system with A has to be solved during the basis construction. Nevertheless, the increase in the overall workload of the solution process can be limited in general. Indeed, if we want to use a direct solver to invert A, for instance, the LU factors of A can be computed once and for all before the LR-ADI scheme starts. On the other hand, if an iterative procedure is employed, analogously a single preconditioner for A has to be designed once.

As already mentioned, in the formulation (3), the right-hand side W is fixed, namely it does not depend on the shift index j. However, in line 3 of Algorithm 1, the linear systems we need to solve are of the form

$$(A+p_{j}I)S_{j}=W_{j-1}.$$

At a first glance, having a nonconstant right-hand side does not allow for the employment of the shifted Krylov framework we briefly described above. A larger class of solvers, the so-called recycling Krylov methods, seems more appropriate (see, e.g., [31, 52, 67, 69, 70] for general sequences of shifted linear systems, and [1, 2, 26] for some recycling Krylov techniques applied in a model reduction context). However, in Section 3, we show that, in the LR-ADI context for j > 1, the residual factor W_j−1 belongs to the subspace $\mathcal {K}_{m}$ employed in the solution of the (j −1)-st linear system $(A + p_{j-1}I)S_{j-1} = W_{j-2}$. Along with the shift-invariance property of the Krylov subspace, this observation allows us to utilize only one subspace for the solution of all the shifted linear systems within the LR-ADI method. In turn, as shown in Section 6, we can notably reduce the computational effort of the overall procedure.

2.1 The extended Krylov subspace method for shifted linear systems

In this section, we recall the extended Krylov subspace method for shifted linear systems presented in [64].

Given the sequence of shifted linear systems (3), the extended Krylov subspace method computes a solution of the form $S_{m}^{(j)}=V_{m}Y_{m}^{(j)}$, where the 2mq orthonormal columns of V_m span the extended Krylov subspace (5), whereas the 2mq × q matrix $Y_{m}^{(j)}$ can be computed in different manners.

For instance, $Y_{m}^{(j)}$ can be computed by imposing a Galerkin condition on the residual $R_{m}^{(j)}=(A+p_{j}I)V_{m}Y_{m}^{(j)}-W$, namely by imposing $V_{m}^{\mathsf {T}}R_{m}^{(j)}=0$. Thanks to the shifted Arnoldi relation (7), it is easy to show that such a Galerkin condition is equivalent to solving the projected linear systems

$$(T_{m}+p_{j}I)Y_{m}^{(j)}=E_{1}\gamma,$$

(8)

where E₁ = e₁ ⊗ I_2q, and $\gamma \in \mathbb {R}^{2q\times q}$ is such that W = V₁γ.

With $Y_{m}^{(j)}$ at hand, the Frobenius norm of the residual $\|R_{m}^{(j)}\|_{F}$ can be computed at low cost, as

$$\|R_{m}^{(j)}\|_{F}=\|E_{m+1}^{\mathsf{T}}\underline{T}_{m}Y_{m}^{(j)}\|_{F},$$

(9)

following [64, Equation (3.2)].

Alternatively, following the discussion in [68, Section 4.1], the matrix $Y_{m}^{(j)}$ can be computed also by minimizing the residual norm, i.e.,

$$Y_m^{(j)}={\underset{Y\in\mathbb{R}^{2mq\times q}}{\text{argmin}}}_{}\;{\left\|(A+p_jI)V_mY-W\right\|}_F.$$

(10)

Once again, thanks to the shifted Arnoldi relation (7), the minimization problem in (10) simplifies, and we can compute $Y_{m}^{(j)}$ as

$$Y_m^{(j)}=\underset{Y\in\mathbb{R}^{2mq\times q}}{\text{argmin}}\;{\left\|({\underline T}_m+p_j\lbrack I_{2mq};O_{2q\times2mq}\rbrack)Y-E_1\gamma\right\|}_F.$$

(11)

Note the abuse of notation: in (11) $E_{1}\in \mathbb {R}^{2(m+1)q\times 2q}$ whereas $E_{1}\in \mathbb {R}^{2mq\times 2q}$ in (8).

If $QP=\underline {T}_{m}+p_{j}[I_{2mq};O_{2q\times 2mq}]$ denotes the QR factorization of $\underline {T}_{m}+p_{j}[I_{2mq};O_{2q\times 2mq}]$, and we consider the following partition

$$Q=[Q_{1},Q_{2}], Q_{1}\in\mathbb{R}^{2(m+1)q\times 2mq}, Q_{2}\in\mathbb{R}^{2(m+1)q\times 2q}, P=\left[\begin{array}{cc} P_{1}\\ O_{2q\times 2mq} \end{array}\right] , P_{1}\in\mathbb{R}^{2mq\times 2mq},$$

then the matrix $Y_{m}^{(j)}$ in (11) can be computed as

$$Y_{m}^{(j)}=P_{1}^{-1}Q_{1}^{\mathsf{T}}E_{1}\gamma,$$

(12)

and the residual norm is given by

$$\|R_{m}^{(j)}\|_{F}=\|Q_{2}^{\mathsf{T}}E_{1}\gamma\|_{F}.$$

(13)

The overall procedure is summarized in Algorithm 2, where Σ contains the indices of all yet unsolved systems, whereas Σ_C contains the indices of all the systems that have already been solved. The basis block $\mathcal V_{m+1}$ can be computed by following [63]. This operation involves both matrix-vector products and linear system solves with A. Moreover, the basis V_m is real whenever A and W are so. Complex arithmetic may occur in the computation of $Y_{m}^{(j)}$, if Im(p_j)≠0.

Notice that as soon as the jth linear system has converged, namely the related relative residual norm is sufficiently small, we stop solving the jth projected problem^{Footnote 3}. Once all the linear systems have converged, we terminate the iterative process.

To conclude, we would like to point out that, to the best of our knowledge, this is the first time the minimal residual condition (11) is proposed within the extended Krylov subspace method for shifted linear systems.

3 Merging the two iterative procedures

In this section, we show how the LR-ADI iteration and the extended Krylov subspace method for shifted linear systems can be merged together into a novel, efficient iterative procedure for the solution of (2).

As already mentioned, in the sequence of shifted linear systems in line 3 of Algorithm 1, also the right-hand side W_j−1 depends on the current LR-ADI iteration j. Therefore, at a first glance, we seemingly have to build a new subspace at each iteration j, by employing the current W_j−1 as initial block. However, in the following theorem, we show that W_j−1 belongs to the subspace constructed to solve the (j −1)-st linear system so that such a space can be used, by being possibly expanded, also in the solution of the subsequent linear system.

Theorem 3.1

Let $S_{j}=V_{m_{j}}Y_{m_{j}}$, $j\geqslant 1$, $Range (V_{m_{j}})=\mathbf {EK}_{m_{j}}^{\square }(A,B)$ for certain $m_{j}\geqslant 0$. Then

$$\text{Range} (W_{j})\subseteq\mathbf{EK}_{m_{j}}^{\square}(A,B).$$

Proof

We are going to show the statement by induction on j.

The first linear system to be solved within the LR-ADI method is $(A + p_{1}I)S_{1} = B$ and the extended Krylov subspace $\mathbf {EK}_{m_{1}}^{\square }(A,B)$ can be employed to this end. The computed solution is of the form $S_{1}=V_{m_{1}}Y_{m_{1}}$, m₁ >0, where $\text{Range} (V_{m_{1}})=\mathbf {EK}_{m_{1}}^{\square }(A,B)$ and $Y_{m_{1}}\in \mathbb {C}^{2m_{1}q\times q}$. It is thus easy to show that $W_{1}=B-2\text {Re}(p_{1})S_{1}=V_{m_{1}}(E_{1}\gamma -2\text {Re}(p_{1})Y_{m_{1}})$ is such that $\text{Range} (W_{1})\subseteq \mathbf {EK}_{m_{1}}^{\square }(A,B)$.

We now assume the statement holds for a certain $j-1\geqslant 1$, and we show it holds for j as well. Since $S_{j}=V_{m_{j}}Y_{m_{j}}$ by assumption and $\text{Range} (W_{j-1})\subseteq \mathbf {EK}_{m_{j-1}}^{\square }(A,B)$ by inductive hypothesis, namely we can write $W_{j-1}=V_{m_{j-1}}{\Upsilon }_{j-1}$ for a certain ${\Upsilon }_{j-1}\in \mathbb {R}^{2m_{j-1}q\times q}$, we have

$$\begin{array}{@{}rcl@{}} W_{j}&=&W_{j-1}-2\text{Re}(p_{j})S_{j}=V_{m_{j-1}}{\Upsilon}_{j-1}-2\text{Re}(p_{j})V_{m_{j}}Y_{m_{j}}\\ &=&V_{m_{j}}\left([{\Upsilon}_{j-1};O_{2(m_{j}-m_{j-1})q\times q}]-2\text{Re}(p_{j})Y_{m_{j}}\right). \end{array}$$

Therefore, $\text{Range} (W_{j})\subseteq \mathbf {EK}_{m_{j}}^{\square }(A,B)$. □

Theorem 3.1 shows that W_j is exactly represented in $\mathbf {EK}_{m_{j}}^{\square }(A,B)$. This means that the latter subspace can be still employed for the computation of S_j+1 by being possibly expanded. Indeed, no components of W_j are annihilated when either the Galerkin or the minimal residual condition is imposed. In the following corollary, we show how to easily write down the projected problems (8) and (11) along with the corresponding residual norm computation.

Corollary 3.1

Assume the prerequisites of Theorem 3.1 hold. If a Galerkin condition is imposed for the computation of $S_{j}=V_{m_{j}}Y_{m_{j}}$, then the matrix $Y_{m_{j}}$ amounts to the solution of the projected linear system

$$(T_{m_{j}}+p_{j}I_{2m_{j}q})Y_{m_{j}}= [{\Upsilon}_{j-1};O_{2(m_{j}-m_{j-1})q\times q}],$$

(14)

where ${\Upsilon }_{j-1}\in \mathbb {R}^{2m_{j-1}q\times q}$ is such that $W_{j-1}=V_{m_{j-1}}{\Upsilon }_{j-1}$, $m_{j-1}\leqslant m_{j}$. The related residual norm can be computed by

$$\|R_{m_{j}}\|_{F}=\|E_{m_{j}+1}^{\mathsf{T}}\underline{T}_{m_{j}}Y_{m_{j}}\|_{F}.$$

(15)

Similarly, if a minimal residual norm condition is imposed, we have

$$Y_{m_{j}}=\underset{Y\in\mathbb{C}^{2m_{j}q\times q}}{\text{argmin}}\|(\underline{T}_{m_{j}}+p_{j}[I_{2m_{j}q};O_{2q\times 2m_{j}q}] )Y-[{\Upsilon}_{j-1};O_{2(m_{j}-m_{j-1}+1)q\times q}]\|_{F},$$

(16)

so that

$$\|R_{m_{j}}\|_{F}=\|Q_{2}^{\mathsf{T}}[{\Upsilon}_{j-1};O_{2(m_{j}-m_{j-1}+1)q\times q}]\|_{F},$$

(17)

where the 2q orthonormal columns of Q₂ are a basis of the kernel of $\underline {T}_{m_{j}}+p_{j}[I_{2m_{j}q};O_{2q\times 2m_{j}q}]$.

Proof

Since $W_{j-1}=V_{m_{j-1}}{\Upsilon }_{j-1}$ and we look for a solution $S_{j}=V_{m_{j}}Y_{m_{j}}$ to $(A + p_{j}I)S_{j} = W_{j-1}$, we can write

$$\begin{array}{lll} R_{m_{j}}&=&(A+p_{j}I)S_{j}-W_{j-1}= (A+p_{j}I)V_{m_{j}}Y_{m_{j}}-V_{m_{j-1}}{\Upsilon}_{j-1}\\ &=& V_{m_{j}}\left((T_{m_{j}}+p_{j}I_{2m_{j}q})Y_{m_{j}}- [{\Upsilon}_{j-1};O_{2(m_{j}-m_{j-1})q\times q}]\right)+\mathcal{V}_{m_{j}+1}E_{m_{j}+1}^{\mathsf{T}}\underline{T}_{m_{j}}\\ &=&V_{m_{j}+1}\left((\underline{T}_{m_{j}}+p_{j} [I_{2m_{j}q};O_{2q\times 2m_{j}q}] )Y_{m_{j}}- [{\Upsilon}_{j-1};O_{2(m_{j}-m_{j-1}+1)q\times q}]\right). \end{array}$$

If a Galerkin condition is imposed, namely $V_{m_{j}}^{\mathsf {T}}R_{m_{j}}=0$, then $Y_{m_{j}}$ is the solution of the linear system in (14) and the related residual norm $\|R_{m_{j}}\|_{F}$ can be computed as in (15).

Similarly, if a minimal residual condition is imposed, $Y_{m_{j}}$ solves the minimization problem (16) and the $\|R_{m_{j}}\|_{F}$ fulfills (17). □

Once $S_{j}=V_{m_{j}}Y_{m_{j}}$ is computed, namely the related residual norm $\|R_{m_{j}}\|_{F}$ is sufficiently small, we proceed with the remaining LR-ADI operations.

We would like to point out that the expression of W_j, i.e., $W_{j}=V_{m_{j}}{\Upsilon }_{j}$, can be exploited for the Lyapunov residual norm as well. Indeed,

$$\|W_{j}^{*}W_{j}\|_{F}=\|{\Upsilon}_{j}^{*}{\Upsilon}_{j}\|_{F}.$$

(18)

This means that also the computation of the Lyapunov residual norm can be carried out by manipulating small matrices of dimension 2m_jq × q. Similarly, the solution Z_j can be assembled at the very end of the LR-ADI procedure once the residual norm in (18) is sufficiently small. Indeed,

$$\begin{aligned}Z_{j}&=[Z_{j-1},\sqrt{-2\text{Re}(p_{j})}S_{j}]=[\sqrt{-2\text{Re}(p_{1})}S_{1},\sqrt{-2\text{Re}(p_{2})}S_{2},\ldots,\sqrt{-2\text{Re}(p_{j})}S_{j}]\\ \\ &=[\sqrt{-2\text{Re}(p_{1})}V_{m_{1}}Y_{m_{1}},\sqrt{-2\text{Re}(p_{2})}V_{m_{2}}Y_{m_{2}},\ldots,\sqrt{-2\text{Re}(p_{j})}V_{m_{j}}Y_{m_{j}}]\\ \\ &=V_{m_{j}}[[Y_{m_{1}};O_{2(m_{j}-m_{1})q\times q}],[Y_{m_{2}};O_{2(m_{j}-m_{2})q\times q}],\ldots,Y_{m_{j}}]\\ &\quad\cdot(\sqrt{-2\text{diag}(\text{Re}(p_{1}),\ldots,\text{Re}(p_{j}))}\otimes I_{q}). \end{aligned}$$

(19)

The overall procedure combining the LR-ADI iteration with the extended Krylov subspace method for shifted linear systems is depicted in Algorithm 3^{Footnote 4}.

As in Algorithm 1, if Im(p_j)≠0, in lines 15 to 19 we set $p_{j+1}=\overline p_{j}$, and we follow the implementation suggested in [11, 36] to reduce the amount of complex arithmetic. In particular, $Y_{m_{j+1}}$ can be obtained from $Y_{m_{j}}$ without solving (14) or (16). Moreover, the adopted scheme results in a real Z_j (see [11] and [36, Algorithm 4.3] for further details).

Remark 4

Theorem 3.1 shows that $\text {Range}(W_{j})\subseteq \mathbf {EK}_{m_{j}}^{\square }(A,B)$ whenever W_j is updated as W_j = W_j−1 −2Re(p_j)S_j, namely whenever all the employed shifts are real. In case of shifts with nonzero imaginary part, the LR-ADI implementation we adopt sets

$$W_{j+1}=W_{j-1}-4\text{Re}(p_{j})(\text{Re}(S_{j})+\upbeta\text{Im}(S_{j})).$$

Therefore, we need to show that W_j+1 defined as above is still such that $\text {Range}(W_{j+1})\subseteq \mathbf {EK}_{m_{j}}^{\square }(A,B)$. This can be done by applying the same exact arguments used in the proof of Theorem 3.1. In particular, the result follows by noticing that the basis V_m is real, as we assumed A and B to be real matrices, and that we can write

$$\begin{array}{@{}rcl@{}} W_{j+1}&=&W_{j-1}-4\text{Re}(p_{j})(\text{Re}(S_{j})+\beta\text{Im}(S_{j}))\\ &=&V_{m_{j}}\left([{\Upsilon}_{j-1};O_{2(m_{j}-m_{j-1})q\times q}]-4\text{Re}(p_{j})(\text{Re}(Y_{m_{j}})+\beta\text{Im}(Y_{m_{j}})\right). \end{array}$$

Notice that two tolerances $\varepsilon _{\mathtt {inn}}^{(j)}$, and ε are employed in Algorithm 3. In particular, ε is used to assess the accuracy of the computed solution in terms of the Lyapunov residual norm, whereas $\varepsilon _{\mathtt {inn}}^{(j)}$ is employed to determine whether the solution of the current linear system is sufficiently correct. In principle, the user can provide a fixed value for the inner tolerance, i.e., $\varepsilon _{\mathtt {inn}}^{(j)}\equiv \overline \varepsilon _{\mathtt {inn}}$ for all j. However, the theory developed in [38] can be used to adaptively compute $\varepsilon _{\mathtt {inn}}^{(j)}$ as the LR-ADI iterations proceed. The relaxation strategy presented in [38, Section 3] allows us to increase $\varepsilon _{\mathtt {inn}}^{(j)}$ as j grows. Therefore, especially when $\varepsilon _{\mathtt {inn}}^{(j)}$ is rather large, there is no need to expand the current extended Krylov subspace in general. In all the results reported in Section 6, we employ such a strategy and $\varepsilon _{\mathtt {inn}}^{(j)}$ is computed according to [38, Equation (3.18b)] (see also [44] for similar results in case of Sylvester equations).

We would like to point out that the lines 23 to 28 in Algorithm 3 and the use of the flag flag_noexpand are crucial to reduce the computational cost of the overall procedure. Indeed, those lines are devoted to check whether the current subspace already contains enough spectral information to solve the current linear system. If this is the case, we do not expand the current space avoiding unnecessary increments in the memory requirements and computational efforts.

If $\mathbf {Y}:= [[Y_{m_{1}};O_{2(m_{j}-m_{1})q\times q}],[Y_{m_{2}};O_{2(m_{2}-m_{1})q\times q}],\ldots ,Y_{m_{j}}]$, (19) shows that the numerical solution computed by the proposed LR-ADI implementation is of the form

$$Z_{j}Z_{j}^{\mathsf{T}}=-2V_{m_{j}}(\mathbf{Y}(\text{diag}(\text{Re}(p_{1}),\ldots,\text{Re}(p_{j}))\otimes I_{q}) \mathbf{Y}^{\mathsf{T}})V_{m_{j}}^{\mathsf{T}}.$$

(20)

The right-hand side in (20) has the typical form of an approximate solution computed by a projection method applied to (2). In particular, if the extended Krylov subspace method (K-PIK) presented in [63] is applied to solve (2), the computed approximation is of the form $X_{m}=V_{m}L_{m} V_{m}^{\mathsf {T}}$, where the orthonormal columns of V_m are a basis of $\mathbf {EK}_{m}^{\square }(A,B)$ and L_m is computed by imposing a Galerkin condition on the residual matrix $AV_{m}L_{m}V_{m}^{\mathsf {T}}+V_{m}L_{m}V_{m}^{\mathsf {T}}A^{\mathsf {T}}+BB^{\mathsf {T}}$. Therefore, the proposed LR-ADI implementation can be seen as a novel projection method where the coefficients of the linear combination in terms of the basis vectors that provides the approximate solution, namely the matrix $\mathbf {Y}(\text {diag}(\text {Re}(p_{1}),\ldots ,\text {Re}(p_{j}))\otimes I_{q}) \mathbf {Y}^{\mathsf {T}}$, is computed as outlined above and not by imposing a Galerkin condition on the residual matrix. This perspective may provide new insights on the relation between LR-ADI and K-PIK. However, this is beyond the scope of this paper. Similar investigations, relating LR-ADI and rational Krylov subspace methods, have been reported in [25, 74, 75].

The expression (20) resembles the LDL^T-form of the LR-ADI solution. This formulation, while being more natural for projection-based solvers, also turned out to be advantageous when LR-ADI is employed as linear solver for differential matrix equations (see [39]).

4 Shift computation

Many of the procedures, available in the literature, for the ADI shift computation need the explicit construction of a basis of Range(Z_j) or a subspace thereof. For instance, in [12], the authors suggest to use, as shifts p_j, a subset of the Ritz values of A with respect to $\mathcal {Z}_{j} = \text{Range} (\widetilde Z_{j})$, where $\widetilde Z_{j}\in \mathbb {R}^{n\times h}$ consists of the last h >0 columns of Z_j that have been orthogonalized with respect to each other. However, (19) shows that Algorithm 3 provides us with a matrix Z_j such that $\text{Range} (Z_{j})\subseteq \mathbf {EK}_{m_{j}}^{\square }(A,B)$ so that the Ritz values of A with respect to $\mathbf {EK}_{m_{j}}^{\square }(A,B)$ can be employed as shifts. Moreover, in standard LR-ADI implementations, one has to explicitly compute the projection of A onto $\mathcal {Z}_{j}$ increasing the computational efforts of the overall procedure. In our approach, the projection of A onto $\mathbf {EK}_{m_{j}}^{\square }(A,B)$ is given for free as this amounts to $T_{m_{j}}$ and no additional operations are required.

The observation above can be applied to many schemes for the shift computation. In the following, we give some details for the residual-Hamiltonian-based shifts and the residual norm-minimizing shifts presented in [37].

In [37, Section 2.1.3], at the jth LR-ADI iteration, the Hamiltonian matrix ${\mathscr{H}}_{j}=\left [\begin {array}{ll} A^{\mathsf {T}} & O \\ W_{j}W_{j}^{\mathsf {T}} & -A \end {array}\right ]$ is considered and its projection onto $\mathcal {Z}_{j}$, namely $\widetilde{\mathcal {H}}_{j}=\left [\begin {array}{ll} {(\widetilde Z_{j}^{\mathsf {T}}A\widetilde Z_{j})}^{\mathsf {T}} & O \\ \widetilde Z_{j}^{\mathsf {T}}W_{j}W_{j}^{\mathsf {T}}\widetilde Z_{j} & -\widetilde Z_{j}^{\mathsf {T}}A\widetilde Z_{j} \end {array}\right ]$, is constructed. In our case, we can easily construct the projection of ${\mathscr{H}}_{j}$ onto $\mathbf {EK}_{m_{j}}^{\square }(A,B)$ and this is given by

$$\widetilde{\mathcal{H}}_{j}=\left[\begin{array}{ll} T_{m_{j}}^{\mathsf{T}} & O \\ {\Upsilon}_{j}^{\mathsf{T}}{\Upsilon}_{j} & -T_{m_{j}} \end{array}\right]\in\mathbb{R}^{4m_{j}q\times 4m_{j}q}.$$

(21)

With (21) at hand, we compute its stable eigenpairs $\left (\lambda _{k},\left [\begin {array}{ll}s_{k}\\ t_{k} \end {array}\right ]\right )$, Re(λ_k) <0, $s_{k},t_{k}\in \mathbb {R}^{2m_{j}q}$, and the (j +1)-st residual-Hamiltonian-based shift p_j+1 is selected as the eigenvalue $\lambda _{\widehat k}$ such that $t_{\widehat k}=argmax\{\|t_{k}\|\}$.

For the computation of residual-norm-minimizing shifts, in [37, Section 3], a rather involved optimization procedure is presented. In particular, the real and imaginary parts of p_j+1 = 𝜃_j+1 + ıξ_j+1 are computed by solving the following minimization problem

$$\lbrack\theta_{j+1},\xi_{j+1}\rbrack = \underset{\theta\in{\mathbb{R}}_-,\xi\in\mathbb{R}}{\text{argmin}}\;\left\|W_j-2\theta((A+(\theta+i\xi)I)^{-1}W_j\right\|^2.$$

(22)

The objective function in (22) is expensive to evaluate, making the shift computation often more expensive than a single LR-ADI iteration. To overcome this issue, Kürschner proposes to employ smaller matrices $\widetilde A$ and $\widetilde W_{j}$ in place of A and W_j. Once again, $\widetilde A$ and $\widetilde W_{j}$ are the projection of A and W_j onto a suitable subspace. This subspace is chosen to be $\mathbf {EK}_{\ell }^{\square }(A,B)\cup \text{Range} (Z_{j})$ for a certain, usually small, ℓ >0. In our implementation, $\mathbf {EK}_{\ell }^{\square }(A,B)\cup \text{Range} (Z_{j})\subseteq \mathbf {EK}_{m_{j}}^{\square }(A,B)$ if $\ell \leqslant m_{j}$. Therefore, we can set $\widetilde A=T_{m_{j}}$ and $\widetilde W_{j}={\Upsilon }_{j}$ for the approximation of [𝜃_j+1,ξ_j+1] (22).

5 Numerical examples

In this section, we illustrate the potential of the scheme we propose in this paper. The two variants of the LR-ADI-EKSM method, we have illustrated in Section 3, will be denoted by LR-ADI-EKSM(G) and LR-ADI-EKSM(MR). In particular, in LR-ADI-EKSM(G), we solve the linear systems by imposing a Galerkin condition, i.e., the matrix Y is computed by solving the reduced problem (14). In LR-ADI-EKSM(MR), Y solves the least squares problem (16).

We test Algorithm 3 on different instances of (2) coming from the discretization of certain PDEs, and we study how the computational cost of the main steps of Algorithm 3 depends on the problem dimension n and rank of the right hand side q.

The results achieved by Algorithm 3 are also compared to the ones obtained by running a standard implementation of the LR-ADI method. In particular, we employed the MATLAB function mess_lradi available in the M-M.E.S.S. package [59]. Notice that mess_lradi is intended to be a black-box routine so that many checks and inspections are performed before the actual solution process starts. This may increase the overall running time of mess_lradi. Therefore, to have fair comparisons, we also report the results obtained by running a standard implementation of LR-ADI where the overhead cost mentioned above is not present. Such a routine is simply denoted by lradi in the tables that follow.

For a better understanding, in Table 1, we summarize the adopted linear system solver included in the tested routines for each of the numerical experiments that follow. Similarly, in Table 1, we indicate whether a given scheme is equipped with the relaxation strategy coming from [38] for the selection of $\varepsilon^{(j)}_{\mathtt{inn}}$.

Table 1 Solver: solver employed for solving the linear systems with A in LR-ADI-EKSM and K-PIK and with $A + p_{j}I$ in lradi and mess_lradi. In the column Relaxation, we indicate whether a certain scheme is equipped with the relaxation strategy proposed in [38]

Full size table

For all experiments, the tolerance ε for the relative residual norm is set to 10⁻⁸. Moreover, except for Experiment 3, we always employ the residual-Hamiltonian-based shifts presented in [37] and computed as illustrated in Section 5.

All results were obtained by running MATLAB R2020b [47] on a standard node^{Footnote 5} of the Linux cluster mechthild hosted at the Max Planck Institute for Dynamics of Complex Technical Systems in Magdeburg, Germany.

Experiment 1

In the first experiment, we consider a Lyapunov equation where

$$A=I_{h}\otimes D_{h}+D_{h}\otimes I_{h},\quad D_{h}=\text{tridiag}(1,-2,1)\in\mathbb{R}^{h\times h}.$$

Therefore, $A\in \mathbb {R}^{n\times n}$, n = h², is symmetric and stable. We first consider a matrix $B\in \mathbb {R}^{n\times q}$ with random entries and unit norm, and in Table 2, we depict how the overall solution time distributes among the main steps of our algorithm for different values of n and q.

Table 2 Experiment 1. Computational timings devoted to the different main steps of LR-ADI-EKSM(G) for different values of the problem size n and rank of the right-hand side q

Full size table

In both LR-ADI-EKSM(G) and LR-ADI-EKSM(MR), the linear systems with A required for the basis construction are solved by means of the MATLAB sparse direct solver “backslash.” In particular, A is factorized once and for all before the iterative procedures start so that only triangular systems are actually solved during the basis construction. The computational time for the factorization of A is always included in the results that follow.

In this experiment, LR-ADI-EKSM(G) and LR-ADI-EKSM(MR) perform very similarly. We thus report only the results achieved by the former.

As expected, the time devoted to the basis construction represents the majority of the overall computational efforts. This is the usual case in Krylov projection algorithms. This cost increases as q grows. Indeed, a larger subspace is computed making the basis construction, and in particular the orthogonalization step, rather demanding. Having a large dimensional approximation space leads to a more expensive shift computation, as well.

In Fig. 1 (left y-axis), we illustrate how the dimension of the computed extended Krylov subspace grows in terms of j for n =360 000 and different values of q.

In this experiment, we can notice that the subspace constructed to solve the second shifted linear system, namely $(A + p_{2}I)S_{2} = W_{1}$, is a very rich approximation space in terms of spectral information. Indeed, we need to only slightly expand it to solve the subsequent linear systems without compromising the decrease in the Lyapunov residual norm (see Fig. 1 (right y-axis)). This means that the majority of the computational efforts are dedicated to solve the second linear system, and we can capitalize on them for j > 2 reducing the overall workload of the solution process. We would like to mention that such a phenomenon is partially due to the adaptive selection of the inner tolerance $\varepsilon _{\mathtt {inn}}^{(j)}$ coming from [38].

We now compare LR-ADI-EKSM(G) with the function mess_lradi of the M-M.E.S.S. package [59], an abstract function handle-based implementation of the LR-ADI, and lradi, a plain matrix-based implementation of the same algorithm.

To this end, we make $B\in \mathbb {R}^{n}$ the normalized vector of all ones. For having fair comparisons, we employ the shifts computed by the LR-ADI-EKSM(G) in all the different implementations. This leads to a very similar trend in the relative residual norm achieved by the routines even though the shifted linear systems in mess_lradi and lradi are solved at very high accuracy^{Footnote 6}, whereas the relaxation strategy of [38] is implemented in LR-ADI-EKSM(G). In Fig. 2, we report the relative difference between the relative residual norms computed by LR-ADI-EKSM(G) and mess_lradi throughout all the necessary iterations j for different problem dimension n along with the values of $\varepsilon ^{(j)}_{\mathtt {inn}}$ we employed. In agreement with the results presented in [38], we can notice that the distance between the computed relative residual norms is always rather moderate and smaller than $\varepsilon ^{(j)}_{\texttt {inn}}$^{Footnote 7}. Very similar results are obtained by comparing the residual norms attained by lradi in place of mess_lradi.

We also compare the routines in terms of computation time. The results are collected in Table 3. Since we employ the shifts computed within LR-ADI-EKSM(G) also for mess_lradi and lradi, we do not consider the time devoted to the shift computation when reporting the performances of LR-ADI-EKSM(G) in Table 3.

Table 3 Experiment 1. Computational timings achieved by LR-ADI-EKSM(G), lradi, and mess_lradi for different problem sizes n

Full size table

The results in Table 3 show that, for this experiment, our proposed scheme combined with the relaxation strategy presented in [38] leads to a remarkable speed-up of the solution process — up to 50% — when compared to a standard implementation of the LR-ADI method.

Experiment 2

In the second experiment, we consider a problem similar to [51, Example 6]. In particular, the matrix A comes from the centered finite difference discretization of the 3-dimensional convection-diffusion operator

$$\mathcal{L}(u)=-\zeta{\Delta} u+\mathbf{w}\cdot\nabla u,$$

on the unit cube with zero Dirichlet boundary conditions. The convection vector w is given by w = (ϕ₁(x)ψ₁(y)π₁(z),0,π₃(z)) = ((1 − x²)yz,0,e^z) whereas ζ >0. By employing h nodes in each direction, the discretization phase leads to a matrix A that can be written as

$$A=(D_{h}+{\Pi}_{3}N^{\mathsf{T}})\otimes I_{h}\otimes I_{h}+I_{h}\otimes D_{h}\otimes I_{h}+I_{h}\otimes I_{h}\otimes D_{h}+{\Pi}_{1}\otimes{\Psi}_{1}\otimes{\Phi}_{1}N,$$

where $D_{h}=\zeta {(h-1)}^{2}\cdot \text {tridiag}(-1,2,-1)\in \mathbb {R}^{h\times h}$, $N=-\frac {(h-1)}{2}\cdot \text {tridiag}(-1,0,1)\in \mathbb {R}^{h\times h}$, and Φ_i, Ψ_i, and π_i are diagonal matrices whose diagonal entries correspond to the nodal values of the corresponding functions ϕ_i, ψ_i, and π_i (see [51] for further details). $B\in \mathbb {R}^{n}$, n = h³, is a vector with random entries.

Due to the 3D nature of the problem, the nonsymmetric linear systems with A involved in the basis construction in LR-ADI-EKSM are solved by GMRES [57]. In particular, we employ the GMRES implementation written by Lund et al [35], namely the function bgmres in [45]. GMRES is stopped whenever the computed relative residual norm gets smaller than 10⁻¹⁰.

It is well known that (polynomial) Krylov methods for linear systems need to be preconditioned to achieve a fast convergence in terms of number of iterations. To this end, as suggested in [51], we employ the following preconditioning operator when solving the linear systems with A,

$$\mathcal{P}=(D_{h}+{\Pi}_{3}N^{\mathsf{T}})\otimes I_{h}\otimes I_{h}+I_{h}\otimes D_{h}\otimes I_{h}+I_{h}\otimes I_{h}\otimes D_{h}+\overline\pi_{1}I_{h}\otimes{\Psi}_{1}\otimes{\Phi}_{1}N,$$

where $\overline \pi _{1}$ is the mean value of the function π₁ in [0,1]. At each GMRES iteration, we thus have to invert $\mathcal {P}$, namely we have to compute $\overline v=\mathcal {P}^{-1}v$ for $v\in \mathbb {R}^{n}$. This operation is performed by solving the Sylvester equation

$$(D_{h}\otimes I_{h}+I_{h}\otimes D_{h}+\overline\pi_{1}{\Psi}_{1}\otimes{\Phi}_{1}N) \mathbf{\overline V}+\mathbf{\overline V}{(D_{h}+{\Pi}_{3}N^{\mathsf{T}})}^{\mathsf{T}}=\mathbf{V},$$

where $\mathbf {\overline V},\mathbf {V}\in \mathbb {R}^{h^{2}\times h}$ are such that $\text {vec}(\mathbf {\overline V})=\overline v$ and vec(V) = v. Since the coefficient matrices in the equation above have moderate dimensions, the Bartels-Stewart method [6] is employed for its solution and the Schur decompositions of the coefficient matrices are computed once and for all before the iterative procedure starts. We always employ a right preconditioning scheme in order to easily have access to the actual residual norm.

Also, for the shifted linear systems with $A + p_{j}I$, within mess_lradi and lradi, we employ preconditioned GMRES equipped with the preconditioning operator $\mathcal {P}+p_{j}I$. Once again, this preconditioner is applied by solving the Sylvester equation

$$(D_{h}\otimes I_{h}+I_{h}\otimes D_{h}+\overline\pi_{1}{\Psi}_{1}\otimes{\Phi}_{1}N) \mathbf{\overline V}+\mathbf{\overline V}{(D_{h}+{\Pi}_{3}N^{\mathsf{T}}+p_{j}I_{h})}^{\mathsf{T}}=\mathbf{V}.$$

Even though this is in general a better preconditioner for $A + p_{j}I$ compared to $\mathcal {P}$, its application involves complex arithmetic whenever Im(p_j)≠0 with a consequent increment in the computational efforts devoted to the preconditioning step.

For this experiment, lradi is equipped with the relaxation strategy presented in [38].

Also for this experiment, LR-ADI-EKSM(G) and LR-ADI-EKSM(MR) perform very similarly, with LR-ADI-EKSM(MR) achieving slightly better results in terms of computational time. We thus report only the performance of LR-ADI-EKSM(MR).

The results are collected in Table 4 for different values of n and ζ. In Table 4, we also report the number of shifts with nonzero imaginary part.

Table 4 Experiment 2. Computational timings achieved by LR-ADI-EKSM(MR), lradi, and mess_lradi for different problem sizes n and diffusivities ζ

Full size table

We would like to mention that we ran some experiments with mess_lradi where the shifted linear systems were solved by means of the MATLAB sparse direct solver “backslash” in place of preconditioned GMRES. However, for this example, the potentially higher accuracy of the direct solves did not benefit the computation and the execution times we achieved with “backslash” could not keep up with the ones reported for GMRES in Table 4. We, thus, decided to omit them here.

From the results in Table 4, we can see that LR-ADI-EKSM(MR) is very competitive and always achieves computational timings that are significantly smaller than the ones required by mess_lradi. Thanks to the relaxation procedure coming from [38], lradi performs better than mess_lradi.

The performance of all the tested routines is strictly related to the number of complex shifts needed to converge. When this is sizable with respect to the total number of iterations, many of the n × n linear systems $A + p_{j}I$ within mess_lradi and lradi involve complex arithmetic, whereas this is needed only in the solution of the small dimensional least squares problem for the computation of Y in LR-ADI-EKSM(MR).

We notice that, for a fixed n, the computational time of LR-ADI-EKSM(MR) decreases, in general, by reducing ζ, even tough the number of LR-ADI iterations that are implicitly performed increases. This is due to the computational efforts required by the solution of the linear systems with A during the basis construction. Indeed, for ζ =0.05, many more GMRES iterations are required than what is necessary for ζ =0.005. In Fig. 3, we report the number of GMRES iterations needed to solve the linear system with A at each m, namely every time a new basis vector of the adopted extended Krylov subspace needs to be computed.

A rather large number of GMRES iterations are required for solving the linear systems with A in case of ζ =0.05 making the construction of the basis of $\mathbf {E}\mathbf {K}^{\square }_{m}(A,B)$ more demanding. On the other hand, few GMRES iterations are sufficient to meet the prescribed accuracy for ζ =0.005 and the overall solution procedure turns out to be very successful.

Experiment 3

In this experiment, we compare LR-ADI-EKSM also with K-PIK [63], since the two routines construct the same subspace^{Footnote 8}. We consider the thermal part of the thermo-elastic modeling of a building-block of an experimental machine tool given by the following heat equation

$$\left\{\begin{array}{ll} c_{p}\rho\frac{\partial T}{\partial t}&=\lambda{\Delta} T,\quad \text{in }{\Omega},\\ \lambda\frac{\partial T}{\partial \textbf{n}}&=f,\quad \text{on }{\Gamma}_{c}\subset\partial{\Omega}, \\ \lambda\frac{\partial T}{\partial \textbf{n}}&=\alpha(T_{ext}-T),\quad \text{on }{\Gamma}_{ext}\subset\partial{\Omega}, \\ T(0)&=0. \end{array}\right.$$

(23)

The discretization in space using the finite element method (here applying the proprietary tool ANSYS^{Footnote 9}) on the three-dimensional domain, given by the machine frame indicated in Fig. 4, leads to the LTI system

$$E\dot T=\left(A-\sum\limits_{i=1}^{t} \alpha_{i}F_{i}\right) T+Bu(t).$$

(24)

where A represents the discretized Laplacian together with the Robin boundary contributions from Γ_ext and represented by F_i, while B results from the external control inputs (heats fluxes, e.g., induced by the drive motors) on Γ_c. Note that the elastic part of the thermo-elastic model can be encoded entirely in the output equation of the corresponding dynamical system and is, thus, not relevant here [40]. The algebraic problem resulting from this system amounts to a Lyapunov equation of the form (1). However, due to mass lumping in ANSYS, the mass matrix E is diagonal and SPD. We can, thus, easily invert its square root and consider the Lyapunov equation

$$E^{-\frac{1}{2}}\left(A-\sum\limits_{i=1}^{t} \alpha_{i}F_{i}\right)E^{-\frac{1}{2}}\widetilde X + \widetilde XE^{-\frac{1}{2}}{\left(A-\sum\limits_{i=1}^{t} \alpha_{i}F_{i}\right)}^{\mathsf{T}}E^{-\frac{1}{2}}+E^{-\frac{1}{2}}BB^{\mathsf{T}}E^{-\frac{1}{2}}=0,\quad \widetilde X=(E^{\frac{1}{2}}XE^{\frac{1}{2}}).$$

So, again, we can efficiently retract to a problem of the form (2). Once a low-rank approximation $\widetilde Z\widetilde Z^{\mathsf {T}}$ to $\widetilde X$ is computed, the low-rank factor Z such that ZZ^T ≈ X can be retrieved by performing $Z=E^{-\frac {1}{2}}\widetilde Z$.

The actual machine frame in Fig. 4 consists of several parts itself, which are discretized separately. This leads to differently sized models of the structure in (24). These are reflected by the rows of Table 5. Accordingly, we solve the Lyapunov equation considering different configurations of the PDE (23), respectively the LTI system in (24). In particular, this allows us to vary the number of degrees of freedom employed in the discretization phase, leading to different problem dimensions n, modify the Neumann boundary conditions obtaining diverse matrices F_i, and consider different values for the rank q of B. Moreover, we set α_i =10 for all i =1,…,t.

Table 5 Experiment 3. Computational timings achieved by LR-ADI-EKSM(G), K-PIK, and mess_lradi for different values of problem size n, number of Robin boundary conditions t, and rank of the right-hand side q

Full size table

The results are collected in Table 5. It turns out that the Wachspress ADI shifts [42, 73] are particularly effective for this experiment, since A as well as all the F_i and thus $E^{-\frac{1}{2}}\;(A\;-{\textstyle\sum_{\;i=1}^t}\;\alpha_iF_i)\;E^{-\frac{1}{2}}$ are symmetric, i.e., the spectrum is real. These are the ideal circumstances for Wachspress shifts. We, thus, employ those shifts in LR-ADI-EKSM(G) and mess_lradi.

For this experiment, the LR-ADI method, either based on our new formulation or on a standard scheme as the one in mess_lradi, turns out to be more efficient in terms of computational time than K-PIK. Indeed, in spite of the smaller number of iterations needed to converge, the large dimension of the extended Krylov subspace constructed by K-PIK leads to a rather costly solution of the projected equations. Also LR-ADI-EKSM(G) requires the construction of an extended Krylov subspace whose dimension is similar to the one computed by K-PIK. However, if ${\dim }\left (\mathbf {E}\mathbf {K}^{\square }_{m}(E^{-\frac {1}{2}}\left (A-{\sum }_{i=1}^{t} \alpha _{i}F_{i}\right )E^{-\frac {1}{2}},E^{-\frac {1}{2}}B)\right )=2mq$, the computational cost of solving the inner problems within LR-ADI-EKSM(G) is $\mathcal {O}(4m^{2}q^{2})$ floating-point operations (FLOPs) whereas it amounts to $\mathcal {O}(8m^{3}q^{3})$ FLOPs for K-PIK.

We conclude by mentioning that in this experiment, we relied on the ease of computing $E^{-\frac {1}{2}}$. However, it may happen that the mass matrix E cannot be easily manipulated, e.g., it can be possibly singular, so that the routine presented in this paper cannot be readily applied as we have done in this experiment. We plan to extend the LR-ADI-EKSM framework to this more challenging class of equations in the near future.

Experiment 4

In the last experiment, we show that the proposed framework still needs some further improvements to efficiently deal with generalized Lyapunov equations of the form (1) where the mass matrix E is not diagonal. To this end, we consider the Steel Profile data set [18, 49] from the MORwiki repository [72].

We compute the observability Gramian of the system, namely the solution X to the equation

$$A^{\mathsf{T}}XE+E^{\mathsf{T}}XA+C^{\mathsf{T}}C=0,$$

(25)

where $A\in \mathbb {R}^{n\times n}$ is symmetric negative definite, $C\in \mathbb {R}^{q\times n}$, q =6, and $E\in \mathbb {R}^{n\times n}$ is SPD but not diagonal (see [17] fur further details on the model).

If E = LL^T denotes the Cholesky factorization of E, we consider the transformed equation

$$(L^{-1}A^{\mathsf{T}}L^{-\mathsf{T}})\widetilde X+\widetilde X(L^{-1}AL^{-\mathsf{T}})+L^{-1}C^{\mathsf{T}}CL^{-\mathsf{T}}=0,\quad \widetilde X=L^{\mathsf{T}}XL,$$

(26)

and, due to symmetry of A, employ the extended Krylov subspace $\mathbf {EK}_{m}^{\square }(L^{-1}AL^{-\mathsf {T}},L^{-1}C^{\mathsf {T}})$ as approximation space. Notice that the matrix L⁻¹AL^−T does not need to be explicitly constructed (see, e.g., [63, Example 5.4]). As before, once $\widetilde Z\widetilde Z^{\mathsf {T}}\approx \widetilde X$ is computed, we obtain a low-rank approximation to the original X by performing $Z=L^{-\mathsf {T}}\widetilde Z$.

In Table 6, we report the results achieved by LR-ADI-EKSM(G) and ${\tt mess\_lradi}$ for different values of n.

Table 6 Experiment 4. Computational timings achieved by LR-ADI-EKSM(G) and mess_lradi for different values of the problem size n

Full size table

From the results in Table 6, we can readily see that the standard scheme of the LR-ADI method implemented in mess_lradi is much faster than LR-ADI-EKSM(G). This is due to the fact that the latter algorithm needs to construct a quite large subspace to achieve the prescribed accuracy with a consequent increment in the computational efforts of the overall procedure.

We also mention that the rank of the approximate solution computed by LR-ADI-EKSM(G) is much lower than the dimension of the constructed subspace. We believe that the transformation we performed in (26), and thus the employment of $\mathbf {E}\mathbf {K}_{m}^{\square }(L^{-1}AL^{-\mathsf {T}},L^{-1}C^{T})$, may lead to some spectral redundancy in the adopted approximation subspace and a slower convergence of the method. On the other hand, mess_lradi is able to deal with the original formulation (25) of the problem.

To address generalized equations of the form (1), we plan to study the employment of different techniques within the Krylov LR-ADI framework we presented in this paper. In particular, the use of nonstandard inner products and (extended) generalized Krylov subspaces [43] will be explored.

6 Conclusions

A new formulation of the LR-ADI algorithm for large-scale standard Lyapunov equations has been proposed. The computational core of the LR-ADI scheme consists in the solution of a shifted linear system at each iteration. We showed that the extended Krylov subspace method can be a valid candidate for this task. In particular, we described how only one extended Krylov subspace needs to be constructed to solve all the necessary linear systems required by the LR-ADI method. The LR-ADI iteration has been completely merged into the extended Krylov subspace method for shifted linear systems resulting in a novel, efficient solution procedure. We also showed that many state-of-the-art algorithms for the shift computation can be easily integrated into our new scheme. Numerical results demonstrate the potential of our novel algorithm, especially when this is equipped with the relaxation strategy proposed in [38], and many complex shifts are needed to converge.

In future work, we will consider more involved Lyapunov equations of the form (1) that cannot be easily transformed into (2). While standard implementations of the LR-ADI method naturally address such a scenario by solving linear systems of the form $A + p_{j}E$, further care has to be taken to employ the scheme we presented in this paper. Indeed, the shifted Arnoldi relation (7) can no longer be exploited. The use of non-standard inner products and generalized Krylov subspace methods [43] will be investigated.

The framework presented in this paper can be generalized to enhance other LR-ADI-like algorithms for matrix equations. For instance, the LR-ADI method for Sylvester equations [14] or LR-RADI schemes for Riccati equations [10, 22] can be equipped with a procedure similar to the one we proposed here.

Data availability

The datasets generated during and/or analysed during the current study are available in the Zenodo repository, https://zenodo.org/record/7101150#.YzLZzNJBwxQ.

Notes

We only consider proper sets of shifts, namely {p_i}_i=1,…,j is closed with respect to complex conjugation.
The value of ℓ depends on the adopted approximation space. It holds, ℓ =1 for the polynomial Krylov subspace in (4), whereas ℓ =2 for the extended Krylov subspace in (5).
Either (8) or (11).
Many subscripts have been removed to make the algorithm more readable.
CPU: 2x Intel Xeon Skylake Silver 4110 @ 2.1 GHz, 8 cores per CPU. RAM: 192 GB DDR4 ECC.
The MATLAB sparse direct solver “backslash” is employed for solving $(A + p_{j}I)S_{j} = W_{j-1}$ for all j.
This is true for all the experiments we ran except for n =640 000, at the very last iteration where $r_{27}^{\text {LR-ADI-EKSM}}\approx 1.3\times 10^{-8}$ whereas $\varepsilon _{\mathtt {inn}}^{(27)}\approx 1.6\times 10^{-8}$
The implementation of K-PIK we employed will be available in the next M-M.E.S.S. release, along with other projection methods for matrix equations. Such implementation is equivalent to the one that can be found on Simoncini’s webpage, http://www.dm.unibo.it/∼simoncin/software.html.
https://www.ansys.com/

References

Ahuja, K., Benner, P., de Sturler, E., Feng, L.: Recycling BiCGSTAB with an application to parametric model order reduction. SIAM J. Sci. Comput 37, S429–S446 (2015). https://doi.org/10.1137/140972433
MathSciNet MATH Google Scholar
Ahuja, K., de Sturler, E., Gugercin, S., Chang, E.R: Recycling BiCG with an application to model reduction. SIAM J. Sci. Comput 34, A1925–A1949 (2012). https://doi.org/10.1137/100801500
MathSciNet MATH Google Scholar
Antoulas, A.C.: Approximation of Large-Scale Dynamical Systems vol. 6 of Adv. Des. Control. SIAM Publications, Philadelphia (2005). https://doi.org/10.1137/1.9780898718713
Google Scholar
Anzt, H., Chow, E., Saak, J., Dongarra, J.: Updating incomplete factorization preconditioners for model order reduction. Numer. Algorithms 73, 611–630 (2016). https://doi.org/10.1007/s11075-016-0110-2
MathSciNet MATH Google Scholar
Baker, J., Embree, M., Sabino, J.: Fast singular value decay for Lyapunov solutions with nonnormal coefficients. SIAM J. Matrix Anal. Appl 36, 656–668 (2015). https://doi.org/10.1137/140993867
MathSciNet MATH Google Scholar
Bartels, R.H., Stewart, G.W.: Solution of the matrix equation AX + XB = C: Algorithm 432. Comm. ACM 15, 820–826 (1972). https://doi.org/10.1145/361573.361582
MATH Google Scholar
Baumann, M., van Gijzen, M.B.: Nested Krylov methods for shifted linear systems. SIAM J. Sci. Comput 37, S90–S112 (2015). https://doi.org/10.1137/140979927
MathSciNet MATH Google Scholar
Baumann, M., van Gijzen, M.B.: Efficient Iterative Methods for Multi-Frequency Wave Propagation Problems: A Comparison Study. Procedia Computer Science, vol. 108, pp. 645–654. In: International Conference on Computational Science, ICCS 2017, 12-14 June 2017, Zurich, Switzerland. https://doi.org/10.1016/j.procs.2017.05.088 (2017)
Bellavia, S., De Simone, V., di Serafino, D., Morini, B.: Efficient preconditioner updates for shifted linear systems. SIAM J. Sci. Comput. 33, 1785–1809 (2011). https://doi.org/10.1137/100803419
MathSciNet MATH Google Scholar
Benner, P., Bujanović, Z., Kürschner, P., Saak, J.: RADI: a low-rank ADI-type algorithm for large scale algebraic Riccati equations. Numer. Math 138, 301–330 (2018). https://doi.org/10.1007/s00211-017-0907-5
MathSciNet MATH Google Scholar
Benner, P., Kürschner, P., Saak, J.: Efficient handling of complex shift parameters in the low-rank Cholesky factor ADI method. Numer. Algorithms 62, 225–251 (2013). https://doi.org/10.1007/s11075-012-9569-7
MathSciNet MATH Google Scholar
Benner, P., Kürschner, P., Saak, J.: Self-generating and efficient shift parameters in ADI methods for large Lyapunov and Sylvester equations. Electron. Trans. Numer. Anal 43, 142–162 (2014). http://etna.mcs.kent.edu/volumes/2011-2020/vol43/abstract.php?vol=43&pages=142-162
MathSciNet MATH Google Scholar
Benner, P., Kürschner, P., Saak, J.: Frequency-limited balanced truncation with low-rank approximations. SIAM J. Sci. Comput 38, A471–A499 (2016). https://doi.org/10.1137/15M1030911
MathSciNet MATH Google Scholar
Benner, P., Li, R.-C., Truhar, N.: On the ADI method for Sylvester equations. J. Comput. Appl. Math 233, 1035–1045 (2009). https://doi.org/10.1016/j.cam.2009.08.108
MathSciNet MATH Google Scholar
Benner, P., Mehrmann, V., Sorensen, D. C.: Dimension Reduction of Large-Scale Systems, vol. 45 of Lect. Notes Comput. Sci. Eng. Springer-Verlag, Berlin (2005). https://doi.org/10.1007/3-540-27909-1
Google Scholar
Benner, P., Mena, H., Saak, J.: On the parameter selection problem in the Newton-ADI iteration for large-scale Riccati equations. Electron. Trans. Numer. Anal 29, 136–149 (2008). https://etna.math.kent.edu/volumes/2001-2010/vol29/abstract.php?vol=29&pages=136-149
MathSciNet MATH Google Scholar
Benner, P., Saak, J.: Linear-quadratic regulator design for optimal cooling of steel profiles, Tech. Report SFB393/05-05, Sonderforschungsbereich 393 Parallele Numerische Simulation für Physik und Kontinuumsmechanik, TU Chemnitz, D-09107 Chemnitz (Germany). http://nbn-resolving.de/urn:nbn:de:swb:ch1-200601597 (2005)
Benner, P, Saak, J: A semi-discretized heat transfer model for optimal cooling of steel profiles. In: Benner, P., Mehrmann, V., Sorensen, D. (eds.) Dimension Reduction of Large-Scale Systems vol. 45 of Lect. Notes Comput. Sci. Eng., Springer-Verlag, Berlin/Heidelberg, Germany, pp. 353–356. https://doi.org/10.1007/3-540-27909-1∖_19 (2005)
Benner, P., Saak, J.: Numerical solution of large and sparse continuous time algebraic matrix Riccati and Lyapunov equations: a state of the art survey. GAMM Mitteilungen 36, 32–52 (2013). https://doi.org/10.1002/gamm.201310003
MathSciNet MATH Google Scholar
Benzi, M., Bertaccini, D.: Approximate inverse preconditioning for shifted linear systems. BIT 43, 231–244 (2003). https://doi.org/10.1023/A:1026089811044
MathSciNet MATH Google Scholar
Bertaccini, D.: Efficient preconditioning for sequences of parametric complex symmetric linear systems. Electron. Trans. Numer. Anal 18, 49–64 (2004). https://etna.math.kent.edu/volumes/2001-2010/vol18/abstract.php?vol=18&pages=49-64
MathSciNet MATH Google Scholar
Bertram, C., Faßbender, H.: Riccati ADI: existence, uniqueness and new iterative methods, math.NA. arXiv:2004.11212 (2020)
Datta, B.N., Saad, Y.: Arnoldi methods for large Sylvester-like observer matrix equations, and an associated algorithm for partial spectrum assignment. Linear Algebra. Appl. 154–156, 225–244 (1991). https://doi.org/10.1016/0024-3795(91)90378-A
MathSciNet MATH Google Scholar
Druskin, V., Knizhnerman, L.: Extended Krylov subspaces: approximation of the matrix square root and related functions. SIAM J. Matrix Anal. Appl 19, 755–771 (1998). https://doi.org/10.1137/S0895479895292400
MathSciNet MATH Google Scholar
Druskin, V., Knizhnerman, L., Simoncini, V.: Analysis of the rational Krylov subspace and ADI methods for solving the Lyapunov equation. SIAM J. Numer. Anal 49, 1875–1898 (2011). https://doi.org/10.1137/100813257
MathSciNet MATH Google Scholar
Feng, L., Benner, P., Korvink, J.G.: Subspace recycling accelerates the parametric macromodeling of MEMS. Internat. J. Numer. Methods Engrg 94, 84–110 (2013). https://doi.org/10.1002/nme.4449
MATH Google Scholar
Feriani, A., Perotti, F., Simoncini, V.: Iterative system solvers for the frequency analysis of linear mechanical systems. Comp. Meth. Appl. Mech. Eng 190, 1719–1739 (2000). https://doi.org/10.1016/S0045-7825(00)00187-0
MATH Google Scholar
Freund, R.W.: Solution of shifted linear systems by quasi-minimal residual iterations. In: Numerical Linear Algebra, De Gruyter, pp. 101–122. (1993)
Frommer, A., Glässner, U.: Restarted GMRES for shifted linear systems. SIAM J. Sci. Comput 19, 15–26 (1998). https://doi.org/10.1137/S1064827596304563
MathSciNet MATH Google Scholar
Frommer, A., Lund, K., Szyld, D.B.: Block Krylov subspace methods for functions of matrices. Electron. Trans. Numer. Anal 47, 100–126 (2017). https://etna.math.kent.edu/volumes/2011-2020/vol47/abstract.php?vol=47&pages=100-126
MathSciNet MATH Google Scholar
Gaul, A.: Recycling Krylov Subspace Methods for Sequences of Linear Systems – Analysis and Applications. PhD thesis Technische Universität, Berlin (2014). https://doi.org/10.13140/2.1.4015.3284
Google Scholar
Glässner, U., Güsken, S., Lippert, T., Ritzenhöfer, G., Schilling, K., Frommer, A.: How to compute Green’s functions for entire mass trajectories within Krylov solvers. Int. J. Mod. Phys. C 7, 635–644 (1996). https://doi.org/10.1142/S0129183196000533
Google Scholar
Grasedyck, L.: Existence of a low rank or H-matrix approximant to the solution of a Sylvester equation. Numer. Lin. Alg. Appl 11, 371–389 (2004). https://doi.org/10.1002/nla.366
MathSciNet MATH Google Scholar
Kalman, R.E., Falb, P.L., Arbib, M.A.: Topics in Mathematical System Theory. McGraw-Hill, New York (1969)
MATH Google Scholar
Kressner, D, Lund, K, Massei, S, Palitta, D: Compress-and-restart block Krylov subspace methods for Sylvester matrix equations, Numer. Lin. Alg. Appl., vol. 28. (2021) https://doi.org/10.1002/nla.2339
Kürschner, P.: Efficient Low-Rank Solution of Large-Scale Matrix Equations, PhD thesis, Otto-von-Guericke-Universität, Magdeburg. Shaker Verlag, ISBN, Germany (2016). 978-3-8440-4385-3. http://hdl.handle.net/11858/00-001M-0000-0029-CE18-2
Google Scholar
Kürschner, P.: Approximate residual-minimizing shift parameters for the low-rank ADI iteration. Electron. Trans. Numer. Anal 51, 240–261 (2019). https://doi.org/10.1553/etna∖_vol51s240
MathSciNet MATH Google Scholar
Kürschner, P., Freitag, M.: Inexact methods for the low rank solution to large scale Lyapunov equations. BIT 60, 1221–1259 (2020). https://doi.org/10.1007/s10543-020-00813-4
MathSciNet MATH Google Scholar
Lang, N., Mena, H., Saak, J.: On the benefits of the LDL^T factorization for large-scale differential matrix equation solvers. Linear Algebra Appl. 480, 44–71 (2015). https://doi.org/10.1016/j.laa.2015.04.006
MathSciNet MATH Google Scholar
Lang, N., Saak, J., Benner, P.: Model order reduction for systems with moving loads. at-Automatisierungstechnik 62, 512–522 (2014). https://doi.org/10.1515/auto-2014-1095
Google Scholar
Laub, A.J.: Numerical linear algebra aspects of control design computations. IEEE Trans. Autom. Control 30, 97–108 (1985). https://doi.org/10.1109/TAC.1985.1103900
MathSciNet MATH Google Scholar
Li, J.-R., Rebecca, J.: White, low rank solution of Lyapunov equations. SIAM J. Matrix Anal. Appl. 24, 260–280 (2002). https://doi.org/10.1137/S0895479801384937
MathSciNet MATH Google Scholar
Li, R.-C., Ye, Q.: A Krylov subspace method for quadratic matrix polynomials with application to constrained least squares problems. SIAM J. Matrix Anal. Appl. 25, 405–428 (2003). https://doi.org/10.1137/S0895479802409390
MathSciNet MATH Google Scholar
Liu, Z., Zhou, Y., Zhang, Y.: On inexact alternating direction implicit iteration for continuous Sylvester equations. Numer. Lin. Alg. Appl. 27, e2320 (2020). https://doi.org/10.1002/nla.2320
MathSciNet MATH Google Scholar
Lund, K., Massei, S., Palitta, D.: CRKSM_MatEQ (2020). https://gitlab.com/katlund/compress-and-restart-KSM
Luo, W.-H., Huang, T.-Z., Li, L., Zhang, Y., Gu, X.-M.: Efficient preconditioner updates for unsymmetric shifted linear systems. Comput. Math. Appl. 67, 1643–1655 (2014). https://doi.org/10.1016/j.camwa.2014.03.005
MathSciNet MATH Google Scholar
The MathWorks, Inc.: MATLAB (2022). https://www.matlab.com
Meerbergen, K.: The solution of parametrized symmetric linear systems. SIAM J. Matrix Anal. Appl. 24, 1038–1059 (2003). https://doi.org/10.1137/S0895479800380386
MathSciNet MATH Google Scholar
Oberwolfach benchmark collection: Steel profile. hosted at MORwiki – Model Order Reduction Wiki (2005), http://modelreduction.org/index.php/Steel_Profile
O’Leary, D.P.: The block conjugate gradient algorithm and related methods. Linear Algebra Appl. 29, 293–322 (1980). https://doi.org/10.1016/0024-3795(80)90247-5
MathSciNet MATH Google Scholar
Palitta, D., Simoncini, V.: Matrix-equation-based strategies for convection-diffusion equations. BIT 56, 751–776 (2016). https://doi.org/10.1007/s10543-015-0575-8
MathSciNet MATH Google Scholar
Parks, M.L., De Sturler, E., Mackey, G., Johnson, D.D., Maiti, S.: Recycling Krylov subspaces for sequences of linear systems. SIAM J. Sci. Comput. 28, 1651–1674 (2006). https://doi.org/10.1137/040607277
MathSciNet MATH Google Scholar
Penzl, T.: Numerical solution of generalized Lyapunov equations. Adv. Comp. Math. 8, 33–48 (1997). https://doi.org/10.1023/A:1018979826766
MathSciNet MATH Google Scholar
Penzl, T.: A cyclic low rank Smith method for large sparse Lyapunov equations. SIAM J. Sci. Comput. 21, 1401–1418 (2000). https://doi.org/10.1137/S1064827598347666
MathSciNet MATH Google Scholar
Penzl, T.: Eigenvalue decay bounds for solutions of Lyapunov equations: the symmetric case. Syst. Control Lett. 40, 139–144 (2000). https://doi.org/10.1016/S0167-6911(00)00010-4
MathSciNet MATH Google Scholar
Saad, Y.: Iterative Methods for Sparse Linear Systems. USA, Second edn., SIAM Philadelphia (2003). https://doi.org/10.1137/1.9780898718003
MATH Google Scholar
Saad, Y., Schultz, M.H.: GMRES : a generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Statist. Comput. 7, 856–869 (1986). https://doi.org/10.1137/0907058
MathSciNet MATH Google Scholar
Saak, J.: Efficient Numerical Solution of Large Scale Algebraic Matrix Equations in PDE Control and Model Order Reduction (2009), http://nbn-resolving.de/urn:nbn:de:bsz:ch1-200901642. PhD thesis, Technische Universität Chemnitz, Germany
Google Scholar
Saak, J., Köhler, M., Benner, P.: M-M.E.S.S.-2.2 – the Matrix Equations Sparse Solvers library (2022). https://doi.org/10.5281/zenodo.5938237. See also: https://www.mpi-magdeburg.mpg.de/projects/mess
Sabino, J.: Solution of Large-Scale Lyapunov Equations via the Block Modified Smith Method (2007). http://www.caam.rice.edu/tech∖_reports/2006/TR06-08.pdf. PhD thesis, Rice University, Texas
Google Scholar
Schmelzer, T.: Block Krlyov Methods for Hermitian Linear Systems. University of Kaiserslautern, PhD thesis (2004)
Google Scholar
Simoncini, V.: Restarted full orthogonalization method for shifted linear systems. BIT 43, 459–466 (2003). https://doi.org/10.1023/A:1026000105893
MathSciNet MATH Google Scholar
Simoncini, V.: A new iterative method for solving large-scale Lyapunov matrix equations. SIAM J. Sci. Comput. 29, 1268–1288 (2007). https://doi.org/10.1137/06066120X
MathSciNet MATH Google Scholar
Simoncini, V.: Extended Krylov subspace for parameter dependent systems. Appl. Numer. Math. 60, 550–560 (2010). https://doi.org/10.1016/j.apnum.2010.03.001
MathSciNet MATH Google Scholar
Simoncini, V.: Computational methods for linear matrix equations. SIAM Rev. 38, 377–441 (2016). https://doi.org/10.1137/130912839
MathSciNet MATH Google Scholar
Sontag, E.D.: Mathematical Control Theory, Texts in Applied Mathematics. 2nd edn., (1998) https://doi.org/10.1007/978-1-4612-0577-7. Springer, New York
Google Scholar
Soodhalter, K.M.: Block Krylov subspace recycling for shifted systems with unrelated right-hand sides. SIAM J. Sci. Comput. 38, A302–A324 (2016). https://doi.org/10.1137/140998214
MathSciNet Google Scholar
Soodhalter, K.M.: Two recursive GMRES-type methods for shifted linear systems with general preconditioning. Electron. Trans. Numer. Anal. 45, 499–523 (2016). https://etna.math.kent.edu/volumes/2011-2020/vol45/abstract.php?vol=45&pages=499-523
MathSciNet MATH Google Scholar
Soodhalter, K.M., De Sturler, E., Kilmer, M.: A survey of subspace recycling iterative methods, GAMM Mitteilungen, vol. 43 (2020). https://doi.org/10.1002/gamm.202000016
Soodhalter, K.M., Szyld, D.B., Xue, F.: Krylov subspace recycling for sequences of shifted linear systems. Appl. Numer. Math. 81, 105–118 (2014). https://doi.org/10.1016/j.apnum.2014.02.006
MathSciNet MATH Google Scholar
Starke, G.: Optimal alternating directions implicit parameters for nonsymmetric systems of linear equations. SIAM J. Numer. Anal. 28, 1431–1445 (1991). https://doi.org/10.1137/0728074
MathSciNet MATH Google Scholar
The MORwiki Community: MORwiki - model order reduction wiki (2022). http://modelreduction.org
Wachspress, E.L.: The ADI Model Problem. Springer, New York, (2013). https://doi.org/10.1007/978-1-4614-5122-8
MATH Google Scholar
Wolf, T.: H₂ Pseudo-optimal model order reduction, PhD thesis, (2015). https://d-nb.info/1064075568/34. Technische Universität München, Munich, Germany
Google Scholar
Wolf, T., Panzer, H.: The ADI iteration for Lyapunov equations implicitly performs H₂ pseudo-optimal model order reduction. Int. J. Control 89, 481–493 (2016). https://doi.org/10.1080/00207179.2015.1081985
MATH Google Scholar

Download references

Acknowledgements

The second author is member of the Italian INdAM Research group GNCS. The work presented in this paper has been carried out when the second author was affiliated with the Research Group Computational Methods in Systems and Control Theory (CSC), Max Planck Institute for Dynamics of Complex Technical Systems, Sandtorstraße 1, 39106 Magdeburg, Germany.

Funding

Open access funding provided by Alma Mater Studiorum - Università di Bologna within the CRUI-CARE Agreement.

Author information

Authors and Affiliations

Research Group Computational Methods in Systems and Control Theory (CSC), Max Planck Institute for Dynamics of Complex Technical Systems, Sandtorstraße 1, 39106, Magdeburg, Germany
Peter Benner & Jens Saak
Centro AM2, Dipartimento di Matematica, Università di Bologna, Piazza di Porta S. Donato 5, 40127, Bologna, Italy
Davide Palitta

Authors

Peter Benner
View author publications
You can also search for this author in PubMed Google Scholar
Davide Palitta
View author publications
You can also search for this author in PubMed Google Scholar
Jens Saak
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Davide Palitta.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Novelty statement:

Extended Krylov subspace methods (EKSM) and the low-rank alternating directions implicit (LR-ADI) iteration have been competing methods for the solution of large-scale algebraic Lyapunov equations. In this paper, we make an important step towards a new method merging them into a combined procedure that inherits advantages from both worlds.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Benner, P., Palitta, D. & Saak, J. On an integrated Krylov-ADI solver for large-scale Lyapunov equations. Numer Algor 92, 35–63 (2023). https://doi.org/10.1007/s11075-022-01409-5

Download citation

Received: 01 April 2022
Accepted: 02 September 2022
Published: 07 October 2022
Issue Date: January 2023
DOI: https://doi.org/10.1007/s11075-022-01409-5

Keywords

AMS Subject Classification (2010)

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

On an integrated Krylov-ADI solver for large-scale Lyapunov equations

Abstract

Similar content being viewed by others

Inexact methods for the low rank solution to large scale Lyapunov equations

Towards a Lanczos’ $$\tau $$ -Method Toolkit for Differential Problems

Krylov Subspace Spectral Methods with Coarse-Grid Residual Correction for Solving Time-Dependent, Variable-Coefficient PDEs

1 Introduction

2 Block Krylov methods for shifted linear systems

2.1 The extended Krylov subspace method for shifted linear systems

3 Merging the two iterative procedures

Theorem 3.1

Proof

Corollary 3.1

Proof

Remark 4

4 Shift computation

5 Numerical examples

Experiment 1

Experiment 2

Experiment 3

Experiment 4

6 Conclusions

Data availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Novelty statement:

Rights and permissions

About this article

Cite this article

Share this article

Keywords

AMS Subject Classification (2010)

Search

Navigation