Rational QZ steps with perfect shifts

Mastronardi, Nicola; Van Barel, Marc; Vandebril, Raf; Van Dooren, Paul

doi:10.1007/s11075-023-01600-2

Rational QZ steps with perfect shifts

Original Paper
Open access
Published: 09 July 2023

Volume 95, pages 1079–1102, (2024)
Cite this article

Download PDF

You have full access to this open access article

Numerical Algorithms Aims and scope Submit manuscript

Rational QZ steps with perfect shifts

Download PDF

Nicola Mastronardi¹,
Marc Van Barel²,
Raf Vandebril² &
…
Paul Van Dooren³

685 Accesses
1 Citation
Explore all metrics

Abstract

In this paper we analyze the stability of the problem of performing a rational QZ step with a shift that is an eigenvalue of a given regular pencil $H-\lambda K$ in unreduced Hessenberg–Hessenberg form. In exact arithmetic, the backward rational QZ step moves the eigenvalue to the top of the pencil, while the rest of the pencil is maintained in Hessenberg–Hessenberg form, which then yields a deflation of the given shift. But in finite-precision the rational QZ step gets “blurred” and precludes the deflation of the given shift at the top of the pencil. In this paper we show that when we first compute the corresponding eigenvector to sufficient accuracy, then the rational QZ step can be constructed using this eigenvector, so that the exact deflation is also obtained in finite-precision.

Structured eigenvalue backward errors for rational matrix functions with symmetry structures

Article 17 February 2024

On the Backward Error Incurred by the Compact Rational Krylov Linearization

Article 25 August 2021

On Nondegenerate Rational Approximation

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Computing all eigenvalues of a small to medium-sized matrix pencil $H-\lambda K$ is nowadays a routine task that shows up in many applications. The method of choice is the QZ algorithm which uses implicit QZ-type steps, implementing a bulge chasing technique. On the other hand, projection methods are often used to compute a subset of the eigenvalues of sparse, large-scale eigenproblems, and Krylov subspace methods are probably among the most used methods within this class. Even though the algorithms are totally different and they target different problems, Krylov and QZ-methods are intimately linked; theoretical support for the convergence and interpreting the QZ can be done entirely relying on Krylov theory. The rational QZ algorithm (which we will abbreviate as RQZ) is a numerical scheme that extends the ideas of the QZ algorithm and links to rational Krylov methods. It has been shown to be quite competitive with the QZ algorithm [3] because of the enhanced convergence behavior. It uses so-called RQZ steps which are pole swapping techniques on a Hessenberg-Hessenberg pencil, and not only look like bulge chasing, but also incorporate rational Krylov subspace ideas [2, 3].

The perfect shift strategy for Hessenberg pencils arises naturally in the downdating setting of orthogonal rational functions as described by Van Buggenhout, Van Barel, and Vandebril [10]. Consider a given finite discrete inner product

$$\begin{aligned} \langle f,g\rangle _{m} := \sum _{i=1}^{m} \vert w_i\vert ^2 \overline{g(z_i)} f(z_i), \end{aligned}$$

(1)

with nodes $z_i$ and weights $w_i$. One wishes to construct a set of orthogonal rational functions, with prescribed poles, for this inner product. Instead of constructing the orthogonal rational functions, it is often numerically more reliable to store the recurrences for generating these functions. These recurrences are stored in a Hessenberg pencil $H-\lambda K$, satisfying

$$\begin{aligned} VH = \Lambda V K, \quad V^H V = I, \quad Ve_1 = w/\Vert w\Vert , \end{aligned}$$

(2)

where w contains the weights $w_i$, $\Lambda $ is a diagonal matrix with nodes $z_i$ on the diagonal, and $H-\lambda K$ is a Hessenberg pencil where the ratio of the subdiagonals equals the poles. The relations (2) express that the rows of V are the left eigenvectors of the pencil $H-\lambda K$ and that the diagonal elements of $\Lambda $ are their corresponding eigenvalues. The chosen nodes, weights, and poles are of course problem specific and could possibly change when, for instance, the problem changes over time. To add or remove nodes, weights, and poles, we refer to the work of Van Buggenhout, Van Barel, and Vandebril [8,9,10]. For removing nodes, one downdates the problem. Say we want to remove node $z_j$, for $j\in \{1,\ldots ,m\}$. Then we need to construct unitary transformations, Z and Q such that the transformed relations

$$\begin{aligned} (VZ) (Z^H H Q) = \Lambda (V Z) (Z^HK Q) \end{aligned}$$

allow to deflate an eigenvalue in the upper left corner^{Footnote 1} of the pencil $Z^H H Q-\lambda Z^HK Q$. The remaining lower right $(n-1)\times (n-1)$ part $\tilde{H}-\lambda \tilde{K}$, satisfies the relation

$$\begin{aligned} \tilde{V} \tilde{H} = \tilde{\Lambda } \tilde{V} \tilde{K}, \quad \tilde{V}^H \tilde{V} = I, \quad \tilde{V}e_1 = \tilde{w}/\Vert \tilde{w}\Vert , \end{aligned}$$

providing the recurrences for the inner product

$$\begin{aligned} \langle f,g\rangle _{m-1} := \sum _{i=1, i\ne j}^{m} \vert w_i\vert ^2 \overline{g(z_i)} f(z_i), \end{aligned}$$

(3)

where $\tilde{\Lambda }$ and $\tilde{w}$ have node $z_j$ and weight $w_j$ removed. The exact deflation of the removed eigenvalue corresponds to the problem of deflating a perfect shift using a backward rational QZ step.

We consider only real matrix pencils and the deflation of a real eigenvalue or of a pair of complex conjugate eigenvalues. Using complex arithmetic avoids the problems of treating complex conjugate pairs together and is thus simpler. The extension to complex pencils is therefore not treated here. We will use the following notations. Matrices and submatrices are denoted by capital letters, i.e., A, B, H. The entry (i, j) of the matrix H is denoted by the lowercase letter $h_{i,j}. $ Vectors are denoted by bold letters, i.e., $ \textbf{a,b,} \ldots $. The identity matrix of order n is denoted by $I_n $ and its i–th column by $ {\textbf{e}}_i^{(n)}, $ or, if there is no ambiguity, simply by I and $ {\textbf{e}}_i,$ respectively. Generic entries different from zero in matrices or vectors are denoted by “$\times $.” The machine precision is denoted by $\epsilon _M$. We denote a Givens rotation between two adjacent rows or columns i and $i+1$, by

$$ G_{i} = \left[ \begin{array}{cccc} I_{i-1} &{} &{} &{} \\ &{} c &{} -s &{} \\ {} &{} s &{} c &{} \\ {} &{} &{} &{} I_{n-i-1} \end{array} \right] , \quad \left[ \begin{array}{cc} c &{} -s \\ s &{} c \end{array} \right] \left[ \begin{array}{cc} c &{} -s \\ s &{} c \end{array} \right] ^T =I_2. $$

The rest of the paper is organized as follows. In Sect. 2, we discuss the special form of a Hessenberg-Hessenberg pencil, which is the basis for performing a perfect shift RQZ-step. In Sect. 3, we give the main result of this paper: we derive a more robust method for implementing the RQZ step so that the perfect shift can be deflated at the top of the pencil. In Sects. 4 and 5, we look at two important aspects of our algorithm, namely how to improve the accuracy of an eigenvalue/eigenvector pair and how to scale the pencil in order to improve the residual of this approximation. In Sect. 6, we illustrate the performance of our algorithm with several numerical experiments.

2 Preliminary Hessenberg–Hessenberg form

The rational QZ algorithm for the generalized eigenvalue problem of a regular pencil $A-\lambda B$ assumes that one first reduces the pencil to a Hessenberg–Hessenberg form. This form can be obtained using orthogonal transformations U and V such that the transformed pencil $H-\lambda K:=V^T(A-\lambda B)U$ consists of two Hessenberg matrices :

$$\begin{aligned} H-\lambda K := \left[ \begin{array}{ccccc} h_{1,1} &{} \ldots &{} \ldots &{} h_{1,n} \\ h_{2,1} &{} \ddots &{} &{} \vdots \\ &{} \ddots &{} \ddots &{} \vdots \\ &{} &{} h_{n,n-1} &{} h_{n,n} \end{array} \right] - \lambda \left[ \begin{array}{ccccc} k_{1,1} &{} \ldots &{} \ldots &{} k_{1,n} \\ k_{2,1} &{} \ddots &{} &{} \vdots \\ &{} \ddots &{} \ddots &{} \vdots \\ &{} &{} k_{n,n-1} &{} k_{n,n} \end{array} \right] . \end{aligned}$$

(4)

Such a form can be obtained by direct construction or by running a rational Krylov algorithm [2, 4]. These will be called HH pencils, and the rational QZ algorithm will be abbreviated as RQZ. The fact that the pencil $H-\lambda K$ is unreduced is equivalent to asking that the subpencil

$$ H_p-\lambda K_p:= \left[ \begin{array}{cc} 0&I_{n-1} \end{array} \right] ( H-\lambda K ) \left[ \begin{array}{cc} I_{n-1} \\ 0 \end{array} \right] $$

is regular, or that the scalar pencils $h_{i+1,i}-\lambda k_{i+1,i}$ are regular for $1\le i < n$. The subpencil $H_p-\lambda K_p$ is called the “pole pencil" of $H-\lambda K$, as its eigenvalues are the poles of the RQZ algorithm [3]. We will analyze in the next section the construction of a backward RQZ step and compare different ways to compute such a step. We go over a number of assumptions that are used in our analysis.

Assumption (A1): $\det (H-\lambda K)\ne 0$ for almost all $\lambda $. This is well-known to hold generically and is necessary and sufficient for the definition of the generalized eigenvalues of $H-\lambda K$. Such a pencil $H-\lambda K$ is said to be regular.
Assumption (A2): $\det (H_p-\lambda K_p)\ne 0$ for almost all $\lambda $, meaning that the “pole pencil" $H_p-\lambda K_p$ is regular which also holds generically and is necessary and sufficient for the definition of the poles of the HH pencil. We call such a HH pencil unreduced.
Assumption (A3): $H-\lambda K$ is proper, meaning that the subpencil
$$ \left[ \begin{array}{cc} h_{n,n-1}&h_{n,n} \end{array} \right] - \lambda \left[ \begin{array}{cc} k_{n,n-1}&k_{n,n} \end{array} \right] $$
has no zeros. Again, this holds generically.
Assumption (A4): The perfect shift $\lambda _0$ is not a pole of $H-\lambda K$, i.e., $\det (H_p-\lambda _0 K_p)\ne 0$. This also holds generically.

Assumptions (A1) and (A2) will be assumed throughout the paper, since this is needed for the definition of generalized eigenvalues and poles of the HH pencil.

A possible extension of the above Hessenberg-Hessenberg structure occurs when the pole pencil $H_p-\lambda K_p$ is block upper triangular, with diagonal sub-blocks of dimensions $k_i\times k_i$. Such a block structure will be called a block-HH form. It will be discussed later on, but only for the case that the diagonal blocks have sizes $k_i=1$ or 2.

3 Perfect shift of an unreduced HH pencil

In this section, we consider the case that the pole pencil $H_p-\lambda K_p$ has all block-sizes $k_i$ equal to 1. This is the simplest case and it allows us to compare the standard RQZ approach with the eigenvector method presented in this paper.

3.1 Deflating a real eigenvalue $\lambda _0$

We assume here that we are given a regular pencil $H-\lambda K$ that is already in HH form, and that it is unreduced. If not, the operations described below can be applied to each unreduced subpencil of a general HH pencil. We also assume that assumptions (A3) and (A4) hold.

Let $\lambda _0$ be a real eigenvalue of $H-\lambda K$, then we represent it as

$$\lambda _0:=\alpha _0/\beta _0, \quad \alpha _0^2+\beta _0^2=1, \quad \beta _0\ge 0 \quad (\alpha _0=1, \beta _0=0 \;\textrm{if}\; \lambda _0=\infty ). $$

In exact arithmetic, if we then perform one backward RQZ step with shift $\lambda _0$, the pencil

$$\hat{H}- \lambda \hat{K}:= Z^T (H - \lambda K) Q $$

is still in HH form with its first column proportional to ${\textbf{e}}_1$ and Q and Z are both unreduced Hessenberg matrices formed by the product of $n-1$ Givens rotations. Unfortunately, the shift $(\alpha _0-\lambda \beta _0)$ may finally not appear accurately in the (1, 1) position because of a phenomenon known as “blurring of the shift." Therefore, we need to consider an alternative construction of the RQZ step, which we describe in the following theorem. Since we want to relate the rotations used in this theorem with those of the RQZ algorithm, we will make them unique by choosing the sign of s always positive when $s\ne 0$, and to choose $c=1$ when $s=0$.

Theorem 1

Let $H-\lambda K$ be a real proper HH pencil with real eigenvalue $\lambda _0=\alpha _0/\beta _0$ of absolute value $\mid \lambda _0 \mid $ bounded by 1 and normalized using $\alpha _0^2+\beta _0^2=1$. Let $\lambda _0$ not be a pole of $H-\lambda K$ and define the Hessenberg matrix $M:= (\beta _0 H - \alpha _0 K)$. Then

1.
the pencil $H-\lambda K$ has a normalized real eigenvector ${\textbf{x}}$ corresponding to $\lambda _0=\alpha _0/\beta _0$:
$$ (\beta _0 H - \alpha _0 K){\textbf{x}}=M{\textbf{x}}=0, \quad \mid \mid {\textbf{x}}\mid \mid _2=1, $$
which is unique up to a scale factor $\pm 1$, and has a nonzero last component $x_n$; therefore, there is an “essentially unique” orthogonal transformation $Q:=G^{(r)}_{1} \ldots G^{(r)}_{n-1}$ that transforms ${\textbf{x}}$ to $Q{\textbf{x}}=\pm {\textbf{e}}_1$;
2.
there is a corresponding “essentially unique” sequence of rotations $G^{(\ell )}_{n-\!1}, \ldots , G^{(\ell )}_{1}$ guaranteeing that the products
$$\begin{aligned} Q:= G^{(r)}_{1}G^{(r)}_{2}\cdots G^{(r)}_{n-1}, \quad Z :=G^{(\ell )}_{1}G^{(\ell )}_{2}\cdots G^{(\ell )}_{n-1}, \end{aligned}$$
(5)
are both Hessenberg and transform the triple $ (H,K,{\textbf{x}})$ to an equivalent one
$$ (\hat{H},\hat{K},\hat{{\textbf{x}}}):=(ZHQ^T, ZKQ^T,Q{\textbf{x}}) $$
where
$$ \hat{{\textbf{x}}} = \pm {\textbf{e}}_1, \quad (\beta _0 \hat{H}-\alpha _0 \hat{K}) {\textbf{e}}_1 = 0, $$
and $\hat{H}-\lambda \hat{K}$ is in HH form.

Proof

To prove item 1, we point out that the normalized eigenvector ${\textbf{x}}$ is unique (up to a scaling factor $\pm 1$) because it is the solution of $M{\textbf{x}}=0$, where M has rank $n-1$ since it is unreduced and Hessenberg, because the pencil $H-\lambda K$ satisfies assumption (A4). For the same reason, its last component $x_n$ is nonzero, since otherwise the whole vector ${\textbf{x}}$ would be zero. The reduction of ${\textbf{x}}$ to $ \hat{{\textbf{x}}} = Q{\textbf{x}}= \pm {\textbf{e}}_1$ then requires a sequence of Givens rotations

$$ G^{(r)}_{i-1} \in {\mathbb {R}}^{ n \times n}, \quad i =n, n-1, \ldots ,2, $$

in order to eliminate the entries $x_i,\; i=n,n-1, \ldots , 2 $ of the vector $ {\textbf{x}}. $ By choosing the sign of s in these Givens rotations positive, we make them unique.

For item 2, we point out that after the first transformation $G^{(r)}_{n-1}$ we have an updated pole pencil

$$\begin{aligned} \tilde{H}_p-\lambda \tilde{K}_p = \left[ \begin{array}{cc} 0&I_{n-1} \end{array} \right] ( H-\lambda K )G^{(r)T}_{n-1} \left[ \begin{array}{cc} I_{n-1} \\ 0 \end{array} \right] \end{aligned}$$

(6)

that is still in generalized Schur form, but its last column has been changed and has the shift $\lambda _0$ as new pole in the bottom position. This follows from (6) which implies

$$ \left[ \begin{array}{cc} \tilde{h}_{n,n-1}-\lambda _0 \tilde{k}_{n,n-1}&\tilde{h}_{n,n}-\lambda _0 \tilde{k}_{n,n} \end{array}\right] \left[ \begin{array}{c} x_{n-1} \\ 0 \end{array}\right] =0, \quad \textrm{where} \quad x_{n-1}\ne 0. $$

It can also be viewed as a special case of Lemma 3 with $k=1$ and $n=1$. Each subsequent pair of rotations $G^{(\ell )}_{i+1}$ and $G^{(r)T}_i$ moves then the perfect shift $\lambda _0$ one position up in the pole pencil $\tilde{H}_p -\lambda \tilde{K}_p$. First $G^{(r)}_i$ moves the trailing nonzero element of ${\textbf{x}}$ one position up. Then $G^{(r)T}_i$ is applied to the columns of the pencil, creating a bulge in the Hessenberg matrices H and K, which is then annihilated by the left Givens transformation $G^{(\ell )}_{i+1}$. The fact that the Hessenberg form is restored in both H and K follows from Lemma 4 with $k=1$ and $n=1$. Therefore the pole pencil

$$\left[ \begin{array}{cc} 0&I_{n-1} \end{array} \right] G^{(\ell )}_{2}\cdots G^{(\ell )}_{n-1}(H_p-\lambda K_p) G^{(r)T}_{n-1}\cdots G^{(r)T}_{1}\left[ \begin{array}{cc} I_{n-1} \\ 0 \end{array} \right] $$

has the perfect shift in its top diagonal, and then the final left rotation $G^{(\ell )}_{1}$ moves it to the top diagonal position of the pencil $\hat{H} -\lambda \hat{K}$ (see Lemma 5 with $n=1$). Therefore, all the poles moved one position down, and the bottom one disappeared. All these transformations are “essentially” unique, since they implement the swapping of the eigenvalue $\lambda _0$ with one of the eigenvalues of $\hat{H}_p-\lambda \hat{K}_p$.$\square $

The reduction described in Theorem 1, transforming an eigenvector $ {\textbf{x}}$ corresponding to a real eigenvalue $ \lambda _0 $ into a multiple of $ {\textbf{e}}_1, $ and modifying the matrices H and K, is graphically depicted in Fig. 1, for $ n=6.$ We display the evolution of the triple $(H,K,{\textbf{x}})$.

In particular, a generic nonzero entry is denoted by “$\times $,” an entry to be annihilated by “$\otimes $” and the entries becoming zero, as a consequence of the multiplication by a Givens matrix, by “$\boxtimes $.”

Remark 1

The implicit Q theorem for regular HH pencils is closely related to Theorem 1. It implies that the transformations Q and Z can also be determined from the first rotation $G_{n-1}^{(r)} $ that computes

$$\begin{aligned} \left[ \begin{array}{@{}cc@{}} m_{n,n-1},&m_{n,n} \end{array} \right] G_{n-1}^{(r)T} = \left[ \begin{array}{cc} 0&\times \end{array} \right] \end{aligned}$$

(7)

and from the fact that the pair $ (ZHQ^T,ZKQ^T) $ is still in HH form. This is known as “swapping the poles" and corresponds to “chasing the bulge” [12] in the QZ algorithm.

Remark 2

Theorem 1 gives an alternative way to determine the sequences of right rotations $Q:=G^{(r)}_{1}G^{(r)}_{2}\cdots G^{(r)}_{n-1}$ and left rotations $Z:=G^{(\ell )}_{1}G^{(\ell )}_{2}\cdots G^{(\ell )}_{n-1}$ to implement an implicit RQZ-step. First one determines Q from $Q{\textbf{x}}=\pm {\textbf{e}}_1$, and then one determines Z from the restoration of the Hessenberg form of K if $\mid \lambda _0 \mid \le 1$ and of H if $\mid \lambda _0 \mid > 1$, as indicated in Lemma 4. These particular choices are made to ensure numerical stability, as will be shown later on.

Although these different approaches are equivalent under exact arithmetic, their numerical behavior is different. We refer for this to Example 2.1 of [6], where a $3\times 3$ Hessenberg matrix H of a standard eigenvalue problem is given which can be seen as a special case of a Hessenberg–Hessenberg pencil $H-\lambda K$ with $K=I$ and all its poles at infinity. The RQZ algorithm then reduces to the standard QR and will yield the same results. It was shown in [6] that the eigenvector approach is then the more reliable method for implementing the perfect shift.

3.2 Importance of the assumptions

In this subsection, we give two examples to illustrate the differences between the RQZ and the eigenvector method. The first example shows that when assumption (A4) is dropped, these two methods are not equivalent anymore.

Example 1

Consider the pencil

$$ \left[ \begin{array}{cccc} 1 &{} 1 &{} 0 &{} 0 \\ 1 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 2 &{} 0 \end{array} \right] - \lambda \left[ \begin{array}{cccc} 0 &{} 0 &{} 0 &{} 1 \\ 1 &{} 0 &{} 0 &{} 0 \\ 0 &{} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 1 &{} 1 \end{array} \right] . $$

and the shift $\lambda _0=0$. Its eigenvalues are 0, 0, 1 and 2 and the two eigenvalues 0 belong to one Jordan block. Therefore, $\lambda _0$ has only one eigenvector ${\textbf{x}}={\textbf{e}}_4$. Assumption (A3) is satisfied, but assumption (A4) not. The eigenvector method will then use three adjacent permutations to transform ${\textbf{e}}_4$ to ${\textbf{e}}_1$ yielding the transformed HH pencil (where $c=\sqrt{2}/2$)

$$ \left[ \begin{array}{c|rrc} 0 &{} -c &{} -c &{} -2c \\ \hline 0 &{} c &{} c &{} -2c \\ 0 &{} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 \end{array} \right] - \lambda \left[ \begin{array}{c|ccc} 2c &{} 0 &{} 0 &{} -c \\ \hline 0 &{} 0 &{} 0 &{} -c \\ 0 &{} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} \;1\; &{} 0 \end{array} \right] . $$

The RQZ method, on the other hand, will obtain after the first column rotation $G^{(r)}_3$ the pencil

$$ \left[ \begin{array}{c|cc|c} 1 &{} 1 &{} 0 &{} 0 \\ 1 &{} 0 &{} 0 &{} 0 \\ \hline 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 2 \end{array} \right] - \lambda \left[ \begin{array}{c|cc|c} 0 &{} 0 &{} 1 &{} 0 \\ 1 &{} 0 &{} 0 &{} 0 \\ \hline 0 &{} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 1 &{} 1 \end{array} \right] $$

and has to perform swapping on the $2\times 2$ subpencil $ \left[ \begin{array}{cc} 0 &{} 0 \\ 0 &{} 0 \end{array} \right] - \lambda \left[ \begin{array}{cc} 1 &{} 0 \\ 0 &{} 1 \end{array} \right] $, which is an ill-posed problem and has non-unique solutions. Therefore the RQZ method does not have a unique way to proceed further when Assumption (A4) does not hold.

Remark 3

When the pencil $H-\lambda K$ has one or more poles coalescent with the shift $\lambda _0$, the matrix $M:=\beta _0H-\alpha _0K$ is no longer unreduced, and the proof that $x_n$ is nonzero does not hold anymore. But it is easy to verify that if the eigenvector ${\textbf{x}}$ is unique, then one (and only one) of the unreduced Hessenberg blocks of M, is singular, and that $x_n\ne 0$ if and only if this is the last block. In the above example, this was indeed the case. But even when $x_n=0$, the eigenvector method would still work, when starting with the unreduced Hessenberg block that is singular, since the eigenvector corresponding to that subblock will have a trailing nonzero component.

In the next example, the assumption (A3) is dropped and again these two methods are not equivalent anymore.

Example 2

Consider the pencil

$$ \left[ \begin{array}{cccc} 1 &{} 1 &{} 0 &{} 0 \\ 1 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 2 &{} 0 \end{array} \right] - \lambda \left[ \begin{array}{cccc} 0 &{} 0 &{} 0 &{} 1 \\ 1 &{} 0 &{} 0 &{} 0 \\ 0 &{} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 1 &{} 0 \end{array} \right] . $$

Its eigenvalues are still 0, 0, 1 and 2 and the two eigenvalues 0 belong to one Jordan block. Therefore, the perfect shift $\lambda _0=0$ has a single eigenvector ${\textbf{x}}={\textbf{e}}_4$. The eigenvector method will then use three adjacent permutations to transform ${\textbf{e}}_4$ to ${\textbf{e}}_1$ yielding the transformed HH pencil

$$ \left[ \begin{array}{c|rrc} 0 &{} -1 &{} -1 &{} 0 \\ \hline 0 &{} 0 &{} 0 &{} -2 \\ 0 &{} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 \end{array} \right] - \lambda \left[ \begin{array}{c|ccc} 1 &{} 0 &{} 0 &{} 0 \\ \hline 0 &{} 0 &{} 0 &{} -1 \\ 0 &{} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 1 &{} 0 \end{array} \right] . $$

The RQZ method, on the other hand, will obtain after the first column rotation $G^{(r)}_3$ the pencil

$$ \left[ \begin{array}{ccc|c} 1 &{} 1 &{} 0 &{} 0 \\ 1 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 \\ \hline 0 &{} 0 &{} 0 &{} 2 \end{array} \right] - \lambda \left[ \begin{array}{ccc|c} 0 &{} 0 &{} 1 &{} 0 \\ 1 &{} 0 &{} 0 &{} 0 \\ 0 &{} 1 &{} 0 &{} 0 \\ \hline 0 &{} 0 &{} 0 &{} 1 \end{array} \right] . $$

Since Assumption (A3) does not hold, the RQZ method will not be able to introduce the shift properly in order to move it to the top. Therefore, it will rather attempt to do an early deflation of the bottom eigenvalue 2.

This example shows that the eigenvector method still works when assumption (A3) fails, provided $x_n\ne 0$.

3.3 Deflating a complex conjugate pair $(\lambda _0,\overline{\lambda }_0)$

We assume now that we are given two complex conjugate eigenvalues $(\lambda _0,\overline{\lambda }_0)$ of a regular and unreduced pencil $H-\lambda K$ that is in HH form and therefore has only real poles. This implies that assumption (A4) holds. Let us represent the eigenvalues and eigenvectors by their real and imaginary parts : $\alpha _0 \pm \imath \beta _0$ and ${\textbf{x}}\pm \imath {\textbf{y}}$. Then the eigenvector/eigenvalue equations $(H-(\alpha _0\pm \imath \beta _0) K)({\textbf{x}}\pm \imath {\textbf{y}})=0$ can be expressed as

$$\begin{aligned} H X = KX\Lambda , \quad \textrm{where} \quad \Lambda :=\left[ \begin{array}{cc} \alpha _0 &{} \beta _0 \\ -\beta _0 &{} \alpha _0\end{array}\right] ,\quad X:=\left[ \begin{array}{cc} {\textbf{x}}&{\textbf{y}}\end{array}\right] \end{aligned}$$

(8)

indicating that X spans a two-dimensional deflating subspace of the pencil $H-\lambda K$. When multiplying X with an invertible matrix R, the new basis XR can be made orthonormal and the matrix $\Lambda $ then becomes $R^{-1}\Lambda R$, which preserves its eigenvalues, as expected.

The following theorem extends essentially the ideas of Theorem 1 to the case of a complex conjugate pair of shifts. Therefore, we restricted the proof to the issues that are different in the two proofs.

Theorem 2

Let $H-\lambda K$ be a real, regular, proper and unreduced HH pencil with two complex conjugate eigenvalues $(\alpha _0 \pm \imath \beta _0)$ of absolute value $\mid \lambda _0 \mid $ bounded by 1. Then the following holds:

1.
There exists an “essentially unique” basis X of the two-dimensional deflating subspace of the eigenvalue pair $\alpha _0\pm \imath \beta _0$ such that
$$ X^TX=I_2, \quad X=\left[ \begin{array}{cc} x_1 &{} y_1 \\ \vdots &{} \vdots \\ x_{n-1} &{} y_{n-1} \\ 0 &{} y_n \end{array}\right] , \quad \textrm{where} \quad x_{n-1}\ne 0, \; y_{n}\ne 0. $$
Moreover there exists a matrix $Q:=G^{(r,2)}G^{(r,1)}$, consisting of 2 essentially unique sequences of Givens rotations
$$G^{(r,i)}=G^{(r,i)}_{i} \ldots G^{(r,i)}_{n+i-3}, \quad i=1,2$$
such that their product Q gives the QR factorization $X=Q^T R$;
2.
There is a matrix $Z:=G^{(\ell ,2)} G^{(\ell ,1)}$, consisting of 2 essentially unique sequences of Givens rotations
$$G^{(\ell ,i)}=G^{(\ell ,i)}_{i} \ldots G^{(\ell ,i)}_{n+i-3}, \quad i=1,2,$$
and a matrix $Q:=G^{(r,2)}G^{(r,1)}$, consisting of 2 essentially unique sequences of Givens rotations
$$G^{(r,i)}=G^{(r,i)}_{i} \ldots G^{(r,i)}_{n+i-3}, \quad i=1,2,$$
such that the triple (H, K, X), is transformed into an equivalent one
$$ (\hat{H},\hat{K},\hat{X}):=(ZHQ^T, ZKQ^T,QX), $$
where $\hat{X}$ is upper triangular, and $(\hat{H} - \lambda \hat{K})$ is in HH form with a leading $2\times 2$ block $[I_2,0](\hat{H}-\lambda \hat{K}) [I_2,0]^T$ that is decoupled and contains the eigenvalues $\alpha _0\pm \imath \beta _0$.

Proof

Clearly the complex eigenvector ${\textbf{x}}+\imath {\textbf{y}}$ has a non-zero last component because $H-\lambda K$ is unreduced Hessenberg; and hence, the last row of X is nonzero. After the normalization, this is still the case; and hence, there exists a rotation on the columns of X that annihilates $x_n$. The fact that $x_{n-1}$ is then non-zero follows from the properness assumption (A3): if $x_{n-1}=0$, then there exists a rotation such that

$G_{n-1}\left[ \begin{array}{cc} 0 &{} y_{n-1} \\ 0 &{} y_n \end{array}\right] =\left[ \begin{array}{cc} 0 &{} \hat{y}_{n-1} \\ 0 &{} 0 \end{array}\right] $ implying

$$\left[ \begin{array}{cc} h_{n,n-1}&h_{n,n} \end{array}\right] G_{n-1}^T\left[ \begin{array}{cc} 0 &{} \hat{y}_{n-1} \\ 0 &{} 0 \end{array}\right] = \left[ \begin{array}{cc} k_{n,n-1}&k_{n,n} \end{array}\right] G_{n-1}^T\left[ \begin{array}{cc} 0 &{} \hat{y}_{n-1} \\ 0 &{} 0 \end{array}\right] \Lambda . $$

So both $\left[ \begin{array}{cc} h_{n,n-1}&h_{n,n} \end{array}\right] $ and $\left[ \begin{array}{cc} k_{n,n-1}&k_{n,n} \end{array}\right] $ are parallel to the last row of $G_{n-1}$ and this violates assumption (A3). The only degree of freedom left over is a scaling of the columns of X with $\pm 1$. Once the properties of X are established, the existence of the sequences of Givens rotations $ G^{(r,i)}=G^{(r,i)}_{i}, \ldots , G^{(r,i)}_{n-i-1}$, for $i=1,2$, follow: these are the rotations needed for the classical QR factorization of X. This then completes the proof of Item 1.

The proof of Item 2 is very similar to that of Item 2 in Theorem 1, except that $n=2$ when using Lemma 3, 4 and 5, and that we need two rotations $G^{(r,2)}_{i+1}G^{(r,1)}_i$ to annihilate the two bottom positions of the matrix $\hat{X}$, and then two rotations $G^{(\ell ,2)}_{i+1}G^{(\ell ,1)}_i$ to restore the Hessenberg form of K if $\mid \lambda _0 \mid \le 1$ and of H, otherwise.$\square $

The reduction described in Theorem 2, transforming an orthogonal basis of the real and the imaginary part of an eigenvector $ {\textbf{x}}$ corresponding to a complex conjugate eigenvalue $ \lambda _0 $ into a multiple of $ [{\textbf{e}}_1, {\textbf{e}}_2] $ and modifying the matrices H and K, is graphically depicted in Fig. 2, for $ n=6.$ We display the evolution of the triple $(H,K,{\textbf{x}})$.

3.4 Perfect shift of a block HH pencil

Let us now consider the case of complex conjugate poles in the pencil $H-\lambda K$. We then have a real block-HH pencil where the diagonal blocks of the pole pencil $H_p-\lambda K_p$ are $1\times 1$ or $2\times 2$. Again, we describe the method for a shift $\lambda _0$ of modulus smaller or equal to 1. In that case, we assume $K_p$ to be upper-triangular (and hence K is Hessenberg), while $H_p$ is block triangular (and hence H is block Hessenberg) :

$$ H= \left[ \begin{array}{ccccc|c} H_{1,1} &{} H_{1,2} &{} \ldots &{} \ldots &{} H_{1,n-1} &{} H_{1,n} \\ \hline H_{2,1} &{} H_{2,2} &{} \ldots &{} \ldots &{} H_{2,n-1} &{} H_{2,n} \\ &{} H_{3,2} &{} \ddots &{} &{} H_{3,n-1} &{} H_{3,n} \\ &{} &{} \ddots &{} \ddots &{} \vdots &{} \vdots \\ &{} &{} &{} H_{n-1,n-2} &{} H_{n-1,n-1} &{} H_{n-1,n} \\ &{} &{} &{} &{} H_{n,n-1} &{} H_{n,n} \end{array}\right] $$

Theorems 1 and 2 are still valid, except that the HH form is now replaced by a block HH form for the pencil $H-\lambda K$. We briefly discuss the differences of the algorithm for both the case of a real shift and a complex conjugate pair. The proofs of our arguments follow from Lemmas 3, 4, and 5.

3.5 A single real shift

If the bottom block $H_{n,n-1}$ is $1\times 1$ then a single Givens rotation $G^{(r)T}_{n-1}$ will rotate the shift $\lambda _0$ to position $(n,n-1)$. If, on the other hand, the bottom block $H_{n,n-1}$ is $2\times 2$, two Givens transformations $G^{(r)T}_{n-1}$ and $G^{(r)T}_{n-2}$ and one Givens rotation $G^{(\ell )}_{n-1}$ have to be applied to $H-\lambda K$ to move the shift to position $(n-1,n-2)$.

After this preliminary step, $\lambda _0$ is swapped with the next block on the diagonal of the pole pencil. If this block is $1 \times 1$ a rotation $G^{(r)T}_{i-1}$ followed by a rotation $G^{(\ell )}_{i}$ moves the shift one position up. If this block is $2 \times 2$, two rotations $G^{(r)T}_{i-1}$ and $G^{(r)T}_{i-2}$ followed by 2 rotations $G^{(\ell )}_{i}$ and $G^{(\ell )}_{i-1}$ moves the shift two positions up.

The RQZ step is finalized by a single rotation $G^{(\ell )}_{1}$ moving the shift to position (1, 1) in $H-\lambda K$.

3.6 Two complex conjugate shifts

Here again there are two different starting scenarios. If the bottom block $H_{n,n-1}$ is $2\times 2$ or if the two bottom blocks are both $1\times 1$, then a pair of Givens rotations $G^{(r,1)T}_{n-2}G^{(r,2)T}_{n-1}$ will move the shifts $(\lambda _0,\overline{\lambda }_0)$ to a new $2\times 2$ block $H_{n,n-1}$. If this is not the case, two pairs of Givens rotations $G^{(r,1)T}_{n-2}G^{(r,2)T}_{n-1}$ and $G^{(r,1)T}_{n-3}G^{(r,2)T}_{n-2}$ and one pair of Givens rotations $G^{(\ell ,2)}_{n-2}G^{(\ell ,1)}_{n-1}$ have to be applied to $H-\lambda K$ to move the pair $(\lambda _0,\overline{\lambda }_0)$ to position $(n-1,n-2)$.

After this preliminary step, the pair $(\lambda _0,\overline{\lambda }_0)$ is swapped with the next block on the diagonal of the pole pencil. If this block is $1 \times 1$, a pair of rotations $G^{(r,1)T}_{i-2}G^{(r,2)T}_{i-1}$ followed by a pair of rotation $G^{(\ell ,2)}_{i-1}G^{(\ell ,1)}_{i}$ moves the shift one position up. If this block is $2 \times 2$, two such pairs of rotations are used to move the pair $(\lambda _0,\overline{\lambda }_0)$ two positions up.

The RQZ step is finalized by a single pair of rotations $G^{(\ell ,1)}_{1}$ $G^{(\ell ,2)}_{2}$ moving the pair $(\lambda _0,\overline{\lambda }_0)$ to the (1, 1) block in $H-\lambda K$.

The graphical description of the former reduction, transforming an orthogonal basis of the real and the imaginary parts of an eigenvector $ {\textbf{x}}$ corresponding to a complex conjugate eigenvalue $ \lambda _0 $ into a multiple of $ [{\textbf{e}}_1, {\textbf{e}}_2] $ and modifying the matrices H and K, is depicted in Fig. 3, for $ n=7.$ The evolution of the triple $(H,K,{\textbf{x}})$ is displayed in that figure.

4 Approximation of eigenvalue/eigenvector pair

In this section, we show how to find or improve an approximation $(\tilde{\lambda },\tilde{\textbf{x}})$ to an exact eigenpair $(\lambda ,{\textbf{x}})$ of a pencil $H-\lambda K$ that is in proper HH form. This section applies to both real and complex eigenvalues. The eigenvalue $\tilde{\lambda }$ is given as the ratio

$$\begin{aligned} \begin{array}{ccc}\widetilde{\lambda }=\widetilde{\alpha }/\widetilde{\beta }&\textrm{with}&{\left| \widetilde{\alpha }\right| }^{2}+{\left| \widetilde{\beta }\right| }^{2}=1\end{array} \end{aligned}$$

and the eigenvector $\tilde{\textbf{x}}$ is supposed to have norm $\parallel \tilde{\textbf{x}}\parallel _2=1$. We want to improve this approximation by reducing the norm $\Vert {\textbf{r}}\Vert _2$ of the residual ${\textbf{r}}$ defined by

$$\begin{aligned} (\tilde{\alpha }H - \tilde{\beta }K ) \tilde{\textbf{x}}=: {\textbf{r}}, \end{aligned}$$

(9)

where ${\textbf{r}}$ is assumed to be small, but nonzero. If the vector $\tilde{\textbf{x}}$ is given, then the minimization of $\Vert {\textbf{r}}\Vert _2$ is equivalent to

$$\underset{{\Vert \left[ \begin{array}{c}\widetilde{\alpha }\\ \widetilde{\beta }\end{array}\right] \Vert }_{2}=1}{\textrm{min}}{\Vert \left[ H\widetilde{x}-K\widetilde{x}\right] \left[ \begin{array}{c}\widetilde{\alpha }\\ \widetilde{\beta }\end{array}\right] \Vert }_{2}, $$

which is a total least squares problem [1] that can be solved by choosing $\left[ \begin{array}{c}\widetilde{\alpha }\\ \widetilde{\beta }\end{array}\right] ={\textbf{v}}_{2}$ using the right singular vector ${\textbf{v}}_2$ of the singular value decomposition of the matrix

$$\left[ \begin{array}{cc} H\tilde{\textbf{x}}&- K\tilde{\textbf{x}}\end{array}\right] = \left[ \begin{array}{cc} {\textbf{u}}_1&{\textbf{u}}_2 \end{array}\right] \left[ \begin{array}{cc} \sigma _1 &{} 0 \\ 0 &{} \sigma _2 \end{array}\right] \left[ \begin{array}{cc} {\textbf{v}}_1&{\textbf{v}}_2 \end{array}\right] ^H. $$

If the vectors $H\tilde{\textbf{x}}$ and $K\tilde{\textbf{x}}$ are not parallel, this update is guaranteed to decrease the norm $\Vert {\textbf{r}}\Vert _2$ (see [1]).

Let us now suppose that $\tilde{\lambda }$ is given, then the best choice for $\tilde{\textbf{x}}$ to reduce the residual $\Vert {\textbf{r}}\Vert _2$ in (9) is the n-th singular vector ${\textbf{v}}_n$ of the singular value decomposition of $ \tilde{M}:=(\tilde{\alpha }H - \tilde{\beta }K ) $, but this may be too expensive when incorporated in an iteration where $\tilde{\lambda }$ and $\tilde{\textbf{x}}$ are updated recursively. A simpler scheme is to apply inverse iteration

$$ {\textbf{z}}:= \tilde{M}^{-1}(\tilde{M}^{-1})^H\tilde{\textbf{x}}, \quad \tilde{\textbf{x}}_{new}:= {\textbf{z}}/\Vert {\textbf{z}}\Vert _2, $$

which is again guaranteed to decrease the norm of the residual if the singular values of $\tilde{M}$ are distinct (see [5]).

The procedure explained in this section, to refine the pair $(\lambda ,{\textbf{x}})$ to $(\tilde{\lambda },\tilde{\textbf{x}})$, is primarily aimed at improving the residual of a scaled eigenvalue problem, as explained in the next section.

5 Improving the scaled residual

Let us suppose now that the pair $(\tilde{\lambda },\tilde{\textbf{x}})$ yields a residual (9) that is of the order of $\epsilon _M\Vert (H,K)\Vert _F$. The backward error analysis of [6, 7] then shows that we need the stronger bounds

$$\begin{aligned} \mid r_{i+1}\mid \le \epsilon _M\Vert (H,K)\Vert _F \Vert \tilde{\textbf{x}}(i:n)\Vert _2, \quad i=1,\ldots ,n-1 \end{aligned}$$

(10)

to ensure that the structured backward error^{Footnote 2} of the RQZ step with perfect shift is also of the order of $\epsilon _M\Vert (H,K)\Vert _F$. This can be achieved as follows. We first update the eigenvalue using the procedure explained in Sect. 4. This will already reduce the residual. For simplicity, we do not change the notation for this simple step. Then define the vector ${\textbf{d}}$ as

$$ d_1=1, \quad d_{i+1}= 2^{\textrm{round}\log _2\Vert \tilde{\textbf{x}}(i:n)\Vert _2}, $$

then $d_{i+1}/\sqrt{2}\le \Vert \tilde{\textbf{x}}(i:n)\Vert _2\le d_{i+1}\sqrt{2} $ and the vector ${\textbf{d}}$ is non-increasing (i.e. $d_{i+1}\le d_i$) since $\Vert \tilde{\textbf{x}}(i:n)\Vert _2\le \Vert \tilde{\textbf{x}}(i-1:n)\Vert _2$. Also the pencil matrices

$$ H_d:= D^{-1}HD, \quad K_d:= D^{-1}KD, \quad \textrm{with} \quad D:=\textrm{diag}(d_1, \ldots , d_n) $$

satisfy the bounds

$$ \Vert (H_d,K_d)\Vert _F \le \gamma \Vert (H,K)\Vert _F, \quad \textrm{where} \quad \gamma := \max _{1\le i \le n-1} d_i/d_{i+1} \ge 1, $$

and the equation

$$ (\tilde{\alpha }H_d -\tilde{\beta }K_d) D^{-1}\tilde{\textbf{x}}= D^{-1}{\textbf{r}}. $$

The scaled subvectors of $\tilde{\textbf{x}}_d:=D^{-1}\tilde{\textbf{x}}$ then have approximately the same norm (see Appendix Scaling) :

$$ \frac{1}{\gamma \sqrt{2}} \le \Vert \tilde{\textbf{x}}_d(n-1:n)\Vert _2 \le \ldots \le \Vert \tilde{\textbf{x}}_d(1:n)\Vert _2 \le \sqrt{2n}, $$

which implies that the norm of $\tilde{\textbf{x}}_d$ is of the order of 1. After performing one step of inverse iteration on $\tilde{\textbf{x}}_d$ to improve that computed eigenvector, we obtain a scaled residual ${\textbf{r}}_{d,new}=(\tilde{\alpha }H_d -\tilde{\beta }K_d) \tilde{\textbf{x}}_{d,new}$ satisfying (for a moderate value of c)

$$ \Vert {\textbf{r}}_{d,new} \Vert _2 \le c\epsilon _M\Vert (H_d,K_d)\Vert _F \le c\gamma \epsilon _M\Vert (H,K)\Vert _F. $$

Multiplying the above equation with D yields in the original coordinate system

$$ \tilde{\textbf{x}}_{new}:= D\tilde{\textbf{x}}_{d,new}, \quad \tilde{\textbf{r}}_{new}:= D{\textbf{r}}_{d,new}, \quad (\tilde{\alpha }H -\tilde{\beta }K)\tilde{\textbf{x}}_{new} = \tilde{\textbf{r}}_{new}. $$

The $(i+1)$-th element of $\tilde{\textbf{r}}_{new}$ then satisfies the required bound since

$$ {\textbf{e}}_{i+1}^T\tilde{\textbf{r}}_{new}= d_{i+1}{\textbf{e}}_{i+1}^T{\textbf{r}}_{d,new} \le d_{i+1}\Vert {\textbf{r}}_{d,new}\Vert _2\le c\gamma d_{i+1} \epsilon _M\Vert (H,K)\Vert _F. $$

If the constant factor $c\gamma $ is large, the scaled refinement step may not yield the expected error bound (10) and an additional refinement step may be needed. In the numerical experiments section, we show that one step of refinement often yields a satisfactory result.

The above method can also be applied to complex eigenvectors, but its impact of the properties of a real deflating subspace X used in the case of complex conjugate pairs of eigenvalues is not clear.

The efficacity of the above approximations, and their use for complex conjugate pairs, is verified in the “Numerical results” section.

6 Numerical results

In this section, we report some numerical experiments. All the computations were performed with Matlab ver. R2022a with machine precision $ \epsilon _M \approx 2.22 \times 10^{-16}. $

We consider 10,000 HH matrix pencils $ (H^{(i)},K^{(i)}),$ of size 100, with pseudo-random values drawn from the standard normal distribution (generated by the function randn of Matlab) as entries, and scaled such that $ \Vert H^{(i)} \Vert _2=\Vert K^{(i)} \Vert _2=1.$ For each matrix, we randomly pick a real and a complex conjugate eigenpair $(\lambda ^{(i)}, {\textbf{x}}^{(i)}), $ and apply the perfect shift technique to deflate that particular eigenvalue from the matrix pencil obtaining the new HH matrix pencils $ (\tilde{H}^{(i)},\tilde{K}^{(i)})$. Furthermore, we also apply the improved scaled residual approach, described in Sect. 5, to compute a better approximation of the eigenpair, obtaining $(\hat{\lambda }^{(i)}, \hat{{\textbf{x}}}^{(i)}), $ and deflate it by means of the perfect shift technique obtaining the HH matrix pencils $ (\hat{H}^{(i)},\hat{K}^{(i)})$.

We define $b^{(i)}=1/\sqrt{1+ \lambda ^{(i)^{2}}},$ $a^{(i)}=\lambda ^{(i)} \ b^{(i)}$, $ \hat{b}^{(i)}=1/\sqrt{1+ \hat{\lambda }^{(i)^{2}}}$, $\hat{a}^{(i)}=\hat{\lambda }^{(i)} \ \hat{b}^{(i)}. $ Moreover, we adopt the Matlab function $\texttt{tril} (F, k)$ to denote the lower triangular part of the matrix F below the kth subdiagonal.

The results are depicted in the histograms displayed in the following pictures. In each figure, the histogram to the left refers to the matrices $ (\tilde{H}^{(i)},\tilde{K}^{(i)})$, while the one to the right refers to the HH matrix pencils $ (\hat{H}^{(i)},\hat{K}^{(i)})$.

The first five figures concern the deflation of a real eigenpair, while the last three figures refer to the complex conjugate case.

In Fig. 4, the histograms of $\log _{10}\sqrt{\tilde{c}_1^{(i)}+\tilde{c}_2^{(i)}} $ (left), with $\tilde{c}_1^{(i)}=\Vert \texttt{tril}(\tilde{H}^{(i)},-2)\Vert _2^2$$+\Vert \texttt{tril}(\tilde{K}^{(i)},-2)\Vert _2^2, $ and $\tilde{c}_2^{(i)}=\mid \tilde{H}^{(i)}_{2,1}\mid ^2+\mid \tilde{K}^{(i)}_{2,1}\mid ^2$, and $\log _{10} \sqrt{\hat{c}_1^{(i)}+\hat{c}_2^{(i)}}$ (right), with $\hat{c}_1^{(i)}=\Vert \texttt{tril}(\hat{H}^{(i)},-2)\Vert _2+\Vert \texttt{tril}(\hat{K}^{(i)},-2) \Vert _2$ and $\hat{c}_2^{(i)}=\mid \hat{H}^{(i)}_{2,1}\mid ^2+\mid \hat{K}^{(i)}_{2,1}\mid ^2, $ are displayed. It can be noticed that if the improved scaled residual approach is not applied, the part below the first subdiagonal of the computed HH matrices often gets blurred.

In Fig. 5, the histograms of $\log _{10} \mid b \tilde{H}_{1,1}-a\tilde{K}_{1,1} \mid $ (left) and $\log _{10} \mid \hat{b} \hat{H}_{1,1}-\hat{a}\hat{K}_{1,1} \mid $ (right), are displayed.

In Fig. 6, the histograms of the logarithms of the residuals $\log _{10} \Vert (a \tilde{K}- b \tilde{H} ){\textbf{x}}\Vert _2 $ (left) and $\log _{10} \Vert ( \hat{a}\hat{K}- \hat{b} \hat{H}) \hat{{\textbf{x}}}\Vert _2$ (right), are displayed.

The histograms in Fig. 7 show the accuracy of the poles in the HH matrices after deflation. In particular, using the definition $ p_j = H_{j+1,j} / K_{j+1,j}, $ $ \tilde{p}_j = \tilde{H}_{j+2,j+1} / \tilde{K}_{j+2,j+1}, $ and $ \hat{p}_j =\hat{H}_{j+2,j+1} /\hat{K}_{j+2,j+1}, $ $ j = 1,\ldots ,n-2, $ the values of $\log _{10} \max _{j} \frac{ \mid p_j - \tilde{p}_j \mid }{ \mid p_j \mid } $ (left) and $\log _{10} \max _{j} \frac{ \mid p_j - \hat{p}_j \mid }{ \mid p_j \mid } $ (right), are displayed.

The next three figures report the histograms corresponding to the complex conjugate eigenvalue pair $(\lambda ^{(i)}, \bar{\lambda }^{(i)})$. The histograms of $\log _{10}\sqrt{\tilde{c}_1^{(i)}+\tilde{c}_2^{(i)}} $ (left), with $\tilde{c}_1^{(i)}=\Vert \texttt{tril}(\tilde{H}^{(i)},-2)\Vert _2^2+\Vert \texttt{tril}(\tilde{K}^{(i)},-2)\Vert _2^2, $ and $\tilde{c}_2^{(i)}=\mid \tilde{H}^{(i)}_{2,1}\mid ^2+\mid \tilde{K}^{(i)}_{2,1}\mid ^2$, and $\log _{10} \sqrt{\hat{c}_1^{(i)}+\hat{c}_2^{(i)}}$ (right), with $\hat{c}_1^{(i)}=\Vert \texttt{tril}(\hat{H}^{(i)},-2)\Vert _2+\Vert \texttt{tril}(\hat{K}^{(i)},-2) \Vert _2$ and $\hat{c}_2^{(i)}=\mid \hat{H}^{(i)}_{2,1}\mid ^2+\mid \hat{K}^{(i)}_{2,1}\mid ^2, $ are displayed in Fig. 8. Similar to the real case, it can be noticed that if the improved scaled residual approach is not applied, the part below the first subdiagonal of the computed HH matrices gets blurred.

In Fig. 9, the histograms of the logarithms of the residuals $\log _{10} \Vert (a {K}-b {H} ){\textbf{x}}\Vert _2 $ (left) and $\log _{10} \Vert ( \hat{a}{K}- \hat{b} {H}) \hat{{\textbf{x}}}\Vert _2$ (right), are displayed.

The histograms in Fig. 10 display the accuracy of the poles in the HH matrices after deflation. In particular, using the definition $ p_j = H_{j+1,j} / K_{j+1,j}, $ $ \tilde{p}_j = \tilde{H}_{j+3,j+2} / \tilde{K}_{j+3,j+2}, $ and $ \hat{p}_j =\hat{H}_{j+3,j+2} /\hat{K}_{j+3,j+2}, $ $ j = 1,\ldots ,n-3, $ the values of $\log _{10} \max _{j} \frac{ \mid p_j - \tilde{p}_j \mid }{ \mid p_j \mid } $ (left) and $\log _{10} \max _{j} \frac{ \mid p_j - \hat{p}_j \mid }{ \mid p_j \mid } $ (right), are displayed.

Availability of data and materials

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

Code Availability

The codes described in the manuscript are available from the corresponding author upon request.

Notes

It is compulsory to have the deflation in the upper left corner to maintain the relation between the weight vector and the matrix VZ; details can be found in [8]
A structured backward error of an HH pencil is one that has zero elements where the pencil has zero elements.

References

Boutry, G., Elad, M., Golub, G., Milanfar, P.: The generalized eigenvalue problem for nonsquare pencils using a minimal perturbation approach. SIAM Matr. Anal. Appl. 27(2), 582–601 (2005)
Article MathSciNet Google Scholar
Camps, D.: Pole swapping methods for the eigenvalue problem, Ph.D. Thesis Dept. Comp. Sc., KULeuven (2019)
Camps, D., Meerbergen, K., Vandebril, R.: A rational QZ method. SIAM Matr. Anal. Appl. 40(3), 943–972 (2019)
Article MathSciNet Google Scholar
Camps, D., Mastronardi, N., Vandebril, R., Van Dooren, P.: Swapping $2\times 2$ blocks in the Schur and generalized Schur form. J. Comput. Appl. Math. 373, 112274 (2020)
Article MathSciNet Google Scholar
Ipsen, I.: Computing an eigenvector with inverse iteration. SIAM Rev. 39(2), 254–291 (1997)
Article ADS MathSciNet Google Scholar
Mastronardi, N., Van Dooren, P.: The QR-steps with perfect shifts. SIAM J. Matr. Anal. Appl. 39(4), 1591–1615 (2018)
Article MathSciNet Google Scholar
Mastronardi, N., Van Dooren, P.: On QZ steps with perfect shifts and computing the index of a differential-algebraic equation. IMA J. Num. Anal. (2020). https://doi.org/10.1093/imanum/draa049
Article Google Scholar
Van Barel, M., Van Buggenhout, N., Vandebril, R.: Algorithms for modifying recurrence relations of orthogonal polynomials and rational functions when changing the discrete inner product, (2023). arXiv:2302.00355
Van Buggenhout, N., Van Barel, M., Vandebril, R.: Non-unitary CMV-decomposition. Special Matrices 8(1), 144–159 (2020)
Article MathSciNet Google Scholar
Van Buggenhout, N., Van Barel, M., Vandebril, R.: Generation of orthogonal rational functions by procedures for structured matrices. Numerical Algorithms 89(2), 1–32 (2021)
MathSciNet Google Scholar
Watkins, D.S.: The transmission of shifts and shift blurring in the QR algorithm. Lin. Alg. Appl. 241–243, 877–896 (1996)
Article MathSciNet Google Scholar
Watkins, D.S.: The Matrix Eigenvalue Problem. SIAM, Philadelphia (2007)
Book Google Scholar

Download references

Acknowledgements

The authors wish to thank the anonymous reviewer for his/her valuable comments.

Funding

Open access funding provided by Consiglio Nazionale Delle Ricerche (CNR) within the CRUI-CARE Agreement. The first and fourth authors were partly supported by Gruppo Nazionale Calcolo Scientifico (GNCS) of Istituto Nazionale di Alta Matematica (INdDAM). The second author was supported by the Research Council KU Leuven, C1-project C14/17/073 and by the Fund for Scientific Research-Flanders (Belgium), EOS Project no 30468160.The third author was supported by the Research Council KU Leuven (Belgium), project C16/21/002 and by the Fund for Scientific Research – Flanders (Belgium), project G0A9923N. The second and third author were also supported by the Fund for Scientific Research – Flanders (Belgium), project G0B0123N.

Author information

Authors and Affiliations

Istituto Applicazioni Calcolo, CNR, Via Amendola 122, Bari, 70126, Italy
Nicola Mastronardi
Department of Computer Science, KULeuven, Celestijnenlaan 200A, Heverlee, 3001, Belgium
Marc Van Barel & Raf Vandebril
Department of Mathematical Engineering, UCLouvain, Av. Lemaitre 4, Louvain-la-Neuve, 1348, Belgium
Paul Van Dooren

Authors

Nicola Mastronardi
View author publications
You can also search for this author in PubMed Google Scholar
Marc Van Barel
View author publications
You can also search for this author in PubMed Google Scholar
Raf Vandebril
View author publications
You can also search for this author in PubMed Google Scholar
Paul Van Dooren
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The authors contributed equally to this work.

Corresponding author

Correspondence to Nicola Mastronardi.

Ethics declarations

Ethics approval

Not applicable.

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Marc Van Barel, Raf Vandebril, and Paul Van Dooren contributed equally to this work.

Appendices

Appendix

A. Deflations and perturbations

In this section, we assume that the pencil $H-\lambda K$ has coefficients of norm bounded by 1, that X has full column rank and has norm bounded by 1, and that $\Lambda $ is a square matrix with spectral radius bounded by 1. Therefore, when applying orthonormal transformations to H, K or X, the numerical errors will be of the order of $\epsilon _M$. We show how the perfect shift propagates in the backward RQZ step by tracking the residual of the deflating subspace equation $R:=HX-KX\Lambda $. When applying the orthonormal transformations Q and Z, the residual R changes to a new residual $\hat{R}:= \hat{H} \hat{X} -\hat{K} \hat{X} \Lambda $, which can be evaluated as follows. Let us superpose the errors performed during the transformations on the data X, H, and K :

$$ \hat{X}:=Q(X+\Delta _X), \quad \hat{K}:=Z(K+\Delta _K)Q^T, \quad \hat{H}:= Z(H+\Delta _H)Q^T, $$

then

$$ \hat{R}= Z\left[ R + \Delta _H X + H \Delta _X -(\Delta _K X + K \Delta _X )\Lambda \right] + \mathcal{O}(\epsilon _M^2) $$

which shows they are of the same order of magnitude, provided $\Vert \Lambda \Vert _2$ is of the order of 1.

Lemma 3

Let $HX=KX\Lambda $, where the pencil $H-\lambda K$ is $k\times (k+1)$ and the matrix X is $(k+1)\times n$, where the matrix KX and hence also X, have full column rank n and the $n\times n$ matrix $\Lambda $ has spectral radius bounded by 1. If Q is an orthonormal transformation satisfying $\hat{X}:= QX=\left[ \begin{array}{c}\hat{X}_1 \\ 0 \end{array}\right] $ where $\hat{X}_1$ is $n\times n$ and invertible, and then we partition

$$ \hat{H} - \lambda \hat{K}:= (H-\lambda K) Q^T = \left[ \begin{array}{cc} \hat{H}_{1,1}&\hat{H}_{1,2} \end{array} \right] -\lambda \left[ \begin{array}{cc} \hat{K}_{1,1}&\hat{K}_{1,2} \end{array}\right] ,$$

where the pencil $\hat{H}_{1,1}-\lambda \hat{K}_{1,1}$ is $n\times n$, then the resulting equation

$$ \left[ \begin{array}{cc} \hat{H}_{1,1}&\hat{H}_{1,2} \end{array} \right] \left[ \begin{array}{cc} \hat{X}_1 \\ 0 \end{array}\right] = \left[ \begin{array}{cc} \hat{K}_{1,1}&\hat{K}_{1,2} \end{array}\right] \left[ \begin{array}{cc} \hat{X}_1 \\ 0 \end{array}\right] \Lambda $$

implies that the spectrum of the pencil $\hat{H}_{1,1}-\lambda \hat{K}_{1,1}$ is that of the matrix $\Lambda $.

Proof

This follows immediately from $\hat{H}_{1,1}\hat{X}_1= \hat{K}_{1,1} \hat{X}_1\Lambda $ and the fact that $\hat{K}_{1,1}\hat{X}_1$ has full rank n.$\square $

Lemma 4

Let $HX=KX\Lambda $ where the $(k+n)\times (k+n)$ pencil $H-\lambda K$ is regular, the $(k+n)\times n$ matrix X has full column rank n and the $n\times n$ matrix $\Lambda $ has spectral radius bounded by 1. If Q is an orthonormal transformation satisfying $\hat{X}:= QX=\left[ \begin{array}{c}\hat{X}_1 \\ 0 \end{array}\right] $ where $\hat{X}_1$ is $n\times n$ and invertible, and Z is an orthonormal matrix such that

$$ \hat{K}:= Z K Q^T = \left[ \begin{array}{cc} \hat{K}_{1,1} &{} \hat{K}_{1,2} \\ 0 &{} \hat{K}_{2,2} \end{array}\right] ,\quad \hat{H}:= Z H Q^T = \left[ \begin{array}{cc} \hat{H}_{1,1} &{} \hat{H}_{1,2} \\ \hat{H}_{2,1} &{} \hat{H}_{2,2} \end{array}\right] ,$$

where $\hat{K}_{1,1}$ is $n\times n$, then the resulting equation

$$ \left[ \begin{array}{cc} \hat{H}_{1,1} &{} \hat{H}_{1,2} \\ \hat{H}_{2,1} &{} \hat{H}_{2,2} \end{array} \right] \left[ \begin{array}{cc} \hat{X}_1 \\ 0 \end{array}\right] = \left[ \begin{array}{cc} \hat{K}_{1,1} &{} \hat{K}_{1,2} \\ 0 &{} \hat{K}_{2,2} \end{array}\right] \left[ \begin{array}{cc} \hat{X}_1 \\ 0 \end{array}\right] \Lambda $$

implies that $\hat{H}_{2,1}=0$ and the spectrum of the pencil $\hat{H}_{1,1}-\lambda \hat{K}_{1,1}$ is that of the matrix $\Lambda $. The correponding deflating subspace has then been transformed to the top block.

If, on the other hand, there is a nonzero residual $\hat{R}:= \hat{H}\hat{X}-\hat{K}\hat{X}\Lambda $ then $\Vert \hat{H}_{2,1} \Vert _2 = \mathcal{O}(\epsilon _M) $ and $\hat{H}_{2,1}$ can safely be dismissed if $\Vert \hat{R}\hat{X}^{-1}\Vert _2 =\mathcal{O}(\epsilon _M)$.

Proof

The result with zero residual follows from the equation $\hat{H}_{2,1}\hat{X}_1= 0$. The result with nonzero residual follows from $ \hat{H}_{2,1}\hat{X}_1 = \left[ \begin{array}{cc} 0&I_k \end{array}\right] \hat{R},$ which is $\mathcal{O}(\epsilon _M)$ when $\Vert \hat{R}\hat{X}_1^{-1}\Vert _2 =\mathcal{O}(\epsilon _M)$.$\square $

Lemma 5

Let $HX=KX\Lambda $ where $H-\lambda K$ is a $(n+1)\times n$ pencil and the $n\times n$ matrix X has full rank n and $\Lambda $ has spectral radius bounded by 1. Let Z be an orthonormal matrix such that $\hat{K}:= Z K = \left[ \begin{array}{cc} \hat{K}_{1,1} \\ 0 \end{array}\right] ,$ where $\hat{K}_{1,1}$ is $n\times n$, then $\hat{H}:= Z H = \left[ \begin{array}{cc} \hat{H}_{1,1} \\ 0 \end{array}\right] $ and the spectrum of $ \hat{H}_{1,1}-\lambda \hat{K}_{1,1}$ is that of the matrix $\Lambda $.

If, on the other hand, there is a nonzero residual $\hat{R}:= \hat{H}\hat{X}-\hat{K}\hat{X}\Lambda $ then $\Vert \hat{H}_{2,1} \Vert _2 = \mathcal{O}(\epsilon _M) $ and $\hat{H}_{2,1}$ can safely be dismissed if $\Vert \hat{R}\hat{X}_1^{-1}\Vert _2 =\mathcal{O}(\epsilon _M)$.

Proof

This follows immediately from $\hat{H}_{2,1} \hat{X}= \left[ \begin{array}{cc} 0&I_k \end{array}\right] \hat{R} $.$\square $

B. Scaling

Lemma 6

Let $\tilde{x}_n\ne 0$, then the scaling $ d_1=1, \; d_{i+1}= 2^{\textrm{round}\log _2\Vert \tilde{\textbf{x}}(i:n)\Vert _2}$ and scaled vector $\tilde{\textbf{x}}_d$ of the normalized vector $\tilde{\textbf{x}}$, satisfies $ d_1 \ge \ldots \ge d_n$ and

$$ \frac{1}{\gamma \sqrt{2}} \le \Vert \tilde{\textbf{x}}_d(n-1:n)\Vert _2 \le \ldots \le \Vert \tilde{\textbf{x}}_d(1:n)\Vert _2 \le \sqrt{2n}. $$

Proof

Clearly, each element of $\tilde{\textbf{x}}_d$ is upper bounded by $\sqrt{2}$ because of the rounding, and therefore $\Vert \tilde{\textbf{x}}_d(i:n)\Vert _2 \le \sqrt{2n}$. The smallest of the subvectors $\Vert \tilde{\textbf{x}}_d(n-1:n)\Vert _2$ is lower bounded by

$$ 1/(\gamma \sqrt{2})\le d_n/(d_{n-1}\sqrt{2}) \le \Vert \tilde{\textbf{x}}(n-1:n)\Vert _2/d_{n-1} \le \Vert \tilde{\textbf{x}}_d(n-1:n)\Vert _2.$$

$\square $

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Mastronardi, N., Van Barel, M., Vandebril, R. et al. Rational QZ steps with perfect shifts. Numer Algor 95, 1079–1102 (2024). https://doi.org/10.1007/s11075-023-01600-2

Download citation

Received: 19 March 2023
Accepted: 13 June 2023
Published: 09 July 2023
Issue Date: March 2024
DOI: https://doi.org/10.1007/s11075-023-01600-2

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Rational QZ steps with perfect shifts

Abstract

Similar content being viewed by others

Structured eigenvalue backward errors for rational matrix functions with symmetry structures

On the Backward Error Incurred by the Compact Rational Krylov Linearization

On Nondegenerate Rational Approximation

1 Introduction

2 Preliminary Hessenberg–Hessenberg form

3 Perfect shift of an unreduced HH pencil

3.1 Deflating a real eigenvalue \(\lambda _0\)

Theorem 1

Proof

Remark 1

Remark 2

3.2 Importance of the assumptions

Example 1

Remark 3

Example 2

3.3 Deflating a complex conjugate pair \((\lambda _0,\overline{\lambda }_0)\)

Theorem 2

Proof

3.4 Perfect shift of a block HH pencil

3.5 A single real shift

3.6 Two complex conjugate shifts

4 Approximation of eigenvalue/eigenvector pair

5 Improving the scaled residual

6 Numerical results

Availability of data and materials

Code Availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix

A. Deflations and perturbations

Lemma 3

Proof

Lemma 4

Proof

Lemma 5

Proof

B. Scaling

Lemma 6

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation