1 Introduction

Computing all eigenvalues of a small to medium-sized matrix pencil \(H-\lambda K\) is nowadays a routine task that shows up in many applications. The method of choice is the QZ algorithm which uses implicit QZ-type steps, implementing a bulge chasing technique. On the other hand, projection methods are often used to compute a subset of the eigenvalues of sparse, large-scale eigenproblems, and Krylov subspace methods are probably among the most used methods within this class. Even though the algorithms are totally different and they target different problems, Krylov and QZ-methods are intimately linked; theoretical support for the convergence and interpreting the QZ can be done entirely relying on Krylov theory. The rational QZ algorithm (which we will abbreviate as RQZ) is a numerical scheme that extends the ideas of the QZ algorithm and links to rational Krylov methods. It has been shown to be quite competitive with the QZ algorithm [3] because of the enhanced convergence behavior. It uses so-called RQZ steps which are pole swapping techniques on a Hessenberg-Hessenberg pencil, and not only look like bulge chasing, but also incorporate rational Krylov subspace ideas [2, 3].

The perfect shift strategy for Hessenberg pencils arises naturally in the downdating setting of orthogonal rational functions as described by Van Buggenhout, Van Barel, and Vandebril [10]. Consider a given finite discrete inner product

$$\begin{aligned} \langle f,g\rangle _{m} := \sum _{i=1}^{m} \vert w_i\vert ^2 \overline{g(z_i)} f(z_i), \end{aligned}$$
(1)

with nodes \(z_i\) and weights \(w_i\). One wishes to construct a set of orthogonal rational functions, with prescribed poles, for this inner product. Instead of constructing the orthogonal rational functions, it is often numerically more reliable to store the recurrences for generating these functions. These recurrences are stored in a Hessenberg pencil \(H-\lambda K\), satisfying

$$\begin{aligned} VH = \Lambda V K, \quad V^H V = I, \quad Ve_1 = w/\Vert w\Vert , \end{aligned}$$
(2)

where w contains the weights \(w_i\), \(\Lambda \) is a diagonal matrix with nodes \(z_i\) on the diagonal, and \(H-\lambda K\) is a Hessenberg pencil where the ratio of the subdiagonals equals the poles. The relations (2) express that the rows of V are the left eigenvectors of the pencil \(H-\lambda K\) and that the diagonal elements of \(\Lambda \) are their corresponding eigenvalues. The chosen nodes, weights, and poles are of course problem specific and could possibly change when, for instance, the problem changes over time. To add or remove nodes, weights, and poles, we refer to the work of Van Buggenhout, Van Barel, and Vandebril [8,9,10]. For removing nodes, one downdates the problem. Say we want to remove node \(z_j\), for \(j\in \{1,\ldots ,m\}\). Then we need to construct unitary transformations, Z and Q such that the transformed relations

$$\begin{aligned} (VZ) (Z^H H Q) = \Lambda (V Z) (Z^HK Q) \end{aligned}$$

allow to deflate an eigenvalue in the upper left cornerFootnote 1 of the pencil \(Z^H H Q-\lambda Z^HK Q\). The remaining lower right \((n-1)\times (n-1)\) part \(\tilde{H}-\lambda \tilde{K}\), satisfies the relation

$$\begin{aligned} \tilde{V} \tilde{H} = \tilde{\Lambda } \tilde{V} \tilde{K}, \quad \tilde{V}^H \tilde{V} = I, \quad \tilde{V}e_1 = \tilde{w}/\Vert \tilde{w}\Vert , \end{aligned}$$

providing the recurrences for the inner product

$$\begin{aligned} \langle f,g\rangle _{m-1} := \sum _{i=1, i\ne j}^{m} \vert w_i\vert ^2 \overline{g(z_i)} f(z_i), \end{aligned}$$
(3)

where \(\tilde{\Lambda }\) and \(\tilde{w}\) have node \(z_j\) and weight \(w_j\) removed. The exact deflation of the removed eigenvalue corresponds to the problem of deflating a perfect shift using a backward rational QZ step.

We consider only real matrix pencils and the deflation of a real eigenvalue or of a pair of complex conjugate eigenvalues. Using complex arithmetic avoids the problems of treating complex conjugate pairs together and is thus simpler. The extension to complex pencils is therefore not treated here. We will use the following notations. Matrices and submatrices are denoted by capital letters, i.e., ABH. The entry (ij) of the matrix H is denoted by the lowercase letter \(h_{i,j}. \) Vectors are denoted by bold letters, i.e., \( \textbf{a,b,} \ldots \). The identity matrix of order n is denoted by \(I_n \) and its i–th column by \( {\textbf{e}}_i^{(n)}, \) or, if there is no ambiguity, simply by I and \( {\textbf{e}}_i,\) respectively. Generic entries different from zero in matrices or vectors are denoted by “\(\times \).” The machine precision is denoted by \(\epsilon _M\). We denote a Givens rotation between two adjacent rows or columns i and \(i+1\), by

$$ G_{i} = \left[ \begin{array}{cccc} I_{i-1} &{} &{} &{} \\ &{} c &{} -s &{} \\ {} &{} s &{} c &{} \\ {} &{} &{} &{} I_{n-i-1} \end{array} \right] , \quad \left[ \begin{array}{cc} c &{} -s \\ s &{} c \end{array} \right] \left[ \begin{array}{cc} c &{} -s \\ s &{} c \end{array} \right] ^T =I_2. $$

The rest of the paper is organized as follows. In Sect. 2, we discuss the special form of a Hessenberg-Hessenberg pencil, which is the basis for performing a perfect shift RQZ-step. In Sect. 3, we give the main result of this paper: we derive a more robust method for implementing the RQZ step so that the perfect shift can be deflated at the top of the pencil. In Sects. 4 and 5, we look at two important aspects of our algorithm, namely how to improve the accuracy of an eigenvalue/eigenvector pair and how to scale the pencil in order to improve the residual of this approximation. In Sect. 6, we illustrate the performance of our algorithm with several numerical experiments.

2 Preliminary Hessenberg–Hessenberg form

The rational QZ algorithm for the generalized eigenvalue problem of a regular pencil \(A-\lambda B\) assumes that one first reduces the pencil to a Hessenberg–Hessenberg form. This form can be obtained using orthogonal transformations U and V such that the transformed pencil \(H-\lambda K:=V^T(A-\lambda B)U\) consists of two Hessenberg matrices :

$$\begin{aligned} H-\lambda K := \left[ \begin{array}{ccccc} h_{1,1} &{} \ldots &{} \ldots &{} h_{1,n} \\ h_{2,1} &{} \ddots &{} &{} \vdots \\ &{} \ddots &{} \ddots &{} \vdots \\ &{} &{} h_{n,n-1} &{} h_{n,n} \end{array} \right] - \lambda \left[ \begin{array}{ccccc} k_{1,1} &{} \ldots &{} \ldots &{} k_{1,n} \\ k_{2,1} &{} \ddots &{} &{} \vdots \\ &{} \ddots &{} \ddots &{} \vdots \\ &{} &{} k_{n,n-1} &{} k_{n,n} \end{array} \right] . \end{aligned}$$
(4)

Such a form can be obtained by direct construction or by running a rational Krylov algorithm [2, 4]. These will be called HH pencils, and the rational QZ algorithm will be abbreviated as RQZ. The fact that the pencil \(H-\lambda K\) is unreduced is equivalent to asking that the subpencil

$$ H_p-\lambda K_p:= \left[ \begin{array}{cc} 0&I_{n-1} \end{array} \right] ( H-\lambda K ) \left[ \begin{array}{cc} I_{n-1} \\ 0 \end{array} \right] $$

is regular, or that the scalar pencils \(h_{i+1,i}-\lambda k_{i+1,i}\) are regular for \(1\le i < n\). The subpencil \(H_p-\lambda K_p\) is called the “pole pencil" of \(H-\lambda K\), as its eigenvalues are the poles of the RQZ algorithm [3]. We will analyze in the next section the construction of a backward RQZ step and compare different ways to compute such a step. We go over a number of assumptions that are used in our analysis.

  • Assumption (A1): \(\det (H-\lambda K)\ne 0\) for almost all \(\lambda \). This is well-known to hold generically and is necessary and sufficient for the definition of the generalized eigenvalues of \(H-\lambda K\). Such a pencil \(H-\lambda K\) is said to be regular.

  • Assumption (A2): \(\det (H_p-\lambda K_p)\ne 0\) for almost all \(\lambda \), meaning that the “pole pencil" \(H_p-\lambda K_p\) is regular which also holds generically and is necessary and sufficient for the definition of the poles of the HH pencil. We call such a HH pencil unreduced.

  • Assumption (A3): \(H-\lambda K\) is proper, meaning that the subpencil

    $$ \left[ \begin{array}{cc} h_{n,n-1}&h_{n,n} \end{array} \right] - \lambda \left[ \begin{array}{cc} k_{n,n-1}&k_{n,n} \end{array} \right] $$

    has no zeros. Again, this holds generically.

  • Assumption (A4): The perfect shift \(\lambda _0\) is not a pole of \(H-\lambda K\), i.e., \(\det (H_p-\lambda _0 K_p)\ne 0\). This also holds generically.

Assumptions (A1) and (A2) will be assumed throughout the paper, since this is needed for the definition of generalized eigenvalues and poles of the HH pencil.

A possible extension of the above Hessenberg-Hessenberg structure occurs when the pole pencil \(H_p-\lambda K_p\) is block upper triangular, with diagonal sub-blocks of dimensions \(k_i\times k_i\). Such a block structure will be called a block-HH form. It will be discussed later on, but only for the case that the diagonal blocks have sizes \(k_i=1\) or 2.

3 Perfect shift of an unreduced HH pencil

In this section, we consider the case that the pole pencil \(H_p-\lambda K_p\) has all block-sizes \(k_i\) equal to 1. This is the simplest case and it allows us to compare the standard RQZ approach with the eigenvector method presented in this paper.

3.1 Deflating a real eigenvalue \(\lambda _0\)

We assume here that we are given a regular pencil \(H-\lambda K\) that is already in HH form, and that it is unreduced. If not, the operations described below can be applied to each unreduced subpencil of a general HH pencil. We also assume that assumptions (A3) and (A4) hold.

Let \(\lambda _0\) be a real eigenvalue of \(H-\lambda K\), then we represent it as

$$\lambda _0:=\alpha _0/\beta _0, \quad \alpha _0^2+\beta _0^2=1, \quad \beta _0\ge 0 \quad (\alpha _0=1, \beta _0=0 \;\textrm{if}\; \lambda _0=\infty ). $$

In exact arithmetic, if we then perform one backward RQZ step with shift \(\lambda _0\), the pencil

$$\hat{H}- \lambda \hat{K}:= Z^T (H - \lambda K) Q $$

is still in HH form with its first column proportional to \({\textbf{e}}_1\) and Q and Z are both unreduced Hessenberg matrices formed by the product of \(n-1\) Givens rotations. Unfortunately, the shift \((\alpha _0-\lambda \beta _0)\) may finally not appear accurately in the (1, 1) position because of a phenomenon known as “blurring of the shift." Therefore, we need to consider an alternative construction of the RQZ step, which we describe in the following theorem. Since we want to relate the rotations used in this theorem with those of the RQZ algorithm, we will make them unique by choosing the sign of s always positive when \(s\ne 0\), and to choose \(c=1\) when \(s=0\).

Theorem 1

Let \(H-\lambda K\) be a real proper HH pencil with real eigenvalue \(\lambda _0=\alpha _0/\beta _0\) of absolute value \(\mid \lambda _0 \mid \) bounded by 1 and normalized using \(\alpha _0^2+\beta _0^2=1\). Let \(\lambda _0\) not be a pole of \(H-\lambda K\) and define the Hessenberg matrix \(M:= (\beta _0 H - \alpha _0 K)\). Then

  1. 1.

    the pencil \(H-\lambda K\) has a normalized real eigenvector \({\textbf{x}}\) corresponding to \(\lambda _0=\alpha _0/\beta _0\):

    $$ (\beta _0 H - \alpha _0 K){\textbf{x}}=M{\textbf{x}}=0, \quad \mid \mid {\textbf{x}}\mid \mid _2=1, $$

    which is unique up to a scale factor \(\pm 1\), and has a nonzero last component \(x_n\); therefore, there is an “essentially unique” orthogonal transformation \(Q:=G^{(r)}_{1} \ldots G^{(r)}_{n-1}\) that transforms \({\textbf{x}}\) to \(Q{\textbf{x}}=\pm {\textbf{e}}_1\);

  2. 2.

    there is a corresponding “essentially unique” sequence of rotations \(G^{(\ell )}_{n-\!1}, \ldots , G^{(\ell )}_{1}\) guaranteeing that the products

    $$\begin{aligned} Q:= G^{(r)}_{1}G^{(r)}_{2}\cdots G^{(r)}_{n-1}, \quad Z :=G^{(\ell )}_{1}G^{(\ell )}_{2}\cdots G^{(\ell )}_{n-1}, \end{aligned}$$
    (5)

    are both Hessenberg and transform the triple \( (H,K,{\textbf{x}})\) to an equivalent one

    $$ (\hat{H},\hat{K},\hat{{\textbf{x}}}):=(ZHQ^T, ZKQ^T,Q{\textbf{x}}) $$

    where

    $$ \hat{{\textbf{x}}} = \pm {\textbf{e}}_1, \quad (\beta _0 \hat{H}-\alpha _0 \hat{K}) {\textbf{e}}_1 = 0, $$

    and \(\hat{H}-\lambda \hat{K}\) is in HH form.

Proof

To prove item 1, we point out that the normalized eigenvector \({\textbf{x}}\) is unique (up to a scaling factor \(\pm 1\)) because it is the solution of \(M{\textbf{x}}=0\), where M has rank \(n-1\) since it is unreduced and Hessenberg, because the pencil \(H-\lambda K\) satisfies assumption (A4). For the same reason, its last component \(x_n\) is nonzero, since otherwise the whole vector \({\textbf{x}}\) would be zero. The reduction of \({\textbf{x}}\) to \( \hat{{\textbf{x}}} = Q{\textbf{x}}= \pm {\textbf{e}}_1\) then requires a sequence of Givens rotations

$$ G^{(r)}_{i-1} \in {\mathbb {R}}^{ n \times n}, \quad i =n, n-1, \ldots ,2, $$

in order to eliminate the entries \(x_i,\; i=n,n-1, \ldots , 2 \) of the vector \( {\textbf{x}}. \) By choosing the sign of s in these Givens rotations positive, we make them unique.

For item 2, we point out that after the first transformation \(G^{(r)}_{n-1}\) we have an updated pole pencil

$$\begin{aligned} \tilde{H}_p-\lambda \tilde{K}_p = \left[ \begin{array}{cc} 0&I_{n-1} \end{array} \right] ( H-\lambda K )G^{(r)T}_{n-1} \left[ \begin{array}{cc} I_{n-1} \\ 0 \end{array} \right] \end{aligned}$$
(6)

that is still in generalized Schur form, but its last column has been changed and has the shift \(\lambda _0\) as new pole in the bottom position. This follows from (6) which implies

$$ \left[ \begin{array}{cc} \tilde{h}_{n,n-1}-\lambda _0 \tilde{k}_{n,n-1}&\tilde{h}_{n,n}-\lambda _0 \tilde{k}_{n,n} \end{array}\right] \left[ \begin{array}{c} x_{n-1} \\ 0 \end{array}\right] =0, \quad \textrm{where} \quad x_{n-1}\ne 0. $$

It can also be viewed as a special case of Lemma 3 with \(k=1\) and \(n=1\). Each subsequent pair of rotations \(G^{(\ell )}_{i+1}\) and \(G^{(r)T}_i\) moves then the perfect shift \(\lambda _0\) one position up in the pole pencil \(\tilde{H}_p -\lambda \tilde{K}_p\). First \(G^{(r)}_i\) moves the trailing nonzero element of \({\textbf{x}}\) one position up. Then \(G^{(r)T}_i\) is applied to the columns of the pencil, creating a bulge in the Hessenberg matrices H and K, which is then annihilated by the left Givens transformation \(G^{(\ell )}_{i+1}\). The fact that the Hessenberg form is restored in both H and K follows from Lemma 4 with \(k=1\) and \(n=1\). Therefore the pole pencil

$$\left[ \begin{array}{cc} 0&I_{n-1} \end{array} \right] G^{(\ell )}_{2}\cdots G^{(\ell )}_{n-1}(H_p-\lambda K_p) G^{(r)T}_{n-1}\cdots G^{(r)T}_{1}\left[ \begin{array}{cc} I_{n-1} \\ 0 \end{array} \right] $$

has the perfect shift in its top diagonal, and then the final left rotation \(G^{(\ell )}_{1}\) moves it to the top diagonal position of the pencil \(\hat{H} -\lambda \hat{K}\) (see Lemma 5 with \(n=1\)). Therefore, all the poles moved one position down, and the bottom one disappeared. All these transformations are “essentially” unique, since they implement the swapping of the eigenvalue \(\lambda _0\) with one of the eigenvalues of \(\hat{H}_p-\lambda \hat{K}_p\).\(\square \)

The reduction described in Theorem 1, transforming an eigenvector \( {\textbf{x}}\) corresponding to a real eigenvalue \( \lambda _0 \) into a multiple of \( {\textbf{e}}_1, \) and modifying the matrices H and K, is graphically depicted in Fig. 1, for \( n=6.\) We display the evolution of the triple \((H,K,{\textbf{x}})\).

Fig. 1
figure 1

Graphical description of the reduction of an eigenvector \( {\textbf{x}}\) to a multiple of \( {\textbf{e}}_1\). The matrices H and K were scaled to have norm 1, and \(\epsilon _M\)-small elements were put equal to zero

In particular, a generic nonzero entry is denoted by “\(\times \),” an entry to be annihilated by “\(\otimes \)” and the entries becoming zero, as a consequence of the multiplication by a Givens matrix, by “\(\boxtimes \).”

Remark 1

The implicit Q theorem for regular HH pencils is closely related to Theorem 1. It implies that the transformations Q and Z can also be determined from the first rotation \(G_{n-1}^{(r)} \) that computes

$$\begin{aligned} \left[ \begin{array}{@{}cc@{}} m_{n,n-1},&m_{n,n} \end{array} \right] G_{n-1}^{(r)T} = \left[ \begin{array}{cc} 0&\times \end{array} \right] \end{aligned}$$
(7)

and from the fact that the pair \( (ZHQ^T,ZKQ^T) \) is still in HH form. This is known as “swapping the poles" and corresponds to “chasing the bulge” [12] in the QZ algorithm.

Remark 2

Theorem 1 gives an alternative way to determine the sequences of right rotations \(Q:=G^{(r)}_{1}G^{(r)}_{2}\cdots G^{(r)}_{n-1}\) and left rotations \(Z:=G^{(\ell )}_{1}G^{(\ell )}_{2}\cdots G^{(\ell )}_{n-1}\) to implement an implicit RQZ-step. First one determines Q from \(Q{\textbf{x}}=\pm {\textbf{e}}_1\), and then one determines Z from the restoration of the Hessenberg form of K if \(\mid \lambda _0 \mid \le 1\) and of H if \(\mid \lambda _0 \mid > 1\), as indicated in Lemma 4. These particular choices are made to ensure numerical stability, as will be shown later on.

Although these different approaches are equivalent under exact arithmetic, their numerical behavior is different. We refer for this to Example 2.1 of [6], where a \(3\times 3\) Hessenberg matrix H of a standard eigenvalue problem is given which can be seen as a special case of a Hessenberg–Hessenberg pencil \(H-\lambda K\) with \(K=I\) and all its poles at infinity. The RQZ algorithm then reduces to the standard QR and will yield the same results. It was shown in [6] that the eigenvector approach is then the more reliable method for implementing the perfect shift.

3.2 Importance of the assumptions

In this subsection, we give two examples to illustrate the differences between the RQZ and the eigenvector method. The first example shows that when assumption (A4) is dropped, these two methods are not equivalent anymore.

Example 1

Consider the pencil

$$ \left[ \begin{array}{cccc} 1 &{} 1 &{} 0 &{} 0 \\ 1 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 2 &{} 0 \end{array} \right] - \lambda \left[ \begin{array}{cccc} 0 &{} 0 &{} 0 &{} 1 \\ 1 &{} 0 &{} 0 &{} 0 \\ 0 &{} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 1 &{} 1 \end{array} \right] . $$

and the shift \(\lambda _0=0\). Its eigenvalues are 0, 0, 1 and 2 and the two eigenvalues 0 belong to one Jordan block. Therefore, \(\lambda _0\) has only one eigenvector \({\textbf{x}}={\textbf{e}}_4\). Assumption (A3) is satisfied, but assumption (A4) not. The eigenvector method will then use three adjacent permutations to transform \({\textbf{e}}_4\) to \({\textbf{e}}_1\) yielding the transformed HH pencil (where \(c=\sqrt{2}/2\))

$$ \left[ \begin{array}{c|rrc} 0 &{} -c &{} -c &{} -2c \\ \hline 0 &{} c &{} c &{} -2c \\ 0 &{} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 \end{array} \right] - \lambda \left[ \begin{array}{c|ccc} 2c &{} 0 &{} 0 &{} -c \\ \hline 0 &{} 0 &{} 0 &{} -c \\ 0 &{} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} \;1\; &{} 0 \end{array} \right] . $$

The RQZ method, on the other hand, will obtain after the first column rotation \(G^{(r)}_3\) the pencil

$$ \left[ \begin{array}{c|cc|c} 1 &{} 1 &{} 0 &{} 0 \\ 1 &{} 0 &{} 0 &{} 0 \\ \hline 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 2 \end{array} \right] - \lambda \left[ \begin{array}{c|cc|c} 0 &{} 0 &{} 1 &{} 0 \\ 1 &{} 0 &{} 0 &{} 0 \\ \hline 0 &{} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 1 &{} 1 \end{array} \right] $$

and has to perform swapping on the \(2\times 2\) subpencil \( \left[ \begin{array}{cc} 0 &{} 0 \\ 0 &{} 0 \end{array} \right] - \lambda \left[ \begin{array}{cc} 1 &{} 0 \\ 0 &{} 1 \end{array} \right] \), which is an ill-posed problem and has non-unique solutions. Therefore the RQZ method does not have a unique way to proceed further when Assumption (A4) does not hold.

Remark 3

When the pencil \(H-\lambda K\) has one or more poles coalescent with the shift \(\lambda _0\), the matrix \(M:=\beta _0H-\alpha _0K\) is no longer unreduced, and the proof that \(x_n\) is nonzero does not hold anymore. But it is easy to verify that if the eigenvector \({\textbf{x}}\) is unique, then one (and only one) of the unreduced Hessenberg blocks of M, is singular, and that \(x_n\ne 0\) if and only if this is the last block. In the above example, this was indeed the case. But even when \(x_n=0\), the eigenvector method would still work, when starting with the unreduced Hessenberg block that is singular, since the eigenvector corresponding to that subblock will have a trailing nonzero component.

In the next example, the assumption (A3) is dropped and again these two methods are not equivalent anymore.

Example 2

Consider the pencil

$$ \left[ \begin{array}{cccc} 1 &{} 1 &{} 0 &{} 0 \\ 1 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 2 &{} 0 \end{array} \right] - \lambda \left[ \begin{array}{cccc} 0 &{} 0 &{} 0 &{} 1 \\ 1 &{} 0 &{} 0 &{} 0 \\ 0 &{} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 1 &{} 0 \end{array} \right] . $$

Its eigenvalues are still 0, 0, 1 and 2 and the two eigenvalues 0 belong to one Jordan block. Therefore, the perfect shift \(\lambda _0=0\) has a single eigenvector \({\textbf{x}}={\textbf{e}}_4\). The eigenvector method will then use three adjacent permutations to transform \({\textbf{e}}_4\) to \({\textbf{e}}_1\) yielding the transformed HH pencil

$$ \left[ \begin{array}{c|rrc} 0 &{} -1 &{} -1 &{} 0 \\ \hline 0 &{} 0 &{} 0 &{} -2 \\ 0 &{} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 \end{array} \right] - \lambda \left[ \begin{array}{c|ccc} 1 &{} 0 &{} 0 &{} 0 \\ \hline 0 &{} 0 &{} 0 &{} -1 \\ 0 &{} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 1 &{} 0 \end{array} \right] . $$

The RQZ method, on the other hand, will obtain after the first column rotation \(G^{(r)}_3\) the pencil

$$ \left[ \begin{array}{ccc|c} 1 &{} 1 &{} 0 &{} 0 \\ 1 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 \\ \hline 0 &{} 0 &{} 0 &{} 2 \end{array} \right] - \lambda \left[ \begin{array}{ccc|c} 0 &{} 0 &{} 1 &{} 0 \\ 1 &{} 0 &{} 0 &{} 0 \\ 0 &{} 1 &{} 0 &{} 0 \\ \hline 0 &{} 0 &{} 0 &{} 1 \end{array} \right] . $$

Since Assumption (A3) does not hold, the RQZ method will not be able to introduce the shift properly in order to move it to the top. Therefore, it will rather attempt to do an early deflation of the bottom eigenvalue 2.

This example shows that the eigenvector method still works when assumption (A3) fails, provided \(x_n\ne 0\).

3.3 Deflating a complex conjugate pair \((\lambda _0,\overline{\lambda }_0)\)

We assume now that we are given two complex conjugate eigenvalues \((\lambda _0,\overline{\lambda }_0)\) of a regular and unreduced pencil \(H-\lambda K\) that is in HH form and therefore has only real poles. This implies that assumption (A4) holds. Let us represent the eigenvalues and eigenvectors by their real and imaginary parts : \(\alpha _0 \pm \imath \beta _0\) and \({\textbf{x}}\pm \imath {\textbf{y}}\). Then the eigenvector/eigenvalue equations \((H-(\alpha _0\pm \imath \beta _0) K)({\textbf{x}}\pm \imath {\textbf{y}})=0\) can be expressed as

$$\begin{aligned} H X = KX\Lambda , \quad \textrm{where} \quad \Lambda :=\left[ \begin{array}{cc} \alpha _0 &{} \beta _0 \\ -\beta _0 &{} \alpha _0\end{array}\right] ,\quad X:=\left[ \begin{array}{cc} {\textbf{x}}&{\textbf{y}}\end{array}\right] \end{aligned}$$
(8)

indicating that X spans a two-dimensional deflating subspace of the pencil \(H-\lambda K\). When multiplying X with an invertible matrix R, the new basis XR can be made orthonormal and the matrix \(\Lambda \) then becomes \(R^{-1}\Lambda R\), which preserves its eigenvalues, as expected.

The following theorem extends essentially the ideas of Theorem 1 to the case of a complex conjugate pair of shifts. Therefore, we restricted the proof to the issues that are different in the two proofs.

Theorem 2

Let \(H-\lambda K\) be a real, regular, proper and unreduced HH pencil with two complex conjugate eigenvalues \((\alpha _0 \pm \imath \beta _0)\) of absolute value \(\mid \lambda _0 \mid \) bounded by 1. Then the following holds:

  1. 1.

    There exists an “essentially unique” basis X of the two-dimensional deflating subspace of the eigenvalue pair \(\alpha _0\pm \imath \beta _0\) such that

    $$ X^TX=I_2, \quad X=\left[ \begin{array}{cc} x_1 &{} y_1 \\ \vdots &{} \vdots \\ x_{n-1} &{} y_{n-1} \\ 0 &{} y_n \end{array}\right] , \quad \textrm{where} \quad x_{n-1}\ne 0, \; y_{n}\ne 0. $$

    Moreover there exists a matrix \(Q:=G^{(r,2)}G^{(r,1)}\), consisting of 2 essentially unique sequences of Givens rotations

    $$G^{(r,i)}=G^{(r,i)}_{i} \ldots G^{(r,i)}_{n+i-3}, \quad i=1,2$$

    such that their product Q gives the QR factorization \(X=Q^T R\);

  2. 2.

    There is a matrix \(Z:=G^{(\ell ,2)} G^{(\ell ,1)}\), consisting of 2 essentially unique sequences of Givens rotations

    $$G^{(\ell ,i)}=G^{(\ell ,i)}_{i} \ldots G^{(\ell ,i)}_{n+i-3}, \quad i=1,2,$$

    and a matrix \(Q:=G^{(r,2)}G^{(r,1)}\), consisting of 2 essentially unique sequences of Givens rotations

    $$G^{(r,i)}=G^{(r,i)}_{i} \ldots G^{(r,i)}_{n+i-3}, \quad i=1,2,$$

    such that the triple (HKX), is transformed into an equivalent one

    $$ (\hat{H},\hat{K},\hat{X}):=(ZHQ^T, ZKQ^T,QX), $$

    where \(\hat{X}\) is upper triangular, and \((\hat{H} - \lambda \hat{K})\) is in HH form with a leading \(2\times 2\) block \([I_2,0](\hat{H}-\lambda \hat{K}) [I_2,0]^T\) that is decoupled and contains the eigenvalues \(\alpha _0\pm \imath \beta _0\).

Proof

Clearly the complex eigenvector \({\textbf{x}}+\imath {\textbf{y}}\) has a non-zero last component because \(H-\lambda K\) is unreduced Hessenberg; and hence, the last row of X is nonzero. After the normalization, this is still the case; and hence, there exists a rotation on the columns of X that annihilates \(x_n\). The fact that \(x_{n-1}\) is then non-zero follows from the properness assumption (A3): if \(x_{n-1}=0\), then there exists a rotation such that

\(G_{n-1}\left[ \begin{array}{cc} 0 &{} y_{n-1} \\ 0 &{} y_n \end{array}\right] =\left[ \begin{array}{cc} 0 &{} \hat{y}_{n-1} \\ 0 &{} 0 \end{array}\right] \) implying

$$\left[ \begin{array}{cc} h_{n,n-1}&h_{n,n} \end{array}\right] G_{n-1}^T\left[ \begin{array}{cc} 0 &{} \hat{y}_{n-1} \\ 0 &{} 0 \end{array}\right] = \left[ \begin{array}{cc} k_{n,n-1}&k_{n,n} \end{array}\right] G_{n-1}^T\left[ \begin{array}{cc} 0 &{} \hat{y}_{n-1} \\ 0 &{} 0 \end{array}\right] \Lambda . $$

So both \(\left[ \begin{array}{cc} h_{n,n-1}&h_{n,n} \end{array}\right] \) and \(\left[ \begin{array}{cc} k_{n,n-1}&k_{n,n} \end{array}\right] \) are parallel to the last row of \(G_{n-1}\) and this violates assumption (A3). The only degree of freedom left over is a scaling of the columns of X with \(\pm 1\). Once the properties of X are established, the existence of the sequences of Givens rotations \( G^{(r,i)}=G^{(r,i)}_{i}, \ldots , G^{(r,i)}_{n-i-1}\), for \(i=1,2\), follow: these are the rotations needed for the classical QR factorization of X. This then completes the proof of Item 1.

The proof of Item 2 is very similar to that of Item 2 in Theorem 1, except that \(n=2\) when using Lemma 3, 4 and 5, and that we need two rotations \(G^{(r,2)}_{i+1}G^{(r,1)}_i\) to annihilate the two bottom positions of the matrix \(\hat{X}\), and then two rotations \(G^{(\ell ,2)}_{i+1}G^{(\ell ,1)}_i\) to restore the Hessenberg form of K if \(\mid \lambda _0 \mid \le 1\) and of H, otherwise.\(\square \)

The reduction described in Theorem 2, transforming an orthogonal basis of the real and the imaginary part of an eigenvector \( {\textbf{x}}\) corresponding to a complex conjugate eigenvalue \( \lambda _0 \) into a multiple of \( [{\textbf{e}}_1, {\textbf{e}}_2] \) and modifying the matrices H and K, is graphically depicted in Fig. 2, for \( n=6.\) We display the evolution of the triple \((H,K,{\textbf{x}})\).

Fig. 2
figure 2

Graphical description of the reduction of the real and the imaginary parts of an eigenvector corresponding to a complex conjugate eigenvalue to a multiple of \( [{\textbf{e}}_1, {\textbf{e}}_2] \). The matrices H and K were scaled to have norm 1, and \(\epsilon _M\)-small elements were put equal to zero

3.4 Perfect shift of a block HH pencil

Let us now consider the case of complex conjugate poles in the pencil \(H-\lambda K\). We then have a real block-HH pencil where the diagonal blocks of the pole pencil \(H_p-\lambda K_p\) are \(1\times 1\) or \(2\times 2\). Again, we describe the method for a shift \(\lambda _0\) of modulus smaller or equal to 1. In that case, we assume \(K_p\) to be upper-triangular (and hence K is Hessenberg), while \(H_p\) is block triangular (and hence H is block Hessenberg) :

$$ H= \left[ \begin{array}{ccccc|c} H_{1,1} &{} H_{1,2} &{} \ldots &{} \ldots &{} H_{1,n-1} &{} H_{1,n} \\ \hline H_{2,1} &{} H_{2,2} &{} \ldots &{} \ldots &{} H_{2,n-1} &{} H_{2,n} \\ &{} H_{3,2} &{} \ddots &{} &{} H_{3,n-1} &{} H_{3,n} \\ &{} &{} \ddots &{} \ddots &{} \vdots &{} \vdots \\ &{} &{} &{} H_{n-1,n-2} &{} H_{n-1,n-1} &{} H_{n-1,n} \\ &{} &{} &{} &{} H_{n,n-1} &{} H_{n,n} \end{array}\right] $$

Theorems 1 and 2 are still valid, except that the HH form is now replaced by a block HH form for the pencil \(H-\lambda K\). We briefly discuss the differences of the algorithm for both the case of a real shift and a complex conjugate pair. The proofs of our arguments follow from Lemmas 3, 4, and 5.

3.5 A single real shift

If the bottom block \(H_{n,n-1}\) is \(1\times 1\) then a single Givens rotation \(G^{(r)T}_{n-1}\) will rotate the shift \(\lambda _0\) to position \((n,n-1)\). If, on the other hand, the bottom block \(H_{n,n-1}\) is \(2\times 2\), two Givens transformations \(G^{(r)T}_{n-1}\) and \(G^{(r)T}_{n-2}\) and one Givens rotation \(G^{(\ell )}_{n-1}\) have to be applied to \(H-\lambda K\) to move the shift to position \((n-1,n-2)\).

After this preliminary step, \(\lambda _0\) is swapped with the next block on the diagonal of the pole pencil. If this block is \(1 \times 1\) a rotation \(G^{(r)T}_{i-1}\) followed by a rotation \(G^{(\ell )}_{i}\) moves the shift one position up. If this block is \(2 \times 2\), two rotations \(G^{(r)T}_{i-1}\) and \(G^{(r)T}_{i-2}\) followed by 2 rotations \(G^{(\ell )}_{i}\) and \(G^{(\ell )}_{i-1}\) moves the shift two positions up.

The RQZ step is finalized by a single rotation \(G^{(\ell )}_{1}\) moving the shift to position (1, 1) in \(H-\lambda K\).

3.6 Two complex conjugate shifts

Here again there are two different starting scenarios. If the bottom block \(H_{n,n-1}\) is \(2\times 2\) or if the two bottom blocks are both \(1\times 1\), then a pair of Givens rotations \(G^{(r,1)T}_{n-2}G^{(r,2)T}_{n-1}\) will move the shifts \((\lambda _0,\overline{\lambda }_0)\) to a new \(2\times 2\) block \(H_{n,n-1}\). If this is not the case, two pairs of Givens rotations \(G^{(r,1)T}_{n-2}G^{(r,2)T}_{n-1}\) and \(G^{(r,1)T}_{n-3}G^{(r,2)T}_{n-2}\) and one pair of Givens rotations \(G^{(\ell ,2)}_{n-2}G^{(\ell ,1)}_{n-1}\) have to be applied to \(H-\lambda K\) to move the pair \((\lambda _0,\overline{\lambda }_0)\) to position \((n-1,n-2)\).

After this preliminary step, the pair \((\lambda _0,\overline{\lambda }_0)\) is swapped with the next block on the diagonal of the pole pencil. If this block is \(1 \times 1\), a pair of rotations \(G^{(r,1)T}_{i-2}G^{(r,2)T}_{i-1}\) followed by a pair of rotation \(G^{(\ell ,2)}_{i-1}G^{(\ell ,1)}_{i}\) moves the shift one position up. If this block is \(2 \times 2\), two such pairs of rotations are used to move the pair \((\lambda _0,\overline{\lambda }_0)\) two positions up.

The RQZ step is finalized by a single pair of rotations \(G^{(\ell ,1)}_{1}\) \(G^{(\ell ,2)}_{2}\) moving the pair \((\lambda _0,\overline{\lambda }_0)\) to the (1, 1) block in \(H-\lambda K\).

The graphical description of the former reduction, transforming an orthogonal basis of the real and the imaginary parts of an eigenvector \( {\textbf{x}}\) corresponding to a complex conjugate eigenvalue \( \lambda _0 \) into a multiple of \( [{\textbf{e}}_1, {\textbf{e}}_2] \) and modifying the matrices H and K, is depicted in Fig. 3, for \( n=7.\) The evolution of the triple \((H,K,{\textbf{x}})\) is displayed in that figure.

Fig. 3
figure 3

Graphical description of the reduction of the real and the imaginary parts of an eigenvector corresponding to a complex conjugate eigenvalue to a multiple of \( [{\textbf{e}}_1, {\textbf{e}}_2], \) with K in Hessenberg form and H in block Hessenberg form. The matrices H and K were scaled to have norm 1, and \(\epsilon _M\)-small elements were put equal to zero

4 Approximation of eigenvalue/eigenvector pair

In this section, we show how to find or improve an approximation \((\tilde{\lambda },\tilde{\textbf{x}})\) to an exact eigenpair \((\lambda ,{\textbf{x}})\) of a pencil \(H-\lambda K\) that is in proper HH form. This section applies to both real and complex eigenvalues. The eigenvalue \(\tilde{\lambda }\) is given as the ratio

$$\begin{aligned} \begin{array}{ccc}\widetilde{\lambda }=\widetilde{\alpha }/\widetilde{\beta }&\textrm{with}&{\left| \widetilde{\alpha }\right| }^{2}+{\left| \widetilde{\beta }\right| }^{2}=1\end{array} \end{aligned}$$

and the eigenvector \(\tilde{\textbf{x}}\) is supposed to have norm \(\parallel \tilde{\textbf{x}}\parallel _2=1\). We want to improve this approximation by reducing the norm \(\Vert {\textbf{r}}\Vert _2\) of the residual \({\textbf{r}}\) defined by

$$\begin{aligned} (\tilde{\alpha }H - \tilde{\beta }K ) \tilde{\textbf{x}}=: {\textbf{r}}, \end{aligned}$$
(9)

where \({\textbf{r}}\) is assumed to be small, but nonzero. If the vector \(\tilde{\textbf{x}}\) is given, then the minimization of \(\Vert {\textbf{r}}\Vert _2\) is equivalent to

$$\underset{{\Vert \left[ \begin{array}{c}\widetilde{\alpha }\\ \widetilde{\beta }\end{array}\right] \Vert }_{2}=1}{\textrm{min}}{\Vert \left[ H\widetilde{x}-K\widetilde{x}\right] \left[ \begin{array}{c}\widetilde{\alpha }\\ \widetilde{\beta }\end{array}\right] \Vert }_{2}, $$

which is a total least squares problem [1] that can be solved by choosing \(\left[ \begin{array}{c}\widetilde{\alpha }\\ \widetilde{\beta }\end{array}\right] ={\textbf{v}}_{2}\) using the right singular vector \({\textbf{v}}_2\) of the singular value decomposition of the matrix

$$\left[ \begin{array}{cc} H\tilde{\textbf{x}}&- K\tilde{\textbf{x}}\end{array}\right] = \left[ \begin{array}{cc} {\textbf{u}}_1&{\textbf{u}}_2 \end{array}\right] \left[ \begin{array}{cc} \sigma _1 &{} 0 \\ 0 &{} \sigma _2 \end{array}\right] \left[ \begin{array}{cc} {\textbf{v}}_1&{\textbf{v}}_2 \end{array}\right] ^H. $$

If the vectors \(H\tilde{\textbf{x}}\) and \(K\tilde{\textbf{x}}\) are not parallel, this update is guaranteed to decrease the norm \(\Vert {\textbf{r}}\Vert _2\) (see [1]).

Let us now suppose that \(\tilde{\lambda }\) is given, then the best choice for \(\tilde{\textbf{x}}\) to reduce the residual \(\Vert {\textbf{r}}\Vert _2\) in (9) is the n-th singular vector \({\textbf{v}}_n\) of the singular value decomposition of \( \tilde{M}:=(\tilde{\alpha }H - \tilde{\beta }K ) \), but this may be too expensive when incorporated in an iteration where \(\tilde{\lambda }\) and \(\tilde{\textbf{x}}\) are updated recursively. A simpler scheme is to apply inverse iteration

$$ {\textbf{z}}:= \tilde{M}^{-1}(\tilde{M}^{-1})^H\tilde{\textbf{x}}, \quad \tilde{\textbf{x}}_{new}:= {\textbf{z}}/\Vert {\textbf{z}}\Vert _2, $$

which is again guaranteed to decrease the norm of the residual if the singular values of \(\tilde{M}\) are distinct (see [5]).

The procedure explained in this section, to refine the pair \((\lambda ,{\textbf{x}})\) to \((\tilde{\lambda },\tilde{\textbf{x}})\), is primarily aimed at improving the residual of a scaled eigenvalue problem, as explained in the next section.

5 Improving the scaled residual

Let us suppose now that the pair \((\tilde{\lambda },\tilde{\textbf{x}})\) yields a residual (9) that is of the order of \(\epsilon _M\Vert (H,K)\Vert _F\). The backward error analysis of [6, 7] then shows that we need the stronger bounds

$$\begin{aligned} \mid r_{i+1}\mid \le \epsilon _M\Vert (H,K)\Vert _F \Vert \tilde{\textbf{x}}(i:n)\Vert _2, \quad i=1,\ldots ,n-1 \end{aligned}$$
(10)

to ensure that the structured backward errorFootnote 2 of the RQZ step with perfect shift is also of the order of \(\epsilon _M\Vert (H,K)\Vert _F\). This can be achieved as follows. We first update the eigenvalue using the procedure explained in Sect. 4. This will already reduce the residual. For simplicity, we do not change the notation for this simple step. Then define the vector \({\textbf{d}}\) as

$$ d_1=1, \quad d_{i+1}= 2^{\textrm{round}\log _2\Vert \tilde{\textbf{x}}(i:n)\Vert _2}, $$

then \(d_{i+1}/\sqrt{2}\le \Vert \tilde{\textbf{x}}(i:n)\Vert _2\le d_{i+1}\sqrt{2} \) and the vector \({\textbf{d}}\) is non-increasing (i.e. \(d_{i+1}\le d_i\)) since \(\Vert \tilde{\textbf{x}}(i:n)\Vert _2\le \Vert \tilde{\textbf{x}}(i-1:n)\Vert _2\). Also the pencil matrices

$$ H_d:= D^{-1}HD, \quad K_d:= D^{-1}KD, \quad \textrm{with} \quad D:=\textrm{diag}(d_1, \ldots , d_n) $$

satisfy the bounds

$$ \Vert (H_d,K_d)\Vert _F \le \gamma \Vert (H,K)\Vert _F, \quad \textrm{where} \quad \gamma := \max _{1\le i \le n-1} d_i/d_{i+1} \ge 1, $$

and the equation

$$ (\tilde{\alpha }H_d -\tilde{\beta }K_d) D^{-1}\tilde{\textbf{x}}= D^{-1}{\textbf{r}}. $$

The scaled subvectors of \(\tilde{\textbf{x}}_d:=D^{-1}\tilde{\textbf{x}}\) then have approximately the same norm (see Appendix Scaling) :

$$ \frac{1}{\gamma \sqrt{2}} \le \Vert \tilde{\textbf{x}}_d(n-1:n)\Vert _2 \le \ldots \le \Vert \tilde{\textbf{x}}_d(1:n)\Vert _2 \le \sqrt{2n}, $$

which implies that the norm of \(\tilde{\textbf{x}}_d\) is of the order of 1. After performing one step of inverse iteration on \(\tilde{\textbf{x}}_d\) to improve that computed eigenvector, we obtain a scaled residual \({\textbf{r}}_{d,new}=(\tilde{\alpha }H_d -\tilde{\beta }K_d) \tilde{\textbf{x}}_{d,new}\) satisfying (for a moderate value of c)

$$ \Vert {\textbf{r}}_{d,new} \Vert _2 \le c\epsilon _M\Vert (H_d,K_d)\Vert _F \le c\gamma \epsilon _M\Vert (H,K)\Vert _F. $$

Multiplying the above equation with D yields in the original coordinate system

$$ \tilde{\textbf{x}}_{new}:= D\tilde{\textbf{x}}_{d,new}, \quad \tilde{\textbf{r}}_{new}:= D{\textbf{r}}_{d,new}, \quad (\tilde{\alpha }H -\tilde{\beta }K)\tilde{\textbf{x}}_{new} = \tilde{\textbf{r}}_{new}. $$

The \((i+1)\)-th element of \(\tilde{\textbf{r}}_{new}\) then satisfies the required bound since

$$ {\textbf{e}}_{i+1}^T\tilde{\textbf{r}}_{new}= d_{i+1}{\textbf{e}}_{i+1}^T{\textbf{r}}_{d,new} \le d_{i+1}\Vert {\textbf{r}}_{d,new}\Vert _2\le c\gamma d_{i+1} \epsilon _M\Vert (H,K)\Vert _F. $$

If the constant factor \(c\gamma \) is large, the scaled refinement step may not yield the expected error bound (10) and an additional refinement step may be needed. In the numerical experiments section, we show that one step of refinement often yields a satisfactory result.

The above method can also be applied to complex eigenvectors, but its impact of the properties of a real deflating subspace X used in the case of complex conjugate pairs of eigenvalues is not clear.

The efficacity of the above approximations, and their use for complex conjugate pairs, is verified in the “Numerical results” section.

6 Numerical results

In this section, we report some numerical experiments. All the computations were performed with Matlab ver. R2022a with machine precision \( \epsilon _M \approx 2.22 \times 10^{-16}. \)

We consider 10,000 HH matrix pencils \( (H^{(i)},K^{(i)}),\) of size 100,  with pseudo-random values drawn from the standard normal distribution (generated by the function randn of Matlab) as entries, and scaled such that \( \Vert H^{(i)} \Vert _2=\Vert K^{(i)} \Vert _2=1.\) For each matrix, we randomly pick a real and a complex conjugate eigenpair \((\lambda ^{(i)}, {\textbf{x}}^{(i)}), \) and apply the perfect shift technique to deflate that particular eigenvalue from the matrix pencil obtaining the new HH matrix pencils \( (\tilde{H}^{(i)},\tilde{K}^{(i)})\). Furthermore, we also apply the improved scaled residual approach, described in Sect. 5, to compute a better approximation of the eigenpair, obtaining \((\hat{\lambda }^{(i)}, \hat{{\textbf{x}}}^{(i)}), \) and deflate it by means of the perfect shift technique obtaining the HH matrix pencils \( (\hat{H}^{(i)},\hat{K}^{(i)})\).

We define \(b^{(i)}=1/\sqrt{1+ \lambda ^{(i)^{2}}},\) \(a^{(i)}=\lambda ^{(i)} \ b^{(i)}\), \( \hat{b}^{(i)}=1/\sqrt{1+ \hat{\lambda }^{(i)^{2}}}\), \(\hat{a}^{(i)}=\hat{\lambda }^{(i)} \ \hat{b}^{(i)}. \) Moreover, we adopt the Matlab function \(\texttt{tril} (F, k)\) to denote the lower triangular part of the matrix F below the kth subdiagonal.

The results are depicted in the histograms displayed in the following pictures. In each figure, the histogram to the left refers to the matrices \( (\tilde{H}^{(i)},\tilde{K}^{(i)})\), while the one to the right refers to the HH matrix pencils \( (\hat{H}^{(i)},\hat{K}^{(i)})\).

The first five figures concern the deflation of a real eigenpair, while the last three figures refer to the complex conjugate case.

In Fig. 4, the histograms of \(\log _{10}\sqrt{\tilde{c}_1^{(i)}+\tilde{c}_2^{(i)}} \) (left), with \(\tilde{c}_1^{(i)}=\Vert \texttt{tril}(\tilde{H}^{(i)},-2)\Vert _2^2\)\(+\Vert \texttt{tril}(\tilde{K}^{(i)},-2)\Vert _2^2, \) and \(\tilde{c}_2^{(i)}=\mid \tilde{H}^{(i)}_{2,1}\mid ^2+\mid \tilde{K}^{(i)}_{2,1}\mid ^2\), and \(\log _{10} \sqrt{\hat{c}_1^{(i)}+\hat{c}_2^{(i)}}\) (right), with \(\hat{c}_1^{(i)}=\Vert \texttt{tril}(\hat{H}^{(i)},-2)\Vert _2+\Vert \texttt{tril}(\hat{K}^{(i)},-2) \Vert _2\) and \(\hat{c}_2^{(i)}=\mid \hat{H}^{(i)}_{2,1}\mid ^2+\mid \hat{K}^{(i)}_{2,1}\mid ^2, \) are displayed. It can be noticed that if the improved scaled residual approach is not applied, the part below the first subdiagonal of the computed HH matrices often gets blurred.

Fig. 4
figure 4

Histograms of \(\log _{10}\sqrt{\tilde{c}_1^{(i)}+\tilde{c}_2^{(i)}} \) (left), with \(\tilde{c}_1^{(i)}=\Vert \texttt{tril}(\tilde{H}^{(i)},-2)\Vert _2^2+\Vert \texttt{tril}(\tilde{K}^{(i)},-2)\Vert _2^2, \) and \(\tilde{c}_2^{(i)}=\mid \tilde{H}^{(i)}_{2,1}\mid ^2+\mid \tilde{K}^{(i)}_{2,1}\mid ^2\), and \(\log _{10} \sqrt{\hat{c}_1^{(i)}+\hat{c}_2^{(i)}}\) (right), with \(\hat{c}_1^{(i)}=\Vert \texttt{tril}(\hat{H}^{(i)},-2)\Vert _2+\Vert \texttt{tril}(\hat{K}^{(i)},-2) \Vert _2\) and \(\hat{c}_2^{(i)}=\mid \hat{H}^{(i)}_{2,1}\mid ^2+\mid \hat{K}^{(i)}_{2,1}\mid ^2, \) real eigenvalue

In Fig. 5, the histograms of \(\log _{10} \mid b \tilde{H}_{1,1}-a\tilde{K}_{1,1} \mid \) (left) and \(\log _{10} \mid \hat{b} \hat{H}_{1,1}-\hat{a}\hat{K}_{1,1} \mid \) (right), are displayed.

Fig. 5
figure 5

Histograms of \(\log _{10} \mid b \tilde{H}_{1,1}-a\tilde{K}_{1,1} \mid \) (left) and \(\log _{10} \mid \hat{b} \hat{H}_{1,1}-\hat{a}\hat{K}_{1,1} \mid \) (right), real eigenvalue

In Fig. 6, the histograms of the logarithms of the residuals \(\log _{10} \Vert (a \tilde{K}- b \tilde{H} ){\textbf{x}}\Vert _2 \) (left) and \(\log _{10} \Vert ( \hat{a}\hat{K}- \hat{b} \hat{H}) \hat{{\textbf{x}}}\Vert _2\) (right), are displayed.

Fig. 6
figure 6

Histograms of the logarithms of the residuals, \(\log _{10} \Vert (a K-b H ){\textbf{x}}\Vert _2 \) (left) and \(\log _{10} \Vert ( \hat{a} {K}- \hat{b} {H}) \hat{{\textbf{x}}}\Vert _2\) (right), real eigenvalue

The histograms in Fig. 7 show the accuracy of the poles in the HH matrices after deflation. In particular, using the definition \( p_j = H_{j+1,j} / K_{j+1,j}, \) \( \tilde{p}_j = \tilde{H}_{j+2,j+1} / \tilde{K}_{j+2,j+1}, \) and \( \hat{p}_j =\hat{H}_{j+2,j+1} /\hat{K}_{j+2,j+1}, \) \( j = 1,\ldots ,n-2, \) the values of \(\log _{10} \max _{j} \frac{ \mid p_j - \tilde{p}_j \mid }{ \mid p_j \mid } \) (left) and \(\log _{10} \max _{j} \frac{ \mid p_j - \hat{p}_j \mid }{ \mid p_j \mid } \) (right), are displayed.

Fig. 7
figure 7

Accuracy of the poles in the HH matrices after deflation, real eigenvalue

Fig. 8
figure 8

Histograms of \(\log _{10}\sqrt{\tilde{c}_1^{(i)}+\tilde{c}_2^{(i)}} \) (left), with \(\tilde{c}_1^{(i)}=\Vert \texttt{tril}(\tilde{H}^{(i)},-2)\Vert _2^2+\Vert \texttt{tril}(\tilde{K}^{(i)},-2)\Vert _2^2, \) and \(\tilde{c}_2^{(i)}=\mid \tilde{H}^{(i)}_{2,1}\mid ^2+\mid \tilde{K}^{(i)}_{2,1}\mid ^2\), and \(\log _{10} \sqrt{\hat{c}_1^{(i)}+\hat{c}_2^{(i)}}\) (right), with \(\hat{c}_1^{(i)}=\Vert \texttt{tril}(\hat{H}^{(i)},-2)\Vert _2+\Vert \texttt{tril}(\hat{K}^{(i)},-2) \Vert _2\) and \(\hat{c}_2^{(i)}=\mid \hat{H}^{(i)}_{2,1}\mid ^2+\mid \hat{K}^{(i)}_{2,1}\mid ^2, \) complex conjugate eigenpair

The next three figures report the histograms corresponding to the complex conjugate eigenvalue pair \((\lambda ^{(i)}, \bar{\lambda }^{(i)})\). The histograms of \(\log _{10}\sqrt{\tilde{c}_1^{(i)}+\tilde{c}_2^{(i)}} \) (left), with \(\tilde{c}_1^{(i)}=\Vert \texttt{tril}(\tilde{H}^{(i)},-2)\Vert _2^2+\Vert \texttt{tril}(\tilde{K}^{(i)},-2)\Vert _2^2, \) and \(\tilde{c}_2^{(i)}=\mid \tilde{H}^{(i)}_{2,1}\mid ^2+\mid \tilde{K}^{(i)}_{2,1}\mid ^2\), and \(\log _{10} \sqrt{\hat{c}_1^{(i)}+\hat{c}_2^{(i)}}\) (right), with \(\hat{c}_1^{(i)}=\Vert \texttt{tril}(\hat{H}^{(i)},-2)\Vert _2+\Vert \texttt{tril}(\hat{K}^{(i)},-2) \Vert _2\) and \(\hat{c}_2^{(i)}=\mid \hat{H}^{(i)}_{2,1}\mid ^2+\mid \hat{K}^{(i)}_{2,1}\mid ^2, \) are displayed in Fig. 8. Similar to the real case, it can be noticed that if the improved scaled residual approach is not applied, the part below the first subdiagonal of the computed HH matrices gets blurred.

In Fig. 9, the histograms of the logarithms of the residuals \(\log _{10} \Vert (a {K}-b {H} ){\textbf{x}}\Vert _2 \) (left) and \(\log _{10} \Vert ( \hat{a}{K}- \hat{b} {H}) \hat{{\textbf{x}}}\Vert _2\) (right), are displayed.

Fig. 9
figure 9

Histograms of the logarithms of the residuals \(\log _{10} \Vert (\tilde{a} K- \tilde{b} H ){\textbf{x}}\Vert _2 \) (left) and \(\log _{10} \Vert ( \hat{a}\hat{K}- \hat{b} \hat{H}) \hat{{\textbf{x}}}\Vert _2\) (right), complex conjugate eigenpair

Fig. 10
figure 10

Accuracy of the poles in the HH matrices after deflation, complex conjugate eigenpair

The histograms in Fig. 10 display the accuracy of the poles in the HH matrices after deflation. In particular, using the definition \( p_j = H_{j+1,j} / K_{j+1,j}, \) \( \tilde{p}_j = \tilde{H}_{j+3,j+2} / \tilde{K}_{j+3,j+2}, \) and \( \hat{p}_j =\hat{H}_{j+3,j+2} /\hat{K}_{j+3,j+2}, \) \( j = 1,\ldots ,n-3, \) the values of \(\log _{10} \max _{j} \frac{ \mid p_j - \tilde{p}_j \mid }{ \mid p_j \mid } \) (left) and \(\log _{10} \max _{j} \frac{ \mid p_j - \hat{p}_j \mid }{ \mid p_j \mid } \) (right), are displayed.