A note on the convergence theorem of the tridiagonal QR algorithm with Wilkinson’s shift

Aishima, Kensuke

doi:10.1007/s13160-015-0171-y

A note on the convergence theorem of the tridiagonal QR algorithm with Wilkinson’s shift

Original Paper
Area 2
Published: 12 June 2015

Volume 32, pages 465–487, (2015)
Cite this article

Japan Journal of Industrial and Applied Mathematics Aims and scope Submit manuscript

Kensuke Aishima¹

190 Accesses
Explore all metrics

Abstract

We discuss the convergence rate of the QR algorithm with Wilkinson’s shift for tridiagonal symmetric eigenvalue problems. It is well known that the convergence rate is theoretically at least quadratic, and practically better than cubic for most matrices. In an effort to derive the convergence rate, the limiting patterns of some lower right submatrices have been intensively investigated. In this paper, we first describe a new limiting pattern of the lower right 3-by-3 submatrix with a concrete example, and then prove that the convergence rate of this new pattern is strictly cubic. In addition, we stress that our analysis identifies three classes of the limiting patterns of the tridiagonal QR algorithm with Wilkinson’s shift.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Explicit formulas for the inverses of Toeplitz matrices, with applications

Article Open access 02 September 2022

Relaxed-inertial derivative-free algorithm for systems of nonlinear pseudo-monotone equations

Article 15 May 2024

Multi-step inertial algorithms for equilibrium, fixed point, general systems of variational inequalities and split feasibility problems

Article 27 January 2024

References

Hoffmann, W., Parlett, B.N.: A new proof of global convergence for the tridiagonal QL algorithm. SIAM J. Numer. Anal. 15, 929–937 (1978)
Article MathSciNet MATH Google Scholar
Huang, H., Tam, T.: On the QR iterations of real matrices. Linear Algebra Appl. 408, 161–176 (2005)
Article MathSciNet Google Scholar
Jiang, E., Zhang, Z.: A new shift of the QL algorithm for irreducible symmetric matrices. Linear Algebra Appl. 65, 261–272 (1985)
Article MathSciNet MATH Google Scholar
Leitte, R.S., Saldanha, N.C., Tomei, C.: The asymptotics of Wilkinson’s shift: loss of cubic convergence. Found. Comput. Math. 10, 15–36 (2010)
Article MathSciNet MATH Google Scholar
Parlett, B.N.: Canonical decomposition of Hessenberg matrices. Math. Comput. 21, 223–227 (1967)
Article MathSciNet MATH Google Scholar
Parlett, B.N.: Global convergence of the basic QR algorithm on Hessenberg matrices. Math. Comput. 22, 803–817 (1968)
MathSciNet Google Scholar
Parlett, B.N.: The symmetric eigenvalue problem. Prentice-Hall,Englewood Cliffs, New Jersey, 1980; SIAM, Philadelphia (1998)
Wang, T.: Convergence of the tridiagonal QR algorithm. Linear Algebra Appl. 322, 1–17 (2001)
Article MathSciNet MATH Google Scholar
Wilkinson, J.H.: The algebraic eigenvalue problem. Clarendon Press, Oxford (1965)
MATH Google Scholar
Wilkinson, J.H.: Global convergence of tridiagonal QR algorithm with origin shifts. Linear Algebra Appl. 1, 409–420 (1968)
Article MathSciNet MATH Google Scholar
Zhang, G.: On the convergence rate of the QL algorithm with Wilkinson’s shift. Linear Algebra Appl. 113, 131–137 (1989)
Article MathSciNet Google Scholar

Download references

Acknowledgments

The author is grateful to Professor Takayasu Matsuo and Professor Beresford Parlett for their valuable comments and suggestions. The author also thanks the anonymous reviewer for the helpful comments.

Author information

Authors and Affiliations

The University of Tokyo, Hongo 7-3-1, Bunkyo-ku, Tokyo, Japan
Kensuke Aishima

Authors

Kensuke Aishima
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kensuke Aishima.

Additional information

The author is supported by JSPS Grant-in-Aid for Young Scientists (Grant Number 25790096).

Appendices

Appendix A: Proof of convergence

We prove here that all the tridiagonal elements of $T^{(n)}$ converge for any initial matrix $T^{(0)}$. To this end, we consider a general shift $s^{(n)}$ satisfying the following conditions:

(i)
The shift $s^{(n)}$ converges to a certain eigenvalue $s^{(\infty )}=\lambda _{l}$;
(ii)
$|s^{(n)}-\lambda _{l}|=\mathrm{o}(c^{n})$ for a positive constant $c<1$.

Note that Wilkinson’s shift satisfies the two conditions: (i) has been proved by [8]; then the convergence rate by the shifts $s^{(n)}\rightarrow \lambda _{l}$ is at least quadratic because $|s^{(n)}-\lambda _{l}|\le |s^{(n)}-\alpha _{m}^{(n)}|+|\alpha _{m}^{(n)}-\lambda _{l}| \le 2|\beta _{m-1}^{(n)}|$ by the definition of Wilkinson’s shift and Gershgorin’s circle theorem, and then we have $|\beta _{m-1}^{(n+1)}|=\mathrm{O}(|\beta _{m-1}^{(n)}|^2)$ by Wilkinson’s proof, which implies (ii).

In what follows, we prove that all the tridiagonal elements $T^{(n)}$ always converge for such general shifts. The following convergence proof might have been noticed by the experts in this research field because its proof is almost the same as those by [2, 6, 9] for the unshifted QR algorithm. However, to the best of the authors’ knowledge, the proof for the shifted algorithm is not explicitly stated in any reference. For the readers’ convenience, we prove it as follows.

Theorem 5

Suppose that the QR algorithm with a general shift satisfying the above conditions (i) and (ii) is applied to an irreducible tridiagonal matrix T. Then, $T^{(n)}$ converges to a block diagonal matrix whose block size is at most 2. Further, the lower right diagonal element $T^{(n)}(m,m)$ converges to $\lambda _{l}$ and the lower right off-diagonal element $T^{(n)}(m,m-1)$ converges to 0.

Proof

First of all, we show several important facts for the convergence analysis. Similarly to the discussion in [6] and [9, Chapter8, §28], let $\tilde{Q}^{(n)},\ \tilde{R}^{(n)}$ be

$$\begin{aligned} \tilde{Q}^{(n)}= & {} Q^{(0)}\ldots Q^{(n-1)},\\ \tilde{R}^{(n)}= & {} R^{(n-1)}\ldots R^{(0)}. \end{aligned}$$

By the orthogonal matrix $\tilde{Q}^{(n)}$, $T^{(n)}$ is described as

$$\begin{aligned} T^{(n)} =\left( \tilde{Q}^{(n)}\right) ^\mathrm{T}T\tilde{Q}^{(n)}. \end{aligned}$$

(38)

in view of $T^{(n+1)}=R^{(n)}Q^{(n)}+s^{(n)}I=(Q^{(n)})^\mathrm{T}(Q^{(n)}R^{(n)}+s^{(n)}I)Q^{(n)}=(Q^{(n)})^\mathrm{T}T^{(n)}Q^{(n)}$. Using (38), we see

$$\begin{aligned} \tilde{Q}^{(n)}\tilde{R}^{(n)}= & {} \tilde{Q}^{(n-1)}\left( T^{(n-1)}-s^{(n-1)}I\right) \tilde{R}^{(n-1)} \nonumber \\= & {} \left( T-s^{(n-1)}I\right) \tilde{Q}^{(n-1)}\tilde{R}^{(n-1)} \nonumber \\= & {} \left( T-s^{(n-1)}I\right) \ldots \left( T-s^{(0)}I\right) . \end{aligned}$$

(39)

Let p(l) denote a permutation of the indices l $(l = 1, \ldots , m)$. Then in view of the condition (i) we can place the shifted eigenvalues in a descending order as

$$\begin{aligned} \left| \lambda _{p(1)}-s^{(\infty )}\right| \ge \cdots \ge \left| \lambda _{p(m-1)}-s^{(\infty )}\right| > \left| \lambda _{p(m)}-s^{(\infty )}\right| =0. \end{aligned}$$

(40)

The last inequality follows from (i) (note that all the eigenvalues are distinct since T is irreducible).

Next, we focus on the eigendecomposition

$$\begin{aligned} T=X\varLambda X^\mathrm{T}, \end{aligned}$$

(41)

where X is the orthogonal matrix consisting of the eigenvectors and $\varLambda $ is the diagonal matrix with the eigenvalues: $\mathrm{diag}(\lambda _{p(1)},\ldots ,\lambda _{p(m)})$. Then we see

$$\begin{aligned} \left( T-s^{(0)}I\right) \ldots \left( T-s^{(n-1)}I\right) =X\varLambda ^{(n)} X^\mathrm{T}, \end{aligned}$$

(42)

where

$$\begin{aligned} \varLambda ^{(n)} =\left( \varLambda -s^{(0)}I\right) \ldots \left( \varLambda -s^{(n-1)}I\right) . \end{aligned}$$

(43)

From (38) and (41), we have $T^{(n)}=(\tilde{Q}^{(n)})^\mathrm{T}X\varLambda X^\mathrm{T}\tilde{Q}^{(n)}$. If all the inequalities are strict in (40), $T^{(n)}$ converges to $\varLambda $. Actually, $T^{(n)}=(\tilde{Q}^{(n)})^\mathrm{T}X\varLambda X^\mathrm{T}\tilde{Q}^{(n)}$ always converges to a block diagonal matrix whose block size is at most 2. In order to prove it, we firstly apply the LU factorization $X^\mathrm{T}=LU$. Note that $X^\mathrm{T}$ constructed by the normalized eigenvectors of an irreducible tridiagonal matrix is always LU factorizable [2, 5]. Hence, we see

$$\begin{aligned} \left( T-s^{(0)}I\right) \ldots \left( T-s^{(n-1)}I\right) = X\varLambda ^{(n)} L\left( \varLambda ^{(n)}\right) ^{-1} \varLambda ^{(n)} U. \end{aligned}$$

(44)

Combining it with (39) we have

$$\begin{aligned} \tilde{Q}^{(n)}\tilde{R}^{(n)}= X\varLambda ^{(n)} L(\varLambda ^{(n)})^{-1} \varLambda ^{(n)} U. \end{aligned}$$

(45)

To reveal the relationship between $\tilde{Q}^{(n)}$ and X, we discuss the behavior of the QR factorization of $\varLambda ^{(n)} L(\varLambda ^{(n)})^{-1}$ as follows. Let $D_{\varLambda ^{(n)}}$ be an orthogonal matrix

$$\begin{aligned} \mathrm{diag}\left( \prod _{l=0}^{n-1}\frac{|\lambda _{p(1)} -s^{(l)}|}{(\lambda _{p(1)}-s^{(l)})},\ldots ,\prod _{l=0}^{n-1}\frac{|\lambda _{p(m)} -s^{(l)}|}{(\lambda _{p(m)}-s^{(l)})}\right) . \end{aligned}$$

(46)

It is easy to see that

$$\begin{aligned} D_{\varLambda ^{(n)}} \varLambda ^{(n)}= \mathrm{diag}\left( \prod _{l=0}^{n-1}|\lambda _{p(1)}-s^{(l)}|, \ldots ,\prod _{l=0}^{n-1}|\lambda _{p(m)}-s^{(l)}|\right) . \end{aligned}$$

(47)

In the right-hand side of (45), we have

$$\begin{aligned} \varLambda ^{(n)} L\left( \varLambda ^{(n)}\right) ^{-1}= D_{\varLambda ^{(n)}}^{-1}\left( D_{\varLambda ^{(n)}}\varLambda ^{(n)} L\left( D_{\varLambda ^{(n)}}\varLambda ^{(n)}\right) ^{-1}\right) D_{\varLambda ^{(n)}}, \end{aligned}$$

(48)

and by applying the QR factorization we see

$$\begin{aligned} D_{\varLambda ^{(n)}}\varLambda ^{(n)} L\left( D_{\varLambda ^{(n)}}\varLambda ^{(n)}\right) ^{-1} =P^{(n)}\varGamma ^{(n)}, \end{aligned}$$

(49)

where $P^{(n)}$ is an orthogonal matrix, $\varGamma ^{(n)}$ is an upper triangular matrix whose diagonal elements are positive. Let $D_{U}$ be an orthogonal matrix $D_{U}=\mathrm{diag}(|u_{11}|/u_{11}, \ldots ,|u_{mm}|/u_{mm})$. Then we see

$$\begin{aligned} \tilde{Q}^{(n)}= & {} XD_{\varLambda ^{(n)}}^{-1}P^{(n)}D_{U}^{-1}, \end{aligned}$$

(50)

$$\begin{aligned} \tilde{R}^{(n)}= & {} D_{U}\varGamma ^{(n)} D_{\varLambda ^{(n)}} \varLambda ^{(n)} U \end{aligned}$$

(51)

from (45), (48), and (49). Therefore, we have

$$\begin{aligned} T^{(n)}=D_{U}\left( P^{(n)}\right) ^\mathrm{T}\varLambda P^{(n)}D_{U}^{-1} \end{aligned}$$

(52)

from (38) and (41).

Since our aim is to prove the convergence of all the elements of $T^{(n)}$, let us discuss the behavior of the orthogonal matrix $P^{(n)}$ as $n\rightarrow \infty $. To this end, we focus on (49). The lower left elements are

$$\begin{aligned} \left( D_{\varLambda ^{(n)}}\varLambda ^{(n)} L\left( D_{\varLambda ^{(n)}}\varLambda ^{(n)}\right) ^{-1}\right) _{ij}=l_{ij} \prod _{l=0}^{n-1}\left| \frac{\lambda _{p(j)} -s^{(l)}}{\lambda _{p(i)}-s^{(l)}}\right| \quad (i>j) \end{aligned}$$

(53)

from (47). Obviously $\lim _{n\rightarrow \infty }(D_{\varLambda ^{(n)}}\varLambda ^{(n)} L(D_{\varLambda ^{(n)}}\varLambda ^{(n)})^{-1})_{ij}=0$, when $|\lambda _{p(i)}-s^{(\infty )}| > |\lambda _{p(j)}-s^{(\infty )}|$. Otherwise, from (40) and the condition (ii), we have

$$\begin{aligned} \left| \frac{\lambda _{p(j)}-s^{(l)}}{\lambda _{p(i)}-s^{(l)}}\right| =\left| \frac{\lambda _{p(j)}-\lambda _{p(m)} +\lambda _{p(m)}-s^{(l)}}{\lambda _{p(i)} -\lambda _{p(m)}+\lambda _{p(m)}-s^{(l)}}\right| =1+\mathrm{o}(c^{l}). \end{aligned}$$

(54)

Since a sequence of the size $\mathrm{o}(c^{l})$ with $0<c<1$ absolutely converges, $(D_{\varLambda ^{(n)}}\varLambda ^{(n)} L(D_{\varLambda ^{(n)}}\varLambda ^{(n)})^{-1})_{ij}$ represented by the infinite product (53) is convergent:

$$\begin{aligned} \lim _{n\rightarrow \infty }D_{\varLambda ^{(n)}}\varLambda ^{(n)} L\left( D_{\varLambda ^{(n)}}\varLambda ^{(n)}\right) ^{-1}=\tilde{L}. \end{aligned}$$

(55)

The resulting matrix $\tilde{L}$ is not only unit lower triangular, but also block diagonal with the block sizes at most 2, because if the equality $|\lambda _{p(k)}-s^{(\infty )}|=|\lambda _{p(k+1)}-s^{(\infty )}|$ in (40) holds, then both inequalities $|\lambda _{p(k-1)}-s^{(\infty )}|>|\lambda _{p(k)}-s^{(\infty )}|$ and $|\lambda _{p(k+1)}-s^{(\infty )}|>|\lambda _{p(k+2)}-s^{(\infty )}|$ are satisfied thanks to the fact that the eigenvalues are all distinct. Hence, the orthogonal matrix $P^{(n)}$ given by the QR factorization of $D_{\varLambda ^{(n)}}\varLambda ^{(n)} L(D_{\varLambda ^{(n)}}\varLambda ^{(n)})^{-1}$ is convergent:

$$\begin{aligned} \lim _{n\rightarrow \infty }{P}^{(n)}=\tilde{P}, \end{aligned}$$

(56)

where $\tilde{P}$ is a block diagonal matrix whose block size is at most 2. It then follows that

$$\begin{aligned} \lim _{n\rightarrow \infty }T^{(n)} =D_{U}{\tilde{P}}^\mathrm{T}\varLambda \tilde{P}D_{U}^{-1} \end{aligned}$$

(57)

from (52). Therefore, $T^{(n)}$ converges to a block diagonal matrix whose block size is at most 2. $\square $

Appendix B: Proofs of Lemmas 2 and 3

We prove here Lemmas 2 and 3 in turn.

By the discussion in Appendix A, $\lambda _{p(m-1)},\ \lambda _{p(m-2)}$ are the eigenvalues of the $2\times 2$ submatrix

$$\begin{aligned} \left( \begin{array}{c@{\quad }c} \alpha _{m-2}^{(\infty )} &{} \beta _{m-2}^{(\infty )} \\ \beta _{m-2}^{(\infty )} &{} \alpha _{m-1}^{(\infty )} \\ \end{array} \right) . \end{aligned}$$

(58)

In view of $\beta _{m-2}^{(\infty )}\not = 0$, we see $|\lambda _{p(m-2)}-\lambda _{p(m)}|=|\lambda _{p(m-1)}-\lambda _{p(m)}|$. The eigenvalues are real and distinct, which implies $(\lambda _{p(m-2)}-\lambda _{p(m)})+(\lambda _{p(m-1)}-\lambda _{p(m)})=0$. Since the sum of the eigenvalues of the matrix (58) is equal to the trace of that, we have $(\alpha _{m-2}^{(\infty )}-s^{(\infty )}) +(\alpha _{m-1}^{(\infty )}-s^{(\infty )})=0$. In other words,

$$\begin{aligned} \lim _{n\rightarrow \infty }\left( \begin{array}{c@{\quad }c} \alpha _{m-2}^{(n)}-s^{(n)} &{} \beta _{m-2}^{(n)} \\ \beta _{m-2}^{(n)} &{} \alpha _{m-1}^{(n)}-s^{(n)} \\ \end{array} \right) = \left( \begin{array}{c@{\quad }c} -D &{} C \\ C &{} D \\ \end{array} \right) \end{aligned}$$

(59)

holds for constants C and D. Obviously, the eigenvalues of the matrix (59) are $\pm \sqrt{C^2+D^2}$. It then follows that

$$\begin{aligned} \sqrt{C^2+D^2}=\left| \lambda _{p(m-2)}-\lambda _{p(m)}\right| =\left| \lambda _{p(m-1)}-\lambda _{p(m)}\right| . \end{aligned}$$

(60)

This completes the proof of Lemma 2.

Next, we we prove Lemma 3 based on Lemma 1. Actually, in Lemma 1,

$$\begin{aligned} \lim _{n\rightarrow \infty }\left| \gamma ^{(n)}\right| =\prod _{1\le i \le m-2}\left| \lambda _{p(i)}-\lambda _{p(m)}\right| >0 \end{aligned}$$

(61)

holds. We prove (61) below. If $\lim _{n\rightarrow \infty }\beta _{m-2}^{(n)}= 0$, then $\lim _{n\rightarrow \infty }\alpha _{m-1}^{(n)}= \lambda _{p(m-1)}$ holds. Therefore, we obtain

$$\begin{aligned} \lim _{n\rightarrow \infty }\left| \gamma ^{(n)}\right| =\lim _{n\rightarrow \infty }\left| d_{m-2}^{(n)}\right| =\lim _{n\rightarrow \infty }\prod _{1\le i \le m-2}\left| \lambda _{p(i)}-\lambda _{p(m)}\right| \end{aligned}$$

from (17) and (19). Next, we consider the situation $\lim _{n\rightarrow \infty }\beta _{m-2}^{(n)} \not =0$. It is easy to see that, if $\lim _{n\rightarrow \infty }\beta _{m-2}^{(n)} \not =0$, then $\lim _{n\rightarrow \infty }\beta _{m-3}^{(n)}= 0$ because the block size of $T^{(\infty )}$ is at most 2. Combining it with (17), we have

$$\begin{aligned} \lim _{n\rightarrow \infty }d_{m-3}^{(n)}=\prod _{1\le i\le m-3}\left( \lambda _{p(i)}-\lambda _{p(m)}\right) \not =0. \end{aligned}$$

(62)

In addition, we see

$$\begin{aligned} \lim _{n\rightarrow \infty }(\gamma ^{(n)})^2=\lim _{n\rightarrow \infty }\left[ \left( d_{m-2}^{(n)}\right) ^2 +\left( \beta _{m-2}^{(n)}d_{m-3}^{(n)}\right) ^2\right] \end{aligned}$$

(63)

in (19). Noting $\beta _{m-3}^{(\infty )}=0$ and (17), we see that

$$\begin{aligned} \lim _{n\rightarrow \infty }|\gamma ^{(n)}|= & {} \lim _{n\rightarrow \infty }\sqrt{\left( \alpha _{m-2}^{(n)}-s^{(n)}\right) ^{2} +\left( \beta _{m-2}^{(n)}\right) ^{2}}|d_{m-3}^{(n)}|\nonumber \\= & {} \sqrt{\left( \alpha _{m-2}^{(\infty )}-\lambda _{p(m)}\right) ^2 +\left( \beta _{m-2}^{(\infty )}\right) ^2}\prod _{1\le i \le m-3}|\lambda _{p(i)}-\lambda _{p(m)}| \end{aligned}$$

(64)

holds. Therefore, we obtain (61) from (59) and (60).

Obviously,

$$\begin{aligned} \lim _{n\rightarrow \infty }d_{m-1}^{(n)}=\lim _{n\rightarrow \infty }\prod _{1\le i \le m-1}\left( \lambda _{p(i)}-s^{(n)}\right) \not =0 \end{aligned}$$

(65)

holds. Moreover, if $D\not =0$, then

$$\begin{aligned} \left| d_{m}^{(n)}\right| \sim \frac{\left| \beta _{m-2}^{(n)}\right| ^2\left| \beta _{m-1}^{(n)}\right| ^2 \left| d_{m-3}^{(n)}\right| }{|D|} \end{aligned}$$

(66)

in the same way as (22).

From (18), (61), (65), and (66), we have

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{\left| \beta _{m-1}^{(n+1)}\right| }{\left| \beta _{m-2}^{(n)}\right| ^2\left| \beta _{m-1}^{(n)}\right| ^3} =\frac{\left| d_{m-3}^{(\infty )}\right| }{|D|\left| \lambda _{p(m-1)} -\lambda _{p(m)}\right| ^2\prod _{1\le i \le m-2}\left| \lambda _{p(i)}-\lambda _{p(m)}\right| }. \end{aligned}$$

(67)

Noting $|\rho _{1}|$ is defined as (25), we have

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{\left| \beta _{m-1}^{(n+1)}\right| }{\left| \beta _{m-2}^{(n)}\right| ^2\left| \beta _{m-1}^{(n)}\right| ^3} =\frac{\left| d_{m-3}^{(\infty )}\right| }{\left| \rho _{1}\right| \left| \lambda _{p(m-1)} -\lambda _{p(m)}\right| ^3\prod _{1\le i \le m-2}\left| \lambda _{p(i)}-\lambda _{p(m)}\right| } \end{aligned}$$

(68)

from (67).

We investigate the behavior of $|d_{m-3}^{(n)}|$. First of all, we assume $\beta _{m-3}^{(\infty )}\not =0$. Then $\beta _{m-4}^{(\infty )}=0$ and $\beta _{m-2}^{(\infty )}=0$. Similarly to (59) and (60),

$$\begin{aligned} \left| \alpha _{m-3}^{(\infty )}-\lambda _{p(m)}\right| ^2 +\left| \beta _{m-3}^{(\infty )}\right| ^2=\left| \lambda _{p(m-3)} -\lambda _{p(m)}\right| ^2=\left| \lambda _{p(m-2)}-\lambda _{p(m)}\right| ^2 \end{aligned}$$

(69)

holds. Noting $\beta _{m-4}^{(\infty )}=0$, we have

$$\begin{aligned} \left| d_{m-3}^{(\infty )}\right|= & {} \left| \alpha _{m-3}^{(\infty )}-\lambda _{p(m)}\right| \prod _{1\le i \le m-4}\left| \lambda _{p(i)}-\lambda _{p(m)}\right| \\= & {} \sqrt{\left| \lambda _{p(m-3)}-\lambda _{p(m)}\right| ^2 -\left| \beta _{m-3}^{(\infty )}\right| ^2}\prod _{1\le i \le m-4}\left| \lambda _{p(i)}-\lambda _{p(m)}\right| \end{aligned}$$

in the same way as (59) and (60). Moreover, noting $\left| \lambda _{p(m-3)}-\lambda _{p(m)}\right| =\left| \lambda _{p(m-2)}\right. \left. -\lambda _{p(m)}\right| $, we obtain

$$\begin{aligned} \left| d_{m-3}^{(\infty )}\right| = \sqrt{1-\left| \beta _{m-3}^{(\infty )}\right| ^2/\left| \lambda _{p(m-2)} -\lambda _{p(m)}\right| ^2}\prod _{1\le i \le m-3}\left| \lambda _{p(i)}-\lambda _{p(m)}\right| . \end{aligned}$$

(70)

Actually, it is easy to see that (70) covers the case $\beta _{m-3}^{(\infty )}=0$. Therefore, we obtain

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{\left| \beta _{m-1}^{(n+1)}\right| }{\left| \beta _{m-2}^{(n)}\right| ^2\left| \beta _{m-1}^{(n)}\right| ^3}= & {} \frac{\sqrt{1-\left| \beta _{m-3}^{(\infty )}\right| ^2/\left| \lambda _{p(m-2)} -\lambda _{p(m)}\right| ^2}}{\left| \rho _{1}\right| \left| \lambda _{p(m-1)} -\lambda _{p(m)}\right| ^3\left| \lambda _{p(m-2)}-\lambda _{p(m)}\right| }\nonumber \\= & {} \frac{\left| \rho _{2}\right| }{\left| \rho _{1}\right| \left| \lambda _{p(m-1)} -\lambda _{p(m)}\right| ^3\left| \lambda _{p(m-2)}-\lambda _{p(m)}\right| }, \end{aligned}$$

(71)

where the first equality is due to (68) and (70), the second equality is due to the definition of $|\rho _{2}|$ in (26). This completes the proof of (24).

The final task is to derive (28) in the case $\alpha _{m-1}^{(\infty )}=\alpha _{m}^{(\infty )}$. In (21), $|\beta _{m-1}^{(n)}|/|\alpha _{m-1}^{(n)}-s^{(n)}|\le 1$ in view of the definition of Wilkinson’s shift. Hence,

$$\begin{aligned} \limsup _{n\rightarrow \infty }\frac{\left| d_{m}^{(n)}\right| }{\left| \beta _{m-1}^{(n)}\right| }\le \left| \beta _{m-2}^{(\infty )}\right| ^2\left| d_{m-3}^{(\infty )}\right| \end{aligned}$$

from (21). Also noting $\left| \beta _{m-2}^{(\infty )}\right| =\left| \lambda _{p(m-1)} -\lambda _{p(m)}\right| =\left| \lambda _{p(m-2)}-\lambda _{p(m)}\right| $, $\beta _{m-3}^{(\infty )}=0$, we have

$$\begin{aligned} \limsup _{n\rightarrow \infty }\frac{\left| d_{m}^{(n)}\right| }{\left| \beta _{m-1}^{(n)}\right| }\le \prod _{1\le i \le m-1}\left| \lambda _{p(i)}-\lambda _{p(m)}\right| . \end{aligned}$$

Therefore, we obtain (28) from (18), (61), and (65).

Appendix C: Convergence analysis based on perturbation theory

In this section, using perturbation theory we prove Corollary 1. In other words, we prove the following facts:

if $\lim _{n\rightarrow \infty }(\alpha _{m-1}^{(n)}-\alpha _{m}^{(n)})=0$, then $|\beta _{m-1}^{(n+1)}|=\mathrm{O}(|\beta _{m-1}^{(n)}|^2)$;
if $\lim _{n\rightarrow \infty }(\alpha _{m-1}^{(n)}-\alpha _{m}^{(n)})=D\not =0$, then $|\beta _{m-1}^{(n+1)}|=\mathrm{O}(|\beta _{m-2}^{(n)}|^2|\beta _{m-1}^{(n)}|^3)$.

In what follows, we use the so-called gap theorem [7, Theorem 11.7.1].

Lemma 4

([7]) Let y be a unit vector, A be a symmetric matrix, and $\lambda _{l}$ be the eigenvalue of A closest to $y^\mathrm{T}Ay$. Then

$$\begin{aligned} \left| y^\mathrm{T}Ay-\lambda _{l}\right| \le \frac{\left\| Ay-\left( y^\mathrm{T}Ay\right) y\right\| ^2}{\min _{i\not =l}\left| \lambda _{i}-y^\mathrm{T}Ay\right| } \end{aligned}$$

holds.

First of all, we note

$$\begin{aligned} \left| \beta _{m-1}^{(n+1)}\right| \sim \frac{\left| \lambda _{p(m)}-s^{(n)}\right| }{\left| \lambda _{p(m-1)} -s^{(n)}\right| }\left| \beta _{m-1}^{(n)}\right| \end{aligned}$$

(72)

as $n\rightarrow \infty $ for the general shifts satisfying $\beta _{m-1}^{(\infty )}=0$ and the condition (i) in Appendix A. Although this fact might be noticed by the experts, the authors do not know the literature where its proof is explicitly written. Hence, we prove (72) below. Recall that $T_{k}^{(n)}$ for $k=1,\ldots ,m$ are the $k \times k$ leading principal submatrix of $T^{(n)}$ and $d_{k}^{(n)}=\det (T_{k}^{(n)}-s^{(n)}I)$ for $k=1,\ldots ,m$ defined in (17). Then, (65) holds for the general shifts in view of the condition (i) and $\lim _{n\rightarrow \infty }\beta _{m-1}^{(n)}= 0$. Obviously, $d_{m}^{(n)}=\prod _{1\le i \le m}(\lambda _{p(i)}-s^{(n)})$. Furthermore, noting that $\gamma ^{(n)}$ in (19) is bounded and $\lim _{n\rightarrow \infty }\beta _{m-1}^{(n)}= 0$, in (18) we have

$$\begin{aligned} \left| \beta _{m-1}^{(n+1)}\right| \sim \frac{\left| (\lambda _{p(m)}-s^{(n)})\gamma ^{(n)}\right| }{\prod _{1\le i \le m-1}\left| \lambda _{p(i)}-s^{(n)}\right| }\left| \beta _{m-1}^{(n)}\right| \end{aligned}$$

(73)

as $n\rightarrow \infty $. From (61), we have (72).

For the discussion below, we describe the relation (72) more precisely. For any $\epsilon _{1} >0$,

$$\begin{aligned} \left| \beta _{m-1}^{(n+1)}\right| \le \frac{\left| \lambda _{p(m)}-s^{(n)}\right| }{\left| \lambda _{p(m-1)} -s^{(n)}\right| }\left| \beta _{m-1}^{(n)}\right| \left( 1+\epsilon _{1}\right) \end{aligned}$$

(74)

holds for all sufficiently large n.

In order to reveal the convergence rate, let us estimate $|\lambda _{p(m)}-s^{(n)}|$. To this end, suppose we apply one step of the Jacobi method for the lower right 2-by-2 submatrix of $T^{(n)}$. Then we see from [7, Chapter9] that the angle $\theta ^{(n)}$ of Givens rotation for annihilating $\beta _{m-1}^{(n)}$ satisfies

$$\begin{aligned} \tan \left( 2\theta ^{(n)}\right) =\frac{2\beta _{m-1}^{(n)}}{\alpha _{m-1}^{(n)} -\alpha _{m}^{(n)}}, \end{aligned}$$

(75)

where $\theta ^{(n)}$ is chosen in the interval $[-\pi /4,\ \pi /4]$. It means that the transformed matrix can be described as

$$\begin{aligned} \left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} \ddots &{} \ddots &{} &{} &{} \\ \ddots &{} * &{} * &{} &{} \\ &{} * &{} * &{} * &{} z^{(n)} \\ &{} &{} * &{} * &{} 0 \\ &{} &{} z^{(n)} &{} 0 &{} s^{(n)} \\ \end{array} \right) , \end{aligned}$$

where

$$\begin{aligned} z^{(n)}=\beta _{m-2}^{(n)}\sin (\theta ^{(n)}). \end{aligned}$$

(76)

Note that $s^{(\infty )}=\lambda _{p(m)}$. Let

$$\begin{aligned} \delta =\min _{i\not =m}\left| \lambda _{p(i)}-s^{(\infty )}\right| >0. \end{aligned}$$

(77)

For any $\epsilon _{2} >0$, noting Lemma 4 with $y:=(0,0,\ldots ,1)^\mathrm{T}$, we have

$$\begin{aligned} \left| \lambda _{p(m)}-s^{(n)}\right| \le \frac{|z^{(n)}|^{2}}{\delta -\epsilon _{2}} \end{aligned}$$

(78)

for all sufficiently large n. In addition, it is easy to see that, for any $\epsilon _{3} > 0$,

$$\begin{aligned} \left| \lambda _{p(m-1)}-s^{(n)}\right| \ge \delta - \epsilon _{3} \end{aligned}$$

(79)

for all sufficiently large n.

Now we consider the case of $|\alpha _{m-1}^{(n)} - \alpha _m^{(n)}| \rightarrow D \ne 0$. We obtain

$$\begin{aligned} \left| \beta _{m-1}^{(n+1)}\right|\le & {} \frac{\left| \beta _{m-1}^{(n)}\right| \left| z^{(n)}\right| ^{2}\left| 1+\epsilon _{1}\right| }{\left| \delta -\epsilon _{2}\right| \left| \delta -\epsilon _{3}\right| } \nonumber \\= & {} \frac{\left| \beta _{m-1}^{(n)}\right| \left| \beta _{m-2}^{(n)} \sin (\theta ^{(n)})\right| ^{2}\left| 1+\epsilon _{1}\right| }{\left| \delta -\epsilon _{2}\right| \left| \delta -\epsilon _{3}\right| } \nonumber \\= & {} \frac{\left| \beta _{m-1}^{(n)}\right| ^{3}\left| \beta _{m-2}^{(n)}\right| ^{2}\left| 1 +\epsilon _{1}\right| }{\left| \alpha _{m-1}^{(n)}-\alpha _{m}^{(n)}\right| ^{2}\left| \delta -\epsilon _{2}\right| \left| \delta -\epsilon _{3}\right| } \end{aligned}$$

(80)

for all sufficiently large n by using (74), (79), (78), (76), (75) in turn. We see $\epsilon _{1}, \epsilon _{2}, \epsilon _{3} \rightarrow 0$ and $|\alpha _{m-1}^{(n)}-\alpha _{m}^{(n)}| \rightarrow |D| > 0$ as $n\rightarrow \infty $. Therefore, $|\beta _{m-1}^{(n+1)}|=\mathrm{O}(|\beta _{m-2}^{(n)}|^{2}|\beta _{m-1}^{(n)}|^{3})$.

Finally, we prove the quadratic convergence in the case $|\alpha _{m-1}^{(n)}-\alpha _{m}^{(n)}|\rightarrow 0$. Since the estimate (80) by the Jacobi transformation cannot derive the quadratic convergence in the case $|\alpha _{m-1}^{(n)}-\alpha _{m}^{(n)}|\rightarrow 0$, we give another estimate based on Lemma 4. For any $\epsilon _{4} >0$, we see

$$\begin{aligned} \left| \lambda _{p(m)}-\alpha _{m}^{(n)}\right| \le \frac{\left| \beta _{m-1}^{(n)}\right| ^{2}}{\delta -\epsilon _{4}} \end{aligned}$$

(81)

for all sufficiently large n from Lemma 4 with $y:=(0,0,\ldots ,1)^\mathrm{T}$, $A:=T^{(n)}$. From the definition of Wilkinson’s shift, $|\alpha _{m}^{(n)}-s^{(n)}|\le |\beta _{m-1}^{(n)}|$ holds. Hence,

$$\begin{aligned} \left| \lambda _{p(m)}-s^{(n)}\right| \le \left| \lambda _{p(m)}-\alpha _{m}^{(n)}\right| +\left| \alpha _{m}^{(n)}-s^{(n)}\right| \le \frac{\left| \beta _{m-1}^{(n)}\right| ^{2}}{\delta -\epsilon _{4}}+\left| \beta _{m-1}^{(n)}\right| \end{aligned}$$

(82)

for all sufficiently large n. Thus, we have

$$\begin{aligned} \left| \beta _{m-1}^{(n+1)}\right| \le \frac{\left| \beta _{m-1}^{(n)}\right| ^{2}\left( \delta -\epsilon _{4}+\beta _{m-1}^{(n)}\right) \left| 1+\epsilon _{1}\right| }{\left| \delta -\epsilon _{3}\right| \left| \delta -\epsilon _{4}\right| } \end{aligned}$$

for all sufficiently large n from (74), (79), (82). We see $\epsilon _{3}, \epsilon _{4} \rightarrow 0$ and $\beta _{m-1}^{(n)} \rightarrow 0$ as $n\rightarrow \infty $. Therefore, $|\beta _{m-1}^{(n+1)}|=\mathrm{O}(|\beta _{m-1}^{(n)}|^{2})$. This completes the proof.

Although the convergence analysis above is readily accessible to the readers in the research fields of the numerical linear algebra, we also note that the right-hand side of (80) in our proof is an overestimate in the case of $|\lambda _{p(m-1)}-\lambda _{p(m)}|< |\lambda _{p(m-2)}-\lambda _{p(m)}|$ because we have

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{1}{\left| \alpha _{m-1}^{(n)}-\alpha _{m}^{(n)}\right| ^{2}\left| \delta \right| ^2}= & {} \frac{1}{|D|^2\left| \lambda _{p(m-1)}-\lambda _{p(m)}\right| ^2}\nonumber \\> & {} \frac{1}{|D|\left| \lambda _{p(m-2)} -\lambda _{p(m)}\right| \left| \lambda _{p(m-1)}-\lambda _{p(m)}\right| ^2}\nonumber \\= & {} \frac{1}{\left| \lambda _{p(m-1)} -\lambda _{p(m)}\right| ^3\left| \lambda _{p(m-2)}-\lambda _{p(m)}\right| \left| \rho _{1}\right| }\nonumber \\\ge & {} \frac{\left| \rho _{2}\right| }{\left| \lambda _{p(m-1)} -\lambda _{p(m)}\right| ^3\left| \lambda _{p(m-2)}-\lambda _{p(m)}\right| \left| \rho _{1}\right| }\nonumber \\= & {} \lim _{n\rightarrow \infty }\frac{\left| \beta _{m-1}^{(n+1)}\right| }{\left| \beta _{m-2}^{(n)}\right| ^2\left| \beta _{m-1}^{(n)}\right| ^3}, \end{aligned}$$

(83)

where the first equality is due to the assumption $\alpha _{m-1}^{(\infty )}-\alpha _{m}^{(\infty )}=D$ and the definition of $\delta $ in (77), the next inequality is due to (27) and the above condition $|\lambda _{p(m-1)}-\lambda _{p(m)}|< |\lambda _{p(m-2)}-\lambda _{p(m)}|$, the next equality is due to (25), the next inequality is due to (26), and the last equality is due to (24). Also note that, in the case of $\beta _{m-2}^{(\infty )}=C\not =0$, the above relation (83) holds because we have the strict inequality in (83) from (60). Therefore, the right-hand side of (80) is an overestimate.

About this article

Cite this article

Aishima, K. A note on the convergence theorem of the tridiagonal QR algorithm with Wilkinson’s shift. Japan J. Indust. Appl. Math. 32, 465–487 (2015). https://doi.org/10.1007/s13160-015-0171-y

Download citation

Received: 02 December 2014
Revised: 23 March 2015
Published: 12 June 2015
Issue Date: July 2015
DOI: https://doi.org/10.1007/s13160-015-0171-y

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A note on the convergence theorem of the tridiagonal QR algorithm with Wilkinson’s shift

Abstract

Access this article

Similar content being viewed by others

Explicit formulas for the inverses of Toeplitz matrices, with applications

Relaxed-inertial derivative-free algorithm for systems of nonlinear pseudo-monotone equations

Multi-step inertial algorithms for equilibrium, fixed point, general systems of variational inequalities and split feasibility problems

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendices

Appendix A: Proof of convergence

Theorem 5

Proof

Appendix B: Proofs of Lemmas 2 and 3

Appendix C: Convergence analysis based on perturbation theory

Lemma 4

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

A note on the convergence theorem of the tridiagonal QR algorithm with Wilkinson’s shift

Abstract

Access this article

Similar content being viewed by others

Explicit formulas for the inverses of Toeplitz matrices, with applications

Relaxed-inertial derivative-free algorithm for systems of nonlinear pseudo-monotone equations

Multi-step inertial algorithms for equilibrium, fixed point, general systems of variational inequalities and split feasibility problems

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendices

Appendix A: Proof of convergence

Theorem 5

Proof

Appendix B: Proofs of Lemmas 2 and 3

Appendix C: Convergence analysis based on perturbation theory

Lemma 4

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation