1 Introduction

In this paper we study a conjecture of Forsythe from 1968 [7] about the “asymptotic directions of the s-dimensional optimum gradient method” or, in modern terms, the asymptotic behavior of the conjugate gradient (CG) method [10] when it is restarted every s steps. Based on what he could prove in the simplest case of restart length \(s=1\), and what he observed numerically for \(s=2\), Forsythe conjectured that for every fixed s the normalized error vectors (or normalized residual vectors) of the restarted method eventually cycle back and forth between two limiting directions. Stiefel had called these limiting directions a “cage” already in 1952 [18]. Such oscillatory or cyclic behavior is unwanted, since it asymptotically leads to slow (at most linear) convergence of the method.

In light of the importance and wide-spread use of the CG method, surprisingly few results about the Forsythe conjecture have been published beyond the case \(s=1\), and the conjecture remains largely open after more than 50 years. In addition to Forsythe in [7], proofs of the oscillatory behavior for \(s=1\) were published by Akaike [2] and Gonzaga and Schneider [8], as well as Afanasjew, Eiermann, Ernst, and Güttel [1] (for a closely related algorithm). The nature of the oscillations for \(s=1\) were studied by Nocedal, Sartenaer, and Zhu [13]. Prozanto, Wynn, and Zhigljavsky [16] analyzed the rate of convergence of the s-dimensional optimum gradient method, re-derived several results of Forsythe in a different formulation, and numerically studied the limiting behavior particularly in the case \(s=2\). This case was also analyzed by Zhuk and Bondarenko [26], and we will comment on their work after the proof of our Theorem 5 below. The papers [8, 13, 16] were all written in the context of gradient descent methods for minimizing quadratic functionals. The ongoing interest in the limiting behavior of such methods is apparent from a recent survey paper by Zou and Magoulès [27], which contains a summary of results of Akaike [2] as well as later developments in its Sect. 4.

The results of Forsythe on the restarted CG method, the setting of his original conjecture, and many results on the conjecture that were derived for gradient descent methods, are valid only for symmetric positive definite matrices. Forsythe already suspected that his results could hold also for a wider class of restarted gradient-type methods when applied to sufficiently smooth functions; see [7, p. 58]. And indeed, an oscillatory limiting behavior has sometimes been observed also in other restarted Krylov subspace methods such as GMRES [17], particularly when applied to symmetric or normal matrices; see, e.g., [4, 20].

In this paper we consider the Forsythe conjecture in a generalized setting that is valid for general (possibly nonsymmetric) matrices A. To this end, we introduce the Arnoldi cross iteration, which performs s iterations of the Arnoldi process with A and then s iterations with its transpose \(A^T\). Such cross iterations have previously been used in eigenvalue methods for nonsymmetric matrices, for example in Ostrowski’s two-sided iteration [14] and Parlett’s alternating Rayleigh quotient iteration [15]. The latter appear to be related to our Arnoldi cross iteration for \(s=1\), but the precise nature of these relations is yet to be explored. In addition, cross iterations of Krylov subspace methods have been used for analyzing the worst-case behavior of the underlying projection process. For example, motivated by work of Zavorin [23] as well as Zavorin, O’Leary, and Elman [24] (who studied complete stagnation of the GMRES method [17] for A and \(A^T\)), we have previously introduced a GMRES cross iteration [6] in order to analyze worst-case GMRES. In this paper we do not focus on the worst-case behavior of any method, but we will nevertheless show that the Arnoldi cross iteration with \(s=1\) solves the ideal Arnoldi problem for real orthogonal matrices; see (12) and the end of Sect. 4.3.

For symmetric matrices the iterative steps with A and \(A^T\) in the Arnoldi cross iteration coincide, and we recover essentially the same algorithm as originally studied by Forsythe, only with a different (namely the Arnoldi or the Lanczos) projection process. We conjecture that the Arnoldi cross iteration in general has the same oscillatory limiting behavior as the one conjectured by Forsythe for the restarted CG method. We point out that usually such a behavior does not exist in restarted Krylov subspace methods for nonsymmetric matrices which use only A (and not \(A^T\)); see [21, 25].

We generalize or extend several results from the original Forsythe formulation and from our paper [6] to the Arnoldi cross iteration. We also prove several new results about the limiting behavior of this iteration. In the simplest case \(s=1\) and \(A=A^T\), the iteration reduces to the one studied in [1], and we give an alternative (and in particular simpler) proof of the limiting behavior in this case. Another new result in this paper is the proof of the limiting behavior of the Arnoldi cross iteration for \(s=1\) and orthogonal matrices with eigenvalues having only positive (or only negative) real parts.

The paper is organized as follows. As mentioned above, in Sect. 2 we present the original Forsythe conjecture in modern terms and discuss previous attempts to prove it. In Sect. 3 we introduce the Arnoldi cross iteration, prove several results about this iteration and its limiting behavior, and generalize the conjecture. In Sect. 4 we consider the Arnoldi cross iteration for symmetric and orthogonal matrices. In Sect. 5 we give concluding remarks.

Notation. Throughout the paper we consider the behavior of all algorithms in exact arithmetic, and only real matrices for notational simplicity. Many results can be easily extended to the complex case. The degree of the minimal polynomial of a matrix \(A\in {{\mathbb {R}}}^{n\times n}\) is denoted by d(A), and the grade of a vector \(v\in {{\mathbb {R}}}^{n}\) with respect to A by d(Av); cf. [11, Definition 4.2.1]. For each \(k\ge 1\), the kth Krylov subspace of \(A\in \mathbb {R}^{n\times n}\) and \(v\in \mathbb {R}^n\) is denoted by \(\mathcal {K}_{k}(A,v):=\textrm{span} \{v,Av,\dots ,A^{k-1}v\}\). The Euclidean norm on \(\mathbb {R}^n\) is denoted by \(\Vert \cdot \Vert \), and for a symmetric positive definite matrix \(A\in \mathbb {R}^{n \times n}\) the A-norm on \(\mathbb {R}^n\) is denoted by \(\Vert \cdot \Vert _A\), where \(\Vert v\Vert _A:=(v^TAv)^{1/2}\).

2 The original Forsythe conjecture

Forsythe [7] considered minimizing functions of the form

$$\begin{aligned} f(x)=\frac{1}{2} x^TAx-x^Tb, \end{aligned}$$
(1)

where \(A\in {{\mathbb {R}}}^{n\times n}\) is symmetric positive definite, by an iterative method he called the optimum s-gradient method. We remark that Forsythe actually considered \(b=0\) in (1) for simplicity, so that the unique minimizer of f is given by \(x=0\). In order to avoid any possible confusion of the reader that may be caused by studying a method for computing a solution that is already known trivially, we here consider a possibly nonzero vector \(b\in \mathbb {R}^n\) in (1). In any case, the unique minimizer of f is equal to the uniquely determined solution of the linear algebraic system \(Ax=b\).

Using modern notation, the iterate \(x_1\) of the optimum s-gradient method starting with \(x_0\) is then defined by the relations

$$\begin{aligned} x_1\in x_0+\mathcal {K}_s(A,r_0)\quad \hbox {such that}\quad x-x_1\perp _A \mathcal {K}_s(A,r_0), \end{aligned}$$
(2)

where \(r_0:=b-Ax_0\). Equivalently, the iterate \(x_1\in x_0+\mathcal {K}_s(A,r_0)\) satisfies

$$\begin{aligned} \Vert x-x_1\Vert _A=\min _{z\in x_0+\mathcal {K}_s(A,r_0)}\Vert x-z\Vert _A. \end{aligned}$$

This is exactly the mathematical characterization of the sth iterate of the CG method applied to \(Ax=b\) with the initial vector \(x_0\); see, e.g., [11, Theorem 2.3.1]. Forsythe was of course aware that the method he considered is mathematically equivalent with the CG method. He also pointed out that the implementation of Hestenes and Stiefel “in practice... may usually be superior to the optimum s-gradient methods” [7, p. 58].

Forsythe was interested in the behavior of the iterates in the optimum s-gradient method when it is applied multiple times or, in modern terms, restarted. It is well known that the restarted method converges to the uniquely determined minimizer of f, given by \(x=A^{-1}b\). The interesting question in this context is from which directions the iterates approach their limit.

In order to study this behavior, one considers an integer s with \(1\le s< d(A)\), an initial vector \(x_0\in {{\mathbb {R}}}^{n}\) with \(d(A,x_0)\ge s+1\), and a sequence of vectors constructed using (2) with an additional normalization:

$$\begin{aligned}&\text{ For } \; {k=0,1,2,\dots }\nonumber \\ {}&{y_k=r_k/\Vert r_k\Vert ,\quad \text{ where }\ r_k:=b-Ax_k,} \end{aligned}$$
(3)
$$\begin{aligned}&{x_{k+1}\in x_k+{\mathcal {K}_s(A,y_k)} \quad \hbox {such that}\ x-x_{k+1}\perp _A {\mathcal {K}_s(A,y_k)}.} \end{aligned}$$
(4)

In the case \(s=1\) we have \(x_{k+1}=x_k+\alpha _k y_k\), where \(\alpha _k\) is uniquely determined by the orthogonality condition \(x-x_{k+1}\perp _A \textrm{span}\{y_k\}\). This yields \(\alpha _k=(y_k^Tr_k)/(y_k^TAy_k)\), and hence

$$\begin{aligned} x_{k+1}=x_k+\frac{y_k^Tr_k}{y_k^TAy_k}y_k=x_k+\frac{r_k^Tr_k}{r_k^TAr_k}r_k, \end{aligned}$$
(5)

which is nothing but the steepest descent method.

Forsythe and Motzkin had conjectured already in 1951 that in the case \(s=1\) the two sequences of normalized vectors with even and odd indices, i.e., \(\{y_{2k}\}\) and \(\{y_{2k+1}\}\), alternate asymptotically between two limit vectors that are determined by the eigenvectors of A corresponding to its smallest and its largest eigenvalue. In the words of Forsythe, “the iteration behaves asymptotically, as \(k\rightarrow \infty \), as though it were entirely in the two-space \(\pi _{1,n}\)” [7, p. 64]. The conjecture of Forsythe and Motzkin (for \(s=1\)) was proven by Akaike in 1959 using methods from probability theory [2], and Forsythe gave another proof in [7] using orthogonal polynomials. Based on numerical evidence he suspected that the behavior of the optimum s-gradient method is similar for all s, and we can therefore state his conjecture from [7, p. 66] as follows:

Forsythe conjecture. For \(2\le s<d(A)\), each of the two subsequences \(\{y_{2k}\}\) and \(\{y_{2k+1}\}\) in (3)–(4) has a single limit vector.

We stress that the Forsythe conjecture is only about the existence of limit vectors, and not about the speed of convergence of the iteration (3)–(4), or of the restarted CG method. However, if the conjecture holds, then asymptotically the vectors with the even indices become arbitrarily close to being collinear, and the same happens for the vectors with the odd indices. Therefore the error norms of the restarted CG iteration can converge to zero (only) linearly; see [7, pp. 63–64] for Forsythe’s original discussion of this important observation.

Fig. 1
figure 1

Plot of the computed limit vectors \(y_*\in \mathbb {R}^{10}\) for the 15 different right hand sides

Fig. 2
figure 2

Error norms \(\Vert x-x_k\Vert _A/\Vert x\Vert _A\), \(k=0,1,\dots ,150\), and \(\Vert y_*-y_{2k}\Vert \), \(k=1,2,\dots ,75\), for the 15 different right hand sides

Example 1

We illustrate the behavior of the iteration (3)–(4) for \(s=1\), where \(x_{k+1}\) is computed as in (5), with a numerical example. We consider the matrix \(A=\textrm{diag}(1,2,\dots ,10)\in \mathbb {R}^{10\times 10}\), the initial vector \(x_0=0\in \mathbb {R}^{10}\), and we run the iteration for 15 different random right hand sides \(b\in \mathbb {R}^{10}\) generated with MATLAB’s randn function. In order to minimize the effect of rounding errors we perform the computations with 128 digits of accuracy using MATLAB’s Symbolic Math Toolbox. For each of the 15 different right hand sides we run 150 iterations of the algorithm. We consider the final vector \(y_{150}\) as the limit vector \(y_*\) of the corresponding sequence \(\{y_{2k}\}\), which is known to exist.

Each line (dashed or solid) in Fig. 1 shows one the 15 limit vectors \(y_*\). These vectors are all different, but we see that (as described above) each of them is of the form \(y_*=\alpha e_1+\beta e_{10}\) with \(\alpha ^2+\beta ^2=1\).

The left and right sides of Fig. 2 show the error norms \(\Vert x-x_k\Vert _A/\Vert x\Vert _A\), \(k=0,1,\dots ,150\), and \(\Vert y_*-y_{2k}\Vert \), \(k=1,2,\dots ,75\), respectively, for the 15 right hand sides. In each case the speed of convergence is linear, sometimes after a short transient phase.

3 The Arnoldi cross iteration

Since the (oblique) projection process (4) on which the Forsythe conjecture is based uses the A-inner product, the matrix A must be symmetric positive definite. Our goal in this section is to generalize the context of the Forsythe conjecture to symmetric and nonsymmetric matrices. In order to do so, we consider the Arnoldi projection process, which is also well known in the area of Krylov subspace methods, and which is well defined for general A.

Given \(A\in {\mathbb {R}}^{n\times n}\), an integer \(s\ge 1\), and a vector \(v\in {\mathbb {R}}^{n}\), we define the vector \(w\in {\mathbb {R}}^{n}\) such that

$$\begin{aligned} w\in A^{s}v+{\mathcal {K}}_{s}(A,v)\quad \hbox {and}\quad w\perp \mathcal {K}_{s}(A,v). \end{aligned}$$
(6)

The construction of the vector w is the sth step in the orthogonalization of the Krylov sequence \(v,Av,A^{2}v,\dots \), where the vector \(A^{s}v\) is orthogonalized with respect to the previous vectors \(v,\dots ,A^{s-1}v\). Since the standard method for computing orthogonal Krylov subspace bases is the Arnoldi algorithm [3], we call w the Arnoldi projection of v with respect to A and s.

By construction, \(w=p(A)v\) for some polynomial \(p\in \mathcal {M}_{s}\), which is the set of (real) monic polynomials of degree s. Further basic properties of the Arnoldi projection are shown in the next result.

Lemma 1

If \(A\in {\mathbb {R}}^{n\times n}\), \(1\le s<d(A)\), \(v\in {\mathbb {R}}^{n}\), and w is given by (6), then the following hold:

  1. (1)

    The vector w satisfies

    $$\begin{aligned} \Vert w\Vert =\min _{z\in \mathcal {K}_{s}(A,v)}\Vert A^{s}v+z\Vert =\min _{q\in \mathcal {M}_{s}}\Vert q(A)v\Vert , \end{aligned}$$

    and hence \(w=0\) if and only if \(d(A,v)\le s\).

  2. (2)

    If \(d(A,v)\ge s\), then \(w={P_s(A;v)\,v}\) for a uniquely determined polynomial .

Proof

(1) By construction, \(w=A^{s}v+u\) for some \(u\in \mathcal {K}_{s}(A,v)\). Hence, using the orthogonality condition (6), we obtain for any \(z\in \mathcal {K}_{s}(A,v)\)

$$\begin{aligned} \Vert A^{s}v+z\Vert ^{2}= \Vert w-(u-z)\Vert ^{2}=\Vert w\Vert ^{2}+\Vert u-z\Vert ^{2} \ge \Vert w\Vert ^{2}, \end{aligned}$$

with equality if and only if \(z=u\). This proves the minimization property of w. From \(\Vert w\Vert =\min _{q\in \mathcal {M}_{s}}\Vert q(A)v\Vert \) we see that \(w=0\) holds if and only if \(d(A,v)\le s\).

(2) Suppose that \(d(A,v)\ge s\). If \(w=p(A)v=q(A)v\) with \(p,q\in \mathcal {M}_{s}\), then \((p-q)(A)v=0\). Since the polynomials p and q are both monic, the polynomial \(p-q\) has degree at most \(s-1\). But then \(d(A,v)\ge s\) implies \(p-q=0\). \(\square \)

In order to generalize the Forsythe conjecture to nonsymmetric matrices we will focus on the limiting behavior of a sequence of vectors obtained from repeatedly computing (6) with A and then with \(A^T\), and with additional normalizations. The algorithm we consider here is similar to the GMRES cross iteration we considered in [6] (cf. Algorithm 1 in that paper), but it is based on the Arnoldi projection (6) rather than the projection process used in the GMRES method.

Given \(A\in {\mathbb {R}}^{n\times n}\), an integer s with \(1\le s<d(A)\), and a vector \(v_{0}\in {\mathbb {R}}^{n}\) with \(\Vert v_0\Vert =1\) and \(d(A,v_0)\ge s+1\), we consider the following algorithm:

$$\begin{aligned}&\hbox {For} \ {k=0,1,2,\dots }\nonumber \\&{{\widetilde{w}_{k}={P_s(A;v_k)\,v_k}},} \end{aligned}$$
(7)
$$\begin{aligned}&{{w_{k}=\widetilde{w}_{k}/\Vert \widetilde{w}_{k}\Vert },} \end{aligned}$$
(8)
$$\begin{aligned}&{{\widetilde{v}_{k+1}={P_s(A^T;w_k)\,w_k}},} \end{aligned}$$
(9)
$$\begin{aligned}&{{v_{k+1}=\widetilde{v}_{k+1}/\Vert \widetilde{v}_{k+1}\Vert }.} \end{aligned}$$
(10)

We call this algorithm the Arnoldi cross iteration with restart length s, or shortly ACI(s).

In Theorems 1 and 2 below we will transfer some known results about the GMRES cross iteration from [6, Section 2] to the ACI(s). Because of these mathematical similarities between the two iterations we expect that their limiting behavior is similar, but we have not investigated this in detail.

The paper [6] is about worst-case GMRES, and here we need an analogous definition for the worst case in the projection process (6).

Definition 1

Let \(A\in {\mathbb {R}}^{n\times n}\) and an integer s with \(1\le s<d(A)\) be given. Denote

$$\begin{aligned} \Phi _s(A):=\max _{\begin{array}{c} v\in \mathbb {R}^n\\ \Vert v\Vert =1 \end{array}}\min _{p\in \mathcal {M}_s}\Vert p(A)v\Vert = \max _{\begin{array}{c} v\in \mathbb {R}^n\\ \Vert v\Vert =1 \end{array}} \Vert {P_s(A;v)\,v}\Vert . \end{aligned}$$
(11)

A unit norm vector \(v\in \mathbb {R}^n\) and a corresponding monic polynomial \(P_s(z;v)\) for which the value \(\Phi _s(A)\) is attained are called a worst-case Arnoldi vector and a worst-case Arnoldi polynomial for A and s, respectively.

Since a worst-case vector or the corresponding worst-case polynomial for the GMRES method need not be unique in general (see [6, Theorem 4.1]), we expect the same to be true for the worst-case Arnoldi problem (11). Note that in general we have

$$\begin{aligned} \Phi _s(A)=\max _{\begin{array}{c} v\in \mathbb {R}^n\\ \Vert v\Vert =1 \end{array}} \min _{p\in \mathcal {M}_s}\Vert p(A)v\Vert \le \min _{p\in \mathcal {M}_s} \max _{\begin{array}{c} v\in \mathbb {R}^n\\ \Vert v\Vert =1 \end{array}}\Vert p(A)v\Vert = \min _{p\in \mathcal {M}_s} \Vert p(A)\Vert . \end{aligned}$$
(12)

The expression on the right hand side is called the ideal Arnoldi problem for A and s; see [9]. If equality holds in (12), then the worst-case Arnoldi polynomial in (11) is unique; see [19, Lemma 2.4] and [12].

The next result transfers [6, Theorem 2.2] and the first part of [6, Theorem 2.5] to the ACI(s).

Theorem 1

Let \(A\in {\mathbb {R}}^{n\times n}\), let s be an integer with \(1\le s<d(A)\), and let \(v_{0}\in {\mathbb {R}}^{n}\) be such that \(\Vert v_0\Vert =1\) and \(d(A,v_{0})\ge s+1\). Then the vectors in (7)–(10) are all well defined, and

$$\begin{aligned} \Vert \widetilde{w}_k\Vert \le \Vert \widetilde{v}_{k+1}\Vert \le \Vert \widetilde{w}_{k+1}\Vert \le \Vert \widetilde{v}_{k+2}\Vert \le \Phi _s(A^T)=\Phi _s(A),\quad k=0,1,2,\dots .\nonumber \\ \end{aligned}$$
(13)

Equality holds in the first inequality if and only if \(v_{k}=\alpha v_{k+1}\) for some \(\alpha \ne 0\), and in the second if and only if \(w_{k}=\beta w_{k+1}\) for some \(\beta \ne 0\).

Proof

We start by showing inductively that the vectors in (7)–(10) are all well defined and satisfy (13). Suppose that for some \(k\ge 0\) we have \(\widetilde{w}_{k}={P_s(A;v_k)\,v_k}\ne 0\) for some uniquely determined polynomial , so that \(w_{k}=\widetilde{w}_{k}/\Vert \widetilde{w}_{k}\Vert \) is well defined. By construction we have \(\widetilde{w}_{k}\perp \mathcal {K}_{s}(A,v_{k})\), and therefore using orthogonality

$$\begin{aligned} \Vert \widetilde{w}_{k}\Vert ^{2} =\langle {P_s(A;v_k)\,v_k},\widetilde{w}_{k}\rangle =\langle A^{s}v_{k},\widetilde{w}_{k}\rangle = \langle q(A)v_{k},\widetilde{w}_{k}\rangle&=\langle v_{k},q(A^T)\widetilde{w}_{k}\rangle , \end{aligned}$$

where \(q\in \mathcal {M}_{s}\) is arbitrary. Using \(w_{k}=\widetilde{w}_{k}/\Vert \widetilde{w}_{k}\Vert \) we obtain

$$\begin{aligned} \Vert \widetilde{w}_{k}\Vert = \langle v_{k},q(A^T)w_{k}\rangle . \end{aligned}$$
(14)

Since \(0<\Vert \widetilde{w}_{k}\Vert \) and \(q\in \mathcal {M}_{s}\) is arbitrary, we have \(d(A^T,w_k)\ge s+1\). Using Lemma 1, there is a unique polynomial such that \(\widetilde{v}_{k+1}={P_s(A^T;w_k)\,w_k} \ne 0\), and, using (14) and choosing yields

$$\begin{aligned} \Vert \widetilde{w}_{k}\Vert = \langle v_{k},\widetilde{v}_{k+1}\rangle \le \Vert \widetilde{v}_{k+1}\Vert , \end{aligned}$$
(15)

where we have used the Cauchy-Schwarz inequality and \(\Vert v_{k}\Vert =1\). Therefore, \(v_{k+1} = \widetilde{v}_{k+1}/\Vert \widetilde{v}_{k+1}\Vert \) is well defined.

Similarly, we can prove that \(\Vert \widetilde{v}_{k+1}\Vert =\langle w_{k},q(A) v_{k+1}\rangle \) where \(q\in \mathcal {M}_{s}\) is arbitrary. Since \(0<\Vert \widetilde{v}_{k+1}\Vert \), we have \(d(A,v_{k+1})\ge s+1\), and it holds that

$$\begin{aligned} \Vert \widetilde{v}_{k+1}\Vert =\langle w_{k},\widetilde{w}_{k+1}\rangle \le \Vert \widetilde{w}_{k+1} \Vert . \end{aligned}$$
(16)

By induction, all vectors in (7)–(10) are well defined and satisfy (13).

Moreover, since we have used the Cauchy-Schwarz inequality to obtain (15), we have \(\Vert \widetilde{w}_{k}\Vert \le \Vert \widetilde{v}_{k+1}\Vert \) with equality if and only if \(v_{k}\) and \(\widetilde{v}_{k+1}\) are linearly dependent, or equivalently \(v_{k}=\alpha v_{k+1}\) for some \(\alpha \ne 0\). Similarly, the equality in (16) holds if and only if \(w_k\) and \(\widetilde{w}_{k+1}\) are linearly dependent, or equivalently \(w_k=\beta w_{k+1}\) for some \(\beta \ne 0\).

It is obvious that \(\Vert \widetilde{w}_{k} \Vert \le \Phi _s(A)\) and \(\Vert \widetilde{v}_{k} \Vert \le \Phi _s(A^T)\) for each k. We only need to prove that \(\Phi _s(A) = \Phi _s(A^T)\). Let v be a unit norm worst-case Arnoldi vector for A and s, and denote \(\widetilde{w}={P_s(A;v)\,v}\) and \(w\equiv \widetilde{w}/\Vert \widetilde{w}\Vert \), so that \(\Vert \widetilde{w}\Vert = \Phi _s(A)\). Then

$$\begin{aligned} \Phi _s(A) = \left\langle {P_s(A;v)\,v}, w \right\rangle = \left\langle v, q(A^T) w \right\rangle \le \min _{p\in \mathcal {M}_s} \Vert p(A^T) w \Vert \le \Phi _s(A^T), \end{aligned}$$

where \(q\in \mathcal {M}_{s}\) is arbitrary. Exchanging the roles of A and \(A^T\) shows the reverse inequality and completes the proof. \(\square \)

As shown in (13), the two nondecreasing sequences \(\{\Vert \widetilde{w}_k\Vert \}\) and \(\{\Vert \widetilde{v}_k\Vert \}\) interlace each other and are both bounded from above by \(\Phi _s(A)=\Phi _s(A^T)<\infty \). Thus, the sequences converge to the same limit,

$$\begin{aligned} \Vert \widetilde{w}_k\Vert \rightarrow \tau \quad \hbox {and}\quad \Vert \widetilde{v}_k\Vert \rightarrow \tau \quad \hbox {for}\ k\rightarrow \infty , \end{aligned}$$
(17)

where \(\tau \le \Phi _s(A)\). This observation is helpful for proving the next result, which is the ACI(s) version of the second part of [6, Theorem 2.5].

Theorem 2

Let \(A\in {\mathbb {R}}^{n\times n}\), let s be an integer with \(1\le s<d(A)\), and let \(v_{0}\in {\mathbb {R}}^{n}\) be such that \(\Vert v_0\Vert =1\) and \(d(A,v_{0})\ge s+1\). Then the sequences of the normalized vectors in (7)–(10) satisfy

$$\begin{aligned} \lim _{k\rightarrow \infty }\Vert w_{k+1}-w_{k}\Vert =0\quad \hbox {and}\quad \lim _{k\rightarrow \infty }\Vert v_{k+1}-v_{k}\Vert =0. \end{aligned}$$

Proof

Using similar ideas as in the proof of Theorem 1, for any \(k\ge 0\) we have

$$\begin{aligned} \begin{aligned}\frac{1}{2}\Vert v_{k+1}-v_{k}\Vert ^{2}&=1-\langle v_{k+1},v_{k}\rangle \\ {}&=1-\frac{1}{\Vert \widetilde{v}_{k+1}\Vert }\langle {P_s(A^T;w_k)\,w_k},v_k\rangle \\ {}&=1-\frac{1}{\Vert \widetilde{v}_{k+1}\Vert }\langle w_k,q(A) v_k\rangle \qquad \text{(for } \text{ any }\; {q\in \mathcal {M}_s})\\ {}&=1-\frac{1}{\Vert \widetilde{v}_{k+1}\Vert }\langle w_k,{P_s(A;v_k)\,v_k}\rangle \\ {}&=1-\frac{1}{\Vert \widetilde{v}_{k+1}\Vert }\langle w_k,\widetilde{w}_k\rangle \\ {}&=1-\frac{\Vert \widetilde{w}_k\Vert }{\Vert \widetilde{v}_{k+1}\Vert }, \end{aligned} \end{aligned}$$

where we have used \(w_k\perp \mathcal {K}_s(A,v_k)\). Since the sequences \(\{\Vert \widetilde{v}_k\Vert \}\) and \(\{\Vert \widetilde{w}_k\Vert \}\) converge to the same limit for \(k\rightarrow \infty \), we have \(\frac{\Vert \widetilde{w}_k\Vert }{\Vert \widetilde{v}_{k+1}\Vert }\rightarrow 1\) and hence \(\Vert v_{k+1}-v_{k}\Vert \rightarrow 0\) for \(k\rightarrow \infty \). The proof for the sequence \(\{w_k\}\) is analogous. \(\square \)

In the following we will focus on the properties of sequence \(\{v_k\}\), with the understanding that analogous results can be formulated also for the sequence \(\{w_k\}\). The vectors \(v_{k}\) all belong to the unit sphere in \({{\mathbb {R}}}^{n}\), and, therefore, they form an infinite bounded sequence in \({{\mathbb {R}}}^{n}\). By the Bolzano-Weierstrass Theorem this sequence has a convergent subsequence, and thus it has limit vectors. The existence of a single limit vector of the sequence \(\{y_{2k}\}\) in the iteration (3)–(4) is the content of the original Forsythe conjecture (see Sect. 2), and here we formulate the same conjecture for the ACI(s):

ACI(s) conjecture. Let \(A\in {\mathbb {R}}^{n\times n}\), let s be an integer with \(1\le s<d(A)\), and let \(v_{0}\in {\mathbb {R}}^{n}\) be such that \(\Vert v_0\Vert =1\) and \(d(A,v_{0})\ge s+1\). Then the sequence \(\{v_{k}\}\) in (7)–(10) has a single limit vector.

As in the Forsythe conjecture, we are here only concerned with the existence of limit vectors, and not with the speed of convergence of the iteration.

As shown in Theorem 2, the (Euclidean) distance between consecutive vectors of the sequence \(\{v_k\}\) shrinks to zero for \(k\rightarrow \infty \). Because of this property it may be difficult to find a counterexample for the conjecture numerically. On the other hand, this property is not sufficient for the existence of a single limit vector. For example, the complex points \(\mu _k=e^{\textbf{i} s_k\pi }\), where \(s_k=\sum _{j=1}^k 1/j\) for \(k\ge 1\), satisfy

$$\begin{aligned} |\mu _{k}-\mu _{k+1} |= |1-e^{\textbf{i} \pi /(k+1)} |\rightarrow 0\quad \hbox {for}\quad k\rightarrow \infty , \end{aligned}$$

but the sequence \(\{\mu _k\}\) does not converge, since \(\{s_k\}\) diverges.

Using a similar notation as Forsythe in [7], we define for the given (possibly nonsymmetric) matrix \(A\in \mathbb {R}^{n\times n}\) and integer s, where \(1\le s<d(A)\), the set

$$\begin{aligned} \Sigma ^{A}:=\{v\in {\mathbb {R}}^{n}:\,\Vert v\Vert =1\;\hbox {and}\;d(A,v)\ge s+1\}, \end{aligned}$$

and the transformation

$$\begin{aligned} T_{A}:\Sigma ^{A}\rightarrow \Sigma ^{A},\quad v\mapsto T_{A}(v):=\frac{{P_s(A;v)\,v}}{\Vert {P_s(A;v)\,v}\Vert }. \end{aligned}$$
(18)

In the notation \(\Sigma ^A\) and \(T_A\) we have suppressed the dependence on s for simplicity. However, for the analysis of the iteration (7)–(10), it is important to explicitly consider the dependence on the matrix, since we operate with both A and \(A^T\).

In terms of (7)–(10), if \(v_0\in \Sigma ^A\), then

$$\begin{aligned} v_{k+1}=T_{A^T}(T_{A}(v_{k}))\quad \hbox {for all} \ k\ge 0, \end{aligned}$$

where Theorem 1 shows that the transformation \(T_{A^T}\circ T_A:\Sigma ^A\rightarrow \Sigma ^A\) is indeed well defined. Moreover, both \(T_{A^T}\) and \(T_A\) are continuous, so that \(T_{A^T}\circ T_{A}\) is also continuous. The next result adapts [7, Theorems 3.8] to our context.

Theorem 3

Let \(A\in {\mathbb {R}}^{n\times n}\), let s be an integer with \(1\le s<d(A)\), and let \(v_{0}\in {\mathbb {R}}^{n}\) be such that \(\Vert v_0\Vert =1\) and \(d(A,v_{0})\ge s+1\). Then the set \(\Sigma _{*}^{A}\) of limit vectors of the sequence \(\{v_{k}\}\) in (7)–(10) satisfies:

  1. (1)

    \(\Sigma _{*}^{A}\) is a closed and connected set in \({\mathbb {R}}^{n}\).

  2. (2)

    \(\Sigma _{*}^{A}\subseteq \Sigma ^{A}\), and each \(v_{*}\in \Sigma _{*}^{A}\) satisfies \(v_{*}=T_{A^T}(T_{A}(v_{*}))\).

Proof

(1) Suppose that the sequence \(\{v_{k}\}\) has two isolated limit vectors, \(v_{*,1}\) and \(v_{*,2}\). Then there exist neighborhoods \(U_{1}\) and \(U_{2}\) of \(v_{*,1}\) and \(v_{*,2}\), respectively, which do not contain other limit vectors of the sequence. Let \(\{v_{k_{i}}\}\) be a set of vectors in the sequence that is contained in \(U_{1}\). Then the sequence \(\{v_{k_{i}}\}\) converges to \(v_{*,1}\), since this sequence can have only one limit vector. We know that the distance \(\Vert v_{k+1}-v_{k}\Vert \) converges to zero for \(k\rightarrow \infty \); see Theorem 2. Therefore almost all (i.e., all except finitely many) successors of vectors in the set \(\{v_{k_{i}}\}\) are contained in \(U_{1}\). But this implies that only a finite number of the vectors of the sequence \(\{v_{k}\}\) can be outside \(U_{1}\), which contradicts that \(v_{*,2}\) is a limit vector. Consequently, the sequence \(\{v_{k}\}\) cannot have two isolated limit vectors.

(2) If \(v_{*}\in \Sigma _{*}^{A}\), then there exists a subsequence \(\{v_{k_{i}}\}\) of the sequence \(\{v_{k}\}\) that converges to \(v_{*}\). For each vector \(v_{k_{i}}\) in the subsequence we have \(T_{A^T}(T_{A}(v_{k_{i}}))=v_{k_{i}+1}\), and therefore

$$\begin{aligned} \Vert T_{A^T}(T_{A}(v_{k_{i}}))-v_{k_{i}}\Vert =\Vert v_{k_{i}+1}-v_{k_{i}}\Vert \rightarrow 0\quad \hbox {for}\, k\rightarrow \infty ; \end{aligned}$$

see Theorem 2. Since \(T_{A^T}\circ T_{A}\) is continuous and \(v_{k_{i}}\rightarrow v_{*}\), we have \(T_{A^T}(T_{A}(v_{k_{i}}))\rightarrow T_{A^T}(T_{A}(v_{*}))\), and hence \(v_{*}=T_{A^T}(T_{A}(v_{*}))\).

It is clear that every \(v_{*}\in \Sigma _{*}^{A}\) satisfies \(\Vert v_{*}\Vert =1\). If \(d(A,v_{*})\le s\), then Lemma 1 implies \(T_{A^T}(T_{A}(v_{*}))=0\), in contradiction to \(T_{A^T}(T_{A}(v_{*}))=v_{*}\). Thus, \(d(A,v_{*})\ge s+1\), which shows that \(\Sigma _{*}^{A}\subseteq \Sigma ^{A}\). \(\square \)

4 Results for special cases

In this section we will first derive some general results about the ACI(s) for symmetric matrices. Then we prove the ACI(1) conjecture for symmetric, and for real orthogonal matrices with \(d(A)=n\) and eigenvalues having only positive (or only negative) real parts.

4.1 The ACI(s) for symmetric matrices

If \(A=A^T\in \mathbb {R}^{n\times n}\), the steps (7)–(8) and (9)–(10) in the ACI(s) are identical, and hence we can write the algorithm in following simpler form:

$$\begin{aligned}&\hbox {For }{k=0,1,2,\dots }\nonumber \\&{{\widetilde{v}_{k+1}={P_s(A;v_k)\,v_k}},} \end{aligned}$$
(19)
$$\begin{aligned}&{{v_{k+1}=\widetilde{v}_{k+1}/\Vert \widetilde{v}_{k+1}\Vert }.} \end{aligned}$$
(20)

Our conjecture now is that for each integer s with \(1\le s<d(A)\), and unit norm initial vector \(v_{0}\in {\mathbb {R}}^{n}\) with \(d(A,v_{0})\ge s+1\), the sequence \(\{v_{2k}\}\) in (19)–(20) has a single limit vector.

Theorem 1 for \(A=A^T\) and the simplified algorithm (19)–(20) says that

$$\begin{aligned} \Vert \widetilde{v}_k\Vert \le \Vert \widetilde{v}_{k+1}\Vert ,\quad k=0,1,2,\dots , \end{aligned}$$

with equality if and only if \(v_k=\alpha v_{k+2}\) for some \(\alpha \ne 0\). Knowing that \(v_k\) and \(v_{k+2}\) have unit norm, and that \(\langle v_k,v_{k+2}\rangle > 0\) (see (15)), we must have \(\alpha =1\), i.e., \(v_k=v_{k+2}\). This can actually happen, as shown in the next result, which adapts [7, Theorem 4.8] to our context.

Theorem 4

Let \(A=A^{T}\in {\mathbb {R}}^{n\times n}\), let s be an integer with \(1\le s<d(A)\), and let \(v_{0}\in {\mathbb {R}}^{n}\) be such that \(\Vert v_0\Vert =1\) and \(d(A,v_{0})=s+1\). Then the vectors \(v_{0},v_{2},v_{4},\dots \) in (19)–(20) are all equal.

Proof

First note that if \(d(A,v_0)=s+1\), then the subspace \(\mathcal {K}_{s+1}(A,v_{0})\) is A-invariant. Therefore, all vectors \(v_k\) are contained in the same \(s+1\) dimensional space, and by Theorem 1 it holds that \(d(A,v_{k})=s+1\) for all \(k\ge 0\). As a consequence,

$$\begin{aligned} \dim \mathcal {K}_{s}(A,v_{k}) = s,\quad k = 0,1,\dots . \end{aligned}$$

It suffices to show that \(v_{0}= v_{2}\), then the equality of all vectors \(v_{0},v_{2},v_{4},\dots \) follows inductively. By construction, \(v_1\perp \mathcal {K}_{s}(A,v_{0})\) and \( v_2\perp \mathcal {K}_{s}(A,v_{1})\), where the second orthogonality condition is equivalent to \(v_1\perp \mathcal {K}_{s}(A,v_{2})\). Therefore,

$$\begin{aligned} v_1\;\perp \;\mathcal {K}_{s}(A,v_{0})\quad \hbox {and}\quad v_1\;\perp \;\mathcal {K}_{s}(A,v_{2}). \end{aligned}$$

Since \(\dim \mathcal {K}_{s}(A,v_{0}) = \dim \mathcal {K}_{s}(A,v_{2}) = s\) and since all vectors are contained in the same \(s+1\) dimensional space, we have \(\mathcal {K}_{s}(A,v_{0})=\mathcal {K}_{s}(A,v_{2})\). Hence,

$$\begin{aligned} v_{2}=\sum _{j=0}^{s-1}\gamma _{j}A^{j}v_{0}, \end{aligned}$$

for some coefficients \(\gamma _{0},\dots ,\gamma _{s-1}\in {\mathbb {R}}\). We will show that \(\gamma _{0}=1\) and \(\gamma _{j}=0\) for \(j>0\).

Using \(v_2\;\perp \;\mathcal {K}_{s}(A,v_{1})\), i.e., \(v_2 \perp A^i v_{1}\) for \(i=0,1,\dots ,s-1\) we obtain

$$\begin{aligned} \begin{aligned} 0=\langle A^i v_{1},v_{2}\rangle = \langle v_{1},A^i v_{2}\rangle = \sum _{j=0}^{s-1}\gamma _{j}\langle v_{1},A^{j+i}v_{0}\rangle , \quad i&=0,1,\dots ,s-1. ,\\ \quad i&=0,1,\dots ,s-1. \end{aligned} \end{aligned}$$

Now realize that \(\langle v_{1},A^{m}v_{0}\rangle = 0\) for \(m=0,1,\dots ,s-1\) and \(\langle v_{1},A^{m}v_{0}\rangle \ne 0\) for \(m\ge s\) because \(A^m v_0 \in {\mathcal K}_{s+1}(A,v_0)\,\backslash \,{{\mathcal {K}}}_{s}(A,v_0)\). In particular, for \(i=1\) we obtain

$$\begin{aligned} 0=\langle A v_{1},v_{2}\rangle =\gamma _{s-1}\langle v_{1},A^{s}v_{0}\rangle . \end{aligned}$$

Since \(\langle v_{1},A^{s}v_{0}\rangle \ne 0\), we get \(\gamma _{s-1}=0\). Proceeding by induction we obtain \(\gamma _{s-2}=\cdots =\gamma _{1}=0\). Therefore, \(v_{2}=\gamma _{0}v_{0}\). Since \(v_0\) and \(v_{2}\) are unit norm vectors and \(\langle v_0,v_{2}\rangle > 0\) (see (15)) we get \(\gamma _0=1\). \(\square \)

We next prove a result about the limit vectors of the sequence (19)–(20). The part about their grades adapts [7, Theorem 4.7] to our context.

Theorem 5

Let \(A=A^T\in {\mathbb {R}}^{n\times n}\), let s be an integer with \(1\le s<d(A)\), and let \(v_{0}\in {\mathbb {R}}^{n}\) be such that \(\Vert v_0\Vert =1\) and \(d(A,v_{0})\ge s+1\). Then each limit vector \(v_*\) of (19)–(20) satisfies

$$\begin{aligned} ({Q_{2s}(A;v_*)}-\tau ^2 I)v_{*}=0, \end{aligned}$$
(21)

where \(\tau \) is the limit value of the sequence \(\{\Vert \widetilde{v}_k\Vert \}\) (as in (17)),

so that, in particular, \(s<d(A,v_{*})\le 2s\).

Proof

We use the notation of Theorem 3. It is clear that each limit vector \(v_*\in \Sigma ^{A}_*\) satisfies \(d(A,v_*)>s\), so we only need to show that \(d(A,v_*)\le 2s\). Moreover, since \(A=A^T\), we have \(T_A^2(v_{*})=v_{*}\).

By construction,

$$\begin{aligned} T_A(v_*)=\frac{{P_s(A;v_*)\,v_*}}{\tau } \end{aligned}$$

for a uniquely determined polynomial , and

$$\begin{aligned} T_A^{2}(v_*)=\frac{{P_s(A;w_*)\,w_*}}{\tau }, \quad w_*=T_A(v_*), \end{aligned}$$

for a uniquely determined polynomial ; cf. Lemma 1. Note that \(\tau \) is independent of the limit vector \(v_*\). Thus,

for a uniquely determined polynomial \({Q_{2s}(z;v_*)}\in \mathcal {M}_{2s}\). Finally, using \(T_A^{2}(v_{*})=v_{*}\) we obtain \(({Q_{2s}(A;v_*)}-{\tau ^2} I)v_{*}=0\), and hence \(d(A,v_{*})\le 2s\). \(\square \)

Example 2

For a numerical illustration of the bound \(s<d(A,v_*)\le 2s\) in Theorem 5 we consider \(A=\textrm{diag}(1,2,\dots ,10)\in \mathbb {R}^{10}\) and normalized random initial vectors \(v_0\in \mathbb {R}^{10}\) generated with MATLAB’s randn function. As in Example 1 we perform the computations with 128 digits of accuracy using MATLAB’s Symbolic Math Toolbox. We run the iteration (19)–(20) for \(s=2\) and \(s=3\), and each time with 5 different initial vectors and for 300 steps.

Figure 3 shows the computed limit vectors \(v_*\) for \(s=2\) (left) and \(s=3\) (right), which satisfy \(d(A,v_*)\in \{3,4\}\) and \(d(A,v_*)\in \{4,5\}\), respectively. (In this example none of the random initial yielded a vector \(v_*\) with \(d(A,v_*)=6\) in the case \(s=3\).) The fact that for \(s\ge 2\) the grade of the limit vectors depends on the initial vector is one of the major obstacles for proving the ACI(s) conjecture for symmetric matrices.

Fig. 3
figure 3

Plots of the computed limit vectors \(v_*\in \mathbb {R}^{10}\) for the 5 different right hand sides with \(s=2\) (left) and \(s=3\) (right)

Consider an orthogonal diagonalization of A, i.e., \(A=U\Lambda U^T\), where \(\Lambda =\textrm{diag}(\lambda _1,\dots ,\lambda _n)\) and U is orthogonal. Let \(v_* = U \nu \) where \(\nu = [\nu _1,\dots ,\nu _n]^T \in {\mathbb {R}}^n\), is the coordinate vector of \(v_*\) in the eigenvector basis. Then the condition (21) can also be written in the form

$$\begin{aligned} {Q_{2s}(\lambda _i;v_*)}\nu _i = {\tau ^2} \nu _i,\quad i=1,\dots ,n. \end{aligned}$$

If \(d(A,v_*)= d\), then there are exactly d nonzero coordinates \(\nu _{k_i}\), \(i=1,\dots ,d\), and the polynomial \({Q_{2s}(z;v_*)}\) satisfies the d interpolation conditions

$$\begin{aligned} {Q_{2s}(\lambda _{k_i};v_*)}= \tau ^2,\quad i=1,\dots ,d. \end{aligned}$$

In particular, if \(d=2s\), then \({Q_{2s}(z;v_*)}\) is uniquely determined by the corresponding interpolation conditions, so that

$$\begin{aligned} {Q_{2s}(z;v_*)} = \prod _{i=1}^{2s} (z-\lambda _{k_i}) + \tau ^2. \end{aligned}$$

Since there exist only finitely many combinations of 2s distinct eigenvalues of A, there are only finitely many polynomials \({Q_{2s}(z;v_*)}\) that correspond to limit vectors \(v_*\) having degree 2s.

The set \(\Sigma _{*}^{A}\) of limit vectors of the sequence \(\{v_{2k}\}\), is a closed and connected set on the unit sphere; see Theorem 3. Therefore, in order to show that \(\Sigma _{*}^{A}\) consist of a single vector, it is sufficient to show that \(\Sigma _{*}^{A}\) contains only finitely many vectors. The existence of only finitely many limit polynomials \({Q_{2s}(z;v_*)}\) can potentially be used to prove that there can be only finitely many limit vectors \(v_*\), thereby obtaining a proof of the ACI(s) conjecture for symmetric matrices and general s.

We point out that the convergence of the coefficients of the iteration polynomials and its consequences for the convergence of the sequence \(\{v_{2k}\}\) is an essential ingredient in the work of Zhuk and Bondarenko [26] on the original Forsythe conjecture and the case \(s=2\). In particular, they show that for \(s=2\) the coefficients of the monic polynomials \({Q_{2s}(z;v_k)}\) converge to their limit values. Quoting a paper by Zabolotskaya, they use as a proven fact that the convergence of the polynomial coefficients implies the existence of a single limit vector of the even iterates; see [26, property 4, p. 429]. However, the English translation of Zabolotskaya’s paper [22, p. 238] states that the convergence of the polynomial coefficients would only indicate the existence of a single limit vector. This may well be an imprecise translation of Zabolotskaya’s Russian original, but it would nevertheless be very useful to have a more transparent proof of this essential property. Until then we consider the Forsythe conjecture for the case \(s=2\) to be still open.

4.2 Proof of the ACI(1) conjecture for symmetric matrices

If \(s=1\), then Theorem 5 shows that every limit vector \(v_*\) of the sequence \(\{v_{2k}\}\) in (19)–(20) satisfies \(d(A,v_*)=2\), i.e., \(v_*\) is a linear combination of exactly two linearly independent eigenvectors of A. This observation is essential for proving the existence of a single limit vector of the sequence \(\{v_{2k}\}\) for \(s=1\).

Consider the ACI(1) for \(A=A^T\in \mathbb {R}^{n \times n}\), \(s=1\). In this case we have (cf. (6))

$$\begin{aligned} w={P_s(A;v)\,v}=Av-\alpha v\;\perp \;\textrm{span}\{v\}. \end{aligned}$$

Thus, \(0=v^{T}w=v^{T}Av-\alpha v^{T}v\), which yields

$$\begin{aligned} \alpha =\frac{v^{T}Av}{v^{T}v}, \end{aligned}$$

i.e., \(\alpha \) is the Rayleigh quotient of A and v. Therefore, the algorithm (19)–(20) can be written as follows:

$$\begin{aligned}&\hbox {For}\ {k=0,1,2,\dots }\nonumber \\&\widetilde{v}_{k+1}=(A-\rho _k I)v_{k},\quad \rho _{k}=v_{k}^{T}Av_{k}, \end{aligned}$$
(22)
$$\begin{aligned}&v_{k+1}=\widetilde{v}_{k+1}/\Vert \widetilde{v}_{k+1}\Vert . \end{aligned}$$
(23)

Let \(A=U\Lambda U^{T}\) be an orthogonal diagonalization of A with \(\Lambda =\textrm{diag}(\lambda _{1},\dots ,\lambda _{n})\). Then \(\rho _k=(U^Tv_k)\Lambda (U^T v_k)\), and in (22) we can write \(U^T\widetilde{v}_{k+1}=(\Lambda -\rho _{k}I)U^{T}v_{k}\). This shows that without loss of generality we can consider the behavior of the (22)–(23) for a diagonal matrix A. Moreover, without loss of generality we will assume that A has n distinct eigenvalues.

Theorem 6

Let \(A={\textrm{diag}}(\lambda _{1},\dots ,\lambda _{n})\in {\mathbb {R}}^{n\times n}\) with \(\lambda _{1}<\lambda _{2}<\cdots <\lambda _{n}\), and let \(v_{0}\) be a unit norm initial vector with \(d(A,v_0)\ge 2\). Then the sequence \(\{v_{2k}\}\) in (22)–(23) converges to a single limit vector.

Proof

Recall that by Theorem 3, the set \(\Sigma _{*}^{A}\) is closed and connected. Therefore, it is sufficient to show that \(\Sigma _{*}^{A}\) contains only finitely many vectors. Let \(v_*\in \Sigma _{*}^{A}\). We know that there exists a fixed \(\tau >0\) (independent of \(v_*\)) with \({\Vert \widetilde{v}_{k+1}\Vert }=\Vert Av_k-(v_k^TAv_k)v_k\Vert \rightarrow \tau \) for \(k\rightarrow \infty \); see (17). Thus,

$$\begin{aligned} \tau =\left\| Av_{*}-\left( v_{*}^{T}Av_{*}\right) v_{*}\right\| . \end{aligned}$$
(24)

Moreover, Theorem 5 implies that \(d(A,v_*)=2\), and hence \(v_*=\alpha e_i+\beta e_j\) for some canonical basis vectors \(e_{i}\) and \(e_{j}\), \(1\le i< j\le n\), and nonzero \(\alpha ,\beta \in \mathbb {R}\) that satisfy \(\alpha ^2+\beta ^2=1\).

Now suppose that some vector \(v=\alpha e_i+\beta e_j\) with \(\alpha ^2+\beta ^2=1\) satisfies the equation (24). Then

$$\begin{aligned} \tau ^{2}&=\left\| Av-\left( v^{T}Av\right) v\right\| ^{2}= v^{T}A^{2}v-\left( v^{T}Av\right) ^{2} \\&=\alpha ^{2}\lambda _{i}^{2}+\beta ^{2}\lambda _{j}^{2}-\left( \alpha ^{2}\lambda _{i}+\beta ^{2}\lambda _{j}\right) ^{2}\\&= \alpha ^{2}\lambda _{i}^{2}(1-\alpha ^{2})+\beta ^{2}\lambda _{j}^{2}(1-\beta ^{2})-2\alpha ^{2}\beta ^{2}\lambda _{i}\lambda _{j}\\&= \alpha ^{2}\beta ^{2}\left( \lambda _{i}^{2}-2\lambda _{i}\lambda _{j}+\lambda _{j}^{2}\right) \\&= \alpha ^{2}\beta ^{2}\left( \lambda _{i}-\lambda _{j}\right) ^{2}\\&= \alpha ^{2}\left( 1-\alpha ^{2}\right) \left( \lambda _{i}-\lambda _{j}\right) ^{2}. \end{aligned}$$

There are only finitely many combinations of distinct \(i,j\in \{1,2,\dots ,n\}\), and for each such combination there are only finitely many values of \(\alpha \in \mathbb {R}\) that satisfy

$$\begin{aligned} \alpha ^{2}\left( 1-\alpha ^{2}\right) =\frac{\tau ^2}{\left( \lambda _{i}-\lambda _{j}\right) ^{2}}. \end{aligned}$$

Consequently, there are only finitely many vectors of the form \(v=\alpha e_i+\beta e_j\) with \(\alpha ^2+\beta ^2=1\) that satisfy the Eq. (24). Therefore there can be only finitely many \(v_*\in \Sigma _{*}^{A}\), which shows that the sequence \(\{v_{2k}\}\) converges to a single limit vector. \(\square \)

Afanasjew, Eiermann, Ernst, and Güttel [1, Section 3] also have shown that the sequence \(\{v_{2k}\}\) in (22)–(23) converges to a single limit vector. Similar to the original proof of Forsythe [7], their proof is based on first showing that each limit vector is a linear combination of \(e_1\) and \(e_n\) (or eigenvectors corresponding to the smallest and largest eigenvalue of A), and then showing that there can be only finitely many such combinations. Variations of this approach have appeared also in other proofs for symmetric positive definite matrices and \(s=1\); see, e.g., [2, 8]. The approach is longer and more technical than our proof of Theorem 6, but the results give more information about the structure of the limiting vectors.

4.3 On the ACI(1) conjecture for orthogonal matrices

In this section we first prove the ACI(1) conjecture for orthogonal matrices \(A\in \mathbb {R}^{n\times n}\) with \(0\notin F(A)\), the field of values of A, and we then comment on the behavior when \(0\in F(A)\).

Let \(A\in \mathbb {R}^{n\times n}\) be orthogonal. Starting from a unit norm vector \(v_0\in \mathbb {R}^n\) with \(d(A,v_0)\ge s+1\), the ACI(1) is as follows:

$$\begin{aligned}&\text{ For }\; k=0,1,2,\dots \nonumber \\ {}&\widetilde{w}_k = (A-\alpha _k I)v_k,\quad \alpha _k=v_{k}^TAv_k, \end{aligned}$$
(25)
$$\begin{aligned}&\hspace{1cm}w_k = \widetilde{w}_k / \Vert \widetilde{w}_k\Vert , \end{aligned}$$
(26)
$$\begin{aligned}&\hspace{1cm}\widetilde{v}_{k+1} = (A^T-\beta _k I)w_k,\quad \beta _k=w_{k}^TA^Tw_k=w_{k}^TAw_k, \end{aligned}$$
(27)
$$\begin{aligned}&\hspace{1cm}v_{k+1}=\widetilde{v}_k/\Vert \widetilde{v}_k\Vert . \end{aligned}$$
(28)

The Rayleigh quotients \(\alpha _k\) and \(\beta _k\) in the ACI(1) are real elements in F(A). Moreover,

$$\begin{aligned} \Vert \widetilde{w}_k\Vert ^2&= v_{k}^T (A^T-\alpha _k I)(A-\alpha _k I)v_k=1-\alpha _k^2, \end{aligned}$$
(29)
$$\begin{aligned} \Vert \widetilde{v}_{k+1}\Vert ^2&= w_{k}^T (A-\beta _k I)(A^T-\beta _k I)w_k=1-\beta _k^2, \end{aligned}$$
(30)

where we have used that A is orthogonal. By construction, \(\Vert \widetilde{w}_k\Vert >0\) and \(\Vert \widetilde{v}_{k+1}\Vert >0\), so that \(\alpha _k,\beta _k\in (-1,1)\). Since \(\Vert \widetilde{w}_k\Vert \le \Vert \widetilde{v}_{k+1}\Vert \), we always have

$$\begin{aligned} |\beta _k |\le |\alpha _k |. \end{aligned}$$
(31)

We know from (17) that the sequences \(\{\Vert \widetilde{w}_k\Vert \}\) and \(\{\Vert \widetilde{v}_k\Vert \}\) converge to the same limit. Therefore, \(\alpha _k^2-\beta _k^2\rightarrow 0\), so that there exist \(\alpha ,\beta \in (-1,1)\) with

$$\begin{aligned} |\alpha |=|\beta |,\quad \alpha _k\rightarrow \alpha ,\quad \hbox {and}\quad \beta _k\rightarrow \beta . \end{aligned}$$
(32)

Here \(\alpha \) and \(\beta \) are independent of the limit vectors of the sequences \(\{v_k\}\) and \(\{w_k\}\).

It is well known that an orthogonal matrix \(A\in \mathbb {R}^{n\times n}\) can be orthogonally block-diagonalized as \(A=UGU^T\) with

$$\begin{aligned} U=[U_1,\dots ,U_m,u_1,\dots ,u_k],\quad G=\textrm{diag}(G_1,\dots ,G_m,[\pm 1],\dots ,[\pm 1]), \end{aligned}$$

where \(U^TU=I\), \(U_1,\dots ,U_m\in \mathbb {R}^{n\times 2}\), \(u_1,\dots u_k\in \mathbb {R}^n\), and

$$\begin{aligned} G_j=\begin{bmatrix} c_j &{} s_j\\ -s_j &{} c_j\end{bmatrix}\in \mathbb {R}^{2\times 2}, \quad c_j^2+s_j^2=1,\quad s_j\ne 0,\quad j=1,\dots ,m. \end{aligned}$$
(33)

The blocks \(G_j\) in (33) correspond to the non-real eigenvalues of A, which occur in complex conjugate pairs \(c_j\pm \textbf{i}\, s_j\) with real parts \(c_j\in (-1,1)\). By transforming the iterates of the ACI(1) with \(U^T\) (similarly to the transformation for symmetric matrices in Sect. 4.2) we can assume without loss of generality that A is in the block-diagonal form, i.e., that \(A=\textrm{diag}(G_1,\dots ,G_m,[\pm 1],\dots ,[\pm 1])\).

We will now assume that \(A\in \mathbb {R}^{n\times n}\) is orthogonal with \(d(A)=n\) and \(0\notin F(A)\). Then in the orthogonal block-diagonalization of A there can be at most one block of size \(1\times 1\), either [1] or \([-1]\), and \(c_1,\dots ,c_m\) are pairwise distinct and either all positive or all negative. For simplicity, we will state and prove the next results only for the positive case.

Lemma 2

Let \(A=\textrm{diag}(G_1,\dots ,G_m,[1])\in \mathbb {R}^{n\times n}\), where possibly the block [1] does not occur, have blocks \(G_j\) as in (33) with \(0<c_1<\cdots <c_m\). If \(v_0\in \mathbb {R}^n\) is any unit norm initial vector with \(d(A,v_0)\ge 2\), then any limit vector \(v_*\in \Sigma _*^A\) of the sequence \(\{v_k\}\) in (25)–(28) satisfies \(d(A,v_*)=2\), and \(\alpha =\beta \) is equal to the real part of an eigenvalue of A.

Proof

Let \(v_*\) be any limit vector of the sequence \(\{v_k\}\). It is clear that \(d(A,v_*)\ge 2\), so it suffices to show that \(d(A,v_*)\le 2\).

According to Theorem 3 we have \(v_{*}=T_{A^T}(T_{A}(v_{*}))\), i.e.,

$$\begin{aligned}&\widetilde{w}_* = (A-\alpha I)v_*,\quad \alpha =v_{*}^TAv_*,\\&w_* = \widetilde{w}_* / \Vert \widetilde{w}_*\Vert ,\\&\widetilde{v}_{*} = (A^T-\beta I)w_*,\quad \beta =w_{*}^TA^Tw_*=w_{*}^TAw_*,\\&v_{*} =\widetilde{v}_*/\Vert \widetilde{v}_*\Vert . \end{aligned}$$

Moreover, (29)–(30) and (32) yield \(\Vert \widetilde{v}_*\Vert ^2=\Vert \widetilde{w}_*\Vert ^2=1-\alpha ^2\), so that

$$\begin{aligned} v_*&=T_{A^T}(T_{A}(v_{*}))= \frac{(A^T-\beta I)(A-\alpha I)v_*}{\Vert \widetilde{w}_*\Vert \, \Vert \widetilde{v}_*\Vert }=\frac{((1+\alpha \beta )I-\alpha A^T-\beta A)v_*}{1-\alpha ^2}. \end{aligned}$$

This yields the equation

$$\begin{aligned} (\alpha A^T+\beta A)v_*=(\alpha ^2+\alpha \beta )v_*. \end{aligned}$$
(34)

Since \(\alpha \) and \(\beta \) are real elements in F(A) with \(|\alpha |=|\beta |\), we must have \(\alpha =\beta \in (0,1)\). Then (34) implies that the limit vector \(v_*\) satisfies

$$\begin{aligned} \frac{1}{2} (A+A^T)v_*=\alpha v_*. \end{aligned}$$
(35)

Note that

$$\begin{aligned} \frac{1}{2} (A+A^T)={\textrm{diag}}(c_1 I_2,\dots , c_m I_2, [1]), \end{aligned}$$

where possibly the block [1] does not occur. Since \(c_1,\dots ,c_m\in (0,1)\) are pairwise distinct, we must have \(\alpha =c_j\) for exactly one index \(j\in \{1,\dots ,m\}\). Consequently, \(v_*=[0,\dots ,0,y^T,0,\dots ,0]^T\) for some unit norm vector \(y\in \mathbb {R}^2\) corresponding to the jth block, which shows that \(d(A,v_*)\le 2\). \(\square \)

For the given orthogonal matrix \(A=\textrm{diag}(G_1,\dots ,G_m,[1])\in \mathbb {R}^{n\times n}\), where possibly the block [1] does not occur, we will consider the corresponding block-partitioning of the vectors \(v_k\), i.e.,

$$\begin{aligned} v_k = \begin{bmatrix} v_k^{(1)}\\ \vdots \\ v_k^{(m)}\\ v_k^{(m+1)} \end{bmatrix},\quad v_k^{(j)}\in \mathbb {R}^2, \quad j=1,\dots ,m,\quad v_k^{(m+1)} \in \mathbb {R}, \end{aligned}$$
(36)

where possibly the block \(v_k^{(m+1)}\) does not occur.

Lemma 3

Let \(A=\textrm{diag}(G_1,\dots ,G_m,[1])\in \mathbb {R}^{n\times n}\), where possibly the block [1] does not occur, have blocks \(G_j\) as in (33) with \(0<c_1<\cdots <c_m\). If \(v_0\in \mathbb {R}^n\) is any unit norm initial vector with \(d(A,v_0)\ge 2\) and \(v_0^{(1)} \ne 0\), then any limit vector \(v_*\in \Sigma _*^A\) of the sequence \(\{v_k\}\) in (25)–(28) has zero entries except for the block \(v_*^{(1)}\), \(\Vert v_*^{(1)}\Vert =1\), and \(\alpha =\beta =c_1\).

Proof

The ACI(1) yields

$$\begin{aligned} v_{k+1}=\frac{(A^{T}-\beta _{k}I)(A-\alpha _{k}I)}{\Vert \widetilde{w}_{k}\Vert \,\Vert \widetilde{v}_{k+1}\Vert }\,v_{k}, \quad k=0,1,2,\dots . \end{aligned}$$

If the block [1] occurs, then for \(k=0,1,2,\dots \),

$$\begin{aligned} v_{k+1}^{(m+1)}=\frac{(1-\beta _k)(1-\alpha _k)}{\sqrt{1-\beta _{k}^{2}}\sqrt{1-\alpha _{k}^{2}}}v_k^{(m+1)} =\left( \frac{(1-\beta _k)(1-\alpha _k)}{(1+\beta _k)(1+\alpha _k)}\right) ^{1/2}v_k^{(m+1)}, \end{aligned}$$

where we have used (29)–(30). Since the factor that multiplies \(v_k^{(m+1)}\) is less than 1, we have \(v_k^{(m+1)}\rightarrow 0\) and hence \(v_*^{(m+1)}=0\).

The assertion is trivial if \(m=1\), so we can assume that \(m>1\). For the other blocks we have the equation

$$\begin{aligned} v_{k+1}^{(j)}= \frac{(G_{j}^{T}-\beta _{k}I)(G_{j}-\alpha _{k}I)}{\sqrt{1-\beta _{k}^{2}}\sqrt{1-\alpha _{k}^{2}}}\,v_{k}^{(j)}, \quad j=1,\dots ,m. \end{aligned}$$
(37)

Taking the squared norm in (37) and using the fact that

$$\begin{aligned} (G_{j}^{T}-\delta I)(G_{j}-\delta I)=(1+\delta ^{2}-2c_{j}\delta )I \end{aligned}$$

holds for any real \(\delta \), we obtain

$$\begin{aligned} \Vert v_{k+1}^{(j)}\Vert ^{2}&= \left( \frac{1+\beta _{k}^{2}-2\beta _{k}c_{j}}{1-\beta _{k}^{2}}\right) \left( \frac{1+\alpha _{k}^{2}-2\alpha _{k}c_{j}}{1-\alpha _{k}^{2}}\right) \Vert v_{k}^{(j)}\Vert ^{2}\\&=\left( 1+2\beta _{k}\frac{\beta _{k}-c_{j}}{1-\beta _{k}^{2}}\right) \left( 1+2\alpha _{k}\frac{\alpha _{k}-c_{j}}{1-\alpha _{k}^{2}}\right) \Vert v_{k}^{(j)}\Vert ^{2}. \end{aligned}$$

We will now prove by contradiction that \(\beta _{k}\rightarrow c_{1}\) and \(\alpha _{k}\rightarrow c_{1}\). Suppose that \(\alpha _{k}\rightarrow c_{\ell }\) and \(\beta _{k}\rightarrow c_{\ell }\) for some \(\ell >1\). Using (31) we know that

$$\begin{aligned} \alpha _{k}\ge \beta _{k} \ge \alpha _{k+1} \ge \beta _{k+1} \ge \dots \ge c_{\ell }, \end{aligned}$$

and hence, for \(j=1\),

$$\begin{aligned} \Vert v_{k+1}^{(1)}\Vert \ge \left( 1+2c_{\ell }\frac{c_{\ell }-c_{1}}{1-c_{\ell }^{2}} \right) \Vert v_{k}^{(1)}\Vert \ge \left( 1+2c_{\ell }\frac{c_{\ell }-c_{1}}{1-c_{\ell }^{2}} \right) ^{k+1} \Vert v_{0}^{(1)}\Vert . \end{aligned}$$

But since \(1+2c_{\ell }\frac{c_{\ell }-c_{1}}{1-c_{\ell }^{2}}>1\) and \(v_0^{(1)}\ne 0\), this implies \(\Vert v_{k}^{(1)}\Vert \rightarrow \infty \), in contradiction to the normalization of the vectors \(v_k\). Therefore, \(\beta _{k}\rightarrow c_{1}\) and \(\alpha _{k}\rightarrow c_{1}\), and Lemma 2 yields that \(v_*\) has the required form. \(\square \)

Our next goal is show that under the assumptions of Lemma 3 there is only one uniquely determined vector in \(\Sigma _*^A\). In the following lemma we show that norms of the blocks \(v_{k}^{(j)}\) for \(j>1\) converge to zero at least linearly.

Lemma 4

If A and \(v_0\) are as in Lemma 3, then exist an index \(k_{0}\) and \(0<\varrho <1\), such that for all \(k\ge k_{0}\),

$$\begin{aligned} \Vert v_{k+1}^{(j)}\Vert \le \varrho \,\Vert v_{k}^{(j)}\Vert ,\quad j=2,\dots ,m+1. \end{aligned}$$

Proof

We know from the proof of Lemma 3 that

$$\begin{aligned} \Vert v_{k+1}^{(j)}\Vert ^{2} \,= \, \zeta _{k}^{(j)} \Vert v_{k}^{(j)}\Vert ^{2}, \quad j=1,\dots ,m+1, \end{aligned}$$

where

$$\begin{aligned} \zeta _{k}^{(j)} = \left( 1+2\beta _{k}\frac{\beta _{k}-c_{j}}{1-\beta _{k}^{2}}\right) \left( 1+2\alpha _{k}\frac{\alpha _{k}-c_{j}}{1-\alpha _{k}^{2}}\right) ,\quad j=1,\dots ,m, \end{aligned}$$
(38)

and

$$\begin{aligned} \zeta _{k}^{(m+1)} = \frac{(1-\beta _k)(1-\alpha _k)}{(1+\beta _k)(1+\alpha _k)}. \end{aligned}$$
(39)

Using that \(\alpha _{k}\ge \beta _{k}\ge \alpha _{k+1}\ge \beta _{k+1}\ge \dots \ge c_{1}\) we obtain

$$\begin{aligned} \begin{aligned} \zeta _k^{(j)} \rightarrow \left( 1+2c_{1}\frac{c_{1}-c_{j}}{1-c_{1}^{2}}\right) ^2, \;\; j=1,\dots ,m, \quad \text{ and }\quad \zeta _k^{(m+1)} \rightarrow \left( \frac{1-c_1}{1+c_1}\right) ^2, \end{aligned} \end{aligned}$$

where the limit values are both strictly less than 1. Consequently there exist \(k_{0}\ge 0\) and \(0<\varrho <1\) such that

$$\begin{aligned} \Vert v_{k+1}^{(j)}\Vert ^{2}\le \varrho ^{2}\Vert v_{k}^{(j)}\Vert ^{2},\quad j=2,\dots ,m+1, \end{aligned}$$

for all \(k\ge k_{0}\). \(\square \)

Since we are interested only in convergence of the sequence of vectors, we can assume for simplicity and without loss of generality that \(k_{0}=0\) in Lemma 4. For all \(k\ge 0\) we then have

$$\begin{aligned} 1= & {} \Vert v_{k+1}\Vert ^{2}=\Vert v_{k+1}^{(1)}\Vert ^{2}+\sum _{j=2}^{m+1}\Vert v_{k+1}^{(j)}\Vert ^{2} \le \Vert v_{k+1}^{(1)}\Vert ^{2}+\varrho ^{2}\sum _{j=2}^{m+1}\Vert v_{k}^{(j)}\Vert ^{2}\\\le & {} \Vert v_{k+1}^{(1)}\Vert ^{2}+\varrho ^{2(k+1)}\sum _{j=2}^{m+1}\Vert v_{0}^{(j)}\Vert ^{2}. \end{aligned}$$

Therefore,

$$\begin{aligned} 1-\Vert v_{k+1}^{(1)}\Vert ^{2}\le \varrho ^{2(k+1)}. \end{aligned}$$
(40)

The next result shows that the sequence \(\{v_{k}^{(1)}\}\subset {{\mathbb {R}}}^2\) converges.

Lemma 5

Let A and \(v_0\) be as in Lemma 3, and assume without loss of generality that \(k_0=0\) in Lemma 4. Then, for all \(k\ge 0\),

$$\begin{aligned} \Vert v_{k+1}^{(1)}-v_{k}^{(1)}\Vert \le \varrho ^{k}. \end{aligned}$$

Proof

We know that

$$\begin{aligned} \Vert v_{k+1}^{(1)}-v_{k}^{(1)}\Vert ^{2}=\Vert v_{k+1}^{(1)}\Vert ^{2}+\Vert v_{k}^{(1)}\Vert ^{2} -2\left( v_{k+1}^{(1)}\right) ^{T}\left( v_{k}^{(1)}\right) , \end{aligned}$$

where

$$\begin{aligned} \left( v_{k+1}^{(1)}\right) ^{T}\left( v_{k}^{(1)}\right)= & {} \left( v_{k}^{(1)}\right) ^{T}\frac{(G_{1}^{T}-\beta _{k}I)(G_{1} -\alpha _{k}I)}{\sqrt{1-\alpha _{k}^{2}}\sqrt{1-\beta _{k}^{2}}}\,v_{k}^{(1)}\\= & {} \frac{1-c_{1}\left( \alpha _{k}+\beta _{k}\right) +\alpha _{k}\beta _{k}}{\sqrt{1-\alpha _{k}^{2}}\sqrt{1-\beta _{k}^{2}}}\left\| v_{k}^{(1)}\right\| ^{2}\\\ge & {} \frac{1-\beta _k\left( \alpha _{k}+\beta _{k}\right) +\alpha _{k}\beta _{k}}{\sqrt{1-\alpha _{k}^{2}}\sqrt{1-\beta _{k}^{2}}}\left\| v_{k}^{(1)}\right\| ^{2} \ge \left\| v_{k}^{(1)}\right\| ^{2}. \end{aligned}$$

Note that the next to last inequality follows from \(\beta _k \ge c_1 > 0\). Therefore

$$\begin{aligned} \Vert v_{k+1}^{(1)}-v_{k}^{(1)}\Vert ^{2}\le & {} \Vert v_{k+1}^{(1)}\Vert ^{2}-\Vert v_{k}^{(1)}\Vert ^{2} \le 1-\Vert v_{k}^{(1)}\Vert ^{2} \le \varrho ^{2k}, \end{aligned}$$

where we have used (40). \(\square \)

Now we are ready to prove the convergence theorem.

Theorem 7

Let \(A=\textrm{diag}(G_{1},\dots ,G_{m},[1])\in {\mathbb {R}}^{n\times n}\), \(m\ge 1\), be an orthogonal matrix with blocks \(G_{j}\) as in (33), and \(0<c_{1}<c_{2}<\cdots<c_{m}<1\), where possibly the block [1] does not occur. Let \(v_{0}\in {\mathbb {R}}^{n}\) be any unit norm initial vector with \(d(A,v_{0})\ge 2\) such that \(v_{0}^{(1)}\ne 0\). Then the sequence \(\{v_{k}\}\) in (25)–(28) converges to a single limit vector.

Proof

Using Lemmas 4 and 5, and assuming without loss of generality that \(k_0=0\) in Lemma 4, we obtain for all \(k\ge 0\),

$$\begin{aligned} \Vert v_{k+1}-v_{k}\Vert ^{2}= & {} \Vert v_{k+1}^{(1)}-v_{k}^{(1)}\Vert ^{2}+\sum _{j=2}^{m+1}\Vert v_{k+1}^{(j)}-v_{k}^{(j)}\Vert ^{2}\\\le & {} \varrho ^{2k}+2\varrho ^{2k}\sum _{j=2}^{m+1}\Vert v_{0}^{(j)}\Vert ^{2} \,\le \, 3\varrho ^{2k} \end{aligned}$$

which implies \(\sum _{k=0}^{\infty }\Vert v_{k+1}-v_{k}\Vert <\infty \), and hence finishes the proof. \(\square \)

As a consequence, under the assumptions of Lemma 3 and Theorem 7, the sequences \(\{v_k\}\) and \(\{w_k\}\) converge to single limit vectors \(v_*\) and \(w_*\), respectively, which are linear combinations of two eigenvectors corresponding to the eigenvalues \(c_1 \pm {\textbf {i}}\,s_1\). Recall that in the symmetric case, the limit vectors are linear combinations of the eigenvectors corresponding to the smallest and the largest eigenvalue of A.

We now consider an orthogonal matrix \(A\in \mathbb {R}^{n\times n}\) with \(d(A)=n\) and \(0\in F(A)\). Analogously to Lemma 3 we will show that (with appropriate assumptions on the initial vector \(v_0\)) the sequences of the Rayleigh quotients \(\{\alpha _k\}\) and \(\{\beta _k\}\) in (25)–(28) converge to \(\min _{z\in F(A)} \textrm{Re}(z)\). Under the assumption \(0\in F(A)\) this means that \(\alpha =\beta =0\) in (32).

We can again assume without loss of generality that A is block diagonal,

$$\begin{aligned} A=\textrm{diag}([-1],G_{1},\dots ,G_{m},[1])\in {\mathbb {R}}^{n\times n},\quad m\ge 1, \end{aligned}$$

with blocks \(G_{j}\) as in (33), and \(-1<c_{1}<c_{2}<\cdots<c_{m}<1\), where the blocks \([-1]\) or [1] possibly do not occur. Let us also formally define \(c_0 = -1\) and \(c_{m+1}=1\). We will consider a block-partitioning of the vectors \(v_{k}\) as in (36), where we add the block \(v_{k}^{(0)}\) if the block \([-1]\) occurs.

Lemma 6

Let \(A=\textrm{diag}([-1],G_{1},\dots ,G_{m},[1])\in {\mathbb {R}}^{n\times n}\), \(m\ge 1\), be an orthogonal matrix with blocks \(G_{j}\) as in (33), and \(-1=c_0<c_{1}<c_{2}<\cdots<c_{m}<1=c_{m+1}\), where possibly the blocks \([-1]\) or [1] do not occur. Let \(v_{0}\in {\mathbb {R}}^{n}\) be a unit norm initial vector with \(d(A,v_{0})\ge 2\). Suppose that \(v_{0}^{(\ell )}\ne 0\) for some \(\ell \), \(1\le \ell \le m\), and that \(v_{0}^{(j)}\ne 0\) for some j, \(0\le j\le m+1\), such that \(c_{\ell }c_j \le 0\). Then \(\alpha =\beta =0\) in (32).

Proof

From (31) and (32) we know that \(|\alpha _{k}|\ge |\beta _{k}|\) and that \(|\alpha _{k}|\rightarrow \gamma \) and \(|\beta _{k}|\rightarrow \gamma \). Since \(d(A,v_{0})\ge 2\) and \(v_{0}^{(\ell )}\ne 0\) for some \(1\le \ell \le m\), we have \(|\alpha _{0}|<1\), and therefore \(0\le \gamma <1\). We will prove by contradiction that \(\gamma =0\). Let \(\gamma >0\), i.e., \(\alpha _{k}\rightarrow \alpha \) and \(\beta _{k}\rightarrow \beta \), where \(1>|\alpha |=|\beta |=\gamma >0\).

Suppose first that \(\alpha =-\beta \ne 0\), and consider a block for which \(v_{0}^{(\ell )}\ne 0\), \(1\le \ell \le m\). Then (see (38))

$$\begin{aligned} \zeta _k^{(\ell )} \rightarrow \left( 1+2\alpha \frac{\alpha +c_{j}}{1-\alpha {}^{2}}\right) \left( 1+2\alpha \frac{\alpha -c_{j}}{1-\alpha ^{2}}\right)= & {} 1+\frac{4\alpha ^{2}}{1-\alpha {}^{2}}\left( \frac{1-c_{j}^{2}}{1-\alpha {}^{2}}\right) >1, \end{aligned}$$

Therefore \(\Vert v_{k}^{(\ell )}\Vert \rightarrow \infty \), which is a contradiction.

Suppose now that \(1>\alpha =\beta >0\), and consider a block with \(c_j\le 0\) such that \(v_{0}^{(j)}\ne 0\), \(0\le j \le m\). If \(j=0\), then

$$\begin{aligned} \zeta _k^{(0)} \rightarrow \frac{(1+\alpha )^{2}(1+\alpha )^{2}}{\left( 1-\alpha ^{2}\right) \left( 1-\alpha ^{2}\right) }>1, \end{aligned}$$

and if \(j>0\), then

$$\begin{aligned} \zeta _k^{(j)} \rightarrow \left( 1+2\frac{\alpha ^{2}-\alpha c_{j}}{1-\alpha ^{2}}\right) ^{2}>1 \end{aligned}$$

giving a contradiction. Note that if \(-1<\beta =\alpha <0\), then we derive a contradiction by considering a block with \(c_j\ge 0\) such that \(v_{0}^{(j)}\ne 0\). \(\square \)

In order to prove that the sequence \(\{v_{k}\}\) converges to a single limit vector also when \(0\in F(A)\), it would be sufficient to show that there exists \(0<\varrho <1\) such that

$$\begin{aligned} {\Vert v_{k+1}-v_{k}\Vert =\mathcal {O}\left( \varrho ^{k}\right) } \end{aligned}$$

for k sufficiently large; cf. the proof of Theorem 7. With some effort we can show that there is a constant \(C>0\) such that

$$\begin{aligned} \Vert v_{k+1}-v_{k}\Vert \le C |\alpha _{k}|\end{aligned}$$

for k sufficiently large. It remains to prove that the coefficients \(\alpha _{k}\) converge to zero linearly, but this step needs further investigation.

We will now show by an example that for an orthogonal matrix A with \(0\in F(A)\) a limit vector \(v_*\) of the sequence \(\{v_{k}\}\) in (25)–(28) may satisfy \(d(A,v_*)>2\). This is a significant difference to the case \(0\notin F(A)\), where \(d(A,v_*)=2\) holds for any limit vector \(v_*\); see Lemma 2.

Example 3

Let \(A=\textrm{diag}(G_1,G_2)\in \mathbb {R}^{4\times 4}\) with blocks \(G_{j}\) as in (33), where \(c_1\in (-1,0)\) and \(c_2=-c_1\), so that in particular \(0\in F(A)\). Let \(v_0\in \mathbb {R}^4\) be any unit norm vector with \(v_0^{(1)}=v_0^{(2)}\). Then \(d(A)=d(A,v_0)=4\), and in (25)–(28) we get \(\alpha _0=v_0^TAv_0=0\), and hence

$$\begin{aligned} \widetilde{w}_0&=Av_0,\quad \Vert \widetilde{w}_0\Vert =1,\quad w_0=\widetilde{w}_0=Av_0,\quad \beta _0 =w_0^TAw_0=v_0^TAv_0=0,\\ \widetilde{v}_1&=A^Tw_0=v_0,\quad v_1=v_0. \end{aligned}$$

Consequently, \(v_k=v_0\) and \(w_k=Av_0\) for all \(k\ge 0\). The unique limit point of the sequence \(\{v_k\}\) is given by \(v_*=v_0\), and \(d(A,v_*)=4\).

The observation in this example that a limit vector \(v_*\) satisfies \(d(A,v_*)=n\) is not surprising if we look at the proof of Lemma 6: Since \(\alpha =0\), we have

$$\begin{aligned} \zeta _{k}^{(j)}\rightarrow 1,\quad j=0,1,\dots ,m+1, \end{aligned}$$

which would indicate that \(v_*^{(j)}\ne 0\) if \(v_{0}^{(j)}\ne 0\), and even \(d(A,v_*)=d(A,v_0)\). However, we did not prove that this holds in general.

Finally, recall that the ideal Arnoldi problem for a matrix \(A\in \mathbb {R}^{n\times n}\) and \(s=1\) is given by

$$\begin{aligned} \min _{\alpha \in \mathbb {R}}\Vert A-\alpha I\Vert ; \end{aligned}$$

see (12). A straightforward computation shows that

$$\begin{aligned} \min _{\alpha \in \mathbb {R}}\Vert A-\alpha I\Vert =\max _{\begin{array}{c} v\in \mathbb {R}^n\\ \Vert v\Vert =1 \end{array}} (\Vert Av\Vert ^2-\langle v,Av\rangle ^2)^{1/2}. \end{aligned}$$

If \(A\in \mathbb {R}^{n\times n}\) is normal and has the eigenvalues \(\lambda _1,\dots ,\lambda _n\in {{\mathbb {C}}}\), then

$$\begin{aligned} \min _{\alpha \in \mathbb {R}}\Vert A-\alpha I\Vert =\min _{\alpha \in \mathbb {R}}\max _{1\le j\le n} |\lambda _j-\alpha |, \end{aligned}$$

and the unique solution of this problem is given by the center of the closed disk of smallest radius in the complex plane that contains \(\lambda _1,\dots ,\lambda _n\); see [5, Section 2.4].

If A and \(v_0\) are as in Lemma 3, then \(c_1\) is the center of that disk, and for the unique limit vector \(v_*\) we have

$$\begin{aligned} \Vert Av_*\Vert ^2-\langle v_*,Av_*\rangle ^2=\Vert G_1v_*^{(1)}\Vert ^2-\langle v_*^{(1)},G_1v_*^{(1)}\rangle ^2=1-c_1^2, \end{aligned}$$

so that

$$\begin{aligned} \min _{\alpha \in \mathbb {R}}\Vert A-\alpha I\Vert =\Vert A-c_1 I\Vert =\Vert Av_*-c_1v_*\Vert =(1-c_1^2)^{1/2}. \end{aligned}$$

On the other hand, if A and \(v_0\) are as in Lemma 6, then 0 is the center of that disk. Any \(v_*\in \Sigma _*^A\) then satisfies \(v_*^TAv_*=0\), which gives \(\Vert Av_*\Vert ^2-\langle v_*,Av_*\rangle ^2=1\), and \(\min _{\alpha \in \mathbb {R}}\Vert A-\alpha I\Vert =\Vert A\Vert =\Vert Av_*\Vert =1\).

In short, any \(v_*\in \Sigma _*^A\) solves the ideal Arnoldi problem for an orthogonal matrix \(A\in \mathbb {R}^{n\times n}\) and \(s=1\) (when \(v_0\) satisfies the appropriate assumptions.) Note that this observation gives no insight into the uniqueness of the limit vectors.

5 Concluding remarks

In this paper we have introduced and analyzed the ACI(s) in order to pose a generalized version of the Forsythe conjecture that applies to symmetric and nonsymmetric matrices. We were able to prove the existence of a single limit vector of the sequence \(\{v_{2k}\}\) in (22)–(23) (the case \(s=1\) and \(A=A^T\)), and the sequence \(\{v_k\}\) in (25)–(28) (the case \(s=1\) and orthogonal A) when \(0\notin F(A)\). Our uniqueness proof for \(s=1\) and \(A=A^T\) is much simpler than other previously published proofs.

In the case \(s=1\), the property of monotonically increasing norms in Theorem 2, and its proof based on orthogonality and the Cauchy-Schwarz inequality, appear to the be closely related to the property of monotonically decreasing residual norms in the Rayleigh quotient iteration (for symmetric matrices) as well as Ostrowski’s two-sided iteration and Parlett’s two-sided iteration (for general matrices). The monotonic residual norms ultimately yield the global convergence of these iterations; see, e.g., [15]. Working out the exact relations between the iterations studied in the context of the Forsythe conjecture and the Rayleigh quotient based iterations remains a subject of further research.

Also, the Forsythe conjecture and its generalized version for the ACI(s) still remain largely open.