Abstract
Reduced Rank Extrapolation (RRE) is a polynomial type method used to accelerate the convergence of sequences of vectors {xm}. It is applied successfully in different disciplines of science and engineering in the solution of large and sparse systems of linear and nonlinear equations of very large dimension. If s is the solution to the system of equations x = f(x), first, a vector sequence {xm} is generated via the fixed-point iterative scheme xm+ 1 = f(xm), m = 0,1,…, and next, RRE is applied to this sequence to accelerate its convergence. RRE produces approximations sn, k to s that are of the form \(\boldsymbol {s}_{n,k}={\sum }_{i=0}^{k} \gamma _{i} \boldsymbol {x}_{n+i}\) for some scalars γi depending (nonlinearly) on xn, xn+ 1,…,xn+k+ 1 and satisfying \({\sum }_{i=0}^{k} \gamma _{i}=1\). The convergence properties of RRE when applied in conjunction with linear f(x) have been analyzed in different publications. In this work, we discuss the convergence of the sn, k obtained from RRE with nonlinear f(x) (i) when \(n\to \infty \) with fixed k, and (ii) in two so-called cycling modes.
Similar content being viewed by others
Notes
It is clear that the integers n and k are chosen by the user and that M is determined by n, k, and the extrapolation method being used.
The approaches of [18] and [21] to RRE are almost identical, in the sense that \(\boldsymbol {s}_{n,k}={\sum }^{k}_{i=0}\gamma _{i} \boldsymbol {x}_{n+i}\) in [21], while \(\boldsymbol {s}_{n,k}={\sum }^{k}_{i=0}\gamma _{i} \boldsymbol {x}_{n+i+1}\) in [18], the γi being the same for both. The approaches of [11] and [21] are completely different, however; their equivalence was proved in the review paper of Smith, Ford, and Sidi [39].
Note that M = n + k + 1 for MPE, RRE, MMPE, and SVD-MPE, while M = n + 2k for SEA, VEA, and TEA.
Given a nonzero vector \(\boldsymbol {u}\in \mathbb {C}^{N},\) the monic polynomial P(λ) is said to be a minimal polynomial of the matrix \(\boldsymbol {T}\in \mathbb {C}^{N\times N}\)with respect tou if P(T)u = 0 and if P(λ) has smallest degree.
The polynomial P(λ) exists and is unique. Moreover, if P1(T)u = 0 for some polynomial P1(λ) with \(\deg P_{1} > \deg P\), then P(λ) divides P1(λ). In particular, P(λ) divides the minimal polynomial of T, which in turn divides the characteristic polynomial of T. [Thus, the degree of P(λ) is at most N and its zeros are some or all of the eigenvalues of T.]
It is clear that to apply any of the extrapolation methods in this mode, one needs to know the matrix F(s), for which one also needs to know the solution s.
Note that k is not necessarily fixed in this mode of cycling; it may vary from one cycle to the next. It always satisfies k ≤ N, however.
Quadratic convergence is relevant only when f(x) is nonlinear. When f(x) is linear, that is, f(x) = Tx + d, where T is a fixed N × N matrix and d is a fixed vector, hence F(s) = T, the solution s is obtained already at the end of step MC2 of the first cycle, that is, we have s(1) = s. Therefore, there is nothing to analyze when f(x) is linear.
See also Sidi and Shapira [37] concerning a modified version of restarted GMRES with prior Richardson iterations, that is very closely related to RRE.
Recall that, for any matrix K with rank(K) = r, we have ∥K∥2 ≤∥K∥F ≤ r∥K∥2. See Golub and Van Loan [13].
Clearly, g(z) = zk is in \(\tilde {\mathcal {P}_{k}}\) and 𝜃k < 1 since L < 1. Next, in general, the polynomial g(z) that gives the optimum in (5.4) is different from zk. Thus, generally speaking, 𝜃k < Lk
For the linear system \(\boldsymbol {x}=\tilde {\boldsymbol {f}}(\boldsymbol {x})\), we have \(\boldsymbol {\epsilon }_{n+1}=\tilde {\boldsymbol {F}}\boldsymbol {\epsilon }_{n}\), n = 0, 1,…, as power iterations. Thus, in some cases, \(\boldsymbol {e}_{\infty }=\lim _{n\to \infty }\boldsymbol {e}_{n}\) exists and is an eigenvector of \(\tilde {\boldsymbol {F}}\), hence causes \(\text {rank}(\boldsymbol {S}(\boldsymbol {e}_{\infty }))=1\) at most. Clearly, this is a problem when rank(S(en)) = k > 1, for n = 0, 1,….
References
Anderson, D.G.: Iterative procedures for nonlinear integral equations. J. ACM 12, 547–560 (1965)
Ben-Israel, A.: On error bounds for generalized inverses. SIAM J. Numer. Anal. 3, 585–592 (1966)
Ben-Israel, A., Greville, T.N.E.: Generalized Inverses: Theory and Applications. CMS Books in Mathematics, 2nd edn. Springer, New York (2003)
Brezinski, C.: Application de l’𝜖-algorithme à la résolution des systèmes non linéaires. C. R. Acad. Sci. Paris 271 A, 1174–1177 (1970)
Brezinski, C.: Sur un algorithme de résolution des systèmes non linéaires. C. R. Acad. Sci. Paris 272 A, 145–148 (1971)
Brezinski, C.: Généralisations de la transformation de Shanks, de la table de Padé, et de l’𝜖-algorithme. Calcolo 12, 317–360 (1975)
Brezinski, C.: Accélération de la Convergence en Analyse Numérique. Number 584 in Lecture Notes in Mathematics, Springer, Berlin (1977)
Brezinski, C., Redivo Zaglia, M.: Extrapolation Methods: Theory and Practice. North-Holland, Amsterdam (1991)
Cabay, S., Jackson, L.W.: A polynomial extrapolation method for finding limits and antilimits of vector sequences. SIAM J. Numer. Anal. 13, 734–752 (1976)
Campbell, S.L., Meyer, C.D. Jr.: Generalized Inverses of Linear Transformations. Dover, New York (1991)
Eddy, R.P.: Extrapolating to the limit of a vector sequence. In: Wang, P.C.C. (ed.) Information Linkage Between Applied Mathematics and Industry, pp 387–396. Academic Press, New York (1979)
Gekeler, E.: On the solution of systems of equations by the epsilon algorithm of Wynn. Math. Comp. 26, 427–436 (1972)
Golub, G.H., Van Loan, C.F.: Matrix Computations, 4th edn. The Johns Hopkins University Press, Baltimore (2013)
Graves-Morris, P.R., Saff, E.B.: Row convergence theorems for generalised inverse vector-valued Padé, approximants. J. Comp. Appl. Math. 23, 63–85 (1988)
Jbilou, K., Sadok, H.: Some results about vector extrapolation methods and related fixed-point iterations. J. Comp. Appl. Math. 36, 385–398 (1991)
Jbilou, K., Sadok, H.: Analysis of some vector extrapolation methods for linear systems. Numer. Math. 70, 73–89 (1995)
Jbilou, K., Sadok, H.: LU-implementation of the modified minimal polynomial extrapolation method. IMA J. Numer. Anal. 19, 549–561 (1999)
Kaniel, S., Stein, J.: Least-square acceleration of iterative methods for linear equations. J Optimization Theory Appl. 14, 431–437 (1974)
Laurens, J., Le Ferrand, H.: Fonctions d’itérations vectorielles, itérations rationelles. C. R. Acad. Sci. Paris 321 I, 631–636 (1995)
Le Ferrand, H.: Convergence of the topological 𝜖-algorithm for solving systems of nonlinear equations. Numer. Algorithms 3, 273–283 (1992)
Mes̆ina, M.: Convergence acceleration for the iterative solution of the equations x = AX + f. Comput. Methods Appl. Mech. Engrg. 10, 165–173 (1977)
Ortega, J., Rheinboldt, W.: Iterative Solution of Nonlinear Equations in Several Variables. Academic Press, New York (1970)
Pugachev, B.P.: Acceleration of the convergence of iterative processes and a method of solving systems of nonlinear equations. U.S.S.R. Comput. Math. Math. Phys. 17, 199–207 (1978)
Shanks, D.: Nonlinear transformations of divergent and slowly convergent sequences. J. Math. and Phys. 34, 1–42 (1955)
Sidi, A.: Convergence and stability properties of minimal polynomial and reduced rank extrapolation algorithms. SIAM J. Numer. Anal. 23, 197–209 (1986). Originally appeared as NASA TM-83443 (1983)
Sidi, A.: Extrapolation vs. projection methods for linear systems of equations. J. Comp. Appl. Math. 22, 71–88 (1988)
Sidi, A.: Efficient implementation of minimal polynomial and reduced rank extrapolation methods. J. Comp. Appl. Math. 36, 305–337 (1991). Originally appeared as NASA TM-103240 ICOMP-90-20 (1990)
Sidi, A.: Convergence of intermediate rows of minimal polynomial and reduced rank extrapolation tables. Numer. Algorithms 6, 229–244 (1994)
Sidi, A.: Extension and completion of Wynn’s theory on convergence of columns of the epsilon table. J Approx. Theory 86, 21–40 (1996)
Sidi, A.: Review of two vector extrapolation methods of polynomial type with applications to large-scale problems. J. Comput. Sci. 3, 92–101 (2012)
Sidi, A.: SVD-MPE: An SVD-based vector extrapolation method of polynomial type. Appl. Math. 7, 1260–1278 (2016). Special issue on Applied Iterative Methods
Sidi, A.: Minimal polynomial and reduced rank extrapolation methods are related. Adv. Comput. Math. 43, 151–170 (2017)
Sidi, A.: Vector Extrapolation Methods with Applications. Number 17 in SIAM Series on Computational Science and Engineering. SIAM, Philadelphia (2017)
Sidi, A., Bridger, J.: Convergence and stability analyses for some vector extrapolation methods in the presence of defective iteration matrices. J. Comp. Appl. Math. 22, 35–61 (1988)
Sidi, A., Ford, W.F., Smith, D.A.: Acceleration of convergence of vector sequences. SIAM J. Numer. Anal. 23, 178–196 (1986). Originally appeared as NASA TP-2193 (1983)
Sidi, A., Shapira, Y.: Upper bounds for convergence rates of vector extrapolation methods on linear systems with initial iterations. Technical Report 701, Computer Science Dept., Technion–Israel Institute of Technology, 1991. Appeared also as NASA TM-105608 ICOMP-92-09 (1992)
Sidi, A., Shapira, Y.: Upper bounds for convergence rates of acceleration methods with initial iterations. Numer. Algorithms 18, 113–132 (1998)
Skelboe, S.: Computation of the periodic steady-state response of nonlinear networks by extrapolation methods. IEEE Trans. Circuits Syst. 27, 161–175 (1980)
Smith, D.A., Ford, W.F., Sidi, A.: Extrapolation methods for vector sequences. SIAM Rev. 29, 199–233 (1987). Erratum: SIAM Rev. 30, 623–634 (1988)
Stewart, G.W.: On the continuity of the generalized inverse. 17, 33–45 (1969)
Toth, A., Kelly, C.T.: Convergence analysis for Anderson acceleration. SIAM J. Numer Anal. 53, 805–819 (2015)
Varga, R.S.: Matrix Iterative Analysis. Number 27 in Springer Series in Computational Mathematics, 2nd edn. Springer, New York (2000)
Walker, H.F., Ni, P.: Anderson acceleration for fixed-point iterations. SIAM J. Numer. Anal. 49, 1715–1735 (2011)
Wedin, P.Å.: Perturbation theory for pseudo-inverses. BIT 13, 217–232 (1973)
Wynn, P.: On a device for computing the em(Sn) transformation. Math. Tables Aids to Comput. 10, 91–96 (1956)
Wynn, P.: Acceleration techniques for iterated vector and matrix problems. Math. Comp. 16, 301–322 (1962)
Wynn, P.: On the convergence and stability of the epsilon algorithm. SIAM J. Numer. Anal. 3, 91–122 (1966)
Acknowledgements
The author would like to thank one of the anonymous referees for his/her remarks that helped to improve the presentation and results of this work substantially.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: Some properties of Moore–Penrose inverses
Appendix: Some properties of Moore–Penrose inverses
First , we recall the well-known facts
and
The following theorems on Moore–Penrose inverses of perturbed matrices can be found in Ben-Israel and Greville [2], Wedin [44], and Stewart [40]. Here we give independent proofs of two of them.
Remark 6
For convenience of notation, throughout this appendix only, we will use ∥⋅∥ to denote the l2 norm. (Thus, ∥⋅∥ here does not stand for the G norm we have used in Sections 1–6.)
Theorem A.1
Let\(\boldsymbol {A}\in \mathbb {C}^{m\times n}\),rank(A) = n, andlet\(\boldsymbol {G}\in \mathbb {C}^{m\times m}\)be nonsingularand defineB = GA.Then rank(B) = ntoo,and
Proof
That rank(B) = n is clear since G is nonsingular. Starting now with A = G− 1B, we first have
Let \(\boldsymbol {x}^{\prime }\) and \(\boldsymbol {x}^{\prime \prime }\), with \(\|\boldsymbol {x}^{\prime }\|=1\) and \(\|\boldsymbol {x}^{\prime \prime }\|=1\), be such that
where \(\sigma _{\min \nolimits }(\boldsymbol {K})\) denotes the smallest singular value of a matrix K. Then
The result follows by recalling that \(\|\boldsymbol {K}^{+}\|=1/\sigma _{\min \nolimits }(\boldsymbol {K}) \) when K has full column rank, which implies that \(\sigma _{\min \nolimits }(\boldsymbol {K})>0\). □
Theorem A.2
Let\(\boldsymbol {A}\in \mathbb {C}^{m\times n}\)and\((\boldsymbol {A}+\boldsymbol {E})\in \mathbb {C}^{m\times n}\),m ≥ n, such that rank(A) = nand∥EA+∥ < 1.Then
If Δ = ∥E∥∥A+∥ < 1 in addition, then this result can be expressed as
Proof
First, because A is of full column rank, we have that \(\boldsymbol {A}^{+}\boldsymbol {A}=\boldsymbol {I}_{n\times n}\). Consequently,
Since ∥EA+∥ < 1 by assumption, the matrix G = I + EA+ is nonsingular. The first result now follows from Theorem A.1 and by the fact that ∥G− 1∥≤ 1/(1 −∥EA+∥). The second result follows by invoking ∥EA+∥≤∥E∥∥A+∥ = Δ and the additional assumption that Δ < 1. □
Theorem A.3
LetAandEbe as in Theorem A.2, Δ = ∥E∥∥A+∥ < 1,and letH = (A + E)+ −A+.Then
Proof
By Wedin [44, Theorem 4.1], there holds
Invoking now Theorem A.2, the result follows. □
The following theorem is due to Stewart [40].
Theorem A.4
LetA1, A2,…, andAbesuch that\(\lim _{n\to \infty }\boldsymbol {A}_{n}=\boldsymbol {A}\).Then\(\lim _{n\to \infty }\boldsymbol {A}_{n}^{+}=\boldsymbol {A}^{+}\)ifand only if rank(An) = rank(A),n ≥ n0,for some integern0.
Rights and permissions
About this article
Cite this article
Sidi, A. A convergence study for reduced rank extrapolation on nonlinear systems. Numer Algor 84, 957–982 (2020). https://doi.org/10.1007/s11075-019-00788-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11075-019-00788-6
Keywords
- Vector extrapolation methods
- Minimal polynomial extrapolation (MPE)
- Reduced rank extrapolation (RRE)
- Krylov subspace methods
- Nonlinear equations
- Cycling mode