Skip to main content
Log in

A convergence study for reduced rank extrapolation on nonlinear systems

  • Original Paper
  • Published:
Numerical Algorithms Aims and scope Submit manuscript

Abstract

Reduced Rank Extrapolation (RRE) is a polynomial type method used to accelerate the convergence of sequences of vectors {xm}. It is applied successfully in different disciplines of science and engineering in the solution of large and sparse systems of linear and nonlinear equations of very large dimension. If s is the solution to the system of equations x = f(x), first, a vector sequence {xm} is generated via the fixed-point iterative scheme xm+ 1 = f(xm), m = 0,1,…, and next, RRE is applied to this sequence to accelerate its convergence. RRE produces approximations sn, k to s that are of the form \(\boldsymbol {s}_{n,k}={\sum }_{i=0}^{k} \gamma _{i} \boldsymbol {x}_{n+i}\) for some scalars γi depending (nonlinearly) on xn, xn+ 1,…,xn+k+ 1 and satisfying \({\sum }_{i=0}^{k} \gamma _{i}=1\). The convergence properties of RRE when applied in conjunction with linear f(x) have been analyzed in different publications. In this work, we discuss the convergence of the sn, k obtained from RRE with nonlinear f(x) (i) when \(n\to \infty \) with fixed k, and (ii) in two so-called cycling modes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. It is clear that the integers n and k are chosen by the user and that M is determined by n, k, and the extrapolation method being used.

  2. The approaches of [18] and [21] to RRE are almost identical, in the sense that \(\boldsymbol {s}_{n,k}={\sum }^{k}_{i=0}\gamma _{i} \boldsymbol {x}_{n+i}\) in [21], while \(\boldsymbol {s}_{n,k}={\sum }^{k}_{i=0}\gamma _{i} \boldsymbol {x}_{n+i+1}\) in [18], the γi being the same for both. The approaches of [11] and [21] are completely different, however; their equivalence was proved in the review paper of Smith, Ford, and Sidi [39].

  3. Note that M = n + k + 1 for MPE, RRE, MMPE, and SVD-MPE, while M = n + 2k for SEA, VEA, and TEA.

  4. Given a nonzero vector \(\boldsymbol {u}\in \mathbb {C}^{N},\) the monic polynomial P(λ) is said to be a minimal polynomial of the matrix \(\boldsymbol {T}\in \mathbb {C}^{N\times N}\)with respect tou if P(T)u = 0 and if P(λ) has smallest degree.

    The polynomial P(λ) exists and is unique. Moreover, if P1(T)u = 0 for some polynomial P1(λ) with \(\deg P_{1} > \deg P\), then P(λ) divides P1(λ). In particular, P(λ) divides the minimal polynomial of T, which in turn divides the characteristic polynomial of T. [Thus, the degree of P(λ) is at most N and its zeros are some or all of the eigenvalues of T.]

  5. It is clear that to apply any of the extrapolation methods in this mode, one needs to know the matrix F(s), for which one also needs to know the solution s.

  6. Note that k is not necessarily fixed in this mode of cycling; it may vary from one cycle to the next. It always satisfies kN, however.

  7. Quadratic convergence is relevant only when f(x) is nonlinear. When f(x) is linear, that is, f(x) = Tx + d, where T is a fixed N × N matrix and d is a fixed vector, hence F(s) = T, the solution s is obtained already at the end of step MC2 of the first cycle, that is, we have s(1) = s. Therefore, there is nothing to analyze when f(x) is linear.

  8. See also Sidi and Shapira [37] concerning a modified version of restarted GMRES with prior Richardson iterations, that is very closely related to RRE.

  9. Recall that, for any matrix K with rank(K) = r, we have ∥K2 ≤∥KFrK2. See Golub and Van Loan [13].

  10. Clearly, g(z) = zk is in \(\tilde {\mathcal {P}_{k}}\) and 𝜃k < 1 since L < 1. Next, in general, the polynomial g(z) that gives the optimum in (5.4) is different from zk. Thus, generally speaking, 𝜃k < Lk

  11. For the linear system \(\boldsymbol {x}=\tilde {\boldsymbol {f}}(\boldsymbol {x})\), we have \(\boldsymbol {\epsilon }_{n+1}=\tilde {\boldsymbol {F}}\boldsymbol {\epsilon }_{n}\), n = 0, 1,…, as power iterations. Thus, in some cases, \(\boldsymbol {e}_{\infty }=\lim _{n\to \infty }\boldsymbol {e}_{n}\) exists and is an eigenvector of \(\tilde {\boldsymbol {F}}\), hence causes \(\text {rank}(\boldsymbol {S}(\boldsymbol {e}_{\infty }))=1\) at most. Clearly, this is a problem when rank(S(en)) = k > 1, for n = 0, 1,….

References

  1. Anderson, D.G.: Iterative procedures for nonlinear integral equations. J. ACM 12, 547–560 (1965)

    Article  MATH  MathSciNet  Google Scholar 

  2. Ben-Israel, A.: On error bounds for generalized inverses. SIAM J. Numer. Anal. 3, 585–592 (1966)

    Article  MATH  MathSciNet  Google Scholar 

  3. Ben-Israel, A., Greville, T.N.E.: Generalized Inverses: Theory and Applications. CMS Books in Mathematics, 2nd edn. Springer, New York (2003)

    MATH  Google Scholar 

  4. Brezinski, C.: Application de l’𝜖-algorithme à la résolution des systèmes non linéaires. C. R. Acad. Sci. Paris 271 A, 1174–1177 (1970)

    MATH  Google Scholar 

  5. Brezinski, C.: Sur un algorithme de résolution des systèmes non linéaires. C. R. Acad. Sci. Paris 272 A, 145–148 (1971)

    MATH  Google Scholar 

  6. Brezinski, C.: Généralisations de la transformation de Shanks, de la table de Padé, et de l’𝜖-algorithme. Calcolo 12, 317–360 (1975)

    Article  MATH  MathSciNet  Google Scholar 

  7. Brezinski, C.: Accélération de la Convergence en Analyse Numérique. Number 584 in Lecture Notes in Mathematics, Springer, Berlin (1977)

  8. Brezinski, C., Redivo Zaglia, M.: Extrapolation Methods: Theory and Practice. North-Holland, Amsterdam (1991)

    MATH  Google Scholar 

  9. Cabay, S., Jackson, L.W.: A polynomial extrapolation method for finding limits and antilimits of vector sequences. SIAM J. Numer. Anal. 13, 734–752 (1976)

    Article  MATH  MathSciNet  Google Scholar 

  10. Campbell, S.L., Meyer, C.D. Jr.: Generalized Inverses of Linear Transformations. Dover, New York (1991)

    MATH  Google Scholar 

  11. Eddy, R.P.: Extrapolating to the limit of a vector sequence. In: Wang, P.C.C. (ed.) Information Linkage Between Applied Mathematics and Industry, pp 387–396. Academic Press, New York (1979)

  12. Gekeler, E.: On the solution of systems of equations by the epsilon algorithm of Wynn. Math. Comp. 26, 427–436 (1972)

    Article  MATH  MathSciNet  Google Scholar 

  13. Golub, G.H., Van Loan, C.F.: Matrix Computations, 4th edn. The Johns Hopkins University Press, Baltimore (2013)

    MATH  Google Scholar 

  14. Graves-Morris, P.R., Saff, E.B.: Row convergence theorems for generalised inverse vector-valued Padé, approximants. J. Comp. Appl. Math. 23, 63–85 (1988)

    Article  MATH  Google Scholar 

  15. Jbilou, K., Sadok, H.: Some results about vector extrapolation methods and related fixed-point iterations. J. Comp. Appl. Math. 36, 385–398 (1991)

    Article  MATH  MathSciNet  Google Scholar 

  16. Jbilou, K., Sadok, H.: Analysis of some vector extrapolation methods for linear systems. Numer. Math. 70, 73–89 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  17. Jbilou, K., Sadok, H.: LU-implementation of the modified minimal polynomial extrapolation method. IMA J. Numer. Anal. 19, 549–561 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  18. Kaniel, S., Stein, J.: Least-square acceleration of iterative methods for linear equations. J Optimization Theory Appl. 14, 431–437 (1974)

    Article  MATH  MathSciNet  Google Scholar 

  19. Laurens, J., Le Ferrand, H.: Fonctions d’itérations vectorielles, itérations rationelles. C. R. Acad. Sci. Paris 321 I, 631–636 (1995)

    MATH  Google Scholar 

  20. Le Ferrand, H.: Convergence of the topological 𝜖-algorithm for solving systems of nonlinear equations. Numer. Algorithms 3, 273–283 (1992)

    Article  MATH  MathSciNet  Google Scholar 

  21. Mes̆ina, M.: Convergence acceleration for the iterative solution of the equations x = AX + f. Comput. Methods Appl. Mech. Engrg. 10, 165–173 (1977)

    Article  MathSciNet  Google Scholar 

  22. Ortega, J., Rheinboldt, W.: Iterative Solution of Nonlinear Equations in Several Variables. Academic Press, New York (1970)

    MATH  Google Scholar 

  23. Pugachev, B.P.: Acceleration of the convergence of iterative processes and a method of solving systems of nonlinear equations. U.S.S.R. Comput. Math. Math. Phys. 17, 199–207 (1978)

    Article  MATH  Google Scholar 

  24. Shanks, D.: Nonlinear transformations of divergent and slowly convergent sequences. J. Math. and Phys. 34, 1–42 (1955)

    Article  MATH  MathSciNet  Google Scholar 

  25. Sidi, A.: Convergence and stability properties of minimal polynomial and reduced rank extrapolation algorithms. SIAM J. Numer. Anal. 23, 197–209 (1986). Originally appeared as NASA TM-83443 (1983)

    Article  MATH  MathSciNet  Google Scholar 

  26. Sidi, A.: Extrapolation vs. projection methods for linear systems of equations. J. Comp. Appl. Math. 22, 71–88 (1988)

    Article  MATH  MathSciNet  Google Scholar 

  27. Sidi, A.: Efficient implementation of minimal polynomial and reduced rank extrapolation methods. J. Comp. Appl. Math. 36, 305–337 (1991). Originally appeared as NASA TM-103240 ICOMP-90-20 (1990)

    Article  MATH  MathSciNet  Google Scholar 

  28. Sidi, A.: Convergence of intermediate rows of minimal polynomial and reduced rank extrapolation tables. Numer. Algorithms 6, 229–244 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  29. Sidi, A.: Extension and completion of Wynn’s theory on convergence of columns of the epsilon table. J Approx. Theory 86, 21–40 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  30. Sidi, A.: Review of two vector extrapolation methods of polynomial type with applications to large-scale problems. J. Comput. Sci. 3, 92–101 (2012)

    Article  Google Scholar 

  31. Sidi, A.: SVD-MPE: An SVD-based vector extrapolation method of polynomial type. Appl. Math. 7, 1260–1278 (2016). Special issue on Applied Iterative Methods

    Article  Google Scholar 

  32. Sidi, A.: Minimal polynomial and reduced rank extrapolation methods are related. Adv. Comput. Math. 43, 151–170 (2017)

    Article  MATH  MathSciNet  Google Scholar 

  33. Sidi, A.: Vector Extrapolation Methods with Applications. Number 17 in SIAM Series on Computational Science and Engineering. SIAM, Philadelphia (2017)

    Book  Google Scholar 

  34. Sidi, A., Bridger, J.: Convergence and stability analyses for some vector extrapolation methods in the presence of defective iteration matrices. J. Comp. Appl. Math. 22, 35–61 (1988)

    Article  MATH  MathSciNet  Google Scholar 

  35. Sidi, A., Ford, W.F., Smith, D.A.: Acceleration of convergence of vector sequences. SIAM J. Numer. Anal. 23, 178–196 (1986). Originally appeared as NASA TP-2193 (1983)

    Article  MATH  MathSciNet  Google Scholar 

  36. Sidi, A., Shapira, Y.: Upper bounds for convergence rates of vector extrapolation methods on linear systems with initial iterations. Technical Report 701, Computer Science Dept., Technion–Israel Institute of Technology, 1991. Appeared also as NASA TM-105608 ICOMP-92-09 (1992)

  37. Sidi, A., Shapira, Y.: Upper bounds for convergence rates of acceleration methods with initial iterations. Numer. Algorithms 18, 113–132 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  38. Skelboe, S.: Computation of the periodic steady-state response of nonlinear networks by extrapolation methods. IEEE Trans. Circuits Syst. 27, 161–175 (1980)

    Article  MATH  MathSciNet  Google Scholar 

  39. Smith, D.A., Ford, W.F., Sidi, A.: Extrapolation methods for vector sequences. SIAM Rev. 29, 199–233 (1987). Erratum: SIAM Rev. 30, 623–634 (1988)

    Article  MATH  MathSciNet  Google Scholar 

  40. Stewart, G.W.: On the continuity of the generalized inverse. 17, 33–45 (1969)

  41. Toth, A., Kelly, C.T.: Convergence analysis for Anderson acceleration. SIAM J. Numer Anal. 53, 805–819 (2015)

    Article  MATH  MathSciNet  Google Scholar 

  42. Varga, R.S.: Matrix Iterative Analysis. Number 27 in Springer Series in Computational Mathematics, 2nd edn. Springer, New York (2000)

    Google Scholar 

  43. Walker, H.F., Ni, P.: Anderson acceleration for fixed-point iterations. SIAM J. Numer. Anal. 49, 1715–1735 (2011)

    Article  MATH  MathSciNet  Google Scholar 

  44. Wedin, P.Å.: Perturbation theory for pseudo-inverses. BIT 13, 217–232 (1973)

    Article  MATH  MathSciNet  Google Scholar 

  45. Wynn, P.: On a device for computing the em(Sn) transformation. Math. Tables Aids to Comput. 10, 91–96 (1956)

    Article  MATH  MathSciNet  Google Scholar 

  46. Wynn, P.: Acceleration techniques for iterated vector and matrix problems. Math. Comp. 16, 301–322 (1962)

    Article  MATH  MathSciNet  Google Scholar 

  47. Wynn, P.: On the convergence and stability of the epsilon algorithm. SIAM J. Numer. Anal. 3, 91–122 (1966)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Acknowledgements

The author would like to thank one of the anonymous referees for his/her remarks that helped to improve the presentation and results of this work substantially.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Avram Sidi.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Some properties of Moore–Penrose inverses

Appendix: Some properties of Moore–Penrose inverses

First , we recall the well-known facts

$$ \boldsymbol{A}\in\mathbb{C}^{m\times n},\quad \text{rank}(\boldsymbol{A})=n\quad\Rightarrow\quad \boldsymbol{A}^{+}=(\boldsymbol{A}^{*}\boldsymbol{A})^{-1}\boldsymbol{A}^{*} \quad\Rightarrow\quad\boldsymbol{A}^{+}\boldsymbol{A}=\boldsymbol{I}_{n\times n}, $$
(A.1)
$$ \boldsymbol{A}\in\mathbb{C}^{m\times n},\quad \text{rank}(\boldsymbol{A})=m\quad\Rightarrow\quad \boldsymbol{A}^{+}=\boldsymbol{A}^{*}(\boldsymbol{A}\boldsymbol{A}^{*})^{-1} \quad\Rightarrow\quad\boldsymbol{A}\boldsymbol{A}^{+}=\boldsymbol{I}_{m\times m}, $$
(A.2)

and

$$ \boldsymbol{A}\in\mathbb{C}^{m\times n},\quad \boldsymbol{B}\in\mathbb{C}^{n\times p},\quad\text{rank}(\boldsymbol{A})=\text{rank}(\boldsymbol{B})=n\quad \Rightarrow\quad (\boldsymbol{A}\boldsymbol{B})^{+}=\boldsymbol{B}^{+}\boldsymbol{A}^{+}. $$
(A.3)

The following theorems on Moore–Penrose inverses of perturbed matrices can be found in Ben-Israel and Greville [2], Wedin [44], and Stewart [40]. Here we give independent proofs of two of them.

Remark 6

For convenience of notation, throughout this appendix only, we will use ∥⋅∥ to denote the l2 norm. (Thus, ∥⋅∥ here does not stand for the G norm we have used in Sections 16.)

Theorem A.1

Let\(\boldsymbol {A}\in \mathbb {C}^{m\times n}\),rank(A) = n, andlet\(\boldsymbol {G}\in \mathbb {C}^{m\times m}\)be nonsingularand defineB = GA.Then rank(B) = ntoo,and

$$ \|\boldsymbol{B}^{+}\|\leq \|\boldsymbol{G}^{-1}\|\|\boldsymbol{A}^{+}\|. $$

Proof

That rank(B) = n is clear since G is nonsingular. Starting now with A = G− 1B, we first have

$$ \boldsymbol{A}\boldsymbol{x}=\boldsymbol{G}^{-1}(\boldsymbol{B}\boldsymbol{x})\quad \Rightarrow \quad \|\boldsymbol{A}\boldsymbol{x}\| \leq\|\boldsymbol{G}^{-1}\| \|\boldsymbol{B}\boldsymbol{x}\|\quad \forall \boldsymbol{x}\in\mathbb{C}^{n},\quad \|\boldsymbol{x}\|=1. $$

Let \(\boldsymbol {x}^{\prime }\) and \(\boldsymbol {x}^{\prime \prime }\), with \(\|\boldsymbol {x}^{\prime }\|=1\) and \(\|\boldsymbol {x}^{\prime \prime }\|=1\), be such that

$$ \sigma_{\min}(\boldsymbol{A})=\min\limits_{\|\boldsymbol{x}\|=1}\|\boldsymbol{A}\boldsymbol{x}\|=\|\boldsymbol{A}\boldsymbol{x}^{\prime}\|\quad\text{and}\quad \sigma_{\min}(\boldsymbol{B})=\min\limits_{\|\boldsymbol{x}\|=1}\|\boldsymbol{B}\boldsymbol{x}\|=\|\boldsymbol{B}\boldsymbol{x}^{\prime\prime}\|, $$

where \(\sigma _{\min \nolimits }(\boldsymbol {K})\) denotes the smallest singular value of a matrix K. Then

$$ \sigma_{\min}(\boldsymbol{A})=\|\boldsymbol{A}\boldsymbol{x}^{\prime}\|\leq \|\boldsymbol{A}\boldsymbol{x}^{\prime\prime}\|\leq \|\boldsymbol{G}^{-1}\| \|\boldsymbol{B}\boldsymbol{x}^{\prime\prime}\| =\|\boldsymbol{G}^{-1}\| \sigma_{\min}(\boldsymbol{B}). $$

The result follows by recalling that \(\|\boldsymbol {K}^{+}\|=1/\sigma _{\min \nolimits }(\boldsymbol {K}) \) when K has full column rank, which implies that \(\sigma _{\min \nolimits }(\boldsymbol {K})>0\). □

Theorem A.2

Let\(\boldsymbol {A}\in \mathbb {C}^{m\times n}\)and\((\boldsymbol {A}+\boldsymbol {E})\in \mathbb {C}^{m\times n}\),mn, such that rank(A) = nandEA+∥ < 1.Then

$$ \|(\boldsymbol{A}+\boldsymbol{E})^{+}\|\leq\frac{\|\boldsymbol{A}^{+}\|}{1-\|\boldsymbol{E}\boldsymbol{A}^{+}\|}. $$

If Δ = ∥E∥∥A+∥ < 1 in addition, then this result can be expressed as

$$ \|(\boldsymbol{A}+\boldsymbol{E})^{+}\|\leq\frac{1}{1-{\Delta}}\|\boldsymbol{A}^{+}\|. $$

Proof

First, because A is of full column rank, we have that \(\boldsymbol {A}^{+}\boldsymbol {A}=\boldsymbol {I}_{n\times n}\). Consequently,

$$ \boldsymbol{A}+\boldsymbol{E}=(\boldsymbol{I}+\boldsymbol{E}\boldsymbol{A}^{+})\boldsymbol{A}. $$

Since ∥EA+∥ < 1 by assumption, the matrix G = I + EA+ is nonsingular. The first result now follows from Theorem A.1 and by the fact that ∥G− 1∥≤ 1/(1 −∥EA+∥). The second result follows by invoking ∥EA+∥≤∥E∥∥A+∥ = Δ and the additional assumption that Δ < 1. □

Theorem A.3

LetAandEbe as in Theorem A.2, Δ = ∥E∥∥A+∥ < 1,and letH = (A + E)+A+.Then

$$ \|\boldsymbol{H}\|\leq\sqrt{2}\frac{\Delta}{1-{\Delta}}\|\boldsymbol{A}^{+}\|.$$

Proof

By Wedin [44, Theorem 4.1], there holds

$$ \|\boldsymbol{H}\|\leq\sqrt{2} \|(\boldsymbol{A}+\boldsymbol{E})^{+}\| \|\boldsymbol{A}^{+}\| \|\boldsymbol{E}\|. $$

Invoking now Theorem A.2, the result follows. □

The following theorem is due to Stewart [40].

Theorem A.4

LetA1, A2,…, andAbesuch that\(\lim _{n\to \infty }\boldsymbol {A}_{n}=\boldsymbol {A}\).Then\(\lim _{n\to \infty }\boldsymbol {A}_{n}^{+}=\boldsymbol {A}^{+}\)ifand only if rank(An) = rank(A),nn0,for some integern0.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sidi, A. A convergence study for reduced rank extrapolation on nonlinear systems. Numer Algor 84, 957–982 (2020). https://doi.org/10.1007/s11075-019-00788-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11075-019-00788-6

Keywords

Mathematics Subject Classification (2010)

Navigation