Skip to main content
Log in

Optimal error bounds for non-expansive fixed-point iterations in normed spaces

  • Full Length Paper
  • Series A
  • Published:
Mathematical Programming Submit manuscript

Abstract

This paper investigates optimal error bounds and convergence rates for general Mann iterations for computing fixed-points of non-expansive maps. We look for iterations that achieve the smallest fixed-point residual after n steps, by minimizing a worst-case bound \(\Vert x^n-Tx^n\Vert \le R_n\) derived from a nested family of optimal transport problems. We prove that this bound is tight so that minimizing \(R_n\) yields optimal iterations. Inspired from numerical results we identify iterations that attain the rate \(R_n=O(1/n)\), which we also show to be the best possible. In particular, we prove that the classical Halpern iteration achieves this optimal rate for several alternative stepsizes, and we determine analytically the optimal stepsizes that attain the smallest worst-case residuals at every step n, with a tight bound \(R_n\approx \frac{4}{n+4}\). We also determine the optimal Halpern stepsizes for affine non-expansive maps, for which we get exactly \(R_n=\frac{1}{n+1}\). Finally, we show that the best rate for the classical Krasnosel’skiĭ–Mann iteration is \(\varOmega (1/\sqrt{n})\), and present numerical evidence suggesting that even extended variants cannot reach a faster rate.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. First version appears in 2017 in http://www.optimization-online.org/DB_FILE/2017/11/6336.pdf.

  2. First version appears in 2019 in arXiv preprint arXiv:1905.05149.

References

  1. Aronszajn, N., Panitchpakdi, P.: Extension of uniformly continuous transformations and hyperconvex metric spaces. Pac. J. Math. 6(3), 405–439 (1956)

    Article  MathSciNet  MATH  Google Scholar 

  2. Baillon, J.B., Bruck, R.E., Reich, S.: On the asymptotic behavior of non-expansive mappings and semigroups in Banach spaces. Houst. J. Math. 4(1), 1–10 (1978)

    MATH  Google Scholar 

  3. Baillon, J.B., Bruck, R.E.: Optimal Rates of Asymptotic Regularity for Averaged Non-expansive Mappings. World Scientific Publishing Co. Pte. Ltd., Singapore (1992)

    MATH  Google Scholar 

  4. Baillon, J.B., Bruck, R.E.: The rate of asymptotic regularity is \(O(1/\sqrt{n})\). In: Kartsatos A.G. (ed.) Theory and Applications of Nonlinear Operators of Accretive and Monotone Types, Lecture Notes in Pure and Applied Mathematics, vol. 178, pp. 51–81. Dekker, New York (1996)

  5. Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces, CMS Books in Mathematics, vol. 408. Springer, Berlin (2011)

    MATH  Google Scholar 

  6. Berinde, V.: Iterative Approximation of Fixed Points. Lecture Notes in Mathematics, vol. 1912. Springer, Berlin (2007)

    MATH  Google Scholar 

  7. Borwein, J., Reich, S., Shafrir, I.: Krasnoselski–Mann iterations in normed spaces. Can. Math. Bull. 35(1), 21–28 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  8. Bravo, M., Cominetti, R.: Sharp convergence rates for averaged non-expansive maps. Israel J. Math. 227(1), 163–188 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  9. Bravo, M., Champion, T., Cominetti, R.: Universal bounds for fixed point iterations via optimal transport metrics, pp. 1–21 (2021). arXiv:2108.00300v1

  10. Browder, F.E.: Convergence of approximants to fixed points of non-expansive nonlinear mappings in Banach spaces. Arch. Ration. Mech. Anal. 24(1), 82–90 (1967)

    Article  MATH  Google Scholar 

  11. Browder, F.E., Petryshyn, W.V.: The solution by iteration of nonlinear functional equations in Banach spaces. Bull. Am. Math. Soc. 72(3), 571–575 (1966)

    Article  MathSciNet  MATH  Google Scholar 

  12. Colao, V., Marino, G.: On the rate of convergence of Halpern iterations. J. Nonlinear Convex Analysis 22(12), 2639–2646 (2021)

    MathSciNet  MATH  Google Scholar 

  13. Cominetti, R., Soto, J.A., Vaisman, J.: On the rate of convergence of Krasnosel’skiĭ–Mann iterations and their connection with sums of Bernoullis. Israel J. Math. 199(2), 757–772 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  14. Darroch, J.N.: On the distribution of the number of successes in independent trials. Ann. Math. Stat. 35(3), 1317–1321 (1964)

    Article  MathSciNet  MATH  Google Scholar 

  15. Diakonikolas, J.: Halpern iteration for near-optimal and parameter-free monotone inclusion and strong solutions to variational inequalities. Proc. Mach. Learn. Res. 125, 1–24 (2020)

    Google Scholar 

  16. Drori, Y., Teboulle, M.: Performance of first-order methods for smooth convex minimization: a novel approach. Math. Program. 145(1–2), 451–482 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  17. Dutton, R.D., Brigham, R.C.: Computationally efficient bounds for the Catalan numbers. Eur. J. Combin. 7(3), 211–213 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  18. Halpern, B.: Fixed points of nonexpanding maps. Bull. Am. Math. Soc. 73(6), 957–961 (1967)

    Article  MathSciNet  MATH  Google Scholar 

  19. Hoeffding, W.: On the distribution of the number of successes in independent trials. Ann. Math. Stat. 27(3), 713–721 (1956)

    Article  MathSciNet  MATH  Google Scholar 

  20. Kim, D.: Accelerated proximal point method for maximally monotone operators. Math. Program. 190(1), 57–87 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  21. Kohlenbach, U.: On the logical analysis of proofs based on nonseparable Hilbert space theory. In: Fefferman, S., Sieg, W. (eds.) Proofs, Categories and Computations. Essays in Honor of Grigori Mints, pp. 131–143. College Publications, New York (2010)

    Google Scholar 

  22. Kohlenbach, U.: On quantitative versions of theorems due to F.E. Browder and R. Wittmann. Adv. Math. 226(3), 2764–2795 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  23. Körnlein, D.: Quantitative results for Halpern iterations of non-expansive mappings. J. Math. Anal. Appl. 428(2), 1161–1172 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  24. Krasnosel’skiĭ, M.A.: Two remarks on the method of successive approximations. Uspekhi Matematicheskikh Nauk 10, 123–127 (1955)

    MathSciNet  Google Scholar 

  25. Leustean, L.: Rates of asymptotic regularity for Halpern iterations of non-expansive mappings. J. Univ. Comput. Sci. 13(11), 1680–1691 (2007)

    MathSciNet  MATH  Google Scholar 

  26. Lieder, F.: On the convergence rate of the Halpern-iteration. Optim. Lett. 15(2), 405–418 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  27. Mann, W.R.: Mean value methods in iteration. Proc. Am. Math. Soc. 4(3), 506–510 (1953)

    Article  MathSciNet  MATH  Google Scholar 

  28. Nesterov, Y.: Lectures on Convex Optimization, Springer Optimization and Its Applications, vol. 137. Springer, Berlin (2018)

    Book  MATH  Google Scholar 

  29. Reich, S.: Fixed point iterations of non-expansive mappings. Pac. J. Math. 60(2), 195–198 (1975)

    Article  MATH  Google Scholar 

  30. Reich, S.: Weak convergence theorems for non-expansive mappings in Banach spaces. J. Math. Anal. Appl. 67, 274–276 (1979)

    Article  MathSciNet  MATH  Google Scholar 

  31. Reich, S.: Strong convergence theorems for resolvents of accretive operators in Banach spaces. J. Math. Anal. Appl. 75, 287–292 (1980)

    Article  MathSciNet  MATH  Google Scholar 

  32. Reich, S.: Approximating fixed points of nonexpansive mappings. Panam. Math. J. 4(2), 23–28 (1994)

    MathSciNet  MATH  Google Scholar 

  33. Ryu, E.K. Yin, W.: Large-scale convex optimization via monotone operators. Cambridge University Press (2022). https://www.cambridge.org/core/books/largescale-convex-optimization/2A7F8E7428BFA4EDB8AFACA11AB97E4C

  34. Sabach, S., Shtern, S.: A first order method for solving convex bilevel optimization problems. SIAM J. Optim. 27(2), 640–660 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  35. Wittmann, R.: Approximation of fixed points of non-expansive mappings. Arch. Math. 58(5), 486–491 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  36. Xu, H.-K.: Iterative algorithms for nonlinear operators. J. Lond. Math. Soc. 66(1), 240–256 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  37. Xu, M., Balakrishnan, N.: On the convolution of heterogeneous Bernoulli random variables. J. Appl. Probab. 48(3), 877–884 (2011)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

We thank professor Simeon Reich (Israel Institute of Technology) for his interest in this paper, his valuable comments, and for pointing out relevant references that helped us improve the introductory section. We also thank the two anonymous referees who carefully read the paper and provided important insights that helped us improve the presentation. The work of Juan Pablo Contreras was supported by a doctoral scholarship from ANID-PFCHA/Doctorado Nacional/2019-21190161. Roberto Cominetti gratefully acknowledges the support provided by the research Grant FONDECYT 1171501.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Juan Pablo Contreras.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Tightness of optimal transport bounds

In this section we present the proof of Theorem 1, establishing the tightness of the optimal transport bounds in general Mann iterations, and therefore the equality \(\varPsi _n(\pi )=R_n(\pi )\).

Proof of Theorem 1

Let \(z^{mn}\) and \(u^{mn}\) be optimal solutions for \(({\mathcal {P}}_{m,n})\) and \(({{\mathcal {D}}}_{m,n})\). Setting \(u_{i}^{mn}={\displaystyle \min _{0\le k\le n}}u_{k}^{mn}+d_{k-1,i-1}\) for \(i>n\), and using the triangle inequality, we get

$$\begin{aligned} \vert u_{i}^{mn}-u_{j}^{mn}\vert \le d_{i-1,j-1} \text{ for } \text{ all } i,j\in {\mathbb N}. \end{aligned}$$
(27)

In particular all the \(u_i^{mn}\)’s are within a distance at most 1 and, since the objective function in \(({{\mathcal {D}}}_{m,n})\) is invariant by translation, we may further assume that \(u_i^{mn}\in [0,1]\) for all \(i\in {\mathbb N}\).

Let \({\mathcal {I}}\) be the set of all pair of integers (mn) with \(-1\le m\le n\), and consider the unit cube \(C=[0,1]^{\mathcal {I}}\) in the space \((\ell ^\infty ({\mathcal {I}}),\Vert \cdot \Vert _\infty )\). For every integer \(k\in {\mathbb N}\) define \(y^{k}\in C\) as

$$\begin{aligned} \forall (m,n)\in {\mathcal {I}}\quad y^k_{m,n}=\left\{ \begin{array}{cl} d_{k-1,n}&{}\text{ if } -\!1=m\le n\\ u^{mn}_{k}&{}\text{ if } 0\le m\le n \end{array}\right. \end{aligned}$$
(28)

and a corresponding sequence \(x^{k}\in C\) given by

$$\begin{aligned} { x^k=\sum _{i=0}^k\pi _i^ky^{i}}. \end{aligned}$$
(29)

We claim that \(\Vert y^{m+1}-y^{n+1}\Vert _\infty \le d_{m,n} =\Vert x^m-x^n\Vert _\infty \) for all \(0\le m\le n\). Indeed, using the triangle inequality and (27) we get

$$\begin{aligned} \left\{ \begin{array}{ll} \vert y^{m+1}_{-1,n'}-y^{n+1}_{-1,n'}\vert =\vert d_{m,n'}-d_{n,n'}\vert \le d_{m,n}&{} \text{ if } -1=m'\le n'\\[1ex] \vert y^{m+1}_{m',n'}-y^{n+1}_{m',n'}\vert =\vert u^{m'n'}_{m+1}\!-u^{m'n'}_{n+1}\vert \le d_{m,n}&{} \text{ if } 0\le m'\le n' \end{array}\right. \end{aligned}$$

which together imply

$$\begin{aligned} \Vert y^{m+1}\!-y^{n+1}\Vert _\infty \le d_{m,n}. \end{aligned}$$
(30)

Also, selecting an optimal transport \(z^{mn}\) for \(({\mathcal {P}}_{m,n})\) we have

$$\begin{aligned} x^m-x^n= & {} \sum _{i=0}^{m}\pi _{i}^my^{i}-\sum _{j=0}^{n}\pi _j^ny^{j}\nonumber \\= & {} \sum _{i=0}^{m}\sum _{j=0}^{n}z^{mn}_{i,j}(y^{i}-y^{j}) \end{aligned}$$
(31)

so that the triangle inequality and (30) yield

$$\begin{aligned} \Vert x^m\!-x^n\Vert _\infty \le \sum _{i=0}^{m}\sum _{j=0}^{n}z^{mn}_{i,j}d_{i-1,j-1}=d_{m,n}. \end{aligned}$$
(32)

On the other hand, considering the (mn)-coordinate in (31), the complementary slackness (9) gives (recall that we are in the case \(0\le m\le n\))

$$\begin{aligned} \vert x_{m,n}^m-x_{m,n}^n\vert= & {} \left| \sum _{i=0}^{m}\sum _{j=0}^{n}z^{mn}_{i,j}(y_{m,n}^{i}-y_{m,n}^{j})\right| \\= & {} \left| \sum _{i=0}^{m}\sum _{j=0}^{n}z^{mn}_{i,j}(u^{mn}_i-u^{mn}_j)\right| \\= & {} \sum _{i=0}^{m}\sum _{j=0}^{n}z^{mn}_{i,j}d_{i-1,j-1}=d_{m,n} \end{aligned}$$

which combined with (32) yields \(\Vert x^m-x^n\Vert _\infty =d_{m,n}\) as claimed.

Define \(T{:}S\rightarrow C\) on the set \(S=\{x^k{:}k\in {\mathbb N}\}\subseteq C\) by \(Tx^k=y^{k+1}\), so that T is non-expansive. Since \(\ell ^\infty ({\mathcal {I}})\) as well as the unit cube C are hyperconvex, then by Theorem 4 in Aronszajn & Panitchpakdi [1], T can be extended to a non-expansive map \(T{:}C\rightarrow C\) and then (29) is precisely a Mann sequence which attains all the bounds \(\Vert x^m-x^n\Vert _\infty =d_{m,n}\) with equality.

It remains to prove that \(\Vert x^n-Tx^n\Vert _\infty =R_n\). The upper bound follows again using the triangle inequality and (30) since

$$\begin{aligned} \Vert x^n-Tx^n\Vert _\infty =\hbox {} \left\| \sum _{i=0}^n\pi ^n_i(y^{i}-y^{n+1})\right\| _\infty \hbox {}\le \sum _{i=0}^n\pi _i^nd_{i-1,n}=R_n. \end{aligned}$$

For the reverse inequality, we look at the coordinate \((-1,n)\) so that

$$\begin{aligned} \Vert x^n-Tx^n\Vert _\infty= & {} \left\| \sum _{i=0}^n\pi ^n_iy^{i}-y^{n+1}\right\| _\infty \\\ge & {} \left| \sum _{i=0}^n\pi ^n_iy_{-1,n}^{i}-y_{-1,n}^{n+1}\right| \\= & {} \left| \sum _{i=0}^n\pi ^n_id_{i-1,n}-d_{n,n}\right| =R_n \end{aligned}$$

which completes the proof. \(\square \)

Remark 6

As in Bravo et al. [9] we observe that the map T must have a fixed point in C, and also that it can be extended to the full space \(\ell ^\infty ({\mathcal {I}})\).

Lower bound for Krasnosel’skiĭ–Mann iterations.

In this Appendix we prove Proposition 3 by showing a non-expansive linear operator T for which the Krasnosel’skiĭ–Mann sequence \(x^{n+1} = (1-\alpha _n)x^n +\alpha _nTx^n\) satisfies \(\Vert x^n-Tx^n\Vert \ge \frac{1}{\sqrt{n+1}}\), independently of the stepsizes \(\{\alpha _n\}_{n\ge 0}\).

Proof of Proposition 3

Let T be the right-shift operator (14) considered as a map acting on \((\ell ^1({\mathbb {N}}),\Vert \cdot \Vert _1)\). Fix an arbitrary sequence of stepsizes \(\alpha _n\) and consider the corresponding Krasnosel’skiĭ–Mann sequence started from \(x^0=(1,0,0,\ldots )\). It is easy to check inductively that the resulting (km) iterates are given by \(x^n = (p^n_0,p^n_1,\ldots ,p^n_n,0,0,\ldots )\), where \(p^n_k = {\mathbb {P}}(S_n= k)\) is the distribution of a sum \(S_n=X_1+\cdots +X_n\) of independent Bernoullis with \({\mathbb P}(X_i=1)=\alpha _i\).

A well-known result by Darroch [14] establishes that the distribution of \(S_n\) is bell-shaped, from which it follows that

$$\begin{aligned} \Vert x^n-Tx^n\Vert _1 = 2\max _{0\le k\le n} p^n_k \end{aligned}$$

the maximum being attained either at \(k=\lfloor \mu \rfloor \) or \(k=\lceil \mu \rceil \) (or both) where \(\mu =\alpha _1+\ldots +\alpha _n\). Moreover, taking \(B_n({{\bar{\alpha }}})\sim \text{ Binomial }(n,{{\bar{\alpha }}})\) with \({{\bar{\alpha }}} = \frac{1}{n}\sum _{i=1}^n \alpha _i\), a result from Hoeffding [19] shows that for \(0\le b \le n{{\bar{\alpha }}} \le c\le n\) we have (see also Xu & Balakrishnan [37])

$$\begin{aligned} {\mathbb {P}}(b\le S_n \le c) \ge {\mathbb {P}}(b\le B_n({{\bar{\alpha }}}) \le c). \end{aligned}$$

Taking \(b=\lfloor n{{\bar{\alpha }}} \rfloor \) and \(c=\lceil n{{\bar{\alpha }}} \rceil \) it follows that

$$\begin{aligned} 2\max _{0\le k\le n} p^n_k&\ge p^n_{\lfloor n{{\bar{\alpha }}} \rfloor }+p^n_{\lceil n{{\bar{\alpha }}} \rceil } \\&= {\mathbb {P}}(\lfloor n{{\bar{\alpha }}} \rfloor \le S_n \le \lceil n{{\bar{\alpha }}} \rceil ) \\&\ge {\mathbb {P}}(\lfloor n{{\bar{\alpha }}} \rfloor \le B_n({{\bar{\alpha }}}) \le \lceil n{{\bar{\alpha }}} \rceil ) . \end{aligned}$$

Let us define \(f_n{:}[0,1]\rightarrow [0,1]\) by \(f_n(x) = {\mathbb {P}}(\lfloor nx \rfloor \le B_n(x) \le \lceil nx \rceil )\). We want to compute the minimum value of \(f_n\). Firstly, we observe that \(f_n\) is symmetric with respect to \(x=\frac{1}{2}\), i.e. \(f_n(x) = f_n(1-x)\). Secondly, we note that \(f_n\) is discontinuous at the points of the form \(\frac{k}{n}\) for \(k=1,\ldots ,n-1\). In fact, one can check that \(f_n(\frac{k}{n})> f_n((\frac{k}{n})^-)\ge f_n((\frac{k}{n})^+)\) for all \(k = 1,\ldots ,\lfloor \frac{n}{2}\rfloor \), and symmetrically \(f_n(\frac{k}{n})> f_n((\frac{k}{n})^+)\ge f_n((\frac{k}{n})^-)\) for all \(k = \lceil \frac{n}{2}\rceil ,\ldots ,n-1\). On each open interval \(]\frac{k}{n},\frac{k+1}{n}[\) the function \(f_n\) is differentiable and concave (see Fig. 6) and its infimum is attained asymptotically by approaching the extreme of the interval which is closest to \(\frac{1}{2}\), namely

$$\begin{aligned} x_{k,n}^*=\left\{ \begin{array}{ll} \frac{k+1}{n}&{}\quad \text{ if } \,\frac{k+1}{n}\le \frac{1}{2},\\ [0.5ex] \frac{k}{n}&{}\quad \text{ if } \,\frac{k}{n}\ge \frac{1}{2},\\ [0.5ex] \text{ both }&{}\quad \text{ if } \,\frac{k}{n}< \frac{1}{2}<\frac{k+1}{n}. \end{array}\right. \end{aligned}$$

After some straightforward computations, one can conclude that the infimum of \(f_n\) over the full interval [0, 1] occurs when x tends to \(\frac{1}{n}\lfloor \frac{n}{2}\rfloor \) from the right and/or when x tends to \(\frac{1}{n}\lceil \frac{n}{2}\rceil \) from the left (see Fig. 6).

In particular, when \(n=2m\) is even, the infimum is obtained when approaching \(x=\frac{1}{2}\) either from the right or the left, with \(\inf f_{2m} = \frac{2m+1}{m+1}\frac{1}{4^{m}} \left( {\begin{array}{c}2m\\ m\end{array}}\right) \). We observe that \(\frac{1}{m+1}\left( {\begin{array}{c}2m\\ m\end{array}}\right) \) is a Catalan number, so that using the bound in Dutton & Brigham [17] we get

$$\begin{aligned} {\Vert x^n-Tx^n\Vert _1 \ge \inf f_n\ge \frac{2m+1}{m+1}\sqrt{\frac{4m-1}{4m}}\frac{1}{\sqrt{\pi m}}\ge \frac{1}{\sqrt{n}}.} \end{aligned}$$

If \(n = 2m+1\) is odd, then \(\inf f_{2m+1}= \left( {\begin{array}{c}2m+1\\ m\end{array}}\right) \left( \frac{m(m+1)}{(2m+1)^2}\right) ^m\). Using this expression along with the expression for the even case, it is easy to check that \(\inf f_{2m+1}\ge \inf f_{2m+2}\), and therefore we conclude

$$\begin{aligned} \hbox { }\ \Vert x^n-Tx^n\Vert _1 \ge \inf f_{n} \ge \inf f_{n+1} \ge \frac{1}{\sqrt{n+1}} \end{aligned}$$

completing the proof. \(\square \)

Fig. 6
figure 6

The function \(f_n\) for \(n=5\) and \(n=6\)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Contreras, J.P., Cominetti, R. Optimal error bounds for non-expansive fixed-point iterations in normed spaces. Math. Program. 199, 343–374 (2023). https://doi.org/10.1007/s10107-022-01830-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10107-022-01830-7

Keywords

Mathematics Subject Classification

Navigation