Abstract
This paper investigates optimal error bounds and convergence rates for general Mann iterations for computing fixed-points of non-expansive maps. We look for iterations that achieve the smallest fixed-point residual after n steps, by minimizing a worst-case bound \(\Vert x^n-Tx^n\Vert \le R_n\) derived from a nested family of optimal transport problems. We prove that this bound is tight so that minimizing \(R_n\) yields optimal iterations. Inspired from numerical results we identify iterations that attain the rate \(R_n=O(1/n)\), which we also show to be the best possible. In particular, we prove that the classical Halpern iteration achieves this optimal rate for several alternative stepsizes, and we determine analytically the optimal stepsizes that attain the smallest worst-case residuals at every step n, with a tight bound \(R_n\approx \frac{4}{n+4}\). We also determine the optimal Halpern stepsizes for affine non-expansive maps, for which we get exactly \(R_n=\frac{1}{n+1}\). Finally, we show that the best rate for the classical Krasnosel’skiĭ–Mann iteration is \(\varOmega (1/\sqrt{n})\), and present numerical evidence suggesting that even extended variants cannot reach a faster rate.
Similar content being viewed by others
Notes
First version appears in 2017 in http://www.optimization-online.org/DB_FILE/2017/11/6336.pdf.
First version appears in 2019 in arXiv preprint arXiv:1905.05149.
References
Aronszajn, N., Panitchpakdi, P.: Extension of uniformly continuous transformations and hyperconvex metric spaces. Pac. J. Math. 6(3), 405–439 (1956)
Baillon, J.B., Bruck, R.E., Reich, S.: On the asymptotic behavior of non-expansive mappings and semigroups in Banach spaces. Houst. J. Math. 4(1), 1–10 (1978)
Baillon, J.B., Bruck, R.E.: Optimal Rates of Asymptotic Regularity for Averaged Non-expansive Mappings. World Scientific Publishing Co. Pte. Ltd., Singapore (1992)
Baillon, J.B., Bruck, R.E.: The rate of asymptotic regularity is \(O(1/\sqrt{n})\). In: Kartsatos A.G. (ed.) Theory and Applications of Nonlinear Operators of Accretive and Monotone Types, Lecture Notes in Pure and Applied Mathematics, vol. 178, pp. 51–81. Dekker, New York (1996)
Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces, CMS Books in Mathematics, vol. 408. Springer, Berlin (2011)
Berinde, V.: Iterative Approximation of Fixed Points. Lecture Notes in Mathematics, vol. 1912. Springer, Berlin (2007)
Borwein, J., Reich, S., Shafrir, I.: Krasnoselski–Mann iterations in normed spaces. Can. Math. Bull. 35(1), 21–28 (1992)
Bravo, M., Cominetti, R.: Sharp convergence rates for averaged non-expansive maps. Israel J. Math. 227(1), 163–188 (2018)
Bravo, M., Champion, T., Cominetti, R.: Universal bounds for fixed point iterations via optimal transport metrics, pp. 1–21 (2021). arXiv:2108.00300v1
Browder, F.E.: Convergence of approximants to fixed points of non-expansive nonlinear mappings in Banach spaces. Arch. Ration. Mech. Anal. 24(1), 82–90 (1967)
Browder, F.E., Petryshyn, W.V.: The solution by iteration of nonlinear functional equations in Banach spaces. Bull. Am. Math. Soc. 72(3), 571–575 (1966)
Colao, V., Marino, G.: On the rate of convergence of Halpern iterations. J. Nonlinear Convex Analysis 22(12), 2639–2646 (2021)
Cominetti, R., Soto, J.A., Vaisman, J.: On the rate of convergence of Krasnosel’skiĭ–Mann iterations and their connection with sums of Bernoullis. Israel J. Math. 199(2), 757–772 (2014)
Darroch, J.N.: On the distribution of the number of successes in independent trials. Ann. Math. Stat. 35(3), 1317–1321 (1964)
Diakonikolas, J.: Halpern iteration for near-optimal and parameter-free monotone inclusion and strong solutions to variational inequalities. Proc. Mach. Learn. Res. 125, 1–24 (2020)
Drori, Y., Teboulle, M.: Performance of first-order methods for smooth convex minimization: a novel approach. Math. Program. 145(1–2), 451–482 (2014)
Dutton, R.D., Brigham, R.C.: Computationally efficient bounds for the Catalan numbers. Eur. J. Combin. 7(3), 211–213 (1986)
Halpern, B.: Fixed points of nonexpanding maps. Bull. Am. Math. Soc. 73(6), 957–961 (1967)
Hoeffding, W.: On the distribution of the number of successes in independent trials. Ann. Math. Stat. 27(3), 713–721 (1956)
Kim, D.: Accelerated proximal point method for maximally monotone operators. Math. Program. 190(1), 57–87 (2021)
Kohlenbach, U.: On the logical analysis of proofs based on nonseparable Hilbert space theory. In: Fefferman, S., Sieg, W. (eds.) Proofs, Categories and Computations. Essays in Honor of Grigori Mints, pp. 131–143. College Publications, New York (2010)
Kohlenbach, U.: On quantitative versions of theorems due to F.E. Browder and R. Wittmann. Adv. Math. 226(3), 2764–2795 (2011)
Körnlein, D.: Quantitative results for Halpern iterations of non-expansive mappings. J. Math. Anal. Appl. 428(2), 1161–1172 (2015)
Krasnosel’skiĭ, M.A.: Two remarks on the method of successive approximations. Uspekhi Matematicheskikh Nauk 10, 123–127 (1955)
Leustean, L.: Rates of asymptotic regularity for Halpern iterations of non-expansive mappings. J. Univ. Comput. Sci. 13(11), 1680–1691 (2007)
Lieder, F.: On the convergence rate of the Halpern-iteration. Optim. Lett. 15(2), 405–418 (2021)
Mann, W.R.: Mean value methods in iteration. Proc. Am. Math. Soc. 4(3), 506–510 (1953)
Nesterov, Y.: Lectures on Convex Optimization, Springer Optimization and Its Applications, vol. 137. Springer, Berlin (2018)
Reich, S.: Fixed point iterations of non-expansive mappings. Pac. J. Math. 60(2), 195–198 (1975)
Reich, S.: Weak convergence theorems for non-expansive mappings in Banach spaces. J. Math. Anal. Appl. 67, 274–276 (1979)
Reich, S.: Strong convergence theorems for resolvents of accretive operators in Banach spaces. J. Math. Anal. Appl. 75, 287–292 (1980)
Reich, S.: Approximating fixed points of nonexpansive mappings. Panam. Math. J. 4(2), 23–28 (1994)
Ryu, E.K. Yin, W.: Large-scale convex optimization via monotone operators. Cambridge University Press (2022). https://www.cambridge.org/core/books/largescale-convex-optimization/2A7F8E7428BFA4EDB8AFACA11AB97E4C
Sabach, S., Shtern, S.: A first order method for solving convex bilevel optimization problems. SIAM J. Optim. 27(2), 640–660 (2017)
Wittmann, R.: Approximation of fixed points of non-expansive mappings. Arch. Math. 58(5), 486–491 (1992)
Xu, H.-K.: Iterative algorithms for nonlinear operators. J. Lond. Math. Soc. 66(1), 240–256 (2002)
Xu, M., Balakrishnan, N.: On the convolution of heterogeneous Bernoulli random variables. J. Appl. Probab. 48(3), 877–884 (2011)
Acknowledgements
We thank professor Simeon Reich (Israel Institute of Technology) for his interest in this paper, his valuable comments, and for pointing out relevant references that helped us improve the introductory section. We also thank the two anonymous referees who carefully read the paper and provided important insights that helped us improve the presentation. The work of Juan Pablo Contreras was supported by a doctoral scholarship from ANID-PFCHA/Doctorado Nacional/2019-21190161. Roberto Cominetti gratefully acknowledges the support provided by the research Grant FONDECYT 1171501.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Tightness of optimal transport bounds
In this section we present the proof of Theorem 1, establishing the tightness of the optimal transport bounds in general Mann iterations, and therefore the equality \(\varPsi _n(\pi )=R_n(\pi )\).
Proof of Theorem 1
Let \(z^{mn}\) and \(u^{mn}\) be optimal solutions for \(({\mathcal {P}}_{m,n})\) and \(({{\mathcal {D}}}_{m,n})\). Setting \(u_{i}^{mn}={\displaystyle \min _{0\le k\le n}}u_{k}^{mn}+d_{k-1,i-1}\) for \(i>n\), and using the triangle inequality, we get
In particular all the \(u_i^{mn}\)’s are within a distance at most 1 and, since the objective function in \(({{\mathcal {D}}}_{m,n})\) is invariant by translation, we may further assume that \(u_i^{mn}\in [0,1]\) for all \(i\in {\mathbb N}\).
Let \({\mathcal {I}}\) be the set of all pair of integers (m, n) with \(-1\le m\le n\), and consider the unit cube \(C=[0,1]^{\mathcal {I}}\) in the space \((\ell ^\infty ({\mathcal {I}}),\Vert \cdot \Vert _\infty )\). For every integer \(k\in {\mathbb N}\) define \(y^{k}\in C\) as
and a corresponding sequence \(x^{k}\in C\) given by
We claim that \(\Vert y^{m+1}-y^{n+1}\Vert _\infty \le d_{m,n} =\Vert x^m-x^n\Vert _\infty \) for all \(0\le m\le n\). Indeed, using the triangle inequality and (27) we get
which together imply
Also, selecting an optimal transport \(z^{mn}\) for \(({\mathcal {P}}_{m,n})\) we have
so that the triangle inequality and (30) yield
On the other hand, considering the (m, n)-coordinate in (31), the complementary slackness (9) gives (recall that we are in the case \(0\le m\le n\))
which combined with (32) yields \(\Vert x^m-x^n\Vert _\infty =d_{m,n}\) as claimed.
Define \(T{:}S\rightarrow C\) on the set \(S=\{x^k{:}k\in {\mathbb N}\}\subseteq C\) by \(Tx^k=y^{k+1}\), so that T is non-expansive. Since \(\ell ^\infty ({\mathcal {I}})\) as well as the unit cube C are hyperconvex, then by Theorem 4 in Aronszajn & Panitchpakdi [1], T can be extended to a non-expansive map \(T{:}C\rightarrow C\) and then (29) is precisely a Mann sequence which attains all the bounds \(\Vert x^m-x^n\Vert _\infty =d_{m,n}\) with equality.
It remains to prove that \(\Vert x^n-Tx^n\Vert _\infty =R_n\). The upper bound follows again using the triangle inequality and (30) since
For the reverse inequality, we look at the coordinate \((-1,n)\) so that
which completes the proof. \(\square \)
Remark 6
As in Bravo et al. [9] we observe that the map T must have a fixed point in C, and also that it can be extended to the full space \(\ell ^\infty ({\mathcal {I}})\).
Lower bound for Krasnosel’skiĭ–Mann iterations.
In this Appendix we prove Proposition 3 by showing a non-expansive linear operator T for which the Krasnosel’skiĭ–Mann sequence \(x^{n+1} = (1-\alpha _n)x^n +\alpha _nTx^n\) satisfies \(\Vert x^n-Tx^n\Vert \ge \frac{1}{\sqrt{n+1}}\), independently of the stepsizes \(\{\alpha _n\}_{n\ge 0}\).
Proof of Proposition 3
Let T be the right-shift operator (14) considered as a map acting on \((\ell ^1({\mathbb {N}}),\Vert \cdot \Vert _1)\). Fix an arbitrary sequence of stepsizes \(\alpha _n\) and consider the corresponding Krasnosel’skiĭ–Mann sequence started from \(x^0=(1,0,0,\ldots )\). It is easy to check inductively that the resulting (km) iterates are given by \(x^n = (p^n_0,p^n_1,\ldots ,p^n_n,0,0,\ldots )\), where \(p^n_k = {\mathbb {P}}(S_n= k)\) is the distribution of a sum \(S_n=X_1+\cdots +X_n\) of independent Bernoullis with \({\mathbb P}(X_i=1)=\alpha _i\).
A well-known result by Darroch [14] establishes that the distribution of \(S_n\) is bell-shaped, from which it follows that
the maximum being attained either at \(k=\lfloor \mu \rfloor \) or \(k=\lceil \mu \rceil \) (or both) where \(\mu =\alpha _1+\ldots +\alpha _n\). Moreover, taking \(B_n({{\bar{\alpha }}})\sim \text{ Binomial }(n,{{\bar{\alpha }}})\) with \({{\bar{\alpha }}} = \frac{1}{n}\sum _{i=1}^n \alpha _i\), a result from Hoeffding [19] shows that for \(0\le b \le n{{\bar{\alpha }}} \le c\le n\) we have (see also Xu & Balakrishnan [37])
Taking \(b=\lfloor n{{\bar{\alpha }}} \rfloor \) and \(c=\lceil n{{\bar{\alpha }}} \rceil \) it follows that
Let us define \(f_n{:}[0,1]\rightarrow [0,1]\) by \(f_n(x) = {\mathbb {P}}(\lfloor nx \rfloor \le B_n(x) \le \lceil nx \rceil )\). We want to compute the minimum value of \(f_n\). Firstly, we observe that \(f_n\) is symmetric with respect to \(x=\frac{1}{2}\), i.e. \(f_n(x) = f_n(1-x)\). Secondly, we note that \(f_n\) is discontinuous at the points of the form \(\frac{k}{n}\) for \(k=1,\ldots ,n-1\). In fact, one can check that \(f_n(\frac{k}{n})> f_n((\frac{k}{n})^-)\ge f_n((\frac{k}{n})^+)\) for all \(k = 1,\ldots ,\lfloor \frac{n}{2}\rfloor \), and symmetrically \(f_n(\frac{k}{n})> f_n((\frac{k}{n})^+)\ge f_n((\frac{k}{n})^-)\) for all \(k = \lceil \frac{n}{2}\rceil ,\ldots ,n-1\). On each open interval \(]\frac{k}{n},\frac{k+1}{n}[\) the function \(f_n\) is differentiable and concave (see Fig. 6) and its infimum is attained asymptotically by approaching the extreme of the interval which is closest to \(\frac{1}{2}\), namely
After some straightforward computations, one can conclude that the infimum of \(f_n\) over the full interval [0, 1] occurs when x tends to \(\frac{1}{n}\lfloor \frac{n}{2}\rfloor \) from the right and/or when x tends to \(\frac{1}{n}\lceil \frac{n}{2}\rceil \) from the left (see Fig. 6).
In particular, when \(n=2m\) is even, the infimum is obtained when approaching \(x=\frac{1}{2}\) either from the right or the left, with \(\inf f_{2m} = \frac{2m+1}{m+1}\frac{1}{4^{m}} \left( {\begin{array}{c}2m\\ m\end{array}}\right) \). We observe that \(\frac{1}{m+1}\left( {\begin{array}{c}2m\\ m\end{array}}\right) \) is a Catalan number, so that using the bound in Dutton & Brigham [17] we get
If \(n = 2m+1\) is odd, then \(\inf f_{2m+1}= \left( {\begin{array}{c}2m+1\\ m\end{array}}\right) \left( \frac{m(m+1)}{(2m+1)^2}\right) ^m\). Using this expression along with the expression for the even case, it is easy to check that \(\inf f_{2m+1}\ge \inf f_{2m+2}\), and therefore we conclude
completing the proof. \(\square \)
Rights and permissions
About this article
Cite this article
Contreras, J.P., Cominetti, R. Optimal error bounds for non-expansive fixed-point iterations in normed spaces. Math. Program. 199, 343–374 (2023). https://doi.org/10.1007/s10107-022-01830-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10107-022-01830-7