The Impact of Noise on Evaluation Complexity: The Deterministic Trust-Region Case

Bellavia, Stefania; Gurioli, Gianmarco; Morini, Benedetta; Toint, Philippe Louis

doi:10.1007/s10957-022-02153-5

The Impact of Noise on Evaluation Complexity: The Deterministic Trust-Region Case

Published: 09 January 2023

Volume 196, pages 700–729, (2023)
Cite this article

Journal of Optimization Theory and Applications Aims and scope Submit manuscript

211 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

Intrinsic noise in objective function and derivatives evaluations may cause premature termination of optimization algorithms. Evaluation complexity bounds taking this situation into account are presented in the framework of a deterministic trust-region method. The results show that the presence of intrinsic noise may dominate these bounds, in contrast with what is known for methods in which the inexactness in function and derivatives’ evaluations is fully controllable. Moreover, the new analysis provides estimates of the optimality level achievable, should noise cause early termination. Numerical experiments are reported that support the theory. The analysis finally sheds some light on the impact of inexact computer arithmetic on evaluation complexity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Worst-case evaluation complexity for unconstrained nonlinear optimization using high-order regularized models

Article 30 August 2016

Second-Order Optimality and Beyond: Characterization and Evaluation Complexity in Convexly Constrained Nonlinear Optimization

Article Open access 07 September 2017

Concise complexity analyses for trust region methods

Article 21 June 2018

Notes

Similar results are also known for the stochastic case which is outside the scope of this paper.
It is easy to verify that, irrespective of $\delta $, (2.4) holds for $j=1$ if and only if $\Vert \nabla _x^1f(x)\Vert \le \epsilon _1$ and that, if $\Vert \nabla _x^1f(x)\Vert =0$, $\lambda _{\min }[\nabla _x^2 f(x)] \ge -\epsilon _2$ if and only if $\phi _{f,2}^\delta (x) \le \epsilon _2$.
We could obviously use values of $\zeta _d$ and $\vartheta _d$ depending on the degree $\ell $, but we prefer the above formulation to simplify notations. Values of $\zeta _d$ depending on the degree $\ell $ are used in [17].
The CHECK algorithm is identical to the VERIFY algorithm of [17] (itself inspired by [4]) whenever accuracy is either absolute or relative.
We keep Algorithms 3.1 and 3.2 distinct for ease of analysis.
VERIFY in [17].
With respect to the “full precision” case. This can be done by making the (known) maximum truncation error for a given precision level small enough.

References

Bandeira, A.S., Scheinberg, K., Vicente, L.N.: Convergence of trust-region methods based on probabilistic models. SIAM J. Optim. 24(3), 1238–1264 (2014)
MathSciNet MATH Google Scholar
Bellavia, S., Gurioli, G.: Complexity analysis of a stochastic cubic regularisation method under inexact gradient evaluations and dynamic Hessian accuracy. Optimization A Journal of Mathematical Programming and Operations Research 71(1), 227–261 (2022)
MathSciNet MATH Google Scholar
Bellavia, S., Gurioli, G., Morini, B.: Adaptive cubic regularization methods with dynamic inexact Hessian information and applications to finite-sum minimization. IMA J. Numer. Anal. 41(1), 764–799 (2021)
MathSciNet MATH Google Scholar
Bellavia, S., Gurioli, G., Morini, B., Toint, P.L.: Adaptive regularization algorithms with inexact evaluations for nonconvex optimization. SIAM J. Optim. 29(4), 2881–2915 (2019)
MathSciNet MATH Google Scholar
Bellavia, S., Gurioli, G., Morini, B., Toint, P.L.: Adaptive regularization for nonconvex optimization using inexact function values and randomly perturbed derivatives. J. Complex. 68, 101591 (2022)
MathSciNet MATH Google Scholar
Berahas, A., Cao, L., Scheinberg, K.: Global convergence rate analysis of a generic line search algorithm with noise. SIAM J. Optim. 31(2), 1489–1518 (2021)
MathSciNet MATH Google Scholar
Birgin, E.G., Krejić, N., Martínez, J.M.: On the employment of inexact restoration for the minimization of functions whose evaluation is subject to errors. Math. Comput. 87, 1307–1326 (2018)
MathSciNet MATH Google Scholar
Birgin, E.G., Krejić, N., Martínez, J.M.: Iteration and evaluation complexity on the minimization of functions whose computation is intrinsically inexact. Math. Comput. 89, 253–278 (2020)
MathSciNet MATH Google Scholar
Blanchet, J., Cartis, C., Menickelly, M., Scheinberg, K.: Convergence rate analysis of a stochastic trust region method via supermartingales. INFORMS J. Optim. 1(2), 92–119 (2019)
MathSciNet Google Scholar
Buckley, AG.: Test functions for unconstrained minimization. Technical Report CS-3. Computing Science Division, Dalhousie University, Dalhousie, Canada (1989)
Carter, R.G.: On the global convergence of trust region methods using inexact gradient information. SIAM J. Numer. Anal. 28(1), 251–265 (1991)
MathSciNet MATH Google Scholar
Cartis, C., Gould, N.I.M., Toint, P.L.: Worst-case evaluation complexity of regularization methods for smooth unconstrained optimization using Hölder continuous gradients. Optim. Methods Softw. 6(6), 1273–1298 (2017)
MATH Google Scholar
Cartis, C., Gould, N.I.M., Toint, P.L.: Worst-case evaluation complexity and optimality of second-order methods for nonconvex smooth optimization. In: Sirakov, B., de Souza, P., Viana, M. (eds.) Invited Lectures, Proceedings of the 2018 International Conference of Mathematicians (ICM 2018), vol. 4, pp. 3729–3768. World Scientific Publishing Co Pte Ltd, Rio de Janeiro (2018)
Cartis, C., Gould, N.I.M., Toint, P.L.: Universal regularization methods-varying the power, the smoothness and the accuracy. SIAM J. Optim. 29(1), 595–615 (2019)
MathSciNet MATH Google Scholar
Cartis, C., Gould, N.I.M., Toint, P.L.: Sharp worst-case evaluation complexity bounds for arbitrary-order nonconvex optimization with inexpensive constraints. SIAM J. Optim. 30(1), 513–541 (2020)
MathSciNet MATH Google Scholar
Cartis, C., Gould, N.I.M., Toint. P.L.: Strong evaluation complexity bounds for arbitrary-order optimization of nonconvex nonsmooth composite functions. arXiv:2001.10802 (2020)
Cartis, C., Gould, N.I.M., Toint, P.L.: Strong evaluation complexity of an inexact trust-region algorithm for arbitrary-order unconstrained nonconvex optimization. arXiv:2011.00854 (2020)
Chen, R., Menickelly, M., Scheinberg, K.: Stochastic optimization using a trust-region method and random models. Math. Program. Ser. A 169(2), 447–487 (2018)
MathSciNet MATH Google Scholar
Conn, A.R., Gould, N.I.M., Toint, P.L.: Trust-Region Methods. MPS-SIAM Series on Optimization, SIAM, Philadelphia (2000)
MATH Google Scholar
de Klerk, E., Laurent, M.: Worst-case examples for Lasserre’s measure-based hierarchy for polynomial optimization on the hypercube. Math. Oper. Res. 45(1), 86–98 (2019)
MathSciNet MATH Google Scholar
de Klerk, E., Laurent, M.: Convergence analysis of a Lasserre hierarchy of upper bounds for polynomial minimization on the sphere. Math. Program. 1–21 (2020)
Grapiglia, G.N., Nesterov, Yu.: Regularized Newton methods for minimizing functions with Hölder continuous Hessians. SIAM J. Optim. 27(1), 478–506 (2017)
MathSciNet MATH Google Scholar
Gratton, S., Sartenaer, A., Toint, P.L.: Recursive trust-region methods for multiscale nonlinear optimization. SIAM J. Optim. 19(1), 414–444 (2008)
MathSciNet MATH Google Scholar
Gratton, S., Simon, E., Toint, P.L.: An algorithm for the minimization of nonsmooth nonconvex functions using inexact evaluations and its worst-case complexity. Math. Program. Ser. A 187(1), 1–24 (2021)
MathSciNet MATH Google Scholar
Gratton, S., Toint, P.L.: A note on solving nonlinear optimization problems in variable precision. Comput. Optim. Appl. 76(3), 917–933 (2020)
MathSciNet MATH Google Scholar
Gratton, S., Toint, P.L.: OPM, a collection of optimization problems in Matlab. arXiv preprint arXiv:2112.05636 (2021)
Higham, N.J.: The rise of multiprecision computations. Talk at SAMSI 2017, April (2017). https://bit.ly/higham-samsi17
Moré, J.J., Garbow, B.S., Hillstrom, K.E.: Testing unconstrained optimization software. ACM Trans. Math. Softw. 7(1), 17–41 (1981)
MathSciNet MATH Google Scholar
Nesterov, Yu.: Gradient methods for minimizing composite objective functions. Math. Program. Ser. A 140(1), 125–161 (2013)
MATH Google Scholar
Nesterov, Yu.: Universal gradient methods for convex optimization problems. Math. Program. Ser. A 152(1–2), 381–404 (2015)
MathSciNet MATH Google Scholar
Oztoprak, F., Byrd, R., Nocedal, J.: Constrained optimization in the presence of noise. arXiv:2110.04355 (2021)
Paquette, C., Scheinberg, K.: A stochastic line search method with convergence rate analysis. SIAM J. Optim. 30(1), 349–376 (2020)
MathSciNet MATH Google Scholar
Slot, L., Laurent, M.: Improved convergence analysis of Lasserre’s measure-based upper bounds for polynomial minimization on compact sets. Math. Program. 1–41 (2020)
Xu, P., Roosta-Khorasani, F., Mahoney, M.W.: Newton-type methods for non-convex optimization under inexact Hessian information. Math. Program. Ser. A 184, 35–70 (2020)
MathSciNet MATH Google Scholar
Yao, Z., Xu, P., Roosta-Khorasani, F., Mahoney, M.W.: Inexact non-convex Newton-type methods. INFORMS J. Optim. 3(2), 154–182 (2021)
MathSciNet Google Scholar
Yuan, Y.: Recent advances in trust region algorithms. Math. Program. Ser. A 151(1), 249–281 (2015)
MathSciNet MATH Google Scholar

Download references

Acknowledgements

INdAM-GNCS partially supported the first, second and third authors under Progetti di Ricerca 2019 and 2020. The fourth author was partially supported by INdAM through a GNCS grant and by Università degli Studi di Firenze through Fondi di Internazionalizzazione.

Author information

Authors and Affiliations

Dipartimento di Ingegneria Industriale, Università degli Studi di Firenze, Firenze, Italy
Stefania Bellavia & Benedetta Morini
Institute of Information Science and Technologies “A. Faedo” ISTI-CNR, Pisa, Italy
Gianmarco Gurioli
Namur Center for Complex Systems (naXys), University of Namur, 61, rue de Bruxelles, 5000, Namur, Belgium
Philippe Louis Toint

Authors

Stefania Bellavia
View author publications
You can also search for this author in PubMed Google Scholar
Gianmarco Gurioli
View author publications
You can also search for this author in PubMed Google Scholar
Benedetta Morini
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Louis Toint
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Benedetta Morini.

Additional information

Communicated by Paulo J. S. Silva.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix : Details of the Proof of Theorem 5.5

We follow the argument of [17, proof of Theorem 3.8] (adapting the bounds to the new context) and derive an upper bound on the number of derivatives’ evaluations. This requires counting the number of additional derivative evaluations caused by successive tightening of the accuracy threshold $\zeta _{d,i_\zeta }$. Observe that repeated evaluations at a given iterate $x_k$ are only needed when the current value of this threshold is smaller than used previously at the same iterate $x_k$. The $\{\zeta _{d,i_\zeta }\}$ are, by construction, linearly decreasing with rate $\gamma _\zeta $. Indeed, $\zeta _{d,i_\zeta }$ is initialized to $\zeta _{d,0}\le \kappa _\zeta $ in Step 0 of the TR q DAN algorithm, decreased each time by a factor $\gamma _\zeta $ in (2.13) in the CHECK invoked in Step 1.2 of Algorithm 3.1, down to the value $\zeta _{d,i_\zeta }$ which is then passed to Step 2, and possibly decreased there further in (2.13) in the CHECK invoked in Step 2.1 of the STEP2 algorithm, again by successive multiplication by $\gamma _\zeta $. We now use (4.14) in Lemma 4.1 and (3.8) in Lemma 3.1 to deduce that, even in the absence of noise, $\zeta _{d,i_\zeta }$ will not be reduced below the value

$$\begin{aligned}{} & {} \min \left[ \frac{\omega }{4}\,\varsigma \,\epsilon _j\,\frac{\delta _k^{j-1}}{j!}, \frac{\omega }{8(1+\omega )\max [1,\Delta _{\max }^j]}\,\epsilon _j\,\frac{\delta _k^j}{j!}\right] \nonumber \\{} & {} \ge \frac{\varsigma \,\omega }{8(1+\omega )\max [1,\Delta _{\max }^j]} \,\epsilon _j\,\frac{\delta _k^j}{j!} \end{aligned}$$

(A.1)

at iteration k. Now define

$$\begin{aligned} \kappa _\textrm{acc} {\mathop {=}\limits ^\textrm{def}}\frac{\varsigma \omega (\varsigma \kappa _{\delta })^q}{8(1+\omega )\max [1,\Delta _{\max }^j]} \le \frac{\varsigma \omega }{8(1+\omega )\max [1,\Delta _{\max }^j]} \,\frac{(\varsigma \kappa _{\delta })^j}{j!}, \end{aligned}$$

so that (5.17) implies that

$$\begin{aligned} \kappa _\textrm{acc}\epsilon _{\min }^{q+1} \le \frac{\varsigma \omega \,\epsilon _j}{8(1+\omega )\max [1,\Delta _{\max }^j]} \,\frac{\delta _k^j}{j!}. \end{aligned}$$

We also note that conditions (2.16) and (2.13) in the CHECK algorithm impose that any reduced value of $\zeta _{d,i_\zeta }$ (before termination) must satisfy the bound $\zeta _{d,i_\zeta } \ge \vartheta _d$. Hence, the bound (A.1) can be strengthened to be

$$\begin{aligned} \max \left[ \vartheta _d, \kappa _\textrm{acc}\epsilon _{\min }^{q+1}\right] . \end{aligned}$$

Thus, no further reduction of the $\zeta _{d,i_\zeta }$, and hence no further approximation of $\{\overline{\nabla _x^j f}(x_k)\}_{j=1}^q$, can possibly occur in any iteration once the largest initial absolute error $\zeta _{d,0}$ has been reduced by successive multiplications by $\gamma _\zeta $ sufficiently to ensure that

$$\begin{aligned} \gamma _\zeta ^{i_\zeta }\zeta _{d,0} \le \gamma _\zeta ^{i_\zeta }\kappa _\zeta \le \max [\vartheta _d,\kappa _\textrm{acc} \epsilon _{\min }^{q+1}], \end{aligned}$$

(A.2)

the second inequality being equivalent to asking

$$\begin{aligned} i_\zeta \log (\gamma _\zeta ) \le \max \left[ \log (\vartheta _d), (q+1)\log \left( \epsilon _{\min }\right) + \log (\kappa _\textrm{acc})\right] - \log \left( \kappa _\zeta \right) , \end{aligned}$$

(A.3)

where the right-hand side is negative because of the inequalities $\kappa _\textrm{acc}<1$ and $\max [\epsilon _{\min }^{q+1},\vartheta _d] \le \kappa _\zeta $ (imposed in the initialization step of the TR q EDAN algorithm). We now recall that Step 1 of this algorithm is only used (and derivatives evaluated) after successful iterations. As a consequence, we deduce that the number of evaluations of the derivatives of the objective function that occur during the course of the TR p DAN algorithm before termination is at most

$$\begin{aligned} |\mathcal{S}_k| + i_{\zeta ,\max }, \end{aligned}$$

(A.4)

i.e., the number iterations in (5.16), plus

$$\begin{aligned} \begin{array}{lcl} i_{\zeta ,\max } &{} \!\! {\mathop {=}\limits ^\textrm{def}}\!\! &{} \left\lfloor \frac{\displaystyle 1}{\displaystyle \log (\gamma _\zeta )} \max \left\{ \log \left( \frac{\displaystyle \vartheta _d}{\displaystyle \zeta _{d,0}}\right) , (q+1)\log \left( \epsilon _{\min }\right) +\log \left( \frac{\displaystyle \kappa _\textrm{acc}}{\displaystyle \zeta _{d,0}}\right) \right\} \right\rfloor \\ &{} \!\!< \!\! &{} \frac{\displaystyle 1}{\displaystyle |\log (\gamma _\zeta )|} \left\{ \left| \log \left( \frac{\displaystyle \vartheta _d}{\displaystyle \zeta _{d,0}}\right) \right| + \,(q+1)\left| \log \left( \epsilon _{\min }\right) \right| + \left| \log \left( \frac{\displaystyle \kappa _\textrm{acc}}{\displaystyle \zeta _{d,0}}\right) \right| \right\} +1, \end{array} \end{aligned}$$

the largest value of $i_\zeta $ that ensures (A.3). Adding one for the final evaluation at termination, this leads to the desired evaluation bound (5.12) with the coefficients

$$\begin{aligned}{} & {} \kappa ^D_{{\mathrm{\mathsf TRqEDAN}}} {\mathop {=}\limits ^\textrm{def}}\frac{q+1}{|\log \gamma _\zeta |} \;\; \hbox {and} \;\;\\{} & {} \kappa ^E_{{\mathrm{\mathsf TRqEDAN}}} {\mathop {=}\limits ^\textrm{def}}\frac{\displaystyle 1}{\displaystyle |\log (\gamma _\zeta )|} \left\{ \left| \log \left( \frac{\displaystyle \kappa _\textrm{acc}}{\displaystyle \zeta _{d,0}}\right) \right| + \left| \log \left( \frac{\displaystyle \vartheta _d}{\displaystyle \zeta _{d,0}}\right) \right| \right\} +2. \end{aligned}$$

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Bellavia, S., Gurioli, G., Morini, B. et al. The Impact of Noise on Evaluation Complexity: The Deterministic Trust-Region Case. J Optim Theory Appl 196, 700–729 (2023). https://doi.org/10.1007/s10957-022-02153-5

Download citation

Received: 21 December 2021
Accepted: 21 December 2022
Published: 09 January 2023
Issue Date: February 2023
DOI: https://doi.org/10.1007/s10957-022-02153-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Impact of Noise on Evaluation Complexity: The Deterministic Trust-Region Case

Abstract

Access this article

Similar content being viewed by others

Worst-case evaluation complexity for unconstrained nonlinear optimization using high-order regularized models

Second-Order Optimality and Beyond: Characterization and Evaluation Complexity in Convexly Constrained Nonlinear Optimization

Concise complexity analyses for trust region methods

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix : Details of the Proof of Theorem 5.5

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The Impact of Noise on Evaluation Complexity: The Deterministic Trust-Region Case

Abstract

Access this article

Similar content being viewed by others

Worst-case evaluation complexity for unconstrained nonlinear optimization using high-order regularized models

Second-Order Optimality and Beyond: Characterization and Evaluation Complexity in Convexly Constrained Nonlinear Optimization

Concise complexity analyses for trust region methods

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix : Details of the Proof of Theorem 5.5

Appendix : Details of the Proof of Theorem 5.5

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation