Skip to main content
Log in

The Impact of Noise on Evaluation Complexity: The Deterministic Trust-Region Case

  • Published:
Journal of Optimization Theory and Applications Aims and scope Submit manuscript

Abstract

Intrinsic noise in objective function and derivatives evaluations may cause premature termination of optimization algorithms. Evaluation complexity bounds taking this situation into account are presented in the framework of a deterministic trust-region method. The results show that the presence of intrinsic noise may dominate these bounds, in contrast with what is known for methods in which the inexactness in function and derivatives’ evaluations is fully controllable. Moreover, the new analysis provides estimates of the optimality level achievable, should noise cause early termination. Numerical experiments are reported that support the theory. The analysis finally sheds some light on the impact of inexact computer arithmetic on evaluation complexity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. Similar results are also known for the stochastic case which is outside the scope of this paper.

  2. It is easy to verify that, irrespective of \(\delta \), (2.4) holds for \(j=1\) if and only if \(\Vert \nabla _x^1f(x)\Vert \le \epsilon _1\) and that, if \(\Vert \nabla _x^1f(x)\Vert =0\), \(\lambda _{\min }[\nabla _x^2 f(x)] \ge -\epsilon _2\) if and only if \(\phi _{f,2}^\delta (x) \le \epsilon _2\).

  3. We could obviously use values of \(\zeta _d\) and \(\vartheta _d\) depending on the degree \(\ell \), but we prefer the above formulation to simplify notations. Values of \(\zeta _d\) depending on the degree \(\ell \) are used in [17].

  4. The CHECK algorithm is identical to the VERIFY algorithm of [17] (itself inspired by [4]) whenever accuracy is either absolute or relative.

  5. We keep Algorithms 3.1 and 3.2 distinct for ease of analysis.

  6. VERIFY in [17].

  7. With respect to the “full precision” case. This can be done by making the (known) maximum truncation error for a given precision level small enough.

References

  1. Bandeira, A.S., Scheinberg, K., Vicente, L.N.: Convergence of trust-region methods based on probabilistic models. SIAM J. Optim. 24(3), 1238–1264 (2014)

    MathSciNet  MATH  Google Scholar 

  2. Bellavia, S., Gurioli, G.: Complexity analysis of a stochastic cubic regularisation method under inexact gradient evaluations and dynamic Hessian accuracy. Optimization A Journal of Mathematical Programming and Operations Research 71(1), 227–261 (2022)

    MathSciNet  MATH  Google Scholar 

  3. Bellavia, S., Gurioli, G., Morini, B.: Adaptive cubic regularization methods with dynamic inexact Hessian information and applications to finite-sum minimization. IMA J. Numer. Anal. 41(1), 764–799 (2021)

    MathSciNet  MATH  Google Scholar 

  4. Bellavia, S., Gurioli, G., Morini, B., Toint, P.L.: Adaptive regularization algorithms with inexact evaluations for nonconvex optimization. SIAM J. Optim. 29(4), 2881–2915 (2019)

    MathSciNet  MATH  Google Scholar 

  5. Bellavia, S., Gurioli, G., Morini, B., Toint, P.L.: Adaptive regularization for nonconvex optimization using inexact function values and randomly perturbed derivatives. J. Complex. 68, 101591 (2022)

    MathSciNet  MATH  Google Scholar 

  6. Berahas, A., Cao, L., Scheinberg, K.: Global convergence rate analysis of a generic line search algorithm with noise. SIAM J. Optim. 31(2), 1489–1518 (2021)

    MathSciNet  MATH  Google Scholar 

  7. Birgin, E.G., Krejić, N., Martínez, J.M.: On the employment of inexact restoration for the minimization of functions whose evaluation is subject to errors. Math. Comput. 87, 1307–1326 (2018)

    MathSciNet  MATH  Google Scholar 

  8. Birgin, E.G., Krejić, N., Martínez, J.M.: Iteration and evaluation complexity on the minimization of functions whose computation is intrinsically inexact. Math. Comput. 89, 253–278 (2020)

    MathSciNet  MATH  Google Scholar 

  9. Blanchet, J., Cartis, C., Menickelly, M., Scheinberg, K.: Convergence rate analysis of a stochastic trust region method via supermartingales. INFORMS J. Optim. 1(2), 92–119 (2019)

    MathSciNet  Google Scholar 

  10. Buckley, AG.: Test functions for unconstrained minimization. Technical Report CS-3. Computing Science Division, Dalhousie University, Dalhousie, Canada (1989)

  11. Carter, R.G.: On the global convergence of trust region methods using inexact gradient information. SIAM J. Numer. Anal. 28(1), 251–265 (1991)

    MathSciNet  MATH  Google Scholar 

  12. Cartis, C., Gould, N.I.M., Toint, P.L.: Worst-case evaluation complexity of regularization methods for smooth unconstrained optimization using Hölder continuous gradients. Optim. Methods Softw. 6(6), 1273–1298 (2017)

    MATH  Google Scholar 

  13. Cartis, C., Gould, N.I.M., Toint, P.L.: Worst-case evaluation complexity and optimality of second-order methods for nonconvex smooth optimization. In: Sirakov, B., de Souza, P., Viana, M. (eds.) Invited Lectures, Proceedings of the 2018 International Conference of Mathematicians (ICM 2018), vol. 4, pp. 3729–3768. World Scientific Publishing Co Pte Ltd, Rio de Janeiro (2018)

  14. Cartis, C., Gould, N.I.M., Toint, P.L.: Universal regularization methods-varying the power, the smoothness and the accuracy. SIAM J. Optim. 29(1), 595–615 (2019)

    MathSciNet  MATH  Google Scholar 

  15. Cartis, C., Gould, N.I.M., Toint, P.L.: Sharp worst-case evaluation complexity bounds for arbitrary-order nonconvex optimization with inexpensive constraints. SIAM J. Optim. 30(1), 513–541 (2020)

    MathSciNet  MATH  Google Scholar 

  16. Cartis, C., Gould, N.I.M., Toint. P.L.: Strong evaluation complexity bounds for arbitrary-order optimization of nonconvex nonsmooth composite functions. arXiv:2001.10802 (2020)

  17. Cartis, C., Gould, N.I.M., Toint, P.L.: Strong evaluation complexity of an inexact trust-region algorithm for arbitrary-order unconstrained nonconvex optimization. arXiv:2011.00854 (2020)

  18. Chen, R., Menickelly, M., Scheinberg, K.: Stochastic optimization using a trust-region method and random models. Math. Program. Ser. A 169(2), 447–487 (2018)

    MathSciNet  MATH  Google Scholar 

  19. Conn, A.R., Gould, N.I.M., Toint, P.L.: Trust-Region Methods. MPS-SIAM Series on Optimization, SIAM, Philadelphia (2000)

    MATH  Google Scholar 

  20. de Klerk, E., Laurent, M.: Worst-case examples for Lasserre’s measure-based hierarchy for polynomial optimization on the hypercube. Math. Oper. Res. 45(1), 86–98 (2019)

    MathSciNet  MATH  Google Scholar 

  21. de Klerk, E., Laurent, M.: Convergence analysis of a Lasserre hierarchy of upper bounds for polynomial minimization on the sphere. Math. Program. 1–21 (2020)

  22. Grapiglia, G.N., Nesterov, Yu.: Regularized Newton methods for minimizing functions with Hölder continuous Hessians. SIAM J. Optim. 27(1), 478–506 (2017)

    MathSciNet  MATH  Google Scholar 

  23. Gratton, S., Sartenaer, A., Toint, P.L.: Recursive trust-region methods for multiscale nonlinear optimization. SIAM J. Optim. 19(1), 414–444 (2008)

    MathSciNet  MATH  Google Scholar 

  24. Gratton, S., Simon, E., Toint, P.L.: An algorithm for the minimization of nonsmooth nonconvex functions using inexact evaluations and its worst-case complexity. Math. Program. Ser. A 187(1), 1–24 (2021)

    MathSciNet  MATH  Google Scholar 

  25. Gratton, S., Toint, P.L.: A note on solving nonlinear optimization problems in variable precision. Comput. Optim. Appl. 76(3), 917–933 (2020)

    MathSciNet  MATH  Google Scholar 

  26. Gratton, S., Toint, P.L.: OPM, a collection of optimization problems in Matlab. arXiv preprint arXiv:2112.05636 (2021)

  27. Higham, N.J.: The rise of multiprecision computations. Talk at SAMSI 2017, April (2017). https://bit.ly/higham-samsi17

  28. Moré, J.J., Garbow, B.S., Hillstrom, K.E.: Testing unconstrained optimization software. ACM Trans. Math. Softw. 7(1), 17–41 (1981)

    MathSciNet  MATH  Google Scholar 

  29. Nesterov, Yu.: Gradient methods for minimizing composite objective functions. Math. Program. Ser. A 140(1), 125–161 (2013)

    MATH  Google Scholar 

  30. Nesterov, Yu.: Universal gradient methods for convex optimization problems. Math. Program. Ser. A 152(1–2), 381–404 (2015)

    MathSciNet  MATH  Google Scholar 

  31. Oztoprak, F., Byrd, R., Nocedal, J.: Constrained optimization in the presence of noise. arXiv:2110.04355 (2021)

  32. Paquette, C., Scheinberg, K.: A stochastic line search method with convergence rate analysis. SIAM J. Optim. 30(1), 349–376 (2020)

    MathSciNet  MATH  Google Scholar 

  33. Slot, L., Laurent, M.: Improved convergence analysis of Lasserre’s measure-based upper bounds for polynomial minimization on compact sets. Math. Program. 1–41 (2020)

  34. Xu, P., Roosta-Khorasani, F., Mahoney, M.W.: Newton-type methods for non-convex optimization under inexact Hessian information. Math. Program. Ser. A 184, 35–70 (2020)

    MathSciNet  MATH  Google Scholar 

  35. Yao, Z., Xu, P., Roosta-Khorasani, F., Mahoney, M.W.: Inexact non-convex Newton-type methods. INFORMS J. Optim. 3(2), 154–182 (2021)

    MathSciNet  Google Scholar 

  36. Yuan, Y.: Recent advances in trust region algorithms. Math. Program. Ser. A 151(1), 249–281 (2015)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

INdAM-GNCS partially supported the first, second and third authors under Progetti di Ricerca 2019 and 2020. The fourth author was partially supported by INdAM through a GNCS grant and by Università degli Studi di Firenze through Fondi di Internazionalizzazione.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Benedetta Morini.

Additional information

Communicated by Paulo J. S. Silva.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix : Details of the Proof of Theorem 5.5

Appendix : Details of the Proof of Theorem 5.5

We follow the argument of [17, proof of Theorem 3.8] (adapting the bounds to the new context) and derive an upper bound on the number of derivatives’ evaluations. This requires counting the number of additional derivative evaluations caused by successive tightening of the accuracy threshold \(\zeta _{d,i_\zeta }\). Observe that repeated evaluations at a given iterate \(x_k\) are only needed when the current value of this threshold is smaller than used previously at the same iterate \(x_k\). The \(\{\zeta _{d,i_\zeta }\}\) are, by construction, linearly decreasing with rate \(\gamma _\zeta \). Indeed, \(\zeta _{d,i_\zeta }\) is initialized to \(\zeta _{d,0}\le \kappa _\zeta \) in Step 0 of the TR q DAN algorithm, decreased each time by a factor \(\gamma _\zeta \) in (2.13) in the CHECK invoked in Step 1.2 of Algorithm 3.1, down to the value \(\zeta _{d,i_\zeta }\) which is then passed to Step 2, and possibly decreased there further in (2.13) in the CHECK invoked in Step 2.1 of the STEP2 algorithm, again by successive multiplication by \(\gamma _\zeta \). We now use (4.14) in Lemma 4.1 and (3.8) in Lemma 3.1 to deduce that, even in the absence of noise, \(\zeta _{d,i_\zeta }\) will not be reduced below the value

$$\begin{aligned}{} & {} \min \left[ \frac{\omega }{4}\,\varsigma \,\epsilon _j\,\frac{\delta _k^{j-1}}{j!}, \frac{\omega }{8(1+\omega )\max [1,\Delta _{\max }^j]}\,\epsilon _j\,\frac{\delta _k^j}{j!}\right] \nonumber \\{} & {} \ge \frac{\varsigma \,\omega }{8(1+\omega )\max [1,\Delta _{\max }^j]} \,\epsilon _j\,\frac{\delta _k^j}{j!} \end{aligned}$$
(A.1)

at iteration k. Now define

$$\begin{aligned} \kappa _\textrm{acc} {\mathop {=}\limits ^\textrm{def}}\frac{\varsigma \omega (\varsigma \kappa _{\delta })^q}{8(1+\omega )\max [1,\Delta _{\max }^j]} \le \frac{\varsigma \omega }{8(1+\omega )\max [1,\Delta _{\max }^j]} \,\frac{(\varsigma \kappa _{\delta })^j}{j!}, \end{aligned}$$

so that (5.17) implies that

$$\begin{aligned} \kappa _\textrm{acc}\epsilon _{\min }^{q+1} \le \frac{\varsigma \omega \,\epsilon _j}{8(1+\omega )\max [1,\Delta _{\max }^j]} \,\frac{\delta _k^j}{j!}. \end{aligned}$$

We also note that conditions (2.16) and (2.13) in the CHECK algorithm impose that any reduced value of \(\zeta _{d,i_\zeta }\) (before termination) must satisfy the bound \(\zeta _{d,i_\zeta } \ge \vartheta _d\). Hence, the bound (A.1) can be strengthened to be

$$\begin{aligned} \max \left[ \vartheta _d, \kappa _\textrm{acc}\epsilon _{\min }^{q+1}\right] . \end{aligned}$$

Thus, no further reduction of the \(\zeta _{d,i_\zeta }\), and hence no further approximation of \(\{\overline{\nabla _x^j f}(x_k)\}_{j=1}^q\), can possibly occur in any iteration once the largest initial absolute error \(\zeta _{d,0}\) has been reduced by successive multiplications by \(\gamma _\zeta \) sufficiently to ensure that

$$\begin{aligned} \gamma _\zeta ^{i_\zeta }\zeta _{d,0} \le \gamma _\zeta ^{i_\zeta }\kappa _\zeta \le \max [\vartheta _d,\kappa _\textrm{acc} \epsilon _{\min }^{q+1}], \end{aligned}$$
(A.2)

the second inequality being equivalent to asking

$$\begin{aligned} i_\zeta \log (\gamma _\zeta ) \le \max \left[ \log (\vartheta _d), (q+1)\log \left( \epsilon _{\min }\right) + \log (\kappa _\textrm{acc})\right] - \log \left( \kappa _\zeta \right) , \end{aligned}$$
(A.3)

where the right-hand side is negative because of the inequalities \(\kappa _\textrm{acc}<1\) and \(\max [\epsilon _{\min }^{q+1},\vartheta _d] \le \kappa _\zeta \) (imposed in the initialization step of the TR q EDAN algorithm). We now recall that Step 1 of this algorithm is only used (and derivatives evaluated) after successful iterations. As a consequence, we deduce that the number of evaluations of the derivatives of the objective function that occur during the course of the TR p DAN algorithm before termination is at most

$$\begin{aligned} |\mathcal{S}_k| + i_{\zeta ,\max }, \end{aligned}$$
(A.4)

i.e., the number iterations in (5.16), plus

$$\begin{aligned} \begin{array}{lcl} i_{\zeta ,\max } &{} \!\! {\mathop {=}\limits ^\textrm{def}}\!\! &{} \left\lfloor \frac{\displaystyle 1}{\displaystyle \log (\gamma _\zeta )} \max \left\{ \log \left( \frac{\displaystyle \vartheta _d}{\displaystyle \zeta _{d,0}}\right) , (q+1)\log \left( \epsilon _{\min }\right) +\log \left( \frac{\displaystyle \kappa _\textrm{acc}}{\displaystyle \zeta _{d,0}}\right) \right\} \right\rfloor \\ &{} \!\!< \!\! &{} \frac{\displaystyle 1}{\displaystyle |\log (\gamma _\zeta )|} \left\{ \left| \log \left( \frac{\displaystyle \vartheta _d}{\displaystyle \zeta _{d,0}}\right) \right| + \,(q+1)\left| \log \left( \epsilon _{\min }\right) \right| + \left| \log \left( \frac{\displaystyle \kappa _\textrm{acc}}{\displaystyle \zeta _{d,0}}\right) \right| \right\} +1, \end{array} \end{aligned}$$

the largest value of \(i_\zeta \) that ensures (A.3). Adding one for the final evaluation at termination, this leads to the desired evaluation bound (5.12) with the coefficients

$$\begin{aligned}{} & {} \kappa ^D_{{\mathrm{\mathsf TRqEDAN}}} {\mathop {=}\limits ^\textrm{def}}\frac{q+1}{|\log \gamma _\zeta |} \;\; \hbox {and} \;\;\\{} & {} \kappa ^E_{{\mathrm{\mathsf TRqEDAN}}} {\mathop {=}\limits ^\textrm{def}}\frac{\displaystyle 1}{\displaystyle |\log (\gamma _\zeta )|} \left\{ \left| \log \left( \frac{\displaystyle \kappa _\textrm{acc}}{\displaystyle \zeta _{d,0}}\right) \right| + \left| \log \left( \frac{\displaystyle \vartheta _d}{\displaystyle \zeta _{d,0}}\right) \right| \right\} +2. \end{aligned}$$

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bellavia, S., Gurioli, G., Morini, B. et al. The Impact of Noise on Evaluation Complexity: The Deterministic Trust-Region Case. J Optim Theory Appl 196, 700–729 (2023). https://doi.org/10.1007/s10957-022-02153-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10957-022-02153-5

Keywords

Navigation