Abstract
Intrinsic noise in objective function and derivatives evaluations may cause premature termination of optimization algorithms. Evaluation complexity bounds taking this situation into account are presented in the framework of a deterministic trust-region method. The results show that the presence of intrinsic noise may dominate these bounds, in contrast with what is known for methods in which the inexactness in function and derivatives’ evaluations is fully controllable. Moreover, the new analysis provides estimates of the optimality level achievable, should noise cause early termination. Numerical experiments are reported that support the theory. The analysis finally sheds some light on the impact of inexact computer arithmetic on evaluation complexity.
Similar content being viewed by others
Notes
Similar results are also known for the stochastic case which is outside the scope of this paper.
It is easy to verify that, irrespective of \(\delta \), (2.4) holds for \(j=1\) if and only if \(\Vert \nabla _x^1f(x)\Vert \le \epsilon _1\) and that, if \(\Vert \nabla _x^1f(x)\Vert =0\), \(\lambda _{\min }[\nabla _x^2 f(x)] \ge -\epsilon _2\) if and only if \(\phi _{f,2}^\delta (x) \le \epsilon _2\).
We could obviously use values of \(\zeta _d\) and \(\vartheta _d\) depending on the degree \(\ell \), but we prefer the above formulation to simplify notations. Values of \(\zeta _d\) depending on the degree \(\ell \) are used in [17].
VERIFY in [17].
With respect to the “full precision” case. This can be done by making the (known) maximum truncation error for a given precision level small enough.
References
Bandeira, A.S., Scheinberg, K., Vicente, L.N.: Convergence of trust-region methods based on probabilistic models. SIAM J. Optim. 24(3), 1238–1264 (2014)
Bellavia, S., Gurioli, G.: Complexity analysis of a stochastic cubic regularisation method under inexact gradient evaluations and dynamic Hessian accuracy. Optimization A Journal of Mathematical Programming and Operations Research 71(1), 227–261 (2022)
Bellavia, S., Gurioli, G., Morini, B.: Adaptive cubic regularization methods with dynamic inexact Hessian information and applications to finite-sum minimization. IMA J. Numer. Anal. 41(1), 764–799 (2021)
Bellavia, S., Gurioli, G., Morini, B., Toint, P.L.: Adaptive regularization algorithms with inexact evaluations for nonconvex optimization. SIAM J. Optim. 29(4), 2881–2915 (2019)
Bellavia, S., Gurioli, G., Morini, B., Toint, P.L.: Adaptive regularization for nonconvex optimization using inexact function values and randomly perturbed derivatives. J. Complex. 68, 101591 (2022)
Berahas, A., Cao, L., Scheinberg, K.: Global convergence rate analysis of a generic line search algorithm with noise. SIAM J. Optim. 31(2), 1489–1518 (2021)
Birgin, E.G., Krejić, N., Martínez, J.M.: On the employment of inexact restoration for the minimization of functions whose evaluation is subject to errors. Math. Comput. 87, 1307–1326 (2018)
Birgin, E.G., Krejić, N., Martínez, J.M.: Iteration and evaluation complexity on the minimization of functions whose computation is intrinsically inexact. Math. Comput. 89, 253–278 (2020)
Blanchet, J., Cartis, C., Menickelly, M., Scheinberg, K.: Convergence rate analysis of a stochastic trust region method via supermartingales. INFORMS J. Optim. 1(2), 92–119 (2019)
Buckley, AG.: Test functions for unconstrained minimization. Technical Report CS-3. Computing Science Division, Dalhousie University, Dalhousie, Canada (1989)
Carter, R.G.: On the global convergence of trust region methods using inexact gradient information. SIAM J. Numer. Anal. 28(1), 251–265 (1991)
Cartis, C., Gould, N.I.M., Toint, P.L.: Worst-case evaluation complexity of regularization methods for smooth unconstrained optimization using Hölder continuous gradients. Optim. Methods Softw. 6(6), 1273–1298 (2017)
Cartis, C., Gould, N.I.M., Toint, P.L.: Worst-case evaluation complexity and optimality of second-order methods for nonconvex smooth optimization. In: Sirakov, B., de Souza, P., Viana, M. (eds.) Invited Lectures, Proceedings of the 2018 International Conference of Mathematicians (ICM 2018), vol. 4, pp. 3729–3768. World Scientific Publishing Co Pte Ltd, Rio de Janeiro (2018)
Cartis, C., Gould, N.I.M., Toint, P.L.: Universal regularization methods-varying the power, the smoothness and the accuracy. SIAM J. Optim. 29(1), 595–615 (2019)
Cartis, C., Gould, N.I.M., Toint, P.L.: Sharp worst-case evaluation complexity bounds for arbitrary-order nonconvex optimization with inexpensive constraints. SIAM J. Optim. 30(1), 513–541 (2020)
Cartis, C., Gould, N.I.M., Toint. P.L.: Strong evaluation complexity bounds for arbitrary-order optimization of nonconvex nonsmooth composite functions. arXiv:2001.10802 (2020)
Cartis, C., Gould, N.I.M., Toint, P.L.: Strong evaluation complexity of an inexact trust-region algorithm for arbitrary-order unconstrained nonconvex optimization. arXiv:2011.00854 (2020)
Chen, R., Menickelly, M., Scheinberg, K.: Stochastic optimization using a trust-region method and random models. Math. Program. Ser. A 169(2), 447–487 (2018)
Conn, A.R., Gould, N.I.M., Toint, P.L.: Trust-Region Methods. MPS-SIAM Series on Optimization, SIAM, Philadelphia (2000)
de Klerk, E., Laurent, M.: Worst-case examples for Lasserre’s measure-based hierarchy for polynomial optimization on the hypercube. Math. Oper. Res. 45(1), 86–98 (2019)
de Klerk, E., Laurent, M.: Convergence analysis of a Lasserre hierarchy of upper bounds for polynomial minimization on the sphere. Math. Program. 1–21 (2020)
Grapiglia, G.N., Nesterov, Yu.: Regularized Newton methods for minimizing functions with Hölder continuous Hessians. SIAM J. Optim. 27(1), 478–506 (2017)
Gratton, S., Sartenaer, A., Toint, P.L.: Recursive trust-region methods for multiscale nonlinear optimization. SIAM J. Optim. 19(1), 414–444 (2008)
Gratton, S., Simon, E., Toint, P.L.: An algorithm for the minimization of nonsmooth nonconvex functions using inexact evaluations and its worst-case complexity. Math. Program. Ser. A 187(1), 1–24 (2021)
Gratton, S., Toint, P.L.: A note on solving nonlinear optimization problems in variable precision. Comput. Optim. Appl. 76(3), 917–933 (2020)
Gratton, S., Toint, P.L.: OPM, a collection of optimization problems in Matlab. arXiv preprint arXiv:2112.05636 (2021)
Higham, N.J.: The rise of multiprecision computations. Talk at SAMSI 2017, April (2017). https://bit.ly/higham-samsi17
Moré, J.J., Garbow, B.S., Hillstrom, K.E.: Testing unconstrained optimization software. ACM Trans. Math. Softw. 7(1), 17–41 (1981)
Nesterov, Yu.: Gradient methods for minimizing composite objective functions. Math. Program. Ser. A 140(1), 125–161 (2013)
Nesterov, Yu.: Universal gradient methods for convex optimization problems. Math. Program. Ser. A 152(1–2), 381–404 (2015)
Oztoprak, F., Byrd, R., Nocedal, J.: Constrained optimization in the presence of noise. arXiv:2110.04355 (2021)
Paquette, C., Scheinberg, K.: A stochastic line search method with convergence rate analysis. SIAM J. Optim. 30(1), 349–376 (2020)
Slot, L., Laurent, M.: Improved convergence analysis of Lasserre’s measure-based upper bounds for polynomial minimization on compact sets. Math. Program. 1–41 (2020)
Xu, P., Roosta-Khorasani, F., Mahoney, M.W.: Newton-type methods for non-convex optimization under inexact Hessian information. Math. Program. Ser. A 184, 35–70 (2020)
Yao, Z., Xu, P., Roosta-Khorasani, F., Mahoney, M.W.: Inexact non-convex Newton-type methods. INFORMS J. Optim. 3(2), 154–182 (2021)
Yuan, Y.: Recent advances in trust region algorithms. Math. Program. Ser. A 151(1), 249–281 (2015)
Acknowledgements
INdAM-GNCS partially supported the first, second and third authors under Progetti di Ricerca 2019 and 2020. The fourth author was partially supported by INdAM through a GNCS grant and by Università degli Studi di Firenze through Fondi di Internazionalizzazione.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Paulo J. S. Silva.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix : Details of the Proof of Theorem 5.5
Appendix : Details of the Proof of Theorem 5.5
We follow the argument of [17, proof of Theorem 3.8] (adapting the bounds to the new context) and derive an upper bound on the number of derivatives’ evaluations. This requires counting the number of additional derivative evaluations caused by successive tightening of the accuracy threshold \(\zeta _{d,i_\zeta }\). Observe that repeated evaluations at a given iterate \(x_k\) are only needed when the current value of this threshold is smaller than used previously at the same iterate \(x_k\). The \(\{\zeta _{d,i_\zeta }\}\) are, by construction, linearly decreasing with rate \(\gamma _\zeta \). Indeed, \(\zeta _{d,i_\zeta }\) is initialized to \(\zeta _{d,0}\le \kappa _\zeta \) in Step 0 of the TR q DAN algorithm, decreased each time by a factor \(\gamma _\zeta \) in (2.13) in the CHECK invoked in Step 1.2 of Algorithm 3.1, down to the value \(\zeta _{d,i_\zeta }\) which is then passed to Step 2, and possibly decreased there further in (2.13) in the CHECK invoked in Step 2.1 of the STEP2 algorithm, again by successive multiplication by \(\gamma _\zeta \). We now use (4.14) in Lemma 4.1 and (3.8) in Lemma 3.1 to deduce that, even in the absence of noise, \(\zeta _{d,i_\zeta }\) will not be reduced below the value
at iteration k. Now define
so that (5.17) implies that
We also note that conditions (2.16) and (2.13) in the CHECK algorithm impose that any reduced value of \(\zeta _{d,i_\zeta }\) (before termination) must satisfy the bound \(\zeta _{d,i_\zeta } \ge \vartheta _d\). Hence, the bound (A.1) can be strengthened to be
Thus, no further reduction of the \(\zeta _{d,i_\zeta }\), and hence no further approximation of \(\{\overline{\nabla _x^j f}(x_k)\}_{j=1}^q\), can possibly occur in any iteration once the largest initial absolute error \(\zeta _{d,0}\) has been reduced by successive multiplications by \(\gamma _\zeta \) sufficiently to ensure that
the second inequality being equivalent to asking
where the right-hand side is negative because of the inequalities \(\kappa _\textrm{acc}<1\) and \(\max [\epsilon _{\min }^{q+1},\vartheta _d] \le \kappa _\zeta \) (imposed in the initialization step of the TR q EDAN algorithm). We now recall that Step 1 of this algorithm is only used (and derivatives evaluated) after successful iterations. As a consequence, we deduce that the number of evaluations of the derivatives of the objective function that occur during the course of the TR p DAN algorithm before termination is at most
i.e., the number iterations in (5.16), plus
the largest value of \(i_\zeta \) that ensures (A.3). Adding one for the final evaluation at termination, this leads to the desired evaluation bound (5.12) with the coefficients
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Bellavia, S., Gurioli, G., Morini, B. et al. The Impact of Noise on Evaluation Complexity: The Deterministic Trust-Region Case. J Optim Theory Appl 196, 700–729 (2023). https://doi.org/10.1007/s10957-022-02153-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10957-022-02153-5