Skip to main content
Log in

Efficiency of higher-order algorithms for minimizing composite functions

  • Published:
Computational Optimization and Applications Aims and scope Submit manuscript

Abstract

Composite minimization involves a collection of functions which are aggregated in a nonsmooth manner. It covers, as a particular case, smooth approximation of minimax games, minimization of max-type functions, and simple composite minimization problems, where the objective function has a nonsmooth component. We design a higher-order majorization algorithmic framework for fully composite problems (possibly nonconvex). Our framework replaces each component with a higher-order surrogate such that the corresponding error function has a higher-order Lipschitz continuous derivative. We present convergence guarantees for our method for composite optimization problems with (non)convex and (non)smooth objective function. In particular, we prove stationary point convergence guarantees for general nonconvex (possibly nonsmooth) problems and under Kurdyka–Lojasiewicz (KL) property of the objective function we derive improved rates depending on the KL parameter. For convex (possibly nonsmooth) problems we also provide sublinear convergence rates.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Attouch, H., Bolte, J.: On the convergence of the proximal algorithm for nonsmooth functions involving analytic features. Math. Program. 116(1–2), 5–16 (2009)

    Article  MathSciNet  Google Scholar 

  2. Birgin, E.G., Gardenghi, J.L., Martínez, J.M., Santos, S.A., Toint, P.L.: Worst-case evaluation complexity for unconstrained nonlinear optimization using high-order regularized models. Math. Program. 163(1), 359–368 (2017)

    Article  MathSciNet  Google Scholar 

  3. Birgin, E.G., Gardenghi, J.L., Martínez, J.M., Santos, S.A.: On the use of third-order models with fourth-order regularization for unconstrained optimization. Optim. Lett. 14, 815–838 (2020)

    Article  MathSciNet  Google Scholar 

  4. Bolte, J., Daniilidis, A., Lewis, A., Shiota, M.: Clarke subgradients of stratifiable functions. SIAM 18(2), 556–572 (2007)

    Article  MathSciNet  Google Scholar 

  5. Bolte, J., Chen, Z., Pauwels, E.: The multiproximal linearization method for convex composite problems. Math. Prog. 182, 1–36 (2020)

    Article  MathSciNet  Google Scholar 

  6. Bolte, J., Sabach, S., Teboulle, M.: Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Program. 146, 459–494 (2014)

    Article  MathSciNet  Google Scholar 

  7. Cartis, C., Gould, N., Toint, P.L.: A concise second-order complexity analysis for unconstrained optimization using high-order regularized models. Optim. Methods Softw. 35, 243–256 (2020)

    Article  MathSciNet  Google Scholar 

  8. Drusvyatskiy, D., Paquette, C.: Efficiency of minimizing compositions of convex functions and smooth maps. Math. Program. 178(1–2), 503–558 (2019)

    Article  MathSciNet  Google Scholar 

  9. Doikov, N., Nesterov, Yu.: Optimization methods for fully composite problems. SIAM J. Optim. 32(3), 2402–2427 (2022)

    Article  MathSciNet  Google Scholar 

  10. Fletcher, R.: A model algorithm for composite NDO problems. Math. Program. Stud. 17, 67–76 (1982)

    Article  Google Scholar 

  11. Gasnikov, A., Dvurechensky, P., Gorbunov, E., Vorontsova, E., Selikhanovych, D., Uribe, C., Jiang, B., Wang, H., Zhang, S., Bubeck, S., Jiang, Q.: Near optimal methods for minimizing convex functions with Lipschitz \(p\)th derivatives. Conf. on Learning Theory 1392–1393 (2019)

  12. Gould, N.I.M., Rees, T., Scott, J.: Convergence and evaluation-complexity analysis of a regularized tensor-Newton method for solving nonlinear least-squares problems. Comput. Optim. Appl. 73(1), 1–35 (2019)

    Article  MathSciNet  Google Scholar 

  13. Grapiglia, G., Nesterov, Yu.: Tensor methods for minimizing convex functions with Hölder continuous higher-order derivatives. SIAM J. Optim. 30(4), 2750–2779 (2020)

    Article  MathSciNet  Google Scholar 

  14. Hiriart-Urruty, J.-B.: New concepts in nondifferentiable programming. Memoires de la Societe Mathematique de France 60, 57–85 (1979)

    Article  MathSciNet  Google Scholar 

  15. Li, C., Ng, K.F.: Majorizing functions and convergence of the Gauss-Newton method for convex composite optimization. SIAM J. Optim. 18(2), 613–642 (2007)

    Article  MathSciNet  Google Scholar 

  16. Mairal, J.: Incremental majorization-minimization optimization with application to large-scale machine learning. SIAM J. Optim. 25(2), 829–855 (2015)

    Article  MathSciNet  Google Scholar 

  17. Mordukhovich, B.: Variational Analysis and Generalized Differentiation. Basic Theory. Springer, Berlin (2006)

    Book  Google Scholar 

  18. More, J., Garbow, B.S., Hillstrom, K.E.: Testing unconstrained optimization software. ACM Transat. Math. Soft. 7(1), 17–41 (1981)

    Article  MathSciNet  Google Scholar 

  19. Necoara, I., Nesterov, Yu., Glineur, F.: Linear convergence of first-order methods for non-strongly convex optimization. Math. Program. 175, 69–107 (2019)

    Article  MathSciNet  Google Scholar 

  20. Necoara, I., Lupu, D.: General higher-order majorization-minimization algorithms for (non) convex optimization (2020). arXiv preprint: arXiv:2010.13893

  21. Nesterov, Yu., Nemirovskii, A.: Interior-Point Polynomial Algorithms in Convex Programming. SIAM, Philadelphia (1994)

    Book  Google Scholar 

  22. Nesterov, Yu.: Smooth minimization of non-smooth functions. Math. Program. 103, 127–152 (2005)

    Article  MathSciNet  Google Scholar 

  23. Nesterov, Yu., Polyak, B.T.: Cubic regularization of Newton method and its global performance. Math. Program. 108, 177–205 (2006)

    Article  MathSciNet  Google Scholar 

  24. Nesterov, Yu.: Implementable tensor methods in unconstrained convex optimization. Math. Program. 186, 157–183 (2021)

    Article  MathSciNet  PubMed  Google Scholar 

  25. Nesterov, Yu.: Inexact basic tensor methods for some classes of convex optimization problems. Optim. Methods Soft 37, 878–906 (2022)

    Article  MathSciNet  Google Scholar 

  26. Nesterov, Yu.: Gradient methods for minimizing composite functions. Math. Program. 140(1), 125–161 (2013)

    Article  MathSciNet  Google Scholar 

  27. Pauwels, E.: The value function approach to convergence analysis in composite optimization. Oper. Res. Lett. 44, 790–795 (2016)

    Article  MathSciNet  Google Scholar 

  28. Wächter, A., Biegler, L.T.: On the implementation of a primal-dual interior point filter line search algorithm for large-scale nonlinear programming. Math. Program. 106(1), 25–57 (2006)

    Article  MathSciNet  Google Scholar 

  29. Yuan, Y.: Conditions for convergence of trust-region algorithms for nonsmooth optimization. Math. Program. 31, 220–228 (1985)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The research leading to these results has received funding from: ITN-ETN project TraDE-OPT funded by the EU, H2020 Research and Innovation Programme under the Marie Skolodowska-Curie grant agreement No. 861137; NO Grants 2014-2021, under project ELO-Hyp, contract no. 24/2020; UEFISCDI PN-III-P4-PCE-2021-0720, under project L2O-MOC, nr. 70/2022.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ion Necoara.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Proof of Lemma 3

Let us first prove that for \(p=2\), \(g(\cdot )=\max (\cdot )\) and \(h(\cdot ) = 0\), one can compute efficiently the global solution \(x_{k+1}\) of the subproblem (15). Indeed, in this particular case (15) is equivalent to the following subproblem:

$$\begin{aligned} \min _{x \in {\mathbb {R}}^n} \max \limits _{i=1:m}&\left\{ F_i(x_k) + \langle \nabla F_i(x_k), x - x_k \rangle + \frac{1}{2} \left\langle \nabla ^2 F_i(x_k)(x - x_k), x - x_k\right\rangle \right. \\&\quad +\left. \frac{M_i}{6} \Vert x - x_k \Vert ^3\right\} .\nonumber \end{aligned}$$
(39)

Further, this is equivalent to:

$$\begin{aligned} \min _{x \in {\mathbb {R}}^n} \max _{\begin{array}{c} u \in \Delta _m \end{array}}\;&\sum _{i=1}^{m} u_i F_i(x_k) + \left\langle \sum _{i=1}^{m}u_i \nabla F_i(x_k), x - x_k \right\rangle \\&\quad + \frac{1}{2} \left\langle \sum _{i=1}^{m} u_i \nabla ^2 F_i(x_k)(x - x_k), x - x_k \right\rangle + \frac{\sum _{i=1}^{m}u_i M_i}{6} \Vert x - x_k\Vert ^3, \end{aligned}$$

where \(u=(u_1,\cdots ,u_m)\) and \(\Delta _m:= \left\{ u\ge 0: \sum _{i=1}^{m} u_i = 1 \right\} \) is the standard simplex in \({\mathbb {R}}^m\). Further, this \(\min -\max \) problem can be written as follows:

$$\begin{aligned}&\min _{x \in {\mathbb {R}}^n} \max _{ \begin{array}{c} u\in \Delta _M \end{array}}\; \sum _{i=1}^{m} u_i F_i(x_k) + \left\langle \sum _{i=1}^{m}u_i \nabla F_i(x_k), x - x_k \right\rangle \\&\qquad + \frac{1}{2} \left\langle \sum _{i=1}^{m} u_i \nabla ^2 F_i(x_k) (x - x_k), (x - x_k) \right\rangle \\&\qquad + \max _{w\ge 0} \left( \frac{w}{4}\Vert x - x_k\Vert ^2 - \frac{1}{12(\sum _{i=1}^{m} u_i M_i)^2} w^3 \!\!\right) . \end{aligned}$$

Denote for simplicity \(H_k(u,w) = \sum _{i=1}^{m} u_i \nabla ^2 F_i(x_k) + \frac{w}{2} I\), \(g_k(u) = \sum _{i=1}^{m}u_i \nabla F_i(x_k)\), \(l_k(u) = \sum _{i=1}^{m} u_i F_i(x_k)\) and \(\tilde{M}(u) = \sum _{i=1}^{m}u_i M_i\). Then, the dual formulation of this problem takes the form:

$$\begin{aligned} \min _{x \in {\mathbb {R}}^n} \max _{ \begin{array}{c} u\in \Delta _m \\ w\in {\mathbb {R}}_+ \end{array}}\; l_k(u) + \left\langle g_k(u), x - x_k \right\rangle + \frac{1}{2} \left\langle H_k(u,w) (x - x_k), (x - x_k) \right\rangle - \frac{w^3}{12\tilde{M}(u)^2}. \end{aligned}$$

Consider the following notations:

$$\begin{aligned} \theta (x,u)&= l_k(u) + \langle g_k(u), x - x_k \rangle + \frac{1}{2}\! \left\langle \!\! \left( \sum _{i=1}^{m} u_i \nabla ^2 F_i(x_k)\right) \! (x - x_k), x - x_k \!\!\right\rangle \\&\quad + \frac{\tilde{M}(u)}{6}\Vert x - x_k\Vert ^3, \\ \beta (u,w)&= l_k(u) -\frac{1}{2} \left\langle H_k(u,w)^{-1} g(u) , g(u)\right\rangle - \frac{1}{12\tilde{M}(u)^2} w^3, \\ D&= \left\{ (u,w)\in \Delta _m\times {\mathbb {R}}_+:\; \; \text {s.t.}\; \sum _{i=1}^{m} u_i \nabla ^2 F_i(x_k) + \frac{w}{2} I\succ 0 \right\} . \end{aligned}$$

Below, we prove that if there exists an \(M_i >0\), for some \(i = 1:m\), then we have the following relation:

$$\begin{aligned} \theta ^*:=\min _{x\in {\mathbb {R}}^n}\max _{u\in \Delta _m} \theta (x,u) = \max _{(u,w)\in D}\beta (u,w) = \beta ^*. \end{aligned}$$

Additionally, for any \((u,w)\in D\) the direction \(x_{k+1} = x_k -H_k(u,w)^{-1}g_k(u)\) satisfies:

$$\begin{aligned} 0\le \theta (x_{k+1},u) - \beta (u,w) = \frac{\tilde{M}(u)}{12} \left( \frac{w}{\tilde{M}(u)} + 2r_k\right) \left( r_k - \frac{w}{\tilde{M}(u)}\right) ^2, \end{aligned}$$
(40)

where \(r_k:= \Vert x_{k+1} - x_k\Vert \). Indeed, let us first show that \(\theta ^*\ge \beta ^*\). Using a similar reasoning as in [23], we have:

$$\begin{aligned} \theta ^*&= \min _{x \in {\mathbb {R}}^n} \max _{ \begin{array}{c} u\in \Delta _m\\ w \in {\mathbb {R}}_+ \end{array}}\; l_k(u) + \left\langle g_k(u), x - x_k \right\rangle + \frac{1}{2} \left\langle H_k(u,w) (x - x_k), x - x_k\right\rangle - \frac{ w^3}{12\tilde{M}(u)^2} \\&\ge \max _{ \begin{array}{c} u\in \Delta _m\\ w \in {\mathbb {R}}_+ \end{array}} \min _{x \in {\mathbb {R}}^n}\;l_k(u) + \left\langle g_k(u), x - x_k \right\rangle + \frac{1}{2} \left\langle H_k(u,w) (x - x_k), x - x_k \right\rangle - \frac{ w^3}{12\tilde{M}(u)^2} \\&\ge \max _{(u,w)\in D} \min _{x \in {\mathbb {R}}^n}\;l_k(u) + \left\langle g_k(u), x - x_k \right\rangle + \frac{1}{2} \left\langle H_k(u,w) (x - x_k), x - x_k \right\rangle - \frac{ w^3}{12\tilde{M}(u)^2}\\&= \max _{(u,w)\in D} {l_k(u)} -\frac{1}{2} \left\langle H_k(u,w)^{-1} g_k(u) , g_k(u)\right\rangle - \frac{1}{12\tilde{M}(u)^2} w^3 = \beta ^*. \end{aligned}$$

Let \((u,w) \in D\). Then, we have \(g_k(u)= - H_k(u,w)(x_{k+1} - x_k)\) and thus:

$$\begin{aligned} \theta (x_{k+1},u)&= l_k(u) + \langle g_k(u), x_{k+1} - x_k \rangle \\&\quad + \frac{1}{2} \left\langle \left( \sum _{i=1}^{m} u_i \nabla ^2 F_i(x_k)\right) (x_{k+1}- x_k), x_{k+1} - x_k \right\rangle + \frac{\tilde{M}(u)}{6}r_k^3 \\&= l_k(u) - \left\langle H_k(u,w)(x_{k+1} - x_k), x_{k+1} - x_k \right\rangle \\&\quad + \frac{1}{2} \left\langle \left( \sum _{i=1}^{m} u_i \nabla ^2 F_i(x_k)\right) (x_{k+1}- x_k), x_{k+1} - x_k \right\rangle + \frac{\tilde{M}(u)}{6}r_k^3 \\&= l_k(u) - \frac{1}{2}\!\! \left\langle \!\! \left( \sum _{i=1}^{m} u_i \nabla ^2 F_i(x_k) + \frac{w}{2}I\right) (x_{k+1} - x_k), x_{k+1} - x_k \!\!\right\rangle \\&\quad - \frac{w}{4}r_k^2 + \frac{\tilde{M}(u)}{6}r_k^3 \\&= \beta (u,w) + \frac{1}{12\tilde{M}(u)^2}w^3 - \frac{w}{4}r_k^2 + \frac{\tilde{M}(u)}{6}r_k^3 \\&= \beta (u,w) + \frac{\tilde{M}(u)}{12}\!\left( \frac{w}{\tilde{M}(u)}\right) ^3 \!\!\!- \!\frac{\tilde{M}(u)}{4}\!\! \left( \frac{w}{\tilde{M}(u)}\right) \!r_k^2 + \frac{\tilde{M}(u)}{6}r_k^3 \\&= \beta (u,w) + \frac{\tilde{M}(u)}{12} \left( \frac{w}{\tilde{M}(u)} + 2r_k\right) \left( r_k - \frac{w}{\tilde{M}(u)}\right) ^2, \end{aligned}$$

which proves (40). Note that we have [23]:

$$\begin{aligned} \nabla _w \beta (u,w)&= \frac{1}{4} \Vert x_{k+1} - x_k\Vert ^2 - \frac{1}{4\tilde{M}(u)^2} w^2 = \frac{1}{4} \left( r_k + \frac{w}{\tilde{M}(u)}\right) \left( r_k - \frac{w}{\tilde{M}(u)}\right) . \end{aligned}$$

Therefore, if \(\beta ^*\) is attained at some \((u^*,w^*) \in D\), then we have \(\nabla \beta (u^*,w^*) = 0\). This implies \(\frac{w^*}{\tilde{M}(u^*)} = r_k\) and by (40) we conclude that \(\theta ^* = \beta ^*\).

Finally, if \(x_{k+1}\) is a global solution of the subproblem (15) (or equivalently (39)), then it satisfies the inexact condition (25) with \(\delta = 0\). Hence, using the proof of Lemma 2 with \(\delta = 0\) we can conclude that Assumption 2 holds with \(y_{k+1}\) given in (24), \(L^{1}_{p}=\left( C_{L^{e}_{p}}^{\mu _{p}}\right) ^{1/3}\) and \(L^{2}_{p}=\frac{\mu _{p}}{2}\). \(\square \)

Proof of Remark 2

If g is the identity function, then taking \(y_{k+1}=x_{k+1}\) one can see that Assumption 3 holds for any \(\theta _{1,p}\) and \(\theta _{2,p}\) nonnegative constants. If g is a general function, then Assumption 3 holds, provided that \(x_{k+1}\) satisfies the inexact optimality condition (25). Indeed, in this case, we have:

$$\begin{aligned} f(x_{k+1})&\le g \Big (s(x_{k+1};\!\!x_{k})\Big ) + h(x_{k+1})\\&{\mathop {\le }\limits ^{(25)}} \min _{{y:\; \Vert y - x_k\Vert \le D_k}} g \Big (s(y;\!\!x_{k})\Big ) + h(y) + \delta \Vert x_{k+1} - x_k\Vert ^{p+1}\\&{\mathop {\le }\limits ^{(25),(8)}} \min _{{y:\;\Vert y - x_k\Vert \le D_k}} g\big (F(y)\big )+ h(y) + \dfrac{g(L^{e}_{p})}{(p+1)!}\Vert y-x_{k}\Vert ^{p+1}\\&\quad + \delta \Vert x_{k+1} - x_k\Vert ^{p+1}\\&\le f(y_{k+1}) + \dfrac{g(L^{e}_{p})}{(p+1)!}\Vert y_{k+1}-x_{k}\Vert ^{p+1} + \delta \Vert x_{k+1} - x_k\Vert ^{p+1}, \end{aligned}$$

where the last inequality follows taking \(y = y_{k+1}\). Hence, Assumption 3 holds in this case for \(\theta _{1,p} = \dfrac{g(L^{e}_{p})}{(p+1)!}\) and \(\theta _{2,p} = \delta \). Finally, if \(p=2\) and \(g(\cdot ) = \max (\cdot )\), then \(x_{k+1}\) is the global solution of the subproblem (15) and hence, using similar arguments as above, we can prove that Assumption 3 also holds in this case. \(\square \)

Proof of Lemma 6

Note that the sequence \(\lambda _{k}\) is nonincreasing and nonnegative, thus it is convergent. Let us consider first \(\theta \le 1\). Since \(\lambda _{k}-\lambda _{k+1}\) converges to 0, then there exists \(k_{0}\) such that \(\lambda _{k}-\lambda _{k+1} \le 1\) and \(\lambda _{k+1}\le (C_{1}+C_{2})\left( \lambda _{k}-\lambda _{k+1}\right) \) for all \(k \ge k_{0}\). It follows that:

$$\begin{aligned} \lambda _{k+1}\le \frac{C_{1}+C_{2}}{1+C_{1}+C_{2}}\lambda _{k}, \end{aligned}$$

which proves the first statement. If \(1<\theta \le 2\), then there exists also an integer \(k_{0}\) such that \(\lambda _{k}-\lambda _{k+1} \le 1\) for all \(k\ge k_{0}\). Then, we have:

$$\begin{aligned} \lambda _{k+1}^{\theta }\le (C_{1}+C_{2})^{\theta }\left( \lambda _{k}-\lambda _{k+1}\right) . \end{aligned}$$

Since \(1<\theta \le 2\), then taking \(0 < \beta = \theta -1 \le 1\), we have:

$$\begin{aligned} \left( \frac{1}{C_{1}+C_{2}}\right) ^{\theta }\lambda _{k+1}^{1+\beta }\le \lambda _{k}-\lambda _{k+1}, \end{aligned}$$

for all \(k\ge k_{0}\). From Lemma 11 in [25], we further have:

$$\begin{aligned} \lambda _{k}\le \frac{\lambda _{k_{0}}}{(1+\sigma (k-k_{0}))^{\frac{1}{\beta }}} \end{aligned}$$

for all \(k\ge k_{0}\) and for some \(\sigma >0\). Finally, if \(\theta > 2\), then define \(h(s){=}s^{-\theta }\) and let \(R>1\) be fixed. Since \(1/\theta < 1\), then there exists a \(k_{0}\) such that \(\lambda _{k}-\lambda _{k+1} \le 1\) for all \(k \ge k_{0}\). Then, we have \(\lambda _{k+1}\le (C_{1}+C_{2})\left( \lambda _{k}-\lambda _{k+1}\right) ^{\frac{1}{\theta }}\), or equivalently:

$$\begin{aligned} 1\le (C_{1}+C_{2})^{\theta }(\lambda _{k}-\lambda _{k+1})h(\lambda _{k+1}). \end{aligned}$$

If we assume that \(h(\lambda _{k+1})\le R h(\lambda _{k}) \), then:

$$\begin{aligned} 1\le R(C_{1}+C_{2})^{\theta }(\lambda _{k}-\lambda _{k+1})h(\lambda _{k})&\le \frac{R(C_{1}+C_{2})^{\theta }}{-\theta +1}\left( \lambda _{k}^{-\theta +1} -\lambda _{k+1}^{-\theta +1} \right) . \end{aligned}$$

Denote \(\mu =\frac{-R(C_{1}+C_{2})^{\theta }}{-\theta +1}\). Then:

$$\begin{aligned} 0<\mu ^{-1} \le \lambda _{k+1}^{1-\theta } - \lambda _{k}^{1-\theta }. \end{aligned}$$
(41)

If we assume that \(h(\lambda _{k+1})> R h(\lambda _{k}) \) and set \(\gamma = R^{-\frac{1}{\theta }}\), then it follows immediately that \(\lambda _{k+1}\le \gamma \lambda _{k}\). Since \(1-\theta \) is negative, we get:

$$\begin{aligned} \lambda _{k+1}^{1-\theta }&\ge \gamma ^{1-\theta }\lambda _{k}^{1-\theta } \quad \iff \quad \lambda _{k+1}^{1-\theta } - \lambda _{k}^{1-\theta } \ge (\gamma ^{1-\theta }-1)\lambda _{k}^{1-\theta }. \end{aligned}$$

Since \(1- \theta <0\), \(\gamma ^{1-\theta } > 1\) and \(\lambda _{k}\) has a nonnegative limit, then there exists \({\bar{\mu }} > 0\) such that \((\gamma ^{1-\theta } - 1) \lambda _{k}^{1-\theta } > {\bar{\mu }}\) for all \(k \ge k_0\). Therefore, in this case we also obtain:

$$\begin{aligned} 0< {\bar{\mu }} \le \lambda _{k+1}^{1-\theta }-\lambda _{k}^{1-\theta }. \end{aligned}$$
(42)

If we set \({\hat{\mu }}=\min (\mu ^{-1},{\bar{\mu }})\) and combine (41) and (42), we obtain:

$$\begin{aligned} 0< {\hat{\mu }} \le \lambda _{k+1}^{1-\theta }-\lambda _{k}^{1-\theta }. \end{aligned}$$

Summing the last inequality from \(k_{0}\) to k, we obtain \(\lambda _{k}^{1-\theta }-\lambda _{k_{0}}^{1-\theta }\ge {\hat{\mu }}(k-k_{0})\), i.e.:

$$\begin{aligned} \lambda _{k} \le \frac{{\hat{\mu }}^{-\frac{1}{\theta -1}}}{(k-k_{0})^{\frac{1}{\theta -1}}} \end{aligned}$$

for all \(k \ge k_0\). This concludes our proof. \(\square \)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nabou, Y., Necoara, I. Efficiency of higher-order algorithms for minimizing composite functions. Comput Optim Appl 87, 441–473 (2024). https://doi.org/10.1007/s10589-023-00533-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10589-023-00533-9

Keywords

Navigation