Skip to main content

Gradient-Type Methods for Optimization Problems with Polyak-Łojasiewicz Condition: Early Stopping and Adaptivity to Inexactness Parameter

  • Conference paper
  • First Online:
Advances in Optimization and Applications (OPTIMA 2022)

Abstract

Due to its applications in many different places in machine learning and other connected engineering applications, the problem of minimization of a smooth function that satisfies the Polyak-Łojasiewicz condition receives much attention from researchers. Recently, for this problem, the authors of [14] proposed an adaptive gradient-type method using an inexact gradient. The adaptivity took place only with respect to the Lipschitz constant of the gradient. In this paper, for problems with the Polyak-Łojasiewicz condition, we propose a full adaptive algorithm, which means that the adaptivity takes place with respect to the Lipschitz constant of the gradient and the level of the noise in the gradient. We provide a detailed analysis of the convergence of the proposed algorithm and an estimation of the distance from the starting point to the output point of the algorithm. Numerical experiments and comparisons are presented to illustrate the advantages of the proposed algorithm in some examples.

The research was supported by Russian Science Foundation and Moscow (project No. 22-21-20065, https://rscf.ru/project/22-21-20065/).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In the worst case, \(L_k\) at the beginning of a new iteration is already equal to L. Then denote by \(I_1\) the minimal solution in I of the inequality \(2^{\tau I} \Delta _{\min } \geqslant \Delta + \frac{\delta \sqrt{2}L}{\Delta } 2^I \) provided that \(I \geqslant 1\). Then \(L_{\max }=2^{I} L\). Similarly, we can get that

    $$ \Delta _{\max } = 2 \left( \Delta + \frac{\delta \sqrt{2} L_{\max }}{\Delta }\right) \cdot \max \left\{ \left( \frac{L}{L_{\min }}\right) ^{\tau }, 1\right\} . $$

    .

References

  1. Beck, A.: First-Order Methods in Optimization. Society for Industrial and Applied Mathematics, Philadelphia (2017)

    Google Scholar 

  2. Du, S.S., Lee, J.D., Li, H., Wang, L., Zhai, X.: Gradient Descent Finds Global Minima of Deep Neural Networks (2018). https://arxiv.org/pdf/1811.03804.pdf

  3. Devolder, O., Glineur, F., Nesterov, Yu.: First-order methods of smooth convex optimization with inexact oracle. Math. Program. 146(1–2), 37–75 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  4. Devolder, O.: Exactness, inexactness and stochasticity in first-order methods for largescale convex optimization: Ph.D. thesis (2013)

    Google Scholar 

  5. D’Aspremont, A.: Smooth optimization with approximate gradient. SIAM J. Opt. 19(3), 1171–1183 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  6. Fazel, M., Ge, R., Kakade, S., Mesbahi, M.: Global convergence of policy gradient methods for the linear quadratic regulator. In: Proceedings of the 35th International Conference on Machine Learning, PMLR 1980, Stockholm, Sweden, pp. 1466–1475 (2018)

    Google Scholar 

  7. Gasnikov, A.V.: Modern Numerical Optimization Methods. The Method of Universal Gradient Descent. A Textbook, 2nd edn. p. 272. MCCME (2021). (in Russian)

    Google Scholar 

  8. Kabanikhin, S.I.: Inverse and ill-posed problems: theory and applications. Walter de Gruyter, p. 475 (2011). https://doi.org/10.1515/9783110224016

  9. Karimi, H., Nutini, J., Schmidt, M.: Linear convergence of gradient and proximal-gradient methods under the Polyak-Łojasiewicz condition. In: Frasconi, P., Landwehr, N., Manco, G., Vreeken, J. (eds.) ECML PKDD 2016. LNCS (LNAI), vol. 9851, pp. 795–811. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46128-1_50

    Chapter  Google Scholar 

  10. Kuruzov, I.A., Stonyakin, F.S.: Sequential subspace optimization for quasar-convex optimization problems with inexact gradient. In: Olenev, N.N., Evtushenko, Y.G., Jaćimović, M., Khachay, M., Malkova, V. (eds.) OPTIMA 2021. CCIS, vol. 1514, pp. 19–33. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-92711-0_2

    Chapter  MATH  Google Scholar 

  11. Nesterov, Yu.: Universal gradient methods for convex optimization problems. Math. Program. A(152), 381–404 (2015)

    Google Scholar 

  12. Nesterov, Yu.E., Skokov, V.A.: Testing unconstrained minimization algorithms. In: The Book: “Computational Methods of Mathematical Programming”. CEMI Ac. of Sc. M, pp. 77–91 (1980). (in Russian)

    Google Scholar 

  13. Nesterov, Yu.: Introductory Lectures on Convex Optimization. Springer Optimization and Its Applications, vol. 137. Springer, Heidelberg (2018)

    Google Scholar 

  14. Polyak, B.T., Kuruzov, I.A., Stonyakin, F.S.: Stopping Rules for Gradient Methods for Non-Convex Problems with Additive Noise in Gradient (2022). https://arxiv.org/pdf/2205.07544.pdf

  15. Polyak, B.T.: Gradient methods for minimizing functionals. Comput. Math. Math. Phys. 3(4), 864–878 (1963)

    Article  MathSciNet  Google Scholar 

  16. Sergeyev, Y.D., Candelieri, A., Kvasov, D.E., Perego, R.: Safe global optimization of expensive noisy black-box functions in the \(\delta \)-Lipschitz framework. Soft. Comput. 24(23), 17715–17735 (2020). https://doi.org/10.1007/s00500-020-05030-3

    Article  MATH  Google Scholar 

  17. Stonyakin, F., et al.: Inexact relative smoothness and strong convexity for optimization and variational inequalities by inexact model. Optim. Methods Softw. 36(6), 1155–1201 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  18. Strongin, R.G., Sergeyev, Y.D.: Global Optimization with Non-convex Constraints: Sequential and Parallel Algorithms, vol. 45. Springer, Cham (2013)

    MATH  Google Scholar 

  19. Sun, J., Qu, Q., Wright, J.: A geometric analysis of phase retrieval. Found. Comput. Math. 18(5), 1131–1198 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  20. Vasilyev, F.: Optimization Methods. Fizmatlit, Moscow (2002). (in Russian)

    Google Scholar 

  21. Vasin, A., Gasnikov, A., Spokoiny, V.: Stopping rules for accelerated gradient methods with additive noise in gradient (2021). https://arxiv.org/pdf/2102.02921.pdf

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ilya A. Kuruzov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kuruzov, I.A., Stonyakin, F.S., Alkousa, M.S. (2022). Gradient-Type Methods for Optimization Problems with Polyak-Łojasiewicz Condition: Early Stopping and Adaptivity to Inexactness Parameter. In: Olenev, N., Evtushenko, Y., Jaćimović, M., Khachay, M., Malkova, V., Pospelov, I. (eds) Advances in Optimization and Applications. OPTIMA 2022. Communications in Computer and Information Science, vol 1739. Springer, Cham. https://doi.org/10.1007/978-3-031-22990-9_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-22990-9_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-22989-3

  • Online ISBN: 978-3-031-22990-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics