Polyak Minorant Method for Convex Optimization

Devanathan, Nikhil; Boyd, Stephen

doi:10.1007/s10957-024-02412-7

160 Accesses
3 Altmetric
Explore all metrics

Abstract

In 1963 Boris Polyak suggested a particular step size for gradient descent methods, now known as the Polyak step size, that he later adapted to subgradient methods. The Polyak step size requires knowledge of the optimal value of the minimization problem, which is a strong assumption but one that holds for several important problems. In this paper we extend Polyak’s method to handle constraints and, as a generalization of subgradients, general minorants, which are convex functions that tightly lower bound the objective and constraint functions. We refer to this algorithm as the Polyak Minorant Method (PMM). It is closely related to cutting-plane and bundle methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Frank-Wolfe Algorithm: A Short Introduction

Article Open access 13 December 2023

$\mathbf{C^{2}}$ -Lusin approximation of strongly convex functions

Article 03 April 2024

A New Insight on Augmented Lagrangian Method with Applications in Machine Learning

Article 13 April 2024

References

Abdukhakimov, F., Xiang, C., Kamzolov, D., Takáč, M.: Stochastic gradient descent with preconditioned Polyak step-size. https://arxiv.org/abs/2310.02093 (2023)
Beck, A., Ben-Tal, A., Guttmann-Beck, N., Tetruashvili, L.: The comirror algorithm for solving nonsmooth constrained convex problems. Oper. Res. Lett. 38(6), 493–498 (2010)
Article MathSciNet Google Scholar
Boyd, S., El Ghaoui, L., Feron, E., Balakrishnan, V.: Linear Matrix Inequalities in System and Control Theory. Society for Industrial and Applied Mathematics, Philadelphia (1994)
Book Google Scholar
Barré, M., Taylor, A., d’Aspremont, A.: Complexity guarantees for Polyak steps with momentum. In: Proceedings of Thirty Third Conference on Learning Theory, vol. 125, pp. 452–478. PMLR, 09-12 Jul 2020
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)
Book Google Scholar
Boyd, S., Vandenberghe, L.: Localization and cutting-plane methods. Lecture notes for EE364b. Stanford University (2007)
Berrada, L., Zisserman, A., Kumar, M.: Comment on stochastic Polyak step-size: Performance of ALI-G. https://arxiv.org/abs/2105.10011 (2021)
Cheney, E., Goldstein, A.: Newton’s method for convex programming and Tchebycheff approximation. Numer. Math. 1, 253–268 (1959)
Article MathSciNet Google Scholar
Cheng, W., Li, D.: An active set modified Polak–Ribiére–Polyak method for large-scale nonlinear bound constrained optimization. J. Optim. Theory Appl. 155(3), 1084–1094 (2012)
Article MathSciNet Google Scholar
Diamond, S., Boyd, S.: CVXPY: a Python-embedded modeling language for convex optimization. J. Mach. Learn. Res. 17(83), 1–5 (2016)
MathSciNet Google Scholar
Davis, D., Drusvyatskiy, D., MacPhee, K., Paquette, C.: Subgradient methods for sharp weakly convex functions. J. Optim. Theory Appl. 179(3), 962–982 (2018)
Article MathSciNet Google Scholar
Drusvyatskiy, D.: Convex analysis and nonsmooth optimization. https://sites.math.washington.edu/~ddrusv/crs/Math_516_2020/bookwithindex.pdf (2020)
Frangioni, A.: Standard bundle methods: Untrusted models and duality. In: Numerical Nonsmooth Optimization, pp. 61–116. Springer (2020)
Gower, R., Blondel, M., Gazagnadou, N., Pedregosa, F.: Cutting some slack for SGD with adaptive Polyak stepsizes. https://arxiv.org/abs/2202.12328 (2022)
Grant, M., Boyd, S., Ye, Y.: Disciplined convex programming. In: Global Optimization, pp. 155–210. Springer (2006)
Goulart, P., Chen, Y.: Clarabel: A library for optimization and control (2021). https://oxfordcontrol.github.io/ClarabelDocs/stable/
Goffin, J.: On convergence rates of subgradient optimization methods. Math. Program. 13(1), 329–347 (1977)
Article MathSciNet Google Scholar
Goujaud, B., Taylor, A., Dieuleveut, A.: Quadratic minimization: from conjugate gradients to an adaptive heavy-ball method with Polyak step-sizes. In: OPT 2022: Optimization for Machine Learning (NeurIPS 2022 Workshop) (2022)
Hazan, E., Kakade, S.: Revisiting the Polyak step size. https://arxiv.org/abs/1905.00313 (2022)
Hiriart-Urruty, J., Lemaréchal, C.: Convex Analysis and Minimization Algorithms II: Advanced Theory and Bundle Methods. Grundlehren der mathematischen Wissenschaften. Springer, Berlin (1996)
Google Scholar
Jongbloed, G.: The iterative convex minorant algorithm for nonparametric estimation. J. Comput. Graph. Stat. 7(3), 310 (1998)
MathSciNet Google Scholar
Kelley, J.: The cutting-plane method for solving convex programs. J. Soc. Ind. Appl. Math. 8(4), 703–712 (1960)
Article MathSciNet Google Scholar
Kiwiel, K.: Proximity control in bundle methods for convex nondifferentiable minimization. Math. Program. 46(1–3), 105–122 (1990)
Article MathSciNet Google Scholar
Kowalewski, G.: Einführung in die determinantentheorie einschliesslich der unendlichen und der Fredholmschen determinanten. Veit & comp. (1909)
Lin, Q., Ma, R., Yang, T.: Level-set methods for finite-sum constrained convex optimization. In: Proceedings of the 35th International Conference on Machine Learning, vol.80, pp. 3112–3121. PMLR, 10–15 Jul (2018)
Lemaréchal, C., Nemirovskii, A., Nesterov, Y.: New variants of bundle methods. Math. Program. 69(1), 111–147 (1995)
Article MathSciNet Google Scholar
Li, S., Swartworth, W., Takáč, M., Needell, D., Gower, R.: SP2: A second order stochastic Polyak method. In: The Eleventh International Conference on Learning Representations (2023)
Loizou, N., Vaswani, S., Hadj Laradji, I., Lacoste-Julien, S.: Stochastic Polyak step-size for SGD: an adaptive learning rate for fast convergence, 13–15 Apr 2021
Lan, G., Zhou, Z.: Algorithms for stochastic optimization with function or expectation constraints. Comput. Optim. Appl. 76(2), 461–498 (2020)
Article MathSciNet Google Scholar
McLinden, L.: Affine minorants minimizing the sum of convex functions. J. Optim. Theory Appl. 24(4), 569–583 (1978)
Article MathSciNet Google Scholar
Marsten, R., Hogan, W., Blankenship, J.: The boxstep method for large-scale optimization. Oper. Res. 23(3), 389–405 (1975)
Article MathSciNet Google Scholar
Nemirovski, A.: Interior point polynomial time methods in convex programming. https://www2.isye.gatech.edu/~nemirovs/Lect_IPM.pdf (1996)
Nesterov, Y.: Lectures on Convex Optimization. Springer, Berlin (2018)
Book Google Scholar
Nesterov, Y., Nemirovskii, A.: Interior-Point Polynomial Algorithms in Convex Programming. Society for Industrial and Applied Mathematics, Philadelphia (1994)
Book Google Scholar
Orvieto, A., Lacoste-Julien, S., Loizou, N.: Dynamics of SGD with stochastic polyak stepsizes: truly adaptive variants and convergence to exact solution. In: Advances in Neural Information Processing Systems (2022)
Parikh, N., Boyd, S.: Proximal algorithms. Found. Trends Optim. 1(3), 127–239 (2014)
Prazeres, M., Oberman, A.: Stochastic gradient descent with Polyak’s learning rate. J. Sci. Comput. 89(1), 1–16 (2021)
Article MathSciNet Google Scholar
Polyak, B.: Gradient methods for the minimisation of functionals. USSR Comput. Math. Math. Phys. 3(4), 864–878 (1963)
Article Google Scholar
Polyak, B.: A general method of solving extremum problems. Sov. Math. Dokl. 8, 593–597 (1967)
Google Scholar
Polyak, B.: Minimization of unsmooth functionals. USSR Comput. Math. Math. Phys. 9(3), 14–29 (1969)
Article Google Scholar
Polyak, B.: Introduction to Optimization. Optimization Software Inc., New York (1987)
Google Scholar
Parshakova, T., Zhang, F., Boyd, S.: Implementation of an oracle-structured bundle method for distributed optimization. Optim. Eng. (2023). https://doi.org/10.1007/s11081-023-09859-z
Rockafellar, R.: The Theory of Subgradients and its Applications to Problems of Optimization. Heldermann Verlag, Berlin (1981)
Google Scholar
Stellato, B., Banjac, G., Goulart, P., Bemporad, A., Boyd, S.: OSQP: an operator splitting solver for quadratic programs. Math. Program. Comput. 12(4), 637–672 (2020)
Article MathSciNet Google Scholar
Shor, N.: Convergence rate of the gradient descent method with dilatation of the space. Cybernetics 6(2), 102–108 (1973)
Article Google Scholar
Shor, N.: Minimization Methods for Non-differentiable Functions, vol. 3. Springer, Berlin (2012)
Google Scholar
Sra, S., Nowozin, S., Wright, S.: Optimization for Machine Learning. MIT Press, Cambridge (2012)
Google Scholar
van Ackooij, W., Frangioni, A., de Oliveira, W.: Inexact stabilized Benders’ decomposition approaches with application to chance-constrained problems with finite support. Comput. Optim. Appl. 65, 637–669 (2016)
Article MathSciNet Google Scholar
Wang, X., Johansson, M., Zhang, T.: Generalized Polyak step size for first order optimization with momentum. https://arxiv.org/abs/2305.12939 (2023)
You, J., Cheng, H., Li, Y.: Minimizing quantum Rényi divergences via mirror descent with Polyak step size. In: 2022 IEEE International Symposium on Information Theory (ISIT), pp. 252–257 (2022)
You, J., Li, Y.: Two Polyak-type step sizes for mirror descent. https://arxiv.org/abs/2210.01532 (2022)

Download references

Acknowledgements

This paper builds on notes written around 2010 for the Stanford course EE364B, Convex Optimization II, to which Lieven Vandenberghe, Almir Mutapcic, Jaehyun Park, Lin Xiao, and Jacob Mattingley contributed. We thank Tetiana Parshakova, Fangzhao Zhang, Parth Nobel, Logan Bell, and Thomas Schmeltzer for useful discussions. The authors thank an anonymous reviewer for suggesting the sharpness convergence result, as well as pointing us to some very relevant literature that we had missed in an earlier version of this paper. Stephen Boyd would like to dedicate this paper to Boris Polyak, his hero and friend.

Funding

Funding was provided by ACCESS (AI Chip Center for Emerging Smart Systems) and Office of Naval Research (Grant No. N00014-22-1-2121).

Author information

Authors and Affiliations

Stanford University, Stanford, CA, 94305, USA
Nikhil Devanathan & Stephen Boyd

Authors

Nikhil Devanathan
View author publications
You can also search for this author in PubMed Google Scholar
Stephen Boyd
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nikhil Devanathan.

Additional information

Communicated by Arkadi Nemirovski.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Devanathan, N., Boyd, S. Polyak Minorant Method for Convex Optimization. J Optim Theory Appl (2024). https://doi.org/10.1007/s10957-024-02412-7

Download citation

Received: 10 October 2023
Accepted: 18 February 2024
Published: 30 March 2024
DOI: https://doi.org/10.1007/s10957-024-02412-7

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Polyak Minorant Method for Convex Optimization

Abstract

Access this article

Similar content being viewed by others

The Frank-Wolfe Algorithm: A Short Introduction

$\mathbf{C^{2}}$ -Lusin approximation of strongly convex functions

A New Insight on Augmented Lagrangian Method with Applications in Machine Learning

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Polyak Minorant Method for Convex Optimization

Abstract

Access this article

Similar content being viewed by others

The Frank-Wolfe Algorithm: A Short Introduction

$\mathbf{C^{2}}$ -Lusin approximation of strongly convex functions

A New Insight on Augmented Lagrangian Method with Applications in Machine Learning

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation