Conservative Parametric Optimality and the Ridge Method for Tame Min-Max Problems

Pauwels, Edouard

doi:10.1007/s11228-023-00682-3

Conservative Parametric Optimality and the Ridge Method for Tame Min-Max Problems

Published: 22 June 2023

Volume 31, article number 19, (2023)
Cite this article

Set-Valued and Variational Analysis Aims and scope Submit manuscript

Edouard Pauwels ORCID: orcid.org/0000-0002-8180-075X¹

67 Accesses
1 Altmetric
Explore all metrics

Abstract

We study the ridge method for min-max problems, and investigate its convergence without any convexity, differentiability or qualification assumption. The central issue is to determine whether the “parametric optimality formula” provides a conservative gradient, a notion of generalized derivative well suited for optimization. The answer to this question is positive in a semi-algebraic, and more generally definable, context. As a consequence, the ridge method applied to definable objectives is proved to have a minimizing behavior and to converge to a set of equilibria which satisfy an optimality condition. Definability is key to our proof: we show that for a more general class of nonsmooth functions, conservativity of the parametric optimality formula may fail, resulting in an absurd behavior of the ridge method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Newton and interior-point methods for (constrained) nonconvex–nonconcave minmax optimization with stability and instability guarantees

Article Open access 10 October 2023

Perseus: a simple and optimal high-order method for variational inequalities

Article 13 March 2024

Nonlinear Programming via König’s Maximum Theorem

Article 31 May 2016

References

Ablin, P., Peyré, G., Moreau, T.: Super-efficiency of automatic differentiation for functions defined as a minimum. Presented at the (2020)
Agrawal, A., Amos, B., Barratt, S., Boyd, S., Diamond, S., Kolter, Z. (2019). Differentiable convex optimization layers. Advances in neural information processing systems
Aliprantis C.D., Border K.C. (2005) Infinite Dimensional Analysis (3rd edition) Springer
Ambrosio L., Gigli N. and Savaré G. (2008). Gradient flows: in metric spaces and in the space of probability measures. Springer Science & Business Media
Amos, B., Kolter, J. Z. (2017). Optnet: Differentiable optimization as a layer in neural networks. In International Conference on Machine Learning
Arjovsky, Chintala, Bottou (2017). Wasserstein GAN. International Conference on Machine Learning
Attouch, H., Bolte, J., Svaiter, B.F.: Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and regularized Gauss-Seidel methods. Mathematical Programming 137(1), 91–129 (2013)
Article MathSciNet MATH Google Scholar
Benaïm, M., Hofbauer, J., Sorin, S.: Stochastic approximations and differential inclusions. SIAM Journal on Control and Optimization 44(1), 328–348 (2005)
Article MathSciNet MATH Google Scholar
Bianchi, P., Hachem, W. and Schechtman, S. (2020). Convergence of constant step stochastic gradient descent for non-smooth non-convex functions. arXiv preprint arXiv:2005.08513
Berthet, Q., Blondel, M., Teboul, O., Cuturi, M., Vert, J. P., Bach, F. (2020). Learning with differentiable perturbed optimizers. Advances in neural information processing systems
Bertsekas D. P. (1971). Control of uncertain systems with a set-membership description of the uncertainty. Doctoral dissertation, Massachusetts Institute of Technology
Bolte, J., Daniilidis, A., Lewis, A., Shiota, M.: Clarke subgradients of stratifiable functions. SIAM Journal on Optimization 18(2), 556–572 (2007)
Article MathSciNet MATH Google Scholar
Bolte, J., Sabach, S., Teboulle, M.: Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Mathematical Programming 146(1), 459–494 (2014)
Article MathSciNet MATH Google Scholar
Bolte, J. and Pauwels, E. (2020). Conservative set valued fields, automatic differentiation, stochastic gradient methods and deep learning. Mathematical Programming
Bolte, J. and Pauwels, E. (2020). A mathematical model for automatic differentiation in machine learning. Proceedings of the conference on neural information processing systems
Borwein, J.M., Moors, W.B.: A chain rule for essentially smooth Lipschitz functions. SIAM Journal on Optimization 8(2), 300–308 (1998)
Article MathSciNet MATH Google Scholar
Borwein, J., Moors, W., Wang, X.: Generalized subdifferentials: a Baire categorical approach. Transactions of the American Mathematical Society 353(10), 3875–3893 (2001)
Article MathSciNet MATH Google Scholar
Borwein, J.M.: Generalisations, Examples, and Counter-examples in Analysis and Optimisation. Set-Valued and Variational Analysis 25(3), 467–479 (2017)
Article MathSciNet MATH Google Scholar
C. Castera, J. Bolte, C. Févotte and E. Pauwels (2019). An inertial newton algorithm for deep learning. arXiv preprint http://arxiv.org/abs/1905.12278arXiv:1905.12278
Clarke F. H. (1983). Optimization and nonsmooth analysis. Siam
Coste, M.: An introduction to o-minimal geometry. RAAG notes, Institut de Recherche Mathématique de Rennes (1999)
Google Scholar
Danskin, J.M.: The theory of max-min, with applications. SIAM Journal on Applied Mathematics 14(4), 641–664 (1966)
Article MathSciNet MATH Google Scholar
Davis D., Drusvyatskiy D., Kakade S., and Lee J. D. (2020). Stochastic subgradient method converges on tame functions, 20(1), 119-154. Foundations of Computational Mathematics
Davis, D. and Drusvyatskiy, D. (2021). Conservative and semismooth derivatives are equivalent for semialgebraic maps. arXiv preprint http://arxiv.org/abs/2102.08484arXiv:2102.08484
Dem’Yanov, V.F.: On the solution of several minimax problems. I. Cybernetics 2(6), 47–53 (1966)
Article Google Scholar
van den Dries, L., Miller, C.: Geometric categories and o-minimal structures. Duke Math. J 84(2), 497–540 (1996)
MathSciNet MATH Google Scholar
Evans L. C. and Gariepy R. F. (2015). Measure theory and fine properties of functions. Revised Edition. Chapman and Hall/CRC
Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y. Generative adversarial nets. Advances in neural information processing systems
Goodfellow, I. J., Shlens, J., Szegedy, C. (2015). Explaining and harnessing adversarial examples. International conference on learning representations
Jin, C., Netrapalli, P. and Jordan, M. (2020). What is local optimality in nonconvex-nonconcave minimax optimization? In International Conference on Machine Learning (pp. 4880-4889). PMLR
Kong, W. and Monteiro, R. D. (2019). An accelerated inexact proximal point method for solving nonconvex-concave min-max problems. arXiv preprint http://arxiv.org/abs/1905.13433arXiv:1905.13433
Lewis, A., Tian, T.: The structure of conservative gradient fields. SIAM Journal on Optimization 31(3), 2080–2083 (2021)
Article MathSciNet MATH Google Scholar
Lin, T., Jin, C. and Jordan, M. (2020). On gradient descent ascent for nonconvex-concave minimax problems. In International Conference on Machine Learning (pp. 6083-6093). PMLR
Ostrovskii, D.M., Lowy, A., Razaviyayn, M.: Efficient search of first-order nash equilibria in nonconvex-concave smooth min-max problems. SIAM Journal on Optimization 31(4), 2508–2538 (2021)
Article MathSciNet MATH Google Scholar
Rafique, H., Liu, M., Lin, Q. and Yang, T. (2021). Weakly-convex-concave min-max optimization: provable algorithms and applications in machine learning. Optimization Methods and Software, 1-35
Rios-Zertuche R. (2020). Examples of pathological dynamics of the subgradient method for Lipschitz path-differentiable functions. arXiv preprint http://arxiv.org/abs/2007.11699arXiv:2007.11699
Rockafellar, R.T.: Extensions of subgradient calculus with applications to optimization. Nonlinear Analysis: Theory, Methods & Applications 9(7), 665–698 (1985)
Article MathSciNet MATH Google Scholar
Rockafellar R. T. and Wets R. J. B. (1998). Variational analysis (Vol. 317). Springer Science & Business Media
H. Royden, P. Fitzpatrick (2010) Real Analysis Prentice Hall
Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fergus, R. (2014). Intriguing properties of neural networks. International Conference on Learning Representations
Thekumparampil, K. K., Jain, P., Netrapalli, P. and Oh, S. (2019). Efficient algorithms for smooth minimax optimization. arXiv preprint http://arxiv.org/abs/1907.01543arXiv:1907.01543
Valadier, M.: Entraînement unilatéral, lignes de descente, fonctions lipschitziennes non pathologiques. Comptes rendus de l’Académie des Sciences 308, 241–244 (1989)
MathSciNet MATH Google Scholar
Wang X. (1995). Pathological Lipschitz functions in \(\mathbb{R}_n\). Master Thesis, Simon Fraser University
Wang Y. and Zhang G. and Ba J. (2020) On Solving Minimax Optimization Locally: A Follow-the-Ridge Approach International Conference on Learning Representations
Whitney, H.: A function not constant on a connected set of critical points. Duke Mathematical Journal 1(4), 514–517 (1935)
Article MathSciNet MATH Google Scholar
Wilkie, A.J.: A theorem of the complement and some new o-minimal structures. Selecta Mathematica 5(4), 397–421 (1999)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The author would like to thank Jérôme Bolte and Rodolfo Rios-Zeruche for interesting discussions which helped putting this work together. The author acknowledge the support of ANR-3IA Artificial and Natural Intelligence Toulouse Institute under the grant agreement ANR-19-PI3A-0004, Air Force Office of Scientific Research, Air Force Material Command, USAF, under grant numbers FA9550-19-1-7026, FA8655-22-1-7012, and ANR MaSDOL - 19-CE23-0017-01.

Author information

Authors and Affiliations

IRIT, Université de Toulouse, CNRS, Institut Universitaire de France (IUF), Toulouse, France
Edouard Pauwels

Authors

Edouard Pauwels
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Edouard Pauwels.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Pauwels, E. Conservative Parametric Optimality and the Ridge Method for Tame Min-Max Problems. Set-Valued Var. Anal 31, 19 (2023). https://doi.org/10.1007/s11228-023-00682-3

Download citation

Received: 08 February 2022
Accepted: 15 May 2023
Published: 22 June 2023
DOI: https://doi.org/10.1007/s11228-023-00682-3

Keywords

Access this article

Log in via an institution

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Conservative Parametric Optimality and the Ridge Method for Tame Min-Max Problems

Abstract

Access this article

Similar content being viewed by others

Newton and interior-point methods for (constrained) nonconvex–nonconcave minmax optimization with stability and instability guarantees

Perseus: a simple and optimal high-order method for variational inequalities

Nonlinear Programming via König’s Maximum Theorem

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Conservative Parametric Optimality and the Ridge Method for Tame Min-Max Problems

Abstract

Access this article

Similar content being viewed by others

Newton and interior-point methods for (constrained) nonconvex–nonconcave minmax optimization with stability and instability guarantees

Perseus: a simple and optimal high-order method for variational inequalities

Nonlinear Programming via König’s Maximum Theorem

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation