Sum-of-squares relaxations for polynomial min–max problems over simple sets

Bach, Francis

doi:10.1007/s10107-024-02072-5

Sum-of-squares relaxations for polynomial min–max problems over simple sets

Full Length Paper
Series A
Published: 15 March 2024

(2024)
Cite this article

Mathematical Programming Submit manuscript

Francis Bach^1,2

183 Accesses
Explore all metrics

Abstract

We consider min–max optimization problems for polynomial functions, where a multivariate polynomial is maximized with respect to a subset of variables, and the resulting maximal value is minimized with respect to the remaining variables. When the variables belong to simple sets (e.g., a hypercube, the Euclidean hypersphere, or a ball), we derive a sum-of-squares formulation based on a primal-dual approach. In the simplest setting, we provide a convergence proof when the degree of the relaxation tends to infinity and observe empirically that it can be finitely convergent in several situations. Moreover, our formulation leads to an interesting link with feasibility certificates for polynomial inequalities based on Putinar’s Positivstellensatz.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Frank-Wolfe Algorithm: A Short Introduction

Article Open access 13 December 2023

Random Gradient-Free Minimization of Convex Functions

Article 30 November 2015

Global convergence of a BFGS-type algorithm for nonconvex multiobjective optimization problems

Article 11 April 2024

Notes

This assumption is mostly made to make the developments as simple as possible, but most of our developments would go through for any basic semi-algebraic sets $\mathcal {X}$ and $\mathcal {Y}$ through the use of adapted positivity certificates.
This feature is complex-valued but equivalent real-valued formulations with cosines and sines could be used. Since we only use kernel formulations, we do not need to pursue them explicitly.
For a monomial $X_1^{\alpha _1} \cdots X_d^{\alpha _d}$, its degree is $\alpha _1+\cdots +\alpha _d$ and its maximal degree is $\max \{\alpha _1,\dots ,\alpha _d\}.$
In this paper, we use the term “relaxation” for all our formulations, but, rigorously, they are “strengthenings” when replacing non-negative functions by sums-of-squares, and proper relaxations in their dual formulations, when relaxing moments to pseudo-moments later in this section.

References

Lasserre, J.-B.: Min–max and robust polynomial optimization. J. Glob. Optim. 51(1), 1–10 (2011)
Article MathSciNet Google Scholar
Ben-Tal, A., El Ghaoui, L., Nemirovski, A.: Robust Optimization, vol. 28. Princeton University Press, Princeton (2009)
Book Google Scholar
Nie, J., Yang, Z., Zhou, G.: The saddle point problem of polynomials. Found. Comput. Math. 1–37 (2021)
Lasserre, J.-B.: Global optimization with polynomials and the problem of moments. SIAM J. Optim. 11(3), 796–817 (2001)
Article MathSciNet Google Scholar
Parrilo, P.A.: Semidefinite programming relaxations for semialgebraic problems. Math. Program. 96(2), 293–320 (2003)
Article MathSciNet Google Scholar
Sion, M.: On general minimax theorems. Pac. J. Math. 8(1), 171–176 (1958)
Article MathSciNet Google Scholar
Jahn, J.: Introduction to the Theory of Nonlinear Optimization. Springer, New York (2020)
Book Google Scholar
Laraki, R., Lasserre, J.-B.: Semidefinite programming for min-max problems and games. Math. Program. 131, 305–332 (2012)
Article MathSciNet Google Scholar
Lasserre, J.-B.: Moments, Positive Polynomials and Their Applications, vol. 1. World Scientific, Singapore (2010)
Google Scholar
Henrion, D., Korda, M., Lasserre, J.-B.: The Moment-SOS Hierarchy: Lectures in Probability, Statistics, Computational Geometry, Control and Nonlinear PDEs. World Scientific, Singapore (2020)
Book Google Scholar
Bach, F., Rudi, A.: Exponential convergence of sum-of-squares hierarchies for trigonometric polynomials. Technical report, arXiv (2022)
Efthimiou, C.S., Frye, C.: Spherical Harmonics in $p$ Dimensions. World Scientific, Singapore (2014)
Book Google Scholar
Schmüdgen, K.: The Moment Problem. Springer, Berlin (2017)
Book Google Scholar
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)
Book Google Scholar
Lofberg, J., Parrilo, P.A.: From coefficients to samples: a new approach to SOS optimization. In: Conference on Decision and Control, vol. 3, pp. 3154–3159 (2004)
Rudi, A., Marteau-Ferey, U., Bach, F.: Finding global minima via kernel approximations. Technical Report 2012.11978, arXiv (2020)
Morokoff, W.J., Caflisch, R.E.: Quasi-random sequences and their discrepancies. SIAM J. Sci. Comput. 15(6), 1251–1279 (1994)
Article MathSciNet Google Scholar
Helmberg, C., Rendl, F., Vanderbei, R.J., Wolkowicz, H.: An interior-point method for semidefinite programming. SIAM J. Optim. 6(2), 342–361 (1996)
Article MathSciNet Google Scholar
Fazel, M., Hindi, H., Boyd, S.P.: A rank minimization heuristic with application to minimum order system approximation. In: Proceedings of the American Control Conference, vol. 6, pp. 4734–4739 (2001)
Henrion, D., Lasserre, J.-B.: Detecting global optimality and extracting solutions in Gloptipoly. Positive Polyn. Control 312, 293–310 (2005)
Article MathSciNet Google Scholar
Fang, K., Fawzi, H.: The sum-of-squares hierarchy on the sphere and applications in quantum information theory. Math. Program. 190(1), 331–360 (2021)
Article MathSciNet Google Scholar
Laurent, M., Slot, L.: An effective version of Schmüdgen’s Positivstellensatz for the hypercube. Optim. Lett. 1–16 (2022)
Scherer, C.W., Hol, C.W.J.: Matrix sum-of-squares relaxations for robust semi-definite programs. Math. Program. 107(1–2), 189–211 (2006)
Article MathSciNet Google Scholar
Muzellec, B., Bach, F., Rudi, A.: Learning PSD-valued functions using kernel sums-of-squares. Technical Report 2111.11306, arXiv (2021)
Golub, G.H., Loan, C.F.V.: Matrix Computations. Johns Hopkins University Press, Baltimore (1996)
Google Scholar
Ganzburg, M.I.: Multidimensional Jackson theorems. Sib. Math. J. 22(2), 223–231 (1981)
Article MathSciNet Google Scholar
Yu, Y.-L.: The strong convexity of von Neumann’s entropy. Unpublished note (2013). http://www.cs.cmu.edu/~yaoliang/mynotes/sc.pdf
Lemaréchal, C., Sagastizábal, C.: Practical aspects of the Moreau–Yosida regularization: Theoretical preliminaries. SIAM J. Optim. 7(2), 367–385 (1997)
Article MathSciNet Google Scholar
Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, New Jersey (1999)
Google Scholar
De Klerk, E., Laurent, M.: On the Lasserre hierarchy of semidefinite programming relaxations of convex polynomial optimization problems. SIAM J. Optim. 21(3), 824–832 (2011)
Article MathSciNet Google Scholar
Putinar, M.: Positive polynomials on compact semi-algebraic sets. Indiana Univ. Math. J. 42(3), 969–984 (1993)
Article MathSciNet Google Scholar

Download references

Acknowledgements

The comments and suggestions of the anonymous reviewers were greatly appreciated.

Author information

Authors and Affiliations

Inria, Paris, France
Francis Bach
Ecole Normale Supérieure, PSL Research University, Paris, France
Francis Bach

Authors

Francis Bach
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Francis Bach.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A Convergence rates of matrix-valued SOS

We extend the proof of [11, Theorem 1] to matrix-valued polynomials, using the same technique as [21], and following the notations of [11] closely.

Proposition 3

Let $r>0$ and $s \geqslant 3r$, and $\varepsilon (s) = \big [ \big ( 1 - \frac{6r^2}{s^2} \big )^{-d} - 1 \big ] \sim _{s \rightarrow +\infty } \frac{ 6 r^2 d }{s^2}$. For any multivariate matrix-valued trigonometric polynomial f of degree less than 2r, written $f(x) = \sum _{\Vert \omega \Vert _\infty \leqslant 2r} \hat{f}(\omega ) e^{2i\pi \omega ^\top x}$,

$$\begin{aligned}{} & {} \forall x \in [0,1]^d, f(x) \succcurlyeq \varepsilon (s) \sum _{\Vert \omega \Vert _\infty \leqslant 2r, \ \omega \ne 0}\!\! \Vert \hat{f}(\omega ) \Vert _{\textrm{op}} \ \\{} & {} \quad \Rightarrow \ f \ \text{ is } \text{ a } \text{ sum } \text{ of } \text{ squares } \text{ of } \text{ polynomials } \text{ of } \text{ degree } s. \end{aligned}$$

Proof

We consider the following integral operator on 1-periodic matrix-valued functions on $ [0,1]^d$, defined as

$$\begin{aligned} Th(x) = \int _{[0, 1]^d} |q(x-y)|^2 h(y) dy, \end{aligned}$$

(A1)

for a well-chosen 1-periodic function q which is a trigonometric polynomial of degree s. The function $x \mapsto |q(x-y)|^2$ is an element of the finite-dimensional cone of SOS polynomials of degree s, thus, by design, if h has positive semi-definite values, then Th is a sum of squares of matrix polynomials of degree less than s. We will find h such that $ Th = f $.

In the Fourier domain, since convolutions lead to pointwise multiplication and vice-versa, we have for all $\omega \in \mathbb {Z}^d$, where $\hat{q} *\hat{q}(\omega )$ is a shorthand for $(\hat{q} *\hat{q})(\omega )$:

$$\begin{aligned} {\widehat{Th}}(\omega ) = \hat{q} *\hat{q}(\omega ) \cdot \hat{h}(\omega ), \end{aligned}$$

and thus, the candidate h is defined by its Fourier series, which is equal to zero for $ \Vert \omega \Vert _\infty > 2r$, and to

$$\begin{aligned} \frac{\hat{f}(\omega ) }{ \hat{q} *\hat{q}(\omega )}\end{aligned}$$

otherwise. If we impose that $ \hat{q} *\hat{q}(0)=1$, we then have

$$\begin{aligned} f - h= & {} \sum _{\omega \in \mathbb {Z}^d} \hat{f}(\omega ) \Big ( 1 - \frac{1}{\hat{q} *\hat{q}(\omega )} \Big ) \exp ( 2i \pi \omega ^\top \cdot ) \\ {}= & {} \sum _{\omega \ne 0 } \hat{f}(\omega ) \Big ( 1 - \frac{1}{\hat{q} *\hat{q}(\omega )} \Big ) \exp ( 2i \pi \omega ^\top \cdot ). \end{aligned}$$

We then get:

$$\begin{aligned} \sup _{x \in [0,1]^d} \Vert f(x) - h(x)\Vert _{\textrm{op}} \leqslant \sum _{\omega \ne 0} \big \Vert \hat{f}(\omega ) \big \Vert _{\textrm{op}} \cdot \max _{ \Vert \omega \Vert _\infty \leqslant 2r} \Big | \frac{1}{\hat{q} *\hat{q}(\omega )} \, - 1\Big |. \end{aligned}$$

(A2)

With the choice $ \hat{q}(\omega ) = a \prod \nolimits _{i=1}^d \Big ( 1 - \frac{|\omega _i|}{s} \Big )_+, $ with a a normalizing constant, we get $ \hat{q}*\hat{q}(0)=1$ and $ \max _{ \Vert \omega \Vert _\infty \leqslant 2r} \big | \frac{1}{\hat{q} *\hat{q}(\omega )} \, - 1\big | \leqslant \varepsilon (s)$ (see [11] for details). Thus, for all $x \in [0,1]^d$, using Eq. (A2) and the assumption on f:

$$\begin{aligned} h(x) = f(x) - ( f(x) - h(x)) \succcurlyeq \varepsilon (s) \sum _{ \omega \ne 0} \Vert \hat{f}(\omega ) \Vert _{\textrm{op}} - \varepsilon (s) \sum _{ \omega \ne 0} \Vert \hat{f}(\omega ) \Vert _{\textrm{op}} = 0, \end{aligned}$$

which leads to the desired result. $\square $

Appendix B Alternating optimization for the two-stage approach

In this section, we explore briefly the possibility evoked in Sect. 4.1 of trying to minimize Eq. () with respect to $\Sigma $ as well. This is a non-convex problem, and alternating optimization has a particularly simple formulation. Indeed, in the kernelized version in Eq. (18), this corresponds to replacing $\mu $ by the previous value of $\alpha $ and iterating. Since the first upper-bound is minimized exactly, at the second iteration and all later ones, the matrix $\Sigma $ corresponds to a Dirac measure, and the upper-bounding polynomial is so that its value at this point is minimized. This is shown empirically in Fig. 5: even in the good attraction basin, the alternating optimization does not lead to the global optimum.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Bach, F. Sum-of-squares relaxations for polynomial min–max problems over simple sets. Math. Program. (2024). https://doi.org/10.1007/s10107-024-02072-5

Download citation

Received: 19 July 2023
Accepted: 06 February 2024
Published: 15 March 2024
DOI: https://doi.org/10.1007/s10107-024-02072-5

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sum-of-squares relaxations for polynomial min–max problems over simple sets

Abstract

Access this article

Similar content being viewed by others

The Frank-Wolfe Algorithm: A Short Introduction

Random Gradient-Free Minimization of Convex Functions

Global convergence of a BFGS-type algorithm for nonconvex multiobjective optimization problems

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix A Convergence rates of matrix-valued SOS

Proposition 3

Proof

Appendix B Alternating optimization for the two-stage approach

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Sum-of-squares relaxations for polynomial min–max problems over simple sets

Abstract

Access this article

Similar content being viewed by others

The Frank-Wolfe Algorithm: A Short Introduction

Random Gradient-Free Minimization of Convex Functions

Global convergence of a BFGS-type algorithm for nonconvex multiobjective optimization problems

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix A Convergence rates of matrix-valued SOS

Proposition 3

Proof

Appendix B Alternating optimization for the two-stage approach

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation