A hierarchy of spectral relaxations for polynomial optimization

Mai, Ngoc Hoang Anh; Lasserre, Jean-Bernard; Magron, Victor

doi:10.1007/s12532-023-00243-7

A hierarchy of spectral relaxations for polynomial optimization

Full Length Paper
Published: 22 May 2023

Volume 15, pages 651–701, (2023)
Cite this article

Mathematical Programming Computation Aims and scope Submit manuscript

Ngoc Hoang Anh Mai¹,
Jean-Bernard Lasserre^1,2 &
Victor Magron^1,2

291 Accesses
Explore all metrics

Abstract

We show that (1) any constrained polynomial optimization problem (POP) has an equivalent formulation on a variety contained in an Euclidean sphere and (2) the resulting semidefinite relaxations in the moment-SOS hierarchy have the constant trace property (CTP) for the involved matrices. We then exploit the CTP to avoid solving the semidefinite relaxations via interior-point methods and rather use ad-hoc spectral methods for minimizing the largest eigenvalue of a matrix pencil. Convergence to the optimal value of the semidefinite relaxation is guaranteed. As a result we obtain a hierarchy of nonsmooth “spectral relaxations” of the initial POP. Efficiency and robustness of this spectral hierarchy is tested against several equality constrained POPs on a sphere as well as on a sample of randomly generated quadratically constrained quadratic problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Gradient projection method on the sphere, complementarity problems and copositivity

Article Open access 16 April 2024

Sum-of-Squares Relaxations for Information Theory and Variational Inference

Article 05 April 2024

The Big-M method with the numerical infinite M

Article Open access 17 September 2020

Data availability

All data analyzed during this study are publicly available.

Notes

The vector ${\textbf{v}}_i$ is updated in Step 6 of [56, Algorithm 4.2] within the loop from Step 5 of [56, Algorithm 4.2]

References

Bagirov, A., Karmitsa, N., Mäkelä, M.M.: Introduction to Nonsmooth Optimization: Theory, Practice and Software. Springer, New York (2014)
Book MATH Google Scholar
Ben-Tal, A., Nemirovski, A.: Lectures on Modern Convex Optimization: Analysis, Algorithms, and Engineering Applications, vol. 2. Siam, Philadelphia (2001)
Book MATH Google Scholar
Bihain, A.: Optimization of upper semidifferentiable functions. J. Optim. Theory Appl. 44(4), 545–568 (1984)
Article MathSciNet MATH Google Scholar
Burer, S., Monteiro, R.D.: A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization. Math. Program. 95(2), 329–357 (2003)
Article MathSciNet MATH Google Scholar
Burke, J.V., Lewis, A.S., Overton, M.L.: A robust gradient sampling algorithm for nonsmooth, nonconvex optimization. SIAM J. Optim. 15(3), 751–779 (2005)
Article MathSciNet MATH Google Scholar
Chandrasekaran, V., Shah, P.: Relative entropy relaxations for signomial optimization. SIAM J. Optim. 26(2), 1147–1173 (2016)
Article MathSciNet MATH Google Scholar
Curtis, F.E., Que, X.: A quasi-Newton algorithm for nonconvex, nonsmooth optimization with global convergence guarantees. Math. Program. Comput. 7(4), 399–428 (2015)
Article MathSciNet MATH Google Scholar
Curto, R.E., Fialkow, L.A.: Truncated K-moment problems in several variables. J. Oper. Theory 189–226 (2005)
Dahl, J.: Semidefinite Optimization using MOSEK. ISMP, Berlin (2012)
Google Scholar
d’Aspremont, A., El Karoui, N.: A stochastic smoothing algorithm for semidefinite programming. SIAM J. Optim. 24(3), 1138–1177 (2014)
Article MathSciNet MATH Google Scholar
Ding, L., Yurtsever, A., Cevher, V., Tropp, J.A., Udell, M.: An optimal-storage approach to semidefinite programming using approximate complementarity. SIAM J. Optim. 31(4), 2695–2725 (2021)
Article MathSciNet MATH Google Scholar
Doherty, A.C., Wehner, S.: Convergence of SDP hierarchies for polynomial optimization on the hypersphere. arXiv preprint arXiv:1210.5048, (2012)
Dressler, M., Iliman, S., de Wolff, T.: A positivstellensatz for sums of nonnegative circuit polynomials. SIAM J. Appl. Algebra Geom. 1(1), 536–555 (2017)
Article MathSciNet MATH Google Scholar
Haarala, M., Miettinen, K., Mäkelä, M.M.: New limited memory bundle method for large-scale nonsmooth optimization. Optim. Methods Softw. 19(6), 673–692 (2004)
Article MathSciNet MATH Google Scholar
Haarala, N., Miettinen, K., Mäkelä, M.M.: Globally convergent limited memory bundle method for large-scale nonsmooth optimization. Math. Program. 109(1), 181–205 (2007)
Article MathSciNet MATH Google Scholar
Helmberg, C., Overton, M.L., Rendl, F.: The spectral bundle method with second-order information. Opt. Methods Softw. 29(4), 855–876 (2014)
Article MathSciNet MATH Google Scholar
Helmberg, C., Rendl, F.: A spectral bundle method for semidefinite programming. SIAM J. Optim. 10(3), 673–696 (2000)
Article MathSciNet MATH Google Scholar
Henrion, D., Lasserre, J.-B.: Detecting global optimality and extracting solutions in gloptipoly. Posit. Polynomials Control 312, 293–310 (2005)
Article MathSciNet MATH Google Scholar
Henrion, D., Malick, J.: Projection methods in conic optimization. Handb. Semidefinite Conic Polynomial Optim. 565–600 (2012)
Josz, C., Henrion, D.: Strong duality in Lasserre’s hierarchy for polynomial optimization. Optim. Lett. 10(1), 3–10 (2016)
Article MathSciNet MATH Google Scholar
Journée, M., Bach, F., Absil, P.-A., Sepulchre, R.: Low-rank optimization for semidefinite convex problems. arXiv preprint arXiv:0807.4423, (2008)
Karmitsa, N.: LMBM-FORTRAN subroutines for Large-Scale nonsmooth minimization: user’s manual’. TUCS Techn. Rep. 77(856) (2007)
Kiwiel, K.C.: Proximity control in bundle methods for convex nondifferentiable minimization. Math. Program. 46(1–3), 105–122 (1990)
Article MathSciNet MATH Google Scholar
Kiwiel, K.C.: Convergence of the gradient sampling algorithm for nonsmooth nonconvex optimization. SIAM J. Optim. 18(2), 379–388 (2007)
Article MathSciNet MATH Google Scholar
Lasserre, J.B.: Global optimization with polynomials and the problem of moments. SIAM J. Optim. 11(3), 796–817 (2001)
Article MathSciNet MATH Google Scholar
Lasserre, J.-B.: Convergent SDP-relaxations in polynomial optimization with sparsity. SIAM J. Optim. 17(3), 822–843 (2006)
Article MathSciNet MATH Google Scholar
Lasserre, J.-B.: Moments, Positive Polynomials and Their Applications, vol. 1. World Scientific, Singapore (2010)
MATH Google Scholar
Lasserre, J.B.: An Introduction to Polynomial and Semi-algebraic Optimization, vol. 52. Cambridge University Press, Cambridge (2015)
Book MATH Google Scholar
Lasserre, J.B.: Homogeneous polynomials and spurious local minima on the unit sphere. Optim. Lett. 1–14 (2021)
Lasserre, J.B., Laurent, M., Rostalski, P.: Semidefinite characterization and computation of zero-dimensional real radical ideals. Found. Comput. Math. 8(5), 607–647 (2008)
Article MathSciNet MATH Google Scholar
Lasserre, J.B., Toh, K.-C., Yang, S.: A bounded degree SOS hierarchy for polynomial optimization. EURO J. Comput. Optim. 5(1–2), 87–117 (2017)
Article MathSciNet MATH Google Scholar
Laurent, M.: Revisiting two theorems of Curto and Fialkow on moment matrices. Proc. Am. Math. Soc. 133(10), 2965–2976 (2005)
Article MathSciNet MATH Google Scholar
Lehoucq, R.B., Sorensen, D.C., Yang, C.: ARPACK Users’ Guide: Solution of Large-Scale Eigenvalue Problems with Implicitly Restarted Arnoldi Methods. SIAM, Philadelphia (1998)
Book MATH Google Scholar
Lewis, A.S., Overton, M.L.: Nonsmooth optimization via quasi-Newton methods. Math. Program. 141(1–2), 135–163 (2013)
Article MathSciNet MATH Google Scholar
Mai, N.H.A., Bhardwaj, A., Magron, V.: The constant trace property in noncommutative optimization. In: Proceedings of the 2021 on International Symposium on Symbolic and Algebraic Computation, ISSAC ’21, pp. 297–304, New York, NY, USA. ). Association for Computing Machinery (2021)
Mai, N.H.A., Lasserre, J.-B., Magron, V., Wang, J.: Exploiting constant trace property in large-scale polynomial optimization. ACM Trans. Math. Softw. 48(4), 1–39 (2022)
Article MathSciNet Google Scholar
Mifflin, R.: An algorithm for constrained optimization with semismooth functions. Math. Oper. Res. 2(2), 191–207 (1977)
Article MathSciNet MATH Google Scholar
Navascués, M., Pironio, S., Acín, A.: A convergent hierarchy of semidefinite programs characterizing the set of quantum correlations. New J. Phys. 10(7), 073013 (2008)
Article Google Scholar
Nemirovsky, A., Yudin, D.: Problem complexity and method efficiency in optimization. Nauka (1983)
Nie, J.: Optimality conditions and finite convergence of Lasserre’s hierarchy. Math. Program. 146(1–2), 97–121 (2014)
Article MathSciNet MATH Google Scholar
Nie, J., Schweighofer, M.: On the complexity of Putinar’s Positivstellensatz. J. Complex. 23(1), 135–150 (2007)
Article MathSciNet MATH Google Scholar
Nocedal, J.: Updating quasi-Newton matrices with limited storage. Math. Comput. 35(151), 773–782 (1980)
Article MathSciNet MATH Google Scholar
Nocedal, J., Wright, S.: Numerical Optimization. Springer, New York (2006)
MATH Google Scholar
Overton, M.L.: Large-scale optimization of eigenvalues. SIAM J. Optim. 2(1), 88–120 (1992)
Article MathSciNet MATH Google Scholar
Overton, M.L., Womersley, R.S.: Second derivatives for optimizing eigenvalues of symmetric matrices. SIAM J. Matrix Anal. Appl. 16(3), 697–718 (1995)
Article MathSciNet MATH Google Scholar
Saad, Y.: Numerical Methods for Large Eigenvalue Problems, revised SIAM, Philadelphia (2011)
Book MATH Google Scholar
Schweighofer, M.: On the complexity of Schmüdgen’s positivstellensatz. J. Complex. 20(4), 529–543 (2004)
Article MathSciNet MATH Google Scholar
Shor, N.Z.: Quadratic optimization problems. Sov. J. Comput. Syst. Sci. 25, 1–11 (1987)
MathSciNet MATH Google Scholar
Trnovska, M.: Strong duality conditions in semidefinite programming. J. Electr. Eng. 56(12), 1–5 (2005)
MATH Google Scholar
Waki, H., Kim, S., Kojima, M., Muramatsu, M.: Sums of squares and semidefinite programming relaxations for polynomial optimization problems with structured sparsity. SIAM J. Optim. 17(1), 218–242 (2006)
Article MathSciNet MATH Google Scholar
Wang, J., Magron, V.: A second order cone characterization for sums of nonnegative circuits. In: Proceedings of the 45th International Symposium on Symbolic and Algebraic Computation, pp. 450–457 (2020)
Wang, J., Magron, V., Lasserre, J.-B.: Chordal-TSSOS: a moment-SOS hierarchy that exploits term sparsity with chordal extension. SIAM J. Optim. 31(1), 114–141 (2021)
Article MathSciNet MATH Google Scholar
Wang, J., Magron, V., Lasserre, J.-B.: TSSOS: A Moment-SOS hierarchy that exploits term sparsity. SIAM J. Optim. 31(1), 30–58 (2021)
Article MathSciNet MATH Google Scholar
Wang, J., Magron, V., Lasserre, J.B., Mai, N.H.A.: CS-TSSOS: correlative and term sparsity for large-scale polynomial optimization. ACM Trans. Math. Softw. 48(4), 1–26 (2022)
Article MathSciNet Google Scholar
Weisser, T., Legat, B., Coey, C., Kapelevich, L., Vielma, J.P.: Polynomial and moment optimization in Julia and JuMP. In: JuliaCon (2019)
Yurtsever, A., Tropp, J.A., Fercoq, O., Udell, M., Cevher, V.: Scalable semidefinite programming. SIAM J. Math. Data Sci. 3(1), 171–200 (2021)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The authors would like to thank the associate editor and anonymous reviewers, whose insightful comments and careful proof-checks helped to improve the paper.

Funding

The first author was supported by the MESRI funding from EDMITT. The second author was supported by the FMJH Program PGMO (EPICS project) and EDF, Thales, Orange et Criteo, as well as from the Tremplin ERC Stg Grant ANR-18-ERC2-0004-01 (T-COPS project). This work has benefited from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie Actions, grant agreement 813211 (POEMA) as well as from the AI Interdisciplinary Institute ANITI funding, through the French “Investing for the Future PIA3” program under the Grant agreement n$^{\circ }$ANR-19-PI3A-0004. The third author was supported by the European Research Council (ERC) under the European’s Union Horizon 2020 research and innovation program (Grant Agreement 666981 TAMING).

Author information

Authors and Affiliations

CNRS, LAAS, 7 avenue du Colonel Roche, 31400, Toulouse, France
Ngoc Hoang Anh Mai, Jean-Bernard Lasserre & Victor Magron
Université de Toulouse, LAAS, 31400, Toulouse, France
Jean-Bernard Lasserre & Victor Magron

Authors

Ngoc Hoang Anh Mai
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Bernard Lasserre
View author publications
You can also search for this author in PubMed Google Scholar
Victor Magron
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ngoc Hoang Anh Mai.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Code availability

The full code was made available for review. We remark that a set of packages were used in this study, that were either open source or available for academic use. Specific references are included in this published article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

1.1 Spectral minimizations of SDP

In this section, we provide the proofs of lemmas stated in Sects. 2.6.1 and 2.6.3. First we recall the following useful properties of ${\mathcal {S}}$ and ${\mathcal {S}}^+$:

If ${\textbf{X}}={{\,\textrm{diag}\,}}({\textbf{X}}_1,\dots ,{\textbf{X}}_l)\in {\mathcal {S}}$,
$$\begin{aligned} {\textbf{X}}\succeq 0\Longleftrightarrow {\textbf{X}}_j\succeq 0,\,j\in [l]\qquad \text { and }\qquad {{\,\textrm{trace}\,}}({\textbf{X}})=\sum _{j=1}^l {{\,\textrm{trace}\,}}({\textbf{X}}_j). \end{aligned}$$
(49)
If ${\textbf{A}}={{\,\textrm{diag}\,}}({\textbf{A}}_1,\dots ,{\textbf{A}}_l)\in {\mathcal {S}}$ and ${\textbf{B}}={{\,\textrm{diag}\,}}({\textbf{B}}_1,\dots ,{\textbf{B}}_l)\in {\mathcal {S}}$,
$$\begin{aligned} \langle {\textbf{A}}, {\textbf{B}}\rangle = \sum _{j=1}^l\langle {\textbf{A}}_j, {\textbf{B}}_j\rangle . \end{aligned}$$
(50)

1.1.1 SDP with constant trace property

Proof of Lemma 3

The proof of (20) is similar in spirit to the one of Helmberg and Rendl in [17, Sect. 2]. Here, we extend this proof for SDP (13), which involves a block-diagonal positive semidefinite matrix. From (13),

$$\begin{aligned} -\tau = \sup _{{\textbf{X}}\in {\mathcal {S}}}\{ \langle {\textbf{C}},{\textbf{X}}\rangle :\,{\mathcal {A}} {\textbf{X}}={\textbf{b}},\,{{\,\textrm{trace}\,}}({\textbf{X}})=a,\, {\textbf{X}} \succeq 0\}. \end{aligned}$$

The dual of this SDP reads:

$$\begin{aligned} -\rho = \inf _{({\textbf{z}},\zeta )} \{ {\textbf{b}}^\top {\textbf{z}}+a\zeta :\, {\mathcal {A}}^\top {\textbf{z}}+\zeta {\textbf{I}}-{\textbf{C}}\succeq 0 \}, \end{aligned}$$

where ${\textbf{I}}$ is the identity matrix of size s. From this,

$$\begin{aligned} \begin{array}{rl} -\rho &{}= \inf _{({\textbf{z}},\zeta )} \{ {\textbf{b}}^\top {\textbf{z}}+a\zeta :\, \zeta \ge \lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\textbf{z}})\} \\ &{} = \inf \{a\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\textbf{z}})+ {\textbf{b}}^\top {\textbf{z}} :\,{\textbf{z}}\in {{\mathbb {R}}}^m \}. \end{array} \end{aligned}$$

Since $\rho =\tau $, (20) follows. For the second statement, let ${\textbf{z}}^\star $ be an optimal solution of SDP (14). Then ${\textbf{b}}^\top {\textbf{z}}^\star =-\rho =-\tau $. In addition, ${\textbf{C}}-{\mathcal {A}}^\top {\textbf{z}}^\star \preceq 0$ implies that

$$\begin{aligned} \lambda _1({\textbf{C}}-{\mathcal {A}}^\top {{\textbf{z}}^\star })\le 0, \end{aligned}$$

so that $\varphi ({\textbf{z}}^\star )\le -\tau $. Note that (20) indicates that $\varphi ({\textbf{z}}^\star )\ge -\tau $. Thus, $\varphi ({\textbf{z}}^\star )=-\tau $, yielding the second statement. $\square $

The following proposition recalls the differentiability properties of $\varphi $.

Proposition 8

The function $\varphi $ in (19) has the following properties:

1.
$\varphi $ is convex and continuous but not differentiable.
2.
The subdifferential of $\varphi $ at ${\textbf{z}}$ reads:
$$\begin{aligned} \partial \varphi ({\textbf{z}})=\{{\textbf{b}}-a{\mathcal {A}}{\textbf{W}} \,\ {\textbf{W}}\in {{\,\textrm{conv}\,}}({\varGamma }({\textbf{C}}-{\mathcal {A}}^\top {\textbf{z}}))\}, \end{aligned}$$
(51)
where for each ${\textbf{A}}\in {\mathcal {S}}$,
$$\begin{aligned} {\varGamma }({\textbf{A}}):=\{{\textbf{u}}{\textbf{u}}^\top \,\ {\textbf{A}}{\textbf{u}}=\lambda _1({\textbf{A}}){\textbf{u}}\,\ \Vert {\textbf{u}}\Vert _2=1\}. \end{aligned}$$
(52)

Proof

Properties 1–2 are from Helmberg–Rendl [17, Sect. 2] (see also [44, (4)]). $\square $

The following result is useful to recover an optimal solution of SDP (13) from an optimal solution of NSOP (20).

Lemma 8

Let ${\bar{\textbf{z}}}$ be an optimal solution of NSOP (20). Then:

1.
There exists ${\textbf{X}}^\star \in a{{\,\textrm{conv}\,}}({\varGamma }({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}}))$ such that ${\mathcal {A}}{\textbf{X}}^\star ={\textbf{b}}$.
2.
Let ${\textbf{u}}$ be a normalized eigenvector corresponding to $\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}})$. If $\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}})$ has multiplicity 1 then ${\textbf{X}}^\star =a{\textbf{u}}{\textbf{u}}^\top $, thus ${\textbf{X}}^\star $ is an optimal solution of SDP (13).

Proof

By [1, Theorem 4.2], ${\textbf{0}}\in \partial \varphi ({\bar{\textbf{z}}})$. Combining this with Proposition 8.2, the first statement follows.

We next prove the second statement. Assume that $\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}})$ has multiplicity 1. Then ${\varGamma }({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}})=\{{\textbf{u}} {\textbf{u}}^\top \}$. The first statement implies that ${\textbf{X}}^\star =a{\textbf{u}}{\textbf{u}}^\top $. From this, ${\textbf{X}}^\star \succeq 0$ and ${\mathcal {A}}{\textbf{X}}^\star ={\textbf{b}}$, so that ${\textbf{X}}^\star $ is a feasible solution of SDP (13). Moreover,

$$\begin{aligned} \begin{array}{rl} \langle {\textbf{C}}, {\textbf{X}}^\star \rangle &{}=\langle {\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}}, {\textbf{X}}^\star \rangle +\langle {\mathcal {A}}^\top {\bar{\textbf{z}}}, {\textbf{X}}^\star \rangle \\ &{}=a \langle {\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}},{\textbf{u}}{\textbf{u}}^\top \rangle +{\bar{\textbf{z}}}^\top ( {\mathcal {A}} {\textbf{X}}^\star )\\ &{}=a{\textbf{u}}^\top ( {\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}}){\textbf{u}}+{\bar{\textbf{z}}}^\top {\textbf{b}}\\ &{}=a\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}})\Vert {\textbf{u}}\Vert _2^2+{\bar{\textbf{z}}}^\top {\textbf{b}}\\ &{}=a\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}})+{\bar{\textbf{z}}}^\top {\textbf{b}}=\varphi ({\bar{\textbf{z}}})=-\tau . \end{array} \end{aligned}$$

Thus, $\langle {\textbf{C}}, {\textbf{X}}^\star \rangle =-\tau $, yielding the second statement. $\square $

To obtain a convergence guarantee when solving NSOP (20) by LMBM [15, Algorithm 1], we need the following technical lemma:

Lemma 9

When applied to problem NSOP (20), the LMBM algorithm is globally convergent.

Proof

The convexity of $\varphi $ yields that $\varphi $ is weakly upper semismooth on ${{\mathbb {R}}}^m$ according to [37, Proposition 5]. From this, $\varphi $ is upper semidifferentiable on ${{\mathbb {R}}}^m$ by using [3, Theorem 3.1]. Combining this with the fact that $\varphi $ is bounded from below on ${{\mathbb {R}}}^m$, the result follows thanks to [3, Sect. 5] (see also the final statement of [1, Sect. 14.2]). $\square $

1.1.2 SDP with bounded trace property

Proof of Lemma 4

Let ${\textbf{X}}^\star $ be an optimal solution of SDP (13) and set . By Condition 4 of Assumption 1, one has

(53)

Similarly to the proof of Lemma 3, one obtains:

(54)

Let us prove that

$$\begin{aligned} \psi ({\textbf{z}})\ge -\tau ,\,\forall {\textbf{z}}\in {{\mathbb {R}}}^m. \end{aligned}$$

(55)

Let ${\textbf{z}}\in {{\mathbb {R}}}^m$ be fixed and consider the following two cases:

Case 1 $\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\textbf{z}})>0$. By (53) and (54),
Thus, $\psi ({\textbf{z}})\ge -\tau $.
Case 2 $\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\textbf{z}})\le 0$. Then ${\mathcal {A}}^\top {\textbf{z}}-{\textbf{C}}\succeq 0$ and $\psi ({\textbf{z}})={\textbf{b}}^\top {\textbf{z}}\ge -\rho =-\tau $ by (14).

Let $({\textbf{z}}^{(j)})_{j\in {{\mathbb {N}}}}$ be a minimizing sequence of SDP (14). Then $\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\textbf{z}}^{(j)})\le 0$, $j\in {{\mathbb {N}}}$, since ${\mathcal {A}}^\top {\textbf{z}}^{(j)}-{\textbf{C}}\succeq 0$ and ${\textbf{b}}^\top {\textbf{z}}^{(j)}\rightarrow -\tau $ as $j\rightarrow \infty $ since $\tau =\rho $. It implies that $\psi ({\textbf{z}}^{(j)})={\textbf{b}}^\top {\textbf{z}}^{(j)}\rightarrow -\tau $ as $j\rightarrow \infty $. From this and by (55), the first statement follows.

For the second statement, let ${\textbf{z}}^\star $ be an optimal solution of SDP (14). Since ${\mathcal {A}}^\top {\textbf{z}}-{\textbf{C}}\succeq 0$, $\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\textbf{z}})\le 0$ and thus $\psi ({\textbf{z}}^\star )={\textbf{b}}^\top {\textbf{z}}^\star = -\rho =-\tau $. Thus, ${\textbf{z}}^\star $ is an optimal solution of (26), yielding the second statement. $\square $

We consider the differentiability properties of $\psi $ in the following proposition:

Proposition 9

The function $\psi $ has the following properties:

1.
$\psi $ is convex and continuous but not differentiable.
2.
The subdifferential of $\psi $ at ${\textbf{z}}$ reads:
$$\begin{aligned} \partial \psi ({\textbf{z}})={\left\{ \begin{array}{ll} \{{\textbf{b}}\}\text { if }\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\textbf{z}})< 0 ,\\ \{{\textbf{b}}-a{\mathcal {A}}{\textbf{W}} \,\ {\textbf{W}}\in {{\,\textrm{conv}\,}}({\varGamma }({\textbf{C}}-{\mathcal {A}}^\top {\textbf{z}}))\}\text { if }\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\textbf{z}})> 0,\\ \{{\textbf{b}}-\zeta a{\mathcal {A}}{\textbf{W}} \,\ \zeta \in [0,1],\,{\textbf{W}}\in {{\,\textrm{conv}\,}}({\varGamma }({\textbf{C}}-{\mathcal {A}}^\top {\textbf{z}}))\}\text { otherwise},\\ \end{array}\right. } \end{aligned}$$
where ${\varGamma }(.)$ is defined as in (52).

Proof

Note that $\psi $ is the maximum of two convex functions, i.e.,

$$\begin{aligned} \psi ({\textbf{z}})=\max \{\varphi _1({\textbf{z}}),\varphi _2({\textbf{z}})\} , \end{aligned}$$

with $\varphi _1({\textbf{z}})=a\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\textbf{z}})+{\textbf{b}}^\top {\textbf{z}}$ and $\varphi _2({\textbf{z}})={\textbf{b}}^\top {\textbf{z}}$. Thus, $\psi $ is convex and

$$\begin{aligned} \partial \psi ({\textbf{z}})= {\left\{ \begin{array}{ll} \partial \varphi _1({\textbf{z}}) &{}\text { if }\varphi _1({\textbf{z}})>\varphi _2({\textbf{z}}),\\ {{\,\textrm{conv}\,}}(\partial \varphi _1({\textbf{z}})\cup \partial \varphi _2({\textbf{z}}))&{}\text { if }\varphi _1({\textbf{z}})=\varphi _2({\textbf{z}}),\\ \partial \varphi _2({\textbf{z}}) &{}\text { otherwise}. \end{array}\right. } \end{aligned}$$

Note that $\partial \varphi _2({\textbf{z}})=\{{\textbf{b}}\}$ and $\partial \varphi _1({\textbf{z}})$ is computed as in formula (51). Thus, the result follows. $\square $

The following theorem is useful to recover an optimal solution of SDP (13) from an optimal solution of NSOP (26).

Lemma 10

Assume that ${\bar{\textbf{z}}}$ is an optimal solution of NSOP (26). The following statements are true:

1.
There exists
$$\begin{aligned} {\textbf{X}}^\star {\left\{ \begin{array}{ll} = {\textbf{0}}&{} \text { if } \lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}})< 0,\\ \in \zeta a{{\,\textrm{conv}\,}}({\varGamma }({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}})) &{}\text { if } \lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}})=0,\\ \in a{{\,\textrm{conv}\,}}({\varGamma }({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}}))&{} \text { otherwise}, \end{array}\right. } \end{aligned}$$
for some $\zeta \in [0,1]$ such that ${\mathcal {A}}{\textbf{X}}^\star ={\textbf{b}}$.
2.
Let ${\textbf{u}}$ be a normalized eigenvector corresponding to $\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}})$. If $\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}})$ has multiplicity 1, then ${\textbf{X}}^\star = {{\bar{\xi }}}{\textbf{u}}{\textbf{u}}^\top $ with ${{\bar{\xi }}}$ defined as in (27) and ${\textbf{X}}^\star $ is an optimal solution of SDP (13).

Proof

Due to [1, Theorem 4.2], ${\textbf{0}}\in \partial \varphi ({\bar{\textbf{z}}})$. From this and by Proposition 9.2, the first statement follows. Let us prove the second statement. Assume that $\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}})$ has multiplicity 1. Then one has ${\varGamma }({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}})=\{{\textbf{u}}{\textbf{u}}^\top \}$, yielding ${\textbf{X}}^\star ={{\bar{\xi }}}{\textbf{u}}{\textbf{u}}^\top $ with ${{\bar{\xi }}}$ defined as in (27), so that ${\textbf{X}}^\star \succeq 0$. From this and since ${\mathcal {A}}{\textbf{X}}^\star ={\textbf{b}}$, ${\textbf{X}}^\star $ is a feasible solution of SDP (13). Moreover,

$$\begin{aligned} \begin{array}{rl} \langle {\textbf{C}}, {\textbf{X}}^\star \rangle &{}=\langle {\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}}, {\textbf{X}}^\star \rangle +\langle {\mathcal {A}}^\top {\bar{\textbf{z}}}, {\textbf{X}}^\star \rangle \\ &{}={{\bar{\xi }}}\langle {\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}},{\textbf{u}}{\textbf{u}}^\top \rangle +{\bar{\textbf{z}}}^\top ( {\mathcal {A}} {\textbf{X}}^\star )\\ &{}= {{\bar{\xi }}}{\textbf{u}}^\top ( {\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}}){\textbf{u}}+{\bar{\textbf{z}}}^\top {\textbf{b}}\\ &{}=\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}}) {{\bar{\xi }}}\Vert {\textbf{u}}\Vert _2^2+{\bar{\textbf{z}}}^\top {\textbf{b}}\\ &{}=\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}}){{\bar{\xi }}}+{\bar{\textbf{z}}}^\top {\textbf{b}}\\ &{}=a\max \{\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}}),0\}+{\bar{\textbf{z}}}^\top {\textbf{b}}=\psi ({\bar{\textbf{z}}})=-\tau . \end{array} \end{aligned}$$

Thus, $\langle {\textbf{C}}, {\textbf{X}}^\star \rangle =-\tau $, yielding the second statement. $\square $

The next result proves that when applied to NSOP (26), the LMBM algorithm [15, Algorithm 1] converges.

Lemma 11

LMBM applied to NSOP (26) is globally convergent.

The proof of Lemma 11 is similar to Lemma 9.

1.2 Converting moment relaxations to standard SDP

We will present a way to transform SDP (31) to the form (33) recalled as follows:

$$\begin{aligned} -\tau _k = \sup _{{\textbf{X}}\in {\mathcal {S}}_k} \{ \langle {\textbf{C}},{\textbf{X}}\rangle :\,\langle {\textbf{A}}_j, {\textbf{X}}\rangle ={\textbf{b}}_j,\,j\in [m],\, {\textbf{X}} \succeq 0\}. \end{aligned}$$

Let $k\in {{\mathbb {N}}}$ be fixed. We will prove that there exists ${\textbf{A}}_j\in {\mathcal {S}}_k$, $j\in [r]$, such that ${\textbf{X}}={\textbf{P}}_k{\textbf{M}}_k({\textbf{y}}){\textbf{P}}_k$ for some if and only if $\langle {\textbf{A}}_j, {\textbf{X}}\rangle =0$, $j\in [r]$. Let . Then ${\mathcal {V}}$ is a linear subspace of ${\mathcal {S}}_k$ and . We take a basis ${\textbf{A}}_1,\dots ,{\textbf{A}}_r$ of the orthogonal complement ${\mathcal {V}}^\bot $ of ${\mathcal {V}}$. Notice that

(56)

with ${\textbf{X}}\in {\mathcal {S}}_k$, it implies that ${\textbf{X}}\in {\mathcal {V}}$ if and only if $\langle {\textbf{A}}_j, {\textbf{X}}\rangle =0$, $j\in [r]$.

Let us find such a basis ${\textbf{A}}_1,\dots ,{\textbf{A}}_r$. Let ${\textbf{A}}=(A_{\alpha ,\beta })_{\alpha ,\beta \in {{\mathbb {N}}}^n_k}\in {\mathcal {V}}^\bot $. Then for all ${\textbf{X}}=(X_{\alpha ,\beta })_{\alpha ,\beta \in {{\mathbb {N}}}^n_k}\in {\mathcal {V}}$, $\langle {\textbf{A}}, {\textbf{X}}\rangle =0$. Note that if ${\textbf{X}}={\textbf{P}}_k{\textbf{M}}_k({\textbf{y}}){\textbf{P}}_k$, then one has

$$\begin{aligned} X_{\alpha ,\beta }=w_{\alpha ,\beta }y_{\alpha +\beta },\,\forall \alpha ,\beta \in {{\mathbb {N}}}^n_k, \end{aligned}$$

with $w_{\alpha ,\beta }:=\theta _{k,\alpha }^{1/2}\theta _{k,\beta }^{1/2}$, for all $\alpha ,\beta \in {{\mathbb {N}}}^n_k$. It implies that

Let $\gamma \in {{\mathbb {N}}}^n_{2k}$ be fixed and let be such that for $\xi \in {{\mathbb {N}}}^n_{2k}$,

$$\begin{aligned} y_\xi ={\left\{ \begin{array}{ll} 0&{}\text { if }\xi \ne \gamma ,\\ 1&{}\text { otherwise.} \end{array}\right. } \end{aligned}$$

Then

If $\gamma \not \in 2{{\mathbb {N}}}^n$, we do not have the first term in the latter equality. Let us define

$$\begin{aligned} {\varLambda }_\gamma :=\{A_{\alpha ,\beta }:\,\alpha ,\beta \in {{\mathbb {N}}}^n_k,\, \alpha +\beta =\gamma ,\, \alpha \le \beta \}. \end{aligned}$$

Note that ${\varLambda }_\gamma $ consists of values of A indexed by pairs of vectors $(\alpha , \beta )$ satisfying the lexicographic order relation $\alpha \le \beta $. Moreover, it can be rewritten as ${\varLambda }_\gamma =\{A_{\alpha _j,\beta _j},\,j\in [t]\}$ where $(\alpha _1,\beta _1)<\dots <(\alpha _t,\beta _t)$ and $t=|{\varLambda }_\gamma |$. Thus, if $t\ge 2$, we can choose ${\textbf{A}}$ such that for all $\alpha ,\beta \in {{\mathbb {N}}}^n_k$,

$$\begin{aligned} A_{\alpha ,\beta }={\left\{ \begin{array}{ll} w_{\alpha _\mu ,\beta _\mu }&{} \text { if } \alpha _1=\beta _1 \text { and }(\alpha ,\beta ) =(\alpha _1,\beta _1),\\ \frac{1}{2} w_{\alpha _\mu ,\beta _\mu }&{} \text { if } \alpha _1< \beta _1 \text { and }(\alpha ,\beta ) \in \{(\alpha _1,\beta _1),(\beta _1,\alpha _1)\},\\ -w_{\alpha _1,\beta _1}&{} \text { if } \alpha _\mu =\beta _\mu \text { and }(\alpha ,\beta ) =(\alpha _\mu ,\beta _\mu ),\\ -\frac{1}{2} w_{\alpha _1,\beta _1}&{} \text { if } \alpha _\mu < \beta _\mu \text { and }(\alpha ,\beta ) \in \{(\alpha _\mu ,\beta _\mu ),(\beta _\mu ,\alpha _\mu )\},\\ 0 &{} \text { otherwise}, \end{array}\right. } \end{aligned}$$

for some $\mu \in [t]\backslash \{1\}$. We denote by ${\mathcal {B}}_\gamma $ the set of all such above $A_{\alpha ,\beta }$ satisfying $t=|{\varLambda }_\gamma |\ge 2$, otherwise let ${\mathcal {B}}_\gamma =\emptyset $. Then $|{\mathcal {B}}_\gamma |=|{\varLambda }_\gamma |-1 $. From this and since $( {\mathcal {B}}_\gamma )_{\gamma \in {{\mathbb {N}}}^{n}_{2k}}$ is a sequence of pairwise disjoint subsets of ${\mathcal {S}}_k$,

It must be equal to r as in (56). We just proved that $\bigcup _{\gamma \in {{\mathbb {N}}}^{n}_{2k}} {\mathcal {B}}_\gamma $ is a basis of ${\mathcal {V}}^\bot $. Now we assume that $\bigcup _{\gamma \in {{\mathbb {N}}}^{n}_{2k}} {\mathcal {B}}_\gamma =\{{\textbf{A}}_1,\dots ,{\textbf{A}}_r\}$.

Let us rewrite the constraints

$$\begin{aligned} {\textbf{M}}_{k - \lceil h_j \rceil }(h_j\;{\textbf{y}}) = 0,\, j\in [l_h], \end{aligned}$$

(57)

as $\langle {\textbf{A}}_j,{\textbf{X}}\rangle =0$, $j=r+1,\dots ,m-1$ with ${\textbf{X}}={\textbf{P}}_k{\textbf{M}}_k({\textbf{y}}){\textbf{P}}_k$. From (57),

$$\begin{aligned} \sum _{\gamma \in {{\mathbb {N}}}^n_{2\lceil h_j \rceil }}h_{j,\gamma }y_{\alpha +\gamma }=0,\,\alpha \in {{\mathbb {N}}}^n_{2(k-\lceil h_j \rceil )},\,j\in [l_h]. \end{aligned}$$

(58)

Let $j\in [l_h]$ and $\alpha \in {{\mathbb {N}}}^n_{2(k-\lceil h_j \rceil )}$ be fixed. We define $\tilde{{\textbf{A}}}=({\tilde{A}}_{\mu ,\nu })_{\mu ,\nu \in {{\mathbb {N}}}^n_k}$ as follows:

$$\begin{aligned} {\tilde{A}}_{\mu ,\nu }={\left\{ \begin{array}{ll} h_{j,\gamma }&{}\text { if }\mu =\nu ,\,\mu +\nu =\alpha +\gamma ,\\ \frac{1}{2} h_{j,\gamma }&{}\text { if }\mu \ne \nu ,\,\mu +\nu =\alpha +\gamma ,\\ &{}\qquad \text { and } (\mu ,\nu )\le ({{\bar{\mu }}},{{\bar{\nu }}}),\,\forall {{\bar{\mu }}},{{\bar{\nu }}}\in {{\mathbb {N}}}^n_k\, \text { such that} \,{{\bar{\mu }}}+{{\bar{\nu }}}=\alpha +\gamma ,\\ 0&{}\text { otherwise.} \end{array}\right. } \end{aligned}$$

(59)

Then (58) implies that $\langle \tilde{{\textbf{A}}}, {\textbf{M}}_k({\textbf{y}})\rangle =0$. Since ${\textbf{M}}_k({\textbf{y}})={\textbf{P}}_k^{-1}{\textbf{X}}{\textbf{P}}_k^{-1}$,

$$\begin{aligned} 0=\langle \tilde{{\textbf{A}}}, {\textbf{P}}_k^{-1}{\textbf{X}}{\textbf{P}}_k^{-1}\rangle =\langle {\textbf{P}}_k^{-1}\tilde{{\textbf{A}}}{\textbf{P}}_k^{-1}, {\textbf{X}}\rangle =\langle {\textbf{A}}, {\textbf{X}}\rangle , \end{aligned}$$

where ${\textbf{A}}:={\textbf{P}}_k^{-1}\tilde{{\textbf{A}}}{\textbf{P}}_k^{-1}$, yielding the statement. Thus, we obtain the constraints $\langle {\textbf{A}}_j,{\textbf{X}}\rangle =0$, $j\in [m-1]$.

The final constraint $y_0=1$ can be rewritten as $\langle {\textbf{A}}_m, {\textbf{X}}\rangle =1$ with ${\textbf{A}}_m\in {\mathcal {S}}_k$ having zero entries except the top left one $[{\textbf{A}}_{m}]_{0,0}=1$. Thus, we select ${\textbf{b}}$ such that all entries of ${\textbf{b}}$ are zeros except $b_m=1$.

The number m (or $m_k$ when plugging the relaxation order k) of equality trace constraints $\langle {\textbf{A}}_j,{\textbf{X}}\rangle =b_j$ is:

(60)

The function $-L_{{\textbf{y}}}(f)=-\sum _\gamma f_\gamma y_{\gamma }$ is equal to $\langle {\textbf{C}}, {\textbf{X}}\rangle $ with ${\textbf{C}}:={\textbf{P}}_k^{-1}\tilde{{\textbf{C}}}{\textbf{P}}_k^{-1}$, where $\tilde{{\textbf{C}}}=({\tilde{C}}_{\mu ,\nu })_{\mu ,\nu \in {{\mathbb {N}}}^n_k}$ is defined by:

$$\begin{aligned} {\tilde{C}}_{\mu ,\nu }={\left\{ \begin{array}{ll} -f_{\gamma }&{}\text { if }\mu =\nu ,\,\mu +\nu =\gamma ,\\ -\frac{1}{2} f_{\gamma }&{}\text { if }\mu \ne \nu ,\,\mu +\nu =\gamma ,\\ &{}\qquad \text { and } (\mu ,\nu )\le ({{\bar{\mu }}},{{\bar{\nu }}}),\,\forall {{\bar{\mu }}},{{\bar{\nu }}}\in {{\mathbb {N}}}^n_k\, \text { such that}\,{{\bar{\mu }}}+{{\bar{\nu }}}=\gamma ,\\ 0&{}\text { otherwise.} \end{array}\right. } \end{aligned}$$

(61)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Mai, N.H.A., Lasserre, JB. & Magron, V. A hierarchy of spectral relaxations for polynomial optimization. Math. Prog. Comp. 15, 651–701 (2023). https://doi.org/10.1007/s12532-023-00243-7

Download citation

Received: 17 July 2020
Accepted: 21 April 2023
Published: 22 May 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s12532-023-00243-7

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A hierarchy of spectral relaxations for polynomial optimization

Abstract

Access this article

Similar content being viewed by others

Gradient projection method on the sphere, complementarity problems and copositivity

Sum-of-Squares Relaxations for Information Theory and Variational Inference

The Big-M method with the numerical infinite M

Data availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Code availability

Additional information

Publisher's Note

Appendix

Appendix

1.1 Spectral minimizations of SDP

1.1.1 SDP with constant trace property

Proof of Lemma 3

Proposition 8

Proof

Lemma 8

Proof

Lemma 9

Proof

1.1.2 SDP with bounded trace property

Proof of Lemma 4

Proposition 9

Proof

Lemma 10

Proof

Lemma 11

1.2 Converting moment relaxations to standard SDP

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation