Skip to main content
Log in

A hierarchy of spectral relaxations for polynomial optimization

  • Full Length Paper
  • Published:
Mathematical Programming Computation Aims and scope Submit manuscript

Abstract

We show that (1) any constrained polynomial optimization problem (POP) has an equivalent formulation on a variety contained in an Euclidean sphere and (2) the resulting semidefinite relaxations in the moment-SOS hierarchy have the constant trace property (CTP) for the involved matrices. We then exploit the CTP to avoid solving the semidefinite relaxations via interior-point methods and rather use ad-hoc spectral methods for minimizing the largest eigenvalue of a matrix pencil. Convergence to the optimal value of the semidefinite relaxation is guaranteed. As a result we obtain a hierarchy of nonsmooth “spectral relaxations” of the initial POP. Efficiency and robustness of this spectral hierarchy is tested against several equality constrained POPs on a sphere as well as on a sample of randomly generated quadratically constrained quadratic problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data availability

All data analyzed during this study are publicly available.

Notes

  1. The vector \({\textbf{v}}_i\) is updated in Step 6 of [56, Algorithm 4.2] within the loop from Step 5 of [56, Algorithm 4.2]

References

  1. Bagirov, A., Karmitsa, N., Mäkelä, M.M.: Introduction to Nonsmooth Optimization: Theory, Practice and Software. Springer, New York (2014)

    Book  MATH  Google Scholar 

  2. Ben-Tal, A., Nemirovski, A.: Lectures on Modern Convex Optimization: Analysis, Algorithms, and Engineering Applications, vol. 2. Siam, Philadelphia (2001)

    Book  MATH  Google Scholar 

  3. Bihain, A.: Optimization of upper semidifferentiable functions. J. Optim. Theory Appl. 44(4), 545–568 (1984)

    Article  MathSciNet  MATH  Google Scholar 

  4. Burer, S., Monteiro, R.D.: A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization. Math. Program. 95(2), 329–357 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  5. Burke, J.V., Lewis, A.S., Overton, M.L.: A robust gradient sampling algorithm for nonsmooth, nonconvex optimization. SIAM J. Optim. 15(3), 751–779 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  6. Chandrasekaran, V., Shah, P.: Relative entropy relaxations for signomial optimization. SIAM J. Optim. 26(2), 1147–1173 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  7. Curtis, F.E., Que, X.: A quasi-Newton algorithm for nonconvex, nonsmooth optimization with global convergence guarantees. Math. Program. Comput. 7(4), 399–428 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  8. Curto, R.E., Fialkow, L.A.: Truncated K-moment problems in several variables. J. Oper. Theory 189–226 (2005)

  9. Dahl, J.: Semidefinite Optimization using MOSEK. ISMP, Berlin (2012)

    Google Scholar 

  10. d’Aspremont, A., El Karoui, N.: A stochastic smoothing algorithm for semidefinite programming. SIAM J. Optim. 24(3), 1138–1177 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  11. Ding, L., Yurtsever, A., Cevher, V., Tropp, J.A., Udell, M.: An optimal-storage approach to semidefinite programming using approximate complementarity. SIAM J. Optim. 31(4), 2695–2725 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  12. Doherty, A.C., Wehner, S.: Convergence of SDP hierarchies for polynomial optimization on the hypersphere. arXiv preprint arXiv:1210.5048, (2012)

  13. Dressler, M., Iliman, S., de Wolff, T.: A positivstellensatz for sums of nonnegative circuit polynomials. SIAM J. Appl. Algebra Geom. 1(1), 536–555 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  14. Haarala, M., Miettinen, K., Mäkelä, M.M.: New limited memory bundle method for large-scale nonsmooth optimization. Optim. Methods Softw. 19(6), 673–692 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  15. Haarala, N., Miettinen, K., Mäkelä, M.M.: Globally convergent limited memory bundle method for large-scale nonsmooth optimization. Math. Program. 109(1), 181–205 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  16. Helmberg, C., Overton, M.L., Rendl, F.: The spectral bundle method with second-order information. Opt. Methods Softw. 29(4), 855–876 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  17. Helmberg, C., Rendl, F.: A spectral bundle method for semidefinite programming. SIAM J. Optim. 10(3), 673–696 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  18. Henrion, D., Lasserre, J.-B.: Detecting global optimality and extracting solutions in gloptipoly. Posit. Polynomials Control 312, 293–310 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  19. Henrion, D., Malick, J.: Projection methods in conic optimization. Handb. Semidefinite Conic Polynomial Optim. 565–600 (2012)

  20. Josz, C., Henrion, D.: Strong duality in Lasserre’s hierarchy for polynomial optimization. Optim. Lett. 10(1), 3–10 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  21. Journée, M., Bach, F., Absil, P.-A., Sepulchre, R.: Low-rank optimization for semidefinite convex problems. arXiv preprint arXiv:0807.4423, (2008)

  22. Karmitsa, N.: LMBM-FORTRAN subroutines for Large-Scale nonsmooth minimization: user’s manual’. TUCS Techn. Rep. 77(856) (2007)

  23. Kiwiel, K.C.: Proximity control in bundle methods for convex nondifferentiable minimization. Math. Program. 46(1–3), 105–122 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  24. Kiwiel, K.C.: Convergence of the gradient sampling algorithm for nonsmooth nonconvex optimization. SIAM J. Optim. 18(2), 379–388 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  25. Lasserre, J.B.: Global optimization with polynomials and the problem of moments. SIAM J. Optim. 11(3), 796–817 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  26. Lasserre, J.-B.: Convergent SDP-relaxations in polynomial optimization with sparsity. SIAM J. Optim. 17(3), 822–843 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  27. Lasserre, J.-B.: Moments, Positive Polynomials and Their Applications, vol. 1. World Scientific, Singapore (2010)

    MATH  Google Scholar 

  28. Lasserre, J.B.: An Introduction to Polynomial and Semi-algebraic Optimization, vol. 52. Cambridge University Press, Cambridge (2015)

    Book  MATH  Google Scholar 

  29. Lasserre, J.B.: Homogeneous polynomials and spurious local minima on the unit sphere. Optim. Lett. 1–14 (2021)

  30. Lasserre, J.B., Laurent, M., Rostalski, P.: Semidefinite characterization and computation of zero-dimensional real radical ideals. Found. Comput. Math. 8(5), 607–647 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  31. Lasserre, J.B., Toh, K.-C., Yang, S.: A bounded degree SOS hierarchy for polynomial optimization. EURO J. Comput. Optim. 5(1–2), 87–117 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  32. Laurent, M.: Revisiting two theorems of Curto and Fialkow on moment matrices. Proc. Am. Math. Soc. 133(10), 2965–2976 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  33. Lehoucq, R.B., Sorensen, D.C., Yang, C.: ARPACK Users’ Guide: Solution of Large-Scale Eigenvalue Problems with Implicitly Restarted Arnoldi Methods. SIAM, Philadelphia (1998)

    Book  MATH  Google Scholar 

  34. Lewis, A.S., Overton, M.L.: Nonsmooth optimization via quasi-Newton methods. Math. Program. 141(1–2), 135–163 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  35. Mai, N.H.A., Bhardwaj, A., Magron, V.: The constant trace property in noncommutative optimization. In: Proceedings of the 2021 on International Symposium on Symbolic and Algebraic Computation, ISSAC ’21, pp. 297–304, New York, NY, USA. ). Association for Computing Machinery (2021)

  36. Mai, N.H.A., Lasserre, J.-B., Magron, V., Wang, J.: Exploiting constant trace property in large-scale polynomial optimization. ACM Trans. Math. Softw. 48(4), 1–39 (2022)

    Article  MathSciNet  Google Scholar 

  37. Mifflin, R.: An algorithm for constrained optimization with semismooth functions. Math. Oper. Res. 2(2), 191–207 (1977)

    Article  MathSciNet  MATH  Google Scholar 

  38. Navascués, M., Pironio, S., Acín, A.: A convergent hierarchy of semidefinite programs characterizing the set of quantum correlations. New J. Phys. 10(7), 073013 (2008)

    Article  Google Scholar 

  39. Nemirovsky, A., Yudin, D.: Problem complexity and method efficiency in optimization. Nauka (1983)

  40. Nie, J.: Optimality conditions and finite convergence of Lasserre’s hierarchy. Math. Program. 146(1–2), 97–121 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  41. Nie, J., Schweighofer, M.: On the complexity of Putinar’s Positivstellensatz. J. Complex. 23(1), 135–150 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  42. Nocedal, J.: Updating quasi-Newton matrices with limited storage. Math. Comput. 35(151), 773–782 (1980)

    Article  MathSciNet  MATH  Google Scholar 

  43. Nocedal, J., Wright, S.: Numerical Optimization. Springer, New York (2006)

    MATH  Google Scholar 

  44. Overton, M.L.: Large-scale optimization of eigenvalues. SIAM J. Optim. 2(1), 88–120 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  45. Overton, M.L., Womersley, R.S.: Second derivatives for optimizing eigenvalues of symmetric matrices. SIAM J. Matrix Anal. Appl. 16(3), 697–718 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  46. Saad, Y.: Numerical Methods for Large Eigenvalue Problems, revised SIAM, Philadelphia (2011)

    Book  MATH  Google Scholar 

  47. Schweighofer, M.: On the complexity of Schmüdgen’s positivstellensatz. J. Complex. 20(4), 529–543 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  48. Shor, N.Z.: Quadratic optimization problems. Sov. J. Comput. Syst. Sci. 25, 1–11 (1987)

    MathSciNet  MATH  Google Scholar 

  49. Trnovska, M.: Strong duality conditions in semidefinite programming. J. Electr. Eng. 56(12), 1–5 (2005)

    MATH  Google Scholar 

  50. Waki, H., Kim, S., Kojima, M., Muramatsu, M.: Sums of squares and semidefinite programming relaxations for polynomial optimization problems with structured sparsity. SIAM J. Optim. 17(1), 218–242 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  51. Wang, J., Magron, V.: A second order cone characterization for sums of nonnegative circuits. In: Proceedings of the 45th International Symposium on Symbolic and Algebraic Computation, pp. 450–457 (2020)

  52. Wang, J., Magron, V., Lasserre, J.-B.: Chordal-TSSOS: a moment-SOS hierarchy that exploits term sparsity with chordal extension. SIAM J. Optim. 31(1), 114–141 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  53. Wang, J., Magron, V., Lasserre, J.-B.: TSSOS: A Moment-SOS hierarchy that exploits term sparsity. SIAM J. Optim. 31(1), 30–58 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  54. Wang, J., Magron, V., Lasserre, J.B., Mai, N.H.A.: CS-TSSOS: correlative and term sparsity for large-scale polynomial optimization. ACM Trans. Math. Softw. 48(4), 1–26 (2022)

    Article  MathSciNet  Google Scholar 

  55. Weisser, T., Legat, B., Coey, C., Kapelevich, L., Vielma, J.P.: Polynomial and moment optimization in Julia and JuMP. In: JuliaCon (2019)

  56. Yurtsever, A., Tropp, J.A., Fercoq, O., Udell, M., Cevher, V.: Scalable semidefinite programming. SIAM J. Math. Data Sci. 3(1), 171–200 (2021)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the associate editor and anonymous reviewers, whose insightful comments and careful proof-checks helped to improve the paper.

Funding

The first author was supported by the MESRI funding from EDMITT. The second author was supported by the FMJH Program PGMO (EPICS project) and EDF, Thales, Orange et Criteo, as well as from the Tremplin ERC Stg Grant ANR-18-ERC2-0004-01 (T-COPS project). This work has benefited from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie Actions, grant agreement 813211 (POEMA) as well as from the AI Interdisciplinary Institute ANITI funding, through the French “Investing for the Future PIA3” program under the Grant agreement n\(^{\circ }\)ANR-19-PI3A-0004. The third author was supported by the European Research Council (ERC) under the European’s Union Horizon 2020 research and innovation program (Grant Agreement 666981 TAMING).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ngoc Hoang Anh Mai.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Code availability

The full code was made available for review. We remark that a set of packages were used in this study, that were either open source or available for academic use. Specific references are included in this published article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

1.1 Spectral minimizations of SDP

In this section, we provide the proofs of lemmas stated in Sects. 2.6.1 and 2.6.3. First we recall the following useful properties of \({\mathcal {S}}\) and \({\mathcal {S}}^+\):

  • If \({\textbf{X}}={{\,\textrm{diag}\,}}({\textbf{X}}_1,\dots ,{\textbf{X}}_l)\in {\mathcal {S}}\),

    $$\begin{aligned} {\textbf{X}}\succeq 0\Longleftrightarrow {\textbf{X}}_j\succeq 0,\,j\in [l]\qquad \text { and }\qquad {{\,\textrm{trace}\,}}({\textbf{X}})=\sum _{j=1}^l {{\,\textrm{trace}\,}}({\textbf{X}}_j). \end{aligned}$$
    (49)
  • If \({\textbf{A}}={{\,\textrm{diag}\,}}({\textbf{A}}_1,\dots ,{\textbf{A}}_l)\in {\mathcal {S}}\) and \({\textbf{B}}={{\,\textrm{diag}\,}}({\textbf{B}}_1,\dots ,{\textbf{B}}_l)\in {\mathcal {S}}\),

    $$\begin{aligned} \langle {\textbf{A}}, {\textbf{B}}\rangle = \sum _{j=1}^l\langle {\textbf{A}}_j, {\textbf{B}}_j\rangle . \end{aligned}$$
    (50)

1.1.1 SDP with constant trace property

Proof of Lemma 3

The proof of (20) is similar in spirit to the one of Helmberg and Rendl in [17, Sect. 2]. Here, we extend this proof for SDP (13), which involves a block-diagonal positive semidefinite matrix. From (13),

$$\begin{aligned} -\tau = \sup _{{\textbf{X}}\in {\mathcal {S}}}\{ \langle {\textbf{C}},{\textbf{X}}\rangle :\,{\mathcal {A}} {\textbf{X}}={\textbf{b}},\,{{\,\textrm{trace}\,}}({\textbf{X}})=a,\, {\textbf{X}} \succeq 0\}. \end{aligned}$$

The dual of this SDP reads:

$$\begin{aligned} -\rho = \inf _{({\textbf{z}},\zeta )} \{ {\textbf{b}}^\top {\textbf{z}}+a\zeta :\, {\mathcal {A}}^\top {\textbf{z}}+\zeta {\textbf{I}}-{\textbf{C}}\succeq 0 \}, \end{aligned}$$

where \({\textbf{I}}\) is the identity matrix of size s. From this,

$$\begin{aligned} \begin{array}{rl} -\rho &{}= \inf _{({\textbf{z}},\zeta )} \{ {\textbf{b}}^\top {\textbf{z}}+a\zeta :\, \zeta \ge \lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\textbf{z}})\} \\ &{} = \inf \{a\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\textbf{z}})+ {\textbf{b}}^\top {\textbf{z}} :\,{\textbf{z}}\in {{\mathbb {R}}}^m \}. \end{array} \end{aligned}$$

Since \(\rho =\tau \), (20) follows. For the second statement, let \({\textbf{z}}^\star \) be an optimal solution of SDP (14). Then \({\textbf{b}}^\top {\textbf{z}}^\star =-\rho =-\tau \). In addition, \({\textbf{C}}-{\mathcal {A}}^\top {\textbf{z}}^\star \preceq 0\) implies that

$$\begin{aligned} \lambda _1({\textbf{C}}-{\mathcal {A}}^\top {{\textbf{z}}^\star })\le 0, \end{aligned}$$

so that \(\varphi ({\textbf{z}}^\star )\le -\tau \). Note that (20) indicates that \(\varphi ({\textbf{z}}^\star )\ge -\tau \). Thus, \(\varphi ({\textbf{z}}^\star )=-\tau \), yielding the second statement. \(\square \)

The following proposition recalls the differentiability properties of \(\varphi \).

Proposition 8

The function \(\varphi \) in (19) has the following properties:

  1. 1.

    \(\varphi \) is convex and continuous but not differentiable.

  2. 2.

    The subdifferential of \(\varphi \) at \({\textbf{z}}\) reads:

    $$\begin{aligned} \partial \varphi ({\textbf{z}})=\{{\textbf{b}}-a{\mathcal {A}}{\textbf{W}} \,\ {\textbf{W}}\in {{\,\textrm{conv}\,}}({\varGamma }({\textbf{C}}-{\mathcal {A}}^\top {\textbf{z}}))\}, \end{aligned}$$
    (51)

    where for each \({\textbf{A}}\in {\mathcal {S}}\),

    $$\begin{aligned} {\varGamma }({\textbf{A}}):=\{{\textbf{u}}{\textbf{u}}^\top \,\ {\textbf{A}}{\textbf{u}}=\lambda _1({\textbf{A}}){\textbf{u}}\,\ \Vert {\textbf{u}}\Vert _2=1\}. \end{aligned}$$
    (52)

Proof

Properties 1–2 are from Helmberg–Rendl [17, Sect. 2] (see also [44, (4)]). \(\square \)

The following result is useful to recover an optimal solution of SDP (13) from an optimal solution of NSOP (20).

Lemma 8

Let \({\bar{\textbf{z}}}\) be an optimal solution of NSOP (20). Then:

  1. 1.

    There exists \({\textbf{X}}^\star \in a{{\,\textrm{conv}\,}}({\varGamma }({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}}))\) such that \({\mathcal {A}}{\textbf{X}}^\star ={\textbf{b}}\).

  2. 2.

    Let \({\textbf{u}}\) be a normalized eigenvector corresponding to \(\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}})\). If \(\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}})\) has multiplicity 1 then \({\textbf{X}}^\star =a{\textbf{u}}{\textbf{u}}^\top \), thus \({\textbf{X}}^\star \) is an optimal solution of SDP (13).

Proof

By [1, Theorem 4.2], \({\textbf{0}}\in \partial \varphi ({\bar{\textbf{z}}})\). Combining this with Proposition 8.2, the first statement follows.

We next prove the second statement. Assume that \(\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}})\) has multiplicity 1. Then \({\varGamma }({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}})=\{{\textbf{u}} {\textbf{u}}^\top \}\). The first statement implies that \({\textbf{X}}^\star =a{\textbf{u}}{\textbf{u}}^\top \). From this, \({\textbf{X}}^\star \succeq 0\) and \({\mathcal {A}}{\textbf{X}}^\star ={\textbf{b}}\), so that \({\textbf{X}}^\star \) is a feasible solution of SDP (13). Moreover,

$$\begin{aligned} \begin{array}{rl} \langle {\textbf{C}}, {\textbf{X}}^\star \rangle &{}=\langle {\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}}, {\textbf{X}}^\star \rangle +\langle {\mathcal {A}}^\top {\bar{\textbf{z}}}, {\textbf{X}}^\star \rangle \\ &{}=a \langle {\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}},{\textbf{u}}{\textbf{u}}^\top \rangle +{\bar{\textbf{z}}}^\top ( {\mathcal {A}} {\textbf{X}}^\star )\\ &{}=a{\textbf{u}}^\top ( {\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}}){\textbf{u}}+{\bar{\textbf{z}}}^\top {\textbf{b}}\\ &{}=a\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}})\Vert {\textbf{u}}\Vert _2^2+{\bar{\textbf{z}}}^\top {\textbf{b}}\\ &{}=a\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}})+{\bar{\textbf{z}}}^\top {\textbf{b}}=\varphi ({\bar{\textbf{z}}})=-\tau . \end{array} \end{aligned}$$

Thus, \(\langle {\textbf{C}}, {\textbf{X}}^\star \rangle =-\tau \), yielding the second statement. \(\square \)

To obtain a convergence guarantee when solving NSOP (20) by LMBM [15, Algorithm 1], we need the following technical lemma:

Lemma 9

When applied to problem NSOP (20), the LMBM algorithm is globally convergent.

Proof

The convexity of \(\varphi \) yields that \(\varphi \) is weakly upper semismooth on \({{\mathbb {R}}}^m\) according to [37, Proposition 5]. From this, \(\varphi \) is upper semidifferentiable on \({{\mathbb {R}}}^m\) by using [3, Theorem 3.1]. Combining this with the fact that \(\varphi \) is bounded from below on \({{\mathbb {R}}}^m\), the result follows thanks to [3, Sect. 5] (see also the final statement of [1, Sect. 14.2]). \(\square \)

1.1.2 SDP with bounded trace property

Proof of Lemma 4

Let \({\textbf{X}}^\star \) be an optimal solution of SDP (13) and set . By Condition 4 of Assumption 1, one has

(53)

Similarly to the proof of Lemma 3, one obtains:

(54)

Let us prove that

$$\begin{aligned} \psi ({\textbf{z}})\ge -\tau ,\,\forall {\textbf{z}}\in {{\mathbb {R}}}^m. \end{aligned}$$
(55)

Let \({\textbf{z}}\in {{\mathbb {R}}}^m\) be fixed and consider the following two cases:

  • Case 1 \(\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\textbf{z}})>0\). By (53) and (54),

    Thus, \(\psi ({\textbf{z}})\ge -\tau \).

  • Case 2 \(\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\textbf{z}})\le 0\). Then \({\mathcal {A}}^\top {\textbf{z}}-{\textbf{C}}\succeq 0\) and \(\psi ({\textbf{z}})={\textbf{b}}^\top {\textbf{z}}\ge -\rho =-\tau \) by (14).

Let \(({\textbf{z}}^{(j)})_{j\in {{\mathbb {N}}}}\) be a minimizing sequence of SDP (14). Then \(\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\textbf{z}}^{(j)})\le 0\), \(j\in {{\mathbb {N}}}\), since \({\mathcal {A}}^\top {\textbf{z}}^{(j)}-{\textbf{C}}\succeq 0\) and \({\textbf{b}}^\top {\textbf{z}}^{(j)}\rightarrow -\tau \) as \(j\rightarrow \infty \) since \(\tau =\rho \). It implies that \(\psi ({\textbf{z}}^{(j)})={\textbf{b}}^\top {\textbf{z}}^{(j)}\rightarrow -\tau \) as \(j\rightarrow \infty \). From this and by (55), the first statement follows.

For the second statement, let \({\textbf{z}}^\star \) be an optimal solution of SDP (14). Since \({\mathcal {A}}^\top {\textbf{z}}-{\textbf{C}}\succeq 0\), \(\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\textbf{z}})\le 0\) and thus \(\psi ({\textbf{z}}^\star )={\textbf{b}}^\top {\textbf{z}}^\star = -\rho =-\tau \). Thus, \({\textbf{z}}^\star \) is an optimal solution of (26), yielding the second statement. \(\square \)

We consider the differentiability properties of \(\psi \) in the following proposition:

Proposition 9

The function \(\psi \) has the following properties:

  1. 1.

    \(\psi \) is convex and continuous but not differentiable.

  2. 2.

    The subdifferential of \(\psi \) at \({\textbf{z}}\) reads:

    $$\begin{aligned} \partial \psi ({\textbf{z}})={\left\{ \begin{array}{ll} \{{\textbf{b}}\}\text { if }\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\textbf{z}})< 0 ,\\ \{{\textbf{b}}-a{\mathcal {A}}{\textbf{W}} \,\ {\textbf{W}}\in {{\,\textrm{conv}\,}}({\varGamma }({\textbf{C}}-{\mathcal {A}}^\top {\textbf{z}}))\}\text { if }\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\textbf{z}})> 0,\\ \{{\textbf{b}}-\zeta a{\mathcal {A}}{\textbf{W}} \,\ \zeta \in [0,1],\,{\textbf{W}}\in {{\,\textrm{conv}\,}}({\varGamma }({\textbf{C}}-{\mathcal {A}}^\top {\textbf{z}}))\}\text { otherwise},\\ \end{array}\right. } \end{aligned}$$

    where \({\varGamma }(.)\) is defined as in (52).

Proof

Note that \(\psi \) is the maximum of two convex functions, i.e.,

$$\begin{aligned} \psi ({\textbf{z}})=\max \{\varphi _1({\textbf{z}}),\varphi _2({\textbf{z}})\} , \end{aligned}$$

with \(\varphi _1({\textbf{z}})=a\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\textbf{z}})+{\textbf{b}}^\top {\textbf{z}}\) and \(\varphi _2({\textbf{z}})={\textbf{b}}^\top {\textbf{z}}\). Thus, \(\psi \) is convex and

$$\begin{aligned} \partial \psi ({\textbf{z}})= {\left\{ \begin{array}{ll} \partial \varphi _1({\textbf{z}}) &{}\text { if }\varphi _1({\textbf{z}})>\varphi _2({\textbf{z}}),\\ {{\,\textrm{conv}\,}}(\partial \varphi _1({\textbf{z}})\cup \partial \varphi _2({\textbf{z}}))&{}\text { if }\varphi _1({\textbf{z}})=\varphi _2({\textbf{z}}),\\ \partial \varphi _2({\textbf{z}}) &{}\text { otherwise}. \end{array}\right. } \end{aligned}$$

Note that \(\partial \varphi _2({\textbf{z}})=\{{\textbf{b}}\}\) and \(\partial \varphi _1({\textbf{z}})\) is computed as in formula (51). Thus, the result follows. \(\square \)

The following theorem is useful to recover an optimal solution of SDP (13) from an optimal solution of NSOP (26).

Lemma 10

Assume that \({\bar{\textbf{z}}}\) is an optimal solution of NSOP (26). The following statements are true:

  1. 1.

    There exists

    $$\begin{aligned} {\textbf{X}}^\star {\left\{ \begin{array}{ll} = {\textbf{0}}&{} \text { if } \lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}})< 0,\\ \in \zeta a{{\,\textrm{conv}\,}}({\varGamma }({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}})) &{}\text { if } \lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}})=0,\\ \in a{{\,\textrm{conv}\,}}({\varGamma }({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}}))&{} \text { otherwise}, \end{array}\right. } \end{aligned}$$

    for some \(\zeta \in [0,1]\) such that \({\mathcal {A}}{\textbf{X}}^\star ={\textbf{b}}\).

  2. 2.

    Let \({\textbf{u}}\) be a normalized eigenvector corresponding to \(\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}})\). If \(\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}})\) has multiplicity 1, then \({\textbf{X}}^\star = {{\bar{\xi }}}{\textbf{u}}{\textbf{u}}^\top \) with \({{\bar{\xi }}}\) defined as in (27) and \({\textbf{X}}^\star \) is an optimal solution of SDP (13).

Proof

Due to [1, Theorem 4.2], \({\textbf{0}}\in \partial \varphi ({\bar{\textbf{z}}})\). From this and by Proposition 9.2, the first statement follows. Let us prove the second statement. Assume that \(\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}})\) has multiplicity 1. Then one has \({\varGamma }({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}})=\{{\textbf{u}}{\textbf{u}}^\top \}\), yielding \({\textbf{X}}^\star ={{\bar{\xi }}}{\textbf{u}}{\textbf{u}}^\top \) with \({{\bar{\xi }}}\) defined as in (27), so that \({\textbf{X}}^\star \succeq 0\). From this and since \({\mathcal {A}}{\textbf{X}}^\star ={\textbf{b}}\), \({\textbf{X}}^\star \) is a feasible solution of SDP (13). Moreover,

$$\begin{aligned} \begin{array}{rl} \langle {\textbf{C}}, {\textbf{X}}^\star \rangle &{}=\langle {\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}}, {\textbf{X}}^\star \rangle +\langle {\mathcal {A}}^\top {\bar{\textbf{z}}}, {\textbf{X}}^\star \rangle \\ &{}={{\bar{\xi }}}\langle {\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}},{\textbf{u}}{\textbf{u}}^\top \rangle +{\bar{\textbf{z}}}^\top ( {\mathcal {A}} {\textbf{X}}^\star )\\ &{}= {{\bar{\xi }}}{\textbf{u}}^\top ( {\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}}){\textbf{u}}+{\bar{\textbf{z}}}^\top {\textbf{b}}\\ &{}=\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}}) {{\bar{\xi }}}\Vert {\textbf{u}}\Vert _2^2+{\bar{\textbf{z}}}^\top {\textbf{b}}\\ &{}=\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}}){{\bar{\xi }}}+{\bar{\textbf{z}}}^\top {\textbf{b}}\\ &{}=a\max \{\lambda _1({\textbf{C}}-{\mathcal {A}}^\top {\bar{\textbf{z}}}),0\}+{\bar{\textbf{z}}}^\top {\textbf{b}}=\psi ({\bar{\textbf{z}}})=-\tau . \end{array} \end{aligned}$$

Thus, \(\langle {\textbf{C}}, {\textbf{X}}^\star \rangle =-\tau \), yielding the second statement. \(\square \)

The next result proves that when applied to NSOP (26), the LMBM algorithm [15, Algorithm 1] converges.

Lemma 11

LMBM applied to NSOP (26) is globally convergent.

The proof of Lemma 11 is similar to Lemma 9.

1.2 Converting moment relaxations to standard SDP

We will present a way to transform SDP (31) to the form (33) recalled as follows:

$$\begin{aligned} -\tau _k = \sup _{{\textbf{X}}\in {\mathcal {S}}_k} \{ \langle {\textbf{C}},{\textbf{X}}\rangle :\,\langle {\textbf{A}}_j, {\textbf{X}}\rangle ={\textbf{b}}_j,\,j\in [m],\, {\textbf{X}} \succeq 0\}. \end{aligned}$$

Let \(k\in {{\mathbb {N}}}\) be fixed. We will prove that there exists \({\textbf{A}}_j\in {\mathcal {S}}_k\), \(j\in [r]\), such that \({\textbf{X}}={\textbf{P}}_k{\textbf{M}}_k({\textbf{y}}){\textbf{P}}_k\) for some if and only if \(\langle {\textbf{A}}_j, {\textbf{X}}\rangle =0\), \(j\in [r]\). Let . Then \({\mathcal {V}}\) is a linear subspace of \({\mathcal {S}}_k\) and . We take a basis \({\textbf{A}}_1,\dots ,{\textbf{A}}_r\) of the orthogonal complement \({\mathcal {V}}^\bot \) of \({\mathcal {V}}\). Notice that

(56)

with \({\textbf{X}}\in {\mathcal {S}}_k\), it implies that \({\textbf{X}}\in {\mathcal {V}}\) if and only if \(\langle {\textbf{A}}_j, {\textbf{X}}\rangle =0\), \(j\in [r]\).

Let us find such a basis \({\textbf{A}}_1,\dots ,{\textbf{A}}_r\). Let \({\textbf{A}}=(A_{\alpha ,\beta })_{\alpha ,\beta \in {{\mathbb {N}}}^n_k}\in {\mathcal {V}}^\bot \). Then for all \({\textbf{X}}=(X_{\alpha ,\beta })_{\alpha ,\beta \in {{\mathbb {N}}}^n_k}\in {\mathcal {V}}\), \(\langle {\textbf{A}}, {\textbf{X}}\rangle =0\). Note that if \({\textbf{X}}={\textbf{P}}_k{\textbf{M}}_k({\textbf{y}}){\textbf{P}}_k\), then one has

$$\begin{aligned} X_{\alpha ,\beta }=w_{\alpha ,\beta }y_{\alpha +\beta },\,\forall \alpha ,\beta \in {{\mathbb {N}}}^n_k, \end{aligned}$$

with \(w_{\alpha ,\beta }:=\theta _{k,\alpha }^{1/2}\theta _{k,\beta }^{1/2}\), for all \(\alpha ,\beta \in {{\mathbb {N}}}^n_k\). It implies that

Let \(\gamma \in {{\mathbb {N}}}^n_{2k}\) be fixed and let be such that for \(\xi \in {{\mathbb {N}}}^n_{2k}\),

$$\begin{aligned} y_\xi ={\left\{ \begin{array}{ll} 0&{}\text { if }\xi \ne \gamma ,\\ 1&{}\text { otherwise.} \end{array}\right. } \end{aligned}$$

Then

If \(\gamma \not \in 2{{\mathbb {N}}}^n\), we do not have the first term in the latter equality. Let us define

$$\begin{aligned} {\varLambda }_\gamma :=\{A_{\alpha ,\beta }:\,\alpha ,\beta \in {{\mathbb {N}}}^n_k,\, \alpha +\beta =\gamma ,\, \alpha \le \beta \}. \end{aligned}$$

Note that \({\varLambda }_\gamma \) consists of values of A indexed by pairs of vectors \((\alpha , \beta )\) satisfying the lexicographic order relation \(\alpha \le \beta \). Moreover, it can be rewritten as \({\varLambda }_\gamma =\{A_{\alpha _j,\beta _j},\,j\in [t]\}\) where \((\alpha _1,\beta _1)<\dots <(\alpha _t,\beta _t)\) and \(t=|{\varLambda }_\gamma |\). Thus, if \(t\ge 2\), we can choose \({\textbf{A}}\) such that for all \(\alpha ,\beta \in {{\mathbb {N}}}^n_k\),

$$\begin{aligned} A_{\alpha ,\beta }={\left\{ \begin{array}{ll} w_{\alpha _\mu ,\beta _\mu }&{} \text { if } \alpha _1=\beta _1 \text { and }(\alpha ,\beta ) =(\alpha _1,\beta _1),\\ \frac{1}{2} w_{\alpha _\mu ,\beta _\mu }&{} \text { if } \alpha _1< \beta _1 \text { and }(\alpha ,\beta ) \in \{(\alpha _1,\beta _1),(\beta _1,\alpha _1)\},\\ -w_{\alpha _1,\beta _1}&{} \text { if } \alpha _\mu =\beta _\mu \text { and }(\alpha ,\beta ) =(\alpha _\mu ,\beta _\mu ),\\ -\frac{1}{2} w_{\alpha _1,\beta _1}&{} \text { if } \alpha _\mu < \beta _\mu \text { and }(\alpha ,\beta ) \in \{(\alpha _\mu ,\beta _\mu ),(\beta _\mu ,\alpha _\mu )\},\\ 0 &{} \text { otherwise}, \end{array}\right. } \end{aligned}$$

for some \(\mu \in [t]\backslash \{1\}\). We denote by \({\mathcal {B}}_\gamma \) the set of all such above \(A_{\alpha ,\beta }\) satisfying \(t=|{\varLambda }_\gamma |\ge 2\), otherwise let \({\mathcal {B}}_\gamma =\emptyset \). Then \(|{\mathcal {B}}_\gamma |=|{\varLambda }_\gamma |-1 \). From this and since \(( {\mathcal {B}}_\gamma )_{\gamma \in {{\mathbb {N}}}^{n}_{2k}}\) is a sequence of pairwise disjoint subsets of \({\mathcal {S}}_k\),

It must be equal to r as in (56). We just proved that \(\bigcup _{\gamma \in {{\mathbb {N}}}^{n}_{2k}} {\mathcal {B}}_\gamma \) is a basis of \({\mathcal {V}}^\bot \). Now we assume that \(\bigcup _{\gamma \in {{\mathbb {N}}}^{n}_{2k}} {\mathcal {B}}_\gamma =\{{\textbf{A}}_1,\dots ,{\textbf{A}}_r\}\).

Let us rewrite the constraints

$$\begin{aligned} {\textbf{M}}_{k - \lceil h_j \rceil }(h_j\;{\textbf{y}}) = 0,\, j\in [l_h], \end{aligned}$$
(57)

as \(\langle {\textbf{A}}_j,{\textbf{X}}\rangle =0\), \(j=r+1,\dots ,m-1\) with \({\textbf{X}}={\textbf{P}}_k{\textbf{M}}_k({\textbf{y}}){\textbf{P}}_k\). From (57),

$$\begin{aligned} \sum _{\gamma \in {{\mathbb {N}}}^n_{2\lceil h_j \rceil }}h_{j,\gamma }y_{\alpha +\gamma }=0,\,\alpha \in {{\mathbb {N}}}^n_{2(k-\lceil h_j \rceil )},\,j\in [l_h]. \end{aligned}$$
(58)

Let \(j\in [l_h]\) and \(\alpha \in {{\mathbb {N}}}^n_{2(k-\lceil h_j \rceil )}\) be fixed. We define \(\tilde{{\textbf{A}}}=({\tilde{A}}_{\mu ,\nu })_{\mu ,\nu \in {{\mathbb {N}}}^n_k}\) as follows:

$$\begin{aligned} {\tilde{A}}_{\mu ,\nu }={\left\{ \begin{array}{ll} h_{j,\gamma }&{}\text { if }\mu =\nu ,\,\mu +\nu =\alpha +\gamma ,\\ \frac{1}{2} h_{j,\gamma }&{}\text { if }\mu \ne \nu ,\,\mu +\nu =\alpha +\gamma ,\\ &{}\qquad \text { and } (\mu ,\nu )\le ({{\bar{\mu }}},{{\bar{\nu }}}),\,\forall {{\bar{\mu }}},{{\bar{\nu }}}\in {{\mathbb {N}}}^n_k\, \text { such that} \,{{\bar{\mu }}}+{{\bar{\nu }}}=\alpha +\gamma ,\\ 0&{}\text { otherwise.} \end{array}\right. } \end{aligned}$$
(59)

Then (58) implies that \(\langle \tilde{{\textbf{A}}}, {\textbf{M}}_k({\textbf{y}})\rangle =0\). Since \({\textbf{M}}_k({\textbf{y}})={\textbf{P}}_k^{-1}{\textbf{X}}{\textbf{P}}_k^{-1}\),

$$\begin{aligned} 0=\langle \tilde{{\textbf{A}}}, {\textbf{P}}_k^{-1}{\textbf{X}}{\textbf{P}}_k^{-1}\rangle =\langle {\textbf{P}}_k^{-1}\tilde{{\textbf{A}}}{\textbf{P}}_k^{-1}, {\textbf{X}}\rangle =\langle {\textbf{A}}, {\textbf{X}}\rangle , \end{aligned}$$

where \({\textbf{A}}:={\textbf{P}}_k^{-1}\tilde{{\textbf{A}}}{\textbf{P}}_k^{-1}\), yielding the statement. Thus, we obtain the constraints \(\langle {\textbf{A}}_j,{\textbf{X}}\rangle =0\), \(j\in [m-1]\).

The final constraint \(y_0=1\) can be rewritten as \(\langle {\textbf{A}}_m, {\textbf{X}}\rangle =1\) with \({\textbf{A}}_m\in {\mathcal {S}}_k\) having zero entries except the top left one \([{\textbf{A}}_{m}]_{0,0}=1\). Thus, we select \({\textbf{b}}\) such that all entries of \({\textbf{b}}\) are zeros except \(b_m=1\).

The number m (or \(m_k\) when plugging the relaxation order k) of equality trace constraints \(\langle {\textbf{A}}_j,{\textbf{X}}\rangle =b_j\) is:

(60)

The function \(-L_{{\textbf{y}}}(f)=-\sum _\gamma f_\gamma y_{\gamma }\) is equal to \(\langle {\textbf{C}}, {\textbf{X}}\rangle \) with \({\textbf{C}}:={\textbf{P}}_k^{-1}\tilde{{\textbf{C}}}{\textbf{P}}_k^{-1}\), where \(\tilde{{\textbf{C}}}=({\tilde{C}}_{\mu ,\nu })_{\mu ,\nu \in {{\mathbb {N}}}^n_k}\) is defined by:

$$\begin{aligned} {\tilde{C}}_{\mu ,\nu }={\left\{ \begin{array}{ll} -f_{\gamma }&{}\text { if }\mu =\nu ,\,\mu +\nu =\gamma ,\\ -\frac{1}{2} f_{\gamma }&{}\text { if }\mu \ne \nu ,\,\mu +\nu =\gamma ,\\ &{}\qquad \text { and } (\mu ,\nu )\le ({{\bar{\mu }}},{{\bar{\nu }}}),\,\forall {{\bar{\mu }}},{{\bar{\nu }}}\in {{\mathbb {N}}}^n_k\, \text { such that}\,{{\bar{\mu }}}+{{\bar{\nu }}}=\gamma ,\\ 0&{}\text { otherwise.} \end{array}\right. } \end{aligned}$$
(61)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mai, N.H.A., Lasserre, JB. & Magron, V. A hierarchy of spectral relaxations for polynomial optimization. Math. Prog. Comp. 15, 651–701 (2023). https://doi.org/10.1007/s12532-023-00243-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12532-023-00243-7

Keywords

Mathematics Subject Classification

Navigation