On approximations of the PSD cone by a polynomial number of smaller-sized PSD cones

Song, Dogyoon; Parrilo, Pablo A.

doi:10.1007/s10107-022-01795-7

On approximations of the PSD cone by a polynomial number of smaller-sized PSD cones

Full Length Paper
Series A
Published: 01 April 2022

Volume 198, pages 733–785, (2023)
Cite this article

Mathematical Programming Submit manuscript

645 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

We study the problem of approximating the cone of positive semidefinite (PSD) matrices with a cone that can be described by smaller-sized PSD constraints. Specifically, we ask the question: “how closely can we approximate the set of unit-trace $n \times n$ PSD matrices, denoted by D, using at most N number of $k \times k$ PSD constraints?” In this paper, we prove lower bounds on N to achieve a good approximation of D by considering two constructions of an approximating set. First, we consider the unit-trace $n \times n$ symmetric matrices that are PSD when restricted to a fixed set of k-dimensional subspaces in ${\mathbb {R}}^n$. We prove that if this set is a good approximation of D, then the number of subspaces must be at least exponentially large in n for any $k = o(n)$. Second, we show that any set S that approximates D within a constant approximation ratio must have superpolynomial ${\varvec{S}}_+^k$-extension complexity. To be more precise, if S is a constant factor approximation of D, then S must have ${\varvec{S}}_+^k$-extension complexity at least $\exp ( C \cdot \min \{ \sqrt{n}, n/k \})$ where C is some absolute constant. In addition, we show that any set S such that $D \subseteq S$ and the Gaussian width of S is at most a constant times larger than the Gaussian width of D must have ${\varvec{S}}_+^k$-extension complexity at least $\exp ( C \cdot \min \{ n^{1/3}, \sqrt{n/k} \})$. These results imply that the cone of $n \times n$ PSD matrices cannot be approximated by a polynomial number of $k \times k$ PSD constraints for any $k = o(n / \log ^2 n)$. These results generalize the recent work of Fawzi (Math Oper Res 46(4):1479–1489, 2021) on the hardness of polyhedral approximations of ${\varvec{S}}_+^n$, which corresponds to the special case with $k=1$.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sparse PSD approximation of the PSD cone

Article 16 October 2020

Inner approximating the completely positive cone via the cone of scaled diagonally dominant matrices

Article 03 December 2019

Polynomial-sized semidefinite representations of derivative relaxations of spectrahedral cones

Article 13 August 2014

Notes

The smallest integer m such that $\text {COR}(n)$ admits a ${\varvec{S}}_+^m$-lift; see Sect. 2.2. This is related to the ${\varvec{S}}_+^k$-extension complexity, but they are different notions of complexity.
Note that $f_k$ is square-free (because $f_k$ is multilinear), and thus, $\nabla ^2 f_k = 0$.
More precisely, $A_{ij} = \frac{1}{2} {\mathbb {E}}_{X \sim \mu (H_n)} [ X_i X_j f(X) ]$ for $i, j \in [n]$ such that $i \ne j$.
That is, $Q_{\chi ^2}(s) := \inf \{ x \in {\mathbb {R}}: F_{\chi ^2}(x) \ge s \}$ for $0 < s \le 1$ where $F_{\chi ^2}$ be the cumulative distribution function of the $\chi ^2$-distribution with one degree of freedom.
In the language of matrix operations, given a slack matrix s(x, y) whose rows are indexed by x and columns are indexed by y, we left-multiply the degree-2 projection matrix to s(x, y) and then take the (scaled) trace of the resulting matrix.
See (8) for the definition of p-norm in Step 3 of the proof. Note that $\Vert \text {proj}_2 f(x)\Vert _2^2$ for $f: H_n \rightarrow {\mathbb {R}}$ can be interpreted as the variance of the random variable $\text {proj}_2 f(X)$ where $X \sim \mu (H_n)$.
Notice that we are computing the second and the fourth moments of the sum of the entries of a random vector uniformly distributed over the n-dimensional hypercube. Thus, we already expect to obtain by the central limit theorem that $\lim _{n \rightarrow \infty } \frac{1}{n} {\mathbb {E}}_{x \sim \mu }\big [ (x^T y)^2 \big ] = 1$ and $\lim _{n\rightarrow \infty }\frac{1}{n^2} {\mathbb {E}}_{x \sim \mu }\big [ (x^T y)^4 \big ] = 3$, which correspond to the second and the fourth moments of the standard Gaussian distribution.
See [21, Lemma 4.4.1] for example.
The lemma in the reference is stated with an unspecified constant C, but one can verify the inequality stated here by carefully following the proof of [21, Lemma 6.2.3].

References

Ahmadi, A.A., Dash, S., Hall, G.: Optimization over structured subsets of positive semidefinite matrices via column generation. Discrete Optim. 24, 129–151 (2017)
Article MathSciNet MATH Google Scholar
Ahmadi, A.A., Hall, G.: Sum of squares basis pursuit with linear and second order cone programming. Algebraic Geom. Methods Discrete Math. 685, 27–53 (2017)
Article MathSciNet MATH Google Scholar
Ahmadi, A.A., Majumdar, A.: DSOS and SDSOS optimization: more tractable alternatives to sum of squares and semidefinite optimization. SIAM J. Appl. Algebra Geom. 3(2), 193–230 (2019)
Article MathSciNet MATH Google Scholar
Aubrun, G., Szarek, S.: Dvoretzky’s theorem and the complexity of entanglement detection. Discrete Analysis, pp. 1242 (2017)
Aubrun, G., Szarek, S.J.: Alice and Bob Meet Banach, vol. 223. American Mathematical Society, Providence (2017)
MATH Google Scholar
Beckner, W.: Inequalities in Fourier analysis. Ann. Math. 102, 159–182 (1975)
Article MathSciNet MATH Google Scholar
Blekherman, G., Dey, S.S., Molinaro, M., Sun, S.: Sparse PSD approximation of the PSD cone. Math. Program. 191, 981–1004 (2022)
Boman, E.G., Chen, D., Parekh, O., Toledo, S.: On factor width and symmetric H-matrices. Linear Algebra Appl. 405, 239–248 (2005)
Article MathSciNet MATH Google Scholar
Bonami, A.: Étude des coefficients de Fourier des fonctions de $ L^{p} (G)$. In: Annales de l’Institut Fourier, vol. 20, pp. 335–402. (1970)
Boucheron, S., Lugosi, G., Massart, P.: Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford University Press, Oxford (2013)
Book MATH Google Scholar
Fawzi, H.: On polyhedral approximations of the positive semidefinite cone. Math. Oper. Res. 46(4), 1479–1489 (2021)
Article MathSciNet MATH Google Scholar
Fawzi, H., Gouveia, J., Parrilo, P.A., Saunderson, J., Thomas, R.R.: Lifting for simplicity: concise descriptions of convex sets. arXiv preprint arXiv:2002.09788 (2020)
Fawzi, H., Parrilo, P.A.: Exponential lower bounds on fixed-size psd rank and semidefinite extension complexity. arXiv preprint arXiv:1311.2571 (2013)
Gouveia, J., Parrilo, P.A., Thomas, R.R.: Lifts of convex sets and cone factorizations. Math. Oper. Res. 38(2), 248–264 (2013)
Article MathSciNet MATH Google Scholar
Laurent, B., Massart, P.: Adaptive estimation of a quadratic functional by model selection. Ann. Stat. 28, 1302–1338 (2000)
Article MathSciNet MATH Google Scholar
Lee, J.R., Raghavendra, P., Steurer, D.: Lower bounds on the size of semidefinite programming relaxations. In: Proceedings of the forty-seventh annual ACM symposium on theory of computing, pp. 567–576 (2015)
O’Donnell, R.: Analysis of Boolean Functions. Cambridge University Press, Cambridge (2014)
O’Rourke, S., Vu, V., Wang, K.: Eigenvectors of random matrices: a survey. J. Comb. Theory Ser. A 144, 361–442 (2016)
Regev, O., Klartag, B.A.: Quantum one-way communication can be exponentially stronger than classical communication. In: Proceedings of the forty-third annual ACM symposium on theory of computing, pp. 31–40 (2011)
Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)
Book MATH Google Scholar
Vershynin, R.: High-Dimensional Probability: An Introduction with Applications in Data Science, vol. 47. Cambridge University Press, Cambridge (2018)
MATH Google Scholar
Yannakakis, M.: Expressing combinatorial optimization problems by linear programs. J. Comput. Syst. Sci. 43(3), 441–466 (1991)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

This research was partially supported by NSF grant AF CCF-1565235.

Author information

Authors and Affiliations

University of Michigan, Ann Arbor, USA
Dogyoon Song
Massachusetts Institute of Technology, Cambridge, USA
Pablo A. Parrilo

Authors

Dogyoon Song
View author publications
You can also search for this author in PubMed Google Scholar
Pablo A. Parrilo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dogyoon Song.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Proof of some lemmas from section 2

1.1 Proof of lemma 6

Proof

Let $f = f_0 + f_1 + f_2 + \dots + f_n$ be the Fourier expansion of f. Then for $0 \le \rho \le 1$,

$$\begin{aligned} \Vert \text {proj}_2 f \Vert _2^2 = \Vert f_2 \Vert _2^2 = \frac{1}{\rho ^4} \big ( \rho ^2 \Vert f_2 \Vert _2 \big )^2 \le \frac{1}{\rho ^4} \sum _{k=0}^n \rho ^{2k} \Vert f_k \Vert _2^2 = \frac{1}{\rho ^4} \Vert T_{\rho }f \Vert _2^2. \end{aligned}$$

With $\rho = \sqrt{p-1}$ for $1 \le p \le 2$, we have $\Vert T_{\rho }f \Vert _2 \le \Vert f \Vert _p$ by hypercontractivity. Then it follows that

$$\begin{aligned} \Vert \text {proj}_2 f \Vert _2 \le \frac{1}{\rho ^2} \Vert T_{\rho }f \Vert _2 \le \frac{1}{p-1} \Vert f \Vert _p \le \frac{1}{p-1} \Lambda ^{p-1} \end{aligned}$$

because $\Vert f \Vert _p = {\mathbb {E}}[ f^p ]^{\frac{1}{p}} \le \Lambda ^{\frac{p-1}{p}} {\mathbb {E}}[ f ]^{\frac{1}{p}} \le \Lambda ^{\frac{p-1}{p}} \le \Lambda ^{p-1}$. If $\Lambda < e$, we choose $p = 2$ to get $\Vert \text {proj}_2 f \Vert _2 \le \Lambda $. Otherwise, we choose $p = 1 + \frac{1}{\log \Lambda }$ to obtain $\Vert \text {proj}_2 f \Vert _2 \le e \log ( \Lambda )$. $\square $

1.2 Proof of lemma 8

Proof

We consider a Gaussian process $(X_v)_{v \in {\mathbb {S}}^{n-1}}$ defined over $ {\mathbb {S}}^{n-1}$ such that $X_v = v^T G v + \gamma $ with G being standard Gaussian in ${\varvec{S}}^n$ and $\gamma \sim N(0,1)$ independent of G. It is easy to verify that ${\mathbb {E}}\big [ \sup _{v \in {\mathbb {S}}^{n-1}} \langle v, Gv \rangle \big ] = {\mathbb {E}}_{G, \gamma }\big [ \sup _{v \in {\mathbb {S}}^{n-1}} X_v \big ]$. Now we introduce an auxiliary Gaussian process $(Y_v)_{v \in {\mathbb {S}}^{n-1}}$ such that $Y_v = g^Tv$ with $g \sim N(0, 2 I_n)$. Observe that for all $u, v \in {\mathbb {S}}^{n-1}$, (1) ${\mathbb {E}}X_v = {\mathbb {E}}Y_v = 0$; and (2) ${\mathbb {E}}(X_u - X_v)^2 \le {\mathbb {E}}(Y_u - Y_v)^2$ because ${\mathbb {E}}X_v^2 = {\mathbb {E}}Y_v^2 = 2$ and ${\mathbb {E}}X_u X_v - {\mathbb {E}}Y_u Y_v = (1 - u^T v )^2 \ge 0$. Thus, we can apply Sudakov-Fernique inequality (Lemma 7) to obtain ${\mathbb {E}}_{G, \gamma } \big [ \sup _{v \in {\mathbb {S}}^{n-1}} X_v \big ] \le {\mathbb {E}}_{g \sim N(0, 2I_n)} \big [ \sup _{v \in {\mathbb {S}}^{n-1}} Y_v \big ] = {\mathbb {E}}_{g \sim N(0, 2I_n)} \Vert g \Vert _2 \le \big ( {\mathbb {E}}_{g \sim N(0, 2I_n)} \Vert g \Vert _2^2 \big )^{1/2} = \sqrt{2n}$. $\square $

1.3 Proof of lemma 10

Proof

Let $h_K(u) := \max _{x \in K} \left\langle u,~ x \right\rangle = \Vert u \Vert _{K^{\circ }}$ denote the support function of K. The function $h_K$ is L-Lipschitz with $L = \sup _{x \in K} \Vert x \Vert _2$, the diameter of K, because for any $u, v \in {\mathbb {R}}^d$,

$$\begin{aligned} \big | h_K(u) - h_K(v) \big | = \big | \Vert u\Vert _{K^{\circ }} - \Vert v\Vert _{K^{\circ }} \big | \le \Vert u - v \Vert _{K^{\circ }} \le \sup _{x \in K} \Vert x \Vert _2 \Vert u-v\Vert _2. \end{aligned}$$

Moreover, we can show that $\sup _{x \in K} \Vert x \Vert _2 \le \sqrt{2\pi } w_G(K)$. To see this, let B(0, R) denote the Euclidean ball centered at 0 with radius R. It follows from [21, Proposition 7.5.2-(e)] that $\sup _{x, y \in K} \Vert x - y \Vert _2 \le \sqrt{2\pi } w_G(K)$. Since $0 \in K$, this implies $K \subseteq B(0, \sqrt{2\pi }w_G(K) )$. Applying Lemma 9 with $f=h_K$ and $\tau = \alpha w_G(K)$ completes the proof. $\square $

1.4 Proof sketch of lemma 11

An Auxiliary Lemma. The MGF of a decoupled Gaussian chaos is bounded as in Lemma 15.

Lemma 15

(MGF of Gaussian chaos) Let $X, X' \sim N(0, I_n)$ be independent Gaussian random vectors and let $A \in {\mathbb {R}}^{n \times n}$. Then

$$\begin{aligned} {\mathbb {E}}\exp ( \lambda X^T A X') \le \exp ( \lambda ^2 \Vert A\Vert _F^2) \end{aligned}$$

for all $\lambda $ satisfying $|\lambda | \le \frac{1}{\sqrt{2}\Vert A\Vert _{op}}$.

Proof

Let $A = U \Sigma V^T$ be a singular value decomposition of A, and let $g = U^T X$, $g' = V^T X'$. Observe that $g, g'$ are independent standard Gaussian random vectors in ${\mathbb {R}}^n$, and that $X^T A X' = \sum _i s_i g_i g_i'$ where $\{ s_i \}_{i=1}^n$ are the singular values of A (i.e., the diagonal elements of the nonnegative diagonal matrix $\Sigma $). As this is a sum of n independent random variables, we have

$$\begin{aligned} {\mathbb {E}}\exp ( \lambda X^T A X' ) = \prod _{i=1}^n {\mathbb {E}}\exp ( \lambda s_i g_i g_i' ). \end{aligned}$$

Now, for each $i \in [n]$, we use the MGF formulas for the Gaussian and the chi-squared random variables to get

$$\begin{aligned} {\mathbb {E}}_{g_i, g_i'} \exp ( \lambda s_i g_i g_i' ) = {\mathbb {E}}_{g_i} \exp ( \lambda ^2 s_i^2 g_i^2 / 2 ) = (1 - \lambda ^2 s_i^2 )^{-1/2} \quad \text {for }\lambda \text { such that } \lambda ^2 s_i^2 < 1. \end{aligned}$$

Since $(1-t)^{-1/2} \le e^t$ for all t satisfying $0 \le t \le 0.7968$, we have

$$\begin{aligned} {\mathbb {E}}\exp ( \lambda X^T A X' ) \le \prod _{i=1}^n \exp ( \lambda ^2 s_i^2 ) = \exp ( \lambda ^2 \Vert A\Vert _F^2 ) \end{aligned}$$

for all $\lambda $ such that $\lambda ^2 \Vert A\Vert _{op}^2 \le 1/2 < 0.7968$. $\square $

Proof Sketch of Lemma 11

Proof (Sketch)

Let $X'$ be an independent copy of X. Then ${\mathbb {E}}\exp ( \lambda X^T A X ) \le {\mathbb {E}}\exp ( 4 \lambda X^T A X' )$ for all $\lambda \in {\mathbb {R}}$ by decoupling lemma [21, Theorem 6.1.1]. Next, let $g, g' \sim N(0, I_n)$ be independent Gaussian random vectors. Then ${\mathbb {E}}\exp ( \lambda X^T A X' ) \le {\mathbb {E}}\exp ( \lambda v g^T A g' )$ for all $\lambda \in {\mathbb {R}}$ due to comparison lemma^{Footnote 9} [21, Lemma 6.2.3]. Lastly, we apply Lemma 15 the MGF upper bound for Gaussian chaos to conclude the proof. $\square $

1.5 Proof of lemma 12

Proof

Let A be a symmetric $n \times n$ matrix such that $A_{ii} = 0, ~\forall i$ and $A_{ij} = \frac{1}{2} {\mathbb {E}}_{Y \sim \mu (H_n)}[ Y_i Y_j f(Y) ]$ for $i \ne j$. Then we observe that for all $X \in H_n$,

$$\begin{aligned} \text {proj}_2(f) (X) = \sum _{\begin{array}{c} {i = 1} \\ {j > i} \end{array}}^n X_i X_j {\mathbb {E}}_{Y \sim \mu (H_n)}[ Y_i Y_j f(Y) ] = X^T A X. \end{aligned}$$

Note that $X_i$ is sub-Gaussian with sub-Gaussian parameter 1 for all i because ${\mathbb {E}}[ e^{\lambda X_i} ] = \frac{1}{2}( e^{\lambda } + e^{-\lambda } ) \le e^{\lambda ^2/2}$. To conclude the proof, it suffices to observe that $\Vert A \Vert _F^2 = \sum _{\begin{array}{c} {i = 1} \\ {j \ne i} \end{array}}^n \big ( \frac{1}{2} {\mathbb {E}}_{X \sim \mu (H_n)}[ X_i X_j f(X) ] \big )^2 = \frac{1}{2} \Vert \text {proj}_2 f \Vert _2^2$ and $\Vert A \Vert _{op} \le \Vert A \Vert _F$, and then apply Lemma 11. $\square $

1.6 Proof of lemma 13

Proof

For any $\lambda \in (0, 1/c]$,

$$\begin{aligned} {\mathbb {E}}\Big [ \max _{i \in [N] } X_i \Big ]&= \frac{1}{\lambda } {\mathbb {E}}\bigg [ \log \exp \Big ( \lambda \max _{i \in [N]} X_i \Big ) \bigg ] \le \frac{1}{\lambda } \log {\mathbb {E}}\bigg [ \exp \Big ( \lambda \max _{i \in [N]} X_i \Big ) \bigg ] \qquad \because \text {Jensen's inequality}\\&= \frac{1}{\lambda } \log {\mathbb {E}}\bigg [ \max _{i \in [N]} \exp \Big ( \lambda X_i \Big ) \bigg ] \le \frac{1}{\lambda } \log \bigg (\sum _{i=1}^{N} {\mathbb {E}}\Big [ \exp \big ( \lambda X_i \big ) \Big ] \bigg )\\&\le \frac{1}{\lambda } \log \bigg ( \sum _{i=1}^{N} e^{\frac{\lambda ^2 v}{2}} \bigg ) \qquad \because \text {sub-exponential}\\&= \frac{\log N}{\lambda } + \frac{\lambda v}{2}. \end{aligned}$$

It remains to choose $\lambda $ in the interval (0, 1/c] to optimize the upper bound. If $\sqrt{2 \log N / v } \le 1/c$, then we choose $\lambda = \sqrt{ 2 \log N / v }$ to get ${\mathbb {E}}\big [ \max _{i \in [N] } X_i \big ] \le \sqrt{2 v \log N}$. On the other hand, if $\sqrt{2 \log N / v } \le 1/c$, then we choose $\lambda = 1/c$ to get ${\mathbb {E}}\big [ \max _{i \in [N] } X_i \big ] \le 2c \log N $ since $v/2c \le \sqrt{2 \log N / v} \le c \log N$. $\square $

More on example 3 (ball, needle, and pancake)

Let $B_2^d := \{ x \in {\mathbb {R}}^d: \Vert x \Vert _2 \le 1 \}$ denote the d-dimensional unit $\ell _2$-ball, and let $B = B_2^d$. Fix $0< \delta < 1$, and let $N = \mathrm {conv}\big \{ B_2^d(0, 1) \cup \{ \pm \frac{1}{\delta } e_1 \} \big \}$ be the ‘needle’ where $e_1 = (1, 0, \dots , 0) \in {\mathbb {R}}^d$. Lastly, we define the ‘pancake’ $P= \{ x \in B: -\delta \le x_1 \le \delta \}$ where $x_1$ is the first coordinate of $x \in {\mathbb {R}}^d$. Observe that N and P are the polars of each other, and B is the polar of itself.

First of all, $w_G(B) = {\mathbb {E}}_g \Vert g \Vert _2 = \kappa _d$ and it is known that $\sqrt{d - 1/2} \le \kappa _d \le \sqrt{d - d/(2d+1)}$, cf. the paragraph below Definition 2. Next, we can see that $w_G(N) \ge \frac{1}{\delta }\sqrt{2/\pi }$ because $ \{ \pm \frac{1}{\delta } e_1 \} \subseteq N$ and thus, $w_G(N) \ge w_G \big ( \{ \pm \frac{1}{\delta } e_1 \} \big ) = \frac{1}{\delta } {\mathbb {E}}_{g \sim {\mathcal {N}}(0,1)} |g| = \frac{1}{\delta }\sqrt{2/\pi }$. Lastly, observe that $w_G(P) \ge \kappa _{d-1} \ge \sqrt{d - 3/2}$ because $\{0\} \times B_2^{d-1}(0,1) \subseteq P$ and $w_G(P) \ge w_G \big ( \{0\} \times B_2^{d-1}(0,1) \big ) = w_G \big ( B_2^{d-1}(0,1) \big ) = \kappa _{d-1}$.

It follows that B is an $\epsilon $-approximation of P in the average sense for $\epsilon = \kappa _d / \kappa _{d-1} - 1 \le 3 / (2d - 3)$. Nevertheless, B is not an $\epsilon '$-approximation of P in the dual-average sense unless $\epsilon ' \ge \frac{1}{\delta } \sqrt{2/\pi } / \kappa _d - 1\ge \frac{2}{ \delta \sqrt{ \pi (2d-1) } } - 1$, which can be made arbitrarily large by choosing small $\delta $. For example, if we choose $\delta \le 1/\sqrt{\pi (2d-1)}$, then $\epsilon ^*_{\text {dual-avg}}(P,S) \ge 1$ whereas $\epsilon ^*_{\text {avg}}(P, S) \le 3/(2d-3)$ regardless of $\delta $.

Solving the cubic inequality $z^3 + \alpha z \ge \beta $ with $\beta > 0$

Consider a cubic equation of the form $z^3 + \alpha z - \beta = 0$, which is commonly referred to as a depressed cubic. Note that when $\beta > 0$, this cubic equation always has a positive real root. The other two roots can be either negative real roots (when $D \le 0$), or a pair of complex conjugate roots (when $D > 0$), depending on the sign of its discriminant, $D = (\alpha /3)^3 + (\beta /2)^2$.

Indeed, we can find the roots with a generic cubic formula, known as Cardano’s formula. Let $i = \sqrt{-1}$ denote the imaginary unit, $\omega = \frac{-1 + \sqrt{3} i}{2}$ be a primitive 3rd of unity, and

$$\begin{aligned} T_+ = \root 3 \of { \frac{\beta }{2} + \sqrt{ \Big ( \frac{\beta }{2} \Big )^2 + \Big ( \frac{\alpha }{3} \Big )^3 } } \quad \text {and} \quad T_- = \root 3 \of { \frac{\beta }{2} - \sqrt{ \Big ( \frac{\beta }{2} \Big )^2 + \Big ( \frac{\alpha }{3} \Big )^3 } }. \end{aligned}$$

(42)

Case 1: $D > 0$. When $D > 0$, the cubic equation $z^3 + \alpha z - \beta = 0$ with $\beta > 0$ has only one real root, $z^*= T_+ + T_-$, which turns out to be positive. Thus, the set of real solutions for the cubic inequality $z^3 + \alpha z \ge \beta $ is $\{ z \in {\mathbb {R}}: z \ge T_+ + T_- \}$.

Case 2: $D \le 0$. There are three real roots for the cubic equation $z^3 + \alpha z - \beta = 0$, which can be written as

$$\begin{aligned} z_1 = T_+ + T_-, \qquad z_2 = \omega T_+ + \omega ^2 T_-, \qquad z_3 = \omega ^2 T_+ + \omega T_-. \end{aligned}$$

One of these three real roots is positive, and the other two are negative.

Note that (42) now involves complex roots, and the choice of branches might affect the order of the roots, $z_1, z_2, z_3$, however, the choice will not change the values of the roots. To avoid any ambiguity in our description, we choose the principal branch so that $\text {Arg}\left( \root m \of {z} \right) \in (-\frac{\pi }{m}, \frac{\pi }{m}]$ for any complex number z and any positive integer m.

Observe that $T_+ = \root 3 \of { \beta /2 + \sqrt{|D|}i }$ and $\text {Arg}\left( T_+ \right) \in [0, \pi /3)$. Similarly, we can see that $\text {Arg}\left( T_- \right) \in (-\pi /3, 0]$. It follows that $T_+ + T_-$ is a positive real number, and thus, the largest real root. Thus, the set of real solutions for the cubic inequality $z^3 + \alpha z \ge \beta $ is $\{ z \in {\mathbb {R}}: z \ge T_+ + T_- \}$.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Song, D., Parrilo, P.A. On approximations of the PSD cone by a polynomial number of smaller-sized PSD cones. Math. Program. 198, 733–785 (2023). https://doi.org/10.1007/s10107-022-01795-7

Download citation

Received: 10 May 2021
Accepted: 13 February 2022
Published: 01 April 2022
Issue Date: March 2023
DOI: https://doi.org/10.1007/s10107-022-01795-7

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On approximations of the PSD cone by a polynomial number of smaller-sized PSD cones

Abstract

Access this article

Similar content being viewed by others

Sparse PSD approximation of the PSD cone

Inner approximating the completely positive cone via the cone of scaled diagonally dominant matrices

Polynomial-sized semidefinite representations of derivative relaxations of spectrahedral cones

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Proof of some lemmas from section 2

1.1 Proof of lemma 6

Proof

1.2 Proof of lemma 8

Proof

1.3 Proof of lemma 10

Proof

1.4 Proof sketch of lemma 11

Lemma 15

Proof

Proof (Sketch)

1.5 Proof of lemma 12

Proof

1.6 Proof of lemma 13

Proof

More on example 3 (ball, needle, and pancake)

Solving the cubic inequality \(z^3 + \alpha z \ge \beta \) with \(\beta > 0\)

Rights and permissions

About this article

Cite this article

Mathematics Subject Classification

Navigation

On approximations of the PSD cone by a polynomial number of smaller-sized PSD cones

Abstract

Access this article

Similar content being viewed by others

Sparse PSD approximation of the PSD cone

Inner approximating the completely positive cone via the cone of scaled diagonally dominant matrices

Polynomial-sized semidefinite representations of derivative relaxations of spectrahedral cones

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Proof of some lemmas from section 2

1.1 Proof of lemma 6

Proof

1.2 Proof of lemma 8

Proof

1.3 Proof of lemma 10

Proof

1.4 Proof sketch of lemma 11

Lemma 15

Proof

Proof (Sketch)

1.5 Proof of lemma 12

Proof

1.6 Proof of lemma 13

Proof

More on example 3 (ball, needle, and pancake)

Solving the cubic inequality \(z^3 + \alpha z \ge \beta \) with \(\beta > 0\)

Rights and permissions

About this article

Cite this article

Share this article

Mathematics Subject Classification

Search

Navigation