Sum-of-squares chordal decomposition of polynomial matrix inequalities

We prove decomposition theorems for sparse positive (semi)definite polynomial matrices that can be viewed as sparsity-exploiting versions of the Hilbert--Artin, Reznick, Putinar, and Putinar--Vasilescu Positivstellens\"atze. First, we establish that a polynomial matrix $P(x)$ with chordal sparsity is positive semidefinite for all $x\in \mathbb{R}^n$ if and only if there exists a sum-of-squares (SOS) polynomial $\sigma(x)$ such that $\sigma P$ is a sum of sparse SOS matrices. Second, we show that setting $\sigma(x)=(x_1^2 + \cdots + x_n^2)^\nu$ for some integer $\nu$ suffices if $P$ is homogeneous and positive definite globally. Third, we prove that if $P$ is positive definite on a compact semialgebraic set $\mathcal{K}=\{x:g_1(x)\geq 0,\ldots,g_m(x)\geq 0\}$ satisfying the Archimedean condition, then $P(x) = S_0(x) + g_1(x)S_1(x) + \cdots + g_m(x)S_m(x)$ for matrices $S_i(x)$ that are sums of sparse SOS matrices. Finally, if $\mathcal{K}$ is not compact or does not satisfy the Archimedean condition, we obtain a similar decomposition for $(x_1^2 + \ldots + x_n^2)^\nu P(x)$ with some integer $\nu\geq 0$ when $P$ and $g_1,\ldots,g_m$ are homogeneous of even degree. Using these results, we find sparse SOS representation theorems for polynomials that are quadratic and correlatively sparse in a subset of variables, and we construct new convergent hierarchies of sparsity-exploiting SOS reformulations for convex optimization problems with large and sparse polynomial matrix inequalities. Numerical examples demonstrate that these hierarchies can have a significantly lower computational complexity than traditional ones.


Introduction
Many control problems for systems of ordinary differential equations can be posed as convex optimization problems with matrix inequality constraints that must hold on a prescribed portion of the state space [1][2][3][4]. For differential equations with polynomial right-hand side, these problems often take the generic form where b : R → R is a convex cost function, P 0 , . . . , P are m × m symmetric polynomial matrices depending on the system state x ∈ R n , and K = {x ∈ R n : g 1 (x) ≥ 0, . . . , g q (x) ≥ 0} (1.2) is a basic semialgebraic set defined by inequalities on fixed polynomials g 1 , . . . , g q . There is no loss of generality in considering only inequality constraints because any equality g(x) = 0 can be replaced by the two inequalities g(x) ≥ 0 and −g(x) ≥ 0. Verifying polynomial matrix inequalities is generally an NP-hard problem [5], which makes (1.1) intractable. Nevertheless, feasible vectors λ can be found via semidefinite programming if one imposes the stronger condition that P (x, λ) = S 0 (x) + g 1 (x)S 1 (x) + · · · + g q (x)S q (x) (1.3) for some m × m sum-of-squares (SOS) polynomial matrices S 0 , . . . , S q . A polynomial matrix S(x) is SOS if S(x) = H(x) T H(x) for some polynomial matrix H(x), and it is well known [6][7][8][9] that linear optimization problems with SOS matrix variables can be reformulated as semidefinite programs (SDPs). However, the size of these SDPs increases very rapidly as a function of the size of P , its polynomial degree, and the number of independent variables x. Thus, even though in theory SDPs can be solved using algorithms with polynomial-time complexity [10][11][12][13], in practice reformulations of (1.1) based on (1.3) remain intractable because they require prohibitively large computational resources. This work introduces new sparsity-exploiting SOS decompositions that can be used to efficiently certify the nonnegativity of large but sparse polynomial matrices, where "sparse" means that many of their off-diagonal entries are identically zero. Specifically, let P (x) be an m × m polynomial matrix and describe its sparsity using an undirected graph G with vertices V = {1, . . . , m} and edges E ⊆ V × V such that P ij (x) = P ji (x) ≡ 0 when i = j and (i, j) / ∈ E. Motivated by chordal decomposition techniques for semidefinite programming [14][15][16][17][18], we ask whether the computational complexity of (1.3) can be lowered by decomposing the matrices S 0 , . . . , S q into sums of sparse SOS matrices, with nonzero entries only on the principal submatrix indexed by one of the maximal cliques of the sparsity graph G of P . We prove that this clique-based decomposition exists if G is a chordal graph (meaning that, for every cycle of length larger than three, there is at least one edge in E connecting nonconsecutive vertices in the cycle), K is a compact set satisfying the so-called Archimedean condition, and P (x) is strictly positive definite on K (cf. Theorem 2.4). This result is a sparsity-exploiting version of Putinar's Positivstellensatz [19] for polynomial matrices. We also give a sparse-matrix version of the Putinar-Vasilescu Positivstellensatz [20], stating that (x 2 1 + · · · + x 2 n ) ν P admits a cliquebased SOS decomposition for some integer ν ≥ 0 if P is homogeneous, has even degree, and is positive definite on a semialgebraic set K defined by homogeneous polynomials g 1 , . . . , g m of even degree (cf. Theorem 2.5). This result applies even if K is noncompact. For the particular case of global nonnegativity, K ≡ R n , we immediately recover a sparsematrix version of Reznick'z Positivestellensatz [21] (cf. Theorem 2.3), and further prove a version of the Hilbert-Artin theorem [22] where the strict positivity of P is weakened into positive semidefiniteness upon replacing the factor (x 2 1 + · · · + x 2 n ) ν with a generic SOS polynomial (cf. Theorem 2.2). Table 1 summarizes our results and gives references to their counterparts for polynomials and general (dense) polynomial matrices.
These chordal SOS decomposition theorems for polynomial matrices extend a classical chordal decomposition result for constant (i.e., independent of x) positive semidefinite (PSD) sparse matrices [25]. The latter allows for significant computational gains when applied to large-scale sparse SDPs [16,18], analysis and control of structured systems [26, [20] Dinh et al. [24] Theorem 2. 5 27], and optimal power flow for large grids [28,29]. Similarly, our decomposition results can be used to construct convergent hierarchies of sparsity-exploiting SOS reformulations of problem (1.1) (cf. Theorems 3.1 to 3.3), which produce a minimizing sequence of feasible vectors λ and often have a significantly lower computational complexity compared to traditional approaches based on the "dense" weighted SOS representation (1.3). Finally, when the polynomial matrix P in (1.1) is not only sparse, but also depends only on a small set of n-variate monomials, our chordal SOS decompositions can be combined with known methods to exploit term sparsity. These methods include facial reduction [30][31][32], symmetry reduction [6,33], the exploitation of so-called correlative sparsity in the couplings between the independent variables [34][35][36][37][38], and the recent TSSOS, chordal-TSSOS and CS-TSSOS approaches to polynomial optimization [39][40][41][42]. Even though all of these methods have been developed for polynomial inequalities, rather than polynomial matrix inequalities, they can be applied directly upon reformulating the matrix inequality P (x; λ) 0 on K as the polynomial inequality p(x, y) = y T P (x; λ)y ≥ 0 for all x ∈ K and y ∈ R m with y ∞ ≤ 1. In particular, if P is structurally sparse, then p(x, y) is correlatively term sparse with respect to y, and the techniques of [34-36, 38, 40, 43] can be used to check if it is nonnegative for all x and y of interest. This connection does not make our matrix decomposition theorems redundant: on the contrary, they reveal that correlatively sparse SOS decompositions for p(x, y) depend only quadratically on y (Corollaries 4.1, 4.2 and 4.3), which cannot be concluded from the available SOS decomposition theorems for scalar polynomials.
The rest of this work is structured as follows. Section 2 states our main chordal SOS decomposition results, while Section 3 explains how they can be used to formulate convergent hierarchies of sparsity-exploiting SOS reformulations of problem (1.1). Section 4 relates our decomposition results for polynomial matrices to the classical SOS techniques for correlatively sparse polynomials [34][35][36]. Computational examples are presented in Section 5. Our matrix decomposition results are proven in Section 6, and conclusions are offered in Section 7. Appendices contain details of calculations and proofs of auxiliary results.

Chordal decomposition of polynomial matrices
The main contributions of this work are chordal decomposition theorems for n-variate PSD polynomial matrices P (x) whose sparsity is described by a chordal graph G. After reviewing the connection between sparse matrices and graphs, as well as the standard chordal decomposition theorem for constant matrices, we present decomposition theorems that apply globally (Section 2.2) and on basic semialgebraic sets (Section 2.3).

Sparse matrices and chordal graphs
A graph G is a set of vertices V = {1, . . . , m} connected by a set of edges E ⊆ V × V. We call G undirected if edge (j, i) is identified with edge (i, j), so edges are unordered pairs; . . , (v k , j) between any two distinct vertices i and j. We consider only undirected graphs, and focus mainly on the connected but not complete case. A vertex i ∈ V of an undirected graph is called simplicial if the subgraph induced by its neighbours is complete. A subset of vertices C ⊆ V that are fully connected, meaning that (i, j) ∈ E for all pairs of (distinct) vertices i, j ∈ C, is called a clique. A clique is maximal if it is not contained in any other clique. Finally, a sequence of vertices between nonconsecutive vertices in a cycle is known as a chord, and a graph is said to be chordal if all cycles of length k ≥ 4 have at least one chord. Complete graphs, chain graphs, and trees are all chordal; other particular examples are illustrated in Figure 1. Any non-chordal graph can be made chordal by adding appropriate edges to it; the process is known as a chordal extension [17].
The sparsity pattern of any m × m symmetric matrix P can be described using an undirected graph G with vertices V = {1, . . . , m} and an edge set E such that (i, j) / ∈ E if and only if i = j and P ij = 0; see Figure 1 for two examples. We call G the sparsity graph of P . Dense principal submatrices of P are indexed by cliques of G, and maximal dense principal submatrices are indexed by maximal cliques.
For each maximal clique C k of G, define a matrix E C k ∈ R |C k |×m as where |C k | is the cardinality of C k and C k (i) is the i-th vertex in C k . This definition ensures that the operation E T C k X k E C k "inflates" a |C k | × |C k | matrix X k into a sparse m × m matrix with nonzero entries only in the submatrix indexed by C k ; for example, if m = 3, C k = {1, 3}, and S = α β β γ we have The following classical result states that PSD matrices with a chordal sparsity graph admit a clique-based PSD decomposition. Theorem 2.1 (Agler et al. [25]). A matrix P whose sparsity graph is chordal and has maximal cliques C 1 , . . . , C t is positive semidefinite if and only if there exist positive semidefinite matrices S k of size |C k | × |C k | such that has the sparsity graph illustrated in Figure 2, which is chordal because it has no cycles. This graph has maximal cliques C 1 = {1, 2} and C 2 = {2, 3}. The decomposition guaranteed by Theorem 2.1 reads Our goal is to derive versions of Theorem 2.1 for sparse polynomial matrices that are positive semidefinite, either globally or on a basic semialgebraic set, where the matrices S k are polynomial and SOS. This allows us to build convergent hierarchies of sparsityexploiting SOS reformulations for the optimization problem (1.1), which have a considerably lower computational complexity compared to standard (dense) ones. Throughout the paper, we assume without loss of generality that the sparsity graph G of P (x) is connected and not complete. Complete sparsity graphs correspond to dense matrices, while disconnected ones correspond to matrices that have a block-diagonalizing permutation. Each irreducible diagonal block can be analyzed individually and has a connected (but possibly complete) sparsity graph by construction.

Polynomial matrix decomposition on R n
Let the polynomial matrix P (x) be positive semidefinite for all x ∈ R n and have a chordal sparsity graph with maximal cliques C 1 , . . . , C t . Applying Theorem 2.1 for each x ∈ R n yields PSD matrices S 1 (x), . . . , S t (x) such that Are these matrices always polynomial in x? Our first result gives a negative answer to this question for all matrix sizes m ≥ 3, irrespective of the number n of independent variables and of the sparsity graph of P .
Proposition 2.1. Let G be a connected and not complete chordal graph with m ≥ 3 vertices and maximal cliques C 1 , . . . , C t . Fix any positive integer n. There exists an n-variate m × m polynomial matrix P (x) with sparsity graph G that is strictly positive definite for all x ∈ R n , but cannot be written in the form (2.2) with positive semidefinite polynomial matrices S k (x).
The proof of this proposition, given in Section 6.1, relies on the following example.
Example 2.2. The 3 × 3 univariate polynomial matrix is globally positive semidefinite and SOS for all k ≥ 0, and it is strictly positive definite if k > 0. Let us try to search for a basic decomposition of the form (2.2). We need to find two 2 × 2 positive semidefinite polynomial matrices S 1 and S 2 such that P (x) = Equivalently, we need to find polynomials a, b, c, d, e and f such that 4) and such that the two matrices on the right-hand sides are positive semidefinite. Fixing to ensure the equality, positive semidefiniteness requires the traces and determinants of the 2 × 2 nonzero blocks to be nonnegative, i.e.
If c(x) is to be nonnegative, then it must be quadratic; otherwise, (2.5b) cannot hold for all x. In particular, we must have c(x) = α + 2x + x 2 for some scalar α to ensure that the coefficients of x 4 and x 3 in (2.5c) and (2.5d) vanish, otherwise at least one of these conditions cannot hold for all x ∈ R. Then, (2.5a) and (2.5b) become x 2 + 2x + α ≥ 0 and x 2 − 2x − α + k ≥ 0, and hold if and only if 1 ≤ α ≤ k − 1. A suitable α therefore exists when k ≥ 2, while the decomposition (2.4) fails to exist if 0 ≤ k < 2 even though P (x) is PSD for all such values of k (and, in fact, positive definite if k = 0).
Clique-based decompositions similar to (2.2) with polynomial matrices S k (x), however, do exist after multiplying P (x) by a suitable SOS polynomial σ(x). The next result generalizes the Hilbert-Artin theorem on the representation of nonnegative polynomial as sums of squares of rational functions [22]. Importantly, it establishes that each S k (x) is not just positive semidefinite, but SOS. Theorem 2.2. Let P (x) be an m × m positive semidefinite polynomial matrix whose sparsity graph is chordal and has maximal cliques C 1 , . . . , C t . There exist an SOS polynomial σ(x) and SOS matrices S k (x) of size |C k | × |C k | such that The proof, given in Section 6.2, extends a constructive proof of Theorem 2.1 for standard PSD matrices with chordal sparsity [44] using Schmüdgen's diagonalization procedure for polynomial matrices [45] and the Hilbert-Artin theorem [22]. Example 2.3. Consider once again the polynomial matrix P (x) from Example 2.2. Inequalities (2.5a-d) hold for the rational function c(x) = (1 + x) 2 x 2 (k + 1 + x 2 ) −1 . We can therefore decompose where, by construction, the polynomial matrices are PSD for all k ≥ 0. They are also SOS because the two concepts are equivalent for univariate polynomial matrices [46]. Rearranging (2.7) yields the decomposition of P guaranteed by Theorem 2.2 with σ(x) = k + 1 + x 2 .
If P (x) and its highest-degree homogeneous part are strictly positive definite on R n and R n \ {0}, respectively, one can fix either σ(x) = x 2ν or σ(x) = (1 + x 2 ) ν for a sufficiently large integer ν ≥ 0, where x 2 := x 2 1 +· · ·+x 2 n . Precisely, we have the following versions of Reznick's Positivstellensatz [21] for sparse polynomial matrices, which follow from more general SOS chordal decomposition results on semialgebraic sets stated in the next section (cf. Theorem 2.5 and Corollary 2.2). Theorem 2.3. Let P (x) be an m×m homogeneous polynomial matrix whose sparsity graph is chordal and has maximal cliques C 1 , . . . , C t . If P is strictly positive definite on R n \ {0}, there exist an integer ν ≥ 0 and homogeneous SOS matrices S k (x) of size |C k | × |C k | such that Corollary 2.1. Let P (x) = |α|≤2d P α x α be an inhomogeneous m × m polynomial matrix of even degree 2d whose sparsity graph is chordal and has maximal cliques C 1 , . . . , C t . If P is strictly positive definite on R n and its highest-degree homogeneous part |α|=2d P α x α is strictly positive definite on R n \ {0}, there exist an integer ν ≥ 0 and SOS matrices S k (x) of size |C k | × |C k | such that [47], which is nonnegative but not SOS [48,Example 3.7]. The polynomial matrix is strictly positive definite on R 2 (see Appendix A), but is not SOS since ε(1+x 6 1 +x 6 2 )+q(x) is not SOS unless ε 0.01006 [48,Example 6.25]. Nevertheless, since the highest-degree homogeneous part of P is also positive definite on R 2 \ {0}, Corollary 2.1 guarantees that P can be decomposed as in (2.9) for a large enough exponent ν. Here ν = 1 suffices, and To see that these two matrices are SOS, observe that the first addend on the right-hand side of (2.11a) is SOS because , the second addend on the right-hand side of (2.11a) is SOS because and the matrix on the right-hand side of (2.11b) is the sum of two univariate PSD (hence, SOS) matrices: setting k = 2/(3 √ 3), we have

Polynomial matrix decomposition on semialgebraic sets
We now turn our attention to SOS chordal decompositions on basic semialgebraic sets K defined as in (1.2). We say that K satisfies the Archimedean condition if there exist SOS polynomials σ 0 (x), . . . , σ q (x) and a scalar r such that This condition implies that K is compact because r 2 − x 2 is positive on K. The converse is not always true [2], but can be ensured by adding the redundant inequality r 2 − x 2 ≥ 0 to the definition (1.2) of K for a sufficiently large r. Theorem 2.4 below guarantees that if a polynomial matrix is strictly positive definite on a compact K satisfying the Archimedean condition, then it admits a chordal decomposition in terms of weighted sums of SOS matrices supported on the cliques of the sparsity graph, where the weights are exactly the polynomials g 1 , . . . , g q used in the semialgebraic definition (1.2) of K. This result extends Putinar's Positivstellensatz [19] to sparse polynomial matrices, and can be considered a sparsity-exploiting version of a Positivstellensatz for general (dense) polynomial matrices (see [49,Theorem 2.19] and [9, Theorem 2]).
Theorem 2.4. Let K be a compact semialgebraic set defined as in (1.2) that satisfies the Archimedean condition (2.12), and let P (x) be a polynomial matrix whose sparsity graph is chordal and has maximal cliques C 1 , . . . , C t . If P is strictly positive definite on K, there exist SOS matrices S j,k (x) of size |C k | × |C k | such that The proof, given in Section 6.3, exploits the Cholesky algorithm for matrices with chordal sparsity, the Weierstrass polynomial approximation theorem, and the aforementioned Positivstellensatz for general polynomial matrices [9, Theorem 2]. Figure 3: The semialgebraic set K considered in Example 2.5 (red shading, solid boundary), compared to the region of R 2 where the matrix P (x) in (2.14) is positive definite (grey shading, dashed boundary). On the boundary of this region, P (x) is PSD but not definite.
Example 2.5. The bivariate polynomial matrix is not positive semidefinite globally (the first diagonal element is negative if x 1 is sufficiently large) but is strictly positive definite on the compact semialgebraic set This can be verified numerically by approximating the region of R 2 where P is positive definite (see Figure 3), and an analytical certificate will be given below.
If K is not compact or does not satisfy the Archimedean condition, Theorem 2.4 can be used to prove a similar decomposition result that applies to (1 + x 2 ) ν P with large enough exponent ν, as long as P has even degree and the behaviour of its leading term can be controlled. We start with the case in which P is homogeneous and K is defined using homogeneous polynomial inequalities of even degree.
Theorem 2.5. Let K be a semialgebraic set defined as in (1.2) with homogeneous polynomials g 1 , . . . , g q of even degree, and such that K \ {0} is nonempty. Let P (x) be a homogeneous polynomial matrix of even degree whose sparsity graph is chordal and has maximal cliques C 1 , . . . , C t . If P is strictly positive definite on K \ {0}, there exist an integer ν ≥ 0 and homogeneous SOS matrices S j,k (x) of size |C k | × |C k | such that This result, proven in Section 6.4, recovers Theorem 4 in [24] when P is dense. If P is not homogeneous, we find the following version of the Putinar-Vasilescu Positivstellensatz [20] for sparse polynomial matrices, which is a sparsity-exploiting formulation of a recent result for general (dense) matrices [24, Corollary 3].
Corollary 2.2. Let K be a semialgebraic set defined as in (1.2), and let P (x) = |α|≤2d 0 P α x α be an inhomogeneous polynomial matrix of even degree 2d 0 whose sparsity graph is chordal and has maximal cliques C 1 , . . . , C t . If P is strictly positive definite on K and its highest- The polynomial matrix Q and the polynomials h j are homogeneous of even degree, and satisfy Q(x, 1) = P (x) and with y = 0, then x/y ∈ K and Q(x, y) is positive definite because so is P (x/y). Applying Theorem 2.5 to Q and K , noting that (x, y) 2 = y 2 + x 2 , setting y = 1, and recalling that Q(x, 1) = P (x) yields (2.17).
Remark 2.1. Setting g 1 = · · · = g q ≡ 0 in Theorem 2.5 and Corollary 2.2 immediately yields Theorem 2.3 and Corollary 2.1 for the global case K = R n (observe that a globally PSD homogeneous polynomial matrix must have even degree).

Convex optimization with sparse polynomial matrix inequalities
The decomposition results in Sections 2.2 and 2.3 can be used to construct hierarchies of sparsity-exploiting SOS reformulations for the optimization problem (1.1) that produce feasible vectors λ and upper bounds on its optimal value B * . Specifically, fix any two integers ν and d satisfying ν ≥ 0 and 2d ≥ max{deg(P ), deg(g 1 ), . . . , deg(g q )} + 2ν, and consider the SOS optimization problem where Σ m 2ω denotes the cone of n-variate m × m SOS matrices of degree 2ω, d 0 := d, d j := d − 1 2 deg(g j ) for each j = 1, . . . , q, and either σ(x) = x 2 or σ(x) = 1 + x 2 depending on whether P is homogeneous in x or not. For each choice of ν and d, problem (3.1) can be recast as an SDP [6][7][8][9] and solved using a wide range of algorithms. The optimal λ is clearly feasible for (1.1), so B * d,ν ≥ B * . The nontrivial and far-reaching implication of the decomposition theorems presented in Sections 2.2 and 2.3 is that the SOS problem (3.1) is asymptotically exact as d or ν are increased, provided that the original problem (1.1) satisfies suitable technical conditions and is strictly feasible. For instance, the sparsity-exploiting version of Putinar's Positivstellensatz in Theorem 2.4 leads to the following result.
Theorem 3.1. Let K be a compact basic semialgebraic set defined as in (1.2) that satisfies the Archimedean condition (2.12), and let B * and B * d,ν be as in (1.1) and Proof. It suffices to show that, for any ε > 0, there exists d such that is strictly positive definite on K and Theorem 2.4 guarantees that λ is feasible for (3.1) when d is sufficiently large. Given such d, we can use the inequality B * ≤ B * d,0 and the convexity of the cost function b to estimate The term in square brackets is strictly positive by construction, so we can fix γ = ε/[b(λ 0 )− B * − ε] and conclude that B * ≤ B * ν ≤ B * + 2ε, as required.
If K is not compact or does not satisfy the Archimedean condition, similar arguments that use Theorem 2.5 and Corollary 2.2 instead of Theorem 2.4 (omitted for brevity) give asymptotic convergence results provided that P satisfies additional conditions. For homogeneous problems of even degree, strict feasiblity suffices.
Remark 3.1. Theorems 3.2 and 3.3 apply also when K ≡ R n , in which case they can be deduced from Theorem 2.3 and Corollary 2.1. Thus, when K ≡ R n the SOS multipliers S j,k (x) for j = 1, . . . , q and k = 1, . . . , t in (3.1) can be set to zero.

Relation to correlatively sparse SOS decompositions of polynomials
The SOS chordal decomposition theorems stated in Section 2 can be used to derive new existence results for sparsity-exploiting SOS decompositions of certain families of correlatively sparse polynomials [34][35][36]. A polynomial with independent variables x = (x 1 , . . . , x n ) and y = (y 1 , . . . , y m ) and coefficients c α,β ∈ R, is correlatively sparse with respect to y if the variables y 1 , . . . , y m are sparsely coupled, meaning that the m × m coupling matrix CSP y (p) with entries is sparse. For example, the polynomial p(x, y) = x 2 1 x 2 y 2 1 + y 1 y 2 − x 2 y 2 y 3 + y 4 4 with n = 2 and m = 4 is correlatively sparse with respect to y and The sparsity graph of the coupling matrix CSP y (p) is known as the correlative sparsity graph of p, and we say that p(x, y) has chordal correlative sparsity with respect to y if its correlative sparsity graph is chordal.
To exploit correlative sparsity when attempting to verify the nonnegativity of p(x, y), one looks for an SOS decomposition in the form [34,35] where C 1 , . . . , C t are the maximal cliques of the correlative sparsity graph and each σ k is an SOS polynomial that depends on x and on the subset y C k = E C k y of y indexed by C k . For instance, with m = 3 and two cliques C 1 = {1, 2} and C 2 = {2, 3} we have y C 1 = (y 1 , y 2 ) and y C 2 = (y 2 , y 3 ).
In general, the existence of the sparse SOS representation (4.2) is only sufficient to conclude that p(x, y) is nonnegative: Example 3.8 in [50] gives a nonnegative (in fact, SOS) correlatively sparse polynomial that cannot be decomposed as in (4.2). Nevertheless, our SOS chordal decomposition theorems from Section 2 imply that sparsity-exploiting SOS decompositions do exist for polynomials p(x, y) that are quadratic and correlatively sparse with respect to y. This is because any polynomial p(x, y) that is correlatively sparse, quadratic, and (without loss of generality) homogeneous with respect to y can be expressed as p(x, y) = y T P (x)y for some polynomial matrix P (x) whose sparsity graph coincides with the correlative sparsity graph of p(x, y). Using this observation, we can "scalarize" Theorems 2.2, 2.3, 2.4 and 2.5 to obtain the following statements.
Proof. Assume first that p is homogeneous in y and write p(x, y) = y T P (x)y, where P (x) is positive semidefinite globally and has the same sparsity pattern as the correlative sparsity matrix CSP y (p). Theorem 2.2 guarantees that for some SOS polynomial σ 0 (x) and SOS polynomial matrices S k (x). Setting σ k (x, y C k ) := y T C k S k (x)y C k gives the desired decomposition. When p is not homogeneous, the result follows from a relatively straightforward homogenization argument described in Appendix B.
Proof. If p is homogeneous in y, write p(x, y) = y T P (x)y, observe that P is strictly positive definite for all x ∈ R n \ {0}, apply Theorem 2.3 to P , and proceed as in the proof of Corollary 4.1. If p is not homogeneous, use a homogenization argument similar to that in Appendix B.
Corollary 4.3. Let p(x, y) = |α|≤d,|β|≤2 c α,β x α y β be quadratic and correlatively sparse in y. Further, let K be a semialgebraic set defined as in (1.2). Suppose that 1) The correlative sparsity graph is chordal with maximal cliques C 1 , . . . , C t , 2) |α|≤d,|β|=2 c α,β x α y β > 0 for all x ∈ K and y ∈ R m \ {0}, 3) If p is not homogeneous in y, then p(x, y) > 0 for all x ∈ K and y ∈ R m . Then: i) If K is compact and satisfies the Archimedean condition (2.12), there exist SOS polynomials σ j,k (x, y C k ), quadratic in the second argument, such that ii) If p and the polynomials g 1 , . . . , g q defining K are homogeneous of even degree in x, the set K \ {0} is nonempty, and conditions 2) and 3) above hold for x ∈ K \ {0}, there exist an integer ν ≥ 0 and SOS polynomials σ j,k (x, y C k ), quadratic in the second argument, such that Proof. If p is homogeneous in y, write p(x, y) = y T P (x)y for a polynomial matrix P (x) with chordal sparsity graph. The strict positivity of p for all nonzero y implies that P is strictly positive definite on K. Therefore, we can apply Theorem 2.4 for statement i) and Theorem 2.5 for statement ii), and proceed as in the proof of Corollary 4.1 to conclude the proof. If p is not homogeneous in y, one can use a homogenization argument similar to that in Appendix B.
Corollary 4.3 specializes, but appears not to be a particular case of, an SOS representation result for correlative sparse polynomials proved by Lasserre [35,Theorem 3.1]. Similarly, Corollaries 4.1 and 4.2 specialize recent results in [51]. In particular, although our statements apply only to polynomials p(x, y) that are quadratic and correlatively sparse with respect to y rather than to general ones, they provide explicit and tight degree bounds on the quadratic variables that cannot be deduced directly from the (more general) results in the references. For example, let K be as in (1.2), suppose that the Archimedean condition (2.12) holds, and suppose that p(x, y) is quadratic, homogeneous, and correlatively sparse in y with a chordal correlative sparsity graph. If p is strictly positive for all x ∈ K and all y ∈ R m \ {0}, then in particular it is so on the basic semialgebraic set K := {(x, y) ∈ K × R m : ±(1 − y 2 1 ) ≥ 0, . . . , ±(1 − y 2 m ) ≥ 0}. This set also satisfies the Archimedean condition, so one can use Theorem 3.1 in [35] to represent p as for some SOS polynomials σ jk and some polynomials ρ k , not necessarily SOS. Corollary 4.3 enables one to go further and conclude that one may take ρ k ≡ 0 and σ jk (x, y C k ) = y T C k S jk (x)y C k for some SOS matrices S jk . These restrictions could probably be deduced starting from (4.3), but our approach based on the SOS chordal decomposition of sparse polynomial matrices makes them almost immediate.

Numerical experiments
We now give numerical examples demonstrating the practical performance of the sparsityexploiting SOS reformulations of the optimization problem (1.1) introduced in Section 3. All examples were implemented on a PC with a 2.2 GHz Intel Core i5 CPU and 12GB of RAM, using the SDP solver MOSEK [52] and a customized version of the MAT-LAB optimization toolbox YALMIP [32,53]. The toolbox and all scripts used to generate the results presented below are available from https://github.com/aeroimperialoptimization/aeroimperial-yalmip and https://github.com/aeroimperial-optimization/soschordal-decomposition-pmi.

Approximation of global polynomial matrix inequalities
Our first numerical experiment illustrates the computational advantage of our sparsityexploiting SOS reformulation for a problem with a global polynomial matrix inequality. Fix an integer ω ≥ 1 and consider the 3ω×3ω tridiagonal polynomial matrix P ω = P ω (x, λ), parameterized by λ ∈ R 2 , given by First, we illustrate how Theorem 2.3 enables one to approximate the set of vectors λ for which P ω is PSD globally, Define two hierarchies of subsets of F ω , indexed by a nonnegative integer ν, as The sets D ω,ν are defined using the standard (dense) SOS constraint (1.3), while the sets S ω,ν use the sparsity-exploiting nonnegativity certificate in Theorem 2.3. For each ν we have S ω,ν ⊆ D ω,ν ⊆ F ω , and the inclusions are generally strict. This is confirmed by the (approximations to) the first few sets D 2,ν and S 2,ν shown in Figure 4, which were obtained by maximizing the linear cost function λ 1 cos θ + λ 2 sin θ for 1000 equispaced values of θ in the interval [0, π/2] and exploiting the λ 1 → −λ 1 symmetry of D 2,ν and S 2,ν . (Computations for S 2,1 were ill-conditioned, so the results are not reported.) On the other hand, for any choice of ω, Theorem 2.3 guarantees that any λ for which P ω is positive definite belongs to S ω,ν for sufficiently large ν. Thus, the sets S ω,ν can approximate F ω arbitrarily accurately in the sense that any compact subset of the interior of F ω is included in S ω,ν for some sufficiently large integer ν. The same is true for the sets D ω,ν since S ω,ν ⊆ D ω,ν . Once again, this is confirmed by our numerical results for ω = 2 in Figure 4, which suggest that S 2,3 = D 2,2 = F 2 . Next, to illustrate the computational advantages of our sparsity-exploiting SOS methods compared to the standard ones, we use both approaches to bound from above by replacing F ω with its inner approximations D ω,ν and S ω,ν in (5.1a) and (5.1b).
Optimizing over D ω,ν requires one SOS constraint on a 3ω×3ω polynomial matrix of degree d = 2ν + 4, while optimizing over S ω,ν requires 3ω − 1 SOS constraints on 2 × 2 polynomial   matrices of the same degree. Theorem 3.2 and the inclusion S ω,ν ⊆ D ω,ν guarantee that the upper bounds B d,ν on B * obtained with either SOS formulation converge to the latter as ν → ∞. (Here, as in Section 3, B d,ν denotes the upper bound on B * obtained from SOS reformulations of (5.2) with SOS matrices of degree d and exponent ν.) Table 2 lists upper bounds B d,ν computed with MOSEK using both SOS formulations, degree d = 4 + 2ν, and different values of ω and ν. The CPU time is also listed. Bounds for our sparse SOS formulation with ν = 1 are not reported because MOSEK encountered severe numerical problems irrespective of the matrix size ω. It is evident that our sparsityexploiting SOS method scales significantly better than the standard approach as ω and ν increase. For ω = 10, for example, the bound obtained with our sparsity-exploiting approach and ν = 3 agrees to two decimal places with the bounds calculated using traditional methods with ν = 2 and 3, but the computation is three orders of magnitude faster. More generally, our sparsity-exploiting computations took less than 10 seconds for all tested values of ω and ν, 1 while traditional ones required more RAM than available for all but the smallest values. We expect similarly large efficiency gains for any optimiza-tion problem with sparse polynomial matrix inequalities if the size of the largest maximal clique of the sparsity graph is much smaller than the matrix size.

Approximation of polynomial matrix inequalities on compact sets
As our second example, we consider the problem of constructing inner approximations for compact sets where a polynomial matrix is positive semidefinite. This problem arises, for instance, when approximating the robust stability region of linear dynamical systems [4], and was studied in [3] using standard SOS methods. Here, we show that our sparse-matrix version of Putinar's Positivstellensatz in Theorem 2.4 allows for significant reductions in computational complexity without sacrificing the rigorous convergence guarantees established in [3].
Let K ⊂ R n be a compact semialgebraic set defined as in (1.2) that satisfies the Archimedean condition, and let P (x) be an m × m symmetric polynomial matrix. We seek to construct a sequence {S 2d } d∈N of subsets of the (compact) set P = {x ∈ K | P (x) 0}, such that S 2d converges to P in volume. Following [3], this can be done by letting S 2d = {x ∈ K | s 2d (x) ≥ 0} be the superlevel set of the degree-2d polynomial s 2d (x) that solves the convex optimization problem This problem is in the form (1.1), and the optimization variable λ is the vector of n+2d n coefficients of s 2d (with respect to any chosen basis). The polynomial s 2d is a pointwise lower bound for the minimum eigenvalue function of P (x) on K. Using this observation, the compactness of K, the continuity of eigenvalues, and the Weierstrass polynomial approximation theorem, one can show that, as d → ∞, S 2d converges to P in volume, s 2d converges pointwise almost everywhere to the minimum eigenvalue function, and B * m,d tends to the integral of the latter on K. Theorem 1 in [3] shows that convergence is maintained if the intractable matrix inequality constraint is replaced with a weighted SOS representation for P (x) − s 2d (x)I in the form (1.3), where the SOS matrices S k are chosen such that the degree of S 0 + g 1 S 1 + · · · + g q S q does not exceed 2d. By Theorem 2.4, the same is true for the sparsity-exploiting reformulation (3.1) with ν = 0, SOS matrices S 0,k of degree d 0 = d, and SOS matrices S j,k of degree d j = d − 1 2 deg(g j ) . To illustrate the computational advantages gained by exploiting sparsity, we consider a relatively simple (but still nontrivial) bivariate problem with K = {x ∈ R 2 : 1−x 2 1 −x 2 2 ≥ 0} being the unit disk and where A and B are m × m symmetric matrices with chordal sparsity graphs, zero diagonal elements, and other entries drawn randomly from the uniform distribution on (0, 1). The sparsity graphs of A and B were generated randomly whilst ensuring that their maximal cliques contain no more than five vertices [54], and the corresponding structure of P for m = 15, 20, 25, 30, 35 and 40 is shown in Figure 5. The exact data matrices used in our calculations are available at https://github.com/aeroimperial-optimization/sos-chordaldecomposition-pmi. Figure 6 illustrates the inner approximations S 2d of P computed using both the standard SOS constraint (1.3) and our sparsity-exploiting formulation (3.1).      SOS programs with MOSEK and the limit B * m,∞ obtained from numerical integration of the minimum eigenvalue function of P on the unit disk K. Similar to what was observed in Section 5.1, for fixed d the dense SOS constraints give better bounds than the sparse ones. As expected, however, the sparsity-exploiting formulation requires significantly less time for large m, and all problem instances were solved within 10 seconds. In addition, the approximating sets S 2d in Figure 6 provided by both SOS formulations for every combination of d and m are almost indistinguishable. For a given matrix size m, therefore, our sparse SOS formulation enables the construction of much better approximations to P by considering large values of d, which are beyond the reach of standard SOS formulations. This is important because, as shown in Figure 7 and Table 4 for m = 15, the convergence to the set P and to the limit B * m,∞ is slow as d is raised.

Proof of Proposition 2.1
We construct polynomial matrices that cannot be decomposed according to (2.2) with polynomial S k . To do so, we may assume that n = 1 without loss of generality because univariate polynomial matrices are particular cases of multivariate ones. First, fix m = 3 and let G be the sparsity graph of the 3×3 positive definite polynomial matrix considered in Example 2.2 for k = 1, Observe that G is essentially the only connected but not complete graph with m = 3: any other such graph can be reduced to G by reordering its vertices, which corresponds to a symmetric permutation of the polynomial matrix it describes. We have already shown in Example 2.2 that P has no decomposition of the form (2.2) with polynomial S k , so Proposition 2.1 holds for m = 3. The same 3×3 matrix can be used to generate counterexamples for a general connected but not complete sparsity graph G with m > 3. Non-completeness implies that G must have at least two maximal cliques, while connectedness implies that every maximal clique C i must contain at least two elements and intersect at least one other clique C j . Whenever Moreover, since G is chordal, Theorem 3.3 in [17] guarantees that it contains at least one simplicial vertex (cf. Section 2.1 for a definition), which must belong to one and only one maximal clique. Upon reordering the vertices and the maximal cliques if necessary, we may therefore assume without loss of generality that: (i) C 1 = {1, . . . , r} for some r; (ii) vertex 1 is simplicial, so it belongs only to clique C 1 ; (iii) vertex 2 is in C 1 ∩ C 2 and vertex r + 1 is in C 2 \ C 1 . Now, consider the positive definite m × m matrix whose nonzero entries are on the diagonal or in the principal submatrix with rows and columns indexed by {1, 2, r + 1}. Note that the sparsity pattern of P is compatible with the sparsity graph G. We claim that no decomposition of the form (2.2) exists where each S k is a PSD polynomial matrix. For the sake of contradiction, assume that such a decomposition exists, so where S 1 (x) and Q(x) are r × r and m × m PSD polynomial matrices, respectively. Since vertex 1 is contained only in clique C 1 , the matrix S 1 must have the form for some (r − 1) × (r − 1) polynomial matrix T to be determined. For the same reason, the matrix Q(x) can be partitioned as where 0 p×q is a p × q matrix of zeros, A is an (r − 1) × (r − 1) polynomial matrix to be determined, and the (r − 1) × (m − r) block B and the (m − r) × (m − r) block C are given by The block T of S 1 and the block A of Q correspond to element of clique C 1 that may belong also to other cliques. These blocks cannot be determined uniquely, but their sum must be equal to the principal submatrix of P with rows and columns indexed by {2, . . . , r}. In particular, we must have A 11 (x) = 2x 2 + 1 − T 11 (x). Moreover, since S 1 and Q are PSD by assumption, we may take appropriate Schur complements to find .
Using the identity A 11 (x) = 2x 2 + 1 − T 11 (x), these conditions require However, just as in Example 2.2, no polynomial T 11 (x) can satisfy these inequalities. We conclude that P cannot admit a decomposition of the form (2.2) with PSD polynomial matrices S k , which proves Proposition 2.1 in the general case.

Proof of Theorem 2.2
To establish Theorem 2.2 we adapt ideas by Kakimura [44], who proved the chordal decomposition theorem for constant PSD matrices (cf. Theorem 2.1) using the fact that symmetric matrices with chordal sparsity patterns admit an LDL T factorization with no fill-in [55]. In Appendix C, we use Schmüdgen's diagonalization procedure [45] to prove the following analogous statement for polynomial matrices. Moreover, L has no fill-in in the sense that L + L T has the same sparsity as T P T T . Now, let P (x) be a PSD polynomial matrix with chordal sparsity graph, and apply Proposition 6.1 to diagonalize it. We will assume first that the permutation matrix T is the identity, and remove this assumption at the end.
Since P is PSD, the polynomials d 1 (x), . . . , d m (x) in (6.1) must be nonnegative globally and, by the Hilbert-Artin theorem [22], can be written as sum of squares of rational functions. In particular, there exist SOS polynomials f 1 , . . . , f m and g 1 , . . . , g m such that f i (x)d i (x) = g i (x) for all i = 1, . . . , m. Therefore, we can write (omitting the argument x for notational simplicity) Next, define the polynomial σ := j f j b 4 and observe that it SOS because it is the product of SOS polynomials. For the same reason, the products g i j =i f j appearing on the right-hand side of the last equation are SOS polynomials. Thus, we can find an integer s and polynomials q 11 , . . . , q m1 , . . . , q 1s , . . . , q ms such that σP = s i=1 L Diag q 2 1i , . . . , q 2 mi L T =: where, for notational simplicity, we have introduced the lower-triangular matrices Under our additional assumption that Proposition 6.1 can be applied with T = I, Theorem 2.2 follows if we can show that for some SOS matrices S ik and each i = 1, . . . , s. Indeed, combining (6.3) with (6.2) and setting S k = s i=1 S ik yields the desired decomposition (2.6) for P . To establish (6.3), denote the columns of H i by h i1 , . . . , h im and write Since H i has the same sparsity pattern as L, the nonzero elements of each column vector h ij must be indexed by a clique C j for some j ∈ {1, . . . , t}. Thus, the nonzero elements of h ij can be extracted through multiplication by the matrix where Q ij is an SOS matrix by construction. Now, let J ik = {j : j = k} be the set of column indices j such that column h ij is indexed by clique C k . These index sets are disjoint and ∪ k J ik = {1, . . . , m}, so substituting (6.5) into (6.4) we obtain This is exactly (6.3) with matrices S ik = j∈J ik Q ij , which are SOS because they are sums of SOS matrices. Thus, we have proved Theorem 2.2 for polynomial matrices P to which Proposition 6.1 can be applied with T = I. The general case follows from a relatively straightforward permutation argument. First, apply the argument above to decompose the permuted matrix T P T T , whose sparsity graph G is obtained by reordering the vertices of the sparsity graph G of P according to the permutation T . Second, observe that the cliques C 1 , . . . , C t of G are related to the cliques C 1 , . . . , C t of G by the permutation T , so the matrices E C k and E C k satisfy E C k = E C k T . As required, therefore,

Proof of Theorem 2.4
Our proof of Theorem 2.4 follows the same steps used by Kakimura [44] to prove the chordal decomposition theorem for constant PSD matrices (Theorem 2.1). Borrowing ideas from [36], this can be done with the help of the Weierstrass polynomial approximation theorem and the following version of Putinar's Positivstellensatz for polynomial matrices due to Scherer and Hol [9, Theorem 2].
Theorem 6.1 (Scherer and Hol [9]). Let K be a compact semialgebraic set defined as in (1.2) that satisfies the Archimedean condition (2.12). If an m×m symmetric polynomial matrix P (x) is strictly positive definite on K, there exist m × m SOS matrices S 0 , . . . , S q such that P (x) = S 0 (x) + q i=1 S i (x)g i (x).
Remark 6.1. It is also possible to establish Theorem 2.4 by modifying the proof of Theorem 6.1 with the help of Theorem 2.1. This alternative approach is technically more involved, but might be extended more easily to obtain sparsity-exploiting versions of the general result in [9, Corollary 1], rather than of its particular version in Theorem 6.1. We leave this generalization to future research.
Let P (x) be an m × m polynomial matrix with chordal sparsity graph G. If m = 1 or 2, Theorem 2.4 is a direct consequence of Theorem 6.1. For m ≥ 3, we proceed by induction assuming that Theorem 2.4 holds for matrices of size m − 1 or less. Without loss of generality, we assume that the sparsity graph G is not complete (otherwise, P is dense and Theorem 2.4 reduces to Theorem 6.1) and connected (otherwise, P and G can be replaced by their connected components).
Since G is chordal, it has at least one simplicial vertex [17,Theorem 3.3]. Relabelling vertices if necessary, which is equivalent to permuting P , we may assume that vertex 1 is simplicial and that the first maximal clique of G is C 1 = {1, . . . , r} with 1 < r < m. Thus, P (x) has the block structure for some polynomial a, polynomial vector b = (b 1 , . . . , b r−1 ), and polynomial matrices U of dimension (r − 1) × (r − 1), V of dimension (r − 1) × (m − r), and W of dimension (m − r) × (m − r).
The polynomial a must be strictly positive on K because P is positive definite on that set, so we can apply one step of the Cholesky factorization algorithm to write The matrix on the right-hand side of (6.6) is positive definite on the compact set K because so is P and L is invertible. Therefore, there exists ε > 0 such that Moreover, the rational entries of the matrix a −1 bb T are continuous on K because a is strictly positive on that set, so we may apply the Weierstrass approximation theorem to choose a polynomial matrix H(x) that satisfies Next, consider the decomposition . (6.9) Combining (6.8) with the strict positivity of a(x) on K we obtain where the last strict matrix inequality follows from the strict positivity of a and Schur's complement conditions. Since Q is positive definite on K, we may apply Theorem 6.1 to find SOS matrices T 0 , . . . , T q such that Moreover, for all x ∈ K inequalities (6.7) and (6.8) yield The sparsity of R(x) is described by the subgraphG of G obtained by removing the simplicial vertex 1 and its corresponding edges. This subgraph is chordal [17,Section 4.2] and has either t maximal cliquesC 1 = C 1 \ {1},C 2 = C 2 , . . . ,C t = C t , or t − 1 maximal cliquesC 2 = C 2 , . . . ,C t = C t (in the latter case, we setC 1 = ∅ for notational convenience). In either case, by the induction hypothesis, we can find SOS matrices Y i andS ik such that (omitting the argument x from all polynomials and polynomial matrices for notational simplicity) 2 The SOS decomposition (6.10) and (6.11) can now be combined with (6.9) to derive the desired SOS decomposition for P (x). The process is straightforward but cumbersome in notation, because we need to handle matrices of different dimensions. For each i ∈ {0, . . . , q} and k ∈ {1, . . . , t} define the matrices and note that We therefore obtain Here we slightly abuse notation: the matrices EC k have size |C k | × (m − 1) because they are defined using the graphG, which has m − 1 vertices. The matrices EC k , instead, have size |C k | × m because they are defined using the graph G, which has m vertices. and can rewrite the decomposition (6.9) as Substituting the decomposition of Q from (6.10), letting S i1 (x) := T i (x) + Z i (x), and reintroducing the x-dependence of various terms we arrive at which is the desired SOS decomposition of P (x).

Proof of Theorem 2.5
We combine the argument given in [24] for general (dense) polynomial matrices with Theorem 2.4 and the following auxiliary result, proven in Appendix D.
Lemma 6.1. Let S(x) be an SOS polynomial matrix satisfying S(x) = S(−x). For any real number r ≥ 0 and any integer ω such that 2ω ≥ deg(S), the matrix x 2ω S(r x −1 x) is polynomial of degree 2ω, homogeneous, and SOS.
Choose any nonzero x 0 ∈ K, let r = x 0 = 0, and observe that the (nonempty) semialgebraic set K := K ∩ {x ∈ R n : ±(r 2 − x 2 ) ≥ 0} satisfies the Archimedean condition (2.12). Set g q+1 (x) = r 2 − x 2 and g q+2 (x) = x 2 − r 2 for notational convenience. Since the homogeneous polynomial matrix P (x ) is strictly positive definite for all x ∈ K ⊆ K \ {0}, we can apply Theorem 2.4 to find SOS matrices S j,k (not necessarily homogeneous) such that Moreover, standard symmetry arguments (see, e.g., [32,33]) reveal that we may takê S j,k (−x ) =Ŝ j,k (x ) for all j and k because the matrix P and the polynomials g 1 , . . . , g q+2 are invariant under the transformation x → −x. The latter assertion is true because P and g 1 , . . . , g q are homogeneous and have even degree by assumption, while g q+1 (−x ) = g q+1 (x ) and g q+2 (−x ) = g q+2 (x ) by construction. Next, set 2d 0 = deg(P ) and 2d j = deg(g j ) for all j = 1, . . . , q. Given any nonzero x ∈ R n , evaluating (6.12) at the point x = rx x −1 yields where we have used the fact that g q+1 (rx x −1 ) = g q+2 (rx x −1 ) = 0. Let ω be the smallest integer such that and set ν := ω − d 0 . Multiplying (6.13) by x 2ω and rearranging, we obtain Lemma 6.1 guarantees that these matrices are homogeneous and SOS. Since (6.14) clearly holds also for x = 0, it is the desired chordal SOS decomposition of P .

Conclusion
We have proven SOS decomposition theorems for positive semidefinite polynomial matrices with chordal sparsity ( In addition to being interesting in their own right, our SOS chordal decompositions have two important consequences. First, they can be combined with a straightforward scalarization argument to deduce new SOS representation results for nonnegative polynomials that are quadratic and correlatively sparse with respect to a subset of independent variables (Corollaries 4.1, 4.2 and 4.3). These statements specialize a sparse version of Putinar's Positivstellensatz proven in [35], as well as recent sparsity-exploiting extensions of Reznick's Positivstellensatz [51]. Second, Theorems 2.3, 2.4 and 2.5 and Corollaries 2.1 and 2.2 enable us to build new sparsity-exploiting hierarchies of SOS reformulations for convex optimization problems subject to large-scale but sparse polynomial matrix inequalities. These hierarchies are asymptotically exact for problems that have strictly feasible points and whose matrix inequalities are either imposed on a compact set satisfying the Archimedean condition (Theorem 3.1), or satisfy additional homogeneity and strict positivity conditions (Theorems 3.2 and 3.3). Moreover, and perhaps most importantly, our SOS hierarchies have significantly lower computational complexity than traditional ones when the maximal cliques of the sparsity graph associated to the polynomial matrix inequality are much smaller than the matrix. As demonstrated by the numerical examples in Section 5, this makes it possible to solve optimization problems with polynomial matrix inequalities that are well beyond the reach of standard SOS methods, without sacrificing their asymptotic convergence.
It would be interesting to explore if the results we have presented in this work can be extended in various directions. For example, it may be possible to adapt the analysis in [9] to derive a more general version of Theorem 2.4. It should also be possible to deduce explicit degree bounds for the SOS matrices that appear in all of our decomposition results. Stronger decomposition results for inhomogeneous polynomial matrix inequalities imposed on semialgebraic sets that are noncompact or do not satisfy the Archimedean condition would also be of interest. For instance, Corollaries 2.1 and 2.2 have restrictive assumptions on the behaviour of the leading homogeneous part of a polynomial matrix. These assumptions often are not met and, in such cases, SOS reformulations of convex optimization problems with polynomial matrix inequalities cannot be guaranteed to converge using Corollaries 2.1 and 2.2. Finally, the chordal decomposition problem for semidefinite matrices has a dual formulation that considers positive semidefinite completion of partially specified matrices; see, e.g., [17,Chapter 10]. Building on a notion of SOS matrix completion introduced in [56], it may be possible to establish SOS completion results for polynomial matrices. All of these extensions will contribute to building a comprehensive theory for SOS decomposition and completion of polynomial matrices, which will enable the application of SOS programming to tackle large-scale optimization problems with semidefinite constraints on sparse polynomial matrices.
Acknowledgements. We would like to thank Antonis Papachristodoulou, Pablo Parrilo, J. William Helton, Igor Klep and Licio Romao for insightful conversations that have led to this work. We also thank the reviewers and Associate Editor, who motivated us to prove stronger theorems than those included in our original manuscript. Their suggestions considerably improved the quality of this work.
Since the second matrix on the right-hand side is positive definite, it suffices to show that the first one is PSD. This is true because its diagonal entries, its determinant, and its 2 × 2 principal minors are nonnegative (confirmation of this is left to the reader).

B Homogenization for Corollary 4.1
If p(x, y) is quadratic but not homogeneous with respect to y, introduce a new variable z and define q(x, y, z) := z 2 p(x, z −1 y).
This polynomial is well defined when z = 0, can be extended by continuity to z = 0, is both homogeneous and quadratic with respect to (y, z), and satisfies q(x, y, 1) = p(x, y).
Since z multiplies all entries of y, the correlative sparsity graph of q with respect to (y, z) is chordal and has maximal cliquesĈ 1 = C 1 ∪ {m + 1},Ĉ 2 = C 2 ∪ {m + 1}, . . ., C t = C t ∪{m+1}, where C 1 , . . . , C t are the maximal cliques of the correlative sparsity graph of p with respect to y. Moreover, since both p(x, y) and α,|β|=2 c α,β x α y β are nonnegative globally by assumption, q(x, y, z) is nonnegative for all x, y and z. Applying the result of Corollary 4.1 for the homogeneous case to q, we find SOS polynomialsσ k (x, y C k , z), each homogeneous and quadratic in y C k and z, such that σ 0 (x)q(x, y, z) = t k=1σ k (x, y C k , z).
Setting z = 1 yields σ 0 (x)p(x, y) = t k=1σ k (x, y C k , 1), which is the decomposition stated in Corollary 4.1 with polynomials σ k (x, y C k ) :=σ k (x, y C k , 1) that are quadratic (but not necessarily homogeneous) in y C k . C Proof of Proposition 6.1 Proposition 6.1 is obvious if m = 1, and follows directly from the next lemma if m = 2.
Lemma C.1 (Schmüdgen [45]). Let P (x) be an m × m polynomial matrix with block form where u is a polynomial, v = [v 1 , . . . , v m−1 ] T is a polynomial vector, and W is a symmetric (m − 1) × (m − 1) polynomial matrix. Then, u 4 (x)P (x) = Z(x)Q(x)Z(x) T with For m ≥ 3, we use an induction procedure that combines Schmüdgen's lemma with the zero fill-in property of the Cholesky algorithm for matrices with chordal sparsity.
Assume that Proposition 6.1 holds for all polynomial matrices of size m−1 with chordal sparsity. We claim that it holds also for polynomial matrices of size m. Let P (x) be any m × m matrix whose sparsity graph G is chordal. By Theorem 3.3 in [17], the graph G has at least one simplicial vertex. Let Π be a permutation matrix and denote by G Π the sparsity graph of the permuted matrix ΠP Π T , which is obtained simply by reordering the vertices of G as specified by the permutation Π. We choose Π such that vertex 1 is simplicial for G Π and such that the first maximal clique of G Π is C 1 = {1, . . . , r} for some r ≥ 1. This means that the matrix ΠP Π T can be partitioned into the block form Moreover, R + R T has the same sparsity pattern as ΛP Λ T , meaning that Λ T (R + R T )Λ has the same sparsity as P . Combining this factorization with (C.2) we obtain Since Λ T (R+R T )Λ has the same sparsity pattern as P and v = [ q 0 ], the 2×2 block matrix on the right-hand side has the same sparsity pattern as the right-hand side of (C.1), hence as ΠP Π T . We conclude that T T (L+L T )T has the same sparsity pattern as P , as required.

D Proof of Lemma 6.1
It suffices to consider 2ω = deg(S). Since S is SOS and S(−x) = S(x), symmetry arguments [32] imply that S(x) = S 1 (x) + S 2 (x) with S 1 (x) = H e (x) T H e (x), S 2 (x) = where A α and B α are m × m coefficient matrices. Therefore, we only need to show that x 2ω S 1 (rx x −1 ) and x 2ω S 2 (rx x −1 ) are SOS and homogeneous of degree 2ω. If r = 0, this is trivial. If r > 0, set A α := r |α| A α and write To show that this matrix is SOS, we distinguish two cases. If ω is even, then so is ω − |α|, and x ω−|α| is a polynomial of x. In this case, each term in brackets on the right-hand side of (D.1) is a polynomial matrix, so x 2ω S 1 (x x −1 ) is SOS. If ω is odd, instead, (D.1) can be written as Since ω−|α|−1 is even, the right-hand side is an SOS polynomial matrix, so x 2ω S 1 (rx x −1 ) is SOS. In both cases, the matrix in the right-hand side of (D.1) is clearly homogeneous of degree 2ω, and so is the left-hand side. Analogous reasoning proves that x 2ω S 2 (rx x −1 ) is SOS and homogeneous with degree 2ω, concluding the proof.