Abstract
We first provide an inner-approximation hierarchy described by a sum-of-squares (SOS) constraint for the copositive (COP) cone over a general symmetric cone. The hierarchy is a generalization of that proposed by Parrilo (Structured semidefinite programs and semialgebraic geometry methods in Robustness and optimization, Ph.D. Thesis, California Institute of Technology, Pasadena, CA, 2000) for the usual COP cone (over a nonnegative orthant). We also discuss its dual. Second, we characterize the COP cone over a symmetric cone using the usual COP cone. By replacing the usual COP cone appearing in this characterization with the inner- or outer-approximation hierarchy provided by de Klerk and Pasechnik (SIAM J Optim 12(4):875–892, https://doi.org/10.1137/S1052623401383248, 2002) or Yıldırım (Optim Methods Softw 27(1):155–173, https://doi.org/10.1080/10556788.2010.540014, 2012), we obtain an inner- or outer-approximation hierarchy described by semidefinite but not by SOS constraints for the COP matrix cone over the direct product of a nonnegative orthant and a second-order cone. We then compare them with the existing hierarchies provided by Zuluaga et al. (SIAM J Optim 16(4):1076–1091, https://doi.org/10.1137/03060151X, 2006) and Lasserre (Math Program 144:265–276, https://doi.org/10.1007/s10107-013-0632-5, 2014). Theoretical and numerical examinations imply that we can numerically increase a depth parameter, which determines an approximation accuracy, in the approximation hierarchies derived from de Klerk and Pasechnik (SIAM J Optim 12(4):875–892, https://doi.org/10.1137/S1052623401383248, 2002) and Yıldırım (Optim Methods Softw 27(1):155–173, https://doi.org/10.1080/10556788.2010.540014, 2012), particularly when the nonnegative orthant is small. In such a case, the approximation hierarchy derived from Yıldırım (Optim Methods Softw 27(1):155–173, https://doi.org/10.1080/10556788.2010.540014, 2012) can yield nearly optimal values numerically. Combining the proposed approximation hierarchies with existing ones, we can evaluate the optimal value of COP programming problems more accurately and efficiently.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
In this study, we focus on the copositivity and its dual, the complete positivity of tensors, which include matrices and, more generally, linear transformations, over a symmetric cone. Typically, the nonnegative orthant, second-order cone, and semidefinite cone and their direct product are symmetric cones. A symmetric cone plays a significant role in optimization [17] and often appears in the modeling of realistic problems [36, 38]. The copositivity and complete positivity over a nonnegative orthant, i.e., those in the typical sense, are of particular importance. They have been deeply studied [46, 50] and exploited in the convex conic reformulation of many NP-hard problems [5, 8, 42, 43]. In addition, the complete positivity over other symmetric cones has been used in the convex conic reformulation of rank-constrained semidefinite programming (SDP) [3] and polynomial optimization problems over a symmetric cone [31], whose applications include polynomial SDP [32], which often appears in system and control theory [26,27,28], and polynomial second-order cone programming [30, 34]. Moreover, Gowda [21] discussed (weighted) linear complementarity problems over a symmetric cone, in which the copositivity over a symmetric cone of linear transformations was exploited. For convenience, the cones of copositive (COP) and completely positive (CP) tensors over a closed cone \({\mathbb {K}}\) are hereafter called the COP and CP cones (over \({\mathbb {K}}\)), respectively.
As the copositivity and complete positivity appear in the reformulation of such formidable problems, the COP and CP cones are difficult to handle [12]. Thus, to guarantee copositivity or complete positivity, we must consider sufficient and necessary conditions that can be handled efficiently.
To achieve this objective, many types of approximation hierarchies have been proposed [1, 7, 10, 13, 14, 20, 24, 25, 29, 33, 40, 41, 51, 53,54,55]. An approximation hierarchy, e.g., \(\{{\mathcal {K}}_r\}_r\), gradually approaches the COP or CP cone from the inside or outside as the depth parameter r, which determines the approximation accuracy, increases and, in a sense, agrees with the cone in the limit. By defining each \({\mathcal {K}}_r\) as a set represented by nonnegative, second-order cone, or semidefinite constraints, we can tentatively handle the copositivity or complete positivity on a computer using methods such as primal–dual interior point methods [2, Chap. 11].
Most of the above mentioned works provided approximation hierarchies for the usual COP and CP cones, and few studies [13, 24, 33, 55] have considered those for the COP or CP cones over a closed cone \({\mathbb {K}}\) other than a nonnegative orthant. Zuluaga et al. [55] provided an inner-approximation hierarchy described by the sum-of-squares (SOS) constraints, which reduce to semidefinite constraints, for the COP cone over a pointed semialgebraic cone. The term “semialgebraic” means that the set is defined by finitely many non-negativity constraints of homogeneous polynomials. The class of pointed semialgebraic cones includes the nonnegative orthant, second-order cone, and semidefinite cone or their direct product, which are also symmetric cones. The semidefiniteness of a matrix of size n is characterized by the nonnegativity of all the \(2^n - 1\) principal minors. That is, the semidefinite cone is a pointed semialgebraic cone; however, the semialgebraic representation requires an exponential number of non-negativity constraints in its size. In such a case, their approximation hierarchy is no longer tractable even if semidefinite constraints can describe it. Lasserre [33] provided an outer-approximation hierarchy described by a semidefinite constraint for the COP cone over a general closed convex cone \({\mathbb {K}}\). The hierarchy is implementable only for the case in which the moment of a finite Borel measure dependent on \({\mathbb {K}}\) is obtainable. The following case of \({\mathbb {K}}\) being the direct product of a nonnegative orthant and a second-order cone is an example in which the moment can be theoretically obtained. The study by Dickinson and Povh [13] is a variant of that of Lasserre [33], which considered the special case in which \({\mathbb {K}}\) is included in a nonnegative orthant to provide a tighter approximation than Lasserre [33].
This study aims to provide approximation hierarchies for the COP cone over a symmetric cone and compare them with existing ones. First, we provide an inner-approximation hierarchy described by an SOS constraint. It is a generalization of the approximation hierarchy proposed by Parrilo [40] for the usual COP cone. We call the proposed approximation hierarchy the NN-type inner-approximation hierarchy. Moreover, we discuss its dual to provide an outer-approximation hierarchy for the CP cone over a symmetric cone and provide its more explicit expression for the case in which the symmetric cone is a nonnegative orthant.
Second, we characterize the COP cone over a symmetric cone using the usual COP cone. The basic idea for providing an approximation hierarchy is to replace the usual COP cone appearing in this characterization with its approximation hierarchy. In general, the induced sequence is defined by the intersection of infinitely many sets and is not even guaranteed to converge to the COP cone over a symmetric cone. However, by exploiting the inner-approximation hierarchy given by de Klerk and Pasechnik [10] or outer-approximation hierarchy given by Yıldırım [53], we obtain an inner- or outer-approximation hierarchy described by finitely many semidefinite but not by SOS constraints for the cone of COP matrices (COP matrix cone) over the direct product of a nonnegative orthant and one second-order cone. Hereafter, we call the proposed inner- and outer-approximation hierarchies the dP- and Yıldırım-type approximation hierarchies, respectively.
As mentioned, Zuluaga et al.’s (ZVP-type) inner-approximation hierarchy [55] and Lasserre’s (Lasserre-type) outer-approximation hierarchy [33] are applicable to the COP matrix cone over the direct product of a nonnegative orthant and second-order cone. Then, we theoretically and numerically compare the proposed approximation hierarchies with existing ones. We determined that we can numerically increase a depth parameter in the dP- and Yıldırım-type approximation hierarchies, particularly when the nonnegative orthant is small. In particular, the Yıldırım-type outer-approximation hierarchy has a higher numerical stability than the Lasserre-type one and can approach nearly optimal values of COP programming (COPP) problems numerically.
The remainder of this paper is organized as follows. In Sect. 2, we introduce the notation and concepts used in this study. In Sect. 3, we provide an SOS-based NN-type inner-approximation hierarchy for the COP cone over a general symmetric cone and discuss its dual. In Sect. 4, as generalizations of the approximation hierarchies given by de Klerk and Pasechnik [10] and Yıldırım [53], we provide dP- and Yıldırım-type approximation hierarchies described by finitely many semidefinite constraints for the COP matrix cone over the direct product of a nonnegative orthant and second-order cone. We also discuss their concise expressions. In Sect. 5, we introduce the existing ZVP- and Lasserre-type approximation hierarchies that are applicable to the COP matrix cone over the direct product of a nonnegative orthant and second-order cone and compare them with the proposed approximation hierarchies. In Sect. 6, we compare the approximation hierarchies numerically by solving optimization problems obtained by approximating the COP cone and investigate the effect of the concise expressions mentioned in Sect. 4. Finally, Sect. 7 provides concluding remarks.
2 Preliminaries
2.1 Notation
We use \({\mathbb {N}}\), \({\mathbb {R}}\), \({\mathbb {R}}^{n\times m}\), \({\mathbb {S}}^n\), and \({\mathbb {S}}_+^n\) to denote the set of nonnegative integers, set of real numbers, set of real \(n\times m\) matrices, space of \(n\times n\) symmetric matrices, and set of positive semidefinite matrices in \({\mathbb {S}}^n\), respectively. For a finite set I, we use \({\mathbb {R}}^{I}\) and \({\mathbb {S}}^{I}\) to denote the |I|-dimensional Euclidean space with elements indexed by I and space of \(|I|\times |I|\) symmetric matrices with columns and rows indexed by I, respectively. Similarly, let \({\mathbb {S}}_+^{I}\) denote the set of positive semidefinite matrices in \({\mathbb {S}}^I\). We use \(\varvec{e}_i\) to denote the vector with an ith element of 1 and the remaining elements of 0, whose size is determined from the context. In addition, we use \(\varvec{0}\), \(\varvec{1}\), \(\varvec{O}\), \(\varvec{I}\), and \(\varvec{E}\) to denote the zero vector, vector with all elements 1, zero matrix, identity matrix, and matrix with all elements 1, respectively. We sometimes use a subscript, such as \(\varvec{1}_n\) and \(\varvec{I}_n\), to specify the size. Although all vectors that appear in this paper are column vectors, for notational convenience, the difference between a column and row may not be stated if it is clear from the context. The Euclidean space \({\mathbb {R}}^n\) is endowed with the usual transpose inner product and \(\Vert \cdot \Vert _2\) denotes the induced norm. We use \(S^n\) and \(\Delta _=^n\) to denote the n-dimensional unit sphere and standard simplex in \({\mathbb {R}}^{n+1}\), i.e.,
respectively. For a set \({\mathcal {X}}\), we use \(|{\mathcal {X}}|\), \({{\,\textrm{conv}\,}}({\mathcal {X}})\), \({{\,\textrm{cone}\,}}({\mathcal {X}})\), \({{\,\textrm{cl}\,}}({\mathcal {X}})\), \({{\,\textrm{int}\,}}({\mathcal {X}})\), and \(\partial ({\mathcal {X}})\) to denote the cardinality, convex hull, conical hull, closure, interior, and boundary of \({\mathcal {X}}\), respectively. For two finite-dimensional real vector spaces \({\mathbb {V}}\) and \({\mathbb {W}}\), we use \({{\,\textrm{Hom}\,}}({\mathbb {V}},{\mathbb {W}})\) to denote the set of linear mappings from \({\mathbb {V}}\) to \({\mathbb {W}}\). We use \(\lfloor \cdot \rfloor \) and \(\lceil \cdot \rceil \) to denote the floor and ceiling functions, respectively.
We call a nonempty set \({\mathcal {K}}\) in a finite-dimensional real vector space a cone if \(\alpha x\in {\mathcal {K}}\) for all \(\alpha >0\) and \(x\in {\mathcal {K}}\). For a cone \({\mathcal {K}}\) in a finite-dimensional real inner product space, \({\mathcal {K}}^*\) denotes its dual cone, i.e., the set of x such that the inner product between x and y is greater than or equal to 0 for all \(y\in {\mathcal {K}}\). A cone \({\mathcal {K}}\) is said to be pointed if it contains no lines. The following properties of a cone and its dual are well known:
Theorem 2.1
[6, Sect. 2.6.1] Let \({\mathcal {K}}\) be a cone.
-
(i)
If \({\mathcal {K}}\) is pointed, closed, and convex, \({\mathcal {K}}^*\) has a nonempty interior.
-
(ii)
If \({\mathcal {K}}\) is convex, \(({\mathcal {K}}^*)^* ={{\,\textrm{cl}\,}}({\mathcal {K}})\) holds. If \({\mathcal {K}}\) is also closed, \(({\mathcal {K}}^*)^* = {\mathcal {K}}\) holds.
For a polynomial f, we use \(\deg (f)\) to denote the degree of f. Let \(H^{n,m}\) be the set of homogeneous polynomials in n variables of degree m with real coefficients. We then define \(\Sigma ^{n,2m} :={{\,\textrm{conv}\,}}\{\theta ^2\mid \theta \in H^{n,m}\}\). \(\Sigma ^{n,2m}\) is known to be a closed convex cone [48, Proposition 3.6]. For \(\varvec{\alpha }\in {\mathbb {N}}^n\) and \(\varvec{x}\in {\mathbb {R}}^n\), we define \(\varvec{\alpha }! :=\prod _{i=1}^n\alpha _i\), \(|\varvec{\alpha }|:=\sum _{i=1}^n\alpha _i\), and \(\varvec{x}^{\varvec{\alpha }}:=\prod _{i=1}^n x_i^{\alpha _i}\). In addition, we define
Under this notation, \({\mathbb {R}}^{{\mathbb {I}}^n_{=m}}\) is linearly isomorphic to \(H^{n,m}\) by the mapping \((\theta _{\varvec{\alpha }})_{\varvec{\alpha }\in {\mathbb {I}}^n_{=m}}\mapsto \sum _{\varvec{\alpha }\in {\mathbb {I}}^n_{=m}}\theta _{\varvec{\alpha }}\varvec{x}^{\varvec{\alpha }}\). Let \({\mathfrak {S}}_m\) be the symmetric group of order m. Then, the group \({\mathfrak {S}}_m\) acts on the set \({\mathbb {N}}_n^m\) by \(\sigma \cdot (i_1,\ldots ,i_m) = (i_{\sigma (1)},\ldots ,i_{\sigma (m)})\). As mentioned in [14, Sect. 4], a bijection exists between the set \({\mathbb {I}}^n_{=m}\) and a complete set of the representatives of \({\mathfrak {S}}_m\)-orbits in \({\mathbb {N}}_n^m\). As the set
is a complete set of the representatives of \({\mathfrak {S}}_m\)-orbits in \({\mathbb {N}}_n^m\), we define \([\varvec{\alpha }]\in {\mathbb {N}}^m_n\) as the element of (1) corresponding to \(\varvec{\alpha }\in {\mathbb {I}}^n_{=m}\), i.e.,
2.2 Euclidean Jordan algebra and symmetric cone
A finite-dimensional real vector space \({\mathbb {E}}\) equipped with a bilinear mapping (product) \(\circ :{\mathbb {E}}\times {\mathbb {E}}\rightarrow {\mathbb {E}}\) is said to be a Jordan algebra if the following two conditions hold for all \(x,y\in {\mathbb {E}}\):
-
(J1)
\(x \circ y = y \circ x\)
-
(J2)
\(x\circ ((x\circ x)\circ y) = (x\circ x)\circ (x\circ y)\)
In this study, we assume that a Jordan algebra has an identity element e for the product. A Jordan algebra \(({\mathbb {E}},\circ )\) is said to be Euclidean if there exists an associative inner product \(\bullet \) on \({\mathbb {E}}\) such that
-
(J3)
\((x\circ y)\bullet z = x\bullet (y\circ z)\)
for all \(x,y,z\in {\mathbb {E}}\). Throughout this study, we fix an associative inner product \(\bullet \) on a Euclidean Jordan algebra \(({\mathbb {E}},\circ )\) and regard \(({\mathbb {E}},\circ ,\bullet )\) as a finite-dimensional real inner product space.
Let \(({\mathbb {E}},\circ ,\bullet )\) be a Euclidean Jordan algebra. An element \(c\in {\mathbb {E}}\) is called an idempotent if \(c\circ c = c\). In addition, an idempotent c is called primitive if it is nonzero and cannot be written as the sum of two nonzero idempotents. Two elements \(c,d\in {\mathbb {E}}\) are called orthogonal if \(c\circ d = 0\). The system \(c_1,\ldots ,c_k\) is called a complete system of orthogonal idempotents if each \(c_i\) is an idempotent, \(c_i\circ c_j = 0\) if \(i\ne j\), and \(\sum _{i=1}^kc_i = e\). In addition, if each \(c_i\) is also primitive, the system is called a Jordan frame. Each Jordan frame is known to consist of exactly \({{\,\textrm{rk}\,}}\) elements, where \({{\,\textrm{rk}\,}}\) is the rank of the Euclidean Jordan algebra \(({\mathbb {E}},\circ ,\bullet )\) and the rank depends only on the algebra [16, Sect. III.1]. Here, for the convenience of the proofs, we consider an ordered Jordan frame and let \({\mathfrak {F}}({\mathbb {E}})\) be the set of ordered Jordan frames, i.e.,
Note that \({\mathfrak {F}}({\mathbb {E}})\) is a compact subset in \({\mathbb {E}}^{{{\,\textrm{rk}\,}}}\) [16, Exercise IV.5]. Each element of \({\mathbb {E}}\) can be decomposed into a linear combination of a Jordan frame [16, Theorem III.1.2]. In particular, for each \(x\in {\mathbb {E}}\), there exist \(x_1,\ldots ,x_{{{\,\textrm{rk}\,}}}\in {\mathbb {R}}\) and \((c_1,\ldots ,c_{{{\,\textrm{rk}\,}}})\in {\mathfrak {F}}({\mathbb {E}})\) such that
The symmetric cone \({\mathbb {E}}_+\) associated with the Euclidean Jordan algebra \(({\mathbb {E}},\circ ,\bullet )\) is defined as \({\mathbb {E}}_+ :=\{x\circ x\mid x\in {\mathbb {E}}\}\). Note that when each \(x\in {\mathbb {E}}_+\) is decomposed into the form (2), all coefficients \(x_i\) are nonnegative. Conversely, for any nonnegative scalars \(x_1,\ldots ,x_{{{\,\textrm{rk}\,}}}\) and \((c_1,\ldots ,c_{{{\,\textrm{rk}\,}}})\in {\mathfrak {F}}({\mathbb {E}})\), it follows that \(\sum _{i=1}^{{{\,\textrm{rk}\,}}}x_ic_i\in {\mathbb {E}}_+\).
We now show some examples of the symmetric cones frequently used in this paper.
Example 2.2
(Nonnegative orthant) Let \({\mathbb {E}}\) be an n-dimensional Euclidean space \({\mathbb {R}}^n\). If we set \(\varvec{x}\circ \varvec{y} :=(x_1y_1,\ldots ,x_ny_n)\) and \(\varvec{x}\bullet \varvec{y} :=\varvec{x}^\top \varvec{y}\) for \(\varvec{x},\varvec{y}\in {\mathbb {E}}\), then \(({\mathbb {E}},\circ ,\bullet )\) is a Euclidean Jordan algebra, and the induced symmetric cone \({\mathbb {E}}_+\) is the nonnegative orthant \({\mathbb {R}}_+^n = \{\varvec{x}\in {\mathbb {R}}^n \mid x_i\ge 0\text { for all}\, i = 1,\ldots ,n\}\). The set \({\mathfrak {F}}({\mathbb {E}})\) of ordered Jordan frames is \(\{(\varvec{e}_{\sigma (1)},\ldots ,\varvec{e}_{\sigma (n)}) \mid \sigma \in {\mathfrak {S}}_n\}\).
Example 2.3
(Second-order cone) Let \({\mathbb {E}}\) be an n-dimensional Euclidean space \({\mathbb {R}}^n\) with \(n\ge 2\). If we set \(\varvec{x}\circ \varvec{y} :=(\varvec{x}^\top \varvec{y},x_1\varvec{y}_{2:n} + y_1\varvec{x}_{2:n})\) and \(\varvec{x}\bullet \varvec{y} :=\varvec{x}^\top \varvec{y}\) for \(\varvec{x}=(x_1,\varvec{x}_{2:n})\), \(\varvec{y}=(y_1,\varvec{y}_{2:n})\in {\mathbb {E}}\), then \(({\mathbb {E}},\circ ,\bullet )\) is a Euclidean Jordan algebra, and the induced symmetric cone \({\mathbb {E}}_+\) is the second-order cone \({\mathbb {L}}^n = \{(x_1,\varvec{x}_{2:n})\in {\mathbb {R}}^n \mid x_1\ge \Vert \varvec{x}_{2:n}\Vert _2\}\). The set \({\mathfrak {F}}({\mathbb {E}})\) of ordered Jordan frames is
Example 2.4
(Semidefinite cone) Let \({\mathbb {E}}\) be the space \({\mathbb {S}}^n\) of \(n\times n\) symmetric matrices. If we set \(\varvec{X}\circ \varvec{Y} :=(\varvec{X}\varvec{Y} + \varvec{Y}\varvec{X})/2\) and \(\varvec{X}\bullet \varvec{Y} :=\sum _{i,j=1}^nX_{ij}Y_{ij}\) for \(\varvec{X},\varvec{Y}\in {\mathbb {E}}\), then \(({\mathbb {E}},\circ ,\bullet )\) is a Euclidean Jordan algebra, and the induced symmetric cone \({\mathbb {E}}_+\) is the semidefinite cone \({\mathbb {S}}_+^n\).
Consider the case in which a Euclidean Jordan algebra \(({\mathbb {E}},\circ ,\bullet )\) can be written as the direct product (sum) of two Euclidean Jordan algebras \(({\mathbb {E}}_i,\circ _i,\bullet _i)\) with rank \({{\,\textrm{rk}\,}}_i\) and identity element \(e_{{\mathbb {E}}_i}\) \((i = 1,2)\). Note that the following discussion can be directly extended to the case of finitely many Euclidean Jordan algebras. The product \(\circ \) and associative inner product \(\bullet \) are defined as follows:
for \((x_1,x_2),(y_1,y_2) \in {\mathbb {E}} = {\mathbb {E}}_1\times {\mathbb {E}}_2\), the rank of \({\mathbb {E}}\) is \({{\,\textrm{rk}\,}}_1 + {{\,\textrm{rk}\,}}_2\), and the identity element e of \({\mathbb {E}}\) is \((e_{{\mathbb {E}}_1},e_{{\mathbb {E}}_2})\). In the following, we derive the set of ordered Jordan frames of \({\mathbb {E}}\). This result will be exploited in Sect. 4 to obtain approximation hierarchies described by finitely many semidefinite constraints.
Lemma 2.5
For a primitive idempotent \(f = (f_1,f_2)\in {\mathbb {E}}\), exactly one of the following two statements holds:
-
(a)
\(f_{1}\) is a primitive idempotent of \({\mathbb {E}}_1\) and \(f_{2} = 0\).
-
(b)
\(f_{1} = 0\) and \(f_{2}\) is a primitive idempotent of \({\mathbb {E}}_2\).
Proof
We assume that \(f_1 \ne 0\). If \(f_2 \ne 0\), then f can be decomposed into the sum of two nonzero elements \((f_1,0)\) and \((0,f_2)\). The two elements are actually idempotents of \({\mathbb {E}}\), which contradicts f being primitive. Thus, we have \(f_2 = 0\). Because \(f = (f_1,0)\) is a primitive idempotent, \(f_1\) is also a primitive idempotent. This case falls under the case (a).
Next, we assume that \(f_1 = 0\). In this case, as in the above discussion, we note that \(f_2\) is a primitive idempotent. This case falls under the case (b). \(\square \)
Proposition 2.6
(The set of ordered Jordan frames of \({\mathbb {E}}\)) \((f_1,\ldots ,f_{{{\,\textrm{rk}\,}}_1 + {{\,\textrm{rk}\,}}_2}) \in {\mathfrak {F}}({\mathbb {E}})\) if and only if there exists a partition \((I_1,I_2)\) of \(\{1,\ldots ,{{\,\textrm{rk}\,}}_1+{{\,\textrm{rk}\,}}_2\}\) such that the following two conditions hold:
-
(i)
\(f_i = (f_{1i},0)\in {\mathbb {E}}\) for all \(i\in I_1\) and \((f_{1i})_{i\in I_1} \in {\mathfrak {F}}({\mathbb {E}}_1)\).
-
(ii)
\(f_i = (0,f_{2i})\in {\mathbb {E}}\) for all \(i\in I_2\) and \((f_{2i})_{i\in I_2} \in {\mathfrak {F}}({\mathbb {E}}_2)\).
Proof
We first prove the “if” part. Because \((f_1,\ldots ,f_{{{\,\textrm{rk}\,}}_1+{{\,\textrm{rk}\,}}_2})\) satisfying the two conditions is evidently a complete system of orthogonal idempotents, we prove only that each \(f_i\) is primitive. To prove this, showing that \((f,0)\in {\mathbb {E}}\) is primitive if \(f\in {\mathbb {E}}_1\) is primitive is sufficient. Assume that (f, 0) can be written as the sum of two idempotents \((f_1,f_2)\) and \((g_1,g_2)\), i.e.,
First, (3) implies that \(f_2 = -g_2\). In addition, we note that \(f_1\), \(g_1\), \(f_2\), and \(g_2\) are idempotents because \((f_1,f_2)\) and \((g_1,g_2)\) are idempotents. Therefore, \(f_2 = g_2 = 0\). Second, (3) implies again that \(f = f_1 + g_1\). As \(f_1\) and \(g_1\) are idempotents and f is primitive, either \(f_1\) or \(g_1\) must be 0. We can assume that \(f_1 = 0\) without loss of generality. We then obtain \((f_1,f_2) = 0\), which implies that (f, 0) is primitive.
We now prove the “only if” part. Let \((f_1,\ldots ,f_{{{\,\textrm{rk}\,}}_1 + {{\,\textrm{rk}\,}}_2}) \in {\mathfrak {F}}({\mathbb {E}})\) and set \(f_i = (f_{1i},f_{2i})\) for each i. Then, each \(f_i = (f_{1i},f_{2i})\) falls into exactly one of the two cases in Lemma 2.5; thus, we define
Evidently, \((I_1,I_2)\) is a partition of \(\{1,\ldots ,{{\,\textrm{rk}\,}}_1 + {{\,\textrm{rk}\,}}_2\}\). In the following, we show that \((f_{1i})_{i\in I_1}\in {\mathfrak {F}}({\mathbb {E}}_1)\) and \((f_{2i})_{i\in I_2}\in {\mathfrak {F}}({\mathbb {E}}_2)\). From the assumption on \((f_1,\ldots ,f_{{{\,\textrm{rk}\,}}_1 + {{\,\textrm{rk}\,}}_2})\), it follows that
from which we obtain \(\sum _{i\in I_1}f_{1i} = e_{{\mathbb {E}}_1}\) and \(\sum _{i\in I_2}f_{2i} = e_{{\mathbb {E}}_2}\). In addition, it follows that \(0 = f_i \circ f_j = (f_{1i}\circ _1 f_{1j},f_{2i}\circ _2 f_{2j})\) for any \(i\ne j\). In particular, we have \(f_{1i}\circ _1 f_{1j} = 0\) for all \(i\ne j\in I_1\) and \(f_{2i}\circ _2 f_{2j} = 0\) for all \(i\ne j\in I_2\). \(\square \)
2.3 Symmetric tensor space
Let \(({\mathbb {V}},(\cdot ,\cdot ))\) be an n-dimensional real inner product space. Note that \({\mathbb {V}}\) can be identified with the dual space \({{\,\textrm{Hom}\,}}({\mathbb {V}},{\mathbb {R}})\) by the natural isomorphism \(x\mapsto (x,\cdot )\). We use
to denote the tensor space of order m over \({\mathbb {V}}\).
Let \(v_1,\ldots ,v_n\) be a basis for \({\mathbb {V}}\). Then, \({\tilde{v}}_{i_1\cdots i_m} :=v_{i_1}\otimes \cdots \otimes v_{i_m}\ ((i_1,\ldots ,i_m)\in {\mathbb {N}}_n^m)\) form a basis for \({\mathbb {V}}^{\otimes m}\). That is, each \({\mathcal {A}}\in {\mathbb {V}}^{\otimes m}\) can be written in the following form:
with coefficients \({\mathcal {A}}_{i_1\cdots i_m}\in {\mathbb {R}}\). For \(\sigma \in {\mathfrak {S}}_m\), the linear transformation \(\pi _{\sigma }\) on \({\mathbb {V}}^{\otimes m}\) is defined by \(\pi _{\sigma }({\tilde{v}}_{i_1\cdots i_m}) :={\tilde{v}}_{i_{\sigma (1)}\cdots i_{\sigma (m)}}\). The definition of \(\pi _{\sigma }\) does not depend on the choice of the basis for \({\mathbb {V}}\). Then,
denotes the symmetric tensor space of order m over \({\mathbb {V}}\), which is a subspace of \({\mathbb {V}}^{\otimes m}\). Note that the symmetric tensor \({\mathcal {A}}\in {\mathcal {S}}^{n,m}({\mathbb {V}})\) with the form (4) depends only on the coefficients in the form of \({\mathcal {A}}_{[\varvec{\alpha }]}\) \((\varvec{\alpha }\in {\mathbb {I}}_{=m}^n)\). Let \({\mathscr {S}}\) be the linear transformation on \({\mathbb {V}}^{\otimes m}\) defined as
Then, \({\mathscr {S}}{\mathcal {A}} \in {\mathcal {S}}^{n,m}({\mathbb {V}})\) for each \({\mathcal {A}}\in {\mathbb {V}}^{\otimes m}\) and \({\mathscr {S}}{\tilde{v}}_{[\varvec{\alpha }]}\) \((\varvec{\alpha }\in {\mathbb {I}}_{=m}^n)\) form a basis for \({\mathcal {S}}^{n,m}({\mathbb {V}})\). For \(x\in {\mathbb {V}}\), let
In particular, we consider the case of \({\mathbb {V}} = {\mathbb {R}}^n\) with the canonical basis \(\varvec{e}_1,\ldots ,\varvec{e}_n\) and write \({\mathcal {S}}^{n,m}({\mathbb {R}}^n)\) as \({\mathcal {S}}^{n,m}\). Then, each element \({\mathcal {A}}\in {\mathcal {S}}^{n,m}\) can be considered a multi-dimensional array; thus, we write \({\mathcal {A}}_{i_1\cdots i_m}\) for the \((i_1,\ldots ,i_m)\)th element of \({\mathcal {A}}\). Note that the symmetric tensor space \({\mathcal {S}}^{n,2}\) of order two equals the space \({\mathbb {S}}^n\) of the symmetric matrices.
Let \(\langle \cdot ,\cdot \rangle \) be the inner product on \({\mathbb {V}}^{\otimes m}\) induced by that on \({\mathbb {V}}\). That is, it satisfies \(\langle x_1\otimes \cdots \otimes x_m,y_1\otimes \cdots \otimes y_m\rangle = \prod _{i=1}^m(x_i,y_i)\) for \(x_1\otimes \cdots \otimes x_m\), \(y_1\otimes \cdots \otimes y_m\in {\mathbb {V}}^{\otimes m}\). We write \(\Vert \cdot \Vert _{\textrm{F}}\) for the norm on \({\mathbb {V}}^{\otimes m}\) induced by the inner product \(\langle \cdot ,\cdot \rangle \).
Using the inner product, we note that \({\mathcal {S}}^{n,m}({\mathbb {V}})\) and \(H^{n,m}\) are linearly isomorphic. Indeed, let \(\phi \in {{\,\textrm{Hom}\,}}({\mathbb {V}},{\mathbb {R}}^n)\) be the linear isomorphism induced by the basis \(v_1,\ldots ,v_n\). Then, the mapping \(\psi :{\mathcal {S}}^{n,m}({\mathbb {V}})\rightarrow H^{n,m}\) defined by
is a linear isomorphism.
In the following proofs, for convenience, we fix an orthonormal basis \(v_1,\ldots ,v_n\) for \({\mathbb {V}}\) arbitrarily. The following lemma describes a property of the inner product on the (symmetric) tensor space.
Lemma 2.7
For \({\mathcal {A}}\in {\mathbb {V}}^{\otimes m}\) and \({\mathcal {B}}\in {\mathcal {S}}^{n,m}({\mathbb {V}})\), it follows that \(\langle {\mathscr {S}}{\mathcal {A}},{\mathcal {B}}\rangle = \langle {\mathcal {A}},{\mathcal {B}}\rangle \).
Proof
Using the orthonormal basis \(v_1,\ldots ,v_n\) for \({\mathbb {V}}\), we write \({\mathcal {A}}\) and \({\mathcal {B}}\) in the form (4). It then follows from the symmetry of \({\mathcal {B}}\) that
\(\square \)
Consider the case of \({\mathbb {V}} = {\mathbb {R}}^n\) with the canonical basis \(\varvec{e}_1,\ldots ,\varvec{e}_n\). The following lemma provides an orthonormal basis for the symmetric tensor space \({\mathcal {S}}^{n,m}\) and the representation of the elements of \({\mathcal {S}}^{n,m}\) with the orthonormal basis.
Lemma 2.8
We now define \({\mathcal {F}}_{\varvec{\alpha }} :=\sqrt{m!/\varvec{\alpha }!}{\mathscr {S}}\tilde{\varvec{e}}_{[\varvec{\alpha }]}\) for each \(\varvec{\alpha }\in {\mathbb {I}}_{=m}^n\). Then, \(({\mathcal {F}}_{\varvec{\alpha }})_{\varvec{\alpha }\in {\mathbb {I}}_{=m}^n}\) is an orthonormal basis for \({\mathcal {S}}^{n,m}\). In addition, using the orthonormal basis, we can represent each \({\mathcal {A}}\in {\mathcal {S}}^{n,m}\) as \({\mathcal {A}} = \sum _{\varvec{\alpha }\in {\mathbb {I}}_{=m}^n}\sqrt{m !/\varvec{\alpha }!}{\mathcal {A}}_{[\varvec{\alpha }]}{\mathcal {F}}_{\varvec{\alpha }}\).
Proof
Note that \(\dim {\mathcal {S}}^{n,m} = |{\mathbb {I}}_{=m}^n|\). Because the linear independence of \(({\mathcal {F}}_{\varvec{\alpha }})_{\varvec{\alpha }\in {\mathbb {I}}_{=m}^n}\) is clear, showing that it is orthonormal is sufficient. Let \(\varvec{\alpha },\varvec{\beta }\in {\mathbb {I}}_{=m}^n\). We first consider the case of \(\varvec{\alpha } \ne \varvec{\beta }\). Let \((i_1,\ldots ,i_m) :=[\varvec{\alpha }]\) and \((j_1,\ldots ,j_m):=[\varvec{\beta }]\). Then, as \((i_1,\ldots ,i_m) \ne (j_1,\ldots ,j_m)\), there exists \(k_0 \in \{1,\ldots ,m\}\) such that \(i_{k_0} \ne j_{k_0}\), which implies that the number of values \(i_{k_0}\) in the vector \((i_1,\ldots ,i_m)\) is not equal to that in the vector \((j_1,\ldots ,j_m)\). Using Lemma 2.7, we then have
Second, we consider the case of \(\varvec{\alpha } = \varvec{\beta }\). Let \((i_1,\ldots ,i_m) :=[\varvec{\alpha }]\). Then,
Therefore, \(({\mathcal {F}}_{\varvec{\alpha }})_{\varvec{\alpha }\in {\mathbb {I}}_{=m}^n}\) is an orthonormal basis for \({\mathcal {S}}^{n,m}\).
In addition, for \(\varvec{\alpha }\in {\mathbb {I}}^n_{=m}\), let
From the symmetry of \({\mathcal {A}}\), it then follows that
\(\square \)
For convenience, we write the coefficients of \({\mathcal {A}}\in {\mathcal {S}}^{n,m}\) with respect to the orthonormal basis \(({\mathcal {F}}_{\varvec{\alpha }})_{\varvec{\alpha }\in {\mathbb {I}}_{=m}^n}\) taken in Lemma 2.8 as \(({\mathcal {A}}_{\varvec{\alpha }}^{{\mathcal {F}}})_{\varvec{\alpha }\in {\mathbb {I}}_{=m}^n}\). That is, each \({\mathcal {A}}\in {\mathcal {S}}^{n,m}\) is written as
Because \({\mathcal {S}}^{n,m}\) and \(H^{n,m}\) are linearly isomorphic by the mapping (5), for each \({\mathcal {A}}\in {\mathcal {S}}^{n,m}\), there exists \(\theta \in H^{n,m}\) such that \(\langle {\mathcal {A}},\varvec{x}^{\otimes m}\rangle = \theta (\varvec{x})\). The following lemma links the coefficients \(({\mathcal {A}}_{\varvec{\alpha }}^{{\mathcal {F}}})_{\varvec{\alpha }\in {\mathbb {I}}_{=m}^n}\) with the coefficients of \(\theta \).
Lemma 2.9
Suppose that \({\mathcal {A}}\in {\mathcal {S}}^{n,m}\) and \((\theta _{\varvec{\alpha }})_{\varvec{\alpha }\in {\mathbb {I}}_{=m}^n} \in {\mathbb {R}}^{{\mathbb {I}}_{=m}^n}\) satisfy
Then, \({\mathcal {A}}_{\varvec{\alpha }}^{{\mathcal {F}}} = \sqrt{\varvec{\alpha }!/m!}\theta _{\varvec{\alpha }}\) for all \(\varvec{\alpha }\in {\mathbb {I}}_{=m}^n\).
Proof
Let \({\mathcal {A}}\) be in the form (6). It then follows from Lemma 2.7 that
As this equals \(\sum _{\varvec{\alpha }\in {\mathbb {I}}_{=m}^n}\theta _{\varvec{\alpha }} \varvec{x}^{\varvec{\alpha }}\) and by comparing the coefficients, we obtain the desired result. \(\square \)
Lemmas 2.8 and 2.9 will be used to prove the technical result (Lemma 3.8) in Sect. 3.3.
2.4 Copositive and completely positive cones
Let \({\mathbb {K}}\) be a closed cone in an n-dimensional real inner product space \(({\mathbb {V}},(\cdot ,\cdot ))\). Then, we define
and call them the COP and CP cones (over \({\mathbb {K}}\)), respectively. In the case in which \({\mathbb {V}} = {\mathbb {R}}^n\) and \({\mathbb {K}} = {\mathbb {R}}_+^n\), the COP and CP cones reduce to the COP tensor cone, written as \(\mathcal {COP}^{n,m}\) and the CP tensor cone given by, for example, [45, 47]. In the case of \(m = 2\) and \({\mathbb {V}} = {\mathbb {R}}^n\), under the identification between \({\mathcal {S}}^{n,2}\) and \({\mathbb {S}}^n\), we have
which are the COP and CP matrix cones [22], respectively. Because we considered both the tensor and matrix cases, we generally omit the terms “tensor” and “matrix”.
We now discuss the duality between \(\mathcal {COP}^{n,m}({\mathbb {K}})\) and \(\mathcal{C}\mathcal{P}^{n,m}({\mathbb {K}})\).
Proposition 2.10
Let \({\mathbb {K}}\) be a closed cone in \({\mathbb {V}}\).
-
(i)
\(\mathcal {COP}^{n,m}({\mathbb {K}}) = \mathcal{C}\mathcal{P}^{n,m}({\mathbb {K}})^*\).
-
(ii)
If \({\mathbb {K}}\) is also pointed and convex, then \(\mathcal{C}\mathcal{P}^{n,m}({\mathbb {K}})\) is a closed convex cone.
Proof
We first prove (i). Let \({\mathcal {A}} \in \mathcal {COP}^{n,m}({\mathbb {K}})\). Then, for any \(x_i\in {\mathbb {K}}\) and \(\lambda _i\ge 0\) such that \(\sum _{i}\lambda _i = 1\), we have
As \(\langle {\mathcal {A}},x_i^{\otimes m}\rangle \) and \(\lambda _i\) are nonnegative for all i, (8) is also nonnegative, which implies that \({\mathcal {A}} \in \mathcal{C}\mathcal{P}^{n,m}({\mathbb {K}})^*\). Conversely, let \({\mathcal {A}} \in \mathcal{C}\mathcal{P}^{n,m}({\mathbb {K}})^*\). For any \(x\in {\mathbb {K}}\), because \(x^{\otimes m} \in \mathcal{C}\mathcal{P}^{n,m}({\mathbb {K}})\), \(\langle {\mathcal {A}},x^{\otimes m}\rangle \ge 0\) follows from the definition of dual cones. Therefore, we obtain \({\mathcal {A}}\in \mathcal {COP}^{n,m}({\mathbb {K}})\).
We now prove (ii). The convexity of \(\mathcal{C}\mathcal{P}^{n,m}({\mathbb {K}})\) follows from its definition. In addition, \(\mathcal{C}\mathcal{P}^{n,m}({\mathbb {K}})\) is a cone because \({\mathbb {K}}\) is a cone. To prove the closedness, let \(\{{\mathcal {A}}_k\}_k \subseteq \mathcal{C}\mathcal{P}^{n,m}({\mathbb {K}})\) and suppose it converges to some \({\mathcal {A}}_{\infty }\in {\mathcal {S}}^{n,m}({\mathbb {V}})\). Note that \(\mathcal{C}\mathcal{P}^{n,m}({\mathbb {K}})\) can be represented as \(\mathcal{C}\mathcal{P}^{n,m}({\mathbb {K}}) = {{\,\textrm{cone}\,}}\{x^{\otimes m}\mid x\in {\mathbb {K}}\}\) as it is a convex cone containing zero (the origin). Therefore, by Carathéodory’s theorem for cones [4, Exercise B.1.7], every element of \(\mathcal{C}\mathcal{P}^{n,m}({\mathbb {K}})\) can be written as the sum of at most \(d :=\dim {\mathcal {S}}^{n,m}({\mathbb {V}})\) elements in the form of \(x^{\otimes m}\) with \(x\in {\mathbb {K}}\); for every k, there exist \(x_{ki}\) with \(x_{ki}\in {\mathbb {K}}\) \((i = 1,\ldots ,d)\) such that \({\mathcal {A}}_k = \sum _{i=1}^dx_{ki}^{\otimes m}\). As \({{\,\textrm{int}\,}}({\mathbb {K}}^*)\) is nonempty under the assumption on \({\mathbb {K}}\), we take \(a\in {{\,\textrm{int}\,}}({\mathbb {K}}^*)\) arbitrarily. Then, we obtain
Given that \(x_{ki}\in {\mathbb {K}}\), \((x_{ki},a) \ge 0\) for any k and i. Therefore, \(\{(x_{ki},a)\}_k\) is bounded for each i.
Then, \(\{x_{ki}\}_k\) is bounded for each i. To observe this, let \(\{k(l) \mid l\in {\mathbb {N}}\}\) denote the set of indices k such that \(x_{ki} \ne 0\). Showing the boundedness of \(\{x_{k(l)i}\}_l\) is sufficient. Let \(B :={\mathbb {K}} \cap \{y\in {\mathbb {V}}\mid (y,a) = 1\}\). Note that B is compact. Then, for each l, there exist \(\alpha _{li}> 0\) and \(y_{li}\in B\) such that \(x_{k(l)i} = \alpha _{li}y_{li}\). Given that \((x_{k(l)i},a) = \alpha _{li}\), the sequence \(\{\alpha _{li}\}_l\) is bounded. Combining it with the boundedness of \(\{y_{li}\}_l \subseteq B\) leads to the boundedness of \(\{x_{k(l)i}\}_l = \{\alpha _{li}y_{li}\}_l\).
Thus, by taking a subsequence, if necessary, we assume that \(\{x_{ki}\}_k\) converges to some \(x_{\infty i}\) for each i. The closedness of \({\mathbb {K}}\) implies that \(x_{\infty i}\in {\mathbb {K}}\). Therefore,
which means that \(\mathcal{C}\mathcal{P}^{n,m}({\mathbb {K}})\) is closed. \(\square \)
Corollary 2.11
Let \({\mathbb {K}}\) be a pointed closed convex cone. Then, \(\mathcal {COP}^{n,m}({\mathbb {K}})\) and \(\mathcal{C}\mathcal{P}^{n,m}({\mathbb {K}})\) are dual to each other.
Proof
It follows from (ii) in Proposition 2.10 that \(\mathcal{C}\mathcal{P}^{n,m}({\mathbb {K}})\) is a closed convex cone. Taking the dual of (i) in Proposition 2.10 and using Theorem 2.1, we obtain \(\mathcal {COP}^{n,m}({\mathbb {K}})^* = \mathcal{C}\mathcal{P}^{n,m}({\mathbb {K}})\). \(\square \)
In this study, we focused only on the case in which \({\mathbb {K}}\) is a symmetric cone, which is a pointed closed convex cone. In this case, Corollary 2.11 is applicable to \(\mathcal {COP}^{n,m}({\mathbb {K}})\) and \(\mathcal{C}\mathcal{P}^{n,m}({\mathbb {K}})\).
2.5 Homogeneous polynomial function on inner product space
Let \(({\mathbb {V}},(\cdot ,\cdot ))\) be an n-dimensional real inner product space. A homogeneous polynomial function of degree m on \({\mathbb {V}}\) is the mapping \({\mathbb {V}}\ni x \mapsto \langle {\mathcal {A}},x^{\otimes m}\rangle \) for some \({\mathcal {A}}\in {\mathcal {S}}^{n,m}({\mathbb {V}})\). \(H^{n,m}({\mathbb {V}})\) denotes the set of homogeneous polynomial functions of degree m on \({\mathbb {V}}\), i.e., \(H^{n,m}({\mathbb {V}}) = \{\langle {\mathcal {A}},x^{\otimes m}\rangle \mid {\mathcal {A}}\in {\mathcal {S}}^{n,m}({\mathbb {V}})\}\). As \(H^{n,m} = \{\langle {\mathcal {A}},\varvec{x}^{\otimes m}\rangle \mid {\mathcal {A}}\in {\mathcal {S}}^{n,m}\}\), \(H^{n,m}({\mathbb {R}}^n)\) agrees with \(H^{n,m}\).
As the definition of \(\Sigma ^{n,2m}\), \(\Sigma ^{n,2m}({\mathbb {V}})\) denotes the set of sums of squares of homogeneous polynomial functions of degree m on \({\mathbb {V}}\). To represent the set \(\Sigma ^{n,2m}({\mathbb {V}})\) more explicitly, we prove the following lemma.
Lemma 2.12
\(\langle {\mathcal {A}},x^{\otimes m}\rangle ^2 = \langle {\mathscr {S}}({\mathcal {A}}\otimes {\mathcal {A}}),x^{\otimes 2m}\rangle \) for any \({\mathcal {A}}\in {\mathcal {S}}^{n,m}({\mathbb {V}})\).
Proof
Fix an orthonormal basis \(v_1,\ldots ,v_n\) in \({\mathbb {V}}\) arbitrarily. In addition, using the basis, we write \({\mathcal {A}}\in {\mathcal {S}}^{n,m}({\mathbb {V}})\) in the form (4). Given that
we have
Moreover, as \({\mathcal {A}}\otimes {\mathcal {A}} = \sum _{i_1,\ldots ,i_{2m}=1}^n{\mathcal {A}}_{i_1\cdots i_m}{\mathcal {A}}_{i_{m+1}\cdots i_{2m}}{\tilde{v}}_{i_1\cdots i_{2m}}\), \(\langle {\mathcal {A}}\otimes {\mathcal {A}},x^{\otimes 2m}\rangle \) agrees with (9). Therefore, by applying Lemma 2.7, we obtain the desired result. \(\square \)
Using Lemma 2.12, we can express \(\Sigma ^{n,2m}({\mathbb {V}})\) as \({{\,\textrm{conv}\,}}\{\langle {\mathscr {S}}({\mathcal {A}}\otimes {\mathcal {A}}),x^{\otimes 2m}\rangle \mid {\mathcal {A}}\in {\mathcal {S}}^{n,m}({\mathbb {V}})\}\). Through the isomorphism \(H^{n,2m}({\mathbb {V}}) \ni \langle {\mathcal {A}},x^{\otimes 2m}\rangle \mapsto {\mathcal {A}}\in {\mathcal {S}}^{n,2m}({\mathbb {V}})\), the set \(\Sigma ^{n,2m}({\mathbb {V}})\) is mapped onto the set \({{\,\textrm{conv}\,}}\{{\mathscr {S}}({\mathcal {A}}\otimes {\mathcal {A}})\mid {\mathcal {A}}\in {\mathcal {S}}^{n,2m}({\mathbb {V}})\}\) denoted by \({{\textrm{SOS}}}^{n,2m}({\mathbb {V}})\). We define \({{\textrm{MOM}}}^{n,2m}({\mathbb {V}}) :={{\textrm{SOS}}}^{n,2m}({\mathbb {V}})^*\) and call it the moment cone. Because \({{\textrm{SOS}}}^{n,2m}({\mathbb {V}})\) is a closed convex cone [9, Lemma 2.2], \({{\textrm{SOS}}}^{n,2m}({\mathbb {V}})\) and \({{\textrm{MOM}}}^{n,2m}({\mathbb {V}})\) are dual to each other.
3 Sum-of-squares-based inner-approximation hierarchy
Let \(({\mathbb {E}},\circ ,\bullet )\) be a Euclidean Jordan algebra of dimension n. In this section, we aim to provide an inner-approximation hierarchy described by an SOS constraint for the COP cone \(\mathcal {COP}^{n,m}({\mathbb {E}}_+)\). In Sect. 3.1, we first provide an inner-approximation hierarchy for the cone of homogeneous polynomials that are nonnegative over a symmetric cone in \({\mathbb {R}}^n\). Using the results in Sect. 3.1, we provide the desired approximation hierarchy in Sect. 3.2. In Sect. 3.3, we discuss its dual. Subsequently, we fix an orthonormal basis \(v_1,\ldots ,v_n\) for the given Euclidean Jordan algebra \(({\mathbb {E}},\circ ,\bullet )\) and let \(\phi :{\mathbb {E}}\rightarrow {\mathbb {R}}^n\) be the associated isometry.
3.1 Approximation hierarchy for homogeneous polynomials
If we define \(\varvec{x}\lozenge \varvec{y} :=\phi (\phi ^{-1}(\varvec{x})\circ \phi ^{-1}(\varvec{y}))\) and \(\varvec{x}\blacklozenge \varvec{y} :=\varvec{x}^\top \varvec{y}\) for \(\varvec{x},\varvec{y}\in {\mathbb {R}}^n\), then \(({\mathbb {R}}^n,\lozenge ,\blacklozenge )\) is also a Euclidean Jordan algebra. Hereafter, to emphasize that \({\mathbb {R}}^n\) is a Euclidean Jordan algebra and \(({\mathbb {R}}^n,\lozenge ,\blacklozenge )\) depends on the choice of \(\phi \), we write the Euclidean Jordan algebra \({\mathbb {R}}^n\) as \(\phi ({\mathbb {E}})\). We note that the symmetric cone \(\phi ({\mathbb {E}})_+\) associated with the Euclidean Jordan algebra \((\phi ({\mathbb {E}}),\lozenge ,\blacklozenge )\) satisfies
We illustrate an example of (10), the isometry \(\phi \), and the product \(\lozenge \) using the Euclidian Jordan algebra introduced in Example 2.4.
Example 3.1
Let \(({\mathbb {S}}^n,\circ ,\bullet )\) be the Euclidian Jordan algebra introduced in Example 2.4. We define the linear mapping \({{\,\textrm{svec}\,}}:{\mathbb {S}}^n\rightarrow {\mathbb {R}}^{\frac{n(n+1)}{2}}\) as
for each \(\varvec{X}\in {\mathbb {S}}^n\). Then, the mapping \({{\,\textrm{svec}\,}}(\cdot )\) is an isometry between \({\mathbb {S}}^n\) and \({\mathbb {R}}^{\frac{n(n+1)}{2}}\); i.e., \(\varvec{X}\bullet \varvec{Y} = {{\,\textrm{svec}\,}}(\varvec{X})^\top {{\,\textrm{svec}\,}}(\varvec{Y})\) holds for all \(\varvec{X},\varvec{Y}\in {\mathbb {S}}^n\). Let \({{\,\textrm{smat}\,}}(\cdot )\) denote the inverse mapping of \({{\,\textrm{svec}\,}}(\cdot )\). If we define \(\varvec{x}\lozenge \varvec{y} :={{\,\textrm{svec}\,}}({{\,\textrm{smat}\,}}(\varvec{x})\circ {{\,\textrm{smat}\,}}(\varvec{y}))\), and \(\varvec{x}\blacklozenge \varvec{y} :=\varvec{x}^\top \varvec{y}\) for \(\varvec{x},\varvec{y}\in {\mathbb {R}}^{\frac{n(n+1)}{2}}\), then \(({\mathbb {R}}^{\frac{n(n+1)}{2}},\lozenge ,\blacklozenge )\) is also a Euclidean Jordan algebra. The symmetric cone associated with the Euclidean Jordan algebra \(({\mathbb {R}}^{\frac{n(n+1)}{2}},\lozenge ,\blacklozenge )\) is \({{\,\textrm{svec}\,}}({\mathbb {S}}_+^n)\).
Here, we derive an inner-approximation hierarchy for the cone of homogeneous polynomials that are nonnegative over the symmetric cone \(\phi ({\mathbb {E}}_+)\). Let
be the cone of homogeneous polynomials in n variables of degree m that are nonnegative over the symmetric cone \(\phi ({\mathbb {E}}_+)\). For each \(r\in {\mathbb {N}}\), we define
Proposition 3.2
Each \(\widetilde{{\mathcal {K}}}_{\textrm{NN},r}^{n,m}(\phi ({\mathbb {E}}_+))\) is a closed convex cone, and the sequence \(\{\widetilde{{\mathcal {K}}}_{\textrm{NN},r}^{n,m}(\phi ({\mathbb {E}}_+))\}_r\) satisfies the following two conditions:
-
(i)
\(\widetilde{{\mathcal {K}}}_{\textrm{NN},r}^{n,m}(\phi ({\mathbb {E}}_+)) \subseteq \widetilde{{\mathcal {K}}}_{\textrm{NN},r+1}^{n,m}(\phi ({\mathbb {E}}_+)) \subseteq {\widetilde{\mathcal {COP}}}^{n,m}(\phi (\mathbb {E_+}))\) for all \(r\in {\mathbb {N}}\).
-
(ii)
\({{\,\textrm{int}\,}}{\widetilde{\mathcal {COP}}}^{n,m}(\phi (\mathbb {E_+})) \subseteq \bigcup _{r=0}^{\infty }\widetilde{{\mathcal {K}}}_{\textrm{NN},r}^{n,m}(\phi ({\mathbb {E}}_+))\).
In the following, the notation “\(\widetilde{{\mathcal {K}}}_{\textrm{NN},r}^{n,m}(\phi ({\mathbb {E}}_+)) \uparrow {\widetilde{\mathcal {COP}}}^{n,m}(\phi (\mathbb {E_+}))\)” is used to represent the two conditions mentioned in Proposition 3.2. The notation is not limited to the sequence \(\{\widetilde{{\mathcal {K}}}_{\textrm{NN},r}^{n,m}(\phi ({\mathbb {E}}_+))\}_r\). To prove Proposition 3.2, we exploit Reznick’s Positivstellensatz, which is described as follows:
Theorem 3.3
[49, Theorem 3.12] Let \(\theta \in H^{n,2m}\). If \(\theta (\varvec{x}) > 0\) for all \(\varvec{x}\in {\mathbb {R}}^n\setminus \{\varvec{0}\}\), then there exists \(r_0\in {\mathbb {N}}\) such that \((\varvec{x}^\top \varvec{x})^{r_0}\theta (\varvec{x}) \in \Sigma ^{n,2(r_0+m)}\).
Proof of Proposition 3.2
Note that \(\theta (\varvec{x}\lozenge \varvec{x})\in H^{n,2m}\) for each \(\theta \in H^{n,m}\) because the product \(\lozenge \) is bilinear; thus, the set \(\widetilde{{\mathcal {K}}}_{\textrm{NN},r}^{n,m}(\phi ({\mathbb {E}}_+))\) is well-defined for each \(r\in {\mathbb {N}}\). \(\widetilde{{\mathcal {K}}}_{\textrm{NN},r}^{n,m}(\phi ({\mathbb {E}}_+))\) can be shown to be a closed convex cone from the counterpart properties of \(\Sigma ^{n,2(r+m)}\). In the following, we prove \(\widetilde{{\mathcal {K}}}_{\textrm{NN},r}^{n,m}(\phi ({\mathbb {E}}_+)) \uparrow {\widetilde{\mathcal {COP}}}^{n,m}(\phi ({\mathbb {E}}_+))\).
To prove (i), let \(\theta \in \widetilde{{\mathcal {K}}}_{\textrm{NN},r}^{n,m}(\phi ({\mathbb {E}}_+))\). Then, there exist \(p_1,\ldots ,p_N\in H^{n,r+m}\) such that
Using this, we obtain
which means that \(\theta \in \widetilde{{\mathcal {K}}}_{\textrm{NN},r+1}^{n,m}(\phi ({\mathbb {E}}_+))\). Now, we assume that \(\theta \not \in {\widetilde{\mathcal {COP}}}^{n,m}(\phi (\mathbb {E_+}))\). Then, from (10), it follows that there exists \(\widetilde{\varvec{x}}\in \phi ({\mathbb {E}})\) such that \(\theta (\widetilde{\varvec{x}}\lozenge \widetilde{\varvec{x}}) < 0\). As \(\widetilde{\varvec{x}} \not = \varvec{0}\) and \(\widetilde{\varvec{x}}^\top \widetilde{\varvec{x}} > 0\), we have \((\widetilde{\varvec{x}}^\top \widetilde{\varvec{x}})^r\theta (\widetilde{\varvec{x}}\lozenge \widetilde{\varvec{x}}) < 0\). However, (11) implies that \((\widetilde{\varvec{x}}^\top \widetilde{\varvec{x}})^r\theta (\widetilde{\varvec{x}}\lozenge \widetilde{\varvec{x}})\) must take a nonnegative value, which is a contradiction. Therefore, we obtain \(\theta \in {\widetilde{\mathcal {COP}}}^{n,m}(\phi (\mathbb {E_+}))\).
To prove (ii), let \(\theta \in {{\,\textrm{int}\,}}{\widetilde{\mathcal {COP}}}^{n,m}(\phi (\mathbb {E_+}))\). Then, as \(\phi (\mathbb {E_+})\) is a closed cone, it follows that \(\theta (\varvec{y}) > 0\) for all \(\varvec{y}\in \phi (\mathbb {E_+}){\setminus }\{\varvec{0}\}\), i.e., \(\theta (\varvec{x}\lozenge \varvec{x}) > 0\) for all \(\varvec{x}\in \phi ({\mathbb {E}}){\setminus }\{\varvec{0}\} = {\mathbb {R}}^n{\setminus }\{\varvec{0}\}\) (see [55, Observation 1], for example). Thus, by Theorem 3.3, there exists \(r_0\in {\mathbb {N}}\) such that \((\varvec{x}^\top \varvec{x})^{r_0}\theta (\varvec{x}\lozenge \varvec{x}) \in \Sigma ^{n,2(r_0+m)}\), which means that \(\theta \in \widetilde{{\mathcal {K}}}_{\textrm{NN},r_0}^{n,m}(\phi ({\mathbb {E}}_+)) \subseteq \bigcup _{r=0}^{\infty }\widetilde{{\mathcal {K}}}_{\textrm{NN},r}^{n,m}(\phi ({\mathbb {E}}_+))\). \(\square \)
Note that the set \(\widetilde{{\mathcal {K}}}_{\textrm{NN},r}^{n,m}(\phi ({\mathbb {E}}_+))\) is defined by an SOS constraint. This constraint can be written as a semidefinite constraint of size \(|{\mathbb {I}}_{=r+m}^n| = \left( \begin{array}{c}n+r+m-1\\ n-1\end{array}\right) \), which is polynomial in n and m for each fixed \(r\in {\mathbb {N}}\).
3.2 Approximation hierarchy for symmetric tensors
In this subsection, we translate the result of Proposition 3.2 to the case of symmetric tensors. Given that \((x\bullet x)^r\langle {\mathcal {A}},(x\circ x)^{\otimes m}\rangle \in H^{n,2(r+m)}({\mathbb {E}})\) for each \(r\in {\mathbb {N}}\) and \({\mathcal {A}}\in {\mathcal {S}}^{n,m}({\mathbb {E}})\), there exists a unique \({\mathcal {A}}^{(r)}\in {\mathcal {S}}^{n,2(r+m)}({\mathbb {E}})\) such that
Using the symmetric tensor \({\mathcal {A}}^{(r)}\), we define
Theorem 3.4
Using the orthonormal basis \(v_1,\ldots ,v_n\) for \({\mathbb {E}}\), we define \(\psi \in {{\,\textrm{Hom}\,}}({\mathcal {S}}^{n,m}({\mathbb {E}}),H^{n,m})\) in the same manner as for (5). Then,
-
(i)
\(\psi (\mathcal {COP}^{n,m}({\mathbb {E}}_+)) = {\widetilde{\mathcal {COP}}}^{n,m}(\phi ({\mathbb {E}}_+))\).
-
(ii)
\(\psi ({\mathcal {K}}_{\textrm{NN},r}^{n,m}({\mathbb {E}}_+)) = \widetilde{{\mathcal {K}}}_{\textrm{NN},r}^{n,m}(\phi ({\mathbb {E}}_+))\).
-
(iii)
Each \({\mathcal {K}}_{\textrm{NN},r}^{n,m}({\mathbb {E}}_+)\) is a closed convex cone, and the sequence \(\{{\mathcal {K}}_{\textrm{NN},r}^{n,m}({\mathbb {E}}_+)\}_r\) satisfies \({\mathcal {K}}_{\textrm{NN},r}^{n,m}({\mathbb {E}}_+) \uparrow \mathcal {COP}^{n,m}({\mathbb {E}}_+)\).
Proof
We first prove (i). Let \(\theta = \psi ({\mathcal {A}}) \in \psi (\mathcal {COP}^{n,m}({\mathbb {E}}_+))\) and \({\mathcal {A}} \in \mathcal {COP}^{n,m}({\mathbb {E}}_+)\). Then, for any \(x\in {\mathbb {E}}\), we have
using \(\phi ^{-1}(\phi (x)\lozenge \phi (x)) = x\circ x\) and \({\mathcal {A}}\in \mathcal {COP}^{n,m}({\mathbb {E}}_+)\). Therefore, we have \(\theta \in {\widetilde{\mathcal {COP}}}^{n,m}(\phi ({\mathbb {E}}_+))\). Conversely, let \(\theta \in {\widetilde{\mathcal {COP}}}^{n,m}(\phi ({\mathbb {E}}_+))\) and \({\mathcal {A}} :=\psi ^{-1}(\theta )\in {\mathcal {S}}^{n,m}({\mathbb {E}})\). Then, for any \(x\in {\mathbb {E}}\), in the same manner as the discussion above, we obtain \(\langle {\mathcal {A}},(x\circ x)^{\otimes m}\rangle = \theta (\phi (x)\lozenge \phi (x)) \ge 0\), using \(\phi (x)\lozenge \phi (x)\in \phi ({\mathbb {E}}_+)\). Therefore, \({\mathcal {A}}\in \mathcal {COP}^{n,m}({\mathbb {E}}_+)\), and thus, \(\theta = \psi ({\mathcal {A}}) \in \psi (\mathcal {COP}^{n,m}({\mathbb {E}}_+))\).
Second, we prove (ii). Let \(\theta = \psi ({\mathcal {A}}) \in \psi ({\mathcal {K}}_{\textrm{NN},r}^{n,m}({\mathbb {E}}_+))\) and \({\mathcal {A}} \in {\mathcal {K}}_{\textrm{NN},r}^{n,m}({\mathbb {E}}_+)\). As \({\mathcal {A}}\in {\mathcal {K}}_{\textrm{NN},r}^{n,m}({\mathbb {E}}_+)\), there exist \({\mathcal {A}}_1,\ldots ,{\mathcal {A}}_N\in {\mathcal {S}}^{n,r+m}({\mathbb {E}})\) such that \((x\bullet x)^r\langle {\mathcal {A}},(x\circ x)^{\otimes m}\rangle = \sum _{i=1}^N\langle {\mathcal {A}}_i,x^{\otimes (r+m)}\rangle ^2\). As \(\langle {\mathcal {A}}_i,\phi ^{-1}(\varvec{x})^{\otimes (r+m)}\rangle \in H^{n,r+m}\) for each i, it follows that
Therefore, we have \(\theta \in \widetilde{{\mathcal {K}}}_{\textrm{NN},r}^{n,m}(\phi ({\mathbb {E}}_+))\). Conversely, let \(\theta \in \widetilde{{\mathcal {K}}}_{\textrm{NN},r}^{n,m}(\phi ({\mathbb {E}}_+))\). Then, there exist \({\mathcal {A}}_1,\ldots ,{\mathcal {A}}_N\in {\mathcal {S}}^{n,r+m}({\mathbb {E}})\) such that \((\varvec{x}^\top \varvec{x})^r\theta (\varvec{x}\lozenge \varvec{x}) = \sum _{i=1}^N\langle {\mathcal {A}}_i,\phi ^{-1}(\varvec{x})^{\otimes (r+m)}\rangle ^2\). Let \({\mathcal {A}} :=\psi ^{-1}(\theta )\in {\mathcal {S}}^{n,m}({\mathbb {E}})\). Then, in the same manner as in (i), we have
Finally, (iii) can be proven by the linear isomorphism of \(\psi \) and \(\widetilde{{\mathcal {K}}}_{\textrm{NN},r}^{n,m}(\phi ({\mathbb {E}}_+)) \uparrow {\widetilde{\mathcal {COP}}}^{n,m}(\phi (\mathbb {E_+}))\). \(\square \)
Note that \({\mathcal {K}}_{\textrm{NN},r}^{n,m}({\mathbb {E}}_+)\) does not depend on the choice of the orthonormal basis \(v_1,\ldots ,v_n\) or isometry \(\phi \). We call the sequence \(\{{\mathcal {K}}_{\textrm{NN},r}^{n,m}({\mathbb {E}}_+)\}_r\) the NN-type inner-approximation hierarchy. In the case where the symmetric cone \({\mathbb {E}}_+\) is the nonnegative orthant \({\mathbb {R}}_+^n\), this hierarchy is essentially identical to the SOS-based one provided by Iqbal and Ahmed [29, Eq. (19)]. They are generalizations of that provided by Parrilo [40] to tensors.
Remark 3.5
The NN-type inner-approximation hierarchy can be extended to the SOS cones proposed by Papp and Alizadeh [39]. Let \({\mathbb {A}}\) and \( {\mathbb {B}}\) be real inner product spaces of dimensions l and n, respectively, and let \(\diamond :{\mathbb {A}}\times {\mathbb {A}}\rightarrow {\mathbb {B}}\) be a bilinear mapping. The SOS cone is then defined as \(\Sigma _{\diamond } :={{\,\textrm{conv}\,}}\{x_i \diamond x_i \mid x_i\in {\mathbb {A}}\}\). Note that each element of \(\Sigma _{\diamond }\) can be written as the sum of at most n elements \(x_1\diamond x_1,\ldots ,x_n\diamond x_n\) such that \(x_1,\ldots ,x_n\in {\mathbb {A}}\), by Carathéodory’s theorem for cones. If \(({\mathbb {A}},{\mathbb {B}},\diamond )\) is formally real, or equivalently, if \(\Sigma _{\diamond }\) is proper [39, Theorem 3.3], then
is a closed convex cone for each \(r\in {\mathbb {N}}\) and satisfies \({\mathcal {K}}_r^{n,m}(\Sigma _{\diamond }) \uparrow \mathcal {COP}^{n,m}(\Sigma _{\diamond })\), where \(\bullet _{{\mathbb {A}}}\) denotes the inner product on \({\mathbb {A}}\).
3.3 Dual of \({\mathcal {K}}_{\textrm{NN},r}^{n,m}({\mathbb {E}}_+)\)
Next, we discuss the dual cone of \({\mathcal {K}}_{\textrm{NN},r}^{n,m}({\mathbb {E}}_+)\). By considering its dual, we can provide an outer-approximation hierarchy for the CP cone \(\mathcal{C}\mathcal{P}^{n,m}({\mathbb {E}}_+)\). Although the closure hull operator is generally required to describe the dual cone, we succeeded in removing it for the case in which the symmetric cone \({\mathbb {E}}_+\) is the nonnegative orthant \({\mathbb {R}}_+^n\).
Let \({\mathcal {C}}^{(r)}:{\mathcal {S}}^{n,2(r+m)}({\mathbb {E}})\rightarrow {\mathcal {S}}^{n,m}({\mathbb {E}})\) be the adjoint of the linear mapping \({\mathcal {A}} \mapsto {\mathcal {A}}^{(r)}\), so that
Using the linear mapping \({\mathcal {C}}^{(r)}\), we define \({\mathcal {C}}_r^{n,m}({\mathbb {E}}_+) :=\{{\mathcal {C}}^{(r)}({\mathcal {X}}) \mid {\mathcal {X}} \in {{\textrm{MOM}}}^{n,2(r+m)}({\mathbb {E}})\}\).
Proposition 3.6
It follows that \({\mathcal {K}}_{\textrm{NN},r}^{n,m}({\mathbb {E}}_+) = {\mathcal {C}}_r^{n,m}({\mathbb {E}}_+)^*\), and thus, \({\mathcal {K}}_{\textrm{NN},r}^{n,m}({\mathbb {E}}_+)^* = {{\,\textrm{cl}\,}}{\mathcal {C}}_r^{n,m}({\mathbb {E}}_+)\).
Proof
Let \({\mathcal {A}}\in {\mathcal {K}}_{\textrm{NN},r}^{n,m}({\mathbb {E}}_+)\). Then, \({\mathcal {A}}^{(r)}\in {{\textrm{SOS}}}^{n,2(r+m)}({\mathbb {E}})\) by definition. For any \({\mathcal {X}}\in {{\textrm{MOM}}}^{n,2(r+m)}({\mathbb {E}})\), it follows from (12) and the duality between \({{\textrm{MOM}}}^{n,2(r+m)}({\mathbb {E}})\) and \({{\textrm{SOS}}}^{n,2(r+m)}({\mathbb {E}})\) that \(\langle {\mathcal {A}},{\mathcal {C}}^{(r)}({\mathcal {X}})\rangle = \langle {\mathcal {A}}^{(r)},{\mathcal {X}}\rangle \ge 0\), which means that \({\mathcal {A}}\in {\mathcal {C}}_r^{n,m}({\mathbb {E}}_+)^*\).
Conversely, let \({\mathcal {A}}\in {\mathcal {C}}_r^{n,m}({\mathbb {E}}_+)^*\). Then, for any \({\mathcal {X}}\in {{\textrm{MOM}}}^{n,2(r+m)}({\mathbb {E}})\), because \({\mathcal {C}}^{(r)}({\mathcal {X}}) \in {\mathcal {C}}_r^{n,m}({\mathbb {E}}_+)\), it follows that \(\langle {\mathcal {A}}^{(r)},{\mathcal {X}}\rangle = \langle {\mathcal {A}},{\mathcal {C}}^{(r)}({\mathcal {X}})\rangle \ge 0\). Therefore, \({\mathcal {A}}^{(r)} \in {{\textrm{MOM}}}^{n,2(r+m)}({\mathbb {E}})^* = {{\textrm{SOS}}}^{n,2(r+m)}({\mathbb {E}})\), which means that \({\mathcal {A}}\in {\mathcal {K}}_{\textrm{NN},r}^{n,m}({\mathbb {E}}_+)\).
Given that \({\mathcal {C}}_r^{n,m}({\mathbb {E}}_+)\) is a convex cone, by taking the dual, we have \({\mathcal {K}}_{\textrm{NN},r}^{n,m}({\mathbb {E}}_+)^* = {{\,\textrm{cl}\,}}{\mathcal {C}}_r^{n,m}({\mathbb {E}}_+)\). \(\square \)
We could not prove that \({\mathcal {C}}_r^{n,m}({\mathbb {E}}_+)\) itself is closed, and the closure hull operator is required to describe the dual of \({\mathcal {K}}_{\textrm{NN},r}^{n,m}({\mathbb {E}}_+)\). Therefore, whether \({\mathcal {K}}_{\textrm{NN},r}^{n,m}({\mathbb {E}}_+)\) and \({\mathcal {C}}_r^{n,m}({\mathbb {E}}_+)\) itself are dual to each other is an open problem for a general symmetric cone \({\mathbb {E}}_+\). The difficulty originates from a lack of understanding of the moment cone \({{\textrm{MOM}}}^{n,2m}({\mathbb {E}})\).
As a special case, we consider the symmetric cone \({\mathbb {E}}_+\) to be the nonnegative orthant \({\mathbb {R}}_+^n\). In this case, we demonstrated in Proposition 3.9 that the dual of \({{\textrm{SOS}}}^{n,2m}({\mathbb {R}}^n)\) can be explicitly described by a semidefinite constraint; this answers a question posed by Chen et al. [9] of whether the membership problem of the dual of \({{\textrm{SOS}}}^{n,2m}({\mathbb {R}}^n)\) can be solved in polynomial time or not. Furthermore, the closedness of \({\mathcal {C}}_r^{n,m}({\mathbb {R}}_+^n)\) (written as \({\mathcal {C}}_r^{n,m}\) hereafter) is shown by exploiting this result.
Definition 3.7
For \({\mathcal {X}} \in {\mathcal {S}}^{n,2m}\), let \(\varvec{M}^{n,m}({\mathcal {X}})\) be a matrix in \({\mathbb {S}}^{{\mathbb {I}}_{=m}^n}\) with the \((\varvec{\alpha },\varvec{\beta })\)th element \({\mathcal {X}}_{[\varvec{\alpha }+\varvec{\beta }]}\) for \(\varvec{\alpha },\varvec{\beta }\in {\mathbb {I}}_{=m}^n\). Then, we define \({\mathcal {M}}^{n,2m} :=\{{\mathcal {X}}\in {\mathcal {S}}^{n,2m} \mid \varvec{M}^{n,m}({\mathcal {X}})\in {\mathbb {S}}_+^{{\mathbb {I}}_{=m}^n}\}\).
We aim to demonstrate that the dual of \({{\textrm{SOS}}}^{n,2m}({\mathbb {R}}^n)\) agrees with \({\mathcal {M}}^{n,2m}\).
Lemma 3.8
For \({\mathcal {A}}\in {\mathcal {S}}^{n,m}\), let \(\varvec{\theta }\) be the element of \({\mathbb {R}}^{{\mathbb {I}}_{=m}^n}\) satisfying (7). Then, \(\langle {\mathscr {S}}({\mathcal {A}}\otimes {\mathcal {A}}),{\mathcal {X}}\rangle = \varvec{\theta }^\top \varvec{M}^{n,m}({\mathcal {X}})\varvec{\theta }\) for all \({\mathcal {X}}\in {\mathcal {S}}^{n,2m}\).
Proof
Let \(({\mathcal {F}}_{\varvec{\alpha }})_{\varvec{\alpha }\in {\mathbb {I}}_{=m}^n}\) be the orthonormal basis for \({\mathcal {S}}^{n,m}\) defined in Lemma 2.8. From Lemma 2.12 and (7), we have
Therefore, from Lemma 2.9, when we represent \({\mathscr {S}}({\mathcal {A}}\otimes {\mathcal {A}})\) with the orthonormal basis \(({\mathcal {F}}_{\varvec{\gamma }})_{\varvec{\gamma }\in {\mathbb {I}}_{=2m}^n}\) for \({\mathcal {S}}^{n,2m}\), the coefficient \({\mathscr {S}}({\mathcal {A}}\otimes {\mathcal {A}})_{\varvec{\gamma }}\) can be written as
for each \(\varvec{\gamma }\in {\mathbb {I}}_{=2m}^n\). In addition, from Lemma 2.8, each \({\mathcal {X}}\in {\mathcal {S}}^{n,2m}\) can be written as
Then, we have
\(\square \)
Proposition 3.9
The dual of \({{\textrm{SOS}}}^{n,2m}({\mathbb {R}}^n)\) is \({\mathcal {M}}^{n,2m}\). In particular, it follows that \({\mathcal {M}}^{n,2m} = {{\textrm{MOM}}}^{n,2m}({\mathbb {R}}^n)\).
Proof
Let \({\mathcal {X}}\in {\mathcal {M}}^{n,2m}\). For each \({\mathcal {A}}\in {{\textrm{SOS}}}^{n,2m}({\mathbb {R}}^n)\), there exist \({\mathcal {A}}^{(1)},\ldots ,{\mathcal {A}}^{(k)}\in {\mathcal {S}}^{n,m}\) such that \({\mathcal {A}} = \sum _{i=1}^k{\mathscr {S}}({\mathcal {A}}^{(i)}\otimes {\mathcal {A}}^{(i)})\). For each \({\mathcal {A}}^{(i)}\), let \(\varvec{\theta }^{(i)} = (\theta _{\varvec{\alpha }}^{(i)})_{\varvec{\alpha }\in {\mathbb {I}}_{=m}^n} \in {\mathbb {R}}^{{\mathbb {I}}_{=m}^n}\) be such that it satisfies (7). Then, it follows from Lemma 3.8 and \(\varvec{M}^{n,m}({\mathcal {X}}) \in {\mathbb {S}}_+^{{\mathbb {I}}_{=m}^n}\) that
which means that \({\mathcal {X}}\in {{\textrm{SOS}}}^{n,2m}({\mathbb {R}}^n)^*\).
Conversely, suppose that \({\mathcal {X}}\in {{\textrm{SOS}}}^{n,2m}({\mathbb {R}}^n)^*\). We take \(\varvec{\theta } = (\theta _{\varvec{\alpha }})_{\varvec{\alpha }\in {\mathbb {I}}_{=m}^n} \in {\mathbb {R}}^{{\mathbb {I}}_{=m}^n}\) arbitrarily and let \({\mathcal {A}}\) be the element of \({\mathcal {S}}^{n,m}\) satisfying (7). Then, it follows from Lemma 3.8 and \({\mathscr {S}}({\mathcal {A}}\otimes {\mathcal {A}})\in {{\textrm{SOS}}}^{n,2m}({\mathbb {R}}^n)\) that \(\varvec{\theta }^\top \varvec{M}^{n,m}({\mathcal {X}})\varvec{\theta } = \langle {\mathscr {S}}({\mathcal {A}}\otimes {\mathcal {A}}),{\mathcal {X}}\rangle \ge 0.\) Therefore, \({\mathcal {X}}\in {\mathcal {M}}^{n,2m}\).
By the definition of \({{\textrm{MOM}}}^{n,2m}({\mathbb {R}}^n)\), it follows that \({{\textrm{MOM}}}^{n,2m}({\mathbb {R}}^n) = {{\textrm{SOS}}}^{n,2m}({\mathbb {R}}^n)^*\). Combining this with \({{\textrm{SOS}}}^{n,2m}({\mathbb {R}}^n)^* = {\mathcal {M}}^{n,2m}\) shown above, we have \({\mathcal {M}}^{n,2m} = {{\textrm{MOM}}}^{n,2m}({\mathbb {R}}^n)\). \(\square \)
Using Proposition 3.9, we show the closedness of \({\mathcal {C}}_r^{n,m}\). When the Euclidean Jordan algebra \(({\mathbb {E}},\circ ,\bullet )\) is that given by Example 2.2, the \((i_1,\ldots ,i_m)\)th element of \({\mathcal {C}}^{(r)}({\mathcal {X}})\in {\mathcal {S}}^{n,m}\) is
See [29, Eq. (29)] for this derivation.
Lemma 3.10
Suppose that \({\mathcal {X}}\in {\mathcal {M}}^{n,2m}\), i.e., \(\varvec{M}^{n,m}({\mathcal {X}}) \in {\mathbb {S}}_+^{{\mathbb {I}}_{=m}^n}\). Using \({\mathcal {X}}\), we define \({\mathcal {X}}'\in {\mathcal {S}}^{n,2m}\) as
for each \(\varvec{\gamma }\in {\mathbb {I}}_{=2m}^n\). It then follows that \(\varvec{M}^{n,m}({\mathcal {X}}') \in {\mathbb {S}}_+^{{\mathbb {I}}_{=m}^n}\).
Proof
The set \({\mathbb {I}}_{=m}^n\) is partitioned as two disjoint sets \({\mathbb {I}}_{=m,\textrm{even}}^n\) and \({\mathbb {I}}_{=m,\textrm{odd}}^n\), where
In addition, the elements of \(\{0,1\}^n\setminus \{\varvec{0}\}\) are ordered as \(\varvec{\delta }_1,\ldots ,\varvec{\delta }_{2^n-1}\) (e.g., \(\varvec{\delta }_1 = (1,0,\ldots ,0)\)). Then, let
for \(i = 1,\ldots ,2^n-1\). Note that the parity between \(\varvec{\alpha }\) and \(\varvec{\delta }_i\) agrees for every \(\varvec{\alpha }\in {\mathbb {I}}_{=m,\textrm{odd},i}^n\) and that the sets \({\mathbb {I}}_{=m,\textrm{odd},i}^n\) \((i = 1,\ldots ,2^n-1)\) are disjoint. Then, for \(\varvec{\alpha },\varvec{\beta }\in {\mathbb {I}}_{=m}^n\), all elements of \(\varvec{\alpha } + \varvec{\beta }\) are even if and only if \(\varvec{\alpha }\) and \(\varvec{\beta }\) belong to the same set of \({\mathbb {I}}_{=m,\textrm{even}}^n,{\mathbb {I}}_{=m,\textrm{odd},1}^n,\ldots ,{\mathbb {I}}_{=m,\textrm{odd},2^n-1}^n\). Therefore, from the definition of \({\mathcal {X}}'\), we note that
for I, \(J = {\mathbb {I}}_{=m,\textrm{even}}^n, {\mathbb {I}}_{=m,\textrm{odd},1}^n,\ldots ,{\mathbb {I}}_{=m,\textrm{odd},2^n-1}^n\), where \(\varvec{M}^{n,m}({\mathcal {X}}')_{IJ}\) is the submatrix obtained by extracting the rows of \(\varvec{M}^{n,m}({\mathcal {X}}')\) indexed by I and columns indexed by J. Because \(\varvec{M}^{n,m}({\mathcal {X}})\) is semidefinite, \(\varvec{M}^{n,m}({\mathcal {X}}')\) is as well. \(\square \)
Proposition 3.11
\({\mathcal {C}}_r^{n,m}\) is a closed convex cone, and thus, \({\mathcal {K}}_{\textrm{NN},r}^{n,m}({\mathbb {R}}_+^n)\) and \({\mathcal {C}}_r^{n,m}\) are dual to each other.
Proof
Showing the closedness of \({\mathcal {C}}_r^{n,m}\) is sufficient. Let \(\{{\mathcal {A}}_k\}_k \subseteq {\mathcal {C}}_r^{n,m}\) and suppose that the sequence converges to some \({\mathcal {A}}_{\infty } \in {\mathcal {S}}^{n,m}\). For each k, there exists \({\mathcal {X}}_k\in {\mathcal {M}}^{n,2(r+m)}\) such that \({\mathcal {A}}_k = {\mathcal {C}}^{(r)}({\mathcal {X}}_k)\). We note from (13) that \({\mathcal {C}}^{(r)}({\mathcal {X}}_k)\) is independent of \(({\mathcal {X}}_k)_{[\varvec{\gamma }]}\) such that some elements of \(\varvec{\gamma } \in {\mathbb {I}}_{=2(r+m)}^n\) are odd. Thus, by Lemma 3.10, we can assume that such \(({\mathcal {X}}_k)_{[\varvec{\gamma }]}\) is 0 without loss of generality. As \({\mathcal {C}}^{(r)}({\mathcal {X}}_k) \rightarrow {\mathcal {A}}_{\infty }\) \((k\rightarrow \infty )\), we observe that
Note that each \(({\mathcal {X}}_k)_{[2\varvec{\alpha } + 2\sum _{l=1}^m\varvec{e}_{i_l}]}\) is nonnegative because it is a diagonal element of the semidefinite matrix \(\varvec{M}^{n,r+m}({\mathcal {X}}_k)\). Therefore, \(\{({\mathcal {X}}_k)_{[2\varvec{\alpha } + 2\sum _{l=1}^m\varvec{e}_{i_l}]}\}_k\) is bounded for each \((i_1,\ldots ,i_m)\in {\mathbb {N}}_n^m\) and \(\varvec{\alpha }\in {\mathbb {I}}_{=r}^n\), and \(\{{\mathcal {X}}_k\}_k\) is as well. Thus, by taking a subsequence if necessary, we assume that the sequence \(\{{\mathcal {X}}_k\}_k\) converges to some \({\mathcal {X}}_{\infty }\). Given that \(\{{\mathcal {X}}_k\}_k \subseteq {\mathcal {M}}^{n,2(r+m)}\) and \({\mathcal {M}}^{n,2(r+m)}\) is closed, we have \({\mathcal {X}}_{\infty } \in {\mathcal {M}}^{n,2(r+m)}\). Therefore,
which implies that \({\mathcal {C}}_r^{n,m}\) is closed. \(\square \)
4 Approximation hierarchies exploiting those for the usual copositive cone
In this section, we provide other approximation hierarchies for the COP cone over a symmetric cone by exploiting those for the usual COP cone. Let \(({\mathbb {E}},\circ ,\bullet )\) be a Euclidean Jordan algebra of dimension n. In addition, for \({\mathcal {A}}\in {\mathcal {S}}^{n,m}({\mathbb {E}})\) and an ordered Jordan frame \((c_1,\ldots ,c_{{{\,\textrm{rk}\,}}})\in {\mathfrak {F}}({\mathbb {E}})\), let \({\mathcal {A}}(c_1,\ldots ,c_{{{\,\textrm{rk}\,}}})\in {\mathcal {S}}^{{{\,\textrm{rk}\,}},m}\) be the tensor with the \((i_1,\ldots ,i_m)\)th element \(\langle {\mathcal {A}},c_{i_1}\otimes \cdots \otimes c_{i_m}\rangle \). (The tensor is guaranteed to be symmetric by the symmetry of \({\mathcal {A}}\).)
The following lemma is key to providing the desired approximation hierarchies.
Lemma 4.1
Proof
Let \({\mathcal {A}}\in \mathcal {COP}^{n,m}({\mathbb {E}}_+)\). For any \((c_1,\ldots ,c_{{{\,\textrm{rk}\,}}})\in {\mathfrak {F}}({\mathbb {E}})\) and \(\varvec{x}\in {\mathbb {R}}_+^{{{\,\textrm{rk}\,}}}\), we have
which is nonnegative given that \(\sum _{i=1}^{{{\,\textrm{rk}\,}}}x_ic_i \in {\mathbb {E}}_+\).
Conversely, suppose that \({\mathcal {A}}\) belongs to the right-hand side set of (14). For any \(x\in {\mathbb {E}}_+\), there exist \((x_1,\ldots ,x_{{{\,\textrm{rk}\,}}})\in {\mathbb {R}}_+^{{{\,\textrm{rk}\,}}}\) and \((c_1,\ldots ,c_{{{\,\textrm{rk}\,}}})\in {\mathfrak {F}}({\mathbb {E}})\) such that \(x = \sum _{i=1}^{{{\,\textrm{rk}\,}}}x_ic_i\). As \({\mathcal {A}}(c_1,\ldots ,c_{{{\,\textrm{rk}\,}}})\in \mathcal {COP}^{{{\,\textrm{rk}\,}},m}\), we can show \(\langle {\mathcal {A}},x^{\otimes m}\rangle \ge 0\), i.e., \({\mathcal {A}}\in \mathcal {COP}^{n,m}({\mathbb {E}}_+)\) in the same manner as the above discussion. \(\square \)
The characterization of the COP cone over a symmetric cone is somewhat redundant because the Jordan frames we considered are ordered, and thus, the same set appears multiple times on the right-hand side of (14). To solve this problem, we provide a more concise characterization of \(\mathcal {COP}^{n,m}({\mathbb {E}}_+)\).
Definition 4.2
For each \(\sigma \in {\mathfrak {S}}_n\), let \({\mathscr {P}}_{\sigma }\) be the linear transformation on \({\mathcal {S}}^{n,m}\) defined by \(({\mathscr {P}}_{\sigma }{\mathcal {A}})_{i_1\cdots i_m} :={\mathcal {A}}_{\sigma (i_1)\cdots \sigma (i_m)}\) for each \({\mathcal {A}}\in {\mathcal {S}}^{n,m}\) and \((i_1,\ldots ,i_m)\in {\mathbb {N}}_n^m\). A set \({\mathcal {K}} \subseteq {\mathcal {S}}^{n,m}\) is said to be permutation-invariant if \({\mathscr {P}}_{\sigma }{\mathcal {K}} = {\mathcal {K}}\) for all \(\sigma \in {\mathfrak {S}}_n\).
Note that because \({\mathscr {P}}_{\sigma }\) is invertible and \({\mathscr {P}}_{\sigma }^{-1} = {\mathscr {P}}_{\sigma ^{-1}}\) for all \(\sigma \in {\mathfrak {S}}_n\), Definition 4.2 can be equivalently stated such that \({\mathscr {P}}_{\sigma }{\mathcal {A}} \in {\mathcal {K}}\) holds for all \({\mathcal {A}}\in {\mathcal {K}}\) and \(\sigma \in {\mathfrak {S}}_n\).
Lemma 4.3
\(\mathcal {COP}^{n,m}\) is permutation-invariant.
Proof
Let \({\mathcal {A}}\in \mathcal {COP}^{n,m}\). Then, for any \(\sigma \in {\mathfrak {S}}_n\) and \(\varvec{x}\in {\mathbb {R}}_+^n\), we have
for which the last inequality follows from the fact that \((x_{\sigma ^{-1}(1)},\ldots ,x_{{\sigma }^{-1}(n)}) \in {\mathbb {R}}_+^n\) if \(\varvec{x}\in {\mathbb {R}}_+^n\). \(\square \)
Lemma 4.4
Let \({\mathcal {K}} \subseteq {\mathcal {S}}^{{{\,\textrm{rk}\,}},m}\) be a permutation-invariant set and let \({\mathcal {A}}\in {\mathcal {S}}^{n,m}({\mathbb {E}})\). If \({\mathcal {A}}(c_1,\ldots ,c_{{{\,\textrm{rk}\,}}})\in {\mathcal {K}}\) for a given \((c_1,\ldots ,c_{{{\,\textrm{rk}\,}}})\in {\mathfrak {F}}({\mathbb {E}})\), then \({\mathcal {A}}(c_{\sigma (1)},\ldots ,c_{\sigma ({{\,\textrm{rk}\,}})})\in {\mathcal {K}}\) for all \(\sigma \in {\mathfrak {S}}_{{{\,\textrm{rk}\,}}}\).
Proof
Because \({\mathcal {K}}\) is permutation-invariant, we have \({\mathscr {P}}_{\sigma }{\mathcal {A}}(c_1,\ldots ,c_{{{\,\textrm{rk}\,}}}) \in {\mathcal {K}}\). Then, the \((i_1,\ldots ,i_m)\)th element of \({\mathscr {P}}_{\sigma }{\mathcal {A}}(c_1,\ldots ,c_{{{\,\textrm{rk}\,}}})\) is
which equals the \((i_1,\ldots ,i_m)\)th element of \({\mathcal {A}}(c_{\sigma (1)},\ldots ,c_{\sigma ({{\,\textrm{rk}\,}})})\). Therefore, we have
\(\square \)
The symmetric group \({\mathfrak {S}}_{{{\,\textrm{rk}\,}}}\) acts on the set \({\mathfrak {F}}({\mathbb {E}})\) by \(\sigma \cdot (c_1,\ldots ,c_{{{\,\textrm{rk}\,}}}) = (c_{\sigma (1)},\ldots ,c_{\sigma ({{\,\textrm{rk}\,}})})\). Let \({\mathfrak {F}}_{\textrm{c}}({\mathbb {E}})\) be a complete set of representatives of \({\mathfrak {S}}_{{{\,\textrm{rk}\,}}}\)-orbits in \({\mathfrak {F}}({\mathbb {E}})\). Then, using Lemmas 4.3 and 4.4, we obtain the following, more concise, characterization of \(\mathcal {COP}^{n,m}({\mathbb {E}}_+)\), compared with that using Lemma 4.1.
Theorem 4.5
Let \({\mathcal {K}} \subseteq {\mathcal {S}}^{{{\,\textrm{rk}\,}},m}\) be a permutation-invariant set and consider \({\mathfrak {F}}_{\textrm{c}}({\mathbb {E}}) \subseteq {\mathfrak {F}} \subseteq {\mathfrak {F}}({\mathbb {E}})\). Then, the set
is the same regardless of the choice of \({\mathfrak {F}}\). In particular, the claim holds when \({\mathcal {K}} = \mathcal {COP}^{{{\,\textrm{rk}\,}},m}\), and it follows that
The idea for providing inner- and outer-approximation hierarchies for \(\mathcal {COP}^{n,m}({\mathbb {E}}_+)\) is to approximate \(\mathcal {COP}^{{{\,\textrm{rk}\,}},m}\) on the right-hand side set in (15) from the inside and outside.
4.1 Inner-approximation hierarchy
Throughout this subsection, we only consider the case of \(m = 2\). Generally, even if \(\{{\mathcal {I}}_r^{{{\,\textrm{rk}\,}}}\}_r\) is an inner-approximation hierarchy for \(\mathcal {COP}^{{{\,\textrm{rk}\,}},2}\), i.e., \({\mathcal {I}}_r^{{{\,\textrm{rk}\,}}}\uparrow \mathcal {COP}^{{{\,\textrm{rk}\,}},2}\), the sequence obtained by replacing \(\mathcal {COP}^{{{\,\textrm{rk}\,}},2}\) on the right-hand side set in (15) with \({\mathcal {I}}_r^{{{\,\textrm{rk}\,}}}\) is not guaranteed to converge to \(\mathcal {COP}^{n,2}({\mathbb {E}}_+)\). However, if we choose the polyhedral inner-approximation hierarchy provided by de Klerk and Pasechnik [10] as \(\{{\mathcal {I}}_r^{{{\,\textrm{rk}\,}}}\}_r\), the induced sequence is indeed an inner-approximation hierarchy for \(\mathcal {COP}^{n,2}({\mathbb {E}}_+)\). The hierarchy \(\{{\mathcal {I}}_{\textrm{dP},r}^{{{\,\textrm{rk}\,}}}\}_r\) provided by de Klerk and Pasechnik [10] is defined as
and satisfies \({\mathcal {I}}_{\textrm{dP},r}^{{{\,\textrm{rk}\,}}} \uparrow \mathcal {COP}^{{{\,\textrm{rk}\,}},2}\).
The following theorem plays an important role in proving that the sequence induced by \(\{{\mathcal {I}}_{\textrm{dP},r}^{{{\,\textrm{rk}\,}}}\}_r\) converges to \(\mathcal {COP}^{n,2}({\mathbb {E}}_+)\).
Theorem 4.6
([10, Corollary 3.5]; also see [44, Theorem 1]) Let \(\varvec{A} \in {{\,\textrm{int}\,}}(\mathcal {COP}^{{{\,\textrm{rk}\,}},2})\), and set \(L :=\max _{1\le i,j\le {{\,\textrm{rk}\,}}}|A_{ij}|\) and \(\lambda :=\min _{\varvec{x}\in \Delta _=^{{{\,\textrm{rk}\,}}-1}}\varvec{x}^\top \varvec{A}\varvec{x} > 0\). If \(r\in {\mathbb {N}}\) satisfies \(r > L/\lambda -2\), then \(\varvec{A} \in {\mathcal {I}}_{\textrm{dP},r}^{{{\,\textrm{rk}\,}}}\).
Proposition 4.7
We fix a set \({\mathfrak {F}}\) with \({\mathfrak {F}}_{\textrm{c}}({\mathbb {E}}) \subseteq {\mathfrak {F}} \subseteq {\mathfrak {F}}({\mathbb {E}})\). Let
Then, each \({\mathcal {I}}_{\textrm{dP},r}^{n}({\mathbb {E}}_+)\) is a closed convex cone and \({\mathcal {I}}_{\textrm{dP},r}^{n}({\mathbb {E}}_+) \uparrow \mathcal {COP}^{n,2}({\mathbb {E}}_+)\).
Proof
Given that each \({\mathcal {I}}_{\textrm{dP},r}^{{{\,\textrm{rk}\,}}}\) is a closed convex cone and that the mapping \({\mathcal {A}}\mapsto \langle {\mathcal {A}},c_i\otimes c_j\rangle \) is linear for each \((c_1,\ldots ,c_{{{\,\textrm{rk}\,}}})\in {\mathfrak {F}}\) and \(i,j=1,\ldots ,{{\,\textrm{rk}\,}}\), \({\mathcal {I}}_{\textrm{dP},r}^{n}({\mathbb {E}}_+)\) is a closed convex cone. In the following, we prove \({\mathcal {I}}_{\textrm{dP},r}^{n}({\mathbb {E}}_+) \uparrow \mathcal {COP}^{n,2}({\mathbb {E}}_+)\). As the sequence \(\{{\mathcal {I}}_{\textrm{dP},r}^{{{\,\textrm{rk}\,}}}\}_r\) satisfies \({\mathcal {I}}_{\textrm{dP},r}^{{{\,\textrm{rk}\,}}} \subseteq {\mathcal {I}}_{\textrm{dP},r+1}^{{{\,\textrm{rk}\,}}} \subseteq \mathcal {COP}^{{{\,\textrm{rk}\,}},2}\) for all \(r\in {\mathbb {N}}\), it follows from Theorem 4.5 that the sequence \(\{{\mathcal {I}}_{\textrm{dP},r}^{n}({\mathbb {E}}_+)\}_r\) satisfies \({\mathcal {I}}_{\textrm{dP},r}^{n}({\mathbb {E}}_+) \subseteq {\mathcal {I}}_{\textrm{dP},r+1}^{n}({\mathbb {E}}_+) \subseteq \mathcal {COP}^{n,2}({\mathbb {E}}_+)\) for all \(r\in {\mathbb {N}}\). Next, let \({\mathcal {A}}\in {{\,\textrm{int}\,}}\mathcal {COP}^{n,2}({\mathbb {E}}_+)\). Using \({\mathcal {A}}\), we define
for each \((c_1,\ldots ,c_{{{\,\textrm{rk}\,}}})\in {\mathfrak {F}}({\mathbb {E}})\), and also define
for \(\widetilde{{\mathfrak {F}}} \in \{{\mathfrak {F}},{\mathfrak {F}}({\mathbb {E}})\}\). Given that \({\mathcal {A}}\in {{\,\textrm{int}\,}}\mathcal {COP}^{n,2}({\mathbb {E}}_+)\) and that the sets \(\Delta _=^{{{\,\textrm{rk}\,}}-1}\) and \({\mathfrak {F}}({\mathbb {E}})\) are compact, we have \(L({\mathcal {A}};{\mathfrak {F}}({\mathbb {E}})) < +\infty \) and \(\lambda ({\mathcal {A}};{\mathfrak {F}}({\mathbb {E}})) > 0\). Because \(L({\mathcal {A}};{\mathfrak {F}}) \le L({\mathcal {A}};{\mathfrak {F}}({\mathbb {E}}))\) and \(\lambda ({\mathcal {A}};{\mathfrak {F}}) \ge \lambda ({\mathcal {A}};{\mathfrak {F}}({\mathbb {E}}))\), we obtain \(L({\mathcal {A}};{\mathfrak {F}}) < +\infty \) and \(\lambda ({\mathcal {A}};{\mathfrak {F}}) > 0\). Now, let \(r_0 :=\lceil L({\mathcal {A}};{\mathfrak {F}})/\lambda ({\mathcal {A}};{\mathfrak {F}})\rceil \in {\mathbb {N}}\) and fix \((c_1,\ldots ,c_{{{\,\textrm{rk}\,}}})\in {\mathfrak {F}}\) arbitrarily. Then, because
Theorem 4.6 implies that \({\mathcal {A}}(c_1,\ldots ,c_{{{\,\textrm{rk}\,}}}) \in {\mathcal {I}}_{\textrm{dP},r_0}^{{{\,\textrm{rk}\,}}}\). Because \(r_0\) is independent of the choice of \((c_1,\ldots ,c_{{{\,\textrm{rk}\,}}})\in {\mathfrak {F}}\), we obtain \({\mathcal {A}} \in {\mathcal {I}}_{\textrm{dP},r_0}^n({\mathbb {E}}_+) \subseteq \bigcup _{r=0}^{\infty }{\mathcal {I}}_{\textrm{dP},r}^n({\mathbb {E}}_+)\). \(\square \)
Remark 4.8
As it can be seen from the proof of Proposition 4.7, if a non-decreasing sequence \(\{{\mathcal {I}}_r^{{{\,\textrm{rk}\,}}}\}_r\) satisfies \({\mathcal {I}}_{\textrm{dP},r}^{{{\,\textrm{rk}\,}}} \subseteq {\mathcal {I}}_r^{{{\,\textrm{rk}\,}}} \subseteq \mathcal {COP}^{{{\,\textrm{rk}\,}},2}\) for all \(r \in {\mathbb {N}}\), the sequence obtained by replacing \(\mathcal {COP}^{{{\,\textrm{rk}\,}},2}\) on the right-hand side set in (15) with \({\mathcal {I}}_r^{{{\,\textrm{rk}\,}}}\) is also an inner-approximation hierarchy for \(\mathcal {COP}^{n,2}({\mathbb {E}}_+)\). In addition, Proposition 4.7 can be extended to the case of general m by using the polyhedral inner-approximation hierarchy provided by Iqbal and Ahmed [29], which is a generalization of that provided by de Klerk and Pasechnik [10], for \(\mathcal {COP}^{{{\,\textrm{rk}\,}},m}\).
Note that \({\mathcal {I}}_{\textrm{dP},r}^{n}({\mathbb {E}}_+)\) is defined as the intersection of the infinitely many sets in general even if m, the order of tensors, is limited to 2. This means that the inner-approximation hierarchy induces a semi-infinite conic constraint. Initially, the tractability of the approximation hierarchy seems to be the same as the COP cone because \(\mathcal {COP}^{n,2}({\mathbb {E}}_+)\) can also be described by a semi-infinite constraint. However, when the symmetric cone \({\mathbb {E}}_+\) is the direct product of a nonnegative orthant and one second-order cone, each \({\mathcal {I}}_{\textrm{dP},r}^{n}({\mathbb {E}}_+)\) can be represented by finitely many semidefinite constraints.
4.1.1 Full expression of \({\mathcal {I}}_{\textrm{dP},r}^{n}({\mathbb {E}}_+)\)
Let \(({\mathbb {E}}_1,\circ _1,\bullet _1)\) and \(({\mathbb {E}}_2,\circ _2,\bullet _2)\) be the Euclidean Jordan algebras associated with the nonnegative orthant \({\mathbb {R}}_+^{n_1}\) and second-order cone \({\mathbb {L}}^{n_2}\) shown in Examples 2.2 and 2.3, respectively. Then, \({\mathbb {E}} :={\mathbb {E}}_1\times {\mathbb {E}}_2 = {\mathbb {R}}^{n_1 + n_2}\) is the Euclidean Jordan algebra with the induced symmetric cone \({\mathbb {R}}_+^{n_1} \times {\mathbb {L}}^{n_2}\) and rank \({{\,\textrm{rk}\,}}= n_1 + 2\). We set \(n :=n_1 + n_2\) and reindex \((1,\ldots ,{{\,\textrm{rk}\,}})\) as \((11,\ldots ,1n_1,21,22)\), i.e., \(1i :=i\) for \(i = 1,\ldots ,n_1\) and \(2i :=n_1 + i\) for \(i = 1,2\). Section 4.2 uses this notation. In addition,
is a set satisfying \({\mathfrak {F}}_{\textrm{c}}({\mathbb {E}}) \subseteq {\mathfrak {F}} \subseteq {\mathfrak {F}}({\mathbb {E}})\) (see Example 2.2, Example 2.3, and Proposition 2.6).
Under the identification between \({\mathcal {S}}^{n,2}\) and \({\mathbb {S}}^n\), the inner-approximation hierarchy (16) with the set (17) is
where
Note that \(f(\varvec{x};\varvec{A},\varvec{v})\) is doubled for the convenience of the following calculation; however, the set \({\mathcal {I}}_{\textrm{dP},r}^{n}({\mathbb {E}}_+)\) is unchanged. Let \(\varvec{A}\in {\mathbb {S}}^n\) be partitioned as follows:
with \(\varvec{A}^{(11)}\in {\mathbb {S}}^{n_1}\), \(\varvec{A}^{(121)}\in {\mathbb {R}}^{n_1}\), \(\varvec{A}^{(122)}\in {\mathbb {R}}^{(n_2-1)\times n_1}\), \(A^{(2121)}\in {\mathbb {R}}\), \(\varvec{A}^{(2122)}\in {\mathbb {R}}^{n_2-1}\), and \(\varvec{A}^{(2222)}\in {\mathbb {S}}^{n_2-1}\). In addition, for such \(\varvec{A}\in {\mathbb {S}}^n\) and
we let
and define
where \({{\,\textrm{diag}\,}}(\varvec{A}^{(11)})\in {\mathbb {R}}^{n_1}\) is the vector of the diagonal elements of \(\varvec{A}^{(11)}\). Then, after some calculations, we have
Therefore,
where the last equation follows from [52, Corollary 6]. In summary, \({\mathcal {I}}_{\textrm{dP},r}^{n}({\mathbb {E}}_+)\) can be described by \(|{\mathbb {I}}_{=r+2}^{{{\,\textrm{rk}\,}}}|\) semidefinite constraints, which is bounded by \({{\,\textrm{rk}\,}}^{r+2}\) and whose size is \(n_2\). We call the sequence \(\{{\mathcal {I}}_{\textrm{dP},r}^{n}({\mathbb {E}}_+)\}_r\) the dP-type inner-approximation hierarchy.
4.1.2 Concise expression of \({\mathcal {I}}_{\textrm{dP},r}^{n}({\mathbb {E}}_+)\)
In this subsubsection, we explain that the expression (19) can be made more concise. First, we show that the number of constraints in (19) can be halved. For \(\varvec{\alpha } = (\varvec{\alpha }_1,\alpha _{21},\alpha _{22})\in {\mathbb {I}}_{=r+2}^{{{\,\textrm{rk}\,}}}\), let \(\widetilde{\varvec{\alpha }} :=(\varvec{\alpha }_1,\alpha _{22},\alpha _{21}) \in {\mathbb {I}}_{=r+2}^{{{\,\textrm{rk}\,}}}\). As \(M^{(11)}(\varvec{A},\widetilde{\varvec{\alpha }}) = M^{(11)}(\varvec{A},\varvec{\alpha })\), \(\varvec{M}^{(21)}(\varvec{A},\widetilde{\varvec{\alpha }}) = -\varvec{M}^{(21)}(\varvec{A},\varvec{\alpha })\), and \(\varvec{M}^{(22)}(\varvec{A},\widetilde{\varvec{\alpha }}) = \varvec{M}^{(22)}(\varvec{A},\varvec{\alpha })\) for each \(\varvec{A}\in {\mathbb {S}}^n\), we have
When \(\varvec{v}\) takes every element of \(S^{n_2-2}\), \(-\varvec{v}\) also takes every element of \(S^{n_2-2}\). Thus, (20) is equivalent to
That is, taking the intersection with respect to \(\varvec{\alpha }\in {\mathbb {I}}_{=r+2}^{{{\,\textrm{rk}\,}}}\) with \(\alpha _{21}\le \alpha _{22}\) in (19) is sufficient. This can be explained by the fact that the ordering of a Jordan frame can be ignored because \({\mathcal {I}}_{\textrm{dP},r}^{{{\,\textrm{rk}\,}}}\) is permutation-invariant; thus, we can apply Theorem 4.5.
Next, we show that some semidefinite constraints in (19) can be written as second-order cone or non-negativity constraints. If \(\varvec{\alpha } = (\varvec{\alpha }_1,\alpha _{21},\alpha _{22})\in {\mathbb {I}}^{{{\,\textrm{rk}\,}}}_{=r+2}\) satisfies
then \(\varvec{M}^{(22)}(\varvec{A},\varvec{\alpha }) = \varvec{O}\). In this case,
In particular, when \(k=0\), i.e., \(\alpha _{21} = \alpha _{22} = 0\), as \(\varvec{M}^{(21)}(\varvec{A},\varvec{\alpha })\) is also zero, the above second-order cone constraint reduces to the non-negativity constraint \(M^{(11)}(\varvec{A},\varvec{\alpha }) \ge 0\).
Finally, we show that the size of some semidefinite constraints can be reduced by 1. If \(\varvec{\alpha } = (\varvec{\alpha }_1,\alpha _{21},\alpha _{22})\in {\mathbb {I}}^{{{\,\textrm{rk}\,}}}_{=r+2}\) satisfies \(\alpha _{21} = \alpha _{22} \ne 0\), we have \(\varvec{M}^{(21)}(\varvec{A},\varvec{\alpha }) = \varvec{0}\). Then,
where \(\lambda _{\min }(\cdot )\) denotes the minimum eigenvalue for an input matrix.
4.2 Outer-approximation hierarchy
Unlike the case of inner-approximation hierarchies, an outer-approximation hierarchy for \(\mathcal {COP}^{{{\,\textrm{rk}\,}},m}\) always induces that for \(\mathcal {COP}^{n,m}({\mathbb {E}}_+)\).
Proposition 4.9
Let \(\{{\mathcal {O}}_r^{{{\,\textrm{rk}\,}},m}\}_r\) be a sequence such that each \({\mathcal {O}}_r^{{{\,\textrm{rk}\,}},m}\) is a closed convex cone, and the sequence satisfies the following two conditions:
-
(i)
\({\mathcal {O}}_{r+1}^{{{\,\textrm{rk}\,}},m} \subseteq {\mathcal {O}}_r^{{{\,\textrm{rk}\,}},m}\) for all \(r\in {\mathbb {N}}\).
-
(ii)
\(\mathcal {COP}^{{{\,\textrm{rk}\,}},m} = \bigcap _{r=0}^{\infty }{\mathcal {O}}_r^{{{\,\textrm{rk}\,}},m}\).
In the following, the notation “\({\mathcal {O}}_r^{{{\,\textrm{rk}\,}},m} \downarrow \mathcal {COP}^{{{\,\textrm{rk}\,}},m}\)” is used to represent the two conditions as in the case of inner-approximation hierarchies. We fix a set \({\mathfrak {F}}\) with \({\mathfrak {F}}_{\textrm{c}}({\mathbb {E}}) \subseteq {\mathfrak {F}} \subseteq {\mathfrak {F}}({\mathbb {E}})\). Let
Then, each \({\mathcal {O}}_r^{n,m}({\mathbb {E}}_+)\) is a closed convex cone and \({\mathcal {O}}_r^{n,m}({\mathbb {E}}_+) \downarrow \mathcal {COP}^{n,m}({\mathbb {E}}_+)\).
Proof
Because \({\mathcal {O}}_r^{n,m}({\mathbb {E}}_+)\) can easily be shown to be a closed convex cone and \({\mathcal {O}}_{r+1}^{n,m}({\mathbb {E}}_+) \subseteq {\mathcal {O}}_r^{n,m}({\mathbb {E}}_+)\) for all \(r\in {\mathbb {N}}\) in the same manner as Proposition 4.7, we prove only \(\mathcal {COP}^{n,m}({\mathbb {E}}_+) = \bigcap _{r=0}^{\infty }{\mathcal {O}}_r^{n,m}({\mathbb {E}}_+)\). The “\(\subseteq \)” part follows from Theorem 4.5. To prove the “\(\supseteq \)” part, let \({\mathcal {A}}\in \bigcap _{r=0}^{\infty }{\mathcal {O}}_r^{n,m}({\mathbb {E}}_+)\). Then, \({\mathcal {A}}(c_1,\ldots ,c_{{{\,\textrm{rk}\,}}})\in {\mathcal {O}}_r^{{{\,\textrm{rk}\,}},m}\) for all \(r\in {\mathbb {N}}\) and \((c_1,\ldots ,c_{{{\,\textrm{rk}\,}}})\in {\mathfrak {F}}\). By the convergence assumption (ii) on \(\{{\mathcal {O}}_r^{{{\,\textrm{rk}\,}},m}\}_r\), we have \({\mathcal {A}}(c_1,\ldots ,c_{{{\,\textrm{rk}\,}}})\in \mathcal {COP}^{{{\,\textrm{rk}\,}},m}\) for all \((c_1,\ldots ,c_{{{\,\textrm{rk}\,}}})\in {\mathfrak {F}}\). Thus, from Theorem 4.5, we obtain \({\mathcal {A}}\in \mathcal {COP}^{n,m}({\mathbb {E}}_+)\). \(\square \)
Remark 4.10
The set \(\bigcup _{r=0}^{\infty }{\mathcal {I}}_{\textrm{dP},r}^n({\mathbb {E}}_+)\) involves the intersection with respect to Jordan frames and the union with respect to the depth parameter r. Therefore, to establish the convergence \({{\,\textrm{int}\,}}\mathcal {COP}^{n,2}({\mathbb {E}}_+) \subseteq \bigcup _{r=0}^{\infty }{\mathcal {I}}_{\textrm{dP},r}^n({\mathbb {E}}_+)\) in the proof of Proposition 4.7, it is necessary to take \(r_0\) such that it is independent of Jordan frames.
On the other hand, the set \(\bigcap _{r=0}^{\infty }{\mathcal {O}}_r^{n,m}({\mathbb {E}}_+)\) involves the intersection with respect to the depth parameter as well as Jordan frames. Thus, as seen in the proof of Proposition 4.9, the order of the depth parameter and Jordan frames is interchangeable in showing the convergence \(\mathcal {COP}^{n,m}({\mathbb {E}}_+) = \bigcap _{r=0}^{\infty }{\mathcal {O}}_r^{n,m}({\mathbb {E}}_+)\), which makes the proof easier than the inner-approximation hierarchy.
As in Proposition 4.7, the outer-approximation hierarchy induces a semi-infinite conic constraint. However, as with the inner-approximation hierarchy, when the symmetric cone \({\mathbb {E}}_+\) is the direct product of a nonnegative orthant and second-order cone and \(m = 2\), if we choose an appropriate polyhedral outer-approximation hierarchy as \(\{{\mathcal {O}}_r^{{{\,\textrm{rk}\,}},2}\}_r\) (written as \(\{{\mathcal {O}}_r^{{{\,\textrm{rk}\,}}}\}_r\) hereafter), each \({\mathcal {O}}_r^{n,2}({\mathbb {E}}_+)\) (written as \({\mathcal {O}}_r^n({\mathbb {E}}_+)\) hereafter) can be represented by finitely many semidefinite constraints.
4.2.1 Full expression of \({\mathcal {O}}_r^n({\mathbb {E}}_+)\)
Let \({\mathbb {E}}\) be the same Euclidean Jordan algebra with the induced symmetric cone \({\mathbb {R}}_+^{n_1} \times {\mathbb {L}}^{n_2}\) as defined in Sect. 4.1.1. The polyhedral outer-approximation hierarchy \(\{{\mathcal {O}}_r^{{{\,\textrm{rk}\,}}}\}_r\) we use is based on a discretization of the standard simplex and written as
where \(\delta _r^{{{\,\textrm{rk}\,}}-1}\) is a finite subset of \(\Delta _=^{{{\,\textrm{rk}\,}}-1}\) for each \(r\in {\mathbb {N}}\). This type of outer-approximation hierarchy includes, for example, that given by Yıldırım [53]. The outer-approximation hierarchy (21) induced by the set (17) and polyhedral outer-approximation hierarchy (22) is
where \(f(\varvec{x};\varvec{A},\varvec{v})\) is defined as (18). Let
Then,
In summary, \({\mathcal {O}}_r^n({\mathbb {E}}_+)\) can be described by the \(|\delta _r^{{{\,\textrm{rk}\,}}-1}|\) semidefinite constraints of size \(n_2\). In particular, if we use the outer-approximation hierarchy \(\{{\mathcal {O}}_r^{{{\,\textrm{rk}\,}}}\}_r\) proposed by Yıldırım [53], then \(\delta _r^{{{\,\textrm{rk}\,}}-1}\) is given by
and \(|\delta _r^{{{\,\textrm{rk}\,}}-1}|\) is bounded by \({{\,\textrm{rk}\,}}^2(\frac{{{\,\textrm{rk}\,}}^{r+1}-1}{{{\,\textrm{rk}\,}}-1})\), which is polynomial in \({{\,\textrm{rk}\,}}\) for every fixed \(r\in {\mathbb {N}}\) (see [53, Eq. (10)]). We call the sequence \(\{{\mathcal {O}}_r^n({\mathbb {E}}_+)\}_r\) obtained by exploiting the hierarchy given by Yıldırım [53] the Yıldırım-type outer-approximation hierarchy.
4.2.2 Concise expression of \({\mathcal {O}}_r^n({\mathbb {E}}_+)\)
As with the inner-approximation hierarchy, we can make the expression (23) more concise. Let \(\varvec{N}(\varvec{x},\varvec{A})\) be partitioned as follows:
where \(N^{(11)}(\varvec{x},\varvec{A})\in {\mathbb {R}}\), \(\varvec{N}^{(21)}(\varvec{x},\varvec{A})\in {\mathbb {R}}^{n_2-1}\), and \(\varvec{N}^{(22)}(\varvec{x},\varvec{A})\in {\mathbb {S}}^{n_2-1}\) and \(\varvec{x} = (\varvec{x}_1,x_{21},x_{22})\in {\mathbb {R}}^{{{\,\textrm{rk}\,}}}\). Then, we have
First, the number of constraints in (23) may be reduced. Suppose that \(\delta _r^{{{\,\textrm{rk}\,}}-1}\) is permutation-invariant in the sense of Definition 4.2. (Note that \({\mathcal {S}}^{{{\,\textrm{rk}\,}},1} = {\mathbb {R}}^{{{\,\textrm{rk}\,}}}\).) For example, (24) is permutation-invariant. For \(\varvec{x} = (\varvec{x}_1,x_{21},x_{22})\in \delta _r^{{{\,\textrm{rk}\,}}-1}\), let \(\widetilde{\varvec{x}} :=(\varvec{x}_1,x_{22},x_{21})\) then \(\widetilde{\varvec{x}}\in \delta _r^{{{\,\textrm{rk}\,}}-1}\) because \(\delta _r^{{{\,\textrm{rk}\,}}-1}\) is permutation-invariant. Given that \(N^{(11)}(\widetilde{\varvec{x}},\varvec{A}) = N^{(11)}(\varvec{x},\varvec{A})\), \(\varvec{N}^{(21)}(\widetilde{\varvec{x}},\varvec{A}) = -\varvec{N}^{(21)}(\varvec{x},\varvec{A})\), and \(\varvec{N}^{(22)}(\widetilde{\varvec{x}},\varvec{A}) = \varvec{N}^{(22)}(\varvec{x},\varvec{A})\) for each \(\varvec{A}\in {\mathbb {S}}^n\), taking the intersection with respect to \(\varvec{x}\in \delta _r^{{{\,\textrm{rk}\,}}-1}\) with \(x_{21}\le x_{22}\) in (23) is sufficient for the same reason as for Sect. 4.1.2.
Second, some semidefinite constraints in (23) can be written as non-negativity constraints. If \(\varvec{x} = (\varvec{x}_1,x_{21},x_{22})\in \delta _r^{{{\,\textrm{rk}\,}}-1}\) satisfies \(x_{21} = x_{22}\), then we have \(\varvec{N}^{(21)}(\varvec{x},\varvec{A}) = \varvec{0}\) and \(\varvec{N}^{(22)}(\varvec{x},\varvec{A}) = \varvec{O}\). Therefore, the constraint
reduces to \(N^{(11)}(\varvec{x},\varvec{A}) \ge 0\).
5 Comparison with other approximation hierarchies
The previous sections provide new approximation hierarchies applicable to the COP cone \(\mathcal {COP}({\mathbb {K}})\) over the cone \({\mathbb {K}} = {\mathbb {R}}^{n_1}\times {\mathbb {L}}^{n_2}\). For such \({\mathbb {K}}\), those given by Zuluaga et al. [55] and Lasserre [33] are also applicable to \(\mathcal {COP}({\mathbb {K}})\). In this section, we analytically compare some properties of the proposed approximation hierarchies for \(\mathcal {COP}({\mathbb {K}})\) with other existing hierarchies. In the following, we set \(n :=n_1 + n_2\) as in Sect. 4 and reindex \((1,\ldots ,n)\) as \((11,\ldots ,1n_1,21,\ldots ,2n_2)\), i.e., \(1i :=i\) for \(i = 1,\ldots ,n_1\) and \(2i :=n_1 + i\) for \(i = 1,\ldots ,n_2\). We use this notation for numerical experiments in Sect. 6 as well.
First, the cone \({\mathbb {K}} = {\mathbb {R}}^{n_1}\times {\mathbb {L}}^{n_2}\) can be represented as a semialgebraic set
where \(\varvec{e} :=(\varvec{1}_{n_1+1},\varvec{0}_{n_2-1})\in {{\,\textrm{int}\,}}({\mathbb {K}})\); thus, the inner-approximation hierarchy given by Zuluaga et al. [55] is applicable to \(\mathcal {COP}({\mathbb {K}})\). Note that the inequality \(\varvec{e}^\top \varvec{x} \ge 0\) in (25) is redundant but necessary to deriving the hierarchy (see [55, Assumption 1]). The inner-approximation hierarchy for \(\mathcal {COP}({\mathbb {K}})\) is summarized as follows:
Theorem 5.1
[55, Proposition 17] Let
and \({\mathcal {K}}_{\textrm{ZVP},r}({\mathbb {K}}) :=\{\varvec{A}\in {\mathbb {S}}^n\mid (\varvec{e}^\top \varvec{x})^r\varvec{x}^\top \varvec{A}\varvec{x}\in E^{n,r+2}({\mathbb {K}})\}\) for each \(r\in {\mathbb {N}}\). Then, the sequence \(\{{\mathcal {K}}_{\textrm{ZVP},r}({\mathbb {K}})\}_r\) satisfies \({\mathcal {K}}_{\textrm{ZVP},r}({\mathbb {K}}) \uparrow \mathcal {COP}({\mathbb {K}})\).
In the following, we call the sequence \(\{{\mathcal {K}}_{\textrm{ZVP},r}({\mathbb {K}})\}_r\) the ZVP-type inner-approximation hierarchy. Although the representation (26) of the set \(E^{n,m}({\mathbb {K}})\) is somewhat abstract, we can represent it recursively.
Lemma 5.2
If \(m = 2k\) for some \(k\in {\mathbb {N}}\), then
If \(m = 2k + 1\) for some \(k\in {\mathbb {N}}\), then
From Lemma 5.2, we note that each \({\mathcal {K}}_{\textrm{ZVP},r}({\mathbb {K}})\) can be described by semidefinite constraints. More precisely, the size and number of the semidefinite constraints that define the set \(E^{n,m}({\mathbb {K}})\) can be calculated.
Proposition 5.3
Let
for each \(m\in {\mathbb {N}}\). Note that \(a_m\) is the mth term of the recurrence \(a_{m+2} = {{\,\textrm{rk}\,}}a_{m+1} + a_m\) with initial conditions \(a_0 = 1\) and \(a_1 = {{\,\textrm{rk}\,}}\) and is of order \(O(n_1^m)\). If \(m = 2k\) for some \(k\in {\mathbb {N}}\), then \(E^{n,m}({\mathbb {K}})\) is described by \(a_{2i}\) semidefinite constraints of size \(|{\mathbb {I}}_{=k-i}^n|\) \((i = 0,\ldots ,k)\). If \(m = 2k+1\) for some \(k\in {\mathbb {N}}\), then \(E^{n,m}({\mathbb {K}})\) is described by \(a_{2i+1}\) semidefinite constraints of size \(|{\mathbb {I}}_{=k-i}^n|\) \((i = 0,\ldots ,k)\).
Second, we introduce the outer-approximation hierarchy given by Lasserre [33]. Let \(\Delta ({\mathbb {K}}) :=\{\varvec{x}\in {\mathbb {K}}\mid \varvec{e}^\top \varvec{x}\le 1\}\), which is a compact set of \({\mathbb {R}}^n\). Note that \(\varvec{A}\in \mathcal {COP}({\mathbb {K}})\) if and only if \(\varvec{A}\in \mathcal {COP}(\Delta ({\mathbb {K}}))\) for each \(\varvec{A}\in {\mathbb {S}}^n\) with a slight abuse of notation. Let \(\nu \) be the finite Borel measure uniformly supported on \(\Delta ({\mathbb {K}})\), i.e., \(\nu (B) :=\int _B 1_{\Delta ({\mathbb {K}})}d\varvec{x}\) for each B in the Borel \(\sigma \)-algebra of \({\mathbb {R}}^n\), where \(1_{\Delta ({\mathbb {K}})}\) is the indicator function of \(\Delta ({\mathbb {K}})\), and the notation \(d\varvec{x}\) represents the Lebesgue measure. Then, the moment \(\varvec{y} = (y_{\varvec{\alpha }})_{\varvec{\alpha }\in {\mathbb {N}}^n}\) of the measure \(\nu \) satisfies
for each \(\varvec{\alpha } = (\varvec{\alpha }_1,\alpha _{21},\ldots ,\alpha _{2n_2})\in {\mathbb {N}}^n\), where \(\Gamma (\cdot )\) denotes the gamma function and \(\beta _{2i} :=(\alpha _{2i}+1)/2\) for \(i = 2,\ldots ,n_2\). See “Appendix A” for the calculation of (27). Using the moment, the outer-approximation hierarchy for \(\mathcal {COP}({\mathbb {K}})\) given by Lasserre [33] can be constructed, which we call the Lasserre-type outer-approximation hierarchy.
Theorem 5.4
[33, Sect. 2.4] For each \(r\in {\mathbb {N}}\), we define
where \(\varvec{M}_r(f_{\varvec{A}}\varvec{y})\) is the symmetric matrix with the \((\varvec{\alpha },\varvec{\beta })\)th element \(\sum _{i,j=1}^nA_{ij}y_{\varvec{\alpha }+\varvec{\beta }+\varvec{e}_i+\varvec{e}_j}\) for each \(\varvec{\alpha },\varvec{\beta }\in {\mathbb {I}}_{\le r}^n\). Then, the sequence \(\{{\mathcal {K}}_{\textrm{L},r}({\mathbb {K}})\}_r\) satisfies \({\mathcal {K}}_{\textrm{L},r}({\mathbb {K}}) \downarrow \mathcal {COP}({\mathbb {K}})\).
The above discussion indicates that the three (dP-, ZVP-, and NN-type) inner-approximation hierarchies and two (Yıldırım- and Lasserre-type) outer-approximation hierarchies for \(\mathcal {COP}({\mathbb {K}})\) are basically described by semidefinite constraints. Table 1 summarizes the approximation hierarchies, from which we can observe the characteristics of each approximation hierarchy. In particular, the dP- and Yıldırım-type approximation hierarchies have features that differ from those of the ZVP-, NN-, and Lasserre-type approximation hierarchies.
First, the number of semidefinite constraints defining the dP- and Yıldırım-type approximation hierarchies is exponential in r but depends only on \(n_1\) and not on \(n_2\) because \({{\,\textrm{rk}\,}}= n_1 + 2\). In addition, their size is linear in \(n_2\). Thus, they would not be affected much by the increase in \(n_2\), and \(n_1\) determines to extent to which depth parameter r can be computationally increased. Conversely, the other approximation hierarchies include semidefinite constraints whose maximum size is exponential in r and dependent on \(n = n_1 + n_2\). Thus, they would be considerably affected by the increase in \(n_2\) as well as in \(n_1\). The numerical experiment conducted in Sect. 6 demonstrates this theoretical comparison.
Second, the dP- and Yıldırım-type approximation hierarchies are defined by multiple but small semidefinite constraints, which means that linear conic programming over these hierarchies can be reformulated as SDP with a block diagonal matrix structure. In this case, we can conduct some of the operations in the primal–dual interior-point methods independently for each block [19], thereby reducing the computational and spatial complexity.
6 Numerical experiments
In this section, we consider the following COPP problem with the COP cone \(\mathcal {COP}({\mathbb {K}})\) over \({\mathbb {K}} = {\mathbb {R}}_+^{n_1}\times {\mathbb {L}}^{n_2}\):
where \(\varvec{C}\) is a symmetric positive definite matrix. Note that the dual problem of (28) is
Both (28) and its dual problem (29) satisfy Slater’s condition; thus, (28) is ideal in a sense.
Lemma 6.1
Both problems (28) and (29) satisfy Slater’s condition, i.e., they have a feasible interior solution if \(\varvec{C}\) is a symmetric positive definite matrix.
Proof
Let \(y_0 :=0\) and \(\varvec{S}_0 :=\varvec{C}\). Then, \((y_0,\varvec{S}_0)\) is a feasible interior solution of (28). Next, let
and we prove that \(\varvec{X}_0\) is a feasible interior solution of problem (29). Let
Then, they span \({\mathbb {R}}^n\) as each \(\varvec{x}\in {\mathbb {R}}^n\) can be written as
Therefore, it follows from [22, Theorem 3.3] that
Given that \(\langle \varvec{E}_n,\varvec{X}_0'\rangle = n_1^2+3n_1+4n_2-1 > 0\), we obtain \(\langle \varvec{E}_n,\varvec{X}_0\rangle = 1\) and \(\varvec{X}_0 \in {{\,\textrm{int}\,}}\mathcal{C}\mathcal{P}({\mathbb {K}})\). \(\square \)
All experiments in this section were conducted on a computer with an Intel Core i5-8279U 2.40 GHz CPU and 16 GB of memory. The modeling language YALMIP [35] (version 20210331), the MOSEK solver [37] (version 9.3.3), and MATLAB (R2022a), were used to solve optimization problems. Based on Lemma 6.1, a coefficient matrix \(\varvec{C}\) in problem (28) was randomly generated such that it was symmetric positive definite. We measured three types of time when solving the optimization problems:
-
preparetime: Time taken before calling YALMIP commands optimize or solvesos.
-
yalmiptime: Time between calling the above commands and beginning to solve an optimization problem in MOSEK.
-
solvertime: Time required to solve an optimization problem in MOSEK.
We defined the total time as the sum of the three types of time, and the calculation was considered invalid when the total time exceeded 7200 s.
6.1 Comparison of approximation hierarchies
Table 1 lists five approximation hierarchies for \(\mathcal {COP}({\mathbb {K}})\) we have introduced. Here, we solve optimization problems obtained by replacing the COP cone \(\mathcal {COP}({\mathbb {K}})\) in (28) with the approximation hierarchies. For convenience, we hereafter call such problems dP-type approximation problems (of depth r), for example. The YALMIP command optimize was used when solving dP-, Yıldırım-, and Lasserre-type approximation problems, and solvesos was used when solving ZVP- and NN-type approximation problems. For each approximation hierarchy, we continuously increased the parameter r that decides the depth of the hierarchy until the total time exceeded 7200 s. When solving dP- and Yıldırım-type approximation problems, the concise expressions mentioned in Sects. 4.1.2 and 4.2.2 were adopted. (Sect. 6.2 investigates their numerical effect.) The pair \((n_1,n_2)\) was set to (20, 5), (5, 20), and (5, 25).
Tables 2, 3 and 4 show the results of \((n_1,n_2) = (20,5)\), (5, 20), and (5, 25), respectively. These tables report the solver and total time because some of the approximation hierarchies spent most of the total time before beginning to solve optimization problems in MOSEK. Although we used YALMIP for convenience, the total time would be substantially reduced if we implemented these approximation hierarchies directly.
As shown in Tables 3 and 4, the optimal values of the ZVP- and NN-type inner-approximation problems agree with those of the Yıldırım-type outer-approximation problems, which implies that the ZVP-, NN-, and Yıldırım-type approximation hierarchies almost approach the optimal value of the original COPP problem (28) for \((n_1,n_2) = (5,20)\) and (5, 25).
Although all Lasserre-type outer-approximation problems were considered unbounded, this would result from the moments (27) taking infinitesimal values. Indeed, the MOSEK solver provided a warning about treating nearly zero elements, and we determined that \(y_{\varvec{\alpha }}\) is approximately \(3.52\times 10^{-37}\) for \((n_1,n_2) = (5,25)\) and \(\varvec{\alpha } = 2\varvec{e}_1\in {\mathbb {R}}^{30}\), for instance. Hence, the Lasserre-type outer-approximation hierarchy is numerically unstable, whereas the Yıldırım-type outer-approximation hierarchy is numerically stable.
The results of this numerical experiment support the theoretical comparison mentioned in Sect. 5. As shown in Tables 3 and 4, the dP- and Yıldırım-type approximation hierarchies are not affected much by the increase in \(n_2\). The solver and total time required to solve the dP- and Yıldırım-type approximation problems with \((n_1,n_2) = (5,25)\) is less than twice as long as those with \((n_1,n_2) = (5,20)\) regardless of r, except for \(r = 0\). Conversely, the others are considerably affected by the increase in \(n_2\). For example, the solver and total time required to solve the NN-type inner-approximation problem with \((n_1,n_2) = (5,25)\) of depth 0 is more than ten times as long as those with \((n_1,n_2) = (5,20)\). In addition, although the increase for \(r = 0,1\) is mild, those required to solve the ZVP-type inner-approximation problem with \((n_1,n_2) = (5,25)\) of depth 2 is also approximately ten times as long as those with \((n_1,n_2) = (5,20)\).Footnote 1
As shown in Tables 2 and 3, the dP- and Yıldırım-type approximation hierarchies are considerably affected by the increase in \(n_1\). The solver and total time required to solve the dP- and Yıldırım-type approximation problems with \((n_1,n_2) = (20,5)\) is longer than those with \((n_1,n_2) = (5,20)\) for each r, and the difference rapidly increases with r. Because \(n = n_1 + n_2\) is the same for the two pairs and because \(n_2\) in the pair \((n_1,n_2) = (20,5)\) is smaller than that in the pair \((n_1,n_2) = (5,20)\), we can conclude that the increase in the required time results from the increase in \(n_1\). Conversely, if \(n_1\) is small, we can increase depth parameter r and, in this case, the Yıldırım-type outer-approximation hierarchy may approach a nearly optimal value of the COPP problems.
Finally, the ZVP- and NN-type inner-approximation hierarchies provided much tighter bounds than that of the dP-type in all cases, and, as mentioned above, the two hierarchies are guaranteed to approach nearly optimal values even at a depth of 0 for \((n_1,n_2) = (5,20)\) and (5, 25). Moreover, the time required to solve the ZVP-type inner-approximation problems of depth 0 is much shorter than that for the NN-type ones. Therefore, from a practical perspective, using the zeroth level of the ZVP-type inner-approximation hierarchy may be preferable if aiming to obtain a reasonable lower bound of the optimal value of the original problem (28).
6.2 Effect of concise expressions of dP- and Yıldırım-type approximation hierarchies
In this subsection, we investigate the numerical effect of the concise expressions of the dP- and Yıldırım-type approximation hierarchies mentioned in Sects. 4.1.2 and 4.2.2. The dP- and Yıldırım-type approximation hierarchies with the full expressions are provided by (19) and (23), respectively. Except for the differences in the expressions of these approximation hierarchies, the experimental settings were the same as those in Sect. 6.1.
Table 5 provides the results of \((n_1,n_2)=(5,25)\). The information of the optimal values is omitted because those of the dP- and Yıldırım-type approximation problems with the full expressions are the same as those provided theoretically (and numerically). The effect of the concise expressions is significant, and the solver and total time required to solve the dP- and Yıldırım-type approximation problems with the concise expressions are shorter than those with the full expressions, except for the total time of solving the dP-type inner-approximation problem of depth 0. Note that the effect was also confirmed for \((n_1,n_2) = (20,5)\).
7 Conclusion and future works
In this study, we provided approximation hierarchies for the COP cone over a symmetric cone and compared them with existing approximation hierarchies. We first provided the NN-type inner-approximation hierarchy. Its strength is that the hierarchy permits an SOS representation for a general symmetric cone. We then provided the dP- and Yıldırım-type approximation hierarchies for the COP matrix cone over the direct product of a nonnegative orthant and second-order cone by exploiting those for the usual COP cone provided by de Klerk and Pasechnik [10] and Yıldırım [53]. Remarkably, they are not affected much by the increase in the size of the second-order cone, unlike the NN-type and existing approximation hierarchies. Combining the proposed approximation hierarchies with those existing, we obtained nearly optimal values of COPP problems when the size of the nonnegative orthant is small.
Unfortunately, the infinity of a set of Jordan frames is guaranteed to be solved in Sects. 4.1 and 4.2 only when the symmetric cone is the direct product of a nonnegative orthant and second-order cone and \(m = 2\). However, as shown by the following proposition, we obtain an outer-approximation hierarchy implementable on a computer for the COP cone over a general symmetric cone by replacing a set \({\mathfrak {F}}\) of Jordan frames appearing in (21) with its finite subset \({\mathfrak {F}}_r\) such that the union \(\bigcup _r{\mathfrak {F}}_r\) is dense in \({\mathfrak {F}}\). The proof is omitted for brevity.
Proposition 7.1
Let \(\{{\mathcal {O}}_r^{{{\,\textrm{rk}\,}},m}\}_r\) be a sequence such that each \({\mathcal {O}}_r^{{{\,\textrm{rk}\,}},m}\) is a closed convex cone and \({\mathcal {O}}_r^{{{\,\textrm{rk}\,}},m} \downarrow \mathcal {COP}^{{{\,\textrm{rk}\,}},m}\). We fix a set \({\mathfrak {F}}\) with \({\mathfrak {F}}_{\textrm{c}}({\mathbb {E}}) \subseteq {\mathfrak {F}} \subseteq {\mathfrak {F}}({\mathbb {E}})\) and let \(\{{\mathfrak {F}}_r\}_r\) be a non-decreasing sequence of finite subsets of \({\mathfrak {F}}\) such that \(\bigcup _{r=0}^{\infty }{\mathfrak {F}}_r\) is dense in \({\mathfrak {F}}\). Let
Then, each \(\widehat{{\mathcal {O}}}_r^{n,m}({\mathbb {E}}_+)\) is a closed convex cone and \(\widehat{{\mathcal {O}}}_r^{n,m}({\mathbb {E}}_+) \downarrow \mathcal {COP}^{n,m}({\mathbb {E}}_+)\).
The characterization of \(\mathcal {COP}^{n,m}({\mathbb {E}}_+)\) provided by Theorem 4.5 might be also useful to investigate its geometric properties. For example, facesFootnote 2 are geometric objects of a closed convex cone, and the facial structure of the usual COP cone has been investigated so far [11, 15]. The following proposition enables us to study the facial structure of \(\mathcal {COP}^{n,m}({\mathbb {E}}_+)\) through that of the usual COP cone appearing in the characterization of \(\mathcal {COP}^{n,m}({\mathbb {E}}_+)\).
Proposition 7.2
Let \({\mathcal {F}}\) be a face of \(\mathcal {COP}^{{{\,\textrm{rk}\,}},m}\). Then,
is a face of \(\mathcal {COP}^{n,m}({\mathbb {E}}_+)\).
Proof
Since \({\mathcal {A}}(c_1,\ldots ,c_{{{\,\textrm{rk}\,}}})\) is linear with respect to \({\mathcal {A}} \in {\mathcal {S}}^{n,m}({\mathbb {E}})\) for each \((c_1,\ldots ,c_{{{\,\textrm{rk}\,}}})\in {\mathfrak {F}}\), we see that \(\mathcal {COP}^{n,m}({\mathbb {E}}_+;{\mathcal {F}})\) is a convex subcone of \(\mathcal {COP}^{n,m}({\mathbb {E}}_+)\). For \({\mathcal {A}},{\mathcal {B}}\in \mathcal {COP}^{n,m}({\mathbb {E}}_+)\), we assume that
Let \((c_1,\ldots ,c_{{{\,\textrm{rk}\,}}})\in {\mathfrak {F}}\) be arbitrary. Then, it follows from (30) that
The two tensors \({\mathcal {A}}(c_1,\ldots ,c_{{{\,\textrm{rk}\,}}})\) and \({\mathcal {B}}(c_1,\ldots ,c_{{{\,\textrm{rk}\,}}})\) belong to \(\mathcal {COP}^{{{\,\textrm{rk}\,}},m}\) since \({\mathcal {A}},{\mathcal {B}}\in \mathcal {COP}^{n,m}({\mathbb {E}}_+)\). Given that \({\mathcal {F}}\) is a face of \(\mathcal {COP}^{{{\,\textrm{rk}\,}},m}\), we see that \({\mathcal {A}}(c_1,\ldots ,c_{{{\,\textrm{rk}\,}}})\) and \({\mathcal {B}}(c_1,\ldots ,c_{{{\,\textrm{rk}\,}}})\) belong to \({\mathcal {F}}\). As \((c_1,\ldots ,c_{{{\,\textrm{rk}\,}}})\in {\mathfrak {F}}\) is arbitrary, we obtain \({\mathcal {A}},{\mathcal {B}}\in \mathcal {COP}^{n,m}({\mathbb {E}}_+;{\mathcal {F}})\). \(\square \)
Finally, questions arise concerning the inclusion among the dP-, ZVP-, and NN-type inner-approximation hierarchies. In the numerical experiment conducted in Sect. 6.1, the ZVP- and NN-type inner-approximation hierarchies provided considerably tighter bounds than the dP-type. In the case in which the symmetric cone is a nonnegative orthant, the dP-type inner-approximation hierarchy is well known [10] to be included in the ZVP- and NN-type ones, which agree with that provided by Parrilo [40]. Investigating whether the inclusion also holds where the symmetric cone is the direct product of a nonnegative orthant and second-order cone would be interesting.
Data availability
The datasets generated during the current study are available from the corresponding author on reasonable request.
Notes
This is because, as shown in Table 1, the maximum size of the semidefinite constraints defining the ZVP-type inner-approximation hierarchy increases from \(|{\mathbb {I}}_{=1}^n| = n\) to \(|{\mathbb {I}}_{=2}^n| = n(n+1)/2\) when r increases from 1 to 2.
Recall that a nonempty convex subcone \({\mathcal {F}}\) of a closed convex cone \({\mathcal {K}}\) is a face of \({\mathcal {K}}\) if the following condition holds: for any \(x,y\in {\mathcal {K}}\), if \(x+y\in {\mathcal {F}}\), then \(x,y\in {\mathcal {F}}\).
References
Ahmadi, A.A., Majumdar, A.: DSOS and SDSOS optimization: more tractable alternatives to sum of squares and semidefinite optimization. SIAM J. Appl. Algebra Geom. 3(2), 193–230 (2019). https://doi.org/10.1137/18M118935X
Alizadeh, F.: An introduction to formally real Jordan algebras and their applications in optimization. In: Anjos, M.F., Lasserre, J.B. (eds.) Handbook on Semidefinite, Conic and Polynomial Optimization, pp. 297–337. Springer, Boston, MA (2012). https://doi.org/10.1007/978-1-4614-0769-0_11
Bai, L., Mitchell, J.E., Pang, J.-S.: On conic QPCCs, conic QCQPs and completely positive programs. Math. Program. 159, 109–136 (2016). https://doi.org/10.1007/s10107-015-0951-9
Bertsekas, D.P.: Nonlinear Programming, 2nd edn. Athena Scientific, Belmont, MA (1999)
Bomze, I.M., Dür, M., de Klerk, E., Roos, C., Quist, A.J., Terlaky, T.: On copositive programming and standard quadratic optimization problems. J. Glob. Optim. 18, 301–320 (2000). https://doi.org/10.1023/A:1026583532263
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)
Bundfuss, S., Dür, M.: An adaptive linear approximation algorithm for copositive programs. SIAM J. Optim. 20(1), 30–53 (2009). https://doi.org/10.1137/070711815
Burer, S.: On the copositive representation of binary and continuous nonconvex quadratic programs. Math. Program. 120, 479–495 (2009). https://doi.org/10.1007/s10107-008-0223-z
Chen, H., Li, G., Qi, L.: SOS tensor decomposition: theory and applications. Commun. Math. Sci. 14(8), 2073–2100 (2016). https://doi.org/10.4310/CMS.2016.v14.n8.a1
de Klerk, E., Pasechnik, D.V.: Approximation of the stability number of a graph via copositive programming. SIAM J. Optim. 12(4), 875–892 (2002). https://doi.org/10.1137/S1052623401383248
Dickinson, P.J.C.: Geometry of the copositive and completely positive cones. J. Math. Anal. Appl. 380(1), 377–395 (2011). https://doi.org/10.1016/j.jmaa.2011.03.005
Dickinson, P.J.C., Gijben, L.: On the computational complexity of membership problems for the completely positive cone and its dual. Comput. Optim. Appl. 57(2), 403–415 (2014). https://doi.org/10.1007/s10589-013-9594-z
Dickinson, P.J.C., Povh, J.: Moment approximations for set-semidefinite polynomials. J. Optim. Theory Appl. 159, 57–68 (2013). https://doi.org/10.1007/s10957-013-0279-7
Dong, H.: Symmetric tensor approximation hierarchies for the completely positive cone. SIAM J. Optim. 23(3), 1850–1866 (2013). https://doi.org/10.1137/100813816
Dong, M., Chen, H.: Geometry of the copositive tensor cone and its dual. Asia-Pac. J. Oper. Res. 37(4), 2040008 (2020). https://doi.org/10.1142/S0217595920400084
Faraut, J., Korányi, A.: Analysis on Symmetric Cones. Clarendon Press, Oxford (1994)
Faybusovich, L.: Linear systems in Jordan algebras and primal-dual interior-point algorithms. J. Comput. Appl. Math. 86(1), 149–175 (1997). https://doi.org/10.1016/S0377-0427(97)00153-2
Folland, G.B.: How to integrate a polynomial over a sphere. Am. Math. Mon. 108(5), 446–448 (2001). https://doi.org/10.1080/00029890.2001.11919774
Fujisawa, K., Kojima, M., Nakata, K.: Exploiting sparsity in primal-dual interior-point methods for semidefinite programming. Math. Program. 79, 235–253 (1997). https://doi.org/10.1007/BF02614319
Gouveia, J., Pong, T.K., Saee, M.: Inner approximating the completely positive cone via the cone of scaled diagonally dominant matrices. J. Glob. Optim. 76, 383–405 (2020). https://doi.org/10.1007/s10898-019-00861-3
Gowda, M.S.: Weighted LCPs and interior point systems for copositive linear transformations on Euclidean Jordan algebras. J. Glob. Optim. 74, 285–295 (2019). https://doi.org/10.1007/s10898-019-00760-7
Gowda, M.S., Sznajder, R.: On the irreducibility, self-duality, and non-homogeneity of completely positive cones. Electron. J. Linear Algebra 26, 177–191 (2013). https://doi.org/10.13001/1081-3810.1648
Grundmann, A., Möller, H.M.: Invariant integration formulas for the \(n\)-simplex by combinatorial methods. SIAM J. Numer. Anal. 15(2), 282–290 (1978). https://doi.org/10.1137/0715019
Guo, X., Deng, Z., Fang, S.-C., Xing, W.: Quadratic optimization over one first-order cone. J. Ind. Manag. Optim. 10(3), 945–963 (2014). https://doi.org/10.3934/jimo.2014.10.945
Gvozdenović, N., Laurent, M.: Semidefinite bounds for the stability number of a graph via sums of squares of polynomials. Math. Program. 110, 145–173 (2007). https://doi.org/10.1007/s10107-006-0062-8
Henrion, D., Lasserre, J.B.: Convergent relaxations of polynomial matrix inequalities and static output feedback. IEEE Trans. Autom. Control 51(2), 192–202 (2006). https://doi.org/10.1109/TAC.2005.863494
Hol, C.W.J., Scherer, C.W.: Sum of squares relaxations for polynomial semi-definite programming. In: Proceedings of the 16th International Symposium on Mathematical Theory of Networks and Systems, pp. 1–10 (2004)
Hol, C.W.J., Scherer, C.W.: A sum-of-squares approach to fixed-order \(H_{\infty }\)-synthesis. In: Henrion, D., Garulli, A. (eds.) Positive Polynomials in Control, pp. 45–71. Springer, Berlin, Heidelberg (2005). https://doi.org/10.1007/10997703_3
Iqbal, M.F., Ahmed, F.: Approximation hierarchies for the copositive tensor cone and their application to the polynomial optimization over the simplex. Mathematics 10(10), 1683 (2022). https://doi.org/10.3390/math10101683
Kim, S., Kojima, M.: Solving polynomial least squares problems via semidefinite programming relaxations. J. Glob. Optim. 46, 1–23 (2010). https://doi.org/10.1007/s10898-009-9405-3
Kim, S., Kojima, M., Toh, K.-C.: A geometrical analysis on convex conic reformulations of quadratic and polynomial optimization problems. SIAM J. Optim. 30(2), 1251–1273 (2020). https://doi.org/10.1137/19M1237715
Kojima, M.: Sums of squares relaxations of polynomial semidefinite programs. Research report B-397, Department of Mathematical and Computing Sciences, Tokyo Institute of Technology, Tokyo (2003)
Lasserre, J.B.: New approximations for the cone of copositive matrices and its dual. Math. Program. 144, 265–276 (2014). https://doi.org/10.1007/s10107-013-0632-5
Li, G., Mordukhovich, B.S., Nghia, T.T.A., Phạm, T.S.: Error bounds for parametric polynomial systems with applications to higher-order stability analysis and convergence rates. Math. Program. 168, 313–346 (2018). https://doi.org/10.1007/s10107-016-1014-6
Löfberg, J.: YALMIP: a toolbox for modeling and optimization in MATLAB. In: Proceedings of the 2004 IEEE International Symposium on Computer Aided Control Systems Design, pp. 284–289 (2004). https://doi.org/10.1109/CACSD.2004.1393890
Miyashiro, R., Takano, Y.: Mixed integer second-order cone programming formulations for variable selection in linear regression. Eur. J. Oper. Res. 247(3), 721–731 (2015). https://doi.org/10.1016/j.ejor.2015.06.081
Mosek: MOSEK Optimization Toolbox for MATLAB. https://www.mosek.com/ (2023). Accessed 18 April 2023
Nishijima, M., Nakata, K.: A block coordinate descent method for sensor network localization. Optim. Lett. 16, 1051–1071 (2022). https://doi.org/10.1007/s11590-021-01762-9
Papp, D., Alizadeh, F.: Semidefinite characterization of sum-of-squares cones in algebras. SIAM J. Optim. 23(3), 1398–1423 (2013). https://doi.org/10.1137/110843265
Parrilo, P.A.: Structured Semidefinite Programs and Semialgebraic Geometry Methods in Robustness and Optimization. Ph.D. thesis, California Institute of Technology, Pasadena, CA (2000)
Peña, J., Vera, J., Zuluaga, L.F.: Computing the stability number of a graph via linear and semidefinite programming. SIAM J. Optim. 18(1), 87–105 (2007). https://doi.org/10.1137/05064401X
Peña, J., Vera, J.C., Zuluaga, L.F.: Completely positive reformulations for polynomial optimization. Math. Program. 151, 405–431 (2015). https://doi.org/10.1007/s10107-014-0822-9
Povh, J., Rendl, F.: Copositive and semidefinite relaxations of the quadratic assignment problem. Discrete Optim. 6(3), 231–241 (2009). https://doi.org/10.1016/j.disopt.2009.01.002
Powers, V., Reznick, B.: A new bound for Pólya’s theorem with applications to polynomials positive on polyhedra. J. Pure Appl. Algebra 164(1–2), 221–229 (2001). https://doi.org/10.1016/S0022-4049(00)00155-9
Qi, L.: Symmetric nonnegative tensors and copositive tensors. Linear Algebra Appl. 439(1), 228–238 (2013). https://doi.org/10.1016/j.laa.2013.03.015
Qi, L., Luo, Z.: Tensor Analysis: Spectral Theory and Special Tensors. SIAM, Philadelphia, PA (2017). https://doi.org/10.1137/1.9781611974751
Qi, L., Xu, C., Xu, Y.: Nonnegative tensor factorization, completely positive tensors, and a hierarchical elimination algorithm. SIAM J. Matrix Anal. Appl. 35(4), 1227–1241 (2014). https://doi.org/10.1137/13092232X
Reznick, B.: Sums of Even Powers of Real Linear Forms. American Mathematical Society, Providence, RI (1992)
Reznick, B.: Uniform denominators in Hilbert’s seventeenth problem. Math Z 220, 75–97 (1995). https://doi.org/10.1007/BF02572604
Shaked-Monderer, N., Berman, A.: Copositive and Completely Positive Matrices. World Scientific, Singapore (2021). https://doi.org/10.1142/11386
Sponsel, J., Bundfuss, S., Dür, M.: An improved algorithm to test copositivity. J. Glob. Optim. 52, 537–551 (2012). https://doi.org/10.1007/s10898-011-9766-2
Sturm, J.F., Zhang, S.: On cones of nonnegative quadratic functions. Math. Oper. Res. 28(2), 246–267 (2003). https://doi.org/10.1287/moor.28.2.246.14485
Yıldırım, E.A.: On the accuracy of uniform polyhedral approximations of the copositive cone. Optim. Methods Softw. 27(1), 155–173 (2012). https://doi.org/10.1080/10556788.2010.540014
Zhou, A., Fan, J.: A hierarchy of semidefinite relaxations for completely positive tensor optimization problems. J. Glob. Optim. 75, 417–437 (2019). https://doi.org/10.1007/s10898-019-00751-8
Zuluaga, L.F., Vera, J., Peña, J.: LMI approximations for cones of positive semidefinite forms. SIAM J. Optim. 16(4), 1076–1091 (2006). https://doi.org/10.1137/03060151X
Acknowledgements
The authors thank anonymous reviewers for their useful comments. This work was supported by Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research (Grant Numbers JP20H02385 and JP22J20008).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A: Calculation of (27)
Appendix A: Calculation of (27)
Note that we can represent the set \(\Delta ({\mathbb {K}})\) as
where
Then, for \(\varvec{\alpha } = (\varvec{\alpha }_1,\alpha _{21},\ldots ,\alpha _{2n_2})\in {\mathbb {N}}^n\), it follows that
From [18], we note that
As (A1) implies that \(y_{\varvec{\alpha }} = 0\) where some of \(\alpha _{22},\ldots ,\alpha _{2n_n}\) are odd, we consider the case in which all \(\alpha _{22},\ldots ,\alpha _{2n_n}\) are even. In this case, we have
See [23], for example, to obtain the last equation.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Nishijima, M., Nakata, K. Approximation hierarchies for copositive cone over symmetric cone and their comparison. J Glob Optim 88, 831–870 (2024). https://doi.org/10.1007/s10898-023-01319-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10898-023-01319-3