1 Introduction

A semidefinite optimization problem (SDP) is an optimization problem in variables in the space of symmetric matrices with a linear objective function and linear constraints over the semidefinite cone. We denote the space of symmetric matrices as \({\mathbb S}^n:=\{X\in {\mathbb {R}}^{n\times n}\mid X_{i,j}=X_{j,i} \ (1 \le i < j \le n) \}\) and the semidefinite cone as \(\mathcal{S}^n_+:=\{X\in {\mathbb {S}}^n\mid d^TXd\ge 0 \ \text{ for } \text{ any } \ d\in {\mathbb {R}}^n \}\). Accordingly, we can readily define an SDP in the standard form, as

$$\begin{aligned} \min&\ \langle C,X\rangle \nonumber \\ \mathrm{s.t.}&\ \langle A_j,X\rangle =b_j,j=1,2,\ldots ,m,\nonumber \\&\ X\in \mathcal{S}^n_+, \end{aligned}$$
(1)

where \(C\in {\mathbb {S}}^n\), \(A_j\in {\mathbb {S}}^n\), \(b_j\in {\mathbb {R}}\) (\(j=1,2,\ldots ,m\)), and \(\langle A,B\rangle :={\mathrm{Trace}}(A^TB)=\sum _{i,j=1}^nA_{i,j}B_{i,j}\) is the inner product over \({\mathbb {S}}^n\).

SDPs are powerful tools that provide convex relaxations for combinatorial and nonconvex optimizations, such as the max-cut problem (e.g., [12, 19]) and the k-equipartition problem (e.g., [23, 46]). Some of these relaxations can even attain the optimum, as shown in [24, 31]. Interested readers may find details about SDPs and their relaxations in [32, 42, 46].

A cone \({{\mathcal {K}}} \subset {\mathbb {S}}^n\) is called proper if it has a non-empty interior and is closed, pointed (i.e., \({{\mathcal {K}}} \cap -{{\mathcal {K}}}=\{O\}\)), and convex. It is known that the SDP cone is a proper cone [9]. By replacing the semidefinite constraint \(X\in {{\mathcal {S}}}^n_+\) with a general conic constraint \(X\in {{\mathcal {K}}}\) in (1) (say, a proper cone \(\mathcal{K} \subset {\mathbb {S}}^n\)), one can obtain a general class of problems, namely, conic optimization problems. The class of conic optimization problems has been an active field of study because it contains many popular classes of problems, including linear optimization problems (LPs), second-order cone programs (SOCPs), SDPs, and copositive programs. Copositive programs have been shown capable of providing tight lower bounds for combinatorial and quadratic optimization problems, as described in the survey paper by Dür [17] and the recent work of Arima et al. [3, 4, 25], etc. It has been shown that a copositive relaxation sometimes gives a highly accurate approximate solution for some combinatorial problems under certain conditions [5, 11]. However, the copositive program and its dual problem are both NP-hard (see, e.g., [16, 36]).

SDPs are also attractive because they can be solved in polynomial time to any desired precision. There are state-of-the-art solvers, such as SDPA [47], SeDuMi [40], SDPT3 [43], and Mosek [35], but their computations become difficult when the size of the SDP becomes large. To overcome this deficiency, for example, one may use preprocessing to reduce the size of the SDPs, which leads to facial reduction methods [37, 38, 44]. As another idea, one may generate relaxations of SDPs and solve them as easily handled optimization problems, e.g., LPs and SOCPs, which leads to cutting plane methods. We will focus on these latter methods.

The cutting plane method solves an SDP by transforming it into an optimization problem (e.g., an LP or an SOCP), adding cutting planes at each iteration to cut the current approximate solution out of the feasible region in the next iterations and to get close to the optimal value. The cutting plane method was first used on the traveling-salesman problem, by Dantzig et al.  [13, 14] in 1954. It was used in 1958 by Gomory [20] to solve integer linear programming problems. As SDPs became popular, it came to be used on them as well; see, for instance, Krishnan and Mitchell [28,29,30], and Konno et al. [27]. Kobayashi and Takano [26] applied it to a class of mixed-integer SDPs. In [2], Ahmadi et al. applied it to nonconvex polynomial optimization problems and copositive programs.

In the above-mentioned cutting plane methods for SDPs, the semidefinite constraint \(X\in {{\mathcal {S}}}^n_+\) in (1) is first relaxed to \(X\in {{\mathcal {K}}}_{\mathrm{out}}\), where \(\mathcal{S}^n_+\subseteq {{\mathcal {K}}}_{\mathrm{out}}\subseteq {{\mathbb {S}}}^n\), and an initial relaxation of the SDP is obtained. If \({{\mathcal {K}}}_{\mathrm{out}}\) is polyhedral, the initial relaxation may give an LP; if \(\mathcal{K}_{\mathrm{out}}\) is given by second-order constraints, the initial relaxation becomes an SOCP. To improve the performance of these cutting plane methods, we consider generating initial relaxations for SDPs that are both tight and computationally efficient and focus on approximations of \({{\mathcal {S}}}^n_+\).

Many approximations of \({{\mathcal {S}}}^n_+\) have been proposed on the basis of its well-known properties. Kobayashi and Takano [26] used the fact that the diagonal elements of semidefinite matrices are nonnegative. Konno et al. [27] imposed an assumption that all diagonal elements of the variable X in the SDPs appearing in their iterative algorithm are bounded by a constant. The sets of diagonally dominant matrices and scaled diagonally dominant matrices are known to be cones contained in \({{\mathcal {S}}}^n_+\) (see e.g., [2, 22] for details). The inclusive relation among them has been studied in, e.g., [7, 8]. Ahmadi et al. [1, 2] used these sets as initial approximations of their cutting plane method. Boman et al. [10] defined the factor width of a semidefinite matrix, and Permenter and Parrilo used it to generate approximations of \(\mathcal{S}^n_+\), which they applied to facial reduction methods in [38].

Tanaka and Yoshise defined various bases of \({\mathbb {S}}^n\), wherein each basis consists of \(\frac{n(n+1)}{2}\) semidefinite matrices, called semidefinite (SD) bases, and used them to devise approximations of \({{\mathcal {S}}}^n_+\) [41]. They showed that the conical hull of SD bases and its dual cone give inner and outer polyhedral approximations of \({{\mathcal {S}}}^n_+\), respectively. On the basis of the SD bases, they also developed techniques to determine whether a given matrix is in the semidefinite plus nonnegative cone \({{\mathcal {S}}}^n_++{{\mathcal {N}}}^n\), which is the Minkowski sum of \({{\mathcal {S}}}^n_+\) and the nonnegative matrices cone \(\mathcal{N}^n\). In this paper, we focus on the fact that SD bases are sometimes sparse, i.e., the number of nonzero elements in a matrix is relatively small, and hence, it is not so computationally expensive to solve polyhedrally approximated problems in such SD bases. We call such an approximation, a sparse polyhedral approximation, and propose efficient sparse approximations of \({{\mathcal {S}}}^n_+\).

The goal of this paper is to construct tight and sparse polyhedral approximations of \({{\mathcal {S}}}^n_+\) by using SD bases in order to solve hard conic optimization problems, e.g., doubly nonnegative (DNN, or \({{\mathcal {S}}}^n_+ \cap \mathcal {N}^n\)) and semidefinite plus nonnegative (\(\mathcal {S}^n_+ + \mathcal {N}^n\)) optimization problems. The contributions of this paper are summarized as follows.

  • This paper gives the relation between the conical hull of sparse SD bases and the set of diagonally dominant matrices. We propose a simple expansion of SD bases without losing the sparsity of the matrices and prove that one can generate a sparse polyhedral approximation of \({{\mathcal {S}}}^n_+\) that contains the set of diagonally dominant matrices and is contained in the set of scaled diagonally dominant matrices.

  • The expanded SD bases are used by cutting plane methods for a semidefinite relaxation of the maximum stable set problem. It is found that the proposed methods with expanded SD bases are significantly more efficient than methods using other approximations or solving semidefinite relaxation problems directly.

The organization of this paper is as follows. Various approximations of \({{\mathcal {S}}}^n_+\) are introduced in Sect. 2, including those based on the factor width by Boman et al. [10], diagonal dominance by Ahmadi et al. [2], and SD bases by Tanaka and Yoshise [41]. The main results of this paper, i.e., an expansion of SD bases and an analysis of its theoretical properties, are provided in Sect. 3. In Sect. 4, we introduce the cutting plane method using different approximations of \({{\mathcal {S}}}^n_+\) for calculating upper bounds of the maximum stable set problem. We also describe the results of numerical experiments and evaluate the efficiency of the proposed method with expanded SD bases.

2 Some approximations of the semidefinite cone

2.1 Factor width approximation

In [10], Boman et al. defined a concept called factor width.

Definition 1

(Definition 1 in [10]) The factor width of a real symmetric matrix \(A\in {\mathbb {S}}^n\) is the smallest integer k such that there exists a real matrix \(V\in {\mathbb {R}}^{n\times m}\) where \(A=VV^T\) and each column of V contains at most k nonzero elements.

For \(k\in \{1,2,\ldots ,n\}\), we can also define

$$\begin{aligned} \mathcal{FW}(k):=\{X\in {\mathbb {S}}^n\mid \text {X has a factor width of at most }k\}. \end{aligned}$$

It is obvious that the factor width is only defined for semidefinite matrices, because for every matrix A in Definition 1, the decomposition \(A=VV^T\) implies that \(A\in {{\mathcal {S}}}^n_+\). Therefore, for every \(k\in \{1,2,\ldots ,n\}\), the set of matrices with a factor width of at most k gives an inner approximation of \({\mathbb {S}}^n_+\): \(\mathcal{FW}(k)\subseteq {\mathbb {S}}^n_+.\)

2.2 Diagonal dominance approximation

In [1, 2], the authors approximated the cone \({{\mathcal {S}}}^n_+\) with the set of diagonally dominant matrices and the set of scaled diagonally dominant matrices.

Definition 2

The set of diagonally dominant matrices \(\mathcal{DD}_n\) and the set of scaled diagonally dominant matrices \(\mathcal{SDD}_n\) are defined as follows:

$$\begin{aligned} \mathcal{DD}_n&:=\{A\in {\mathbb {S}}^n\mid \ A_{i,i}\ge \sum _{j\ne i}|A_{i,j}|\quad (i=1,2,\ldots ,n)\},\\ \mathcal{SDD}_n&:=\{A\in {\mathbb {S}}^n\mid DAD\in \mathcal{DD}_n \ \text{ for } \text{ some } \text{ positive } \text{ diagonal } \text{ matrix } \text{ D } \}. \end{aligned}$$

It is easy to see that \(\mathcal{DD}_n\) is a convex cone and \(\mathcal{SDD}_n\) is a cone in \({\mathbb {S}}^n\). As a consequence of the Gershgorin circle theorem [18], we have the relation \(\mathcal{DD}_n\subseteq \mathcal{SDD}_n\subseteq {{\mathcal {S}}}^n_+\). Ahmadi et al. [2] defined \({{\mathcal {U}}}_{n,k}\) as the set of vectors in \({\mathbb {R}}^n\) with at most k nonzeros, each equal to 1 or \(-1\). They also defined a set of matrices \(U_{n,k}:=\{uu^T\mid u\in {{\mathcal {U}}}_{n,k}\}\). Barker and Carlson [6] proved the following theorem.

Theorem 1

(Barker and Carlson [6]) \(\mathcal{DD}_n={\mathrm{cone}}(U_{n,2}).\)

The conical hull of a given set \({{\mathcal {K}}}\subseteq {\mathbb {S}}^n\) is defined as \({\mathrm{cone}}({{\mathcal {K}}}):=\{\sum _{i=1}^k \alpha _iX_i\mid X_i\in {{\mathcal {K}}},\alpha _i\ge 0,k\in {{\mathbb {Z}}}_{\ge 0}\}\), where \({\mathbb {Z}}_{\ge 0}\) is the set of nonnegative integers. A cone generated in this way by a finite number of elements is called finitely generated. Theorem 1 implies that \(\mathcal{DD}_n\) has \(n^2\) extreme rays; thus, it is a finitely generated cone.

A cone \({{\mathcal {K}}}\in {\mathbb {S}}^n\) is polyhedral if \(\mathcal{K}=\{X\in {\mathbb {S}}^n\mid \langle A_i,X\rangle \le 0\}\) for some \(A_i\in {\mathbb {S}}^n\). The following theorem follows from the results of Minkowski [34] and Weyl [45].

Theorem 2

(Minkowski–Weyl theorem, see Corollary 7.1a in [39]) A convex cone is polyhedral if and only if it is finitely generated.

The above theorem ensures that \(\mathcal{DD}_n\) is a polyhedral cone. Using the expression in Theorem 1, Ahmadi et al. proved that optimization problems over \(\mathcal{DD}_n\) can be solved as LPs. They also proved that optimization problems over \(\mathcal{SDD}_n\) can be solved as SOCPs. They designed a column generation method using \(\mathcal{DD}_n\) and \(\mathcal{SDD}_n\) to obtain a series of inner approximations of \({{\mathcal {S}}}_n^+\). As for the relation between the factor width and diagonal dominance, useful results were presented in [1, 10], which gives a relation between \(\mathcal{SDD}_n\) and the set of matrices with a factor width of at most 2.

Lemma 1

(See [10] and Theorem 8 in [1]) \(\mathcal{FW}(2)=\mathcal{SDD}_n\)

Note that Definition 1 implies that the set \(\mathcal{FW}(k)\) is convex for any \(k \in \{1,2,\ldots ,n\}\), and we obtain the following corollary of Lemma 1:

Corollary 1

The set \(\mathcal{SDD}_n\) is a convex cone.

2.3 SD basis approximation

Tanaka and Yoshise defined semidefinite (SD) bases [41].

Definition 3

(Definitions 1 and 2 in [41]) Let \(e_i\in {\mathbb {R}}^n\) denotes the vector with a 1 at the ith coordinate and 0 elsewhere, and let \(I=(e_1,\ldots ,e_n)\in {\mathbb {S}}^n\) be the identity matrix. Then

$$\begin{aligned} {{\mathcal {B}}_+}:=\{(e_i+e_j)(e_i+e_j)^T\mid 1\le i\le j\le n\} \end{aligned}$$

is called an SD basis of Type I, and

$$\begin{aligned} {{\mathcal {B}}_-}:&=\{(e_i+e_i)(e_i+e_i)^T\mid 1\le i\le n\}\cup \ \\&{(e_i-e_j)(e_i-e_j)^T\mid 1\le i<j\le n}\} \end{aligned}$$

is called an SD basis of Type II. Matrices in SD bases Type I and II are defined as

$$\begin{aligned} {B}^+_{i,j:}=(e_i+e_j)(e_i+e_j)^T,\ {B}^-_{i,j}:=(e_i-e_j)(e_i-e_j)^T. \end{aligned}$$

As shown in [41], \({{\mathcal {B}}}_+\) and \({{\mathcal {B}}}_-\) are subsets of \(\mathcal {S}^n_+\) and bases of \({\mathbb {S}}^n\). Given a set \({{\mathcal {K}}}\subseteq {\mathbb {S}}^n\), we define the dual cone of \({{\mathcal {K}}}\) as \(({{\mathcal {K}}})^*:=\{A\in {\mathbb {S}}^n\mid \langle A,B\rangle \ge 0 \ \text{ for } \text{ any } \ B\in {{\mathcal {K}}}\}\). The conical hull of \({{\mathcal {B}}}_+ \cup {{\mathcal {B}}}_-\) and its dual give an inner and an outer polyhedral approximation of \(\mathcal {S}^n_+\), as follows.

Definition 4

Let \(I=(e_1,\ldots ,e_n)\in {\mathbb {S}}^n\) be the identity matrix. The inner and outer approximations of \({{\mathcal {S}}}^n_+\) by using SD bases are defined as

$$\begin{aligned} {{\mathcal {S}}}_{\mathrm{in}}:={\mathrm{cone}}({{\mathcal {B}}}_+\cup {{\mathcal {B}}}_-),\ \ \mathcal{S}_{\mathrm{out}}:=({{\mathcal {S}}}_{\mathrm{in}})^*. \end{aligned}$$

By Definition 3, we know that \({{\mathcal {B}}}_+,\mathcal{B}_-\subseteq {{\mathcal {S}}}^n_+\). Since \({{\mathcal {S}}}^n_+\) is a convex cone, we have \({{\mathcal {S}}}_{\mathrm{in}}\subseteq {\mathrm{cone}}({{\mathcal {S}}}^n_+)=\mathcal{S}^n_+\). By Lemma 1.7.3 in [32], we know that \({{\mathcal {S}}}^n_+\) is self-dual; that is, \({{\mathcal {S}}}^n_+=(\mathcal{S}^n_+)^*\). Accordingly, we can conclude that \({{\mathcal {S}}}_\mathrm{in}\subseteq {{\mathcal {S}}}^n_+\subseteq {{\mathcal {S}}}_{\mathrm{out}}\).

Remark 1

In [41], \({{\mathcal {B}}}_+\) and \({{\mathcal {B}}}_-\) are defined as \({{\mathcal {B}}}_+(P)\) and \({{\mathcal {B}}}_-(P)\) using an orthogonal matrix P instead of the identity matrix I. In fact, for any orthogonal matrix P,

$$\begin{aligned} P{{\mathcal {B}}_+}P^T :=\{ PB_{i,j}^+P^T \mid B_{i,j}^+ \in {{\mathcal {B}}_+} \} \ \text{ and } \ P{{\mathcal {B}}_-}P^T :=\{ PB_{i,j}^-P^T \mid B_{i,j}^- \in {{\mathcal {B}}_-} \} \end{aligned}$$

also give other bases and generalizations of \({{\mathcal {B}}_+}\) and \({{\mathcal {B}}_-}\). However, as we will see in section 4, we use the matrices in the bases as in optimization problems of the form

$$\begin{aligned} \min \ \langle C,X\rangle \ \ \mathrm{s.t.} \ \langle A,X\rangle =b, \ \langle Y ,X\rangle \ge 0 \ (Y \in {{\mathcal {B}}_+}), \end{aligned}$$

which is equivalent to

$$\begin{aligned} \min \ \langle PCP^T,{\bar{X}}\rangle \ \ \mathrm{s.t.} \ \langle PAP^T, {\bar{X}}\rangle =b, \ \langle Y ,{\bar{X}}\rangle \ge 0 \ (Y \in P\mathcal{B}_+ P^T). \end{aligned}$$
(2)

Therefore, we consider that the generalizations \(P{{\mathcal {B}}_+}P^T\) and \(P{{\mathcal {B}}_-}P^T\) are not essential throughout this paper and omit those descriptions from subsequent sections to simplify the presentation.

3 Expansion of SD bases

When we use the SD bases for approximating \(\mathcal {S}^n_+\), the sparsity of the matrices in those bases is quite important in terms of computational efficiency. As we mentioned in Remark 1, for any orthogonal matrix P, \(P{{\mathcal {B}}_+}P^T\) and \(P{{\mathcal {B}}_-}P^T\) give generalizations of the SD bases. However, it is hard to choose an appropriate orthogonal matrix P (except for the identity matrix I) to keep the sparsity of the matrices \(PCP^T\) and \(PAP^T\) in (2). In this section, we try to extend the definition of the SD bases in order to obtain various sparse SD bases which will lead us to sparse polyhedral approximations of \(\mathcal {S}^n_+\).

3.1 SD bases and their relations with \({{\mathcal {S}}}^{n}_+\) and \(\mathcal{DD}_n\)

First, we give a lemma that provides an expression of \(\mathcal{S}^{n}_+\) by using SD bases. The lemma is a direct corollary of the fact that any \(X \in {{\mathcal {S}}}^{n}_+\) has nonnegative eigenvalues and a corresponding orthogonal basis of eigenvectors.

Lemma 2

$$\begin{aligned} {{\mathcal {S}}}^n_+={\mathrm{cone}}\left( \displaystyle \bigcup _{P\in \mathcal{O}^n}\{P^TXP\mid X\in {{\mathcal {B}}}_+\}\right) ={\mathrm{cone}}\left( \displaystyle \bigcup _{P\in {{\mathcal {O}}}^n}\{P^TXP\mid X\in {{\mathcal {B}}}_-\} \right) , \end{aligned}$$

where \({{\mathcal {O}}}^n\) is the set of orthogonal matrices in \({\mathbb {R}}^{n\times n}\).

Lemma 2 gives a way to approximate \({{\mathcal {S}}}^n_+\) by changing the matrix \(P=(p_1,\ldots ,p_n)\) \(\in {{\mathcal {O}}}^n\) when creating SD bases. However, a dense matrix \(P\in {{\mathcal {O}}}^n\) may lead to a dense formulation of the approximation using SD basis, which is unattractive from the standpoint of computational efficiency.

Note that we can easily see that the set \({\mathrm{cone}}(\mathcal{B}_+\cup {{\mathcal {B}}}_-)\), the conical hull of the sparse SD bases \(\mathcal {B}_+\) and \(\mathcal {B}_-\), is equivalent to \({\mathrm{cone}}(U_{n,2})\). Thus, we obtain the following proposition as a corollary of Theorem 1.

Proposition 1

$$\begin{aligned} {\mathrm{cone}}({{\mathcal {B}}}_+\cup {{\mathcal {B}}}_-)=\mathcal{DD}_n. \end{aligned}$$

3.2 Expansion of SD bases without losing sparsity

The previous section shows that we can obtain a sparse polyhedral approximation of \(\mathcal {S}^n_+\) by using the SD bases. In this section, we try to extend the definition of the SD bases in order to obtain various sparse polyhedral approximations of \(\mathcal {S}^n_+\).

Definition 5

Let \(I=(e_1,\ldots ,e_n)\in {\mathbb {S}}^n\) be the identity matrix. Define the expansion of the SD basis with one parameter \(\alpha \in {\mathbb {R}}\) as

$$\begin{aligned} {\bar{B}}_{i,j}(\alpha )&:=( e_i+\alpha e_j)( e_i+\alpha e_j)^T,\\ \bar{{\mathcal {B}}}(\alpha )&:=\{{\bar{B}}_{i,j}(\alpha )\mid 1\le i\le j\le n\}. \end{aligned}$$

The proposition below ensures that the expansion of the SD bases also gives bases of \({\mathbb {S}}^n\).

Proposition 2

Let \(I=(e_1,\ldots ,e_n)\in {\mathbb {S}}^n\) be the identity matrix. For any \(\alpha \in {\mathbb {R}}\setminus \{0,-1\}\), \(\bar{{\mathcal {B}}}(\alpha )\) is a set of \(n(n+1)/2\) independent matrices and thus a basis of \({\mathbb {S}}^n\).

Proof

Let \(\alpha \in {\mathbb {R}}\setminus \{0,-1\}\). Accordingly, for \(1\le i<j\le n\), we have

$$\begin{aligned} {\bar{B}}_{i,j}(\alpha ):&=( e_i+\alpha e_j)( e_i+\alpha e_j)^T\nonumber \\&=e_ie_i^T+\alpha (e_ie_j^T+e_je_i^T)+\alpha ^2e_je_j^T\nonumber \\&=\alpha (e_ie_i^T+e_ie_j^T+e_je_i^T+e_je_j^T)+(1-\alpha )e_ie_i^T+(\alpha ^2-\alpha )e_je_j^T\nonumber \\&=\alpha {B}_{i,j}^++\frac{1-\alpha }{4}{B}_{i,i}^++\frac{\alpha (\alpha -1)}{4}{B}_{j,j}^+, \end{aligned}$$
(3)

and for every \(1\le i\le n\), we also have

$$\begin{aligned} {\bar{B}}_{i,i}(\alpha )&:=( e_i+\alpha e_i)( e_i+\alpha e_i)^T\nonumber \\&=(1+\alpha )^2 e_i e_i^T=\frac{(1+\alpha )^2}{4}{B}_{i,i}^+. \end{aligned}$$
(4)

Suppose that there exist \(\gamma _{i,j}\ge 0\ (1\le i\le j\le n)\) such that

$$\begin{aligned} \sum _{1\le i\le j\le n}\gamma _{i,j}{\bar{B}}_{i,j}(\alpha )=O. \end{aligned}$$

Then, by (3) and (4), we see that

$$\begin{aligned} O&=\sum _{i=1}^n \frac{\gamma _{i,i}(1+\alpha )^2}{4}{B}_{i,i}^++\sum _{1\le i<j\le n}\gamma _{i,j}\left[ \alpha {B}_{i,j}^++\frac{1-\alpha }{4}{B}_{i,i}^++\frac{\alpha (\alpha -1)}{4}{B}_{j,j}^+\right] \nonumber \\&=\sum _{i=1}^n \frac{(1+\alpha )^2}{4}\gamma _{i,i}{B}_{i,i}^++\sum _{1\le i<j\le n}\alpha \gamma _{i,j}{B}_{i,j}^++\sum _{i=1}^{n-1} \frac{1-\alpha }{4}\left( \sum _{j=i+1}^n\gamma _{i,j}\right) {B}_{i,i}^+\nonumber \\&\quad +\sum _{j=2}^n\frac{\alpha (\alpha -1)}{4}(\sum _{i=1}^{j-1}\gamma _{i,j}){B}_{j,j}^+\nonumber \\&=\left[ \frac{\gamma _{1,1}(1+\alpha )^2}{4}+\frac{1-\alpha }{4}\left( \sum _{j=2}^n\gamma _{1,j}\right) \right] {B}_{1,1}^+\nonumber \\&\quad +\sum _{i=2}^{n-1}\left[ \frac{(1+\alpha )^2}{4}\gamma _{i,i}+\frac{1-\alpha }{4}\left( \sum _{j=i+1}^n\gamma _{i,j}\right) +\frac{\alpha (\alpha -1)}{4}\left( \sum _{j=1}^{i-1}\gamma _{j,i}\right) \right] {B}_{i,i}^+ \nonumber \\&\quad +\left[ \frac{\gamma _{n,n}(1+\alpha )^2}{4}+\frac{\alpha (\alpha -1)}{4}\left( \sum _{j=1}^{n-1}\gamma _{j,n}\right) \right] {B}_{n,n}^+\nonumber \\&\quad +\sum _{1\le i<j\le n}\alpha \gamma _{i,j}{B}_{i,j}^+ . \end{aligned}$$
(5)

Since \(\{B_{i,j}^+\}={{\mathcal {B}}}_+\) is a set of linearly independent matrices, all the coefficients for \({B}_{i,j}\) in (5) should be 0. Thus, we have

$$\begin{aligned}&0=\frac{\gamma _{1,1}(1+\alpha )^2}{4}+\frac{1-\alpha }{4}\left( \sum _{j=2}^n\gamma _{1,j}\right) , \end{aligned}$$
(6)
$$\begin{aligned}&0= \frac{(1+\alpha )^2}{4}\gamma _{i,i}+\frac{1-\alpha }{4}\left( \sum _{j=i+1}^n\gamma _{i,j}\right) +\frac{\alpha (\alpha -1)}{4}\left( \sum _{j=1}^{i-1}\gamma _{j,i}\right) \ (2\le i\le n-1), \end{aligned}$$
(7)
$$\begin{aligned}&0=\frac{\gamma _{n,n}(1+\alpha )^2}{4}+\frac{\alpha (\alpha -1)}{4}\left( \sum _{j=1}^{n-1}\gamma _{j,n}\right) , \end{aligned}$$
(8)
$$\begin{aligned}&0=\alpha \gamma _{i,j} \ (1\le i<j\le n). \end{aligned}$$
(9)

Since \(\alpha \ne 0\), by (9) we have

$$\begin{aligned} \gamma _{i,j}=0 \ (1\le i<j\le n). \end{aligned}$$
(10)

Since \(\alpha \ne -1\), (6)-(10) imply that

$$\begin{aligned} \gamma _{i,i}=0 \ (i=1,2,\ldots ,n). \end{aligned}$$
(11)

The above leads us to conclude that \(\{\bar{B}_{i,j}(\alpha )\}=\bar{{\mathcal {B}}}(\alpha )\) is a set of \(n(n+1)/2\) linearly independent matrices. \(\square \)

If we let \(\alpha =1\), then it is straightforward that \(\bar{\mathcal{B}}(1)={{\mathcal {B}}}_+\). If we let \(\alpha \) be other real numbers, we may obtain different SD bases. The following proposition gives the condition for generating different expanded SD bases.

Proposition 3

Let \(I=(e_1,\ldots ,e_n)\in {\mathbb {S}}^n\) be the identity matrix. Suppose that \(\alpha _1\in {\mathbb {R}}\setminus \{0,-1\}\) and \(\alpha _2\in {\mathbb {R}}\setminus \{0,\alpha _1\}\). Then, for every \(1\le i<j\le n\),

$$\begin{aligned} ( e_i+\alpha _2 e_j)( e_i+\alpha _2 e_j)^T\notin {\mathrm{cone}}(\bar{\mathcal{B}}(\alpha _1)). \end{aligned}$$

Proof

For \(1\le i\le j\le n\), let us define

$$\begin{aligned} {\bar{B}}_{i,j}^1:=( e_i+\alpha _1 e_j)( e_i+\alpha _1 e_j)^T,\ \ {\bar{B}}_{i,j}^2:=( e_i+\alpha _2 e_j)( e_i+\alpha _2 e_j)^T. \end{aligned}$$

Note that if \(i=j\), then

$$\begin{aligned} {\bar{B}}_{i,i}^1:=(1+\alpha _1)^2e_ie_i^T,\ \ {\bar{B}}_{i,i}^2:=(1+\alpha _2)^2e_ie_i^T. \end{aligned}$$
(12)

For every \(i< j\), we can write \({\bar{B}}_{i,j}^2\) as a linear combination of \({\bar{B}}_{i,j}^1\):

$$\begin{aligned} {\bar{B}}_{i,j}^2&=e_ie_i^T+\alpha _2^2e_je_j^T+\alpha _2(e_ie_j^T+e_je_i^T)\nonumber \\ &=e_ie_i^T+\alpha _2^2e_je_j^T+\frac{\alpha _2}{\alpha _1}\alpha _1(e_ie_j^T+e_je_i^T)\ \ \ (\mathrm{because\ }\alpha _1\ne 0)\nonumber \\ &=e_ie_i^T+\alpha _2^2e_je_j^T-\frac{\alpha _2}{\alpha _1}e_ie_i^T-\frac{\alpha _2\alpha _1^2}{\alpha _1}e_je_j^T\nonumber \\&+\frac{\alpha _2}{\alpha _1}\left[ e_ie_i^T+\alpha _1(e_ie_j^T+e_je_i^T)+\alpha _1^2e_je_j^T\right] \nonumber \\ &=\frac{\alpha _1-\alpha _2}{\alpha _1}e_ie_i^T+\alpha _2(\alpha _2-\alpha _1)e_je_j^T+\frac{\alpha _2}{\alpha _1}{\bar{B}}_{i,j}^1\nonumber \\ &=\frac{\alpha _1-\alpha _2}{\alpha _1(1+\alpha _1)^2}(1+\alpha _1)^2e_ie_i^T+\frac{\alpha _2(\alpha _2-\alpha _1)}{(1+\alpha _1)^2}(1+\alpha _1)^2e_je_j^T+\frac{\alpha _2}{\alpha _1}{\bar{B}}_{i,j}^1\nonumber \\&(\mathrm{because\ }\alpha _1\ne -1)\nonumber \\ &=\frac{\alpha _1-\alpha _2}{\alpha _1(1+\alpha _1)^2}{\bar{B}}_{i,i}^1+\frac{\alpha _2(\alpha _2-\alpha _1)}{(1+\alpha _1)^2}{\bar{B}}_{j,j}^1+\frac{\alpha _2}{\alpha _1}{\bar{B}}_{i,j}^1 \ (\mathrm{by\ }(12)). \end{aligned}$$
(13)

Since \(\alpha _1 \not \in \{0, -1\}\), Proposition 2 ensures that \({\bar{\mathcal{B}}}(\alpha _1)\) is linearly independent, and hence, the expression (13) for \({{\bar{B}}}_{i,j}^2\) is unique.

Suppose that \({\bar{B}}_{i,j}^2\in {\mathrm{cone}}\left( \bar{\mathcal{B}}(\alpha _1)\right) \). In this case, all the coefficients in (13) should be nonnegative, which implies that

$$\begin{aligned} \frac{\alpha _1-\alpha _2}{\alpha _1(1+\alpha _1)^2}\ge 0,\ \frac{\alpha _2(\alpha _2-\alpha _1)}{(1+\alpha _1)^2}\ge 0,\ \frac{\alpha _2}{\alpha _1}>0. \end{aligned}$$
(14)

From the last inequality in (14), we have either

$$\begin{aligned} {\mathrm{(i)}}\ \alpha _1,\alpha _2>0 \quad {\mathrm{or}}\quad {\mathrm{(ii)}}\ \alpha _1,\alpha _2<0. \end{aligned}$$

For case (i), from the first and second inequalities of (14), we have \(\alpha _2-\alpha _1\ge 0\) and \(\alpha _1-\alpha _2\ge 0\), which implies \(\alpha _2= \alpha _1\) and contradicts the assumption \(\alpha _2 \ne \alpha _1\). A similar contradiction is obtained for case (ii). Thus, we have \({\bar{B}}_{i,j}^2\notin {\mathrm{cone}}(\bar{{\mathcal {B}}}(\alpha _1))\). \(\square \)

3.3 Expression of \(\boldsymbol{\mathcal{SDD}}_{\varvec{n}}\) with expanded SD bases

As we have seen in Corollary 1, the set \(\mathcal{SDD}_n=\mathcal{FW}(2)\) is a convex cone. This fact ensures that as a corollary of Theorem 1, the conical hull of the union of the extended SD bases \(\bar{{\mathcal {B}}}(\alpha )\) on \(\alpha \in {\mathbb {R}}\) coincides with \(\mathcal{FW}(2)\) and hence, the set of scaled diagonally dominant matrices \(\mathcal{SDD}_n\):

Corollary 2

$$\begin{aligned} {\mathrm{cone}}\left( \displaystyle \bigcup _{{\alpha }\in {\mathbb {R}}}\bar{\mathcal{B}}(\alpha )\right) =\mathcal{SDD}_n. \end{aligned}$$

3.4 Notes on the parameter \(\alpha \)

Here, we discuss the choice for the parameter \(\alpha \) to increase the “volume” of the polyhedral approximation \({\mathrm{cone}}(\bar{\mathcal{B}}(\alpha ))\) of the semidefinite cone \({{\mathcal {S}}}^n_+\). For any \(\alpha \in {\mathbb {R}}\) and \(1\le i< j\le n\), by Definition 5, we can calculate the Frobenius norm of \({\bar{B}}_{i,j}(\alpha )\):

$$\begin{aligned} \Vert {\bar{B}}_{i,j}(\alpha )\Vert &=\Vert ( e_i+\alpha e_j)( e_i+\alpha e_j)^T\Vert \nonumber \\ &=\sqrt{\mathrm{Trace}\left( ( e_i+\alpha e_j)( e_i+\alpha e_j)^T( e_i+\alpha e_j)( e_i+\alpha e_j)^T\right) }\nonumber \\ &=\Vert e_i+\alpha e_j\Vert ^2\nonumber \\ &=1+\alpha ^2. \end{aligned}$$
(15)

According to Proposition 3, by changing \(\alpha \), one can obtain different polyhedral approximations. However, we can see that

$$\begin{aligned} \lim _{|\alpha |\rightarrow \infty }\frac{{\bar{B}}_{i,j}(\alpha )}{\Vert {\bar{B}}_{i,j}(\alpha )\Vert }&=\lim _{|\alpha |\rightarrow \infty }\frac{1}{1+\alpha ^2}(e_i+\alpha e_j)(e_i+\alpha e_j)^T\ (\text {by} (15)),\\&=\lim _{|\alpha |\rightarrow \infty }\left[ \frac{1}{1+\alpha ^2}e_ie_i^T+\frac{\alpha }{1+\alpha ^2}(e_ie_j^T+e_je_i^T)+\frac{\alpha ^2}{1+\alpha ^2}e_je_j^T\right] \\&=e_je_j^T= \frac{1}{4}{B}^+_{j,j}, \end{aligned}$$

and by Definitions 3 and 5, we have

$$\begin{aligned} {\bar{B}}_{i,j}(0)=\frac{1}{4}{B}^+_{i,i},\ \bar{ B}_{i,j}(1)={B}^+_{i,j},\ {\bar{B}}_{i,j}(-1)={B}^-_{i,j}. \end{aligned}$$

This shows that, if \(|\alpha |\rightarrow \infty \) or \(\alpha \in \{0,1,-1\}\), the new matrix \({\bar{B}}_{i,j}(\alpha )\) will become close to the existing matrices, e.g. \({B}^+_{i,i}\), \({ B}^+_{j,j}\), \({ B}^+_{i,j}\) and \({ B}^-_{i,j}\), and the “volume” of the polyhedral approximation \({\mathrm{cone}}(\bar{\mathcal{B}}(\alpha )\cup {{\mathcal {B}}}_+\cup {{\mathcal {B}}}_-)\) of the semidefinite cone \({{\mathcal {S}}}^n_+\) will also be close to the “volume” of the existing inner approximation \({\mathrm{cone}}({{\mathcal {B}}}_+\cup {{\mathcal {B}}}_-)\) of \(\mathcal{S}^n_+\).

To give an illustrative explanation of the above discussion, here we consider the specific case

$$\begin{aligned} {{\mathcal {S}}}^2_+ =\left\{ \left( \begin{array}{ccc} a &{} c \\ c &{} b \end{array} \right) \mid a,b,c\in {\mathbb {R}}, a,b\ge 0,\ ab-c^2\ge 0 \right\} \end{aligned}$$

and draw some figures in \({\mathbb {R}}^3\) with coordinate ab and c. Figure 1a shows the set of \({{\mathcal {S}}}^2_+\) in \({\mathbb {R}}^3\). The red arrow in Fig. 1b shows the extreme rays \(\{\gamma {\bar{B}}_{i,j}(\alpha )\mid \gamma \ge 0\}\) with \(|\alpha |\rightarrow \infty \) and \(\alpha \in \{0,1,-1\}\). The conical hull of these extreme rays is \({\mathrm{cone}}({{\mathcal {B}}}_+\cup {{\mathcal {B}}}_-)\) and its cross section with \(\{X\in {\mathbb {S}}^2\mid \langle X,I\rangle =1\}\) is illustrated as the blue area. To avoid generating a new matrix \({\bar{B}}_{i,j}(\alpha )\) that is close to the existing matrices, we should choose an \(\alpha \) such that the angle between \({\bar{B}}_{i,j}(\alpha )\) and existing matrices are equal, as illustrated in Fig. 1c.

Fig. 1
figure 1

Choice of \(\alpha \) to generate \({\bar{B}}_{i,j}(\alpha )\in {\mathbb {S}}^2\) in \({\mathbb {R}}^3\)

We expand this idea to the case of generating a matrix \({\bar{B}}_{i,j}(\alpha )\in {\mathbb {S}}^n\). Given an \(\alpha \in {\mathbb {R}}\), we can define the angles between matrices in the expanded SD bases and SD bases Type I and II for every \(1\le i < j\le n\), as follows:

$$\begin{aligned} \theta _1(\alpha ):=\mathrm{arccos}\frac{\langle {\bar{B}}_{i,j}(\alpha ) , {B}^+_{i,i}\rangle }{\Vert {\bar{B}}_{i,j}(\alpha )\Vert \Vert {B}^+_{i,i}\Vert },\ \theta _2(\alpha ):=\mathrm{arccos}\frac{\langle {\bar{B}}_{i,j}(\alpha ) , {B}^+_{j,j}\rangle }{\Vert {\bar{B}}_{i,j}(\alpha )\Vert \Vert {B}^+_{j,j}\Vert },\\ \theta _3(\alpha ):=\mathrm{arccos}\frac{\langle {\bar{B}}_{i,j}(\alpha ) , {B}^+_{i,j}\rangle }{\Vert {\bar{B}}_{i,j}(\alpha )\Vert \Vert {B}^+_{i,j}\Vert },\ \theta _4(\alpha ):=\mathrm{arccos}\frac{\langle {\bar{B}}_{i,j}(\alpha ) ,{B}^-_{i,j}\rangle }{\Vert {\bar{B}}_{i,j}(\alpha )\Vert \Vert {B}^-_{i,j}\Vert }. \end{aligned}$$

Thus, we have

$$\begin{aligned} \mathrm{cos}\theta _1(\alpha )&=\frac{\langle {\bar{B}}_{i,j}(\alpha ) , {B}^+_{i,i}\rangle }{\Vert {\bar{B}}_{i,j}(\alpha )\Vert \Vert {B}^+_{i,i}\Vert }\\&=\frac{\langle (e_i+\alpha e_j)(e_i+\alpha e_j)^T , (e_i+e_i)(e_i+e_i)^T\rangle }{(1+\alpha ^2)\Vert (e_i+e_i)(e_i+e_i)^T\Vert }\ (\text {by }(15) )\\&=\frac{4\Vert e_i\Vert ^4}{(1+\alpha ^2)4\Vert e_i\Vert ^2}\ (\text {because }e_i^Te_j=0)\\&=\frac{1}{1+\alpha ^2}\ (\text {because }\Vert e_i\Vert =1). \end{aligned}$$

Similarly, we have

$$\begin{aligned} \mathrm{cos}\theta _2(\alpha )=\frac{\alpha ^2}{1+\alpha ^2},\ \mathrm{cos}\theta _3(\alpha )=\frac{(1+\alpha )^2}{2(1+\alpha ^2)},\ \mathrm{cos}\theta _4(\alpha )=\frac{(1-\alpha )^2}{2(1+\alpha ^2)}. \end{aligned}$$

In general, to obtain a large enough inner approximation with limited parameters, we prefer an \(\alpha \) that makes \(\theta _1(\alpha )=\theta _3(\alpha )\), which means that the new matrix \({\bar{B}}_{i,j}(\alpha )\) will be in the middle of \({B}^+_{i,i}\) and \({B}^+_{i,j}\) on the boundary of \({{\mathcal {S}}}^n_+\). Similarly, we can obtain \(\alpha \) by calculating \(\theta _2(\alpha )=\theta _3(\alpha )\), \(\theta _1(\alpha )=\theta _4(\alpha )\) and \(\theta _2(\alpha )=\theta _4(\alpha )\). By solving these equalities, we find that

$$\begin{aligned} \alpha =\pm 1\pm \sqrt{2}. \end{aligned}$$

The expansions with these parameters are expected to provide generally large inner approximations for \({{\mathcal {S}}}^n_+\).

4 Cutting plane methods for the maximum stable set problem

Conic optimization problems, including SDPs and copositive programs, have been shown to provide tight bounds for NP-hard combinatorial and noconvex optimization problems. Here, we consider applying approximations of \({{\mathcal {S}}}^n_+\) to one of those NP-hard problems, the maximum stable set problem. A stable set of a graph G(VE) is a set of vertices in V, such that there is no edge connecting any pair of vertices in the set. The maximum stable set problem aims to find the stability number, i.e. the number of vertices of the largest stable set of G, namely \(\alpha (G)\).

De Klerk and Pasechnik [15] proposed a copositive programming formulation to obtain the exact stability number of a graph G with n vertices:

$$\begin{aligned} \alpha (G)=\max&\ \langle ee^T,X\rangle \nonumber \\ \mathrm{s.t.}&\ \langle A+I,X\rangle =1,\nonumber \\&X\in {{\mathcal {C}}}^*_n, \end{aligned}$$
(16)

where e is the all-ones vector, A is the adjacency matrix of graph G, and \({{\mathcal {C}}}^*_n\) is the dual cone of the copositive cone \({{\mathcal {C}}}_n:=\{X\in {\mathbb {S}}^n\mid d^TXd\ge 0 \ \forall d\in {\mathbb {R}}^n,\ d\ge 0\}\).

Although problem (16) is a conic optimization problem, it is still difficult since determining whether \(X \in {{\mathcal {C}}}^*_n\) or not is NP-hard [16]. A natural approach is to relax this problem to a more tractable optimization problem. From the definition of each cone, we can see the validity of the following inclusions:

$$\begin{aligned} {{\mathcal {C}}}^*_n\subseteq {{\mathcal {S}}}^n_+\cap {{\mathcal {N}}}^n\subseteq \mathcal{S}^n_+\subseteq {{\mathcal {S}}}^n_++{{\mathcal {N}}}^n\subseteq {{\mathcal {C}}}_n. \end{aligned}$$

By replacing \({{\mathcal {C}}}^*_n\) with \({{\mathcal {S}}}^n_+\cap {{\mathcal {N}}}^n\), one can obtain an SDP relaxation of (16):

$$\begin{aligned} \max&\ \langle ee^T,X\rangle \nonumber \\ \mathrm{s.t.}&\ \langle A+I,X\rangle =1,\nonumber \\&\ X\in {{\mathcal {S}}}^n_+\cap {{\mathcal {N}}}^n. \end{aligned}$$
(17)

Solving this SDP is not as easy as it seems to be; in fact, we could not obtain a useful result of (17) after 6 hours of calculation using the state-of-the-art SDP solver Mosek for a random generalized problem when \(n=300\). Combining the expanded SD bases with the cutting plane method, we apply the approximations of \(\mathcal{S}^n_+\) to (17) and solve it by calculating a series of more tractable problems.

Let \({{\mathcal {P}}}^n\) satisfy \({{\mathcal {S}}}^n_+\subseteq \mathcal{P}^n\subseteq {\mathbb {S}}^n\) and replace \(X\in {{\mathcal {S}}}^n_+\) by \(X\in {{\mathcal {P}}}^n\) in (17). Then, we obtain a relaxation of (17):

$$\begin{aligned} \max&\ \langle ee^T,X\rangle \nonumber \\ \mathrm{s.t.}&\ \langle A+I,X\rangle =1,\nonumber \\&\ X\in {{\mathcal {P}}}^n\cap {{\mathcal {N}}}^n. \end{aligned}$$
(18)

Usually, the relaxed problem (18) is expected to be easier to solve and to give us a better upper bound of problem (17) from its optimal solution \(X^*\). To get a better upper bound, we select some eigenvectors with negative eigenvalues of an optimal solution \(X^*\) of problem (18), say \(d_1,\ldots ,d_k\), by adding cutting planes

$$\begin{aligned} \langle d_id_i^T,X\rangle \ge 0\ \ \ ( i=1,\ldots ,k) \end{aligned}$$

to (18), and obtain a new optimization problem

$$\begin{aligned} \max&\ \langle ee^T,X\rangle \nonumber \\ \mathrm{s.t.}&\ \langle A+I,X\rangle =1,\nonumber \\&\ \langle d_id_i^T,X\rangle \ge 0\ ( i=1,\ldots ,k)\nonumber \\&\ X\in {{\mathcal {P}}}^n\cap {{\mathcal {N}}}^n. \end{aligned}$$
(19)

Notice that the optimal solution \(X^*\) of problem (18) is cut from the feasible region of problem (19) since \(\langle d_id_i^T,X^*\rangle <0\ ( i=1,\ldots ,k)\). On the other hand, since \({{\mathcal {S}}}^n_+=\{X\in {\mathbb {S}}^n\mid \forall d\in {\mathbb {R}}^n, \ \langle dd^T,X\rangle \ge 0 \} \subseteq {{\mathcal {P}}}^n\), every feasible solution of (17) is feasible for (19), and hence problem (19) is a relaxation of problem (17). These facts ensure that problem (19) is a tighter relaxation of problem (17) than problem (18). By repeating this procedure, we are able to obtain a series of nonincreasing upper bounds of (17). Since the eigenvectors are usually dense, we only have to add eigenvectors corresponding to up to the second smallest eigenvalues to \(\{d_i\}\) at every iteration, which increases computational efficiency.

As for the selection of the initial relaxation \({{\mathcal {P}}}^n\), we are ready to use the approximations of \({{\mathcal {S}}}^n_+\) based on the expanded SD bases. Let \({{\mathcal {H}}}:=\{\pm 1,\pm 1\pm \sqrt{2}\}\) be the set of parameters calculated in Sect. 3.4, and let \(\mathcal{SDB}_n\) denote the conical hull of expanded SD bases using \(\mathcal{H}\):

$$\begin{aligned} \mathcal{SDB}_n:={\mathrm{cone}}\left( \displaystyle \bigcup _{{\alpha }\in \mathcal{H}}\bar{{\mathcal {B}}}(\alpha )\right) . \end{aligned}$$

Then, as has been described in the previous sections, we have

$$\begin{aligned} {{\mathcal {S}}}^n_+\subseteq \mathcal{SDD}^*_n\subseteq \mathcal{SDB}^*_n\subseteq \mathcal{DD}^*_n. \end{aligned}$$
(20)

If \(\mathcal{SDB}^*_n\) or \(\mathcal{DD}^*_n\) is selected to be \(\mathcal{P}_n\), the corresponding relaxed problem in the cutting plane procedure becomes an LP, which allows us to use powerful state-of-the-art LP solvers, such as Gurobi [21]. Ahmadi et. al. [2] showed that when \(\mathcal{SDD}^*_n\) is selected, the relaxations turn out to be SOCPs. Although \(\mathcal{SDD}^*_n\) provides a tighter relaxation than either \(\mathcal{DD}_n\) or \(\mathcal{SDB}_n\), the latter two relaxations are expected to have a lower computational cost. In addition, in [2], Ahmadi et al. also proposed an SOCP-based cutting plane approach, named SDSOS, which adds SOCP cuts at every iteration. We conducted experiments to compare the efficiencies of those cutting plane methods using different approximations and SDSOS. The specifications of the experimental methods are summarized in Table 1.

Table 1 Specifications of the experimental methods

We tested these methods on the Erd\(\ddot{\mathrm{o}}\)s–Rényi graphs ER(np), randomly generated by Ahmadi et al. in [2], where n is the number of vertices and every pair of vertices has an edge with probability p. All experiments were performed with MATLAB 2018b on a Windows PC with an Intel(R) Core(TM) i7-6700 CPU running at 3.4 GHz and 16 GB of RAM. The LPs were solved using Gurobi Optimizer 8.0.0 [21] and the SOCPs and SDPs are solved using Mosek Optimizer 9.0 [35].

Figure 2 shows the result for an instance with \(n=250\) and \(p=0.8\). The x-axis is the number of iterations, and the y-axis is the gap between the upper bounds of each method and the SDP bound obtained by (17); the gap is computed by \(\left| \frac{f^*-f_k}{f^*}\right| \times 100\%\) for the obtained upper bound \(f_k\) at k’s iteration and the SDP bound \(f^*\) obtained by solving (17) directly.

As can be seen in this figure, the accuracy of CPDD is the worst among the four methods at each iteration. CPSDB achieves almost the same upper bounds as CPSDD and SDSOS, which shows that the proposed polyhedral approximation \(\mathcal{SDB}_n\) is promising for obtaining a solution close to the non-polyhedral approximation \(\mathcal{SDD}_n\) of \({{\mathcal {S}}}^n_+\). Although SDSOS adds an extra SOCP cut at every iteration and takes longer to solve, the accuracy of SDSOS does not seem to be affected and is not so different from the accuracy of CPSDD at each iteration.

Fig. 2
figure 2

Relation between the number of iterations and the gap

Figure 3 shows the relation between the computation time and the gap of each method for the same instance. Although its accuracy is not necessarily the best at every iteration, it seems that CPSDB is the most efficient method. CPSDB attains an upper bound whose gap is 2 within 30 s, while CPSDD and SDSOS attain upper bounds whose gap is 4 after the same amount of time. The difference might come from that the subproblems of CPSDB are sparse LPs at earlier iterations and the computations are relatively cheaper than those of CPSDD and SDSOS whose subproblems are SOCPs.

Fig. 3
figure 3

Relation between the computational time (s) and the gap

Tables 2 and 3 give the bounds of iterative methods and the SDP bound for all the instances. In Table 2, the CPSDD\(_0\)/SDSOS\(_0\) column shows the first upper bound obtained by CPSDD and SDSOS, i.e., the upper bound obtained by solving the same SOCP before adding any cutting plane. The (5 min) and (10 min) columns of CPSDD (SDSOS) show the upper bounds obtained after 5 min and after 10 min of the CPSDD (SDSOS) computation, respectively. The SDP column shows the SDP bound obtained by solving (17).

Similarly, in Table 3, the CPDD\(_0\) and CPSDB\(_0\) columns show the first upper bounds obtained by CPDD and CPSDB, respectively, before adding any cutting plane. The (5 min) and (10 min) columns of CPDD (CPSDB) show the upper bounds obtained after 5 min and after 10 min of the CPDD (CPSDB) computation, respectively.

Note that we failed to solve SDPs (17) for instances having \(n=300\) nodes within our time limit 20000s. In Table 2, the Value and Time (s) columns of SDP with \(n=300\) show the results obtained in [2] for these two instances, as a reference.

As can be seen in Tables 2 and 3, for all instances, the values of CPSDD\(_0\)/SDSOS\(_0\) are better than the values of CPSDB\(_0\) and CPDD\(_0\). These results correspond to the inclusion relationship of initial approximations (20). We can also see that the values of CPSDB\(_0\) are almost the same as those of CPSDD\(_0\)/SDSOS\(_0\) for all instances, while the values of CPDD\(_0\) are much worse than others. For all instances, CPSDB seems to be significantly more efficient than all other methods. For example, for instance with \(n=250\) and \(p=0.3\), after 10 min of calculation, CPSDB obtained an upper bound of 73.24, while CPSDD and SDSOS got upper bounds greater than 90 and CPDD got a bound of more than 146.

At present, solving a large SDP, e.g., one with more than \(n=300\) nodes requires a significant amount of computational time. The cutting plane method CPSDB with our polyhedral approximation \(\mathcal{SDB}_n\) is a promising way of obtaining efficient upper bounds of such large SDPs in a moderate time.

Table 2 Upper bounds obtained by SDP and SOCP methods on ER(np) graphs
Table 3 Upper bounds obtained by LP methods on the same ER(np) graphs

5 Concluding remarks

We developed techniques to construct a series of sparse polyhedral approximations of the semidefinite cone. We provided a way to approximate the semidefinite cone by using SD bases and proved that the set of diagonally dominant matrices can be expressed with sparse SD bases. We proposed a simple expansion of SD bases that keeps the sparsity of the matrices that compose it. We gave the conditions for generating linearly independent matrices in expanded SD bases as well as for generating an expansion different from the existing one. We showed that the polyhedral approximation using our expanded SD bases contains the set of diagonally dominant matrices and is contained in the set of scaled diagonally dominant matrices. We also proved that the set of scaled diagonally dominant matrices can be expressed using an infinite number of expanded SD bases.

The polyhedral approximations were applied to the cutting plane method for solving a semidefinite relaxation of the maximum stable set problem. The results of the numerical experiments showed that the method with our expanded SD bases is more efficient than other methods (see Fig. 3); improving the efficiency of our method still remains an important study issue.

One future direction of study is to increase the number of vectors in the definition of the SD bases. The current SD bases are defined as a set of matrices \((e_i+e_j)(e_i+e_j)^T\). If we use three vectors, as in \((e_i+e_j+e_k)(e_i+e_j+e_k)^T\), we might obtain another inner approximation that remains relatively sparse when the dimension n is large.

Another future direction is to focus on the factor width k of a matrix. The cone of matrices with factor width at most \(k=2\) was introduced in order to give another expression of the set \(\mathcal {SDD}_n\) of scaled diagonally dominant matrices. By considering a larger width \(k > 2\), we may obtain a larger inner approximation of the semidefinite cone \(\mathcal {S}^n_+\), although it would not be polyhedral, or even characterized by using SOCP constraints. Finding efficient ways to solve approximation problems over such cones might be an interesting challenge.

Also, our expanded SD bases can be applied to some other difficult problems. Mixed integer nonlinear programming has recently become popular in many practical applications. In [33], Lubin et al. proposed a cutting plane framework for mixed integer convex optimization problems. In [26], Kobayashi and Takano proposed a branch and bound cutting plane method for mixed integer SDPs. It would be interesting to see whether the approximations of \({{\mathcal {S}}}^n_+\) proposed in this paper could be used to improve the efficiency of those methods.