An SDP-based approach for computing the stability number of a graph

Finding the stability number of a graph, i.e., the maximum number of vertices of which no two are adjacent, is a well known NP-hard combinatorial optimization problem. Since this problem has several applications in real life, there is need to find efficient algorithms to solve this problem. Recently, Gaar and Rendl enhanced semidefinite programming approaches to tighten the upper bound given by the Lovász theta function. This is done by carefully selecting some so-called exact subgraph constraints (ESC) and adding them to the semidefinite program of computing the Lovász theta function. First, we provide two new relaxations that allow to compute the bounds faster without substantial loss of the quality of the bounds. One of these two relaxations is based on including violated facets of the polytope representing the ESCs, the other one adds separating hyperplanes for that polytope. Furthermore, we implement a branch and bound (B&B) algorithm using these tightened relaxations in our bounding routine. We compare the efficiency of our B&B algorithm using the different upper bounds. It turns out that already the bounds of Gaar and Rendl drastically reduce the number of nodes to be explored in the B&B tree as compared to the Lovász theta bound. However, this comes with a high computational cost. Our new relaxations improve the run time of the overall B&B algorithm, while keeping the number of nodes in the B&B tree small.


Introduction
The stable set problem is a fundamental combinatorial optimization problem.It is capable of modeling other combinatorial problems as well as real-world applications and is therefore widely applied in areas like operations research or computer science.We refer to the survey [27] for more information and a review of exact and heuristic algorithms.Most of the exact algorithms are based on branch and bound (B&B) and differ mainly by different upper and lower bound computations.A recent paper using a MIP solver is e.g.[20].The models used in that paper yield computation times from less than a second up to half an hour on a selection of DIMACS instances.
Outstanding results are obtained by an algorithm of Depolli et.al [9].They introduced an algorithm using parallel computing for finding maximum cliques in the context of protein design.The algorithm consists of carefully implemented algorithmic building blocks such as an approximate coloring algorithm, an initial vertex ordering algorithm and the use of bit-strings for encoding the adjacency matrix.
In the 2015 survey [27], no exact algorithms using semidefinite programming (SDP) are mentioned.One reason for the rare literature on SDP based B&B algorithms is the high computational cost for computing these bounds.In this work we introduce an SDP based B&B algorithm.We formulate new SDP relaxations and develop solution algorithms to compute these bounds with moderate computational expense, making them applicable within a B&B scheme.
Before introducing the stable set problem, sometimes also referred to as vertex packing problem, we give the definition of a stable set.Let G = (V, E) be a simple undirected graph with |V | = n vertices and |E| = m edges.A set S ⊆ V is called stable if no vertices in S are adjacent.S is called a maximal stable set if it is not possible to add a vertex to S without losing the stability property.The stability number α (G) denotes the maximum size of a stable set in G, where size means the cardinality of the set.A stable set S is called a maximum stable set if it has size α(G).
For convenience, from now on we always label the vertices of a graph with n vertices from 1 to n. Computing α(G) can be done by solving the following optimization problem.
For a graph G = (V, E), the set of all stable set vectors S(G) and the stable set polytope STAB(G) are defined as S(G) = {s ∈ {0, 1} n : s i s j = 0 ∀{i, j} ∈ E} and STAB(G) = conv {s : s ∈ S(G)} .
Determining α(G) is NP-complete and the decision problem is among Karp's 21 NPcomplete problems [18].Furthermore, Håstad [15] proved that α(G) is not approximable within n 1−ε for any ε > 0 unless P=NP.A well known upper bound on α(G) is the Lovász theta function ϑ(G).Grötschel, Lovász and Schrijver [14] proved that and hence provided a semidefinite program (SDP) to compute ϑ(G).We define the feasible region of (2) as Clearly for each element (x, X) of TH 2 (G) the projection of X onto its main diagonal is x.
The set of all projections TH(G) = x ∈ R n : ∃X ∈ S n : (x, X) ∈ TH 2 (G) is called theta body.More information on TH(G) can be found for example in Conforti, Cornuejols and Zambelli [7].It is easy to see that STAB(G) ⊆ TH(G) holds for every graph G, see [14].Thus ϑ(G) is a relaxation of α(G).This paper is structured as follows.In Section 2 we introduce two new relaxations using the concept of exact subgraph constraints.A branch and bound algorithm that uses these relaxations is described in Section 3, followed by the discussion of numerical results in Section 4. Section 5 concludes this paper.

New Relaxations of the Exact Subgraph Constraints
In this section we present two new approaches to find upper bounds on the stability number α(G) of a graph G starting from the Lovász theta function ϑ(G) with so-called exact subgraph constraints, one based on violated facets and one based on separating hyperplanes.After introducing these approaches, we compare them both theoretically and practically.

Basic Setup for Exact Subgraph Constraints
Our approach is based on the idea of exact subgraph constraints that goes back to Adams, Anjos, Rendl and Wiegele [1] for combinatorial optimization problems that have an SDP relaxation and was recently computationally investigated by Gaar and Rendl [12,13] for the stable set, the Max-Cut and the coloring problem as a basis.Starting from this, we present two relaxations of including exact subgraph constraints into the SDP for calculating ϑ(G) that are computationally more efficient.
We first recapitulate the basic concepts of exact subgraph constraints with the notation from [11].An upper bound on α(G) is given by the Lovász theta function ϑ(G).Due to the SDP formulation (2) it can be computed in polynomial time.Adams, Anjos, Rendl and Wiegele [1] proposed to improve ϑ(G) as an upper bound by adding so-called exact subgraph constraints.These exact subgraph constraints can be used to strengthen SDP relaxations of combinatorial optimization problems with a certain property by including subgraph information.For the stable set problem we need the following definitions in order to introduce the exact subgraph constraints.For a graph G the squared stable set polytope STAB 2 (G) is defined as and matrices of the form ss T for s ∈ S(G) are called stable set matrices.Let G I denote the subgraph induced by the vertex set I ⊆ V (G) with |I| = k I .With X I we denote the submatrix of X that results when we delete each row and column corresponding to a vertex that is not in I.In other words, X I is the submatrix of X where we only choose the rows and columns corresponding to the vertices in I. Then the constraint that asks the submatrix X I of (2) for an induced subgraph G I to be in the squared stable set polytope STAB 2 (G I ) is called exact subgraph constraint (ESC).
The k-th level of the exact subgraph hierarchy introduced in [1] is the Lovász theta function (2) with additional ESC for each subgraph of order k.In [12,13] this hierarchy is exploited computationally by including the ESC only for a set J of subgraphs and then considering ( Clearly, α(G) z J (G) holds for every set J of subsets of V (G), so z J (G) is an upper bound on α(G).One of the key remaining questions is how to solve (3).We will compare different implementations and relaxations of this problem in the rest of the paper and start by considering existing methods.
The most straightforward way to solve (3) is to include the ESCs in a convex hull formulation as presented in [12,13,11].We now recall the basic features and follow the presentation from [11].As the ESC for a subgraph G I makes sure that X I ∈ STAB 2 (G I ) holds and the polytope STAB 2 (G I ) is defined as the convex hull of the stable set matrices, the most intuitive way to formulate the ESC is as a convex combination.Towards that end, for a subgraph G I of G induced by the subset I ⊆ V , let |S(G I )| = t I and let S(G I ) = s I 1 , . . ., s I t I .Then the i-th stable set matrix S I i of the subgraph G I is defined as As a result, the ESC X I ∈ STAB 2 (G I ) can be rewritten as and it is natural to implement the ESC for the subgraph G I as where ∆ t I is the t I -dimensional simplex.This means that when including the ESC for the subgraph G I into (2) we have t I additional non-negative variables, one additional linear equality constraint for λ I and the matrix equality constraint which couples X I and λ I .We denote the number of equality constraints that are induced by the matrix equality constraint by b I and note that b holds.With this formulation (3) can equivalently be written as so z J (G) = z C J (G) holds.In practice, this SDP can be solved by interior point methods as long as the number of ESC constraints is of moderate size.
Due to the fact that (4) becomes a huge SDP as soon as the number of ESCs |J| becomes large, Gaar and Rendl [12,13] proposed to use the bundle method to solve this SDP.The bundle method is an iterative procedure to find a global minimum of a non-smooth convex function and has been adapted for SDPs by Helmberg and Rendl [16].As we use the bundle method only as a tool and do not enhance it any further, we refrain from presenting details here.

Relaxation Based on Inequalities that Represent Violated Facets
We will see later on that the computational costs of a B&B algorithm are enormous in the original version with the convex hull formulation (4) and they are still substantial with the bundle approach from [12,13].Therefore, we suggest two alternatives.
First, we present a relaxation of calculating the Lovász theta function with ESCs (3) that has already been mentioned in [11], but has never been computationally exploited so far.The key ingredient for this relaxation is the following observation.The polytope STAB 2 (G I ) is given by its extreme points, which are the stable set matrices of G I .Due to Weyl's theorem (see for example [23]) it can also be represented by its facets.This means that there are (finitely many) inequalities, such that the constraint X I ∈ STAB 2 (G I ) can be represented by these inequalities.
However, the facets and hence the inequalities depend on the stable set matrices and therefore on the subgraph G I .Thus different subgraphs need different calculations that will lead to different inequalities.Gaar [11,Lemma 3] showed that adding the ESC X I ∈ STAB 2 (G I ) to the SDP calculating the Lovász theta function ( 2) is equivalent to adding the constraint This implies that it is enough to calculate the facets of STAB 2 (G 0 k I ) and include these facets for each subgraph G I on k I vertices, instead of calculating the facets of STAB 2 (G I ) for each subgraph G I separately.Let r k I be the number of facets of STAB 2 (G 0 k I ) and let ) is an inequality representing the i-th facet of STAB 2 (G 0 k I ).We obtained r k I , F k I i and f k I i for k I 6 in the way suggested in [11].For k I 7 this computation is beyond reach, as r 7 is conjectured to be 217093472 [6].
If we would include all facets of STAB 2 (G 0 k I ) for each subgraph G I to replace the ESCs in (3), then we would include a huge number of inequalities (r 5 = 368 and r 6 = 116764) and reach the limits of computing power rather soon.In order to reduce the number of inequalities, for each subgraph we include only those inequalities that represent facets that are violated by the current solution X * .To be more precise, let X * be the optimal solution of (3) for J = ∅, i.e., the optimal solution of calculating the Lovász theta function.Then we define the indices of significantly violated facets of G I , i.e., facets where the corresponding inequalities are violated at least by ε F , as where ε F is a small constant to take care of numerical inaccuracies of calculating X * .Now we can further reduce the number of included inequalities in the following way.Although all (F induce the same inequality.This is possible because they might differ only in positions (j, j ) with j, j ∈ I and {j, j } ∈ E. Therefore, these different entries are multiplied with zero due to [X * I ] j,j = 0. Hence, let V I ⊆ V I be a set such that only one index among all indices in V I which induce the same inequality is in V I .Then we obtain the following relaxation of (3), in which we include only inequalities that induce significantly violated facets of Unfortunately for k I 7 it is not possible to store and check the facets of STAB 2 (G 0 k I ) for violation in reasonable memory and time due to the huge number of facets.Hence, we can perform this relaxation only for subgraphs G I of order k I 6.

Relaxation Based on Separating Hyperplanes
Next we consider another approach to implement a relaxation of (3) which can also be used for subgraphs G I of order k I 7 and which is based on including separating hyperplanes.
It uses the following fact.Let X be any matrix in ∈ S n and let P I be the projection of XI onto STAB 2 (G I ).Then we can calculate the projection distance of X to STAB 2 (G I ) as where symmetric and positive semidefinite because it is a Gram matrix, so ( 6) is a convex-quadratic program with t I variables, a convex-quadratic objective function and one linear equality constraint.With the optimal solution λ I of (6) the projection of XI onto STAB 2 (G I ) can be obtained by due to the separating hyperplane theorem (see for example Boyd and Vandenberghe [2]) is a hyperplane that separates XI from STAB 2 (G I ) such that X I = P I fulfills the inequality with equality.Obviously ( 7) is a relaxation of the ESC is another relaxation of (3) that depends on the chosen X.

Theoretical Comparison of the Relaxations
We briefly comment on some theoretical properties of z C J (G), z F J (G) and z H J (G).We start by analyzing the upper bounds we obtain.Due to the fact that z F J (G) and holds for every graph G and every set J.
Another important observation is the following.Whenever we include the ESC of the subgraph G I into the SDP computing z C J (G), the stable set problem is solved exactly on this subgraph G I .However, when computing z F J (G) and z H J (G) we do not include the ESC but only a relaxed version of it.Hence, in the optimal solutions of these two relaxations, it could still be the case that the ESC is not fulfilled, i.e., for the subgraph G I we do not have an exact solution.Hence, it is possible that we still find violated inequalities (representing facets or hyperplanes) in these cases.As a consequence, for z C J (G) it does not make sense to include the ESC for the same subgraph twice, but for z F J (G) and z H J (G) it is possible that we want to include a relaxation of the very same ESC twice with different facets or a different separating hyperplane.
Finally let us consider the sizes of the SDPs to solve.In all three versions z C J (G), z F J (G) and z H J (G) we solve the SDP of the Lovász theta function (2) with additional constraints, so in all three SDPs we have a matrix variable of dimension n + 1 which has to be positive semidefinite (psd) and n + m + 1 linear equality constraints.Additionally to that we have , and |J| inequalities for z H J (G).Table 1 gives an overview of the different sizes of the SDPs.
Table 1: Sizes of the SDPs to compute z C J (G), z F J (G) and z H J (G)

Computational Comparison of the Relaxations
Before we perform a large scale comparison of z C J (G), z F J (G) and z H J (G) within a B&B algorithm, we investigate a small example.
holds for every set J.
All the computations were performed on an Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz with 32 GB RAM with the MATLAB version R2016b and with MOSEK version 8.In the computations, we use ε F = 0.00005 and we include a separating hyperplane for a subgraph whenever the projection distance is greater or equal to 0.00005.
We compute z C J (G), z F J (G) and z H J (G) for different sets J, which all consist of subsets of vertices of size k I = 5 and only differ in the number q = |J| of included ESCs.To be more precise, we consider five different sets J = J q with q = |J q | ∈ {221, 443, 664, 886, 1107}.These values of q are chosen in such a way that the number of linear equality constraints which are induced by the matrix equalities from the ESCs in the convex hull formulation, i.e., I∈Jq b I , is in {3000, 6000, 9000, 12000, 15000}.To choose the subsets in J q , we first determine X * as the optimal solution of (3) for J = ∅, i.e., the optimal solution of calculating the Lovász theta function (2).Then we generated 3q subgraphs G I of order k I randomly and included those q subsets I into J q , where the corresponding X * I have the largest projection distances to STAB 2 (G I ).For computing z H J (G) we choose X = X * .
Table 4: The average projection distances of X I to STAB 2 (G I ) over all I ∈ J q before (i.e., X = X * is the optimal solution of ( 2)), and after (i.e., X ∈ {X C * , X F * , X H * } is the optimal solution of z C J (G), z F J (G) and z H J (G)) including the ESCs before including ESCs 0.03095 0.03014 0.03057 0.02982 0.03032 after computing z C J (G) 0.00005 0.00004 0.00004 0.00004 0.00004 after computing z F J (G) 0.00151 0.00087 0.00115 0.00080 0.00051 after computing z H J (G) 0.00290 0.00256 0.00252 0.00196 0.00183 As a next step we investigate the projection distances.Recall that X * is the optimal solution of calculating the Lovász theta function (2).Let X C * , X F * and X H * be the optimal solution of calculating z C J (G), z F J (G) and z H J (G), respectively.In Table 4 we see that the average projection distance of X * is significantly larger than 0 before including the ESCs, so there are several violated ESCs.As soon as the ESCs are included the average projection distance for X C * is almost zero, so the ESCs are almost satisfied.In theory they should all be zero, but as MOSEK is not an exact solver, the optimal solution is subject to numerical inaccuracies.If we turn to z F J (G), then the projection distances of X F * I are not as close to zero as those for X C * , because z F J (G) is only a relaxation of z C J (G).Also the average projection distance of X H * I after solving z H J (G) is greater than the one obtained with z F J (G).This is in tune with the fact that the upper bounds obtained in the latter case are better for this instance.Furthermore, note that the average projection distances for X F * I and X H * I decrease as q increases.This is not surprising, as more ESCs mean that a bigger portion of the graph is forced into the stable set structure.Finally we present in Table 5 the average number of violated facets.As one can see the average number of violated facets before including the ESCs is already very low.This means that we do not include too many inequalities that represent facets in the computation of z F J (G).Furthermore, the average number of facets that are violated by X F * decreases significantly compared to the average number of violated facets before including the relaxations of the ESCs.This is very encouraging because one could imagine a scenario where we iteratively add violated facets of one subgraph and then the optimal solution violates different facets.However, the computations suggest that this does not happen too often.Like before in Table 4 we see that the more ESCs are included, the more stable set structure is captured and therefore the less facets are violated after including the relaxations of the ESCs.
Let us briefly summarize the key points of Example 2.1.Usually the upper bounds obtained by z F J (G) are only slightly worse than those of z C J (G), but the running times are only a fraction.Unfortunately, this approach works only for subgraphs of order at most 6.Also z H J (G) yields good upper bounds in slightly better running time than z F J (G), but the bounds are a little bit worse.A major benefit of this approach is that it can be used for subgraphs of any order.
In a nutshell, both z F J (G) and z H J (G) are relaxations of z C J (G) that reduce the running times drastically by worsening the bounds only a little bit.As a result these bounds are very promising for including them into a B&B algorithm for stable set.

Branch and Bound for the Stable Set Problem
The aim of this section is to present our implementation of an exact branch and bound (B&B) algorithm for the stable set problem (1).

Our Branch and Bound Algorithm
We start by detailing the general setup of our B&B algorithm.Towards this end, keep in mind that in a solution of (1) the binary variable x i is equal to 1 if vertex i is in the stable set, and x i = 0 otherwise.
For our B&B algorithm for the stable set problem we choose a vertex i ∈ V (G) and divide the optimization problem in a node of the B&B tree into the subproblem where vertex i is in the stable set (i.e., set the branching variable x i = 1) and a second subproblem where i is not in the stable set (i.e., set the branching variable x i = 0).
It turns out that in each node of the B&B tree the optimization problem is of the form for some graph G, so in each node we have to solve a stable set problem and add a constant term c to the objective function value.Indeed, by fixing a branching variable x i to 0 or 1, we shrink the graph and create subproblems that are again stable set problems of the form (9) but with a smaller graph and some offset c.To be more precise, for the subproblem with x i = 1 the objective function value of (1) increases by 1 because there is one more vertex in the stable set.Furthermore, all neighbors of i can not be in the maximum stable set because i is already in the stable set.So we can set } denotes the set of neighbors of the vertex i in G. Furthermore, we can delete i and all neighbors of i in the current graph G and search for a maximum stable set in the new graph G of smaller order, where Hence, the subproblem to solve in the new node has the form P (G , c + 1).In the subproblem for x i = 0 the vertex i is not in the stable set.We can remove the vertex i from the graph and search for a maximum stable set in the induced subgraph G = G[U ] with U = V (G) \ {i}.This boils down to solving P (G , c) in the new node of the B&B tree.
Note that whenever we delete a vertex i from the graph in the branching process, we set the according variable x i to a fixed value.Consequently, in every node, all vertices of the original graph G are either still present, or the value of the variable corresponding to them is implicitly fixed.Furthermore, we exclude all non-feasible solutions by deleting all neighbors in case of setting the branching variable to 1. Hence, every time we set a variable of a vertex to 1 the set of all vertices of which the variable is set to 1 remains stable and we only obtain feasible solutions of (1).Therefore, from a feasible solution of ( 9) in any node we can determine a feasible solution of (1) with the same objective function value.
The order of the graph to consider in a node shrinks whenever we branch.As a consequence the B&B tree is of finite size.Whenever we reach a node with a suproblem on a graph with less or equal to 23 vertices, we solve the problem by a fast enumeration procedure that can be employed whenever the subproblems become sufficiently small.To do so, we iterate easily -and especially fast -over all subsets of V in descending order with an implementation of Hinnant [17] in C ++ .

Bounds
We do not want to solve the subproblems (9) in each node to optimality, but only compute an upper and a lower bound on the optimal objective function value.This boils down to obtaining bounds on the stability number of the graph considered in (9).We present details on lower bounds obtained by heuristics in Section 3.2.
As upper bounds we use the relaxations based on ESCs in four different versions, namely the convex hull formulation or the bundle method as detailed in Section 2.1, the violated facets version as described in Section 2.2 or the separating hyperplanes version as presented in Section 2.3.For choosing the subset J of ESCs, we follow the approach of [13] in our computations and perform several cycles, i.e., iterations of the repeat until loop, of solving (a relaxation) of ( 3) and then adjusting the set J, as illustrated in Algorithm 1.
In particular, we start with J = ∅, as preliminary computations have shown that carrying over ESC to subproblems does not pay of, and in each cycle we update J depending on the current optimal solution X * of the SDP solved.We remove all previously added ESCs where the associated dual variables of the optimal solution have absolute value less than 0.01.For finding violated subgraphs (i.e., subgraphs for which the ESC does not hold in X * ) we use the methods presented in [13], so we use a local search heuristic to find submatrices of X * that minimize the inner product with some matrices.We let the local search heuristic run for 9n times and add random subgraphs to obtain 9n subgraphs without duplicates.From these subgraphs we add the 3n most violated ones (subsets I with largest projection distance of X * I to STAB 2 (G I )) to J. To reduce computational effort, we stop cycling as soon as we do not expect to be able to prune within the next 5 cycles assuming that the decrease of the upper bound z * in each future cycle is 0.75 of the average decrease we had in the previous cycles.
For our computations we use the implementation of the bundle method as it is detailed in [13], in particular with all specifications given in Section 6.3 therein and we let the bundle method run for at most 15 iterations in each cycle.
Branching Rule An important question in the B&B algorithm is how to choose the branching variable.In our implementation we follow the approach to first deal with vertices, for which we know least whether they will be in a maximum stable set or not ("difficult first") in order to find an optimal solution soon.
All our upper bounds are based on the Lovász theta function ( 2), so we can use the intuition that the closer an entry x i is to 1 in an optimal solution of (2), the more likely it is that this vertex i is in a maximum stable set.Hence, we choose the variable x i with value closest to 0.5 as branching variable.
More on branching rules for the stable set problem can for example be found in [10,28].

Algorithm 1:
Upper bound computation at a B&B node Input: Graph G at current node, method convex-hull, bundle-method, violated-factes, or separating-hyperplanes Output: Upper bound z * 1 J = ∅ 2 repeat 3 if convex-hull then let X * be an optimal solution of (4) with objective function value z * 4 if bundle-method then let X * be the solution of approximately solving (4) using the bundle method with objective function value z * 5 if violated-facets then let X * be an optimal solution of ( 5) with objective function value z * 6 if separating-hyperplanes then let X * be an optimal solution of ( 8) with objective function value z * , where we choose X as X * of the last cycle

Diving Strategy
We implemented a best first search strategy, where we always consider the open subproblem with the largest upper bound next.We expect that we find a large stable set in this branch of the B&B tree because the difference between the global lower bound and the upper bound for this branch is the highest of all.

Heuristics to Find Large Stable Sets
It is crucial to find a good lower bound on α(G) early in the B&B algorithm to prevent the growth of the B&B tree and therefore solve the stable set problem more efficiently.In [26] and [27], for example, one can find references to some heuristics to find a large stable set.In our implementation we use several different heuristics.
The first heuristic makes use of the vector x from the SDP formulation (2) of ϑ(G), which is available from the upper bound computation.This vector consists of n elements between 0 and 1.The value x i gives us some intuition about the i-th vertex of the graph, namely the closer it is to 1, the more likely it is that the vertex is in a maximum stable set.Hence, we sort the vertices in descending order according to their value in x and then add the vertices in this order to a set S, such that the vertices of S always remain a stable set.In the following we refer to this heuristic as (HT).
Furthermore, we use a heuristic introduced by Kahn I., Ahmad and Kahn M. in [19] based on vertex covers and vertex supports.A subset C of the vertices of a graph is called vertex cover if for each edge at least one of the two incident vertices is in C. The vertex support of a vertex is defined as the sum of the degrees of all vertices in the neighborhood of this vertex.If C is a vertex cover, then clearly V (G) \ C is a stable set, so instead of searching for a maximum stable set we can search for a vertex cover of minimum cardinality.In a nutshell, the heuristic of [19] searches for a vertex with maximum vertex support in the neighborhood of the vertices with minimum vertex support.If there is more than one vertex with maximum support, one with maximum degree is chosen.This vertex is added to the vertex cover C and all incident edges are removed.The above steps are repeated until there are no edges left in the graph.In the end we obtain a hopefully large stable set with V (G) \ C. We denote this heuristic by (HVC).
Finally we use a heuristic proposed by Burer, Monteiro and Zhang in [4].Their heuristic is based on the SDP formulation of the Lovàsz theta function with additional restriction to the matrix variable to be of low rank.With rank one, a local maximizer of the problem yields a maximal stable set, whereas with rank two the stable set corresponding to the local maximizer does not necessarily have to be maximal, but one can escape to a higher local maximizer which corresponds to a maximal stable set.The C source code of this heuristic is online available at [3].We use this code with the parameters rank set to 2 and the number of so-called escapes set to 1 in a first version and set to 5 in a second version.Both parameter settings are among the choices that were tested in [4].We will refer to this versions with (H21) and (H25).
In the B&B algorithm we perform the heuristic (H25) in the root node with a time limit of 20 seconds.Then we only perform the heuristics in every third node of the B&B tree.(HT) and (HVC) are very fast, so we let them run in each node we run heuristics.Furthermore, in the first 10 nodes of the B&B tree we perform (H25) with the hope to find a stable set of cardinality α(G) as soon as possible, but we do not allow the heuristic to iterate longer than 7 seconds.For graphs with less than 200 vertices we additionally perform (H21) with the running time limited to 1 second.On graphs with more vertices we perform with probability 0.05 (H25) and a time limit of 7 seconds and in the other cases (H21) with a time limit of 3 seconds.In a computational comparison in the master's thesis of Siebenhofer [25], this turned out to be the best combination of heuristics.

Computational Experiments
In this section we finally compare the B&B algorithms using the different upper bounds presented so far.In Table 6 and Figure 1 we compare the number of nodes generated in the B&B algorithm as well as the CPU time and the final gap.The abbreviations refer to the following bounds used.(TH) For better comparability we also consider our B&B algorithm with only the Lovász theta function (2) as an upper bound and denote this version with (TH).Note that this boils down to solving (3) with J = ∅.
If we are not able to solve an instance within the timelimit, we indicate this in Table 6 by a cell that is colored .Whenever a cell is colored it means that the run did not finish correctly, for example because MOSEK produced an error or ran out of memory.Before discussing the results, we give the details on the instances as well as on the soft-and hardware.

Benchmark Set and Experimental Setup
We consider several different instances.First, we consider the instances used in [13], i.e., torus graphs, random near-r-regular graphs and random graphs from the Erdős-Rényi model and also several instances from the literature.Additionally we consider all instances from the DIMACS challenge for which the gap between ϑ(G) and α(G) is larger than one (i.e., all instances which are not solved in the root node of our B&B algorithm) and that have at most 500 vertices.Moreover, we consider spin graphs, which are produced with the command ./rudy-spinglass3pm x x x 50 xxx1 (for x ∈ {5, 7, 9, 11}) by the graph generator rudy, which was written by Giovanni Rinaldi. 1e implemented our B&B algorithm with different upper bounds in C ++ .All computations were performed on an Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz with 32 GB RAM.All programs were compiled with gcc version 5.4.0 with the optimization level -O3 and the CPU time was measured with std::clock.We set the random seed to zero.We use MOSEK [22] 8.1 in the methods (CH), (VF), and (SH) to solve the SDPs (4), (5), and (8), respectively.Furthermore, we use it within the method (BD) for solving the subproblems within the bundle method for computing an approximate solution of (4).Moreover, we use it to solve the QP (6) to compute the projection distance in (SH) and when updating J, i.e., while searching for subgraphs with violated ESCs and adding these subgraphs to J. The execution time of our B&B algorithm is limited to 4 hours, i.e., after this computation time we allow the B&B to finish solving the SDP of the already started node in the B&B tree and then stop.

First Computational Experiments
We first want to compare the two different versions of the B&B algorithm that use (CH) and (BD) to compute upper bounds, i.e., we compare those versions that have already been established as upper bounds in the literature, but are now for the first time used within a B&B algorithm.First, by looking at Table 6 we observe that 20 instances were not solved correctly with (CH), which is due to the fact that the SDPs to solve are huge and therefore MOSEK runs out of memory very often.Indeed, by using (BD) and hence not having to solve that large SDPs only 10 instances are not solved correctly, most of them due to other MOSEK errors.When we compare the number of B&B nodes for the instances where both (CH) and (BD) finished we see that typically the number is the same, or there are slightly more nodes for  (BD).This is plausible, as we only have an approximate solution of (3) when using (BD), but an exact solution of (3) with (CH).
We next take a closer look on the lines labeled (CH) and (BD) in the performance profiles in Figure 1.The B&B code using (BD) as bounding routine can solve much more instances within a given time than when (CH) is used.Once the time limit is reached, the gap is typically much lower for (BD) than it is for (CH).
In a nutshell, even though (BD) solves only a relaxation of (3), using it is much faster than using (CH) while it does not increase the number of B&B nodes a lot.This justifies considering only a relaxation of (3).

Computational Experiments with New Relaxations
Up to now we have used the exact subgraph approach of [1] with the implementation proposed by Gaar and Rendl [12,13] in order to get tight upper bounds on the stability number within a B&B algorithm for solving the stable set problem.So far we have proven the strength of the bounds by showing that the number of nodes in a B&B algorithm reduces drastically by using these bounds, however the computational costs are enormous in the original version with the convex hull formulation (CH) and they are still substantial with the bundle approach (BD) from [12,13].Therefore, we now discuss the numerical results of the B&B algorithm using the new relaxations (VF) and (SH).
Looking at Table 6, the first thing we observe is that both (VF) and (SH) never lead to a MOSEK error, hence they are more robust than the other versions, presumably due to their smaller size of the SDPs to solve in the B&B nodes.For 9 instances both (VF) and (SH) and in 13 instances at least one of (VF) and (SH) is able to finish within the time limit for an instance where both (CH) and (BD) were not able to finish.
If we compare the number of B&B nodes in Table 6 for the finished instances, then we see that typically the number of B&B nodes for (VF) is smaller than those of (BD), which makes sense because in (BD) we only approximately solve (3) whereas in (VF) we solve a possibly very tight relaxation of (3) exactly.The number of B&B nodes needed by (SH) is typically a little bit larger than the one of (VF), which is in tune with the empirical finding that (5) gives better bounds than (8) in the small example considered in Section 2.5.In a nutshell, for finished runs typically (CH) and (VF) need roughly the same number of nodes, (SH) needs a little bit and (BD) needs many more nodes in the B&B tree.
As for the running times, in Figure 1 we see that both (VF) and (SH) are faster than (BD) and considerably faster than (CH).(VF) is a little bit slower than (SH).For those instances that cannot be solved to optimality, the gap when the time limit is reached is roughly the same for (VF) and (SH), and it is considerably tighter than for (BD).
We have demonstrated that within a B&B algorithm both our relaxations (VF) and (SH) work better than already existing SDP based methods.In particular using (SH) allows to keep the majority of the strength of the upper bound (3) (i.e., keeping the number of vertices in the B&B tree low) by reducing the running time so that within the time limit almost 60% of the instances are solved, as compared to (CH) that only manages to solve a bit more than 20%.

Conclusions
We introduced an algorithm for computing the stability number of a graph using semidefinite programming.While there exist several exact solution methods for finding the stability number, those based on semidefinite programming are rather rare.
We implemented a B&B algorithm using the SDP relaxations introduced in [12, 13] together with heuristics from the literature.Moreover, we further relaxed the SDPs, getting more tractable SDPs still producing high-quality upper bounds.This is confirmed by the numerical experiments where we compare the number of nodes to be explored in the B&B tree as well as the CPU times.
While SDPs produce strong bounds, the computational expense for solving the SDPs is sometimes huge.In particular, there is potential for improvement of the running time for solving SDPs with many constraints.We use MOSEK as a solver, which uses the interior point method to solve an SDP.For large instances it would be beneficial to use a solver based on the boundary point method [24,21] or DADAL [8].Moreover, the solver computing ϑ + as an upper bound [5] combined with the relaxations above, may push the performance of the B&B solver even further.Also, these other solvers are capable of doing warm starts, that can have big advantages within a B&B framework.Since all these implementations are available as MATLAB source code only, they need to be translated to C or C ++ first.

Example 2 . 1 .
We consider a random graph G = G 100,15 from the Erdős-Rényi model G(n, p) with n = 100 and p = 0.15.A random graph from this model has n vertices and every edge is present with probability p independently from all other edges.For the chosen graph, ϑ(G 100,15 ) = 27.2003 and α(G 100,15 ) = 24 holds, so

(
CH)We consider the upper bound obtained by the ESCs in the convex hull formulation (CH) described in Section 2.1, (BD) solving this formulation with the bundle method (BD) as presented in Section 2.1, (VF) relaxing this formulation by considering only violated facets (VF) as described in Section 2.2, (SH) and using only separating hyperplanes (SH) as presented in Section 2.3.
gap (ub-lb)/lb (b) Gap after the time limit

Table 2 :
The values of z C J (G), z F J (G) and z H J (G) for G = G 100,15 for different sets J q

Table 3 :
The running times in seconds for computing z C J (G), z F J (G) and z H J (G) of Table2

Table 5 :
The average number of violated facets |V I | over all subgraphs G I with I ∈ J q before and after including the ESCs and computing z F J (G) J 221 J 443 J 664 J 886 J 1107

Table 6 :
The number of nodes in our B&B algorithm