The min-cut and vertex separator problem

We consider graph three-partitions with the objective of minimizing the number of edges between the first two partition sets while keeping the size of the third block small. We review most of the existing relaxations for this min-cut problem and focus on a new class of semidefinite relaxations, based on matrices of order 2n+1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2n+1$$\end{document} which provide a good compromise between quality of the bound and computational effort to actually compute it. Here, n is the order of the graph. Our numerical results indicate that the new bounds are quite strong and can be computed for graphs of medium size (n≈300\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n \approx 300$$\end{document}) with reasonable effort of a few minutes of computation time. Further, we exploit those bounds to obtain bounds on the size of the vertex separators. A vertex separator is a subset of the vertex set of a graph whose removal splits the graph into two disconnected subsets. We also present an elegant way of convexifying non-convex quadratic problems by using semidefinite programming. This approach results with bounds that can be computed with any standard convex quadratic programming solver.


Introduction
The vertex separator problem (VSP) for a graph is to find a subset of vertices (called vertex separator) whose removal disconnects the graph into two components of roughly equal size. The VSP is NP-hard. Some families of graphs are known to have small vertex separators. Lipton and Tarjan [1] provide a polynomial time algorithm which determines a vertex separator in n-vertex planar graphs of size O( √ n). Their result was extended to some other families of graphs such as graphs of fixed genus [2]. It is also known that trees, 3D-grids and meshes have small separators. However, there are graphs that do not have small separators.
The VSP problem arises in many different fields such as VLSI design [3] and bioinformatics [4]. Finding vertex separators of minimal size is an important issue in communications network [5] and finite element methods [6]. The VSP also plays a role in divide-and-conquer algorithms for minimizing the work involved in solving system of equations, see e.g., [2,7].
The vertex separator problem is related to the following graph partition problem. Let A = (a i j ) be the adjacency matrix of a graph G with vertex set V (G) = {1, . . . , n} and edge set E(G). Thus A is a symmetric zero-one matrix of order n with zero diagonal. We are interested in 3-partitions (S 1 , S 2 , S 3 ) of V (G) with the property that It asks to find a vertex partition (S 1 , S 2 , S 3 ) with specified cardinalities, such that the number of edges joining vertices in S 1 and S 2 is minimized. We remark that this min-cut problem is known to be NP-hard [8]. It is clear that if OPT MC = 0 for some m = (m 1 , m 2 , m 3 ) T then S 3 separates S 1 and S 2 . On the other hand, OPT MC > 0 shows that no separator S 3 for the cardinalities specified in m exists.
A natural way to model this problem in 0-1 variables consists in representing the partition (S 1 , S 2 , S 3 ) by the characteristic vectors x i corresponding to S i . Thus x i ∈ {0, 1} n and (x i ) j = 1 exactly if j ∈ S i . Hence partitions with prescribed cardinalities are in one-to-one correspondence with n × 3 zero-one matrices X = (x 1 , x 2 , x 3 ) such that X T e = m and Xe = e (throughout e denotes the vector of all ones of appropriate size). The first condition takes care of the cardinalities and the second one insures that each vertex is in exactly one partition block. The number of edges between S 1 and S 2 is now given by x T 1 Ax 2 . Thus (MC) is equivalent to min{x T 1 Ax 2 : X = (x 1 , x 2 , x 3 ) ∈ {0, 1} n×3 , X T e = m, Xe = e}.
It is the main purpose of this paper to explore computationally efficient ways to get tight approximations of OPT MC . These will be used to find vertex separators of small size.
In Sect. 2 we provide an overview of various techniques from the literature to get lower bounds for the min-cut problem. Section 3 contains a series of new relaxations based on semidefinite programming (SDP). We also consider convexification techniques suitable for Branch-and-Bound methods based on convex quadratic optimization, see Sect. 4. In Sect. 5.1 we investigate reformulations of our SDP relaxations where strictly feasible points exist. This is crucial for algorithms based on interior-point methods. We also show equivalence of some of the here introduced SDP relaxations with the SDP relaxations from the literature, see Sect. 5.2. In particular, we prove that SDP relaxations with matrices of order 2n + 1 introduced here are equivalent to the SDP relaxations with matrices of order 3n + 1 from the literature. This reduction in the size of the matrix variable enables us to further improve and compute SDP bounds by adding the facet defining inequalities of the boolean quadric polytope. Symmetry (m 1 = m 2 ) is investigated in Sect. 6. We also address the problem of getting feasible 0-1 solutions by standard rounding heuristics, see Sect. 7. Section 8 provides computational results on various classes of graphs taken from the literature, and Sect. 9 final conclusions.

Example 1
The following graph will be used to illustrate the various bounding techniques discussed in this paper. The vertices are selected from a 17 × 17 grid using the following MATLAB commands to make them reproduceable. rand( seed , 27072015), Q = rand(17) < 0.33 This results in n = 93 vertices which correspond to the nonzero entries in Q. These are located at the grid points (i, j) in case Q i j = 0. Two vertices are joined by an edge whenever their distance is at most √ 10. The resulting graph with |E| = 470 is displayed in Fig. 1. For m = (44, 43, 6) T we find a partition which leaves 7 edges between S 1 and S 2 . We will in fact see later on that this partition is optimal for the specific choice of m. Vertices in S 3 are marked by '*', the edges in the cut between S 1 and S 2 are plotted with the thickest lines, those with one endpoint in S 3 are dashed.

Notation
The space of k × k symmetric matrices is denoted by S k , and the space of k × k symmetric positive semidefinite matrices by S + k . The space of symmetric matrices is considered with the trace inner product A, B = tr(AB). We will sometimes also use the notation X 0 instead of X ∈ S + k , if the order of the matrix is clear from the context.
We will use matrices having a block structure. We denote a sub-block of a matrix Y such as in Eq. (6) by Y i j or Y i . In contrast we indicate the (i, j) entry of a matrix Y by Y i, j . For two matrices X, Y ∈ R p×q , X ≥ Y means X i, j ≥ Y i, j , for all i, j.
To denote the ith column of the matrix X we write X :,i . J and e denote the all-ones matrix and all-ones vector respectively. The size of these objects will be clear from the context. We set E i j = e i e T j where e i denotes column i of the identity matrix I . The 'vec' operator stacks the columns of a matrix, while the 'diag' operator maps an n × n matrix to the n-vector given by its diagonal. The adjoint operator of 'diag' is denoted by 'Diag'.
The Kronecker product A ⊗ B of matrices A ∈ R p×q and B ∈ R r ×s is defined as the pr × qs matrix composed of pq blocks of size r × s, with block i j given by A i, j B, i = 1, . . . , p, j = 1, . . . , q, see e.g., [9]. The Hadamard product of two matrices A and B of the same size is denoted by A • B and defined as (A • B) i j = A i, j · B i, j for all i, j.

Overview of relaxations for (MC)
Before we present our new relaxations for (MC) we find it useful to give a short overview of existing relaxation techniques. This allows us to set the stage for our own results and also to describe the rich structure of the problem which gives rise to a variety of relaxations.

Orthogonal relaxations based on the Hoffman-Wielandt inequality
The problem (MC) can be viewed as optimizing a quadratic objective function over the set P m of partition matrices where Historically the first relaxations exploit the fact that the columns of X ∈ P m are pairwise orthogonal, more precisely X T X = Diag(m). The objective function x T 1 Ax 2 can be expressed as 1 2 A, X B X T with We recall the Hoffman-Wielandt theorem which provides a closed form solution to the following type of problem.
The minimum on the right hand side above is attained for the permutation which recursively maps the largest eigenvalue of C to the smallest eigenvalue of D.
Donath and Hoffman [10] use this result to bound (MC) from below, The fact that in this case A and B do not have the same size can easily be overcome, see for instance [11].
To get a further tightening, we introduce the Laplacian L, associated to the adjacency matrix A, which is defined as By definition, we have −L = A outside the main diagonal, and moreover diag(X B X T ) = 0 for partition matrices X . Therefore the objective function of our problem satisfies −L , X B X T = A, X B X T . The vector e is eigenvector of L, in fact Le = 0, which is used in [11] to investigate the following relaxation This relaxation also has a closed form solution based on the Hoffman-Wielandt theorem. To describe it, we need some more notation. Let λ 2 and λ n denote the second smallest and the largest eigenvalue of L, with normalized eigenvectors v 2 and v n . Further, setm = ( Theorem 2 ([12]) In the notation above we have OPT HW = − 1 2 (λ 2 τ 1 + λ n τ 2 ) and the optimum is attained at This approach has been investigated for general graph partition with specified sizes m 1 , . . . , m k . We refer to [11,13] for further details. More recently, Pong et al. [14] explore and extend this approach for the generalized min-cut problem. The solution given in closed form through the eigenvalues of the input matrices makes it attractive for large-scale instances, see [14]. The drawback lies in the fact that it is difficult to introduce additional constraints into the model while maintaining the applicability of the Hoffman-Wielandt theorem. This can be overcome by moving to relaxations based on semidefinite optimization.

Relaxations using semidefinite optimization
The relaxations underlying the Hoffman-Wielandt theorem can equivalently be expressed using semidefinite optimization. We briefly describe this connection and then we consider more general models based on semidefinite optimization. The key tool here is the following theorem of Anstreicher and Wolkowicz [15], which can be viewed as an extension of the Hoffman-Wielandt theorem. Based on this theorem, Povh and Rendl [16] show that the optimal value of (4) can equivalently be expressed as the optimal solution of the following semidefinite program with matrix Y of order 3n. Theorem 4 [16] We have A proof of this result is given in [16]. The proof implicitly shows that also the following holds.

Theorem 5
The following problem also has the optimal value O P T H W : We provide an independent proof of this theorem, which simplifies the arguments from [16]. To maintain readability, we postpone the proof to Sect. 10. The significance of these two results lies in the fact that we can compute optimal solutions for the respective semidefinite programs by simple eigenvalue computations.
The SDP relaxation from Theorem 4 can be viewed as moving from X ∈ P m to Y = x x T ∈ S 3n with x = vec(X ) ∈ R 3n and replacing the quadratic terms in x by the corresponding entries in Y . The constraint tr(Y 1 ) = m 1 follows from tr(Y 1 ) = tr(x 1 x T 1 ) = x T 1 x 1 = m 1 . Similarly, tr(Y 12 ) = x T 1 x 2 = 0 and tr(J Y 1 ) = (e T x 1 ) 2 = m 2 1 . Thus these constraints simply translate orthogonality of X into linear constraints on Y .
In order to derive stronger SDP relaxations than the one from Theorem 4, one can exploit the fact that for X ∈ P m it follows that diag 0 which is well known to be equivalent to the following convex constraint The general case of k−partition leads to SDP relaxations with matrices of order (nk + 1), see for instance Zhao et al. [17] and Wolkowicz and Zhao [18]. In our notation, the model (4.1) from [18] has the following form: Literally speaking, the model (4.1) from [18] does not include the equations involving the m i above, but uses information from the barycenter of the feasible region to eliminate these constraints by reducing the dimension of the matrix variable Y . We make this more precise in Sect. 5 below.
Further strengthening is suggested by asking Y ≥ 0 leading to the strongest bound contained in [18].
The min-cut problem can also be seen as a special case of the quadratic assignment problem (QAP), as noted already by Helmberg et al. [12]. This idea is further exploited by van Dam and Sotirov [19] where the authors use the well known SDP relaxation for the QAP [17], as the SDP relaxation for the min-cut problem. The resulting QAPbased SDP relaxation for the min-cut problem is proven to be equivalent to (7), see [19].

Linear and quadratic programming relaxations
The model (MC) starts with specified sizes m = (m 1 , m 2 , m 3 ) T and tries to separate V (G) into S 1 , S 2 and S 3 so that the number of edges between S 1 and S 2 is minimized. This by itself does not yield a vertex separator, but it can be used to experiment with different choices of m to eventually produce a separator.
Several papers consider the separator problem directly as a linear integer problem of the following form The constraint Xe = e makes sure that X represents a vertex partition, the inequalities on the edges inforce that there are no edges joining S 1 and S 2 and the last constraints are cardinality conditions on S 1 and S 2 . The objective function looks for a separator of smallest size. We refer to Balas and de Souza [20,21] who exploit the above integer formulation within Branch and Bound settings with additional cutting planes to find vertex separators in small graphs. Biha and Meurs [22] introduced new classes of valid inequalities for the vertex separator polyhedron and solved instances from [21] to optimality. Hager et al. [23,24] investigate continuous bilinear versions and show that is equivalent to (VS). Even though this problem is intractable, as the objective function is indefinite, it is shown in [24] that this model can be used to produce heuristic solutions of good quality even for very large graphs. A quadratic programming (QP) relaxation for the min-cut problem is derived in [14]. That convex QP relaxation is based on the QP relaxation for the QAP, see [15,25,26]. Numerical results in [14] show that QP bounds are weaker, but cheaper to compute than the strongest SDP bounds, see also Sect. 4.
Armbruster et al. [27] compared branch-and-cut frameworks for linear and semidefinite relaxations of the minimum graph bisection problem on large and sparse instances. Extensive numerical experiments show that the semidefinite branch-and-cut approach is superior choice to the simplex approach. In the sequel we mostly consider SDP bounds for the min-cut problem.

The new SDP relaxations
In this section we derive several SDP relaxations with matrix variables of order 2n + 1 and increasing complexity. We also show that our strongest SDP relaxation provides tight min-cut bounds on a graph with 93 vertices.
Motivated by Theorem 5 and also in view of the objective function x T 1 Ax 2 of (MC), which makes explicit use only of the first two columns of X ∈ P m we propose to investigate SDP relaxations of (MC) with matrices of order 2n, obtained by moving T .
An integer programming formulation of (MC) using only x 1 and x 2 amounts to the following This formulation has the disadvantage that its linear relaxation (0 ≤ x i ≤ 1) is intractable, as the objective function x T 1 Ax 2 is indefinite. An integer linear version is obtained by linearizing the terms (x 1 ) i (x 2 ) j in the objective function. We get This is a binary LP with 2n binary and 2m continuous variables. Unfortunately, its linear relaxation gives a value of 0 (by setting an appropriate number of the nonzero entries in u and v to 1 2 ). Even the use of advanced ILP technology, as for instance provided by GUROBI or CPLEX or similar packages, is only moderately successful on this formulation. We will argue below that some SDP models in contrast yield tight approximations to the optimal value of the integer problem.
Moving to the matrix space, we consider where we set Looking at Theorem 5 we consider the following simple SDP as our starting relaxation.
Y y y T 1 0 This SDP captures the constraints from Theorem 5 and has (2n + 1) + 6 linear equality constraints. We have replaced Y 0 by the stronger condition Y − yy T 0, and we also replaced −L by A in the objective function.
There is an immediate improvement by exploiting the fact that which adds another n equations and makes tr(Y 12 +Y T 12 ) = 0 redundant. We call S D P 1 the relaxation obtained from S D P 0 by replacing tr(Y 12 +Y T 12 ) = 0 with diag(Y 12 ) = 0. The equations in (14) are captured by the 'gangster operator' in [18]. Moreover, once these constraints are added, it will make no difference whether the adjacency matrix A or −L is used in the objective function.
Up to now we have not yet considered the inequality where x 1 (resp. x 2 ) represents the first n (resp. last n) coordinates of y.
Proof The submatrix in (13) indexed by (i, n +i, 2n +1) and i ∈ {1, . . . , n} is positive semidefinite, i.e., ⎛ ⎝ y i 0 y i 0 y i+n y i+n y i y i+n 1 The proof of the lemma follows from the following inequality In order to obtain additional linear constraints for our SDP model, we consider (15) and which we multiply pairwise and apply linearization. A pairwise multiplication of individual inequalities from (15) yields We also get by multiplying individual constraints from (15) and y ≥ 0. Finally we get in a similar way by multiplying with e − y ≥ 0 The inequalities (16)- (19) are based on a technique known as the reformulationlinearization technique (RLT) that was introduced by Sherali and Adams [28].
In order to strengthen our SDP relaxation further, one can add the following facet defining inequalities of the boolean quadric polytope (BQP), see e.g., [29], In our numerical experiments we will consider the following relaxations which we order according to their computational effort.

Name
Constraints Complexity The first two relaxations can potentially produce negative lower bounds, which would make them useless. The remaining relaxations yield nonnegative bounds due to the nonnegativity condition on Y corresponding to the nonzero entries in M.
Example 2 We continue with the example from the introduction and provide the various lower bounds introduced so far, see Table 1. We also include the value of the best known feasible solution, which we found using a rounding heuristic described in Sect. 7 below. In most of these cases O PT eig , S D P 0 and S D P 1 bounds are negative but we know that OPT MC ≥ 0. In contrast, the strongest bound S D P 4 proves optimality of all the solutions found by our heuristic. Here we do not solve S D P 3 and S D P 4 exactly. The S D P 3 (resp. S D P 4 ) bounds are obtained by adding the most violated inequalities of type (16)-(19) (resp. (16)- (19) and (20)) to S D P 2 . The cutting plane scheme adds at most 2n violated valid constraints in each iteration and performs at most 15 iterations. It takes about 6 minutes to compute S D P 4 bound for fixed m. We compute S D P 2 (resp. S D P 3 ) bound in about 5 s (resp. 2 minutes) for fixed m.
For comparison purposes, we computed also linear programming (LP) bounds. The LP bound RLT 3 incorporates all constraints from S D P 3 except the SDP constraint, including of course standard linearization constraints. The RLT 3 bound for m = (45, 44, 4) T is zero. Similarly, we derive the linear programming bound RLT 4 that includes all constraints from S D P 4 except the SDP constraint. We solve RLT 4 approximately by cutting plane scheme that first adds all violated (16)- (19) constraints and then at most 4n violated constraints of type (20), in each iteration of the algorithm. After 100 such iterations the bound RLT 4 was still zero.
We find it remarkable that even the rather expensive model S D P 3 is not able to approximate the optimal solution value within 'reasonable' limits. On the other hand, the 'full' model S D P 4 is strong enough to actually solve these problems. We will see in the computational section, that only the full model is strong enough to actually get good approximations also on instances from the literature. In our case the objective function f (x) is not convex. It can however be reformulated as a convex quadratic L(x) such that f (x) = L(x) for all feasible 0-1 solutions by exploiting the fact that x • x = x for 0-1 valued x. This convexification is based on Lagrangian duality and has a long history in nonlinear optimization, see for instance Hammer and Rubin [30] and Shor [31,32]. Lemarechal and Oustry [33], Faye and Roupin [34] and Billionet et al. [35] consider convexification of quadratic problems and the connection to semidefinite optimization. We briefly summarize the theoretical background behind this approach.

Convexification of 0-1 problems
We first recall the following well-known facts from convex analysis. Let f (x) := x T Qx + 2q T x + q 0 for q 0 ∈ R, q ∈ R n and Q ∈ S n . Then inf x f (x) > −∞ ⇐⇒ Q 0 and ∃ξ ∈ R n such that q = Qξ, due to the first order (Qx +q = 0) and second order (Q 0) necessary optimality conditions. The following proposition summarizes what we need later for convexification.

Proposition 7 Let Q
0 and q = Qξ for some ξ ∈ R n and q 0 ∈ R, and f ( Proof For completeness we include the following short arguments. The first statement follows from ∇ f (x) = 0. To see the last statement we use the factorization Using a Schur-complement argument, this implies which shows that the supremum is attained at −ξ T Qξ with optimal value q 0 − ξ T Qξ . Finally, the second problem is the dual of the third, with strictly feasible solution X = I, x = 0.
We are going to describe the convexification procedure for a general problem of the form for suitable data D, d, C, c. In case of (MC) we have The key idea is to consider a relaxation of the problem where integrality of x is expressed by the quadratic equation x • x = x. Let us consider the following simple to explain the details. Its Lagrangian is given by The associated Lagrangian dual reads Ignoring values (u, α) where the infimum is −∞, this is by Proposition 7 equivalent to sup u,α This is a semidefinite program with strictly feasible points (by selecting u and −σ large enough and α = 0). Hence its optimal value is equal to the value of the dual problem, which reads Let (u * , α * , σ * ) be an optimal solution of (23). Then we have Using Proposition 7 again we get the following equality The proposition also shows that L(x; u * , α * ) is convex (in x) and moreover L(x; u * , α * ) = f (x) for all integer feasible solutions x. The convex quadratic programming relaxation of problem (21), obtained from (22), consists in minimizing L(x; u * , α * ) over the polyhedron We close the general description of convexification with the following observation which will be used later on.
Feasibility of (X * , x * ) for (24) shows that the last term is equal to Finally, using (26), this term is lower bounded by L(x * ; u * , α * ).
The relaxation (22) is rather simple. In [35] it is suggested to include all equations obtained by multiplying the constraints e T x 1 = m 1 , e T x 2 = m 2 with x (i) and 1 − x (i) , where x (i) denotes the i−th component of x. The inclusion of quadratic constraints is particularly useful, as their multipliers provide additional degrees of freedom for the Hessian of the Lagrangian function. The main insight from the analysis so far can be summarized as follows. Given a relaxation of the original problem, such as (22) above, form the associated semidefinite relaxation obtained from relaxing X − x x T = 0 to X − x x T 0. Then the optimal solution of the dual problem yields the desired convexification.

Convexifying (8)
Let us now return to (MC) given in (8). We have x = x 1 x 2 ∈ R 2n and f (x) = x T M x where M is given in (9). The following models are natural candidates for convexification. We list them in increasing order of computational effort to determine the convexification.
Let σ * , u * be an optimal solution to the last problem above. The Lagrangian The convex QP bound based on this convexification is therefore given by The unconstrained minimization of L would result in inf{L(x; u * ) : x ∈ R 2n }.
• In the previous model, no information of the m i is used in the convexification. We next include also the equality constraints The Lagrangian relaxation corresponds to S D P 0 . The dual solution of S D P 0 yields again the desired convexification. • Finally, we replace x T 1 x 2 by x 1 • x 2 = 0 above and get S D P 1 as Lagrangian relaxation. In this case we know from Lemma 6 and Proposition 8 that the convex QP bound is equal to the value of S D P 1 which in turn is equal to the unconstrained minimum of the Lagrangian L. Example 3 We apply the convexification as explained above to the example graph from the introduction, see Table 2. In the first two cases we provide the unconstrained minimum along with the resulting convex quadratic programming bound. In case of S D P 1 we know from the previous proposition that the unconstrained minimum agrees with the optimal value of S D P 1 . These bounds are not very useful, as we know a trivial lower bound of zero in all cases. On the other hand, the convex QP bound is computationally cheap compared to solving SDP, and may be useful in a Branch-and-Bound process. Convex quadratic programming bounds may play crucial role in solving non-convex quadratic 0-1 problems to optimality, see for instance the work of Anstreicher et al. [25] on the quadratic assignment problem. Here we presented a general framework for obtaining convexifications, partly iterating the approach from [35]. Compared to Table 1, we note that the bounds based on convexification and using convex QP are not competitive to the strongest SDP bounds. On the other hand, these bounds are much cheaper to compute, so their use in a Branch and Bound code may still be valuable. Implementing such bounds within a Branch and Bound framework is out of the scope of the present paper, so we will leave this for future research.

The Slater feasible versions of the SDP relaxations
In this section we present the Slater feasible versions of here introduced SDP relaxations. In particular, in Sect. 5.1 we derive the Slater feasible version of the relaxation S D P 1 , and in Sect. 5.2 present the Slater feasible version of the SDP relaxation (7) from [18]. In Sect. 5.2 we prove that the SDP relaxations S D P 1 and (7) are equivalent, and that S D P 3 with additional nonnegativity constraints on all matrix elements is equivalent to the strongest SDP relaxation from [18]. We actually show here that our strongest SDP relaxation with matrix variable of order 2n + 1, i.e., S D P 4 dominates the currently strongest SDP relaxation with matrix variable of order 3n + 1.

The projected new relaxations
In this section we take a closer look at the feasible region of our basic new relaxation S D P 0 . The following lemma will be useful.
, and there exists a = 0 such that From this the claim follows.
We introduce some notation to describe the feasible region of S D P 0 . Let Z be a symmetric (2n + 1) × (2n + 1) matrix with the block form as in the definition of S D P 0 . We define (29), (10), (12), (13) and tr(J The set F 1 differs from the feasible region of S D P 0 only in the constraint tr(Y 12 + Y T 12 ) = 0 which is not included in F 1 .

Lemma 10
forms a basis of the orthogonal complement to T , W T T = 0. Using the previous lemma, we conclude that Z ∈ F 1 implies that Z = WU W T for some U ∈ S + 2n−1 . Let us also introduce the set Here e 2n+1 is the last column of the identity matrix of order 2n + 1. In the following theorem we prove that sets F 1 and F 2 are equal. Similar results are also shown in the connection with the quadratic assignment problem, see e.g., [17].
Proof We first show that F 1 ⊆ F 2 and take Z ∈ F 1 . The previous lemma implies that Z is of the form Z = WU W T and U 0. Z 2n+1,2n+1 = 1 implies that U 2n−1,2n−1 = 1 due to the way W is defined in (32). The main diagonal of Z is equal to its last column, which translates into diag(Z ) = Ze 2n+1 , so Z ∈ F 2 . Conversely, consider Z = WU W T ∈ F 2 and let it be partitioned as in (29). We have W T T = 0, so Z T = 0. Multiplying out columnwise and using the block form of Z we get Y i e = m i y i , y T i e = m i and Y T 12 e = m 1 y 2 , Y 12 e = m 2 y 1 .
From this we conclude that tr( We conclude by arguing that F 2 contains matrices where U 0. To see this we note that the barycenter of the feasible set iŝ SinceẐ ∈ F 2 it has the formẐ = WÛ W T . It can be derived from the results in Wolkowicz and Zhao [18, Theorem 3.1.] that it has a two-dimensional nullspace, sô U 0. This puts us in a position to rewrite our relaxations as SDP having Slater points. In case of S D P 1 , we only need to add the condition diag(Y 12 ) = 0 to the constraints defining F 2 . It can be expressed in terms of Z as e T i Ze n+i = 0 i = 1, . . . n. Here e i and e n+i are the appropriate columns of the identity matrix of order 2n + 1. We extend the matrix in the objective function by a row and column of zeros, and get the following Slater feasible version of S D P 1 in matrices U ∈ S 2n−1 U 2n−1,2n−1 = 1, U 0.

The projected Wolkowicz-Zhao relaxation and equivalent relaxations
The Slater feasible version of the SDP relaxation (7) is derived in [18] and further exploited in [14]. The matrix variable Z in (7) is of order 3n + 1 and has the following As before we can identify a nullspace common to all feasible matrices. In this case it is given by the columns ofT , see [18], wherē Note that this is a (3n + 1) × (n + 3) matrix. It has rank n + 2, as the sum of the first three columns is equal to the sum of the last n columns, see also [18]. A basis of the orthogonal complement toT is given bȳ As before, we argue that feasibleZ are of the formZ =WUW T with additional suitable constraints on U ∈ S 2n−1 . It is instructive to look at the last n columns ofZT which, due to the block structure ofZ translate into the following equations: Y 1 +Y 12 +Y 13 = y 1 e T , Y 21 +Y 2 +Y 23 = y 2 e T , Y 31 +Y 32 +Y 3 = y 3 e T , y 1 +y 2 +y 3 = e. (35) Given Y 1 , Y 2 , y 1 , y 2 and Y 12 these equations uniquely determine y 3 , Y 13 , Y 23 , and Y 3 and produce the n-dimensional part of the nullspace ofZ given by the last n columns ofT . We can therefore drop this linear dependent part ofZ without loosing any information. Mathematically, this is achieved as follows. Let us introduce the (2n + 1) × (3n + 1) matrix It satisfies W = PW and gives us a handle to relate the relaxations in matrices of order 3n +1 to our models.
We recall the Slater feasible version of (7) from [18]: whereM In [19] it is proven that the SDP relaxation (36) is equivalent to the QAP-based SDP relaxation with matrices of order n 2 × n 2 . Below we prove that (36) is equivalent to the here introduced SDP relaxation.

Theorem 12
The SDP relaxation (36) is equivalent to S D P 1project .
Proof The feasible sets of the two problems are related by pre-and postmultiplication of the input matrices by P, for instance PZ P T yields the (2n + 1) × (2n + 1) matrix variable of our relaxations. This operation basically removes the block of rows and columns corresponding to x 3 . Both models contain the constraint diag(Y 12 ) = 0. From (35) we conclude that Thus diag(Y 13 ) = 0 in (36) is equivalent to diag(Y 1 ) = y 1 in S D P 1project . Similarly, diag(Y 23 ) = 0 is equivalent to diag(Y 2 ) = y 2 . The objective function is nonzero only on the part ofZ corresponding to Y 12 . Thus the two problems are equivalent.
The SDP relaxation (36) with additional nonnegativity constraintsW RW T ≥ 0 is investigated by Pong et al. [14] on small graphs. The results show that the resulting bound outperforms other bounding approaches described in [14] for graphs with up to 41 vertices. It is instructive to look at the nonnegativity conditionZ i j ≥ 0, whereZ is has the block form (33), in connection with (35).

Theorem 13
The SDP relaxation (36) with additional nonnegativity constraints is equivalent to S D P 1project with additional constraints W U W T ≥ 0 and (16)- (19).
Note that S D P 1project with additional constraints WU W T ≥ 0, (16)- (19) is actually S D P 3 with additional nonnegativity constraints.

Symmetry reduction
It is possible to reduce the number of variables in S D P 2 when subsets S 1 and S 2 have the same cardinality. Therefore, let us suppose in this section that m 1 = m 2 . Now, we apply the general theory of symmetry reduction to S D P 2 (see e.g., [36,37]) and obtain the following SDP relaxation: In order to obtain the SDP relaxation (37) one should exploit the fact that for m 1 = m 2 the matrix variable is of the following form for details see [37], Sect. 5.1. In particular, the above equation follows from (20), page 264 in [37], and the fact that the basis elements A t (t = 1, 2) in our case are I and J − I . Now, (37) follows by direct verification. In a case that a graph under consideration is highly symmetric, the size of the above SDP can be further reduced by block diagonalizing the data matrices, see e.g., [36,37]. In order to break symmetry we may assume without loss of generality that a vertex of the graph is not in the first partition set. This can be achieved by adding a constraint, which assigns zero value to the variable that corresponds to that vertex in the first set. In general, we can perform n such fixings and obtain n different valid bounds. Similar approach is exploited in e.g., [19,37]. If the graph under consideration has a nontrivial automorphism group, then there might be less than n different lower bounds. It is not difficult to show that each of the bounds obtained in the above described way dominates S D P 2 . For the numerical results on the bounds after breaking symmetry see Sect. 8.

Feasible solutions
Up to now our focus was on finding lower bounds on OPT MC . A byproduct of all our relaxations is the vector y = y 1 y 2 ∈ R 2n such that y 1 , y 2 ≥ 0, e T y 1 = m 1 , e T y 2 = m 2 and possibly y 1 + y 2 ≤ e. We now try to generate 0-1 solutions x 1 and x 2 with x 1 + x 2 ≤ e, e T x 1 = m 1 , e T x 2 = m 2 such that x T 1 Ax 2 is small. The hyperplane rounding idea can be applied in our setting. Feige and Langberg [38] propose random projections followed by randomized rounding (R P R 2 ) to obtain 0-1 vectors x 1 and x 2 . In our case, we need to modify this approach to insure that x 1 and x 2 represent partition blocks of requested cardinalities. It is also suggested to consider y 1 and y 2 and find the closest feasible 0-1 solution x 1 and x 2 , which amounts to solving a simple transportation problem, see for instance [11].
It is also common practice to improve a given feasible solution by local exchange operations. In our situation we have the following obvious options. Fixing the set S 3 given by x 3 := e − x 1 − x 2 , we apply the Kernighan-Lin local improvement [39] to S 1 and S 2 in order to (possibly) reduce the number of edges between S 1 and S 2 . After that we fix S 1 and try swapping single vertices between S 2 and S 3 to reduce our objective function.
It turns out that carrying out these local improvement steps by cyclically fixing S i until no more improvement is found leads to satisfactory feasible solutions. In fact, all the feasible solutions reported in the computational section were found by this simple heuristic.

Computational results
In this section we compare several SDP bounds on graphs from the literature. All bounds were computed on an Intel Xeon, E5-1620, 3.70 GHz with 32 GB memory. All relaxations were solved with SDPT3 [40].
We select the partition vector m such that |m 1 − m 2 | ≤ 1. For a given graph with n vertices, the optimal value of the min-cut is monotonically decreasing when m 3 is increasing. We select m 3 small enough so that OPT MC > 0 and we can also provide nontrivial (i.e., positive) lower bounds on OPT MC . Thus for given m we provide lower bounds (based on our relaxations) and also upper bounds (using the rounding heuristic from the previous section) on OPT MC . In case of a positive lower bound we also get a lower bound (of m 3 + 1) on the size of a strongly balanced (|m 1 − m 2 | ≤ 1) vertex separator. Finally, we also use our rounding heuristic and vary m 3 to actually find vertex separators, yielding also upper bounds for their cardinality. The results are summarized in Table 3. Matrices can-xx and bcspwr03 are from the library Matrix Market [41], grid3dt matrices are 3D cubical meshes, gridt matrices are 2D triangular meshes, and Smallmesh is a 2D finite-element mesh. Lower bounds for the min-cut problem presented in the table are obtained by approximately solving S D P 4 , i.e., by iteratively adding the most violated inequalities of type (16)- (19) and (20) to S D P 2 . In particular, we perform at most 25 iterations and each iteration includes at most 2n the most violated valid constraints. It takes 59 minutes to compute grid3dt(5) and 170 minutes to compute grid3dt (6).
One can download our test instances from the following link: https://sites.google. com/site/sotirovr/the-vertex-separator.
All the instances in Table 3 have a lower bound of 0 for S D P 2 and S D P 3 while S D P 4 > 0. This is a clear indication of the superiority of S D P 4 . Table 4 provides further comparison of the our bounds. In particular, we list S D P 1 , S D P 2 , S D P 3 , S D P 4 , S D P fix , and upper bounds for several graphs. S D P fix bound is obtained after breaking symmetry as described in Sect. 6. We choose m such that S D P 2 > 0 and m 1 = m 2 . Thus, we evaluate all bounds obtained by fixing a single vertex and report the best among them. All bounds in Table 4 are rounded up to the closest integer. The results further verify the quality of S D P 4 , and also show that breaking symmetry improves S D P 2 but not significantly.

Conclusion
In this paper we derive several SDP relaxations for the min-cut problem and compare them with relaxations from the literature. Our SDP relaxations have matrix variables of order 2n, while other SDP relaxations have matrix variables of order 3n.
We prove that the eigenvalue bound from [12] equals the optimal value of the SDP relaxation from Theorem 5, with matrix variable of order 2n. In [16] it is proven that the same eigenvalue bound is equal to the optimal solution of an SDP relaxation with matrix variable of order 3n. Further, we prove that the SDP relaxation S D P 1 is equivalent to the SDP relaxation (36) from [18], see Theorem 12. We also prove that the SDP relaxation obtained after adding all remaining nonnegativity constraints to S D P 3 is equivalent to the strongest SDP relaxation from [18], see Theorem 13. Thus, we have shown that for the min-cut problem one should consider SDP relaxations with matrix variables of order 2n + 1 instead of traditionally considered SDP relaxations with matrices of order 3n + 1. Consequently, our strongest SDP relaxation S D P 4 also has a matrix variable of order 2n + 1 and O(n 3 ) constraints. S D P 4 relaxation can be solved approximately by the cutting plane schema for graphs of medium size. The numerical results verify the superiority of S D P 4 . We further exploit the resulting strong SDP bounds for the min-cut to obtain strong bounds on the size of the vertex separators.
Finally, our general framework for convexifying non-convex quadratic problems (see Sect. 4) results with convex quadratic programming bounds that are cheap to compute. Since convex quadratic programming bounds played in the past crucial role in solving several non-convex problems to optimality, we plan to exploit their potential in our future research.

Proof of Theorem 5
We prove this theorem by providing feasible solutions to the primal and the dual SDP which have the same objective function value.
For the primal solution we take the optimizer X = (x 1 x 2 x 3 ) from (5) with objective value OPT HW and define Y = Feasibility of X with respect to (4) shows that Y is feasible for the SDP (for instance trY 1 = x T 1 x 1 = m 1 and tr J Y 1 = (e T x 1 ) 2 = m 2 1 ). Next we construct a dual solution with objective value OPT HW . Let With this notation, the primal constraints become We recall that Le = 0, hence we can select an eigenvalue decomposition of L as L = PDiag(λ)P T where P = ( 1 √ n e V ) with V T V = I n−1 , V T e = 0 and λ = (0, λ 2 , . . . , λ n ) T contains the eigenvalues λ i of L in nondecreasing order. The matrix P also diagonalizes J , J = PDiag ((n, 0, . . . , 0))P T . We use this to rewrite E 12 ⊗ L as E 12 ⊗ L = (I 2 E 12 I 2 ) ⊗ (P P T ) = (I 2 ⊗ P)(E 12 ⊗ )(I 2 ⊗ P) T .
Finally, setting insures that the matrix in (38) is 0 and hence (38) also holds. We now have a dual feasible solution, and we conclude the proof by selecting t < 0 in such a way that the objective function has value OPT HW .
Let D := m 1 m 2 (n − m 1 )(n − m 2 ). We recall that The dual solution defined above has value Comparing the coefficients of λ n − λ 2 and λ n + λ 2 we note that the values agree if