Abstract
A Clique Partitioning Problem (CPP) finds an optimal partition of a given edgeweighted undirected graph, such that the sum of the weights is maximized. This general graph problem has a wide range of realworld applications, including correlation clustering, group technology, community detection, and coalition structure generation. Although a CPP is NPhard, due to the recent advance of Integer Linear Programming (ILP) solvers, we can solve reasonably large problem instances by formulating a CPP as an ILP instance. The first ILP formulation was introduced by Grötschel and Wakabayashi (Mathematical Programming, 45(13), 59–96, 1989). Recently, Miyauchi et al. (2018) proposed a more concise ILP formulation that can significantly reduce transitivity constraints as compared to previously introduced models. In this paper, we introduce a series of concise ILP formulations that can reduce even more transitivity constraints. We theoretically evaluate the amount of reduction based on a simple model in which edge signs (positive/negative) are chosen independently. We show that the reduction can be up to 50% (dependent of the ratio of negative edges) and experimentally evaluate the amount of reduction and the performance of our proposed formulation using a variety of graph data sets. Experimental evaluations show that the reduction can exceed 50% (where edge signs can be correlated), and our formulation outperforms the existing stateoftheart formulations both in terms of memory usage and computational time for most problem instances.
1 Introduction
A Clique Partitioning Problem (CPP) finds an optimal partition of a given edgeweighted undirected graph that maximizes the sum of the weights within clusters [9]. An edge weight can be positive or negative. As the popularity of analyzing complex networks has increased (e.g., the Internet and social networks), CPPs can be found in a variety of important AI applications:
Correlation clustering [1]: For given objects (e.g., documents) with similarity scores, we are interested in finding clusters of similar objects (e.g., clusters of documents with the same topic). This problem can be formalized as a CPP where an object/document is a vertex and similarity scores among objects are edge weights.
Group technology [16, 19]: Assume we are developing a manufacturing system that requires several parts and machines for processing. The goal is to find a suitable partition of the parts and machines for efficient production. This problem can be formalized as a CPP where each part/machine is a vertex. If part i is processed by machine j, edge (i, j) has weight 1 (otherwise, its weight is − 1). An edge between parts (or machines) has weight 0.
Community detection [2, 4, 8]: For a given network, we are interested in partitioning the vertices into communities. A community is a subset of vertices densely/positively connected within the same community and sparsely/negatively connected with other communities. Community detection in a network has a long research history (e.g., see the survey by Fortunato [8]). Community detection in bipartite networks, which can be formalized as a special case of a CPP called a modularity maximization problem, is proposed by Brandes et al. [4] and has gathered a significant amount of interest [2, 8].
Coalition structure generation [17]: A pair of agents can have positive/negative synergy, e.g., friends who collaborate well or rivals who do not. The goal is to partition agents into groups (coalitions) to maximize the sum of their synergies. This problem can be formalized as a CPP where an agent is a vertex and the synergies among the agents are edge weights.
Although a CPP is NPhard, due to the recent advance of Integer Linear Programming (ILP) solvers (e.g., CPLEX and Gurobi), we can solve reasonably large problem instances by formulating a CPP as an ILP instance. Grötschel and Wakabayashi [9] proposed the first IPL formulation of a CPP, where the goal was selecting a subset of edges such that the sum of their weights is maximized and satisfies transitivity constraints. The number of transitivity constraints is Θ(n^{3}), where n is the number of vertices. Thus, the growth of the number of transitivity constraints creates a bottleneck for solving larger CPP instances.
To address this scalability issue, several researchers have proposed more concise ILP formulations [3, 5, 6, 11, 14, 15]. Dinh and Thai [7] proposed a concise ILP formulation for the modularity maximization problem. Inspired by this work, Miyauchi and Sukegawa [12] proposed an ILP formulation for a general CPP by identifying redundant transitivity constraints by focusing on the signs of edge weights.
In this paper, we propose further reductions of the transitivity constraints that utilize the characterization of transitivity based on reachability among vertices.^{Footnote 1} Our method can be used in conjunction with the stateoftheart method proposed by Miyauchi et al. [13], whose formulation removed significantly more constraints, including nonredundant ones. As a result, the obtained ILP solution can be suboptimal. Thus, they proposed a method for obtaining an optimal solution by postprocessing.
We theoretically evaluate the amount of reduction based on a model in which edge signs are chosen independently and show that the ratio of the number of constraints between our proposed and the existing formulations is 50% when the ratio of the nonnegative edges approaches 0. Furthermore, we experimentally evaluate the amount of reduction and the performance of our proposed formulation using wellknown, realworld graph datasets related to CPP applications. We show that the reduction can exceed 50% when the edge signs can be correlated, and our formulations outperform the existing stateoftheart formulations in terms of both memory usage and computational time for most problem instances.
2 Model
A Clique Partitioning Problem (CPP) instance is a complete undirected edgeweighted graph G = (V, E, w) consisting of vertices {1,2,…,n}, n(n − 1)/2 = E edges, and weight function \(w: E\rightarrow \mathbb {R}\). For simplicity, we denote w_{i, j} = w({i, j}) for each {i, j}∈ E. Clearly, w_{i, j} = w_{j, i}. Set A, which is a subset of E, is called clique partitioning if partition \(\{V_{1}, V_{2}, \dots , V_{p}\}\) of V exists such that \(A = \bigcup _{\ell = 1}^{p}\{\{i,j\}\in E  i,j\in V_{\ell }\}\). We call each V_{ℓ} a cluster (\(\ell =1,\dots ,p\)). CPP’s goal is to find clique partitioning A to maximize the sum of weights \({\sum }_{\{i,j\}\in A} w_{i,j}\).
3 Existing concise ILP formulations
Let us introduce the ILP formulation based on Grötschel and Wakabayashi [9]. Decision variable x_{i, j} is introduced for each edge {i, j}∈ E where i < j. x_{i, j} equals 1 if i and j are in the same cluster, and 0 otherwise.
Transitivity constraints are introduced for any i, j, k ∈ V. If i and j are in the same cluster and j and k are in the same cluster, then i and k must also be in the same cluster. This ILP formulation, called P(G), is defined:
s.t.
where
This formulation has \(\left (\begin {array}{c} n \\ 2 \end {array} \right ) = {\varTheta }(n^{2})\) variables and \(3\left (\begin {array}{c} n \\ 3 \end {array} \right ) = {\varTheta }(n^{3})\) transitivity constraints.
Miyauchi and Sukegawa [12] proposed a more concise ILP formulation called RP(G), which reduces the number of transitivity constraints. More specifically, we replace (1) in the above formulation with:
where
A transitivity constraint, such that w_{i, j} < 0 ∧ w_{j, k} < 0 holds, is redundant and safely removed. RP(G) has O(nm_{≥ 0}) constraints, where m_{≥ 0} indicates the number of nonnegative weighted edges in G.
4 New concise ILP formulations
4.1 Overview of our main idea and contributions
We introduce several terms to explain our main idea. Let \(\boldsymbol {x}^{\ast } = (x_{i,j}^{\ast })_{1\leq i< j\leq n}\) denote an arbitrary optimal solution of an ILP formulation, E_{≥ 0} denote {{i, j}∈ E∣w_{i, j} ≥ 0}, and E^{∗} denote \(\{\{i,j\}\in E \mid x_{{{i,j}}}^{\ast } = 1\}\). Furthermore, let \(\{(V_{1},E_{1}^{\ast }),(V_{2},E_{2}^{\ast }),\dots ,(V_{p},E_{p}^{\ast })\}\) denote a set of connected components of (V, E^{∗}). For each ℓ ∈{1,…,p}, for a pair of vertices i, j ∈ V_{ℓ} (where i < j), they are directly reachable if \(\{i, j\} \in E_{\ell }^{\ast }\), and nonnegatively, directly reachable if \(\{i, j\} \in E_{\ell }^{\ast } \cap E_{\geq 0}\). They also are reachable if a path exists from i to j that consists of edges in \(E_{\ell }^{\ast }\), and nonnegatively reachable if a path exists from i to j that consists of edges in \(E_{\ell }^{\ast }\cap E_{\geq 0}\).
Example 1
Consider one connected component \((V_{\ell },E^{\ast }_{\ell })\) where V_{ℓ} = {1,2,3,4,5} and \(E^{\ast }_{\ell } = \{\{1,2\},\){2,3},{2,4},{3,4},{4,5}}. Weight function w on \(E^{\ast }_{\ell }\) is as follows: w_{1,2} = − 1, w_{2,3} = 0, w_{2,4} = 1, w_{3,4} = 0 and w_{4,5} = 2. Figure 1 illustrates the component. Here 1 and 2 are directly reachable but not nonnegatively, directly reachable, and 3 and 4 are nonnegatively, directly reachable. 3 and 5 are nonnegatively reachable, and 1 and 5 are not nonnegatively reachable.
The goal of transitivity constraints is to ensure that the following property holds.
If i and j are reachable, they are directly reachable, i.e., x_{i, j} = 1 must hold.
As shown by Lemma 1, if i and j are reachable, then they are nonnegatively reachable. The intuition of this lemma is simple. If i, j ∈ V_{ℓ} are not nonnegatively connected, by removing only negative weight edges, we can divide V_{ℓ} into two parts, contradicting the fact that x^{∗} is an optimal solution. For example, in Fig. 1, since vertices 1 and 5 are not nonnegatively reachable, the component is not optimal; we can divide V_{ℓ} into two parts, i.e., {1} and {2,3,4,5}, by removing only negative weight edge {1,2}.
Thus, the above goal can be modified to guarantee the following property, which corresponds to Lemma 2:
If i and j are nonnegatively reachable, they are directly reachable.
Transitivity constraints must be defined to ensure this property. With these terms, the transitivity constraints in [12] can be translated:
For any i < k and for any j s.t. j≠i, j≠k, if at least one of the following conditions holds, then i and k must be directly reachable (i.e., x_{i, k} = 1 must hold):
 (i):

i and j are directly reachable, and j and k are nonnegatively, directly reachable, or
 (ii):

i and j are nonnegatively, directly reachable, and j and k are directly reachable.
We argue that having only (i) instead of (i) and (ii) together is sufficient, i.e., (ii) is redundant if we have (i) to derive Lemma 2. Similarly, if we have (ii), (i) is redundant. This fact is verified easily by mathematical induction on the length of the nonnegative path from i to k. Assume the length of the path is q, and divide it into two parts: a nonnegative path from i to j with length q − 1, and a nonnegative edge between j and k (for (i)); or a nonnegative edge between i and j, and a nonnegative path from j to k with length q − 1 (for (ii)).
In the rest of this section, we show a series of ILP formulations based on this idea, called CP(G), CP^{∗}(G), ACP(G), and ACP^{∗}(G). Intuitively, we can remove about half of the transitivity constraints. However, since (i) and (ii) can overlap, this intuition is not exact. In the rest of this section, we scrutinize the amount of reduction.
4.2 CP(G)
We first present our new concise ILP formulation called CP(G), which is based on RP(G). It is identical to RP(G) (or P(G)) except that we replace (1) with:
where
Here the requirement of (i) is given as \(C_{\geq 0}^{1}\cup C_{\geq 0}^{2} \cup C_{\geq 0}^{3}\).
Let \(\boldsymbol {x}^{\ast } = (x_{i,j}^{\ast })_{1\leq i< j\leq n}\) be an arbitrary optimal solution of CP(G). We first present the following lemma where \(\{(V_{1},E_{1}^{\ast }),(V_{2},E_{2}^{\ast }),\dots ,(V_{p},E_{p}^{\ast })\}\) denotes the connected components of (V, E^{∗}) and \(E^{\ast } = \{\{i,j\}\in E \mid x_{{i,j}}^{\ast } = 1\}\).
Lemma 1
\((V_{\ell }, E_{\ell }^{\ast }\cap E_{\geq 0})\) is connected for each ℓ ∈{1,…,p}.
Proof
We prove this lemma in the same manner as the proof of Lemma 1 [12]. It suffices to show that for any partition {S_{i}, S_{j}} of V_{ℓ}, where i ∈ S_{i} and j ∈ S_{j}, there exists a nonnegative edge in \(E^{\ast }_{\ell }\cap E_{\geq 0}\) whose one endpoint is in S_{i} and its other is in S_{j}.
From the definition of V_{ℓ}, at least one edge exists in \(E^{\ast }_{\ell }\) between S_{i} and S_{j}. Suppose that none of these connecting edges belong to E_{≥ 0}, i.e., their weights are strictly negative. We focus on a CP solution obtained by changing the values of the variables that correspond to these edges from 1 to 0 on x^{∗}. This operation resembles partitioning V_{ℓ} into S_{i} and S_{j}. This solution is feasible for CP because each transitivity constraint that includes a removed edge is satisfied since at least two terms equal 0. Thus, the objective value of this solution is strictly greater than that of x^{∗}. This contradicts the optimality of x^{∗}. □
Lemma 1 argues that if i and j in V_{ℓ} are reachable, then they are nonnegatively reachable. Next we prove that if i and j are nonnegatively reachable, then they are directly reachable. In the following, for notation simplicity, let x_{{i, j}} denote x_{i, j} when i < j, and otherwise x_{j, i}. With this notation, the three inequalities in (1) are merged into one inequality x_{{i, j}} + x_{{j, k}}− x_{i, k} ≤ 1, where i < k.
Lemma 2
\((V_{\ell }, E_{\ell }^{\ast })\) is complete for each ℓ ∈{1,…,p}.
Proof
Let i and j be two vertices in V_{ℓ} (i < j). According to Lemma 1, there exists path i = u_{0}, u_{1}, \(\dots ,u_{q} = j\) on \(E_{\ell }^{\ast }\cap E_{\geq 0}\). We prove \(x_{i,j}^{\ast } = 1\) by induction on the path’s length q. (Base case): \(x_{i,j}^{\ast } = x_{u_{0},u_{1}}^{\ast } = 1\) is clearly true. (Inductive step): Assume the induction hypothesis where the lemma is true for an arbitrary value of q − 1. Next we must show that \(x_{i,j}^{\ast } = 1\), that is, \(x^{\ast }_{u_{0},u_{q}} = 1\). According to the induction hypothesis, \(x_{\{u_{0},u_{q1}\}}^{\ast } = 1\). Since u_{0} = i < j = u_{q} and the edge between u_{q− 1} and u_{q} is in E_{≥ 0}, constraint
is contained in CP(G). Thus, using \(x_{\{u_{0},u_{q1}\}}^{\ast } = 1\) and \(x_{\{u_{q1},u_{q}\}}^{\ast } = 1\), we have \(x_{u_{0},u_{q}}^{\ast } = 1\). □
Now we are ready to present our main theorem.
Theorem 1
CP(G) and P(G) have the same set of optimal solutions.
Proof
It is sufficient to show that any optimal solution x^{∗} of CP(G) is feasible for P(G). This is equivalent to the statement that \((V_{\ell },E_{\ell }^{\ast })\) is complete for each \(\ell \in \{1,2,\dots ,p\}\). This is what Lemma 2 argues. □
It is straightforward to show that Theorem 1 holds if we choose (ii) instead of (i) for transitivity constraints.
According to Theorem 1 [12], RP(G) and P(G) have the same set of optimal solutions. Thus, we obtain the following corollary.
Corollary 1
CP(G) and RP(G) have the same set of optimal solutions.
We examine the numbers of constraints in RP(G) and CP(G). Let a denote the probability that an edge has a nonnegative weight and formally define it as follows:
Let RP(G) and CP(G) denote \(T^{1}_{\geq 0}\cup T^{2}_{\geq 0}\cup T^{3}_{\geq 0}\) and \(C^{1}_{\geq 0}\cup C^{2}_{\geq 0}\cup C^{3}_{\geq 0}\), i.e., the number of constraints in RP(G) and CP(G).
Theorem 2
When the probability in which the weight of each edge is nonnegative is independent for each edge, the ratio of the number of constraints in CP(G) against that in RP(G) can be estimated:
Proof
We first estimate the number of transitivity constraints in RP(G). T = {(i, j, k)∣1 ≤ i < j < k ≤ n} equals \(\left (\begin {array}{l} n \\ 3 \end {array} \right )\). Tuple (i, j, k) is included in \(T_{\geq 0}^{1}\) except when both w_{i, j} and w_{j, k} are negative. Thus, this probability is given as 1 − (1 − a)^{2}= 2a − a^{2}. The same argument is applicable to \(T_{\geq 0}^{2}\) and \(T_{\geq 0}^{3}\). Then \(T_{\geq 0}^{1}\cup T^{2}_{\geq 0}\cup T_{\geq 0}^{2}\) can be estimated as \(3(2aa^{2})\left (\begin {array}{l} n \\ 3 \end {array} \right )\).
Then we estimate the number of transitivity constraints in CP(G). Tuple (i, j, k) is included in \(C^{1}_{\geq 0}\) only when w_{j, k} ≥ 0, which occurs with probability a. Thus, \(C^{1}_{\geq 0}\cup C^{2}_{\geq 0}\cup C^{3}_{\geq 0}\) is estimated: \(3a\left (\begin {array}{l} n \\ 3\end {array} \right )\).
Therefore, we obtain
Since 0 ≤ a ≤ 1,
Furthermore, this ratio approaches 1/2 when a approaches 0, and it approaches 1 when a approaches 1. □
4.3 CP^{∗}(G)
We present another concise ILP formulation called CP^{∗}(G), based on RP^{∗}(G) [13]. In a reallife network, most edges are commonly zeroweighted. By taking this fact into account, Miyauchi et al. [13] introduced RP^{∗}(G), which is identical to RP(G) (or P(G)) except that we replace (1) with:
where
Since RP^{∗}(G) reduces nonredundant transitivity constraints, a connected component can be incomplete and the obtained partition can be infeasible for P(G). The following example presented by [13] shows the infeasibility.
Example 2
Let G = (V, E, w) with V = {1,2,3,4},w_{1,2} = 1,w_{1,3} = w_{2,3} = − 1, and w_{1,4} = w_{2,4} = w_{3,4} = 0. A 01 vector \({\overline {\boldsymbol {x}}}^{\ast } = (\overline {x}_{i,j})^{\ast }\) such that \({\overline {x}}_{1,2}^{\ast } ={\overline {x}}_{1,4}^{\ast } ={\overline {x}}_{2,4}^{\ast } ={\overline {x}}_{3,4}^{\ast } = 1\) and \({\overline {x}}_{1,3}^{\ast } ={\overline {x}}_{2,3}^{\ast } = 0\) is one of the optimal solutions to RP^{∗}(G); however, transitivity constraint − x_{1,3} + x_{3,4} + x_{1,4} ≤ 1 in P(G) is violated.
Thus, Miyauchi et al. [13] proposed RP^{∗}(G)+pp, which first runs RP^{∗}(G) and then performs postprocessing (pp) for the solution obtained by RP^{∗}(G) to recover the feasibility. We can modify RP^{∗}(G) using the same idea as CP(G) and obtain CP^{∗}(G). Furthermore, by applying the same postprocessing procedure pp, we obtain CP^{∗}(G)+pp.
In CP^{∗}(G), we replace (1) with:
where
The postprocessing procedure pp is described as follows. Let \(\boldsymbol {\check {x}}^{\ast } = (\check {x}_{i,j}^{\ast })_{1\leq i<j\leq n}\) denote an optimal solution of CP^{∗}(G). Let \(\check {E}^{\ast }\) denote \(\{\{i,j\}\in E  \check {x}_{{i,j}}^{\ast } = 1\},\) E_{> 0} denote {{i, j}∈ Ew_{i, j} > 0}, and \(\check {E}_{>0}^{\ast }\) denote \(\check {E}^{\ast }\cap E_{>0}\). Now let us consider a new graph \(G_{new} = (V, \check {E}_{>0}^{\ast })\). We partition V into a set of weakly connected components \(\{(V_{1},\check {E}_{1}^{\ast }),(V_{2},\check {E}_{2}^{\ast }),\dots ,(V_{p},\check {E}_{p}^{\ast })\}\) of G_{new} by the depthfirst search and outputs 01 vector x^{∗} that corresponds to partition \(\{V_{1},V_{2},\dots ,V_{p}\}\), i.e., x^{∗}, such that \(x_{{i,j}}^{\ast } = 1\) if and only if i, j ∈ V_{ℓ} for some \(\ell \in \{1,2,\dots ,p\}\). Note that this procedure runs in time linear with the size of G.
The following lemma, which corresponds to Lemma 2 for CP(G), holds for CP^{∗}(G). Here vertices i and j are positively reachable if a path exists from i to j that consists of edges in \(\check {E}_{>0}^{\ast }\).
Lemma 3
\((V_{\ell }, \check {E}^{\ast }_{{\ell }})\) is complete for each ℓ ∈{1,…,p}.
The proof is obtained from that of Lemma 2 by replacing ‘nonnegatively’ with ‘positively’, E_{≥ 0} with E_{> 0}, and CP(G) with CP^{∗}(G).
Here we show that CP^{∗}(G)+pp returns an optimal solution to P(G). This follows the proof in Subsection 4.1 in Miyauchi et al. [13], which shows that RP^{∗}(G)+pp returns an optimal solution to P(G). For the proof, it suffices to show that the objective value of x^{∗} remains identical as that of \(\boldsymbol {\check {x}}^{\ast }\) in P(G) and CP^{∗}(G) because x^{∗} is feasible for P(G) and \(\boldsymbol {\check {x}}^{\ast }\) is optimal to CP^{∗}(G), which is a relaxation of P(G) (i.e., having fewer constraints). For convenience, we define \(E_{in}^{\ast } = \{\{i,j\}\in E  x_{{i,j}}^{\ast }= 1\}\) and \(E_{out}^{\ast } = E\setminus E_{in}^{\ast }\). Note that graph \((V,E_{in}^{\ast })\) consists of p cliques \(V_{1},V_{2},\dots , V_{p}\).
We have the following lemmas.
Lemma 4
It holds that \(\sum \limits _{\{i,j\}\in {E^{\ast }}_{in}} w_{i,j}x_{{i,j}}^{\ast }= \sum \limits _{\{i,j\}\in {E^{\ast }}_{in}} w_{i,j}\check {x}_{{i,j}}^{\ast }\).
Proof
It suffices to show that for any \(\ell \in \{1,2,\dots ,p\}\), it holds that \(\check {x}^{\ast }_{i,j} = 1\) for each i, j ∈ V_{ℓ} where i < j. Since V_{ℓ} is weakly connected by \(\check {E}_{>0}^{\ast }\), a path exists on \(\check {E}_{>0}^{\ast }\) that connects i and j. This implies that every edge on it has a positive weight; i and j are positively reachable. According to Lemma 3, we conclude that i and j are directly reachable; \(\check {x}_{i,j}^{\ast } =~1\). □
Lemma 5
It holds that \(\sum \limits _{\{i,j\}\in {E^{\ast }}_{out}} w_{i,j} \check {x}_{{i,j}}^{\ast } \leq 0\).
Proof
For each \(\{i,j\}\in E^{\ast }_{out}\), we have \(\{i,j\}\not \in E^{\ast }_{>0}\). If otherwise, then \(x^{\ast }_{{i,j}} = 1\) and thus \(\{i,j\}\in E^{\ast }_{in}\). Therefore, for each \(\{i,j\}\in E^{\ast }_{out}\), we have \(\check {x}^{\ast }_{{i,j}} = 0\) or w_{i, j} ≤ 0, which proves the lemma. □
By Lemmas 4 and 5, we have \({\sum }_{\{i,j\}\in E} w_{i,j}x^{\ast }_{{i,j}} = {\sum }_{\{i,j\}\in E^{\ast }_{in}} w_{i,j}x^{\ast }_{{i,j}} \geq {\sum }_{\{i,j\}\in E^{\ast }_{in}} w_{i,j}\check {x}^{\ast }_{{i,j}} + {\sum }_{\{i,j\}\in E^{\ast }_{out}} w_{i,j}\check {x}^{\ast }_{{i,j}} = {\sum }_{\{i,j\}\in E} w_{i,j}\check {x}^{\ast }_{{i,j}}\). Thus, the following theorem holds.
Theorem 3
Any 01 vector returned by CP^{∗}(G)+pp is optimal to P(G).
To estimate the ratio of CP^{∗}(G) to RP^{∗}(G), we define b as the probability that an edge has a positive weight:
Theorem 4
When the probability in which the weight of each edge is positive is independent for each edge, the ratio of the number of constraints in CP^{∗}(G) compared with that in RP^{∗}(G) can be estimated:
The proof is almost identical to Theorem 2. The ratio approaches 1/2 when b approaches 0, and it approaches 1 when b approaches 1.
4.4 ACP(G)/ACP^{∗}(G)
As discussed in Section 4.1, either (i) or (ii) is required for transitivity constraints. We can adaptively switch between (i) or (ii) to minimize the number of transitivity constraints. Such a formulation, which introduces this idea to CP(G), is called Adaptive CP(G) (ACP(G)) and that for CP^{∗}(G) is called Adaptive CP^{∗}(G) (ACP^{∗}(G)).
Let nn(i) denote the number of vertices j(≠i) that satisfy w_{i, j} ≥ 0. In ACP(G), we replace (1) with:
where
ACP^{∗}(G) is defined in the same manner based on CP^{∗}(G).
Theorem 5
ACP(G) and P(G) have the same set of optimal solutions.
The proof resembles Theorem 1. By Theorem 1 [12] and this theorem, we obtain the following corollary.
Corollary 2
ACP(G) and RP(G) have the same set of optimal solutions.
Compare the number of transitivity constraints in ACP(G) and CP(G). For given i, k, s.t. i < k, the number of transitivity constraints related to i, k, i.e., the number of edges with nonnegative weight between k and j, s.t. j≠i, j≠k, is given by binomial distribution B(n − 2,a). The probability that the number of such edges equals y is given:
The expected number of transitivity constraints related to i, k for CP(G) is given as \({\sum }_{y=0}^{n2} y\cdot p(y) = a(n2)\). For ACP(G), we chose the smaller one from two values generated by binomial distribution B(n − 2,a). The probability that the smaller one equals y is given:
Then the expected number of transitivity constraints related to i, k for ACP(G) is given as \({\sum }_{y=0}^{n2} y\cdot q(y)\).
When n becomes large and the mean (which equals (n − 2)a) and the variance (which equals (n − 2)a(1 − a)) are sufficiently large, we can approximate binomial distribution B(n − 2,a) by normal distribution N((n − 2)a,(n − 2)a(1 − a)). Then for each i, k, the difference between the expected number of transitivity constraints for CP(G) and ACP(G) is given as \(\frac {1}{\sqrt {\pi }}\) of standard deviation \(\sqrt {(n2)a(1  a)}\).
Thus, CP(G) − ACP(G) is approximately:
For ACP^{∗}(G), the following theorem holds.
Theorem 6
Any 01 vector returned by ACP^{∗}(G) + pp is optimal to P(G).
We can prove this theorem based on the same reason as for Theorem 3. We can also estimate the reduction of the transitivity constraints in ACP^{∗}(G) against CP^{∗}(G) by replacing a with b in (2) for ACP(G).
5 Experiments
We experimentally evaluated the performance of our new concise ILP formulations with Gurobi package version 9.0.3 with default parameters to solve all the ILP formulations. All problem instances were solved by a workstation equipped with an Intel Xeon Gold 6130 CPU @ 2.10GHz processor with 192GB RAM and Ubuntu 18.04.2 LTS. For each instance and formulation, we set the time limit to four hours.
5.1 Problem settings
We used for the experiments four types of problem settings, based on popular CPP application domains. The first three settings are based on realworld datasets from Miyauchi et al. [13]. The last is based on synthetic (randomly generated) graphs, which were used as a benchmark for coalition structure generation algorithms [20].
(P) Correlation clustering:
For instance G = (V, E, w), E = E_{>} ∪ E_{<} ∪ E_{0} where E_{>}, E_{<} and E_{0} are mutually disjointed. For each {i, j}∈ E, we set w_{i, j} = 1 if {i, j}∈ E_{>}, w_{i, j} = − 1 if {i, j}∈ E_{<}, and w_{i, j} = 0 if {i, j}∈ E_{0}. These instances were generated from a protein sequence dataset, http://www.paccanarolab.org/scps. The number of variables always equals(n 2 ) = n(n− 1)/2.
(G) Group technology:
Instance G = (V, E, w) is an undirected bipartite graph. V is divided into two sets: p parts and q machines. For edge {i, j}∈ E between part i and machine j, w_{i, j} = 1 if i is proceeded by machine j and w_{i, j} = 0 otherwise. These instances are generated based on a manufacturing cell formation dataset, http://mauricio.resende.info/data.
(C) Community detection:
Instance G = (V, E, w) is based on a bipartite modularity network. V is divided into two sets, V_{R} and V_{L}, so that each edge has one end point in V_{L} and another in V_{R}. Edge {i, j}∈ E between i ∈ V_{L} and j ∈ V_{R} has weight A_{ij} E −d_{i}d_{j} E^{2}, where A_{ij} is the (i, j) component of the adjacency matrix of G and d_{i} is the degree of i ∈ N.
(S) Coalition structure generation:
We adopted the setting of the benchmark problem used in an existing work on coalition structure generation algorithms [20]. For each pair of nodes, an edge exists with probability 95%. With probability 5%, the edge weight is positive. Otherwise (i.e., with probability 95%), it is negative. A positive weight is chosen randomly from a range [1,100]; a negative weight is chosen randomly from a range [− 100,− 1]. We set the number of nodes n to 30, 60, 90, and 120 and computed the average over 100 instances for each setting.
Table 1 shows the ratio of the nonnegative or positive edges for each instance.
5.2 Experimental results
We compared our four new concise ILP formulations against two existing ILP formulations: RP(G) and RP^{∗}(G). Table 2 shows the number of transitivity constraints contained in the ILP formulations for P(G), RP(G), CP(G), and ACP(G), and Table 3 shows the number for RP^{∗}(G), CP^{∗}(G), and ACP^{∗}(G).
In the CP(G), ACP(G), CP^{∗}(G), and ACP^{∗}(G) columns, the pair of numbers in parentheses shows the ratio of the number of transitivity constraints. The first shows the actual ratio: CP(G)/RP(G) for the CP(G) column, ACP(G)/RP(G) for the ACP(G) column, CP^{∗}(G)/RP^{∗}(G) for the CP^{∗}(G) column, and ACP^{∗}(G)/RP^{∗}(G) for the ACP^{∗}(G) column. The second shows the theoretical ratio estimated by Theorems 2 and 4 or the approximation method presented in Section 4.4. These estimations are based on the ratios in Table 1. When the difference between the actual ratio and the theoretical one exceeds 10%, the larger one is shown in bold.
CP(G) contains slightly more transitivity constraints than the theoretical estimation for (G) as well as C4, C5 and C6. The ratio of CP(G) almost meets the theoretical estimation for (P) and (S). On the other hand, the ratio of ACP(G) almost meets the theoretical estimation for all the instances. The maximum difference between the actual and theoretical ratios is five percent. Note that ACP(G) eliminates more transitivity constraints than CP(G) for all the instances.
CP^{∗}(G) has almost the same or fewer constraints than the theoretical estimation for all the instances except for C2 and C3. ACP^{∗}(G) significantly reduces the redundant transitivity constraints compared to RP^{∗}(G) (by more than 60% in most instances). Similar to CP(G), the ratio of CP^{∗}(G) is close to the theoretical estimation for synthetic graphs (S). Note that ACP^{∗}(G) eliminates more transitivity constraints than CP^{∗}(G) for all the instances.
We obtained our theoretical estimation under the assumption that the sign of each edge (positive or negative) is determined independently. For realworld datasets, some correlations among edge signs likely exist. Thus, the actual reduction rate and our estimation do not exactly coincide except for synthetic graphs.
Table 4 shows the computation times in seconds. For the RP^{∗}(G), CP^{∗}(G), and ACP^{∗}(G) results, we included the computational times for running the postprocessing pp. The pp time was negligible in our experiments. In most cases it was less than 0.05 seconds. The longest pp time was 0.15 seconds for C6, which has the most nodes (i.e., 2,549 nodes) in our experiments.
Bold numbers indicate the ILP formulation that obtained the best result. OM indicates that we failed to solve the instance due to insufficient memory, which occurs when the problem instance contains more than tens of millions of constraints. We show the relative gaps for problem instances that could be stored on the machine but failed to obtain an optimal solution within the time limit. More specifically, Gurobi returns an upper bound (UB) and a lower bound (LB) of the optimal value when terminated before reaching an optimal solution. We show (UB − LB)/LB. For each instance, the best computation time (or the relative gap) among the formulations is highlighted in bold.
ACP^{∗}(G) successfully solved the most instances, i.e., 24 out of 27 in the (P), (G), and (C) categories, among the six formulations; RP(G), CP(G) and ACP(G) solved 18 instances, and RP^{∗}(G) and CP^{∗}(G) solved 22. There were no instances that ACP^{∗}(G) could not solve that the other formulations could. Thus, ACP^{∗}(G) is the best formulation with respect to the number of solved instances.
For the computation time, RP^{∗}(G) works well on the (G) instances, except G33, even though CP^{∗}(G) and ACP^{∗}(G) have fewer transitivity constraints. On the other instance categories of (P),(C), and (S), ACP^{∗}(G) shows the best performance.
As shown in our example, although reducing redundant constraints effectively shortened for the computational times in general, in some cases (mostly in (G) instances), our new formulations required more computational time than the existing methods, even though our formulations have fewer constraints. Our current conjecture is that even though some constraints are redundant, they can still be useful for constraint propagation and reducing the computational times.
6 Conclusions
We investigated the ILP formulations for CPP. Based on the existing ILP formulations, we developed a series of new ILP formulations that can significantly reduce transitivity constraints. We characterized the transitivity constraints based on their reachablity among vertices and identified the redundant parts in the existing ILP formulations. All our ILP formulations are guaranteed to obtain optimal clique partitions. Experimental results showed that our new ILP formulations outperformed the existing ILP formulations for most instances. Transitivity constraints are ubiquitous in a variety of problems and their possible formulations. For CPP, we can use MaxSAT encoding instead of an ILP formulation. We would like to apply our idea to reduce transitivity constraints for MaxSAT encoding of CPP, as well as other discrete optimization problems, e.g., [10, 18, 21].
Notes
Liao et al. [10] also proposed a method to reduce the transitivity constraints for coalition structure generation problems based on reachability among vertices. Unlike our method, their work assumes that a problem instance is represented as a set of rules rather than a graph and deals with MaxSAT encoding.
References
Bansal, N., Blum, A., & Chawla, S. (2004). Correlation clustering. Machine Learning, 56(13), 89–113.
Barber, M. J. (2007). Modularity and community detection in bipartite networks. Physical Review E, 76, 066102.
Benati, S., Puerto, J., & Rodríguezchíac, A. M. (2017). Clustering data that are graph connected. European Journal of Operational Research, 261(1), 43–53.
Brandes, U., Delling, D., Gaertler, M., Görke, R., Hoefer, M., Nikoloski, Z., & Wagner, D. (2008). On modularity clustering. IEEE Transactions on Knowledge and Data Engineering, 20(2), 172–188.
Bruckner, S., Höffner, F., Komusiewicz, C., & Niedermeier, R (2013). Evaluation of ILPbased approaches for partitioning into colorful components. In Proceedings of 12th International Symposium on Experimental Algorithms (SEA 2013) (pp. 176–187).
Costa, A. (2015). MILP Formulations for the modularity density maximization problem. European Journal of Operational Research, 245(1), 14–21.
Dinh, T. N., & Thai, M. T. (2015). Toward optimal community detection: From trees to general weighted networks. Internet Mathematics, 11(3), 181–200.
Fortunato, S. (2010). Community detection in graphs. Physics Reports, 486(3), 75–174.
Grötschel, M., & Wakabayashi, Y. (1989). A cutting plane algorithm for a clustering problem. Mathematical Programming, 45(13), 59–96.
Liao, X., Koshimura, M., Nomoto, K., Ueda, S., Sakurai, Y., & Yokoo, M. (2019). Improved WPM encoding for coalition structure generation under mcnets. Constraints, 24(1), 25–55.
Miyauchi, A., & Miyamoto, Y. (2013). Computing an upper bound of modularity. European Physical Journal B, 86, 302.
Miyauchi, A., & Sukegawa N. (2015). Redundant constraints in the standard formulation for the clique partitioning problem. Optimization Letters, 9 (1), 199–207.
Miyauchi, A., Sonobe, T., & Sukegawa N. (2018). Exact clustering via integer programming and maximum satisfiability. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI18) (pp. 1387–1394).
Nogueira, L. L. H., Quiles, M. G., & Lorena L. A. N. (2019a). Improving the performance of an integer linear programming community detection algorithm through clique filtering. In Proceedings of 19th International Conference on Computational Science and Its Applications (ICCSA 2019) (pp. 757–769).
Nogueira, L. L. H., Quiles, M. G., Lorena, L. A. N., de Carvalho, A., & Cespedes J. G. (2019b). Qualitative data clustering: a new integer linear programming model. In Proceedings of 2019 International Joint Conference on Neural Networks (IJCNN 2019) (pp. 1–8).
Oosten, M., Rutten, J. H. G. C., & Spieksma F. C. R. (2001). The clique partitioning problem. Networks, 38(4), 209–226.
Rahwan, T., michalak, T. P. , Wooldridge, M. , & Jennings, N. R. (2015). Coalition structure generation: a survey. Artificial Intelligence, 229, 139–174.
Soh, T., Berre, D. L., Roussel, S., Banbara, M., & Tamura, N. (2014). Incremental SATbased method with native boolean cardinality handling for the hamiltonian cycle problem. In Proceedings of the 14th European conference on logics in artificial intelligence (JELIA 2014) (pp. 684–693).
Wang, H., Alidaee, B., Glover, F., & Kochenberger, G. (2006). Solving group technology problems via clique partitioning. International Journal of Flexible Manufacturing Systems, 18, 77–97.
Watanabe, E., Koshimura, M., Sakurai, Y., & Yokoo, M. (2019). Solving coalition structure generation problems over weighted graph. In Proceedings of the 22nd International Conference on Principles and Practice of MultiAgent Systems (PRIMA19) (pp. 338–353).
Zha, A., Nomoto, K., Ueda, S., Koshimura, M., Sakurai, Y., & Yokoo, M. (2017). Coalition structure generation for partition function games utilizing a concise graphical representation. In Proceedings of 20th International Conference on Principles and Practice of MultiAgent Systems (PRIMA 2017) (pp. 143–159).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no conflict of interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work was supported by JSPS KAKENHI Grant Numbers JP19H04175, JP20H00609, JP18H03299.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Koshimura, M., Watanabe, E., Sakurai, Y. et al. Concise integer linear programming formulation for clique partitioning problems. Constraints 27, 99–115 (2022). https://doi.org/10.1007/s1060102209326z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s1060102209326z
Keywords
 Clique partitioning problem
 Integer linear programming
 Transitivity constraints