Fast branching algorithm for Cluster Vertex Deletion

In the family of clustering problems, we are given a set of objects (vertices of the graph), together with some observed pairwise similarities (edges). The goal is to identify clusters of similar objects by slightly modifying the graph to obtain a cluster graph (disjoint union of cliques). Hueffner et al. [Theory Comput. Syst. 2010] initiated the parameterized study of Cluster Vertex Deletion, where the allowed modification is vertex deletion, and presented an elegant O(2^k * k^9 + n * m)-time fixed-parameter algorithm, parameterized by the solution size. In our work, we pick up this line of research and present an O(1.9102^k * (n + m))-time branching algorithm.


Introduction
The problem to cluster objects based on their pairwise similarities has arisen from applications both in computational biology [6] and machine learning [5]. In the language of graph theory, as an input we are given a graph where vertices correspond to objects, and two objects are connected by an edge if they are observed to be similar. The goal is to transform the graph into a cluster graph (a disjoint union of cliques) using a minimum number of modifications.
The set of allowed modifications depends on a particular problem and an application considered. Probably the most studied variant is the Cluster Editing problem, known also as Correlation Clustering, where we seek for a minimal number of edge editions to obtain a cluster graph. The study of Cluster Editing include [3,4,13,18,28] and, from the parameterized perspective, [7,8,9,10,11,14,15,17,20,21,22,24,25,26].
The main principle of the parameterized complexity is that we seek for algorithms that are efficient if the considered parameter is small. However, the distance measure in Cluster Editing, the number of edge editions, may be quite large in practical instances, and, in the light of recent lower bounds refuting the existence of subexponential FPT algorithms for Cluster Editing [17,24], it seems reasonable to look for other distance measures (see e.g. Komusiewicz's PhD thesis [24]) and/or different problem formulations.
In 2008, Hüffner et al. [23] initiated the parameterized study of the Cluster Vertex Deletion problem (ClusterVD for short). Here, the allowed modification is a vertex deletion.

Cluster Vertex Deletion (ClusterVD)
Parameter: k Input: An undirected graph G and an integer k. Question: Does there exist a set S of at most k vertices of G such that G \ S is a cluster graph, i.e., a disjoint union of cliques?
In terms of motivation, we want to refute as few objects as possible to make the set of observations completely consistent. As a vertex deletion removes as well all its incident edges, we may expect that this new editing measure may be significantly smaller in practical applications than the edge-edition distance.
As ClusterVD can be equivalently stated as the problem of hitting, with minimum number of vertices, all induced P 3 s (paths on 3 vertices) in the input graph, ClusterVD can be solved in O(3 k (n+ m)) time by a straightforward branching algorithm [12], where n and m denote the number of vertices and edges of G, respectively. The dependency on k can be improved by considering more elaborate case distinction in the branching algorithm, either directly [19], or via a general algorithm for 3-Hitting Set [29]. Hüffner et al. [23] provided an elegant O(2 k k 9 + nm)-time algorithm, using the iterative compression principle [27] and a reduction to the weighted maximum matching problem.
In our work we pick up this line of research and obtain the fastest algorithm for (unweighted) Clus-terVD.
Theorem 1. Cluster Vertex Deletion can be solved in O(1.9102 k (n + m)) time and polynomial space on an input (G, k) with |V (G)| = n and |E(G)| = m.
Contrary to the algorithm of [23], our algorithm is a typical branching algorithm, where a number of branches and reductions is presented, and the complexity is analysed through (sometimes long) case analysis and branching vectors. The advantage of this approach is that we obtain a linear dependency on the graph size in the running time.
The main observation in the proof of Theorem 1 is that, if, for some vertex v, we know that there exists a solution S not containing v, in the neighbourhood of v the ClusterVD problem reduces to Vertex Cover. More precisely, define N 1 and N 2 to be the vertices within distance 1 and 2 from v, respectively, and define the auxiliary graph H v to be a graph on N 1 ∪N 2 having and edge for each edge of G between N 1 and N 2 and for each non-edge inside N 1 in G. In other words, two vertices are connected by an edge in H v iff, together with v, they form a P 3 in G. We observe that a solution S not containing v needs to contain a vertex cover of H v . Moreover, one can show that we may greedily take as much as possible (inclusion-wise) vertices from N 2 into the aforementioned vertex cover, as these vertices would help us resolve the remaining part of the graph.
We note that a similar observation has been already used in [23] to cope with a variant of ClusterVD where we restrict the number of clusters in the resulting graph.
Branching to find the 'correct' vertex cover of H v is a very efficient branching, with worst-case (1, 2) (i.e., golden-ratio) branching vector. However, we do not have the vertex v beforehand, and branching to obtain such a vertex may be quite costly. Thus, our approach is to get as much gain as possible from the vertex cover-style branching on the auxiliary graph H v , to be able to balance the loss from some inefficient branches used to obtain the vertex v to start with. Consequently, we employ quite involved analysis of properties and branching algorithms for the auxiliary graph H v .
The paper is organised as follows. We give some preliminary definitions and notation in Section 2. In Section 3 we analyse the auxiliary graph H v and show a branching algorithm finding all relevant vertex covers of H v . Then, in Section 4 we prove Theorem 1. Section 5 concludes the paper.

Preliminaries
We use standard graph notation. All our graphs are undirected and simple. For a graph G, by V (G) and E(G) we denote its vertex-and edge-set, respectively. For v ∈ V (G), the set N G (v) = {u|uv ∈ E(G)} is the neighbourhood of v in G and N G [v] = N G (v) ∪ {v} is the closed neighbourhood. We extend these notions to sets of vertices X ⊆ V (G) by N G [X] = v∈X N G [v] and N G (X) = N G [X] \ X. We omit the subscript if it is clear from the context. For a set X ⊆ V (G) we also define G[X] to be the subgraph induced by X and G \ X is a shorthand for we denote the size of the minimum vertex cover of G.
In all further sections, we assume we are given an instance (G, k) of Cluster Vertex Deletion, where G = (V, E). That is, we use V and E to denote the vertex-and edge-set of the input instance G.
A P 3 is an ordered set of 3 vertices (u, v, w) such that uv, vw ∈ E and uw / ∈ E. A graph is a cluster graph iff it does not contain any P 3 ; hence, in ClusterVD we seek for a set of at most k vertices that hits all P 3 s.
If at some point a vertex v is fixed in the graph G, we define sets N 1 = N 1 (v) and N 2 = N 2 (v) as follows: ). That is, N 1 and N 2 are sets of vertices within distance 1 and 2 from v, respectively. For a fixed v ∈ V , we define an auxiliary graph Thus, H v consists of the vertices in N 1 and N 2 along with non-edges among vertices of N 1 and edges between N 1 and N 2 . Observe the following.
Proof. For every uw ∈ E(H v ) with u, w ∈ N 1 , (u, v, w) is a P 3 in G. For uw ∈ E(H v ) with u ∈ N 1 and w ∈ N 2 , (v, u, w) forms a P 3 in G. In the other direction, for any P 3 in G of the form (u, v, w) we have u, w ∈ N 1 and uw / ∈ E, thus uw ∈ E(H v ). Finally, for any P 3 in G of the form (v, u, w) we have u ∈ N 1 , w ∈ N 2 and uw ∈ E, hence uw ∈ E(H v ).
We call a subset S ⊆ V a modulator when G \ S is a cluster graph, that is, a collection of cliques. A modulator with minimal cardinality is called a solution.
Our algorithm is a typical branching algorithm, that is, it consists of a number of branching steps. In a step (A 1 , A 2 , . . . , A r ), A 1 , A 2 , . . . , A r ⊆ V , we independently consider r subcases. In the i-th subcase we look for a solution S containing A i : we delete A i from the graph and decrease the parameter k by |A i |. If k becomes negative, we terminate the current branch and return a negative answer from the current subcase. For brevity, we sometimes write in the branching step w instead of The branching vector for a step (A 1 , A 2 , . . . , A r ) is the vector (|A 1 |, |A 2 |, . . . , |A r |). It is well-known (see e.g. [16]) that the number of final subcases of a branching algorithm is bounded by O(c k ), where c is the largest positive root of an equation 1 = r i=1 x −|Ai| among all branching steps (A 1 , A 2 , . . . , A r ) in the algorithm.

The auxiliary graph H v
In this section we investigate properties of the auxiliary graph H v . Hence, we assume that a ClusterVD input (G, k) is given with G = (V, E), and a vertex v ∈ V is fixed. We first start with a few basic properties and then we build on them an efficient branching algorithm for ClusterVD, if we know there exists a solution not containing v. Proof. Observe that if S is a modulator, then G \ S does not contain a P 3 . By Lemma 2, if v / ∈ S, no edge may remain in H v \ S and the lemma follows.

Basic properties
Proof. Suppose the connected component of v in G \ X is not a clique. Then by Lemma 3, there is a P 3 involving v. Such a P 3 is also present in G. However, by Lemma 2, as X is a vertex cover of H v , X intersects such a P 3 , a contradiction.
For vertex covers of H v , X and Y , we say X dominates Y if |X| ≤ |Y |, X ∩ N 2 ⊇ Y ∩ N 2 and at least one of these inequalities is sharp. Two vertex covers X and Y are said to be equivalent if X ∩ N 2 = Y ∩ N 2 and |X ∩ N 1 | = |Y ∩ N 1 |. We note that the first aforementioned relation is transitive and strongly anti-symmetric, whereas the second is an equivalence relation.
As a corollary of Lemma 6, we have:

Branching algorithm
We are now ready to develop a branching algorithm that guesses the 'correct' vertex cover of H v . Recall that we are working in the setting where we look for a solution to ClusterVD on (G, k) not containing v, thus, by Lemma 4, containing a vertex cover of H v . Our goal is to branch into a number of subcases, in each subcase picking a vertex cover of H v . By Corollary 7, our branching algorithm, to be correct, needs only to generate at least one element from each equivalence class of the 'equivalent' relation, among maximal elements in the 'dominate' relation.
The algorithm consists of a number of branching steps; in each subcase of each step we take a number of vertices into the constructed vertex cover of H v and, consequently, into the constructed solution to ClusterVD on G. At any point, the first applicable rule is applied.
First, we disregard isolated vertices in H v . Second, we take care of large-degree vertices.
That is, use the branching step (u, N Hv (u)).
Note that Rule 1 yields a branching vector Henceforth, we can assume that vertices have degree 1 or 2 in H v . Assume there exists u ∈ N 1 of degree 1, with uw ∈ E(H v ). Moreover, assume there exists a solution S containing u. If w ∈ S, then, by Lemma 6, S \ {u} is also a modulator, a contradiction.
Hence, we infer the following greedy rule.
into the vertex cover. That is, use the branching step (N Hv (u)). Now we assume vertices in N 1 are of degree exactly 2 in H v . Suppose we have vertices u, w ∈ N 1 with uw ∈ E(H v ). We would like to branch on u as in Rule 1, including either u or N Hv (u) into the vertex cover. However, note that in the case where u is deleted, Rule 2 is triggered on w and consequently the other neighbour of w is deleted. Hence, we infer the following rule.

Rule 3.
If there are vertices u, w ∈ N 1 , uw ∈ E(H v ) then include either N Hv (w) or N Hv (u) into the vertex cover. That is, use the branching step (N Hv (w), N Hv (u)).
Note that Rule 3 yields the branching vector (2, 2). We are left with the case where the maximum degree of H v is 2, there are no edges with both endpoints in N 1 , and no vertices of degree one in N 1 . Hence H v must be a collection of even cycles and paths (recall that N 2 is an independent set in H v ). On each such cycle C, of 2l vertices, the vertices of N 1 and N 2 alternate. Note that we must use at least l vertices for the vertex cover of C. By Lemma 6 it is optimal to greedily select the l vertices in C ∩ N 2 .

Rule 4.
If there is an even cycle C in H v with every second vertex in N 2 , include C ∩ N 2 into the vertex cover. That is, use the branching step (C ∩ N 2 ).
For an even path P of length 2l, we have two choices. If we are allowed to use l + 1 vertices in the vertex cover of P , then, by Lemma 6, we may greedily take P ∩ N 2 . If we may use only l vertices, the minimum possible number, we need to choose P ∩ N 1 , as it is the unique vertex cover of size l of such path. Hence, we have an (l, l + 1) branch with our last rule.
Rule 5. Take the longest possible even path P in H v and either include P ∩ N 1 or P ∩ N 2 into the vertex cover. That is, use the branching step (P ∩ N 1 , P ∩ N 2 ).
In Rule 5, we pick the longest possible path to avoid the branching vector (1, 2) as long as possible; this is the worst branching vector in the algorithm of this section.
When we are forced to use the (1, 2) branch, we exploit a very specific structure of H v . A seagull is a connected component of H v that is isomorphic to a P 3 with middle vertex in N 1 and endpoints in N 2 . The graph H v is called an s-skein if it is a disjoint union of s seagulls and some isolated vertices. The following observation is straightforward from the above analysis. Lemma 8. If the algorithm of Section 3.2 may only use a branch with the branching vector (1, 2), then H v is an s-skein for some s ≥ 1.
We conclude this section with a note on how fast a single branching step may be executed. Note that, as H v contains parts of the complement of G, it may have size superlinear in the size of G. However, it is easy to see that the following oracle procedure suffices to find and execute the lowest-numbered available branching step in the graph H v .

Algorithm
In this section we show our algorithm for ClusterVD, proving Theorem 1. The algorithm is a typical branching algorithm, where at each step we choose one branching rule and apply it. In each subcase, a number of vertices is deleted, and the parameter k drops by this number. If k becomes negative, the current subcase is terminated with a negative answer. On the other hand, if k is nonnegative and G is a cluster graph, the vertices deleted in this subcase form a modulator of size at most k.

Preprocessing
At each step, we first preprocess simple connected components of G.
Lemma 10. In linear time, we can for each connected component C of G: 1. conclude that C is a clique; or 2. conclude that C is not a clique, but identify a vertex w such that C \ {w} is a cluster graph; or 3. conclude that none of the above holds.
Proof. On each connected component C, we perform a depth-first search. At every stage, we ensure that the set of already marked vertices induces a clique.
When we enter a new vertex, w, adjacent to a marked vertex v, we attempt to maintain this invariant. We check if the number of marked vertices is equal to the number neighbours of w which are marked; if so then the new vertex w is marked. Since w is adjacent to every marked vertex, the set of marked vertices remains a clique. Otherwise, there is a marked vertex u such that uw / ∈ E(G), and we may discover it by iterating once again over edges incident to w. In this case, we have discovered a P 3 (u, v, w) and C is not a clique. At least one of u, v, w must be deleted to make C into a cluster graph. We delete each one of them, and repeat the algorithm (without further recursion) to check if the remaining graph is a cluster graph. If one of the three possibilities returns a cluster graph, then (2) holds. Otherwise, (3) holds.
If we have marked all vertices in a component C while maintaining the invariant that marked vertices form a clique, then the current component C is a clique.
For each connected component C that is a clique, we disregard C. For each connected component C that is not a clique, but C \ {w} is a cluster graph for some w, we may greedily delete w from G: we need to delete at least one vertex from C, and w hits all P 3 s in C. Thus, henceforth we assume that for each connected component C of G and for each v ∈ V (C), C \ {v} is not a cluster graph. In other words, we assume that we need to delete at least two vertices to solve each connected component of G.

Studying H v
Once preprocessing is no longer possible, we fix an arbitrary vertex v in G, and let C be its connected component. Our goal is to 'resolve' the neighbourhood of v: either decide to delete v, or guess the 'correct' vertex cover of H v . However, if we implement this in a straightforward manner, we do not get the time bound promised by Theorem 1. To achieve this bound, we carefully study the cases where H v has small vertex cover or has special structure, and discover some possible greedy decisions that can be made.
We would like to make decision depending on the size of the minimum vertex cover of H v . As C is not a clique, by Lemma 3 H v contains at least one edge, thus MinVC(G) ≥ 1. We first note that we can make a distinction on small vertex covers of G in linear time.
Lemma 11. In linear time, we can determine whether H v has minimum vertex cover of size 1, of size 2, or of size at least 3. Moreover, in the first two cases we can find the vertex cover in the same time bound.
Proof. We use Lemma 9 on to find, in linear time, a vertex w with degree at least 3, or generate H v explicitly.
In the latter case, H v has vertices of degree at most 2. Then, H v consists of paths and cycles and we can find the size of the minimum vertex cover in linear time. We use the fact that paths with l vertices require at least ⌊ l 2 ⌋ vertices, and cycles with l vertices require ⌈ l 2 ⌉ vertices in the vertex cover. If we find a vertex w of degree at least 3 in H v , then w must be in any vertex cover of size at most 2. Otherwise, N (w) must be in the vertex cover but |N (w)| ≥ 3. We proceed to delete w and restart the algorithm of Lemma 9 on the remaining graph to check if it has a vertex cover of size 0 or 1. We perform at most 2 such restarts. Finally, if we do not find a vertex cover of size at most 2, it must be the case that the minimum vertex cover contains at least 3 vertices.
We now make a few important observations about H v that will enable us to do some greedy choices in the future.
Lemma 12. Suppose X is a vertex cover of H v . Then there is a solution S such that either v / ∈ S or |X \ S| ≥ 2.
Proof. Suppose S is a solution such that v ∈ S and |X \ S| ≤ 1. Consider T (S \ {v}) ∪ X. Clearly, |T | ≤ |S|. Since T contains X, a vertex cover, by Lemma 5, the connected component of v in G \ T is a clique. Thus, there is no P 3 containing v. Since, any P 3 in G \ T which does not include v must also be contained in G \ S, contradicting the fact that S is a modulator, we obtain that T is also a modulator. Hence, T is a solution.
Corollary 13. If MinVC(H v ) = 1 then there is a solution S not containing v.
Proof. Let X be a minimum vertex cover of H v , and let S be a solution promised by Lemma 12 for the vertex cover X. Then v / ∈ S, as |X \ S| ≤ |X| = 1. Proof. Assume the contrary. Consider a component C of C \ {v} which is not a clique. Since v must be adjacent to each connected component of C \ {v}, C ∩ N 1 must be non-empty. For any w ∈ C ∩ N 1 , we have that w 1 , w 2 = w and ww 1 , ww 2 / ∈ E, since otherwise the result follows. If uw ∈ E with u ∈ N 2 , then, as {w 1 , w 2 } is a vertex cover we must have u = w 1 or u = w 2 , We would then have w 1 or w 2 contained in a non-clique C, contradicting our assumption. Hence uw ∈ E ⇒ u ∈ N 1 . Thus C ⊆ N 1 . As w 1 and w 2 are not contained in C and they cover all edges in H v , C must be an independent set in H v . In G \ {v}, therefore, C must be a clique, a contradiction.

Lemma 15.
Let v ∈ V . Suppose that H v is an s-skein. Then there is a solution S such that v / ∈ S.
Proof. Let H v consist of seaguls (x 1 , y 1 , z 1 ), (x 2 , y 2 , z 2 ), . . . , (x s , y s , z s ). That is, the middle vertices y i 's are in N 1 , while the endpoints x i 's and z i 's are in N 2 . If s = 1, {y 1 } is a vertex cover of H v and Corollary 13 yields the result. Henceforth, we assume s ≥ 2. As X consider the set N 1 with all the vertices isolated in H v removed. Clearly X is a vertex cover of H v , thus we may use X as in Lemma 12 and obtain a solution S. If v / ∈ S we are done, so let us assume |X \ S| ≥ 2. Take arbitrary i such that y i ∈ X \ S. As |X \ S| ≥ 2, we may pick another j = i, y j ∈ X \ S. The crucial observation from the definition of H v is that (y j , y i , x i ) and (y j , y i , z i ) are P 3 s in G. As y i , y j / ∈ S, we have x i , z i ∈ S. Hence, since the choice of i was arbitrary, we infer that for each 1 ≤ i ≤ s either y i ∈ S or x i , z i ∈ S, and, consequently, S contains a vertex cover of H v . By Lemma 5, S \ {v} is also a modulator in G, a contradiction.

Branching steps
We are now ready to present the branching steps of our algorithm. We assume the preprocessing (Lemma 10) is done and a vertex v is picked. We first run the algorithm of Lemma 11 to determine if H v has a small minimum vertex cover. Second, we run the algorithm of Lemma 9 to check if H v is not an s-skein for some s.
We consider the following cases. In the first case, we first delete v from the graph and decrease k by one. Then we check whether the connected component containing w 1 or w 2 is not a clique; By Lemma 14, for some w ∈ {w 1 , w 2 }, the connected component of G \ {v} containing w is not a clique; finding such w clearly takes linear time. We invoke the algorithm of Section 3.2 on H w .
In the second case, we invoke the algorithm of Section 3.2 on H v .

3.
MinVC(H v ) ≥ 3 and H v is not an s-skein for some s ≥ 3. We branch into two cases: we look for a solution containing v or not containing v. In the first branch, we simply delete v and decrease k by one. In the second branch, we invoke the algorithm of Section 3.2 on H v .

Complexity analysis
In the previous discussion we have argued that invoking each branching step takes linear time. As in each branch we decrease the parameter k by at least one, the depth of the recursion is at most k. In this section we analyse branching vectors occuring in our algorithm. To finish the proof of Theorem 1 we need to show that the largest positive root of the equation 1 = r i=1 x −ai among all possible branching vectors (a 1 , a 2 , . . . , a r ) is strictly less than 1.9102.
As the number of resulting branching vectors in the analysis is rather large, we use a Python script for automated analysis (attached in the appendix). The main reason for a large number of branching vectors is that we need to analyse branchings on the graph H v in case when we consider v not to be included in the vertex cover. Let us now proceed with formal arguments.
In a few places, the algorithm of Section 3.2 is invoked on the graph H v and we know that MinVC(H v ) ≥ h for some integer h. Consider the branching tree T of this algorithm. For a node x ∈ V (T), the depth of x is the number of vertices of H v deleted on the path from x to the root. We mark some nodes of T. Each node of depth less than h is marked. Moreover, if a node x is of depth d < h and the branching step at node x has branching vector (1, 2), we infer that graph H v at this node is an s-skein for some s ≥ h − d, all descendants of x in V (T) are also nodes with branching steps with vectors (1,2). In this case, we mark all descendants of x that are within distance (in T) less than h − d. Note that in this way we may mark some descendants of x of depth equal or larger than h.
We split the analysis of an application of the algorithm of Section 3.2 into two phases: the first one contains all branching steps performed on marked nodes, and the second on the remaining nodes. In the second phase, we simply observe that each branching step has branching vector not worse than (1,2). In the first phase, we aim to write a single branching vector summarizing the phase, so that with its help we can balance the loss from other branches when v is deleted from the graph.
The main property of the marked nodes in T is that their existence is granted by the assumption MinVC(H v ) ≥ h. That is, each leaf of T has depth at least h, and, if at some node x of depth d < h the graph H v is an s-skein, we infer that s ≥ h − d (as the size of minimum vertex cover of an s-skein is s) and the algorithm performs s independent branching steps with branching vectors (1,2) in this case. Overall, no leaf of T is marked.
To analyse such branchings for h = 2 and h = 3 we employ the Python script, supplied in the appendix. The procedure branch Hv generates all possible branching vectors for the first branch, assuming the algorithm of Section 3.2 is allowed to pick branching vectors (1), (1,3), (2,2) or (1, 2) (option allow skein enables/disables the use of the (1, 2) vector in the first branch). Note that all other vectors described in Section 3.2 may be simulated by applying a number of vectors (1) after one of the aforementioned branching vectors.
Let us now move to the analysis of the algorithm of Section 4.3.
In Case 1 the algorithm of Section 3.2 performs branchings with vectors not worse than (1,2). Consider now Case 2. If v is deleted, we apply the algorithm of Section 3.2 to H w , yielding at least one branching step (as the connected component with w is not a clique). Hence, after this first branching step, we have either one subcase with parameter drop at least 2, or two subcases with parameter drops at least 2 and at least 3. Clearly, the second case yields worse branching vector.
If v is not deleted, the algorithm of Section 3.2 is applied to H v . The script invokes the procedure branch Hv on h = 2 and allow skein=False to obtain a list of possible branching vectors. For each such vector, we append entries (2, 3) from the subcase when v is deleted.
Case 3 is analysed analogously. The script invokes the procedure branch Hv on h = 3 and allow skein=False to obtain a list of possible branching vectors. For each such vector, we append the entry (1) from the subcase when v is deleted.
We infer that the largest root of the equation 1 = r i=1 x −ai occurs for branching vector (1, 3, 3, 4, 4, 5) and is less than 1.9102. This branching vector corresponds to Case 3 and the algorithm of Section 3.2, invoked on H v , first performs a branching step with the vector (1, 3) and in the branch with 1 deleted vertex, finds H v to be a 2-skein and performs two independent branching steps with vectors (1,2). This analysis concludes the proof of Theorem 1.

Conclusions and open problems
We have presented a new branching algorithm for Cluster Vertex Deletion. We hope our work will trigger a race for faster FPT algorithms for ClusterVD, as it was in the case of the famous Vertex Cover problem. Repeating after Hüffner et al. [23], we would like to re-pose here the question for a linear vertexkernel for ClusterVD. As ClusterVD is a special case of the 3-Hitting Set problem, it admits an O(k 2 )-vertex kernel in the unweighted case and an O(k 3 )-vertex kernel in the weighted one [1,2]. However, Cluster Editing is known to admit a much smaller 2k-vertex kernel, so there is a hope for a similar result for ClusterVD.