1 Introduction

Motivated by various practical network applications, many different vulnerability measures of undirected graphs have been introduced and studied in the literature. The two most studied of such measures are vertex and edge connectivity of an undirected graph. However, these two measures often do not capture the more subtle vulnerability properties of networks that one might wish to consider, such as the number of vertices in the largest remaining connected component.

While both undirected and directed graphs are of great interest in graph theory, algorithms and applications, undirected graphs have been studied much more than their directed counterparts arguably due to simpler structure of undirected graphs. In this paper, we study a number of parameterizations of a problem of interest from both theory and applications which was mainly studied for undirected graphs so far.

In many networks, the underlying graph is directed rather than undirected and the aim of this paper is to study an extension to directed graphs of the \(\ell \)-component order connectivity of an undirected graph G, which is the size of a minimum set \(X\subseteq V(G)\) such that \(\mathrm{mco}(G-X)\le \ell ,\) where \(\mathrm{mco}(G-X)\) is the number of vertices in the largest connected component of \(G-X\) (mco stands for maximum component order). By Component Order Connectivity we will denote the following decision problem:

figure a

For a survey on Component Order Connectivity, see Gross et al. [14]; for more recent research on the problem, see e.g. [11, 16, 17].

For a directed graph D,  we define the \(\ell \) -component order connectivity as the size of a minimum set \(X\subseteq V(D)\) such that \(\mathrm{mco}(D-X)\le \ell ,\) where \(\mathrm{mco}(D-X)\) is the number of vertices in the largest strongly connected component of \(D-X.\) Using this definition of \(\mathrm{mco}(D-X),\) we can state the following directed version of Component Order Connectivity.

figure b

In what follows, we will assume without loss of generality that \(k+\ell < n=|V|\) (or, \(k< n-\ell \)). Indeed, if \(k+\ell \ge n\) then our instance is a YES-instance since deleting any set X of k vertices implies \(\mathrm{mco}(D-X)\le \ell .\)

Clearly, Directed Component Order Connectivity is a generalization of Component Order Connectivity (each instance \((G,\ell ,k)\) of Component Order Connectivity corresponds to an equivalent instance \((D,\ell ,k)\) of Directed Component Order Connectivity, where D is obtained from G by replacing every edge of G by a directed 2-cycle). For \(\ell =1,\) while Component Order Connectivity is equivalent to the Vertex Cover problem, Directed Component Order Connectivity is equivalent to the Directed Feedback Vertex Set problem. Unlike Vertex Cover whose fixed-parameter tractability is very easy to show, a fact that was known very early on in parameterized algorithmics [9], fixed-parameter tractability of Directed Feedback Vertex Set was a long-standing open problem until Chen et al. [7] in 2008 proved its fixed-parameter tractability by designing a \(4^kk!n^{{\mathcal {O}}(1)}\)-time algorithm. (We provide basics on parameterized algorithms and complexity in the next section.)

Since Component Order Connectivity is NP-complete (it remains NP-complete even for split, co-bipartite and chordal undirected graphs [11]), a number of researchers studied Component Order Connectivity using the framework of parameterized algorithmics, see e.g. [11, 16, 17]. Göke, Marx and Mnich [13] were the first to study the Directed Component Order Connectivity problem from the viewpoint of parameterized algorithms and complexity. They obtained an algorithm of running time \(4^k(k\ell +k+\ell )!n^{{\mathcal {O}}(1)},\) which is close to the complexity of the algorithm of Chen et al. [7] when \(\ell =1\). Thus, Directed Component Order Connectivity parameterized by \(k+\ell \) is fixed-parameter tractable (FPT).

We will continue the study of Directed Component Order Connectivity using parameterized algorithms and complexity. In particular, as in papers [11, 16, 17] which studied Component Order Connectivity, we study Directed Component Order Connectivity parameterized by three parameters: \(\ell \), k and \(\ell +k.\) We will denote the corresponding parameterized problems by Directed Component Order Connectivity[p], where p is the parameter.

Moreover, we introduce and study a new parameterization of Directed Component Order Connectivity: parameter \(n-\ell ,\) where n is the number of vertices in D. One reason to introduce Directed Component Order Connectivity[\(n-\ell \)] is that normally one requires the parameters to be relatively small compared to the size of the problem under consideration. However, if k is small it is possible that for every \(X\subseteq V(D)\) of size k, \(\mathrm{mco}(D-X)\) is not much smaller than \(n-k.\) Then \(n-\ell \) can be much smaller than \(\ell .\)

Since Component Order Connectivity is equivalent to the Vertex Cover problem for \(\ell =1\), Component Order Connectivity[\(\ell \)] is para-NP-complete. Drange et al. [11, Theorem 8] proved that Component Order Connectivity[\(k\)] is W[1]-hard even on split graphs. In their construction, \(n-\ell ={\mathcal {O}}(k^2).\) Hence, Component Order Connectivity[\(n-\ell \)] is also W[1]-hard. They also showed that Component Order Connectivity[\(\ell +k\)] is FPT by obtaining an algorithm of running time \(2^{{\mathcal {O}}(k\log \ell )}n.\) The above mentioned results are written in the undirected graphs row of Table 1.

A directed graph D is semicomplete if for every pair xy of distinct vertices of D, there is an arc between x and y. When we require that there is only one arc between x and y then we obtain a definition of a tournament. Clearly, the hardness results for the directed graphs row of Table 1 follow from the corresponding results in the undirected graphs row for columns \(n-\ell \) and k. Directed Component Order Connectivity[\(\ell \)] is para-NP-complete for semicomplete digraphs as Directed Component Order Connectivity on semicomplete digraphs is NP-complete for \(\ell =1\). This follows from the fact that Directed Feedback Vertex Set is NP-complete even for tournaments, as proved by Bang-Jensen and Thomassen [3] and Speckenmeyer [19].

The FPT result in the directed graphs row of Table 1 is first obtained by Göke et al. [13] as discussed above. The running time of their algorithm is \( 4^k(k\ell +k+\ell )!n^{{\mathcal {O}}(1)} = 2^{{\mathcal {O}}(k \ell \log (k \ell ))} n^{{\mathcal {O}}(1)}. \) By modifying their algorithm, we obtain an algorithm of complexity \(2^{{\mathcal {O}}(k)} \ell ^k k! n^{{\mathcal {O}}(1)}=2^{{\mathcal {O}}(k \log (k \ell ))} n^{{\mathcal {O}}(1)},\) which decreases the asymptotic dependence of the running time on \(\ell .\)Footnote 1 Our modification consists of replacing a branching algorithm in [13] with a randomized algorithm which can be derandomized without increasing the complexity upper bound. Note that Drange et al. [11, Theorem 14] proved that even for Component Order Connectivity on split graphs there is no algorithm of running time \(O^*(2^{o(k\log \ell )})\) (here we restrict ourselves to \(\ell =k^{O(1)}\)) unless the Exponential Time Hypothesis (ETH) [15] fails and it is a long-standing problem to decide whether Directed Feedback Vertex Set admits an algorithm of time complexity \(O^*(2^{o(k\log k)}).\)

Table 1 Parameterized Complexity of (Directed) Component Order Connectivity

The most interesting entry in the semicomplete digraphs row is a non-trivial result that Directed Component Order Connectivity[\(k\)] on semicomplete digraphs is FPT. This FPT algorithm boils down to finding a shortest path in a suitably defined auxiliary weighted acyclic digraph. The running time of the algorithm is \({\mathcal {O}}(2^{16k}kn^2).\) The other two FPT entries in this row follow from this result (for the parameter \(n-\ell \) this is due to our assumption that \(k< n-\ell \)). We also prove the following lower bounds: no algorithm for Directed Component Order Connectivity[\(k\)] on semicomplete digraphs can have time complexity \(2^{o(k)}n^{{\mathcal {O}}(1)}\) unless ETH failsFootnote 2 and no such deterministic algorithm can run in time \(o(n^2)\) for \(k=0\) (the last bound is it is information theoretic, not depending on any computational complexity hypothesis).

Our paper is organised as follows. The next section is devoted to terminology and notation on directed and undirected graphs, and basics on parameterized algorithms and complexity. In Sect. 3, we describe our improvement on the algorithm of Göke et al. [13]. In Sect. 4, we prove that Directed Component Order Connectivity[\(k\)] on semicomplete digraphs admits an algorithm of running time \({\mathcal {O}}^*(2^{16k})\) and show the lower bounds on the running time with parameters k and \(n-\ell \). We conclude the paper in Sect. 5.

2 Preliminaries

2.1 Directed and Undirected Graph Terminology and Notation

In this paper, all directed and undirected graphs are finite, without loops or parallel edges. As often the case in the directed graph theory, an edge of a digraph will be called an arc and the vertex and arc sets of a digraph D will be denoted by V(D) and A(D),  respectively. The out-neighbourhood and in-neighbourhood of a vertex x of a digraph D are denoted by \(N^+_D(x)=\{y\in V(D):\ xy\in A(D)\}\) and \(N^-_D(x)=\{y\in V(D):\ yx\in A(D)\}\), respectively, and the subscript D will be omitted if D is clear from the context. The out-degree and in-degree of a vertex x of D is \(d^+_D(x)=|N^+_D(x)|\) and \(d^-_D(x)=|N^-_D(x)|,\) respectively.

In this paper all paths and cycles in digraphs are directed, so we will omit the adjective ‘directed’ when referring to paths and cycles in digraphs. If \(D=(V,A)\) is a digraph and \(S\subseteq V\), then we denote by D[S] the subdigraph induced by the vertices in S. A digraph D is strongly connected (or, just strong) if there is a path from x to y for every ordered pair xy of distinct vertices. A strong component of a digraph D is a maximal strong induced subgraph of D. Strong components of D do not share vertices and can be ordered \(D_1,D_2,\dots ,D_p\) such that there is no arc in D from \(V(D_j)\) to \(V(D_i)\) when \(j>i.\) Such an ordering is called an acyclic ordering. Note that if D is a semicomplete digraph, then the strong components of D have a unique acyclic ordering \(D_1,D_2,\dots ,D_p\) and we have \(xy\in A(D)\) for every \(x\in V(D_i),\ y\in V(D_j),\ i<j.\)

Basic digraph terminology not introduced in this section can be found in [1, 2].

2.2 Parameterized Complexity

An instance of a parameterized problem \(\Pi \) is a pair (Ik) where I is the main part and k is the parameter; the latter is usually a non-negative integer. A parameterized problem is fixed-parameter tractable (FPT) if there exists a computable function f such that instances (Ik) can be solved in time \({\mathcal {O}}(f(k)|{I}|^c)\) where |I| denotes the size of I and c is an absolute constant. The class of all fixed-parameter tractable decision problems is called FPT and algorithms which run in the time specified above are called FPT algorithms. As in other literature on FPT algorithms, we will sometimes omit the polynomial factor in \({\mathcal {O}}(f(k)|{I}|^c)\) and write \({\mathcal {O}}^*(f(k))\) instead.

While FPT is a parameterized complexity analog of P in classic complexity, there are many hardness classes in parameterized complexity and they form a nested sequence starting from W[1]. It is well known [8, Chapter 14] that if the Exponential Time Hypothesis holds then FPT\(\ne \)W[1]. Due to this and other complexity results, it is widely believed that FPT\(\ne \)W[1] and hence W[1] is viewed as a parameterized analog of NP in classical complexity.

para-NP is the class of parameterized problems which can be solved by a nondeterministic algorithm in time \({\mathcal {O}}(f(k)|{I}|^c),\) where f is a computable function and c is an absolute constant. It is well-known that if a problem \(\Pi \) with parameter \(\kappa \) is NP-hard when \(\kappa \) equals to some constant, then \(\Pi \) is para-NP-hard [12, Corollary 2.16]. It is also well known that FPT=para-NP if and only if P=NP [12, Corollary 2.13].

For more information on parameterized algorithms and complexity, see recent books [8, 10, 12].

3 Directed Component Order Connectivity[\(\ell +k\)] on General Digraphs

Göke, Marx and Mnich [13] showed that Directed Component Order Connectivity[\(\ell +k\)] is FPT with a running time given by

$$\begin{aligned} 4^k(k\ell +k+\ell )!n^{{\mathcal {O}}(1)} = 2^{{\mathcal {O}}(k \ell \log (k \ell ))} n^{{\mathcal {O}}(1)}. \end{aligned}$$

The core of their algorithm is as follows. Begin with the iterative compression version of the problem, where in addition to \((D, \ell , k)\) the input also contains a solution \(X_0\) with \(|X_0|=k+1\), which can be used to guide the search for a smaller solution. This is a standard ingredient in FPT algorithms; see, e.g., [8]. At the cost of a simple branching step, we may also assume that we are looking for a solution X with \(X \cap X_0 = \emptyset \). Next, they observe that if we knew the strongly connected components of \(D-X\) that the vertices of \(X_0\) are contained in, then the problem reduces to a previously studied, simpler problem known as Skew Separator [7], which occurs in the design of the FPT algorithm for Directed Feedback Vertex Set (DFVS) of Chen et al. [7]. Indeed, if the precise strong components containing the vertices of \(X_0\) are known, then the problem can be solved in time \(O^*(4^kk!)\) using a strategy much like that for DFVS [7, 13]. Hence the bottleneck of the current best algorithm for Directed Component Order Connectivity[\(\ell +k\)] is the guessing of the strong components of \(X_0\) in \(D-X\).

Göke et al. [13] solve this via a branching algorithm that they analyse as taking time at most \((k\ell + k + \ell )!\). We show a simpler randomized method solving this problem with an improved time bound of

$$\begin{aligned} 2^{{\mathcal {O}}(k)}\left( {\begin{array}{c}\ell (k+1) + k\\ k\end{array}}\right) \le 2^{{\mathcal {O}}(k)} (e (\ell + 1+\ell /k))^k{\le 2^{{\mathcal {O}}(k)}(3e\ell )^k=2^{{\mathcal {O}}(k)}\cdot \ell ^k}. \end{aligned}$$
(1)

The method can be derandomized by standard methods.

Lemma 3.1

Let \((D, \ell , k)\) be an instance of Directed Component Order Connectivity[\(\ell +k\)], and let \(X_0\) be a solution with \(|X_0| = k+1\). Let X be an unknown solution with \(|X| \le k\) such that \(X \cap X_0 = \emptyset \). There is a randomized procedure that with success probability at least

$$\begin{aligned} \left( 2^{{\mathcal {O}}(k)} \left( {\begin{array}{c}\ell k + \ell + k\\ k\end{array}}\right) \right) ^{-1} \end{aligned}$$

computes a set \(S \subset V(D)\) such that for every \(x \in X_0\), the strong components containing x in \(D-X\) and in D[S] are identical.

Proof

Initialize \(S=X_0\), then for every vertex \(v \in V(D) \setminus X_0,\) place v in S independently at random with probability \(p=1-1/(\ell +1)\). We declare a guess a success if the following conditions apply:

  1. 1.

    For every \(x \in X_0\) we have \(V_x \subseteq S\), where \(V_x \subseteq V\) is the strong component of \(D-X\) containing x

  2. 2.

    \(X \cap S = \emptyset \)

Let \(Y=\bigcup _{x \in X_0} V_x\). Note that \(|Y|\le \ell (k+1)\). Our guess is successful if and only if \(v \in S\) for every \(v \in Y\), and \(v \notin S\) for every \(v \in X\). Since these are independent events, this clearly happens with probability

$$\begin{aligned} p^{|Y|-|X_0|}(1-p)^{|X|}\ge & {} p^{\ell (k+1)}(1-p)^k\\= & {} (1/(1+1/\ell )^\ell )^{k+1} (1/(\ell +1))^k\\\ge & {} (1/e)^{k+1} (1/(\ell +1))^k. \end{aligned}$$

Above we used the bound \(1+a\le e^a\) (\(a\ge 0\)), where we set \(a=1/\ell \). By the inequality \( (1/e)^{k+1} (1/(\ell +1))^k\ge 1/(2^{{\mathcal {O}}(k)} \left( {\begin{array}{c}\ell k + \ell + k\\ k\end{array}}\right) )\), we conclude that the success probability matches the bound in the lemma.

Now assume that the guess was successful for some set S and consider the strong component of x in D[S] for some \(x \in X_0\). Let \(V_x'\) be this strong component. Since \(D[V_x]\) is strongly connected and \(V_x \subseteq S\), we have \(V_x \subseteq V_x'\). On the other hand, by assumption D[S] is an induced subgraph of \(D-X\), and since \(V_x\) is a strongly connected component in \(D-X\) we must have \(V_x' \subseteq V_x\). We conclude \(V_x=V_x'\) for each \(x \in X_0\), as required. \(\square \)

For the derandomization, we employ a cover-free family construction of Bshouty and Gabizon [4]. We get the following lemma.

Lemma 3.2

Let \((D, \ell , k)\) be an instance of Directed Component Order Connectivity[\(\ell +k\)], and let \(X_0\) be a solution with \(|X_0| = k+1\). Let X be an unknown solution with \(|X| \le k\) such that \(X \cap X_0 = \emptyset \). There is a deterministic procedure that produces a set \({\mathcal {F}}\subseteq 2^V\) with

$$\begin{aligned} |{\mathcal {F}}| = \left( {\begin{array}{c}\ell k + \ell + k\\ k\end{array}}\right) ^{1+o(1)} \log |V| \end{aligned}$$

in time \({\mathcal {O}}(|{\mathcal {F}}| n)\), such that there is a set \(S \in {\mathcal {F}}\) such that for every \(x \in X_0\), the strong components containing x in \(D-X\) and in D[S] are identical.

Proof

Let \(r \le s < n\) be integers. Bshouty and Gabizon (in a slightly non-standard definition) define an (n , (r s))-cover free family as a set \({\mathcal {F}}\subseteq \{0,1\}^n\) such that for every disjoint pair of sets \(A, B \subseteq [n]\) with \(|A|=r\) and \(|B|=s\) there is a set \(S \in {\mathcal {F}}\) such that \(A \subseteq S\) and \(B \cap S = \emptyset \). Bshouty and Gabizon [4] show how to compute an (n, (rs))-cover free family \({\mathcal {F}}\) of size

$$\begin{aligned} |{\mathcal {F}}| = \left( {\begin{array}{c}r+s\\ r\end{array}}\right) ^{1+o(1)} \log n \end{aligned}$$

in time \({\mathcal {O}}(|{\mathcal {F}}| n)\).

For \(x \in X_0\), let \(V_x \subseteq V\) be the strong component containing x in \(D-X\), and let \(Y = \bigcup _{x \in X_0} V_x\). As in Lemma 3.1, it suffices to guarantee that there is a set \(S \in {\mathcal {F}}\) such that \(Y \subseteq S\) and \(X \cap S = \emptyset \). This guarantee is achieved by constructing a cover-free family with parameters \(n=|V(D)|\), \(r=\ell (k+1)\) and \(s=k\). Here \(r>s\), but we can simply compute an (n, (sr))-cover free family and take the complement of every member. Hence we get a family of size

$$\begin{aligned} \left( {\begin{array}{c}\ell k+\ell +k\\ k\end{array}}\right) ^{1+o(1)}\log n \end{aligned}$$

computed in output-linear time. \(\square \)

The two lemmas of this section and (1) imply the following:

Theorem 3.1

There is a randomized FPT algorithm that solves Directed Component Order Connectivity[\(\ell +k\)] in time \(2^{{\mathcal {O}}(k)} \ell ^k k! n^{{\mathcal {O}}(1)}\) with probability at least \(\Omega (1)\). The algorithm can be derandomized in the same time, up to a lower-order overhead factor.

4 Directed Component Order Connectivity on Semicomplete Digraphs

Let us first summarize the main ideas behind our FPT algorithm, before providing more technical details. Let \(D=(V,A)\) be a semicomplete digraph, \(k,\ell \in \mathbb {N}\) and let \(X\subseteq V\) of size k such that \(\mathrm{mco}(D-X)\le \ell \). The vertices of \(D-X\) can be partitioned into \(C_1,\ldots , C_q\) such that each \(C_i\) is the vertex set of a strong component of \(D-X\) and

  1. 1.

    for every \(i\in [q]\) we have \(|C_i|\le \ell \), and

  2. 2.

    for every \(i,j\in [q]\) with \(i<j\) and every \(x\in C_i\), \(y\in C_j\) we have \(xy\in A\) and \(yx\notin A\).

Fig. 1
figure 1

An example of a valid triple \((Y_i, Z_i, S_i)\). A semicomplete digraph D, the set \(X=\bigcup _{i\in [q]}X_i\) is such that \(\mathrm{mco}(D-X)=3\) and \(C_1,\ldots , C_q\) are strong components of \(D-X\). \(Y_i=C'_1\cup C'_2\cup \cdots \cup C'_i\) and \(Z_i=C'_{i+1}\cup C'_{i+2}\cup \cdots \cup C'_q\), where \(C'_i=C_i\cup X_i\), \(i\in [q]\). The arcs uv, \(u\in C'_i\), \(v\in C'_j\) for \(i<j\) are omitted as well as the arcs within X between \(X_t\) and \(C_t\), \(t\in [q]\). The set \(S_i\) is the set of the three square vertices, one in each of \(X_i\), \(X_{i+1}\), and \(X_q\). The set \(S_i\) is a minimal vertex cover of the dashed arcs from \(Z_i\) to \(Y_i\). Note that the vertex in \(X_1\) is not in \(S_i\) as the arc incident to it with the tail in \(Z_i\) is already covered by \(S_i\). Note also the hollow circle vertex in \(X_i\), the only reason it is in X is to reduce the size of \(C_i\) and as such it will not appear in any \(S_j\), \(j\in [q]\), in the set of q valid triples defining these components

In our algorithm, we would like to discover the strong components one by one in the ascending order from \(C_1\) to \(C_q\). Now let \(X_1,\ldots , X_q\) be a partition of X into q (possibly empty) parts and let, for each \(i\in [q]\), \(Y_i=C'_1\cup C'_2\cup \cdots \cup C'_i\) and \(Z_i=C'_{i+1}\cup C'_{i+2}\cup \cdots \cup C'_q\), where \(C'_i=C_i\cup X_i\), \(i\in [q]\). Moreover, let \(S_i\) be a subset of X such that for each \(y\in Y_i\setminus S_i\) and \(z\in Z_i\setminus S_i\) we have \(yz\in A\) and \(zy\notin A\). See also Fig. 1. Note that, given \(S_i\), it suffices to solve our problem in subgraphs \(D[Y_i\setminus S_i]\) and \(D[Z_i\setminus S_i]\) separately. Moreover, the set \((Y_{i+1}\setminus Y_{i}) \setminus (S_{i+1}\cup S_i)\) is basically the strong component \(C_{i+1}\) up to few vertices in \(X_{i+1}\) that are not incident to any arc with tail in \(Z_{i+1}\setminus S_{i+1}\) or head in \(Y_i\setminus S_i\). Such vertices can actually be replaced in X by any vertex in \(C_{i+1}\). It follows that if we are given \((Y_1,Z_1,S_1),\ldots , (Y_q,Z_q,S_q)\), then we can easily reconstruct a solution of size |X| as \(\bigcup _{i\in [q]}S_i\) plus some arbitrary vertices of \((Y_{i+1}\setminus Y_{i}) \setminus (S_{i+1}\cup S_i)\) to have at most \(\ell \) vertices in each strong component of \(D-X\).

Therefore, our goal will be to search for triples \((Y_i, Z_i, S_i)\), \(i\in [q]\), where \(\{Y_i,Z_i\}\) is a partition of V and \(S_i\) is a minimal subset such that there is no arc zy in A with \(z\in Z_i\setminus S_i\) and \(y\in Y_i\setminus S_i\).

The first step of our proof is to show that there are at most \(2^{8k+2}n\) triples we need to consider (Lemma 4.4). We will call these important triples valid and we postpone the precise definition for later. The main reason for the bound is that we only need to consider triples \((Y_i, Z_i, S_i)\) for which \(|S_i|\le k\) and that if we fix \(|Y_i|\) (and hence also \(|Z_i|\)), then vertices with out-degree at least \(|Z_i|+|S_i|+1\) (resp. in-degree at least \(|Y_i|+|S_i|+1\)) have to be in \(Y_i\) (resp. in \(Z_i\)) or in \(S_i\) and we can fix these vertices in \(Y_i\) (resp. in \(Z_i\)). Once we bound the number of the triples we need to consider, we can define compatible pairs of triples \(\left( (Y^1, Z^1, S^1), (Y^2, Z^2, S^2)\right) \), for which \(Y^1\subset Y^2\) and these triples, loosely speaking can define a strong component of \(D-X\) with at most \(\ell \) vertices as \((Y^2\setminus Y^1) \setminus (S^1\cup S^2)\) and the arcs from \(Z_2\) to \(Y_1\) are all hit by a vertex in \(S^1\cap S^2\). This allows us to create an auxiliary acyclic “state” digraph whose vertices are valid triples and arcs are the compatible pairs of triples. The paths from \((\emptyset , V, \emptyset )\) to \((V, \emptyset , \emptyset )\) in this graph then define a solution for \((D,\ell , k)\). Note that our algorithm can be equivalently seen as a dynamic programming which computes for each valid triple (YZS) a minimum size set X such that \(\mathrm{mco}(D[Y]-(X\cup S))\le \ell \).

The following lemma allows us to show that if we fix |Y| in a triple (YZS), then only \({\mathcal {O}}(k)\) vertices of D could potentially be in both Y and Z and all other vertices are fixed. The lemma is an easy consequence of the fact that every semicomplete digraph on at least \(2p+2\), \(p\in \mathbb {N}\), vertices has a vertex of out-degree at least \(p+1\). We give the proof here for the convenience of the reader.

Lemma 4.1

Let \(D=(V,A)\) be a semicomplete digraph and let YZ be a partition of V such that for every \(y\in Y\) and every \(z\in Z,\) we have \(yz\in A\). Then for every \(p\in \mathbb {N}\) (1) there are at most \(2p+1\) vertices in Y with \(d^+_D(y)\le |Z|+p\) and (2) there are at most \(2p+1\) vertices in Z with \(d^-_D(z)\le |Y|+p\).

Proof

We will first prove Part (1). Let \(Y_\le \) be the set of vertices in Y with out-degree at most \(|Z|+p\) in D. Since for every \(y\in Y\) and every \(z\in Z,\) we have \(yz\in A\), it follows that all vertices in \(Y_\le \) have out-degree at most p in \(D[Y_\le ]\). Hence \(\sum _{y\in Y_\le } d^+_{D[Y_\le ]}(y)\), i.e., the sum of out-degrees of vertices in \(Y_\le \) in \(D[Y_\le ]\), is at most \(p|Y_\le |\). Hence,

$$\begin{aligned} \sum _{y\in Y_\le } d^+_{D[Y_\le ]}(y)= |A(D[Y_\le ])|\le p|Y_\le |. \end{aligned}$$

Since D is a semicomplete digraph,

$$\begin{aligned} \frac{|Y_\le |\cdot (|Y_\le |-1)}{2}\le |A(D[Y_\le ])|\le p|Y_\le |. \end{aligned}$$

It follows that \(|Y_\le |\le 2p+1.\)

Part (2) follows directly from Part (1) applied to a digraph \(D'=(V,A')\) obtained from D by reversing all the arcs i.e. \(A'=\{yx\;\mid \; xy\in A \}\). \(\square \)

Let \(D=(V,A)\) be a semicomplete digraph and \(t\in [n]\). We will call a triple (YZS) t-valid if

  1. 1.

    YZ is a partition of V(D) with \(|Y|=t\),

  2. 2.

    \(S\subseteq V(D)\) is a minimal (w.r.t. inclusion) set such that for all \(y\in Y\) and \(z\in Z\), if \(zy\in A(D)\), then \(|\{y,z\}\cap S|\ge 1\),

  3. 3.

    \(|S|\le k\),

  4. 4.

    for all \(x\in S\), if \(d^+_D(x) > n-t+k\), then \(x\in Y\),

  5. 5.

    for all \(x\in S\), if \(d^+_D(x) \le n-t+k\) and \(d^-_D(x) > t+k\), then \(x\in Z\).

We will say a triple (YZS) is valid, if it is t-valid for some \(t\in \mathbb {N}\). The following simple observation will help us bound the number of partitions (YZ) that could lead to a t-valid triple (YZS).

Lemma 4.2

For any t-valid triple (YZS),  all vertices v with \(d^+_{D}(v) > n-t+k\) are in Y and all vertices v with \(d^+_D(x) \le n-t+k\) and \(d^-_{D}(v) > t+k\) are in Z.

Proof

If \(v\in S\), the lemma follows directly from the definition of a t-valid triple. If \(v\in V(D)\setminus S\) and \(d^+_{D}(v) > n-t+k\), then v has an out-neighbour in \(Y\setminus S\), because \(|Z\cup S|\le n-t+k\), and \(v\in Y\) follows by property 2. Similarly, if \(v\in V(D)\setminus S\) and \(d^-_{D}(v) > t+k\), then v has an in-neighbour in \(Z\setminus S\) and \(v\in Z\) by property 2. \(\square \)

Lemma 4.3

Let \(D=(V,A)\) be a semicomplete digraph, \(n=|V|\), and let \(t\in [n]\). If there exists a t-valid triple, then there are at most \(7k+2\) vertices v in V(D) with \(d^+_{D}(v) \le n-t+k\) and \(d^-_{D}(v) \le t+k\).

Proof

Let us assume that there is at least one t-valid triple and let us denote it (YZS). Note that for all \(y\in Y\setminus S\) and \(z\in Z\setminus S\) it holds that \(zy\notin A(D)\). Since D is a semicomplete digraph, it follows that \(yz\in A(D)\). Due to Lemma 4.1 applied to \(D-S,\) there are at most \(2(k+|Z\cap S|)+1\) vertices in \(Y\setminus S\) with \(d^+_{D-S}(y)\le |Z\setminus S|+k+|Z\cap S|=n-t+k\) and there are at most \(2(k+|Y\cap S|)+1\) vertices in \(Z\setminus S\) with \(d^-_{D-S}(z)\le |Y\setminus S|+k+|Y\cap S|=t+k\). Let \(F=\{v\in V(D):\ d^+_{D}(v)\le n-t+k \text{ and } d^-_{D}(v)\le t+k\}.\) By the above,

$$\begin{aligned} |F\setminus S|\le & {} 2(k+|Z\cap S|)+1 + 2(k+|Y\cap S|)+1\\\le & {} 4k+2 +{2|S|}\le 6k+2. \end{aligned}$$

Thus, \(|F|\le 7k+2.\) \(\square \)

Lemma 4.4

Let \(D=(V,A)\) be a semicomplete digraph, \(n=|V|\), and let \(t\in [n]\). There are at most \(2^{8k+2}\) t-valid triples (YZS). Moreover, if we are given the in- and out-degrees of all vertices in D on the input, then we can enumerate all such triples in time \({\mathcal {O}}(2^{8k} k n)\).

Proof

Let \(F=\{v\in V(D):\ d^+_{D}(v)\le n-t+k \text{ and } d^-_{D}(v)\le t+k\}.\) By Lemma 4.3, \(|F|\le 7k+2\). If the out- and in-degrees of all vertices in D are given on the input, we can construct the set F in time \({\mathcal {O}}(n).\)

By Lemma 4.2, there are at most \(2^{7k+2}\) possible partitions \((Y',Z')\) that could lead to a t-valid triple \((Y',Z',S')\) for some \(S'\), each such partition is uniquely determined by fixing \(Y'\cap F\).

For the rest of the proof, we assume that we computed the set F of vertices v in V(D) with \(d^+_{D}(v) \le n-t+k\) and \(d^-_{D}(v) \le t+k\), \(|F|\le 7k+2\). Let \((Y',Z')\) be one of \(2^{7k+2}\) partitions that could lead to a t-valid triple.

We show that we can enumerate all minimal sets \(S'\), \(|S'|\le k\), such that for all \(y\in Y'\) and \(z\in Z'\), if \(zy\in A(D)\), then \(|\{y,z\}\cap S'|\ge 1\). Let G be an undirected bipartite graph such that \(V(G)=V(D)\), the partite sets of G are \(Y'\) and \(Z',\) and for every \(y\in Y'\), \(z\in Z',\) it holds \(yz\in V(G)\) if and only if \(zy\in A(D)\). Then \(S'\) is a minimal vertex cover of size at most k in G. Moreover, every minimal vertex cover \(S'\) in G leads to a t-valid triple \((Y',Z',S')\). It is well known and easy to show that we can enumerate all minimal vertex covers of size at most k in G in time \({\mathcal {O}}(2^kk^2+kn)\). This is done by including all vertices with degree at least \(k+1\) in every vertex cover and removing every vertex they cover. If the resulting graph has more than \(k^2\) edges, then there is no vertex cover of size at most k [5]. Then we can enumerate all vertex covers of size at most k, by using a simple search-tree algorithm that picks an edge, say uv, and recursively enumerates all vertex covers of size at most \(k-1\) that include u or v, respectively. By the algorithm, it is also easy to see that there are at most \(2^k\) distinct vertex covers of size at most k. For each of these vertex covers, we can easily determine whether it is minimal in \({\mathcal {O}}(k^2)\) time by going over all of the at most \(k^2\) edges and if exactly one endpoint of the edge is in vertex cover, then we mark this vertex as important. If all vertices at are marked important, then the vertex cover is minimal. Otherwise, any vertex that is not marked important at the end, can be removed from the vertex cover since all its neighbours are already in the vertex cover and the vertex cover is not minimal.

It follows that there are at most \(2^{7k+2}\cdot 2^k=2^{8k+2}\) t-valid triples and we can enumerate all of them in time \({\mathcal {O}}(n+2^{8k}k^2+kn)={\mathcal {O}}(2^{8k}kn)\). \(\square \)

We are now ready to present our algorithm.

Theorem 4.1

There is an FPT algorithm that solves Directed Component Order Connectivity[\(k\)] on semicomplete digraphs in time \({\mathcal {O}}(2^{16k}kn^2)\).

Proof

Let \(D= (V,A)\) be a semicomplete digraph and let \((D,\ell , k)\) be an instance of Directed Component Order Connectivity[\(k\)].

Algorithm. Our algorithm boils down to finding a shortest path in an auxiliary weighted acyclic digraph whose vertex set consists of all the valid triples. The main idea is to find a sequence of valid triples \((Y_1,Z_1,S_1),\ldots , (Y_q,Z_q, S_q)\) such that \(S=\bigcup _{i\in [q]}S_i\) is a solution for \((D,\ell , k)\) and the strongly connected components of \(D-S\) are subsets of \(C_i=Y_{i+1}\setminus (Y_i\cup S)\), where \(|C_i|\le \ell \) and for all \(i<j\), \(x_i\in C_i\), \(x_j\in C_j\) it holds that \(x_jx_i\notin A\).

We define the weighted directed acyclic state graph \(\mathcal {D}=(\mathcal {V}, \mathcal {A})\) as follows. The set of vertices \(\mathcal {V}\) is the set of all t-valid triples for all \(t\in \{0,1,\dots ,n\}\). The set of arcs \(\mathcal {A}\) contains an arc from a \(t_1\)-valid triple \((Y_1,Z_1, S_1)\) to a \(t_2\)-valid triple \((Y_2,Z_2, S_2)\) if and only if the following conditions holds:

  • \(Y_1\subset Y_2\) (and \(Z_2\subseteq Z_1\)),

  • if \(x\in S_1\cap Z_1\) and \(x\in Z_2\), then \(x\in S_2\),

  • if \(x\in Y_1\setminus S_1\), then \(x\in Y_2\setminus S_2\), and

  • \(|S_1\setminus S_2|+ \max (0, |Z_1\cap Y_2\setminus (S_1\cup S_2)| -\ell ) \le k\).

We let the weight of an arc from \((Y_1,Z_1, S_1)\) to \((Y_2,Z_2, S_2)\) be

$$\begin{aligned} |S_1\setminus S_2|+ \max (0, |Z_1\cap Y_2\setminus (S_1\cup S_2)| -\ell ). \end{aligned}$$

This finishes the description of the auxiliary weighted acyclic digraph. In the remainder of the proof we first show that \((D,\ell ,k)\) is a YES-instance if and only if the cost of the shortest path in \(\mathcal {D}\) from \((\emptyset ,V(D),\emptyset )\) to \((V(D),\emptyset ,\emptyset )\) is at most k. Afterwards, we bound \(|\mathcal {V}|+|\mathcal {A}|\) by \({\mathcal {O}}(2^{16k}n^2)\) and prove that we can construct the auxiliary digraph in \({\mathcal {O}}(2^{16k}kn^2)\) time. We can then find a shortest path from \((\emptyset ,V(D),\emptyset )\) to \((V(D),\emptyset ,\emptyset )\) in linear time, that is, in time \({\mathcal {O}}(2^{16k}n^2)\) since \(\mathcal {D}\) is acyclic (by dynamic programming using an acyclic ordering of the vertices), which finishes the proof.

Correctness of the Algorithm. Suppose first that \((D,\ell , k)\) is a YES-instance of Directed Component Order Connectivity[\(k\)] such that D is a semicomplete digraph. Let X be a minimum size solution for \((D,\ell , k)\), that is, a minimum size set such that \(\mathrm{mco}(D-X)\le \ell \). Since \((D,\ell , k)\) is a YES-instance and \(|X|\le k\), the vertices of \(D-X\) can be partitioned in sets \(C_1,\ldots , C_q\) such that

  1. 1.

    for every \(i\in [q]\) we have \(|C_i|\le \ell \), and

  2. 2.

    for every \(i,j\in [q]\) with \(i<j\) and every \(x\in C_i\), \(y\in C_j\) we have \(xy\in A\) and \(yx\notin A\).

Our goal is to define a sequence of valid triples \((Y_i,Z_i,S_i)\), \(i\in [q]\), such that the arc \(((Y_i,Z_i,S_i),(Y_{i+1},Z_{i+1},S_{i+1}))\) is in \(\mathcal {A}\) and the cost of the path in \(\mathcal {D}\) defined by this sequence is |X|. We will construct these triples from X and \(C_1,\ldots , C_q\) with some additional restrictions that make it easier to show that they indeed define a path in \(\mathcal {D}\) of cost at most |X|. Namely, we will define them such that for all \(i,j\in [q]\), \(i<j\) the triples satisfy the following properties:

  1. 1.

    \((Y_i,Z_i,S_i)\) is \(t_i\)-valid for some \(t_i\in [n]\),

  2. 2.

    \(C_1\cup \cdots \cup C_i\subseteq Y_i\),

  3. 3.

    \(C_{i+1}\cup \cdots \cup C_q\subseteq Z_i\),

  4. 4.

    \(S_i\subseteq X\),

  5. 5.

    \(Y_i\subset Y_j\) and \(Z_j\subseteq Z_i\),

  6. 6.

    if \(x\in S_i\cap Z_i\) and \(x\in Z_j\), then \(x\in S_j\),

  7. 7.

    if \(x\in Y_i\setminus S_i\), then \(x\in Y_j\setminus S_j\).

We first show that a sequence with the above properties indeed exists and defer the computation of the cost of the path defined by this sequence later. Note that given above properties, the arc \(((Y_i,Z_i,S_i),(Y_{i+1},Z_{i+1},S_{i+1}))\) exists in \(\mathcal {D}\) whenever the weight of the arc is at most k. This follows from the argument that the cost of the path defined by this sequence is at most k and is also deferred later.

To obtain this sequence, we need to discuss how to distribute the vertices of X in the sets \(Y_i\) and \(Z_i\) and how to compute \(S_i\), \(S_j\) (note that the partition of the vertices in \(V\setminus X\) is fixed by properties 2 and 3).

We distribute the vertices of X between \(Y_i\) and \(Z_i\) as follows. We start with \(t_i=|C_1\cup \cdots \cup C_i|\) and while there are more than \(t_i - |C_1\cup \cdots \cup C_i|\) vertices \(x\in X\) with \(d^+_D(x) > n-t_i+k\) we increase \(t_i\) by one. Since \(n-t_i+k > n-(t_i+1)+k\), once \(d^+_D(x) > n-t_i+k\) holds for a vertex \(x\in X\), it will be true for this vertex even after increasing \(t_i\). Moreover, since \(|X|\le k\), there is a value of \(t_i\) between \(|C_1\cup \cdots \cup C_i|\) and \(|C_1\cup \cdots \cup C_i|+k\) such that there are precisely \(t_i - |C_1\cup \cdots \cup C_i|\) vertices in X with \(d^+_D(x) > n-t_i+k\). We put all of these vertices in \(Y_i\) and the remaining vertices of X in \(Z_i\). Note that for \(j\in \mathbb {N}\) such that \(i < j\), we will start with \(t_j = |C_1\cup \cdots \cup C_j| > |C_1\cup \cdots \cup C_i|\) and observe that if we include \(x\in X\) in \(Y_i\), then we include it in \(Y_j\) as well.

Now \(|X|\le k\) and for all \(y\in Y_i\setminus X = C_1\cup \cdots \cup C_i\) and all \(z\in Z_i\setminus X= C_{i+1}\cup \cdots \cup C_q\) we have \(zy\notin A(D)\). The set \(S_i\) is defined to be those vertices \(x\in X\) such that one of the following holds:

  1. 1.

    \(x\in Y_i\) and there exists \(z\in Z_i\setminus X\) such that \(zx\in A(D)\),

  2. 2.

    \(x\in Z_i\) and there is an arc \(xy\in A(D)\), \(y\in Y_i\) such that \(y\notin S_i\).

Note that all arcs from \(Z_i\) to \(Y_i\) are covered by \(S_i\) and for each \(x\in S_i\) there is an arc zy from \(Z_i\) to \(Y_i\) with \(\{y,z\}\cap X= \{x\}\). Note that if \(x\in Y_i\setminus S_i\), then \(x\in Y_j\setminus S_j\) for all \(j>i\). On the other hand, if \(x\in Z_i\cap S_i\), then there is a vertex \(y\in Y_i\setminus S_i\) such that \(xy\in A(D)\). Moreover, for all \(j>i\), \(y\in Y_j\setminus S_j\). Therefore, if \(x\in Z_j\), then \(x\in S_j\). From the above two properties it follows that if \(x\in S_i\setminus S_j\), then \(x\notin S_{j+1}\cup \cdots \cup S_q\). This finishes the proof of the existence of a sequence of valid triples \((Y_1, Z_1, S_1),\ldots , (Y_q, Z_q, S_q)\) with properties 1-7.

We claim that the cost of the path following this sequence is at most k. First note that if \(x\in S_i\setminus S_{i+1}\), then \(x\in Y_{i+1}\) and for all \(j\ge i+1\) it holds \(x\notin S_j\), hence every vertex in X is counted in at most one of the sets \(S_{i}\setminus S_{i+1}\). Now the set \(C_i\) is precisely \((Z_{i-1}\cap Y_i)\setminus X\). If \(x\in Z_{i-1}\cap Y_i\cap X\) is in some set \(S_j\), then from the properties 5, 6 and 7 of the sequence of triples it follows that x is in \(S_{i-1}\cup S_i\). Hence \(|(Z_{i-1}\cap Y_i)\setminus (S_{i-1}\cup S_i)| -|C_i|\) is precisely the number of vertices in X that are in \(Z_{i-1}\cap Y_i\) and in none of the sets \(S_j\), \(j\in [q]\). Note that for such vertex \(x\in (Z_{i-1}\cap Y_i)\setminus \bigcup _{j\in [q]} S_j\) and a vertex \(y\in Y_{j}\setminus S_j\), for some \(j\in [q]\) with \(j<i\), it holds \(xy\notin A(D)\) (else by definition of a valid triple \(|\{x,y\}\cap S_j|\ge 1\)). Similarly for \(z\in Z_j\setminus S_j\), \(j>i\), \(zx\notin A(D)\). Hence, if \(|C_i|<\ell \), then \(X\setminus \{x\}\) would be a smaller solution for the instance \((D,\ell , k)\) and because of minimality of X, \((|Z_{i-1}\cap Y_i\setminus (S_{i-1}\cup S_i)| -\ell )\) is precisely the number of vertices in X that are in \(Z_{i-1}\cap Y_i\) and in none of the sets \(S_j\). It follows that each vertex in X is counted on precisely one arc on the path and the shortest path from \((\emptyset {},V(D),\emptyset )\) to \((V(D),\emptyset {},\emptyset {})\) in \(\mathcal {D}=(\mathcal {V}, \mathcal {A})\) has length precisely |X|.

For the other direction, let some shortest path in \(\mathcal {D}\) from \((\emptyset ,V(D),\emptyset )\) to \((V(D),\emptyset ,\emptyset )\) be defined by the sequence \((Y_i,Z_i,S_i)\), \(i\in \{0,\ldots ,q\}\), and assume that the cost of the path is at most k. For every \(i\in [q]\), let \(T_i\) be an arbitrary set consisting of \((|(Z_{i-1}\cap Y_i)\setminus (S_{i-1}\cup S_i)| -\ell )\) vertices from \((Z_{i-1}\cap Y_i)\setminus (S_{i-1}\cup S_i)\) and let \(X=\bigcup _{i\in [q]}(T_i\cup S_i)\). Because the pair \(((Y_{i-1},Z_{i-1},S_{i-1}),(Y_i,Z_i,S_i))\) is an arc in \(\mathcal {D}\) for every \(i\in [q]\), we have \(Y_{i-1}\subseteq Y_i\) and \(Z_i\subseteq Z_{i-1}\). Moreover, \((Y_{i-1},Z_{i-1},S_{i-1})\) and \((Y_i,Z_i,S_i))\) are \(t_{i-1}\)-valid and \(t_i\)-valid triples, for some \(t_{i-1},t_i\in [n]\), respectively. Therefore, there is no arc from \(Z_j\setminus X\) to \(Y_i\setminus X\) for any \(i\le j\in [q]\). It follows that each strongly connected component of \(D-X\) is a subset of \((Z_{i-1}\cap Y_i)\setminus X\) for some \(i\in [q]\). In particular note that \((Z_{i-1}\cap Y_i)\cap X = (Z_{i-1}\cap Y_i)\cap (S_{i-1}\cup S_i\cup T_i)\), \((S_{i-1}\cup S_i)\cap T_i=\emptyset \) and \(T_i\subseteq (Z_{i-1}\cap Y_i)\). Hence the size of each strongly connected component is at most \(\max _{i\in [q]}|(Z_{i-1}\cap Y_i)\setminus (S_{i-1}\cup S_i\cup T_i)| = \max _{i\in [q]}|\left( (Z_{i-1}\cap Y_i)\setminus (S_{i-1}\cup S_i)\right) \setminus T_i| = \max _{i\in [q]}(|\left( (Z_{i-1}\cap Y_i)\setminus (S_{i-1}\cup S_i)\right) | - |T_i|)\le \ell \). Since \(S_0=S_q=\emptyset \), every vertex that appears in \(S_i\) for some \(i\in [q]\) is counted in some \(|S_j\setminus S_{j+1}|\), where \(j\ge i\) and every vertex that appears in \(T_i\) for some \(i\in [q]\) is counted in \(\max (0, |Z_i\cap Y_{i+1}\setminus (S_i\cup S_{i+1})| -\ell )\) and the final set X has at most k vertices.

Construction of the Auxiliary Weighted Digraph. Note that by Lemma 4.4, \(|\mathcal {V}|\le 2^{8k+2}n\) and, since we can compute the out- and in-degrees of all vertices in D in time \({\mathcal {O}}(n^2)\), we can enumerate all vertices in \(\mathcal {D}\) in time \({\mathcal {O}}(2^{8k}kn^2)\). It follows that \(|\mathcal {A}|\le |\mathcal {V}|^2\le 2^{16k+4}n^2\) and \(|\mathcal {V}|+|\mathcal {A}|={\mathcal {O}}(2^{16k}n^2)\). It remains to show that for a pair of triples \((Y_1,Z_1, S_1)\) and \((Y_2,Z_2, S_2)\), we can check whether \(((Y_1,Z_1, S_1),(Y_2,Z_2, S_2))\) is an arc and compute its weight in \({\mathcal {O}}(k)\) amortized time. First note that if \(|Y_1|\ge |Y_2|\), then the arc is not there. We will only check if \(((Y_1,Z_1, S_1),(Y_2,Z_2, S_2))\) is an arc if \(|Y_1|<|Y_2|\). This can be done without computing the sizes of \(Y_1\) and \(Y_2\), respectively, if we enumerate the t-valid triples in \(\mathcal {D}\) in levels in the order of increasing t (i.e., we invoke Lemma 4.4 for t only after we added all \(t'\)-valid triples, for all \(t'<t\), to \(\mathcal {V}\)) and compute all in-neighbours of a vertex when it is added to \(\mathcal {V}\). Moreover, when adding the triple (YZS) in \(\mathcal {V}\), we will in \({\mathcal {O}}(n)\) time compute maps \(\alpha _{(Y,Z,S)}: V(D)\rightarrow \{0,1\}\) such that \(\alpha _{(Y,Z,S)}(x)=0\) if and only if \(x\in Y\) and \(\beta _{(Y,Z,S)}: V(D)\rightarrow \{0,1\}\) such that \(\beta _{(Y,Z,S)}(x)=0\) if and only if \(x\in S\). We also compute the set \({\Delta _{Y,Z}} = \{x\mid x\in V(D), d^+_D(x)\le |Z|+k, d^-_D(x)\le |Y|+k\}\). By Lemma 4.3, \(|\Delta _{Y,Z}|\le 7k+2\). Now we can describe the \({\mathcal {O}}(k)\) algorithm that determines whether \(((Y_1,Z_1, S_1),(Y_2,Z_2, S_2))\) is an arc.

First, for every \(x\in S_1\) we can in constant time check that \(x\in S_1\cap Z_1\) (i.e., \(\alpha _{(Y_1,Z_1,S_1)}(x)=1\) and \(\beta _{(Y_1,Z_1,S_1)}(x)=0\)) and \(x\in Z_2\) (\(\alpha _{(Y_2,Z_2,S_2)}(x)=1\) ) implies \(x\in S_2\) (\(\beta _{(Y_2,Z_2,S_2)}=0\)). Similarly we can check in constant time that if \(x\in Y_1\), then \(x\in Y_2\setminus S_2\).

Second, by Lemma 4.2 and since \(|Y_1|<|Y_2|\) and \(|Z_1|>|Z_2|\), we get that to check that \(Y_1\subset Y_2\) and \(Z_2\subseteq Z_1\), we only need to check for every \(x\in {\Delta _{Y_1,Z_1}\cup \Delta _{Y_2,Z_2}}\) that \(\alpha _{(Y_1,Z_1,S_1)}(y)=0\) implies \(\alpha _{(Y_2,Z_2,S_2)}{(x)}=0\). This check can be done in \({\mathcal {O}}({|\Delta _{Y_1,Z_1}\cup \Delta _{Y_2,Z_2}|}) = {\mathcal {O}}(k)\) time.

Finally, to compute the weight of the arc, we note that \(|Z_1\cap Y_2|\) is precisely \(|Y_2|-|Y_1|\), because \(Y_1\subset Y_2\) and \(Z_1=V(D)\setminus Y_1\), so we only need to check how many of the vertices in \(S_1\cup S_2\) are in \(Z_1\cap Y_2\) and how many of the vertices in \(S_1\) are also in \(S_2\). Moreover, we only need to compute \(|(Z_1\cap Y_2)\setminus (S_1\cup S_2)| -\ell \) if \(\ell < |Y_2|-|Y_1| \le \ell + 2k\). Else either the weight of the arc is precisely \(|S_1\setminus S_2|\) or it would be more than k and hence it is not an arc. Hence, we end up spending \({\mathcal {O}}(k+\log n)\) time on the computation of the weight of each of at most \({\mathcal {O}}(2^{16k}kn)\) many arcs (for which \(\ell < |Y_2|-|Y_1| \le \ell + 2k\) ) and \({\mathcal {O}}(k)\) on all of at most \({\mathcal {O}}(2^{16k}n^2)\) remaining arcs. Since \(k\le n\), we can construct \(\mathcal {D}\) in \({\mathcal {O}}(2^{16k}kn^2)\) time. \(\square \)

In the rest of the section, we will show that the dependency on both k and n cannot be significantly improved. More precisely, we will show an unconditional lower-bound of \(\Omega (n^2)\) even if \(k=0\), as we show that we need to read at least \(\Omega (n^2)\) arcs of the input instance in the worst case to distinguish between \(k=0\) and \(k=1\). Furthermore, we show that any \(2^{o(k)}n^{{\mathcal {O}}(1)}\) algorithm would imply that the Exponential Time Hypothesis fails.

Theorem 4.2

There is no deterministic sequential algorithm that outputs the correct answer for every instance \((D,\ell ,0)\) of Directed Component Order Connectivity when D is a tournament in \(o(n^2)\) time.

Proof

For \(i\in \mathbb {N}\), let \(H_i\) be an arbitrary but fixed strongly connected tournament on i vertices. Consider two cases.

Case 1: \(\frac{n}{2} \le \ell < n.\) Let us consider the graph D obtained by taking the disjoint union of \(H_{\lfloor \frac{n}{2}\rfloor }\) and \(H_{\lceil \frac{n}{2}\rceil }\) and orienting arcs between \(H_{\lfloor \frac{n}{2}\rfloor }\) and \(H_{\lceil \frac{n}{2}\rceil }\) from \(H_{\lfloor \frac{n}{2}\rfloor }\) to \(H_{\lceil \frac{n}{2}\rceil }\). Clearly, \(\mathrm{mco}(D)={\lceil \frac{n}{2}\rceil }\le \ell \) and \((D,\ell , 0)\) is a YES-instance of Directed Component Order Connectivity. Note there are \({\lfloor \frac{n}{2}\rfloor }\cdot {\lceil \frac{n}{2}\rceil } = \Theta (n^2)\) arcs between \(H_{\lfloor \frac{n}{2}\rfloor }\) and \(H_{\lceil \frac{n}{2}\rceil }\). Now let \(\mathbb {A}\) be a deterministic sequential algorithm that solves Directed Component Order Connectivity[\(k\)] in \(o(n^2)\) time if \(k=0\). If we run \(\mathbb {A}\) on D, then there is an arc from \(H_{\lfloor \frac{n}{2}\rfloor }\) to \(H_{\lceil \frac{n}{2}\rceil }\) that \(\mathbb {A}\) did not read. Let this arc be xy and let \(D_{xy}\) be the graph obtained from D by replacing the arc xy by the arc yx. It follows that \(D_{xy}\) is strongly connected and hence \((D_{xy},\ell , 0)\) is a NO-instance of Directed Component Order Connectivity. However, because the algorithm \(\mathbb {A}\) decided that \((D,\ell ,0)\) is a YES-instance without considering the orientation of the arc between x and y on the instance \((D,\ell , 0)\) and the only difference between \((D,\ell , 0)\) and \((D_{xy},\ell , 0)\) is the orientation of the arc between x and y, it follows that \(\mathbb {A}\) outputs that \((D_{xy},\ell , 0)\) is a YES-instance, which contradicts the assumption that \(\mathbb {A}\) outputs the correct answer for every instance \((D,\ell ,0)\) of Directed Component Order Connectivity such that D is a tournament.

Case 2: \(\ell < \frac{n}{2}\). The proof is very similar to Case 1; the only difference is the construction of the digraph D. To construct D we first take the disjoint union of \(q= \lfloor \frac{n}{\ell }\rfloor \) copies of \(H_\ell \), denoted \(H^1_\ell , \ldots , H^{q}_\ell \), and one copy of \(H_{n-q\ell }\). We add the arc xy to D if \(x\in H^i_\ell \) and \(y\in H^j_\ell \) such that \(1\le i<j \le q\) or if \(x\in H^i_\ell \), \(i\in [q]\), and \(y\in H_{n-q\ell }\). It follows that D is a tournament and \(\mathrm{mco}(D)=\ell \), that is \((D,\ell , 0)\) is a YES-instance. Now let \(Y=\bigcup _{i\in [\lfloor \frac{q}{2}\rfloor ]} V( H^{i}_\ell )\) and \(Z=V(D)\setminus Y\). It is easy to see that \(\frac{n}{4}\le |Y|\le \frac{n}{2}\) and there are \(\Theta (n^2)\) arcs from Y to Z in D. Moreover if \(yz\in A(D)\) is an arc such that \(y\in Y\) and \(z\in Z\), then \(D_{yz}=(V(D), (A(D)\setminus \{yz\})\cup \{zy\})\) contains a strongly connected component of size at least \(\ell +1\). The proof follows by analogous arguments to the case \(n-\ell < \ell \), as for any algorithm \(\mathbb {A}\) that solves \((D,\ell , k)\) in \(o(n^2)\), there is an arc yz such that \(\mathbb {A}\) outputs incorrectly that \((D_{yz},\ell , k)\) is a YES-instance. \(\square \)

Finally, we will present our \({\mathcal {O}}^*(2^{o(k)})\) lower bound result, based on the well-established Exponential Time Hypothesis (ETH). Our result uses the fact that the classical Vertex Cover problem cannot be solved in subexponential time under ETH.

Theorem 4.3

(Cai and Juedes [6]) There is no \(2^{o(k)}\cdot |V(G)|^{\mathcal {O}(1)}\) algorithm for Vertex Cover, unless ETH fails.

Given the above result by Cai and Juedes, the lower bound then directly follows from the proof of NP-hardness of Directed Feedback Vertex Set by Speckenmeyer [19]. In fact, given a graph G, Speckenmeyer constructs in \(O(|V(G)|^2)\) time a tournament T with \(3|V(G)|-2\) vertices such that for every k the graph G has a vertex cover of size at most k if and only if T has a directed feedback vertex set of size at most k (see Theorem 6 in [19]). Hence, we obtain the following:

Theorem 4.4

There is no algorithm solving Directed Component Order Connectivity[\(k\)] on tournaments in time \(2^{o(k)}n^{{\mathcal {O}}(1)}\), unless ETH fails.

In Theorem 4.1 we saw that there is an FPT algorithm for Directed Component Order Connectivity[\(n-\ell \)] that runs in \({\mathcal {O}}^*(2^{16(n-\ell )})\) time, as we may assume that \(k \le n - \ell \). By the construction explained before Theorem 4.4 we can replace k by \(n-\ell \) in \(2^{o(k)}\) in Theorem 4.4 and thus obtain a matching lower bound for the upper bound \({\mathcal {O}}^*(2^{16(n-\ell )}).\)

Theorem 4.5

There is no \(2^{o(n-\ell )} n^{{\mathcal {O}}(1)}\)-time algorithm for solving Directed Component Order Connectivity[\(n-\ell \)] on semicomplete digraphs, unless ETH fails.

5 Conclusions

Since Directed Component Order Connectivity generalizes Directed Feedback Vertex Set, it would likely be hard to improve our upper bound and obtain a tight lower bound for the time complexity of Directed Component Order Connectivity[\(\ell +k\)] on general digraphs. It seems easier to improve our upper and lower bounds on the time complexity of Directed Component Order Connectivity[\(k\)] on semicomplete digraphs.

It would be interesting to consider the time complexity of the problem on well-studied generalizations of semicomplete digraphs: (i) semicomplete multipartite digraphs which are digraphs that can be obtained from complete multipartite graphs by replacing every edge by an arc with the same end-vertices or a pair of opposite arcs with the same end-vertices, (ii) quasi-transitive digraphs which are digraphs in which if xy and yz are arcs such that xyz are distinct vertices then either xz or zx or both are arcs, too (in particular, a transitive digraph is quasi-transitive), (iii) locally semicomplete digraphs which are digraphs in which for every vertex x,  both \(N^+(x)\) and \(N^-(x)\) induce semicomplete digraphs (a directed cycle is an example of a locally semicomplete digraph). Chapters 7,8, and 5, respectively, in the textbook on classes of directed graphs [2], provide extensive surveys on these classes of digraphs.