Linear kernels for outbranching problems in sparse digraphs

In the $k$-Leaf Out-Branching and $k$-Internal Out-Branching problems we are given a directed graph $D$ with a designated root $r$ and a nonnegative integer $k$. The question is to determine the existence of an outbranching rooted at $r$ that has at least $k$ leaves, or at least $k$ internal vertices, respectively. Both these problems were intensively studied from the points of view of parameterized complexity and kernelization, and in particular for both of them kernels with $O(k^2)$ vertices are known on general graphs. In this work we show that $k$-Leaf Out-Branching admits a kernel with $O(k)$ vertices on $\mathcal{H}$-minor-free graphs, for any fixed family of graphs $\mathcal{H}$, whereas $k$-Internal Out-Branching admits a kernel with $O(k)$ vertices on any graph class of bounded expansion.


Introduction
Kernelization is a thriving research direction within parameterized complexity that aims at understanding the computational power of polynomial-time preprocessing procedures via a rigorous mathematical framework. Its central notion is the definition of a kernelization algorithm, or simply a kernel: Given an instance (I, k) of some parameterized problem L, a kernelization algorithm reduces (I, k) in polynomial time to an equivalent instance (I ′ , k ′ ) of L so that |I ′ |, k ′ ≤ f (k) for some computable function f of the parameter k only; function f is called the size of the kernel. While for a decidable problem L the existence of any kernelization algorithm is equivalent to fixed-parameter tractability of the problem, we are most interested in finding small kernels, possibly of polynomial or even linear size. For concreteness, in this paper we concentrate on parameterized graph problems, so we always assume that the input instance is a graph.
One of the most influential ideas in the search for small kernels was to restrict the input graph to belong to some sparse graph class, e.g. to be planar, bounded-genus, or H-minor-free for some fixed H. Starting with the groundbreaking work of Alber et al. [1], who showed a kernel of size 335k for Dominating Set on planar graphs, numerous strong kernelization results were shown on planar, bounded genus, and H-minor-free graphs; these results often concern problems that on general graphs are intractable in the parameterized sense. A milestone in this theory is the development of the technique of meta-kernelization by Bodlaender et al. [4], further refined by Fomin et al. [15]. Informally speaking, using this methodology one graphs is a stronger property than admitting a subexponential parameterized algorithm.
While the techniques of bidimensionality and meta-kernelization are elegant and have many important applications, they have certain limitations that make them inapplicable to several important families of problems, for instance problems on directed graphs or problems with prescribed sets of distinguished vertices like Steiner Tree. Therefore, significant effort has been put into investigating the existence of subexponential parameterized algorithms and small kernels outside the framework of bidimensionality [9,14,20,22,26,27].
In this work we are interested in two problems investigated by Dorn et al. [9], namely k-Leaf Out-Branching (LOB) and k-Internal Out-Branching (IOB). In both problems, we are given a directed graph D with a specified root r and a nonnegative integer k. By an outbranching rooted at r we mean a spanning tree of D with all the edges oriented away from r. A vertex of D is a leaf in an outbranching T if it has outdegree 0 in T , and is internal otherwise. In LOB the question is to verify the existence of an outbranching rooted at r that has at least k leaves, whereas in IOB we instead ask for an outbranching rooted at r with at least k internal vertices. Both problems enjoy the existence of kernels with O(k 2 ) vertices on general graphs [6,19], however up to this work no better kernels were known even in the case of planar graphs. Indeed, the directed nature of both problems prevents them from satisfying even the most basic properties needed for the bidimensionality tools to be applicable.
Dorn et al. [9] designed subexponential parameterized algorithms with running time 2Õ ( √ k) · n O(1) for both problems on H-minor-free graphs 1 . They did it, however, by circumventing in both cases the need of obtaining a linear kernel. In the case of LOB they show how to apply preprocessing rules to obtain an instance that can be still large in terms of k, but has treewidth O( √ k) so that the dynamic programming on a tree decomposition can be applied. In the case of IOB they apply a variant of Baker's layering technique.
Our results and techniques. In this work we fill the gap left by Dorn et al. [9] and prove that both LOB and IOB admit linear kernels on H-minor-free graphs. In fact, for IOB our approach works even in the more general setting of graph classes of bounded expansion (see Section 2 for a definition). By slightly abusing notation, in what follows we say that a directed graph D belongs to some class of undirected graphs (e.g. is H-minor free) if the underlying undirected graph of D has this property. Theorem 1. Let H be a fixed graph. There is an algorithm that, given an instance (D, k) of LOB where D is H-minor-free, in polynomial time either resolves the instance (D, k), or outputs an equivalent instance (D ′ , k ′ ) of LOB where |V (D ′ )| = O(k), k ′ ≤ k, and D ′ is H-minor free. The algorithm does not need to know H.
Note that Theorem 1 implies also a kernel of linear size for any minor-closed family of graphs G. Indeed, by the Roberson and Seymour's graph minor theorem there exists a fixed finite family H such that G contains exactly graphs that are H-minor free for every H ∈ H. By Theorem 1, for any input graph D ∈ G, the output graph D ′ is H-minor free for every H ∈ H. Hence, D ′ is in G. In particular, it follows that Theorem 1 implies linear kernels for planar graphs and other graphs embeddable on a surface of bounded genus.
Theorem 2. Let G be a hereditary graph class of bounded expansion. There is an algorithm that, given an instance (D, k) of IOB where D ∈ G, in polynomial time either resolves the instance (D, k), or outputs an equivalent instance (D ′ , k) of IOB where |V (D ′ )| = O(k) and D ′ is an induced subgraph of D.
By applying these kernelization algorithms and then running dynamic programming on a tree decomposition of the obtained graph, we easily obtain the following corollary. Algorithms with a similar running time -but with additional log k factor in the exponent -were obtained by Dorn et al. [9]. If one follows their approach, then for LOB it is possible to shave off this factor in the exponent just by replacing the dynamic programming on a tree decomposition with a more modern one. However, for IOB the logarithmic factor is caused also by an application of the layering technique, and hence such a replacement and manipulation of parameters in layering would only improve log k to √ log k. By constructing a truly linear kernel we are able to shave this factor completely off. We remark that the running time given by Theorem 3 is optimal under the Exponential Time Hypothesis even on planar graphs; see Section 5 for further details.
To prove Theorems 1 and 2, we revisit the quadratic kernels on general graphs given by Daligault and Thomassé [6] (for LOB) and by Gutin et al. [19] (for IOB). For LOB we need to modify the approach substantially, as the core reduction rule used by Daligault and Thomassé is the following: whenever there is a cutvertex in the graph -a vertex whose removal makes some other vertex not reachable from r -then it is safe to shortcut it: remove it and add an arc from every its inneighbor to every its outneighbor. Observe that an application of this rule does not preserve H-minor-freeness, so the kernel of Daligault and Thomassé [6] may start with an H-minor free graph and go outside of this class.
To circumvent this problem, we exploit the structural approach proposed by Dorn et al. [9]. While not achieving a linear kernel in the precise sense, Dorn et al. are able to simplify the structure of the instance so that it fits their purposes. The main idea is to contract cutedges instead of shortcutting cutvertices, which is a weaker operation that, however, preserves Hminor-freeness. Dorn et al. are able to expose a set of so-called special vertices S of size linear in k such that G \ S has constant pathwidth; this is already enough to employ the bidimensionality technique. To obtain a linear kernel, we need to perform a much more (a) A weak bipath (b) A fat bipath (the black vertices may have outneighbors other than those depicted and some of the vertical edges may be contracted) Figure 1: Different types of bipaths refined analysis of the instance. More precisely, we construct a set S with |S| = O(k) such that G \ S is consists of fat bipaths: chains as depicted in Figure 1, possibly with some vertical (cut)edges contracted, and with outgoing edges with heads in S. After contracting the vertical edges, such a fat bipath becomes a weak bipath: a bidirectional path possibly with outgoing edges with heads in S. Weak bipaths are crucial in the structural approach of Daligault and Thomassé [6], and our fat bipaths can be thought of as more fuzzy variants of weak bipaths that cannot be reduced due to the inability to shortcut cutvertices.
To obtain a linear kernel, we need to reduce the total length of the fat bipaths. For this, we use concepts borrowed from the analysis of graph classes of bounded expansion, of which H-minor-free classes are special cases. Very recently Drange et al. [11] announced a linear kernel for Dominating Set on graph classes of bounded expansion, and the main tool used there is the analysis of the number of different neighborhoods that can arise in a graph G from a bounded expansion graph class G. Essentially, there is a constant c such that for every X ⊆ V (G) there are only O(|X|) vertices in V (G) \ X that neighbor more than c vertices in X, while the vertices of V (G) \ X that neighbor at most c vertices in X can be grouped into O(|X|) classes with exactly the same neighborhoods. We apply this idea to the instance at hand with the interior of every fat bipath contracted to one vertex. Thus, we infer that there are only O(k) fat bipaths that neighbor more than c special vertices, and their total length can be bounded by O(k) using reduction rules. On the other hand, fat bipaths with neighborhoods of size at most c are reduced within their neighborhood classes, whose number is also O(k).
The same neighborhood diversity argument plays the key role also in our kernel for IOB (Theorem 2). The idea of Gutin et al. [19] is that if a solution to the instance cannot be found immediately by a simple local search, then one can expose a vertex cover U of size at most 2k in the graph. The vertices of V (D) \ U are reduced using an argument involving crown decompositions in an auxiliary graph where vertices of V (D) \ U are matched to pairs of adjacent vertices of U ; this gives a quadratic dependence on k of the size of the kernel. We observe that in case D belongs to a class of bounded expansion, then there is only O(|U |) = O(k) vertices of V (D)\U that have super-constant neighborhood size in U , while the others are grouped into O(|U |) = O(k) neighborhood classes, each of which can be reduced to constant size using the same approach via crown decompositions.
For IOB we did not need any edge contractions in the reduction rules, so the kernelization procedure works on any graph class of bounded expansion. However, for LOB it seems necessary to apply contractions of subgraphs of unbounded diameter, e.g. to reduce long paths that contribute with at most one leaf to the solution. While the last phase relies mostly on the bounded expansion properties of the graph class, we need to allow contractions in the reduction rules and hence we do not achieve the same level of generality as for IOB.
We see the additional advantage of our approach in its simplicity. Instead of relying on complicated decomposition theorems for H-minor free graphs, which is a standard technique in such a setting, we use the methodology proposed by Drange et al. [11]: To exploit purely combinatorial, abstract notions of sparsity, like the concept of bounded expansion, and in this manner obtain a much cleaner treatment of the considered graph classes. Of particular interest is the usefulness of the approach of grouping vertices according to their neighborhoods in some fixed modulator X, which is the key idea in [11].
Organization of the paper. In Section 2 we give preliminaries on tools borrowed from the analysis of graph classes of bounded expansion. Sections 3 and 4 are devoted to the proofs of Theorems 1 and 2, respectively. In Section 5 we derive Theorem 3 as a corollary, and discuss the optimality of the obtained algorithms. We conclude with some closing remarks in Section 6.
Notation. In this paper we deal with digraphs. Let D = (V, E) be a digraph. Consider an edge (u, v) ∈ E. We say that v is an out-neighbor of u and u is an in-neighbor of v. We also say that v is a head and u is a tail of (u, v). Also, v and u are neighbors of each other. For any vertex v we denote the sets of all its neighbors, out-neighbors and in-neighbors by We omit the subscripts and write simple N (v) or deg(v) whenever it does not lead to ambiguity. For any

Preliminaries on Sparse Graphs
In this section we recall some definitions and basic properties of sparse graphs, in particular d-degenerate graphs, bounded expansion graphs and H-minor-free graphs. Although in this section we refer to undirected graphs, all the notions and claims apply also to digraphs, by looking at the underlying undirected graph. We say that graph G is k-degenerate when every subgraph of G has a vertex of degree at most k. This implies (and in fact is equivalent to) that we can remove all the edges of G by repeatedly removing vertices of degree at most k. It follows that G has at most k|V (G)| edges. The degeneracy of a graph is the smallest value of k for which it is k-degenerate. Degeneracy is closely linked to arboricity, i.e., minimum number arb(G) of forests that cover the edges of G: it is well known that degeneracy is between arb(G) and 2 arb(G).
Recall that a graph H is a minor of graph G if there exists a minor model (I u ) u∈V (H) of H in G that satisfies the following properties: • sets I u for u ∈ V (H) are pairwise disjoint subsets of V (G) that moreover induce connected subgraphs; • for each uv ∈ E(H), there exist x u ∈ I u and x v ∈ I v such that x u x v ∈ E(G). Let r be a nonnegative integer. If a minor model (I u ) u∈V (H) satisfies in addition that G[I u ] has radius at most r for each u ∈ V (H), then (I u ) u∈V (H) is an r-shallow minor model of H, and we say that H is an r-shallow minor of G. If G is a class of graphs, then by G ▽ r we denote the class of all r-shallow minors of graphs from G; note that G ▽ 0 are all subgraphs of graphs of G. We now define the greatest reduced average degree (grad) of a class G at depth r as That is, we take the greatest edge density among the r-shallow minors of G. Class G is said to be of bounded expansion if ∇ r (G) is a finite constant for every r. Observe that then the graphs from G are in particular d-degenerate for d = ⌊2∇ 0 (G)⌋. For a single graph G, we denote ∇ r (G) = ∇ r ({G}). Consider the class G H of H-minor-free graphs. By Lemma 4, every graph G ∈ G H has at most d H · |V (G)| edges. Since G H is closed under taking minors, it follows that G H ▽ r = G H for every nonnegative r, so also ∇ r (G H ) ≤ d H . Thus, H-minor-free graphs form a class of bounded expansion with all the grads bounded independently of r.
In this paper we do not use the original definition of bounded expansion graphs, but we rather rely on the point of view of diversity of neighborhoods, which was found to be very useful in [11]. More precisely, we now use the following result from [16, Lemma 6.6]; the statement with adjusted notation is taken verbatim from [11].
Consequently, the following bound holds: We need a strengthening of the first claim of Proposition 5. Proof. Let Z = {y ∈ Y : deg G (y) > 2d}. Consider G ′ = G[X ∪ Z], and observe that |E(G ′ )| = y∈Z deg G (y). Since G ′ is a subgraph of G, we obtain that Observe that since deg G (y) > 2d for each y ∈ Z, we have that deg G (y) − d > deg G (y)/2. Hence it follows that y∈Z deg G (y) ≤ 2d|X|.
Note that Proposition 5 has the following corollary when applied to H-minor-free graphs.

k-Leaf Out-Branching in H-minor-free graphs
In this section we deal with rooted digraphs, i.e., digraphs with a vertex r, called root, of in-degree 0. In such digraphs we redefine some standard connectivity notions as follows. Let (D, r) be a rooted digraph. We say that D is connected when every vertex of D is reachable from r. A cut-vertex is any vertex v ∈ V (D) \ {r} such that D − r is not connected. The set of all cut-vertices of D is denoted by cv(D). We say that D is 2-connected if D has no cut-vertex (equivalently, for every vertex v ∈ V (D) \ {r} there are at least two paths from r to v that do not share internal vertices). Similarly, a cut-edge is any edge (u, v) ∈ E(D) such that D − (u, v) is not connected. We say that D is 2-edge-connected if D has no cut-edge (equivalently, for every vertex v ∈ V (D) \ {r} there are at least two edge-disjoint paths from r to v). Note that if (u, v) is a cut-edge then u is a cut-vertex or u = r.
Given a cut-vertex u, or u = r, we define P (u) as the set of private neighbors of u, that is, the set of out-neighbors of u that are not reachable from the root in D − u. In particular, all the outneighbors of r are its private neighbors.
By a contraction of edge (a, b) in D we mean the following operation: identify a and b into a newly introduced vertex v (a,b) , replace a and b with v (a,b) in every edge of D, and remove all the loops and parallel edges created in this manner. Note that if D is H-minor-free, then it remains H-minor-free after contractions as well.
Following [6], we say that a vertex v of D is special if v is of in-degree at least 3 or there is an incoming simple edge, i.e., an edge (u, v) such that (v, u) ∈ E(D). The set of all special vertices of D is denoted by sp(D).
A weak bipath P is a sequence of vertices u 1 , . . . , u p for some p ≥ 3, such that for each . . , p − 1, we say that P is proper bipath (or shortly a bipath). u 1 and u p are called the extremities of P .
We say that a cut-edge (u, v) is lonely when there is no other cut-edge with the tail in u. We call a cut-edge branching is there is another cut-edge with the same tail. The graph obtained from D by contracting all lonely cut-edges is denoted by D c and called the contracted graph. Consider a vertex v of D c . Then either v was created by contracting some set of cutedges Z in D or v ∈ D. In the prior case we define the bag B of v as the set of vertices incident to edges in Z. Also, for any edge (x, y) ∈ Z the vertex x is called a tail of B and y is a head of B. In the latter case, i.e., when v ∈ D, we define the bag as B = {v} and v is both head If there is exactly one head and exactly one tail of B, then they are denoted by h B and t B , respectively.
We say that bags A and B are linked if in D there is an edge from A to B and an edge from B to A.

Our kernelization algorithm
In this section we describe our algorithm which outputs a kernel for k-Leaf Out-Branching. The algorithm exhaustively applies reduction rules. Each reduction rule is a subroutine which finds in polynomial time a certain structure in the graph and replaces it by another structure, so that the resulting instance is equivalent to the original one. More precisely, we say that a reduction rule for parameterized graph problem P is correct when for every instance (D, k) of P it returns an instance (D ′ , k ′ ) such that: Below we state the rules we use. The rules are applied in the given order, i.e., in each rule we assume that the earlier rules do not apply. We begin with some rules used in the previous works [6]. The correctness of the above reduction rules was proven in [6]. (In [6], Rule 2 is formulated in a more general way, but we restrict it so that if the input digraph was H-minor-free, then so is the resulting reduced graph.) Let us remark that Rule 4 remains true if r ∈ N − (x) \ {y}, and in this case it triggers removal of all the incoming edges apart from the one coming from the root. Below we introduce two simple rules which will make our argument a bit easier.

Rule 5
If there are two cut-edges (x 1 , y 1 ) and (x 2 , y 2 ) such that ( Proof. Let D and D ′ denote the graph before and after applying the reduction. Let x be the vertex obtained by contracting (x 1 , x 2 ). Let T ′ be an outbranching in D ′ . Then an outbranching of D can be obtained by the following procedure: • remove x and add x 1 and x 2 ; • replace the edge from the parent p of x by (p, Clearly, the number of leaves does not change. For the second direction, assume that T is an outbranching of D. Then T contains both (x 1 , y 1 ) and (x 2 , y 2 ), because they are cut-edges. In particular, x 1 and x 2 are not leaves in T . At least one of x 1 , x 2 is not a descendant of the other in T , by symmetry assume x 1 is not a descendant of x 2 . Then remove the edge from the parent of x 2 to x 2 and add the edge (x 1 , x 2 ). Thus we obtained an outbranching T ′ of D that contains the edge (x 1 , x 2 ) and has at least as many leaves as T . By contracting the edge (x 1 , x 2 ) in T we get an outbranching of D ′ with the same number of leaves.
Proof. Let D and D ′ denote the graph before and after applying the reduction. Since D ′ ⊆ D, any outbranching of D ′ is also an outbranching of D. Pick any outbranching T of D.
To complete the algorithm we need a final accepting rule which is applied when the resulting graph is too big. In Section 3.5 we prove that Rule 7 is correct for H-minor-free graphs for some constant c = 2 O(|H| √ log |H|) .

Rule 7
If the graph has more than c · k vertices, return a trivial yes-instance (conclude that there is a rooted outbranching with at least k leaves in D).
We conclude with the following lemma.

Lemma 10.
Let H be a graph. If the input is an H-minor-free graph, then the output of each of the rules 1-7 is a minor of D, and hence an H-minor-free graph. Moreover, each rule can be recognized and applied in polynomial time, and the degree of the polynomial does not depend on H.
Proof. The first claim follows from the fact that the rules modify the graph by means of deletions and contractions only. The second claim is straightforward to check.

A few simple properties of the reduced graph
In this section we state simple auxiliary lemmas, which will be used in the remainder of the paper.
Lemma 11. Assume reduction rules 1-4 do not apply to D. Let u be a cut-vertex in D, or u = r. Then every private neighbor v ∈ P (u) has indegree 1 and (u, v) is a cut-edge. In particular, the head of any cut-edge has indegree 1.
Proof. If v has indegree at least 2 then either Rule 1 applies, or Rule 4 applies to x = v and y being the other inneighbor of v. Any edge incoming to a vertex of indegree 1 is a cut-edge, so (u, v) is a cut-edge. The head of any cut-edge is a private neighbor of its tail, so the last claim also follows.
Lemma 12. If reduction rules 1-4 do not apply to D then the tail of any cut-edge is not a head of another cut-edge.
Lemma 13. If reduction rules do not apply to D then every bag is of size at most two and contains at most one edge. In particular, every bag has exactly one head and one tail.
Proof. Assume that there is a bag B of size at least three. Since the cut-edges that get contracted to v B are lonely, and their heads have indegrees 1 due to Lemma 11, then these edges form a directed path, a contradiction with Lemma 12. The fact that a bag of size 2 cannot contain two edges follows from Rule 6. Proof. If |B| = 1, then the claim is trivial, so assume |B| ≥ 2. By Lemma 13, |B| = 2, i.e., Lemma 15. Assume reduction rules 1-4 do not apply to D. If bags A and B are linked then there is exactly one edge from A to B and exactly one edge from B to A.
Proof. It suffices to show that there is exactly one edge from A to B, since the other claim is symmetric. Assume for the contradiction that there are two edges (a 1 , b 1 ), (a 2 , b 2 ) ∈ A × B. Note that b 1 = b 2 , for otherwise we get a contradiction with Lemma 14. It follows that a 1 = a 2 , since there are no two identical edges in D. Assume w.l.o.g. that (a 1 , a 2 ) is a cut-edge. Then Rule 4 applies (with x = b 1 and y = a 2 ), a contradiction. Proof. We have that deg + D (r) ≥ 2 because otherwise Rule 2 would apply. Therefore, it suffices to show that every edge (r, u) is a cut-edge, because the head of a branching cut-edge is always special in D c by Rule 6. This, however, follows from inapplicability of Rule 4 to u.

Decomposition into weak bipaths
The following lemma gives a structural connection between weak bipaths and special vertices.
Lemma 17. Assume no reduction rule applies to D. Let S ⊆ V (D c ) be any set of vertices that contains the root r and every special vertex of D c . Then one can find weak bipaths P 1 , P 2 , . . . , P q , such that: (i) The sets of internal vertices of P 1 , P 2 , . . . , P q form a partition of V (D c ) \ S.
(ii) The extremities of each P i belong to S and are distinct.
(iii) The out-neighbors of the internal vertices of each P i belong to S.
would be a cut-edge in D and Rule 6 would apply, a contradiction. Otherwise, the bags of u and v are linked and by Lemmas 14 and 15, there is one edge from the bag of u to the tail of the bag of v; clearly, this edge is a cut-edge in D. If v was obtained from the contraction of a lonely cut-edge (v 1 , v 2 ), then this would be a contradiction with Lemma 12. Hence assume v ∈ D. From Lemma 14 we infer that in D there is an edge from v to t Bu . However, the edge from B u to v has tail in t Bu by Lemma 12. Then again Rule 6 would apply, a contradiction.
Since v is not special, we get that deg − Dc (v) = 2, and the two of its in-neighbors are also its out-neighbors. Since r ∈ S and D c is connected, we have that D c − S is a set of bidirectional paths, with each endpoint connected by two edges with opposite directions with a vertex of S. Thus we immediately obtain weak bipaths P 1 , P 2 , . . . , P q that satisfy (i), (iii), as well as (ii) apart from the claim that the extremities are distinct. Suppose there is a weak bipath P i = u, v 2 , v 3 , . . . , v p−1 , u such that both its extremities are in fact one vertex u ∈ S. By Lemma 16, u = r. Regardless whether u ∈ D or u is obtained by contracting some lonely cut-edge in D, we have that x, the tail of the bag of u, is a cut-vertex in D whose removal disconnects all the bags of the internal vertices of P i from r. However, by Lemma 14 and the definition of a bipath we have that x has an inneighbor in the bag of v 2 . Then Rule 4 would apply to x, a contradiction.
Weak bipaths P 1 , . . . , P q given by Lemma 17 are called maximal bipaths. Note that for every such maximal bipath P = v 1 , v 2 , . . . , v p and every j = 2, . . . , p − 1, bag B v j is linked to B v j−1 and B v j+1 , and to no other bag.

New lower bounds on the number of leaves
In this section our goal is to establish a number of lower bounds on the number of leaves. Each of the lower bounds is a linear function of a number of some type of vertices or structures in D. These bounds will help us prove that Rule 7 is correct. Indeed, to this end it suffices to focus on a no-instance and prove that it has at most ck vertices. Hence, if we know that maxleaf(D) is large when there are many vertices of some kind A, then we know that in our no-instance there are few vertices of kind A. In other words vertices of type A are "easy". In the next section we will show that because of sparsity arguments the number of the remaining vertices (not corresponding to an "easy type") is linear in the number of "easy" vertices.
In fact, instead of looking for "easy" vertices in D, we focus of D c . This is justified by the fact that by Lemma  Proof. Let (u, v) be the contracted cut-edge and let x be the resulting vertex in D ′ . Consider any outbranching T ′ of D ′ . Then let T be obtained by the following procedure: 1) remove x from T ′ , 2) add vertices u and v, 3) add edge (u, v), 4) if p is the parent of x in T ′ , add edge (p, u), 5) for every edge (x, y) ∈ E(T ′ ), add (u, y) to T if (u, y) ∈ E(D) and add (v, y) otherwise. Then clearly T is an outbranching with at least the same number of leaves as T ′ . Hence it suffices to show that T is a subgraph of D. Otherwise, (p, u) ∈ E(D). However, then (p, v) ∈ E(D). Also, in T ′ there is a path from the root to p that avoids x. It follows that this path, extended by the edge (p, v) is contained also in D, (u, v) is not a cut-edge, a contradiction.
Since all heads of cut-edges have indegree 1, and contraction of lonely cut-edges cannot spoil this property for other cut-edges, we infer that every cut-edge of D remains a cut-edge in the process of obtaining D c from D by contracting lonely cut-edges one by one. This yields the following.  Proof. Every special vertex in D is still a special vertex in D ′ : duplicating can neither decrease the in-degree of a vertex, nor remove a simple in-edge. Hence |sp(D ′ )| ≥ |sp(D)|. Take a rooted maximum leaf outbranching T ′ in D ′ . By symmetry, suppose that v is not a descendant of v ′ .
Note that T is an outbranching in D. If both v and v ′ were leaves in T ′ then v is a leaf in T , so T has one leaf less than T ′ . Otherwise even if v is not a leaf in T the number of leaves drops by at most one. This finishes the proof.
Lemma 23. In any digraph D such that rules 1-4 do not apply and every cut-edge is branching we have maxleaf(D) ≥ cv(D) + 1.
Proof. Let T be the spanning tree of D obtained through a Breadth-First-Search started in r. Consider any cut-vertex u. Since u is a cut-vertex, we have |P (u)| ≥ 1. If |P (u)| = 1, let v be the only private neighbor of u. The edge (u, v) is then a lonely cut-edge, a contradiction. Therefore |P (u)| ≥ 2. By Lemma 11 all the edges from u to P (u) are cut-edges. It follows that every cut-vertex in D has at least two out-neighbors in T . Hence T has at least cv(D) + 1 leaves.
We are now ready to prove Theorem 21.
Proof of Theorem 21. Let S be the set of cut-vertices in D. Take the digraph D ′ obtained from D by duplicating each vertex in S (in any order). We claim that D ′ is 2-connected. Indeed, assume that D ′ contains a cut-vertex u; since a vertex and its duplicate are twins, we can assume that u ∈ D. Since D is a subgraph of D ′ , it follows that u is a cut-vertex in D. Now D ′ contains a duplicate u ′ of u, so every vertex reachable from r in D ′ is still reachable in D ′ − u, a contradiction. Therefore From Lemma 23 we have maxleaf(D) ≥ cv(D) + 1.
The claim now follows from adding (1) and (2).
Now it suffices to show that Theorem 21 can be applied to graph D c .
Lemma 24. Suppose D is a rooted digraph that is connected. Then for any vertex u = r that is not the head of a cut-edge, one can find two simple paths P 1 , P 2 from r to u that end with different edges.
Proof. Let R be the set of inneighbors of u that are reachable from r in D − u. Since D is connected we have R = ∅, and if |R| ≥ 2 then we would be done. Suppose therefore that R = {v} for some vertex v such that (v, u) ∈ E(D). Then (v, u) would be a cut-edge, a contradiction.
Lemma 24 will be most often used in the following setting. Suppose that we know that in D the head of every cut-edge has indegree 1. Then if we know that some edge (v, u) is not a cut-edge, then u is not the head of any cut-edge, and hence we can apply Lemma 24 to it. Proof. Induction on |S ′ |. The claim is trivially true for |S ′ | = 0. Assume |S ′ | > 0. Pick any cut-edge (x, y) ∈ S ′ and let D 0 be the graph obtained from D by contracting all edges of S ′ \ {(x, y)}; obviously D 0 is connected. From the induction hypothesis we have that the set of cut-edges of D 0 is a subset of the set of cut-edges of D, and hence from the fact that in D all the heads of cut-edges have indegrees equal to 1, the same conclusion follows for D 0 as well. Hence, whenever in D 0 we conclude that an edge (u, v) is not a cut-edge, then all the edges incoming to v are also not cut-edges. We will show that contracting (x, y) in D 0 does not create a new cut-edge in D 1 .
Assume for a contradiction that (u, v) is a new cut-edge in D 1 , i.e., either (u, v) ∈ E(D 0 ) or (u, v) ∈ E(D 0 ) and is not a cut-edge in D 0 . In the former case we have two subcases: contracting (x, y) creates vertex u or v. CASE 1 v is obtained by contracting (x, y). Then there is an edge (u, x) or (u, y) in D 0 . However, the latter situation is impossible because then an edge enters y in D, a contradiction with Lemma 11. Hence (u, x) ∈ D 0 , and in particular x = r. Edges entering x in D are not cut-edges by Lemma 12, and hence by induction hypothesis no cut-edge enters x in D 0 . By Lemma 24 it follows that in D 0 there are two paths P 1 , P 2 from r to x, each entering x via a different edge, say P 1 by (a 1 , x) and P 2 by (a 2 , x), with a 1 = a 2 . Note that a 1 = y and a 2 = y because by Rule 6 we have that (y, x) / ∈ E(D). By replacing (a 1 , x) with (a 1 , v) in P 1 and (a 2 , x) with (a 2 , v) in P 2 we get two paths P ′ 1 and P ′ 2 from r to v in D 1 that end with different edges. It follows that (u, v) is not a cut-edge in D 1 , a contradiction. CASE 2 u is obtained by contracting (x, y). Then D 0 contains (x, v) or (y, v). No other edge leaving x is a cut-edge in D because (x, y) is lonely in D. Also no edge leaving y is a cut-edge in D by Lemma 12. Hence by induction hypothesis neither (x, v) nor (y, v) can be a cut-edge in D 0 . Since v has an incoming edge that is not a cut-edge in D 0 , as explained before we infer that no edge incoming to v in D 0 is a cut-edge.
From Lemma 24 it follows that in D 0 there are two paths P 1 , P 2 from r to v, each entering v via a different edge. If (u, v) is a cut-edge in D 1 , then it means that P 1 ends with (x, v) and P 2 ends with edges (x, y), (y, v), because (x, y) is a cut-edge. If v ∈ D then Rule 4 would apply to D (with v as x), a contradiction. Otherwise v is obtained by contracting a cut-edge (v 1 , v 2 ). However, by Lemma 14, no other edge enters v 2 in D, so D contains both edges (x, v 1 ) and (y, v 1 ). Again, we see that Rule 4 applies to D (with v 1 as x), a contradiction. CASE 3 Neither u nor v is obtained by contracting (x, y). Since (u, v) is not a cut-edge in D 0 , as in the previous case we infer that in fact no edge incoming to v is a cut-edge in D 0 . By Lemma 24, in D 0 there are two paths P 1 and P 2 from r to v, each ending with a different edge. Let us assume that P 1 ends by (a, v) and P 2 ends by (b, v), for some a = b. Let P ′ 1 and P ′ 2 be the paths in D 1 obtained from P 1 and P 2 by contracting edge (x, y), and possibly omitting a loop in case both x and y were traversed by P 1 or P 2 . Then P ′ 1 and P ′ 2 end with different edges unless {x, y} = {a, b}. By symmetry suppose that (x, y) = (a, b). However, (x, y) is a cut-edge. Hence if v ∈ D then Rule 4 would apply to D (with v as x), a contradiction. Otherwise v is obtained by contracting a cut-edge (v 1 , v 2 ), and the same reasoning as in the previous case also gives a contradiction.  A bound on isolated vertices. We say that a bag B is special when v B is special in D c . We say that a bag B is isolated when B is a non-special bag of size 2 and there is no edge from t B to a special bag.
The set of all isolated vertices in D c is denoted by iso(D c ).
By shortcutting a vertex v = r in a digraph D we mean creating a new digraph D ′ obtained from D by removing v and adding an edge (x, y) for every directed path (x, v, y) in D.
Let D s be the graph obtained from D by (i) contracting all lonely cut-edges that form a non-isolated bag, and then (ii) shortcutting every tail of an isolated bag. Note that D s is not necessarily H-minor-free, but we will use it only as an auxiliary construction when establishing a lower bound on maxleaf(D) in terms of iso(D c ).
The proof of following lemma can be found in [6]: Lemma 28. Let D be a digraph, and let D ′ be the digraph obtained from D by shortcutting a cut-vertex v. Then maxleaf(D) = maxleaf(D ′ ).
Lemmas 28 and 18 imply the following We also observe the following property.
Lemma 29. Suppose D is a connected rooted digraph where every head of a cut-edge has indegree 1. Let u be a vertex and suppose that r / ∈ N − (u) and there is no vertex v ∈ N − (u) such that u becomes disconnected from r after removing v. Then after shortcutting u no new cut-edges appear in D.
Proof. Note that the assumption of the lemma implies that in D there is no cut-edge that enters u, so we can apply Lemma 24 to u. Let D and D ′ denote the graph before and after shortcutting u. Assume that a new cut-edge (x, y) appears in D ′ . CASE 1 (x, y) ∈ E(D) and (x, y) is not a cut-edge in D. Since every head of a cut-edge has indegree 1, we infer that no cut-edge enters y. By Lemma 24, in D there are two paths P 1 and P 2 from r to y, ending with different edges e 1 and e 2 . Let P ′ 1 and P ′ 2 be the paths obtained from P 1 and P 2 by shortcutting u. If P ′ 1 and P ′ 2 end with the same edges as P 1 and P 2 , then (x, y) is not a cut-edge in D ′ , a contradiction. Otherwise observe that exactly one of P ′ 1 and P ′ 2 has changed the last edge, because otherwise e 1 = e 2 = (u, y). By symmetry assume e 1 = (u, y) and e 2 = (w, y), for some w = u. Then P ′ 1 ends with (w, y) and (w, u) is the second last edge of P 1 , or otherwise we are done. By the assumption of the lemma, removal of w does not disconnect u from r, so there is a path Q from r to u that avoids w. If this path traverses y, then its prefix is a path from r to y in D ′ that enters y from a different vertex than w. Otherwise after prolonging Q with (u, y) and shortcutting u we obtain a path from r to y in D ′ that enters u from a different vertex than w. In both cases we obtained two paths from r to y in D ′ that end with different edges, which means that no edge incoming to y can be a cut-edge. This is a contradiction with (x, y) being a cut-edge. CASE 2 (x, y) ∈ E(D), i.e., (x, y) is obtained by shortcutting u and (x, y) was not present in D. By Lemma 24, in D there are two simple paths P 1 and P 2 from r to u, ending with different edges (a, u) and (b, u), for some a = b. If any of these paths traverses y, then some its prefix is a path in D ′ from r to y that avoids the new edge (x, y), due to (x, y) being not present in D. This is a contradiction with (x, y) being a cut-edge. Suppose then that neither P 1 nor P 2 traverses y; in particular a = y and b = y. Then by replacing (a, u) by (a, y) and u) by (b, y) we get two paths in D ′ from r to y ending by different edges, so (x, y) is not a cut-edge, a contradiction.
Let S be the set comprising r and all the special vertices of D c . Let us invoke Lemma 17 on the set S, and thus obtain a family of maximal weak bipaths P 1 , P 2 , . . . , P q with properties as in this lemma.
Consider the process of creating D s . After contracting all lonely cut-edges corresponding to non-isolated bags, by Lemma 25, no new cut-edges appear. We would like to derive the same conclusion for D s as well, however we must be careful due to the non-trivial prerequisites of Lemma 29.
Lemma 30. If reduction rules do not apply to D then every cut-edge in graph D s is branching.
Proof. Let D ′ be the graph after contracting the lonely cut-edges corresponding to non-isolated bags. As argued above, from Lemma 25 if follows that D ′ has no new cut-edge, i.e., all cutedges of D ′ are either original branching cut-edges of D, or original lonely cut-edges of D that correspond to isolated bags.
In D c , every isolated vertex is some internal vertex on one of the bipaths P i . We can view the construction of D s from D ′ as follows: We iterate through the bipaths P 1 , P 2 , . . . , P q one by one. For each of them, we iterate through the internal vertices w of the bipath from left to right, and in D ′ we shortcut the tail of the bag corresponding to w provided this bag is isolated. We prove now that during this process we maintain the following invariant: (1) No new cut-edge has been created, and in particular all the heads of cut-edges in the current digraph have indegree 1.
(2) For every v ∈ D c that is an isolated vertex on some weak bipath, and t v is the tail of its bag, the following holds: as long as t v is not yet shortcutted, in D ′ there is no inneighbor of t v which is a cut-vertex whose removal disconnects t v from the root.
We now show that invariant (2) holds for every such v throughout the process, up to the point when t v is shortcutted. Let us fix v, and suppose v = v i lies on a maximal weak bipath P α = v 1 , v 2 , . . . , v p in D c , for some α ∈ {1, 2, . . . , q}. For simplicity, denote B j = B v j . By Lemmas 16 and 17, v 1 , v p ∈ S, r / ∈ {v 1 , v p }, and v 1 = v p . Note that vertices v 1 , v p are already present in D ′ . Let W be the set of (a) all vertices of D ′ that are contained in bags B i , for i = 2, . . . , p − 1, and (b) all vertices v i , for i = 1, 2, . . . , p, for which v i ∈ D ′ .
We now claim that in D ′ there is a path Q 1 from r to v 1 that avoids the vertices of W . Indeed, if v 1 was disconnected from r in D ′ − W , then any path from r to v 1 would need to use the unique edge from v p to B p−1 (or v p−1 ), so this edge would be a cut-edge in D ′ . This is a contradiction, because this edge was not a cut-edge in D, since cut-edges of D not residing in one bag must be branching and the head of each branching cut-edge in D is special in D c . Similarly, there is a path Q 2 from r to v p that avoids W .
From Lemmas 14 and 15 it follows that in D ′ there is a path R 1 from v 1 to t v that traverses consecutive bags B 1 , B 2 , . . . , B i−1 , B i (possibly contracted when constructing D ′ ), and in each it visits either only the tail, or first the tail and then the head. Similarly, there is a path R 2 from v p to t v that traverses consecutive bags B p , B p−1 , . . . , B i+1 , B i (possibly contracted when constructing D ′ ), and in each it visits either only the tail, or first the tail and then the head. In particular, since v 1 = v p , we have that R 1 and R 2 are vertex-disjoint apart from the last vertex t v .
Let M 1 be the concatenation of Q 1 and R 1 , and similarly define M 2 . We now examine what happens with paths M 1 and M 2 during the process of obtaining D s from D ′ . Every shortcutting of a vertex gives rise to a natural transformation of simple paths in D ′ , where the traversal of the shortcutted vertex is replaced by the usage of a newly introduced edge. Observe that the prefix Q 1 can only get shortcutted during the process, and similarly holds for the prefix Q 2 . However, v 1 and v p are not being shortcutted. Finally, the internal vertices of both R 1 and R 2 also can get shortcutted, but we maintain the invariant that these suffixes remain vertex-disjoint.
Concluding, during the process of obtaining D s from D ′ , M 1 and M 2 are always two paths from r to t v , and their suffixes beginning from v 1 and v p are always vertex-disjoint apart from the last vertex. Moreover, the vertices appearing before v 1 on M 1 cannot become the inneighbors of t v during the shortcutting process due to not belonging to W ∪ {v 1 , v p }, and the symmetrical claim holds for M 2 as well. We conclude that at any moment of the process, the removal of any inneighbor of t v cannot affect both paths M 1 and M 2 at the same time.
Hence invariant (2) holds throughout the process. Invariants (1) and (2) are exactly the prerequisites of Lemma 29 when applied to shortcutting t v . Hence, by iteratively applying Lemma 29 we conclude that no new cut-edge appears in D s , and in particular every application shows that invariant (1) is maintained in the next step. Therefore, the cut-edges of D s are simply the branching cut-edges of the original digraph D. (3) we are going to show that if there are many isolated vertices in D c , then there are many special vertices in D s , which, together with Theorem 21, implies the desired lower bound. Note that every non-special (in particular, every isolated) vertex in D c is an internal vertex of some weak bipath P i , and hence a nonspecial bag is linked to exactly two other bags -neighbors on the bipath.

Motivated by Lemma 30 and Equation
Lemma 31. Assume reduction rules do not apply to D. Suppose bag A is isolated. Then h A is special in D s or there is a non-special bag B linked to A such that h B , or v B if B gets contracted, is special in D s .
Proof. Since A is isolated, there is some bipath P i = v 1 , v 2 , . . . , v p such that v A = v a for some 2 ≤ a ≤ p − 1. Denote B i = B v i . Since Rule 2 does not apply, we infer that d + (t A ) ≥ 2. One of these edges goes to h A , whereas the second needs to go to one of the two neighboring bags on P , because A is isolated. By symmetry, suppose that there is an edge from t A to B = B a+1 . Of course, B is linked to A and B is not special, because there is an edge from t A to B and A is isolated. We consider two cases regarding the size of B. CASE 1 |B| = 2. We will show that at least one of h A , h B is special in D s . By Lemma 14, the edge from A to B is (t A , t B ). Then by Rule 5, (t B , t A ) / ∈ E(D). By Lemma 14 it follows that (h B , t A ) ∈ E(D). By Lemma 15, (t A , t B ) and (h B , t A ) are the only edges between A and B.
Let t be the minimum index i < a such that B i+1 , . . . , B a are all isolated and there is an edge from t B j to t B j+1 for each i < j < a. By the minimality of t and Lemma 15, it follows that either B t is not isolated or there is an edge from h Bt to t B t+1 . In either case, D s has an edge e incoming to h B (or v B , if B gets contracted) from a vertex corresponding to bag B t (i.e., either from v Bt or h Bt ). If in D s there is no edge from h B (or v B ) to a vertex that corresponds to B t , then h B (v B ) is special in D s , and we are done. So assume that there is such an edge. It means that in D there must be an edge from t A to t B a−1 . Then t = a − 1, because otherwise Rule 5 would apply. By Lemma 15 in D there is no edge from h A to B a−1 .
We argued earlier that there is also no edge in D from h A to B a+1 = B. It follows that (h A , h B ) / ∈ E(D s ). However, after shortcutting A we get (h B , h A ) ∈ E(D s ). Hence, h A is special in D s . CASE 2 |B| = 1. By Lemmas 14 and 15, the only edges between A and B are (t A , t B ) and ∈ D s , then h A is special in D s and we are done. Otherwise, denoting C = B a−1 , it must hold that C is isolated (so in particular non-special) and there must be edges (h A , t C ), (t C , t A ) in D. Then by the same argument as in Case 1 (with C playing the role of A and A playing the role of B), h A or h C is special in D s . This ends the proof.

60
. By Lemma 31, to every isolated bag A we can assign a non-special bag B, such that h B is special in D s and either B = A or B is linked to A. By the definition, there are at most two bags linked to a non-special bag (corresponding to the neighbors of v B on a weak bipath in D c ). It follows that |sp(D s )| ≥ |iso(Dc)|
We will say that a vertex v of D c is easy when v = r, or v is special, or v is isolated in D c . A vertex that is not easy is called hard. We now invoke once more Lemma 17, but this time instead of S we take the set of all the easy vertices. Every maximal bipath obtained in this decomposition will be called a maximal hard bipath. In other words, a weak bipath in D c is hard if all its internal vertices are hard. The sets of all easy and hard vertices in  Proof. By Lemma 18 it suffices to show that maxleaf(D c ) ≥ sl(D c ). We will show that in fact every outbranching T of D c has at least sl(D c ) leaves.
Fix an arbitrary outbranching T of D c . It is easy to see that in any outbranching T , the number of leaves is equal to 1 + u∈V (T ) max(deg + T (u) − 1, 0). Consider a slave Z = v 1 , . . . , v ℓ with O(Z) = S and extremities v 1 , v ℓ ∈ S, and let M 1 , M 2 be its masters. Then either (v 1 , v 2 ) ∈ E(T ), or (v ℓ , v ℓ−1 ) ∈ E(T ). Let slave Z charge vertex v 1 in the former case, and charge vertex v ℓ in the latter case. Also on M 1 and M 2 at least one edge outgoing from v 1 and one edge outgoing from v ℓ is present in T . We conclude that the total contribution to the outdegrees in T of v 1 and v ℓ from M 1 , M 2 and their slaves is at least the number of times v 1 and v ℓ are charged by the slaves of M 1 , M 2 , plus 2 for M 1 and M 2 .
Let X ⊆ ea(D c ) be the set of easy vertices that are the extremities of some slave. Then On the other hand, from what we argued in the previous paragraph it follows that u∈X deg + T (u) ≥ sl(D c )+2|F |, where F is the set of equivalence classes of slaves partitioned according to their masters. However, since every bipath has two extremities, it follows that |X| ≤ 2|F |. Hence u∈X deg + T (u) − |X| ≥ sl(D c ) and T has at least sl(D c ) leaves.

The size bound
In this section we prove the following theorem which imply the correctness of Rule 7.
Theorem 34. Let H be a graph. Let D be an H-minor-free digraph such that rules 1-6 do not apply. If maxleaf(D) < k, then |V (D) Throughout the section we assume that rules 1-6 do not apply to D. The results from the previous section give a bound of O(k) on the number of easy vertices. Our plan in this section is to show a linear bound on the number of hard vertices in terms of |ea(D c )| + sl(D c ) and next get a bound on |V (D)| as a corollary. It follows that our task is to show that the total length of hard weak bipaths in D c is not too large. Let us state a few useful properties of such bipaths.
Lemma 35. Let ℓ ≥ 9 and let P ′ = v 1 , . . . , v ℓ be a hard bipath in D c such that v 1 and v ℓ are easy. For every i = 3, . . . , ℓ − 6 there is at least one edge in D from t Bv j , for some . . , ℓ − 6} and consider the length 4 bipath v i , . . . , v i+4 . For convenience denote B j = B v j . If for some j = i + 1, i + 2, i + 3 there is an edge from B j with head not in B j−1 ∪ B j+1 , then by Lemma 17(iii) this head is outside ∪ ℓ−1 j ′ =2 B v j ′ and we are done. Hence the edges leaving B i+1 , B i+2 , and B i+3 go only to the neighboring bags. Since Rule 3 does not apply, for some j = i, . . . , i + 4 the bag B j is of size 2. Since v j is hard, B j is not isolated. Hence, there is an edge e in D from t B j to a special bag B. Since v 2 , . . . , v ℓ−1 are hard, B is none of B 2 , . . . , B ℓ−1 Lemma 36. For any maximal hard weak bipath P ′ in D c , we have |hd(D c ) ∩ V (P ′ )| ≤ 10|O(P ′ )| + 6.
Proof. Let P ′ = v 1 , . . . , v ℓ . We can assume that ℓ ≥ 9, for otherwise |hd(D c ) ∩ V (P ′ )| ≤ 6 and the claim holds trivially. For convenience denote B i = B v i . By Lemma 35 there are at least ⌊ ℓ−4 5 ⌋ edges from tails of bags B 3 , . . . , B ℓ−2 to vertices outside ∪ ℓ−1 i=2 B v j . Let Z denote the set of these edges. We claim that for every vertex u ∈ V (D) there are at most two edges from Z with heads in u. Indeed, assume that u has got three in-neighbors t In what follows we are going to bound the size of D c using its sparsity properties. To this end we use an auxiliary bipartite graph G, called the bipath minor of D c , constructed as follows. We put V (G) = A ∪ B, where A = ea(D c ), and B is the set of all maximal hard bipaths in D c . For every maximal hard bipath P ′ in D c with extremities u, v ∈ ea(D c ), the neighborhood of the corresponding vertex in B is exactly O(P ′ ).
Lemma 37. Let H be a graph.
Proof. Consider an arbitrary hard vertex v of D c . Consider the maximal hard weak bipath P ′ in D c that contains v. Then P ′ corresponds to a vertex in B and by Lemma 36, it has at most 10|O(P ′ )| + 6 internal vertices. It follows that Note that G is a minor of (the undirected version of) D since it can be obtained from D c by edge contractions and deletions, and D c in turn is obtained from D by contractions. Hence, G is H-minor-free. Moreover, G is simple. By Lemma 4, we know that G is d H -degenerate, for d H = O(|H| log |H|). Let B m and B s denote the vertices in B for which the corresponding maximal hard bipath is master and slave, respectively. By (4) we get Let us bound each of the terms separately. By Lemma 6, we have By Corollary 7, there is a constant c H = 2 O(|H| √ log |H|) such that there are at most c H |A| distinct neighborhoods of vertices in B. For each such neighborhood S ⊆ A and for every pair of vertices u, v ∈ S there are at most two master bipaths P ′ with endpoints u and v and such that O(P ′ ) = S. Therefore for a fixed neighborhood S of size at most 2d H we have The claim follows.

k-Internal Out-Branching in graphs of bounded expansion
In this section we give a linear kernel for IOB on any graph class G of bounded expansion. To this end, we modify the approach of Gutin, Razgon and Kim [19]. Before we proceed to the argumentation, let us remark that Gutin et al. work with a slightly more general problem, where the root of the outbranching is not prescribed; of course, the outbranching is still required to span the whole vertex set. Note that the variant with a prescribed root r can be reduced to this variant simply by removing all in-arcs of r, which forces r to be the root of any outbranching of the given digraph. Since our kernel will be an induced subgraph of D and r will not be removed by any reduction, it will be still true that r is the only candidate for the root of an outbranching. Hence, the resulting instance will be equivalent in both variants. Therefore, from now on we work with variant without prescribed root in order to be able to use the observations of Gutin et al. as black-boxes.
First, Gutin et al. observe that in an instance that cannot be easily resolved, one can find a small vertex cover (of the underlying undirected graph).
Lemma 38 ( [19]). Given a digraph D, we can either build an out-branching with at least k internal vertices or obtain a vertex cover of size at most 2k − 2 in O(n 2 m) time. For a given directed graph D and a vertex cover U in D we build an undirected bipartite graph B D,U as follows (see Fig. 3). Let W = V (D) \ U . Then, A crown decomposition of an undirected graph G is a partitioning of V (G) into three parts C, H and R, such that • C is an independent set.
• There are no edges between vertices of C and R. That is, H separates C and R.
• C can be partitioned into C m ∪ C u with |C m | = |H|, such that G[C m ∪ H] contains a perfect matching that matches each vertex of C m with a vertex of H.
Crown decompositions are used in multiple kernelization algorithms. In particular, the following lemma, which Gutin et al. attribute to Fellows et al. [12], shows that in certain situations a crown decomposition can be found efficiently.
Lemma 39 (see [19]). Suppose G is an undirected graph on n vertices, and suppose I is an independent set in G such that |I| ≥ 2n 3 . Then G admits a crown decomposition (C = C u ⊎ C m , H, R) with C ⊆ I, H ⊆ V (G) \ I and C u = ∅. Moreover, given I, the decomposition (C = C u ⊎ C m , H, R) can be found in O(nm) time.
The main idea of Gutin et al. is to search for crowns in B D,U with C ⊆ W and C u = ∅. Such crowns can be conveniently reduced using the following reduction rule, whose correctness is proved in Lemma 4.4 of [19].
Rule 1 Let U be a vertex cover in D and let W = V (D) \ U . Assume there is a crown decomposition (C = C m ∪ C u , H, R) in B D,U with C ⊆ W and C u = ∅. Then remove C u from D.
Our idea is to combine Rule 1 with the knowledge that D belongs to a graph class of bounded expansion G, and hence Proposition 5 can be used to reason about the sparseness of the adjacency structure between U and W . Let us introduce some notation. Consider a vertex cover U and an independent set Our kernelization algorithm is as follows.
) is a crown decomposition of B. Apply Rule 1 to this crown decomposition in order to remove C u from D, and restart the algorithm in the reduced graph.

Otherwise, return D.
In case we have a prescribed root r of the outbranching that we would like to preserve in the kernelization process, we can add it to the constructed vertex cover U , thus increasing its size up to at most 2k − 1. The reduction rules never remove any vertex of U .
Given this algorithm, we can restate and prove our main result for IOB. Proof. The correctness of our kernelization algorithm and a polynomial bound on its running time follows from Lemmas 38 and 39. Note that the kernelization algorithm never decrements the budget k, so it suffices to show that it outputs an instance (D, k) such that |V (D)| = O(k). We can assume that the algorithm constructed a vertex cover U of D of size at most 2k − 2 (2k − 1 if we want to preserve a prescribed root), because otherwise the algorithm would terminate and provide a positive answer.
By the first claim of Proposition 5 we get |W b | ≤ 2∇ 0 (G)|U | ≤ 4∇ 0 (G)k. Hence it suffices to bound the size of W s . Note that W s = N ∈N(U ) W N . By the second claim of Proposition 5 we get |N(U )| ≤ (4 ∇ 1 (G) + 2∇ 1 (G))|U | = O(4 ∇ 1 (G) k). However, since Step 2 of the kernelization algorithm cannot be applied, for every N ∈ N(U ) we have |W N | ≤ 2|N B D,U (W N )|. However, by the construction of B D,U it is clear that . This finishes the proof.
Let us remark that in the proof of Theorem 2 we used only the boundedness of ∇ 0 (G) and ∇ 1 (G), so our algorithm works as well in any graph class where only these two grads are finite constants. Also, the kernelization algorithm has polynomial running time, where the degree of the polynomial is a constant independent of G.

Subexponential algorithms
Theorems 1 and 2 enable us to design subexponential parameterized algorithms for LOB and IOB on H-minor-free graphs using the standard approach via treewidth. To this end, we compose two facts: First, for a fixed forbidden minor H, every H-minor-free graph on n vertices has treewidth at most O( √ n) [18]. Second, both k-Leaf Out-Branching and k-Internal Out-Branching can be solved in time 2 O(t) · n O(1) on n-vertex graphs given together with their tree decompositions of width at most t, as explained next. For the latter ingredient, a standard approach to dynamic programming on tree decompositions would yield algorithms with running time 2 O(t log t) · n O(1) , since we consider all possible partitions of a bag in the states of the dynamic programming table. However, both problems are amenable to recently developed new techniques for constructing dynamic programming algorithms with running time 2 O(t) · n O(1) . An application of the Cut&Count technique [5] immediately yields randomized algorithms with such a running time for both these problems. Actually, the existence of such algorithms also follows from expressibility in the logical formalism ECML+C proposed by the third author [24,25], which provides a meta-result on applicability of Cut&Count; The full paper [25], Appendix D, contains a formula for the problem of finding an outbranching with exactly k leaves, which can be trivially adjusted to express both k-Leaf Out-Branching and k-Internal Out-Branching. The Cut&Count technique has been recently derandomized by Bodlaender et al. [2], who proposed the so-called rank based approach that yields deterministic 2 O(t) · n O(1) -time algorithms for many problems amenable to Cut&Count. It is a simple exercise to see that using this technique one can also design such algorithms for k-Leaf Out-Branching and k-Internal Out-Branching. Thus, we have the following proposition.
Proposition 40. k-Leaf Out-Branching and k-Internal Out-Branching can be solved in deterministic time 2 O(t) · n O(1) on an n-vertex graph given together with its tree decomposition of width t.
Gathering all the tools, we obtain the subexponential algorithms promised in Section 1. Proof. Let (D, k) be the input instance of LOB or IOB, where D is H-minor free. First, we apply the kernelization algorithm of Theorem 1 or 2 (depending on the problem) to reduce the size of the instance to O(k); note that the application of neither of these algorithms can increase the parameter. Having the reduced instance (D ′ , k ′ ) in hand, where k ′ ≤ k, D ′ is H-minor-free, and |V (D ′ )| = O(k), we infer that the treewidth of D ′ is in O( √ k). Hence we apply any constant-factor approximation algorithm for treewidth, e.g. [3], to compute a tree decomposition of D ′ of width O( We remark that the running time of the algorithms given by Theorem 3 is essentially optimal under the assumption of the Exponential Time Hypothesis (ETH), even already on planar directed graphs. More precisely, from the known NP-hardness reductions it follows that the existence of an algorithm for LOB or IOB working in time 2 o( √ N ) on a planar directed graph with N vertices would contradict ETH. For completeness, we sketch now how this conclusion can be derived.
Theorem 41. Unless ETH fails, there is no algorithm solving LOB or IOB that achieves running time 2 o( √ N ) on planar directed graphs with N vertices.
Proof. For IOB the statement follows easily from the known fact that the existence of such an algorithm for Planar Hamiltonian Cycle would contradict ETH, see e.g. [21]. First, Planar Hamiltonian Cycle can be Turing-reduced to Planar Hamiltonian Path by guessing an edge used in the solution and replacing it with two pendant vertices attached to its endpoints. Planar Hamiltonian Path can be now reduced to the variant of IOB on planar digraphs, where the root is not specified: simply replace every undirected edge by two directed edges with opposite orientations, and ask for an outbranching with at least N − 1 internal vertices. The variant with unspecified root is easily Turing-reducible to the one with specified root by simply guessing the root. Note that all the aforementioned reductions increase the instance size by at most a constant factor, and hence the statement for IOB follows.
We turn our attention to LOB. First, it is known that the Planar Vertex Cover problem does not admit an algorithm with running time 2 o( √ N ) on planar graphs with N vertices, see e.g. [13,21]. Garey and Johnson [17] proposed a reduction from Planar Vertex Cover to Planar Connected Vertex Cover that increases the number of vertices of the graph only by a constant multiplicative factor. This proves that also for Planar Connected Vertex Cover an algorithm with running time 2 o( √ N ) can be excluded under ETH. We further reduce Planar Connected Vertex Cover to Planar Connected Dominating Set using the following transformation: subdivide every edge of the graph, and add a pendant to every introduced subdividing vertex. It can be easily shown that a planar graph G has a connected vertex cover of size k if and only if the planar graph G ′ obtained in this transformation has a connected dominating set of size k + |E(G)|. Hence, under ETH there is no 2 o( √ N ) -time algorithm for Planar Connected Dominating Set. Now, we use the known fact that the Connected Dominating Set is dual to the Max Leaf problem -the problem of finding a spanning tree of an undirected graph with the maximum possible number of leaves. More precisely, a graph G has a connected dominating set of size at most k if and only if it admits a spanning tree with at least |V (G)| − k leaves; cf. [10]. Hence, under ETH there is no 2 o( √ N ) -time algorithm for Planar Max Leaf.
Finally, Planar Max Leaf can be reduced to the variant of LOB on planar graphs where the root is not specified by just replacing every undirected edge by two directed edges with opposite orientations. Again, the variant with unspecified root is easily Turing-reducible to the one with specified root by simply guessing the root. This proves the statement for LOB.

Concluding remarks
In this paper we have shown linear kernels for both k-Leaf Out-Branching and k-Internal Out-Branching on sparse graph classes: H-minor-free and of bounded expansion, respectively. We believe that our work is another good example of how abstract properties derived from the sparsity of the considered graph class, in particular the ones expressed in Proposition 5, can be used in the kernelization setting for a clean treatment of graph classes with excluded minors, without the need of invoking the decomposition theorem of Robertson and Seymour. Other examples of this approach include [11,16], and we hope that even more will appear in future.
In the light of our results, the question about the existence of linear kernels for k-Leaf Out-Branching and k-Internal Out-Branching on general graphs becomes even more tantalizing. We do not intend to take a stance about the actual answer, but after investigating both problems for some time we believe that in both cases a conceptual breakthrough is needed to make an improvement.