On Cycle Transversals and Their Connected Variants in the Absence of a Small Linear Forest

A graph is $H$-free if it contains no induced subgraph isomorphic to $H$. We prove new complexity results for the two classical cycle transversal problems Feedback Vertex Set and Odd Cycle Transversal by showing that they can be solved in polynomial time on $(sP_1+P_3)$-free graphs for every integer $s\geq 1$. We show the same result for the variants Connected Feedback Vertex Set and Connected Odd Cycle Transversal. We also prove that the latter two problems are polynomial-time solvable on cographs; this was already known for Feedback Vertex Set and Odd Cycle Transversal. We complement these results by proving that Odd Cycle Transversal and Connected Odd Cycle Transversal are NP-complete on $(P_2+P_5,P_6)$-free graphs.


Introduction
Graph transversal problems play a central role in Theoretical Computer Science. To define the notion of a graph transversal, let H be a family of graphs, G = (V, E) be a graph and S ⊆ V be a subset of vertices of G. The graph G − S is obtained from G by removing all vertices of S and all edges incident to vertices in S. We say that S is an H-transversal of G if G − S is H-free, that is, if G − S contains no induced subgraph isomorphic to a graph of H . In other words, S intersects every induced copy of every graph of H in G. Let C r and P r denote the cycle and path on r vertices, respectively. Then S is a vertex cover, feedback vertex set, or odd cycle transversal if S is an H-transversal for, respectively, H = {P 2 } (that is, G − S is edgeless), H = {C 3 , C 4 , …} (that is, G − S is a forest), or H = {C 3 , C 5 , C 7 , …} (that is, G − S is bipartite).
Usually the goal is to find a transversal of minimum size in some given graph. In this paper we focus on the decision problems corresponding to the three transversals defined above. These are the Vertex cOVer, Feedback Vertex Set and Odd cycle tranSVerSal problems, which are to decide whether a given graph has a vertex cover, feedback vertex set or odd cycle transversal, respectively, of size at most k for some given positive integer k. Each of these three problems is well studied and is well known to be NP-complete.
We may add further constraints to a transversal. In particular, we may require a transversal of a graph G to be connected, that is, to induce a connected subgraph of G. The corresponding decision problems for the three above transversals are then called cOnnected Vertex cOVer, cOnnected Feedback Vertex Set and cOnnected Odd cycle tranSVerSal, respectively. Garey and Johnson [15] proved that cOnnected Vertex cOVer is NP-complete even on planar graphs of maximum degree 4 (see, for example, [14,31,36] for NP-completeness results for other graph classes). Grigoriev and Sitters [18] proved that cOnnected Feedback Vertex Set is NP-complete even on planar graphs with maximum degree 9. More recently, Chiarelli et al. [10] proved that cOnnected Odd cycle tranSVerSal is NP-complete even on graphs of arbitrarily large girth and on line graphs.
As all three decision problems and their connected variants are NP-complete, we can consider how to restrict the input to some special graph class in order to achieve tractability. Note that this approach is in line with the aforementioned results in the literature, where NP-completeness was proven on special graph classes. It is also in line with with, for instance, polynomial-time results for cOnnected Vertex cOVer by Escoffier, Gourvès and Monnot [12] (for chordal graphs) and Ueno, Kajitani and Gotoh [35] (for graphs of maximum degree at most 3 and trees).
Just as in most of these papers, we consider hereditary graph classes, that is, graph classes closed under vertex deletion. Hereditary graph classes form a rich framework that captures many well-studied graph classes. It is not difficult to see that every hereditary graph class G can be characterized by a (possibly infinite) set F G of forbidden induced subgraphs. If |F G | = 1 , say F = {H} , then G is said to be monogenic, and every graph G ∈ G is said to be H-free. Considering monogenic graph classes can be seen as a natural first step for increasing our knowledge of the complexity of an NP-complete problem in a systematic way. Hence, we consider the following research question:

How does the structure of a graph H influence the computational complexity of a graph transversal problem for input graphs that are H -free?
Note that different graph transversal problems may behave differently on some class of H-free graphs. However, the general strategy for obtaining complexity results is to first try to prove that the restriction to H-free graphs is NP-complete whenever H contains a cycle or the claw (the 4-vertex star). This is usually done by showing, respectively, that the problem is NP-complete on graphs of arbitrarily large girth (length of a shortest cycle) and on line graphs, which form a subclass of claw-free graphs. If this is the case, then we are left to consider the case when H does not contain a cycle, implying that H is a forest, and does not contain a claw either, implying that H is a linear forest, that is, the disjoint union of one or more paths.

The Graph H Contains a Cycle or Claw
It follows from Poljak's construction [30] that Vertex cOVer is NP-complete on graphs of arbitrarily large girth. Hence, Vertex cOVer is NP-complete on H-free graphs if H contains a cycle. However, Vertex cOVer becomes polynomial-time solvable when restricted to claw-free graphs [25,32]. In contrast, the other five problems cOnnected Vertex cOVer, (cOnnected) Feedback Vertex Set and (cOnnected) Odd cycle tranSVerSal are all NP-complete on graphs of arbitrarily large girth and on line graphs; see Table 1. Hence, for these five problems, it remains to consider only the case when H is a linear forest. Table 1 The complexities of the three connected transversal problems together with the original transversal problems on graphs of girth at least p for every (fixed) constant p ≥ 3 , on line graphs, and on H-free graphs for various linear forests H In particular, Feedback Vertex Set can be shown to be NP-complete on graphs of arbitrarily large girth by using Poljak's construction (see [3,26]). We also note that Munro [28] showed that Feedback Vertex Set is NP-complete even on line graphs of planar cubic bipartite graphs. Unreferenced results directly follow from other results in the table, and results marked with * are new results proven in this paper. Our two other new results, namely that Odd cycle tranSVerSal and cOnnected Odd cycle tranSVerSal are NP-complete on (P 2 + P 5 , P 6 )-free graphs, are not included in the

The Graph H Is a Linear Forest
In this paper, we focus on proving new complexity results for Feedback Vertex Set, cOnnected Feedback Vertex Set, Odd cycle tranSVerSal and cOnnected Odd cycle tranSVerSal on H-free graphs. It follows from Sect. 1.1 that we may assume that H is a linear forest. Below we first discuss the known polynomial-time solvable cases. As we will use algorithms for Vertex cOVer and cOnnected Vertex cOVer as subroutines for our new algorithms, we include these two problems in our discussion.
For every s ≥ 1 , Vertex cOVer (by combining the results of [1,34]) and cOnnected Vertex cOVer [10] are polynomial-time solvable on sP 2 -free graphs. 1 Moreover, Vertex cOVer is also polynomial-time solvable on (sP 1 + P 6 )-free graphs, for every s ≥ 0 [20], as is the case for cOnnected Vertex cOVer on (sP 1 + P 5 )-free graphs [24]. Their complexity on P r -free graphs is unknown for r ≥ 7 and r ≥ 6 , respectively.
Both Feedback Vertex Set and Odd cycle tranSVerSal are polynomial-time solvable on permutation graphs [4], and thus on P 4 -free graphs. Recently, Okrasa and Rzążewski [29] proved that Odd cycle tranSVerSal is NP-complete on P 13 -free graphs. A small modification of their construction yields the same result for cOnnected Odd cycle tranSVerSal. The complexity of Feedback Vertex Set and cOnnected Feedback Vertex Set is unknown when restricted to P r -free graphs for r ≥ 5 . For every s ≥ 1 , both problems and their connected variants are polynomialtime solvable on sP 2 -free graphs [10], using the price of connectivity for feedback vertex set [2,21]. 2

Our Results
In Sect. 3 we prove that cOnnected Feedback Vertex Set and cOnnected Odd cycle tranSVerSal are polynomial-time solvable on P 4 -free graphs, just as is the case for Feedback Vertex Set and Odd cycle tranSVerSal. In Sect. 4 we prove that for every s ≥ 1 , these four problems are all polynomial-time solvable on (sP 1 + P 3 )-free graphs; see also Table 1. Finally, in Sect. 5, we show that Odd cycle tranSVerSal and cOnnected Odd cycle tranSVerSal are NP-complete on (P 2 + P 5 , P 6 )-free graphs, that is, graphs that are both (P 2 + P 5 )-free and P 6 -free.
To prove our polynomial-time results, we rely on two proof ingredients. The first one is that we use known algorithms for Vertex cOVer and cOnnected Vertex cOVer restricted to H-free graphs as subroutines in our new algorithms. The second is that we consider the connected variant of the transversal problems in a more general form. For cOnnected Vertex cOVer this variant is defined as follows: 1 The graph G + H is the disjoint union of graphs G and H and sG is the disjoint union of s copies of G; see Sect. 2. 2 The price of connectivity concept was introduced by Cardinal and Levy [9] for vertex cover; see also, for example, [6][7][8].
Note that cOnnected Vertex cOVer extenSiOn becomes the original problem if W = � . We define the problems cOnnected Feedback Vertex Set extenSiOn and cOnnected Odd cycle tranSVerSal extenSiOn analogously. We will prove all our results for connected feedback vertex sets and connected odd cycle transversals for the extension versions. These extension versions will serve as auxiliary problems for some of our inductive arguments, but this approach also leads to slightly stronger results.

Remark 1
For any connected extension variant of these problems on H-transversals, we may assume that the input graph G is connected. If it is not, then either all but at most one connected component of G is H-free and does not intersect W, in which case it need not be considered, or the answer is immediately no. It is easy to check H-freeness for the three problems we consider.

Remark 2
Note that one could also define extension versions for any original transversal problem (that is, where there is no requirement for the transversal to be connected). However, such extension versions will be polynomially equivalent. Indeed, we can solve the extension version on the input (G, W, k) by considering the original problem on the input (G − W, max{0, k − |W|}) and adding W to the solution. However, due to the connectivity condition, we cannot use this approach for the connected variants.

Remark 3
It is known that Vertex cOVer is polynomial-time solvable on (P 1 + H)-free graphs whenever this is the case on H-free graphs. This follows from a well-known observation, see, for example, [27]: one can solve the complementary problem of finding a maximum independent set in a (P 1 + H)-free graph by solving this problem on each H-free graph obtained by removing a vertex and all of its neighbours. However, this trick does not work for cOnnected Vertex cOVer. Moreover, it does not work for Feedback Vertex Set and Odd cycle tranSVerSal and their connected variants either.

Preliminaries
Let G = (V, E) be a graph. For a set S ⊆ V , we write G[S] to denote the subgraph of G induced by S. We say that S is connected if G[S] is connected. We write G − S to denote the graph G[V ⧵ S] . A subset D ⊆ V is a dominating set of G if every vertex of V ⧵ D is adjacent to at least one vertex of D. An edge uv of a graph G = (V, E) is dominating if {u, v} is a dominating set. The complement of G is the graph G = (V, {uv | uv ∉ E and u ≠ v}) . The neighbourhood of a vertex u ∈ V is the set N G (u) = {v | uv ∈ E} and for U ⊆ V , we let N G (U) = ⋃ u∈U N(u) ⧵ U . We omit the subscript when there is no ambiguity. We denote the degree of a vertex u ∈ V by deg(u) = |N G (u)|.
Let G = (V, E) be a graph and let S ⊆ V . Then S is a clique if the vertices of S are pairwise adjacent and an independent set if the vertices of S are pairwise nonadjacent. A graph is complete if its vertex set is a clique. We let K r denote the complete graph on r vertices. Let T ⊆ V with S ∩ T = � . Then S is complete to T if every vertex of S is adjacent to every vertex of T, and S is anti-complete to T if there are no edges between S and T. In the first case, we also say that S is complete to G[T] and in the second case anti-complete to G [T].
A graph is bipartite if its vertex set can be partitioned into at most two independent sets. A bipartite graph is complete bipartite if its vertex set can be partitioned into two independent sets X and Y such that X is complete to Y. If X or Y has size 1, the complete bipartite graph is said to be a star. Note that every edge of a complete bipartite graph is dominating.
Let G 1 and G 2 be two vertex-disjoint graphs. The union operation creates the disjoint union G 1 + G 2 of G 1 and G 2 , that is, the graph with vertex set V(G 1 ) ∪ V(G 2 ) and edge set E(G 1 ) ∪ E(G 2 ) . We denote the disjoint union of r copies of G 1 by rG 1 . The join operation adds an edge between every vertex of G 1 and every vertex of G 2 . A graph G is a cograph if G can be generated from K 1 by a sequence of join and union operations. A graph is a cograph if and only if it is P 4 -free (see, for example, [5]).
The following lemma is well known, but we include a short proof for completeness.

Lemma 1
Every connected P 4 -free graph on at least two vertices has a spanning complete bipartite subgraph which can be found in polynomial time.
Proof Let G be a connected P 4 -free graph on at least two vertices. Then G is the join of two graphs G[X] and G [Y]. Hence, G has a spanning complete bipartite subgraph with partition classes X and Y. Note that this implies that G is disconnected. In order to find a (not necessarily unique) spanning complete bipartite subgraph of G with partition classes X and Y in polynomial time, we put the vertices of one connected component of G in X and all the other vertices of G in Y. ◻ Grzesik et al. [20] gave a polynomial-time algorithm for finding a maximum independent set of a P 6 -free graph in polynomial time. As the complement V(G) ⧵ I of every independent set I of a graph G is a vertex cover, their result implies that Vertex cOVer is polynomial-time solvable on P 6 -free graphs. Using the folklore trick mentioned in Remark 3 (see also, for example, [24,27]) their result can also be formulated as follows.
Theorem 1 [20] For every s ≥ 0 , Vertex CoVer can be solved in polynomial time on (sP 1 + P 6 )-free graphs.
We recall also that cOnnected Vertex cOVer is polynomial-time solvable on (sP 1 + P 5 )-free graphs [24]. We will need the extension version of this result. Its proof is based on a straightforward adaption of the proof for cOnnected Vertex cOVer on (sP 1 + P 5 )-free graphs [24] (see "Appendix" for a proof).

Theorem 2 [24]
For every s ≥ 0 , ConneCted Vertex CoVer extension can be solved in polynomial time on (sP 1 + P 5 )-free graphs.

The Case H = P 4
Recall that Brandstädt and Kratsch [4] proved that Feedback Vertex Set and Odd cycle tranSVerSal can be solved in polynomial time on permutation graphs, which form a superclass of the class of P 4 -free graphs. Hence, we obtain the following proposition.

Proposition 1 [4]
Feedback Vertex Set and Odd cycle tranSVerSal can be solved in polynomial time on P 4 -free graphs.
In this section, we prove that the (extension versions of the) connected variants of Feedback Vertex Set and Odd cycle tranSVerSal are also polynomial-time solvable on P 4 -free graphs. We make use of Proposition 1 in the proofs.

Theorem 3 ConneCted FeedbaCk Vertex set extension can be solved in polynomial time on P 4 -free graphs.
Proof Let G = (V, E) be a P 4 -free graph on n vertices and let W be a subset of V. By Remark 1, we may assume that G is connected. By Lemma 1, in polynomial time we can find a spanning complete bipartite subgraph G � = (X, Y, E � ) , and we note that, by definition, every edge in G ′ is dominating. Below, in Step 1, in polynomial time we compute a smallest connected feedback vertex set of G that contains W and intersects both X and Y. In Step 2, in polynomial time we compute a smallest connected feedback vertex set of G that contains W and that is a subset of either X or Y (if such a set exists). Then the smallest set found is a smallest connected feedback vertex set of G that contains W.

Step 1 Compute a smallest connected feedback vertex set S of G such that
We perform Step 1 as follows. Consider two vertices u ∈ X and v ∈ Y . We shall describe how to find a smallest connected feedback vertex set of G that contains W ∪ {u, v} . We find a smallest feedback vertex set S ′ in G − (W ∪ {u, v}) . As G − (W ∪ {u, v}) is P 4 -free, this takes polynomial time by Proposition 1. Then S � ∪ W ∪ {u, v} is a smallest feedback vertex set of G that contains W ∪ {u, v} and is connected, since uv is a dominating edge. By repeating this polynomial-time procedure for all O(n 2 ) possible choices of u and v, we will find S in polynomial time.
Step 2 Compute a smallest connected feedback vertex set S of G such that S ⊆ X or S ⊆ Y. For Step 2 we describe only the S ⊆ X case, as the S ⊆ Y case is symmetric. Thus we may assume that W ⊆ X , otherwise no such set exists. Clearly, we may also assume that G[Y] contains no cycles. If G[Y] contains an edge it follows that S = X , otherwise G − S would contain a triangle. Suppose instead that Y is an independent set. If |Y| = 1 , then X ⧵ S must be an independent set, otherwise G − S contains a triangle. So S is a smallest connected vertex cover of G[X] that contains W. As G[X] is P 4 -free, we can find such an S in polynomial time by Theorem 2.

Theorem 4 ConneCted odd CyCle transVersal extension can be solved in polynomial time on P 4 -free graphs.
Proof We only provide an outline, as the proof follows that of Theorem 3. We perform the same two steps. In Step 1, we need to find a smallest odd cycle transver- In this section, we will prove that Feedback Vertex Set and Odd cycle tranSVer-Sal and their connected variants can be solved in polynomial time on (sP 1 + P 3 )-free graphs. We need three structural results. First, let us define a function c on the nonnegative integers by c(s) ∶= max{3, 2s − 1} . We will use this function c throughout the remainder of this section, starting with the following lemma. Proof First note that the s = 0 case of the lemma is trivially true, as every connected component of a bipartite P 3 -free graph has at most two vertices.
Suppose, for contradiction, that G has a connected component C 1 on at least c(s) vertices and a connected component C 2 on at least three vertices. As C 1 is bipartite and contains at least 2s − 1 vertices, C 1 contains a independent set of s vertices that induce sP 1 . As C 2 is bipartite and contains at least three vertices, C 2 has a vertex v of degree at least 2, and so v and two of its neighbours induce a P 3 . Thus G is not Similarly, if G contains a connected component C 1 on at least c(s) ≥ 3 vertices, then this component contains an induced P 3 . Since G is (sP 1 + P 3 )-free, G can contain at most s − 1 connected components other than C 1 . ◻ The internal vertices and leaves of a tree are the vertices of degree at least 2 and degree 1, respectively. Lemma 3 Let s ≥ 0 be an integer. Let T be an (sP 1 + P 3 )-free tree. Then T has at most 4s internal vertices.
Proof Let U be the set of internal vertices of T. Suppose that |U| ≥ 4s + 1 ≥ 1 . We will show that this leads to a contradiction. As a path with at least 4s + 1 internal vertices contains an induced sP 1 + P 3 , we may assume that T is not a path and so has at least three leaves. Hence |V(T)| ≥ 4s + 4.
Let X and Y be the two bipartition sets of T, and assume without loss of generality that |X| ≥ 2s + 2 . For Z ∈ {X, Y} , let L Z and U Z be the leaves and internal vertices of T that belong to Z. If there is a vertex in Y of degree at least 2 that is anti-complete to a set of s vertices of X, then T contains an induced sP 1 + P 3 , a contradiction. Therefore we may assume that every vertex of Y either has degree at least |X| − s + 1 or is in L Y . Then and |L X | = 0 , or |U X | = 0 and |L X | ≤ 1 . Both cases contradict the assumption that X has at least 2s + 2 vertices. Now suppose |U Y | = 1 . Then, by our assumption that . Now it is easy to find an induced sP 1 + P 3 (see Fig. 1), and this contradiction completes the proof. ◻ The bound of 4s in Lemma 3 is not tight but, as we shall see later, it suffices for our purposes.
Lemma 4 Let s ≥ 0 be an integer. Let G be a connected (sP 1 + P 3 )-free graph, and let U be a set of vertices in G. Then there is a set of vertices R in G such that G[R ∪ U] is connected and |R| ≤ 2s 2 − 2s + 3.

Proof If G[U]
is connected, then let R = � . Otherwise, since G cannot now be a complete graph, it contains an induced path P on three vertices in G. The number of connected components of G[U] that do not contain a vertex that is either in P or adjacent to a vertex of P in G is at most s − 1 , otherwise G contains an induced sP 1 + P 3 . Let R contain the vertices of P and the internal vertices of shortest paths in G from P to each set of vertices that induces a connected component of G [U]. As at most s − 1 of these shortest paths have more than zero internal vertices, and as each contains at most 2s internal vertices (any longer path contains an induced sP 1 + P 3 ), it follows that |R| ≤ 3 + 2s(s − 1) = 2s 2 − 2s + 3 . As G[R ∪ U] is connected, the lemma is proved. ◻ We now prove our four results. For the connected variants, we consider the more general extension versions.
Theorem 5 For every s ≥ 0 , FeedbaCk Vertex set can be solved in polynomial time on (sP 1 + P 3 )-free graphs. Fig. 1 The structure of the tree T in the proof of Lemma 3 in the case when |U Y | = 1 . The set L X is an independent set of vertices that each are adjacent to the unique vertex y ∈ U Y . The set L Y is partitioned into independent sets of vertices that have the same neighbour in U X . The vertices y, x, z, together with s vertices of L y not adjacent to x, induce an sP 1 + P 3 in T (which leads to the desired contradiction in the proof) Proof Let s ≥ 0 be an integer, and let G = (V, E) be an (sP 1 + P 3 )-free graph. We must show how to find a smallest feedback vertex set of G. We will in fact show how to find a largest induced forest of G, the complement of a smallest feedback vertex set. The proof is by induction on s. If s = 0 , then we can use Proposition 1. We now assume that s ≥ 1 and that we have a polynomial-time algorithm for finding a largest induced forest in ((s − 1)P 1 + P 3 )-free graphs. Our algorithm performs the following two steps in polynomial time. Together, these two steps cover all possibilities.
Step 1 Compute a largest induced forest F such that every connected component of F has at least c(s) vertices. By Lemma 2 we know that F will be connected, and so by Lemma 3 F will be a tree with at most 4s internal vertices. We consider every possible choice U of a non-empty set of at most 4s vertices. There are O(n 4s ) choices. If U induces a tree, we will find a largest induced tree whose internal vertices all belong to U. This can be found by adding to U the largest possible set of vertices that are independent and belong to the set R of vertices in G − U that each have exactly one neighbour in U. That is, we need a largest independent set in G[R] and, by Theorem 1, such a set can be found in polynomial time.
Step 2 Compute a largest induced forest F such that F has a connected component with at most c(s) − 1 vertices.
We Proof There are similarities to the proof of Theorem 5, but more arguments are needed. Let s ≥ 0 be an integer, let G = (V, E) be a connected (sP 1 + P 3 )-free graph and let W be a subset of V. We must show how to find a smallest connected feedback vertex set of G that contains W in polynomial time. We show how to solve the complementary problem in polynomial time: how to find a largest induced forest F of G that does not include any vertex of W and V ⧵ F is connected. We will say that an induced forest F is good if it has these two properties. ◻ Our algorithm performs the following three steps in polynomial time. Together, these three steps cover all possibilities.
Step 1 Compute a largest good induced forest F such that there is a connected component of F that has at least c(s) vertices.
By Lemma 2 we know that F has exactly one connected component on at least c(s) and there are at most s − 1 other connected components of F, each on at most two vertices. By Lemma 3, the connected component on at least c(s) vertices has at most 4s internal vertices. We consider O(n 4s+2(s−1) ) choices of a non-empty set U of at most 4s vertices that induces a tree and a set U ′ of at most 2(s − 1) vertices that induces a disjoint union of vertices and edges such that U ∪ U � does not intersect W, U is disjoint from U ′ and no vertex of U has a neighbour in U ′ . Let R be the set of vertices that each have exactly one neighbour in U and no neighbour in U ′ , but do not belong to W. We then add to U ∪ U � the largest possible set L of vertices that are independent and belong to the set R such that G − (L ∪ U ∪ U � ) is connected. This is achieved by taking the complement of the smallest connected vertex cover of G − (U ∪ U � ) that contains V ⧵ (R ∪ U ∪ U � ) . By Theorem 2, this can be done in polynomial time.
Step 2 Compute a largest good induced forest F such that F has at most s − 1 connected components and each connected component has at most c(s) − 1 vertices.
Since the number of vertices in F is bounded by the constant (s − 1)(c(s) − 1) , we can simply check all sets containing at most that many vertices to see if they induce such a good forest.

Step 3 Compute a largest good induced forest F such that F has at least s connected components and each connected component has at most c(s) − 1 vertices.
We Let S = R ∪ U ∪ W . If G − S is a forest, then we are done. Otherwise note that G − (L ∪ S) is the disjoint union of one or more complete graphs: G − (L ∪ S) cannot contain an induced P 3 , as it is anti-complete to L which contains an induced sP 1 .
As G is connected, each of the complete graphs in G − (L ∪ S) contains at least one vertex that is adjacent to some vertex of S. Hence in polynomial time we can find a set S ′ of vertices containing all but min{2, |X|} vertices from each of the complete graphs X in such a way that G[S ∪ S � ] is connected. Then G − (S ∪ S � ) is a largest good induced forest that contains L and no vertex of R ∪ U.
After considering each of the O(n 2s 2 −2s+3 ) choices for R, in polynomial time we find a largest good induced forest that contains L and no vertex of U. Proof Let s ≥ 0 be an integer, and let G = (V, E) be an (sP 1 + P 3 )-free graph. We must describe how to find a smallest odd cycle transversal of G. If s = 0 , then we can use Proposition 1. We now assume that s ≥ 1 and use induction. We will in fact describe how to solve the complementary problem and find a largest induced bipartite subgraph of G. The proof is by induction on s and our algorithm performs two steps in polynomial time, which together cover all possibilities.

Step 1 Compute a largest induced bipartite subgraph B such that every connected component of B has at least c(s) vertices.
By Lemma 2, we know that B will be connected. Hence, B has a unique bipartition, which we denote {X, Y} . We first find a largest induced bipartite subgraph B Fig. 2 The decomposition of the (sP 1 + P 3 )-free graph G, as given in Step 3 of the algorithm from the proof of Theorem 6 that is a star: we consider each vertex x and find a largest induced star centred at x by finding a largest independent set in N(x). This can be done in polynomial time by Theorem 1.
Next, we find a largest induced bipartite subgraph B that is not a star. We consider each of the O(n 2 ) choices of edges xy of G and find a largest induced connected bipartite subgraph B such that x ∈ X and y ∈ Y and neither x nor y has degree 1 in B (since B is not a star, it must contain such a pair of vertices). Note that the number of vertices in X non-adjacent to y is at most s − 1 , otherwise B induces an sP 1 + P 3 . Similarly there are at most s − 1 vertices in Y non-adjacent to x. We consider each of the O(n 2s−2 ) possible pairs of disjoint sets X ′ and Y ′ , which are each independent sets of size at most s − 1 such that X � ∪ Y � is anti-complete to {x, y} . We will find a largest induced bipartite subgraph with partition classes X and Y such that {x} ∪ X � ⊆ X and {y} ∪ Y � ⊆ Y and every vertex in X ⧵ X ′ is adjacent to y and every vertex in Y ⧵ Y ′ is adjacent to x. That is, we must find a largest independent set in both N(x) ⧵ N({y} ∪ Y � ) and N(y) ⧵ N({x} ∪ X � ) ; see Fig. 3 for an illustration. This can be done in polynomial time, again by applying Theorem 1.
Step 2 Compute a largest induced bipartite subgraph B such that B has a connected component with at most c(s) − 1 vertices. Proof Let s ≥ 0 be an integer, let G = (V, E) be a connected (sP 1 + P 3 )-free graph and let W be a subset of V. We must describe how to find a smallest connected odd cycle transversal of G that contains W. We will solve the complementary problem: how to find a largest induced bipartite graph of G that does not include any vertex of W and whose complement is connected. We will say that an induced bipartite graph B is good if it has these two properties. Our algorithm consists of three steps, which can each be performed in polynomial time and which together cover all the possible cases.

Step 1 Compute a largest good induced bipartite subgraph B such that B has a bipartition {X, Y} in which one set, say X, has size |X| ≤ s . (Note that this includes the case when every connected component of B has at most two vertices and B has at most s connected components.)
We consider O(n s ) choices of an independent set X of at most s vertices of G that does not intersect W. We wish to find Y, the largest possible independent set in G − (W ∪ X) such that G − (X ∪ Y) is connected. By Theorem 2, we can do this in polynomial time by computing a minimum connected vertex cover of G − X that contains W and taking its complement (in G − X).

Step 2 Compute a largest good induced bipartite subgraph B such that B has at least s connected components and each connected component has at most two vertices.
Note that 2 ≤ c(s) − 1 . The algorithm mimics Step 3 of the algorithm in the proof of Theorem 6, but checks for a good bipartite graph instead of a good forest.

Step 3 Compute a largest good induced bipartite subgraph B such that there is a connected component of B that has at least three vertices and B has a bipartition
{X, Y} with |X| ≥ s + 1 and |Y| ≥ s + 1.
It is in this case that we must do most of the work in proving the theorem, and here we will need ideas beyond those already met in this section.
As B contains a connected component on at least three vertices, it will contain an induced P 3 and so |X| ≥ 1 and |Y| ≥ 1 . We consider O(n 2s+2 ) choices of disjoint independent sets X ′ and Y ′ that each contain s + 1 vertices of G and do not intersect W. If G[X � ∪ Y � ] contains an induced P 3 , our aim is to compute a largest good induced bipartite graph B with bipartition {X, Y} such that X ′ ⊆ X and Y ′ ⊆ Y ; otherwise we discard the choice of X ′ , Y ′ .
We define (see also Fig. 4) There are a number of steps where our procedure branches as we consider all possible ways of choosing whether or not to add certain vertices to B. Note that assuming our choice of X ′ and Y ′ is correct, no vertex of U can be in B. If we decide that a vertex will not be in B, we will then add it to U.
Step 3.1. Reduce Z to the empty set. Notice that Z does not contain an independent set on more than s − 1 vertices otherwise G[X � ∪ Y � ∪ Z] would contain an induced sP 1 + P 3 . We consider O(n 2s−2 ) choices of disjoint independent sets Z X and Z Y that are each subsets of Z and each contain at most s − 1 vertices. We move the vertices of Z X and Z Y by adding them to X ′ and Y ′ , respectively. We move the vertices of Z ⧵ (Z X ∪ Z Y ) by adding them to U. If after this process is complete there are vertices in V X ∪ V Y with neighbours in both X ′ and Y ′ , we move these vertices by adding them to U. We note that now: • Z is the empty set, • V X still contains vertices with neighbours in X ′ but not in Y ′ , Fig. 4 The decomposition of G in Step 3. Full and dotted lines indicate when two sets are complete or anti-complete to each other, respectively. The absence of a full or dotted line indicates that edges may or may not exist between two sets. The circles in V X and V Y represent disjoint unions of complete graphs • V Y still contains vertices with neighbours in Y ′ but not in X ′ , and • U contains vertices that will not be in B.
So our task is to decide how best to add vertices of V X to Y ′ and vertices of V Y to X ′ , but first there is another step: as G − B must be connected, and G[U] is a subgraph of G − B , we choose some vertices that will not be in B, but will connect together the connected components of G[U]. This will not be possible if the vertices of U belong to more than one connected component of G − (X � ∪ Y � ) . Hence, in that case we discard this choice of Z X , Z Y .

Make G[U] connected.
We consider O(n 2s 2 −2s+3 ) choices of sets R of vertices of G − (X � ∪ Y � ) such that each contains at most 2s 2 − 2s + 3 vertices. If G[R ∪ U] is connected, we move the vertices of R by adding them to U, and so G[U] becomes connected. Note that since all vertices of U are in the same connected component of G − (X � ∪ Y � ) , Lemma 4 implies that at least one such set R can be found.

Step 3.3. Add vertices from
are disjoint unions of complete graphs. Note that B can contain at most one vertex from each of these complete graphs. We consider two subcases.

Step 3.3.a. Compute a largest good induced bipartite subgraph B with bipartition
{X, Y} such that X ′ ⊆ X , Y ′ ⊆ Y and G − B contains no edges between V X and V Y .
As G − B must be connected, each clique of V X and V Y that contains at least two vertices must contain a vertex adjacent to U (otherwise such a set B cannot exist). Thus we can form X from X ′ by adding to X ′ one vertex from each clique of V Y and form Y by adding to Y ′ one vertex from each clique of V X in such a way that G − B is connected. (If we do this, it is possible that G − B will contain an edge from V X to V Y , but then this solution is at least as large as one where such edges are avoided.)

Step 3.3.b. Compute a largest good induced bipartite subgraph B with bipartition
We consider O(n 2 ) choices of an edge xy, x ∈ V X , y ∈ V Y . Let v X ∈ X � be a neighbour of x and note that v X , x and y induce a P 3 in G. Therefore x must be complete to all but at most s − 1 cliques of V Y . By symmetry, y must be complete to all but at most s − 1 cliques of V X . A clique in V X or V Y is bad if it is not complete to y or x, respectively. Note that the cliques containing x and y may be bad. We move x and y to U.
We consider O(n 2s−2 ) choices of a set S of at most 2s − 2 vertices that each belong to a distinct bad clique and move each to X ′ or Y ′ if they are in V Y or V X respectively. We move the other vertices of the bad cliques to U. If the vertices of U are not in the same connected component of G − (X � ∪ Y � ) , we discard this choice of S. We consider O(n 2s 2 −2s+3 ) choices of sets R ′ of vertices of G − (X � ∪ Y � ) such that each contains at most 2s 2 − 2s + 3 vertices. If G[R � ∪ U] is connected we move the vertices of R ′ to U, so G[U] becomes connected. Since the vertices of U are in the same connected component of G − (X � ∪ Y � ) , Lemma 4 implies that at least one such set R ′ can be found.
Note that some cliques might have been completely removed from V X and V Y by the choice of R ′ . It only remains to pick one vertex from each remaining clique of V X and V Y , and add these vertices to Y ′ or X ′ , respectively to finally obtain B. As all vertices in these cliques are adjacent to x or y we know that G − B will be connected. ◻

The Case H = P 6
In this section we prove that Odd cycle tranSVerSal and cOnnected Odd cycle tranSVerSal are NP-hard on (P 2 + P 5 , P 6 )-free graphs. We do this by modifying the construction used in [29] for proving that these two problems are NP-complete on P 13 -free segment graphs.
Proof Both problems are readily seen to belong to NP. To prove NP-hardness we reduce from Vertex cOVer, which is known to be NP-complete [16]. Let (G, k) be an instance of Vertex cOVer. Let n and m be the number of vertices and edges, respectively, in G. Let v 1 , … , v n be the vertices of G. We construct a graph G * from G as follows.  We first claim that the following statements are equivalent: (i) G has a vertex cover of size at most k; (ii) G * has an odd cycle transversal of size at most n + k; (iii) G * has a connected odd cycle transversal of size at most n + k.
(i) ⇒ (iii). Suppose that G has a vertex cover Q of size at most k. We define the set and observe that |S| = 2|Q| + (n − |Q|) = n + |Q| ≤ n + k and that S is connected. We claim that S is an odd cycle transversal of G * . This can be seen as follows. The only induced odd cycles in G * are the three triangles in each vertex gadget and the triangle in each edge gadget. By construction of S, for every i ∈ {1, … , n} , either S contains both x i and y i or S contains b i , thus every triangle in every vertex gadget intersects S. Furthermore, since Q is a vertex cover of G, for every edge gadget {x i , y j , d i,j } , either x i ∈ S or y j ∈ S . Therefore S intersects every odd cycle in G * .
(ii) ⇒ (i). Suppose that G * has an odd cycle transversal S of size at most n + k . Consider an edge gadget on We claim that Q is a vertex cover of G. This can be seen as follows. Consider an edge v i v j of G (without loss of generality assume i < j ). Then |{x i , y j , d i,j } ∩ S| ≥ 1 , as S is an odd cycle transversal of G * . By assumption on S, d i,j ∉ S and if y j ∈ S then x j ∈ S . It follows that x i ∈ S or x j ∈ S and so v i ∈ Q or v j ∈ Q . We conclude that Q is a vertex cover of G of size at most k.
It only remains to show that G * is (P 2 + P 5 , P 6 )-free. Suppose, for contradiction, that H ∈ {P 2 + P 5 , P 6 } is an induced subgraph of G * . Every vertex in A ∪ C ∪ D has degree 2 and its two neighbours are adjacent. Therefore no vertex in V(H) ∩ (A ∪ C ∪ D) is an internal vertex of a path of H. That is, if x ∈ V(H) ∩ (A ∪ C ∪ D) then x has degree 1 in H. Furthermore, A ∪ C ∪ D is an independent set in G * . Hence, if H = P 2 + P 5 , then at most one contains an induced subgraph H ′ on four vertices that is isomorphic to P 1 + P 3 if H = P 2 + P 5 or P 4 if H = P 6 . Since Y is an independent set and B ∪ X is a perfect matching, H ′ must contain at least one vertex of B ∪ X and at least one vertex of Y. As Y is complete to B ∪ X , we find that H ′ contains either C 4 or K 1,3 as a (not necessarily induced) subgraph, a contradiction. This completes the proof. ◻ The proof of Theorem 9 gives a slightly stronger result if we assume the Exponential Time Hypothesis (ETH). The ETH is one of standard assumptions in complexity theory which, along with the sparsification lemma, implies that 3-Sat with n variables and m clauses cannot be solved in 2 o(n+m) time [22,23]. The number of vertices in the graph G * constructed in the proof of Theorem 9 is 5n + m . Thus an algorithm solving (cOnnected) Odd cycle tranSVerSal on (P 2 + P 5 , P 6 )-free graphs with n vertices in time 2 o(n) could be used to solve Vertex cOVer on graphs with n vertices and m edges in 2 o(n+m) time. However, such a fast algorithm for Vertex cOVer does not exist unless the ETH fails [11]. Thus we get the following statement.

Corollary 1
Odd cycle tranSVerSal and cOnnected Odd cycle tranSVerSal cannot be solved in 2 o(n) time on (P 2 + P 5 , P 6 )-free graphs with n vertices, unless the ETH fails.

Conclusions
We proved polynomial-time solvability of Feedback Vertex Set and Odd cycle tranSVerSal on H-free graphs when H = sP 1 + P 3 and polynomial-time solvability of their connected variants on H-free graphs, when H = P 4 or H = sP 1 + P 3 ; see also Table 1, where we place these results in the context of known results for these problems on H-free graphs. We also showed that Odd cycle tranSVerSal and cOnnected Odd cycle tranSVerSal are NP-complete on (P 2 + P 5 , P 6 )-free graphs.
Natural cases for future work are the cases when H = sP 1 + P 4 for s ≥ 1 and H = P 5 for all four problems (in particular the case when H = P 5 is the only open case for Odd cycle tranSVerSal and cOnnected Odd cycle tranSVerSal restricted to P r -free graphs). Note that Lemma 2 does not hold on (sP 1 + P 4 )-free graphs: the disjoint union of any number of arbitrarily large stars is even P 4 -free.
Recall that Vertex cOVer and cOnnected Vertex cOVer are polynomial-time solvable even on (sP 1 + P 6 )-free graphs [20] and (sP 1 + P 5 )-free graphs [24], respectively, for every s ≥ 0 . In contrast to the case for Odd cycle tranSVerSal and cOnnected Odd cycle tranSVerSal, it is not known whether there is an integer r for which any of the problems Vertex cOVer, Feedback Vertex Set or their connected variants is NP-complete on P r -free graphs. Determining whether such an r exists is an interesting open problem.
We note that a similar complexity study has also been undertaken for the independent variants of the problems Feedback Vertex Set and Odd cycle tranSVer-Sal. 3 In particular, independent Feedback Vertex Set and independent Odd cycle tranSVerSal are polynomial-time solvable on P 5 -free graphs [3], but their complexity status is unknown on P 6 -free graphs. It is not known whether there is an integer r such that independent Feedback Vertex Set or independent Odd cycle tranSVer-Sal is NP-complete on P r -free graphs.
We conclude that in order to make any further progress, we must better understand the structure of P r -free graphs. This topic has been well studied in recent years, see also for example [17,19]. However, more research and new approaches will be needed. [24] Let s ≥ 0 and let G be a connected (sP 1 + P 5 )-free graph. Then G has a connected dominating set D that is either a clique or has size at most 2s 2 + s + 3 . Moreover, D can be found in O(n 2s 2 +s+3 ) time.

Lemma 6
Lemma 7 [24] Let J be an independent set in a connected graph G such that J has a vertex y that is adjacent to every vertex of G − J . Let J ′ consist of those vertices of J ⧵ {y} that have two adjacent neighbours in G − J (or equivalently, in G). Then a subset S of the vertex set of G is a connected vertex cover of G that contains J if and only if S ⧵ J ′ is a connected vertex cover of G − J � that contains J ⧵ J ′ .
We also need an auxiliary problem defined in [24]. Let G be a connected graph, let J ⊆ V G be a subset of the vertex set of G and let y be a vertex of J. We call say that a triple (G, J, y) is cover-complete if it has the following three properties: This leads to the following optimization problem: We also need the following two lemmas. Lemma 8 [24] Let (G, {y}, y) be a cover-complete triple, where G is an (sP 1 + P 5 )-free graph for some s ≥ 0 . Then it is possible to compute a smallest connected vertex cover of G that contains y in O(n s+14 ) time.

Lemma 9 [24]
For every s ≥ 0 , cOnnected Vertex cOVer cOmpletiOn can be solved in O(n 2s+19 ) time for cover-complete triples (G, J, y), where G is an (sP 1 + P 5 )-free graph.
We are now ready to prove Theorem 2, which we restate below. The proof mimics the proof of [24].

Theorem 2 (restated)
For every s ≥ 0 , ConneCted Vertex CoVer extension can be solved in polynomial time on (sP 1 + P 5 )-free graphs.
Proof Let G be an (sP 1 + P 5 )-free graph on n vertices for some s ≥ 0 and let W ⊆ V(G) be a subset of vertices of G. We may assume without loss of generality that G is connected. By Lemma 6 we can first compute in O(n 2s 2 +s+3 ) time a connected dominating set D that either has size at most 2s 2 + s + 3 or is a clique. We note that, if D is a clique, any vertex cover of G contains all but at most one vertex of D. This leads to a case analysis where we guess the subset D * ⊆ D ⧵ W of vertices not in a smallest connected vertex cover of G that contains W. That is, we choose a set of at most one vertex if D is a clique and a set of at most |D ⧵ W| vertices otherwise, and eventually look at all such sets. As |D ⧵ W| ≤ |D| ≤ 2s 2 + s + 3 if D is not a clique, the number of guesses is O(n 2s 2 +s+3 ) . For each guess of D * , we compute a smallest connected vertex cover S D * that contains all vertices of (D ⧵ D * ) ∪ W and no vertex of D * . Then, at the end, we return one that has minimum size overall. In particular we note that, since D is a connected dominating set of G, D ∪ W is also a connected dominating set of G.
Let D * be a guess. Before we start our case analysis we first prove the following claim.

Claim 1
We may assume, at the expense of an O(n 16s 3 +4 ) factor in the running time, that D ⧵ D * is connected.
We prove Claim 1 as follows. Suppose D ⧵ D * is not connected. Recall that G [D] is either a complete graph or has size at most 2s 2 + s + 3 . In the first case, G[D ⧵ D * ] is connected. Hence, the second case applies so D has size at most 2s 2 + s + 3 .
Let v ∈ D ⧵ D * . As G is (sP 1 + P 5 )-free, G is also P 5+2s -free. Hence, for each u ∈ D ⧵ (D * ∪ {v}) , every connected vertex cover of G contains a path of at most 5 + 2s − 1 vertices that connects u to v. We will guess all these u − v-paths (using only vertices from G − D * ) and add their vertices to D. As the number of paths is at most 2s 2 + s + 2 , this branching adds an O(n (5+2s−3)(2s 2 +s+2) ) = O(n 16s 3 +4 ) factor to our running time and increases our set D by at most 24s 3 extra vertices. We have proven Claim 1.
We distinguish two cases.

Case 1 D * = �.
We compute a minimum vertex cover S ′ of G − (D ∪ W) in polynomial time by Theorem 1. To be more precise, this takes O(n s+14 ) time by using the same arguments as in the proof of Lemma 8 (see [24]). Clearly S � ∪ D ∪ W is a vertex cover of G. As D is a connected dominating set, S � ∪ D ∪ W is even a connected vertex cover of G. Let S � = S � ∪ D ∪ W . As S ′ is a minimum vertex cover of G − (D ∪ W) , S ∅ is a smallest connected vertex cover of G that contains all vertices of D ∪ W . We remember S ∅ . Note that S ∅ is found in O(n s+14 ) time.
Recall that we are looking for a smallest connected vertex cover of G that contains every vertex of (D ⧵ D * ) ∪ W , but does not contain any vertex of D * . Hence D * must be an independent set, disjoint from W, and G − D * must be connected (if one of these conditions is false, then we stop considering the guess D * ). Moreover, a vertex As mentioned, at the end we pick a smallest set of the sets S D * . This set is then a smallest connected vertex cover of G that contains W. As there are O(n 2s 2 +s+3 ⋅ n 16s 3 +4 ) such sets, each of which is found in O(n 2s+19 ) time, the total running time is O(n 21s 3 +26 ) . The correctness of our algorithm follows immediately from the above case analysis and the description of the cases. ◻ Note that the algorithm given in Theorem 2 not only solves the decision problem, but also finds a minimum connected vertex cover of a given (sP 1 + P 5 )-free graph in polynomial time.