1 Introduction

Many optimization problems in graphs can be expressed as follows: given a graph G, find a largest vertex set A such that G[A], the subgraph of G induced by A, satisfies some property. Examples include Independent Set (the property of being edgeless), Feedback Vertex Set (the property of being acyclic), and Planarization (the property of being planar). Here, Feedback Vertex Set and Planarization are customarily phrased in the complementary form that asks for minimizing the complement of A: given G, find a smallest vertex set X such that \(G-X\) has the desired property. While all problems considered in this paper can be viewed in these two ways, for the sake of clarity we focus on the maximization formulation.

Formally, we shall consider the following Max Induced \({\mathcal {C}}\)-Subgraph problem. Fix a graph class \({\mathcal {C}}\) that is hereditary, that is, closed under taking induced subgraphs. Then, given a graph G, the goal is to find a largest vertex subset A such that \(G[A]\in {\mathcal {C}}\). Our focus is on exact algorithms for this problem with running time expressed in terms of n, the number of vertices of G. Clearly, as long as the graphs from \({\mathcal {C}}\) can be recognized in polynomial time, the problem can be solved in \(2^n\cdot n^{{\mathcal {O}}(1)}\) time by brute-force; we are interested in non-trivial improvements over this approach.

The complexity of Max Induced \({\mathcal {C}}\)-Subgraph was studied as early as in 1980 by Lewis and Yannakakis [26], who proved that when the graph class \({\mathcal {C}}\) does not contain all graphs, the problem is NP-hard. Recently, Komusiewicz [21] inspected the reduction of Lewis and Yannakakis and concluded that under the Exponential Time Hypothesis (ETH) one can even exclude the existence of subexponential-time algorithms for the problem, that is, ones with running time \(2^{o(n)}\). While the result of Komusiewicz [21] excludes significant improvements in the running time, there is still room for improvement in the base of the exponent. Indeed, for various classes of graphs \({\mathcal {C}}\), algorithms with running time \({\mathcal {O}}((2-\varepsilon )^n)\) for some \(\varepsilon >0\) are known; see e.g. [5, 15,16,17, 31] and the references therein.

Another direction, which is of main interest to us, is to impose more conditions on the input graphs G in the hope of obtaining faster algorithms for restricted cases. Formally, we fix another hereditary graph class \({\mathcal {D}}\) and consider Max Induced \({\mathcal {C}}\)-Subgraph where the input graph G is additionally required to belong to \({\mathcal {D}}\).

In this line of research, the class \({\mathcal {C}}\) of edgeless graphs, which corresponds to the classical Max Independent Set (mis) problem, has been extensively studied. Suppose \({\mathcal {D}}\) is the class of H -free graphs, that is, graphs that exclude some fixed graph H as an induced subgraph. As observed by Alekseev [2], the problem is NP-hard on H-free graphs unless H is a path or a subdivision of the claw (\(K_{1,3}\)); the reduction of [2] actually excludes the existence of a subexponential-time algorithm under ETH in these cases. On the positive side, the maximal classes for which polynomial-time algorithms are known are the \(P_6\)-free graphs [18] and the fork-free graphs [3, 27]. It would be consistent with our knowledge if mis was polynomial-time solvable on H-free graphs whenever H is a path or a subdivision of the claw. Very recently, Abrishami et al. [1] reported a polynomial-time algorithm on long-hole-free graphs, which are graphs that exclude every cycle of length at least 5 as an induced subgraph.

It turns out that if we only aim at subexponential-time instead of polynomial-time algorithms, many more tractability results can be obtained for mis, and usually they are also conceptually much simpler. Bacsó et al. [4] showed that mis can be solved in \(2^{{\mathcal {O}}(\sqrt{tn\log n})}\) time on \(P_t\)-free graphs, for every \(t\in \mathbb {N}\) (see also an alternative subexponential-time algorithm by Brause [9], with running time \(2^{{\mathcal {O}}(n^{1-\varepsilon })}\) for any \(\varepsilon \in (0,1/2t)\)).

In the light of the results above, it is natural to ask whether structural assumptions on the class \({\mathcal {D}}\) from which the input is drawn, like e.g. \(P_t\)-freeness, can help in the design of subexponential-time algorithms for other maximum induced subgraph problems, beyond \({\mathcal {C}}\) being the class of edgeless graphs. This is precisely the question we investigate in this work.

Our contribution We identify three properties that together provide a way to solve the Max Induced \({\mathcal {C}}\)-Subgraph problem on graphs from \({\mathcal {D}}\) in subexponential time, where \({\mathcal {C}}\) and \({\mathcal {D}}\) are hereditary graph classes. They are as follows:

  • The class \({\mathcal {C}}\) should consist of sparse graphs. To be specific, let us assume that every n-vertex graph from \({\mathcal {C}}\) has \({\mathcal {O}}(n)\) edges.

  • The class \({\mathcal {D}}\) may contain dense graphs, but they should admit balanced separators whose size is somehow governed by the density. To be specific, let us assume that every graph from \({\mathcal {D}}\) with maximum degree \(\varDelta\) has a balanced separator of size \({\mathcal {O}}(\varDelta )\), or that every graph from \({\mathcal {D}}\) with m edges has a balanced separator of size \({\mathcal {O}}(\sqrt{m})\).

  • The Max Induced \({\mathcal {C}}\)-Subgraph problem on graphs from \({\mathcal {D}}\) can be solved in \(2^{\tilde{{\mathcal {O}}}(w)}\cdot n^{{\mathcal {O}}(1)}\) time, where w is the treewidth of the input graph. Here, notation \(\tilde{{\mathcal {O}}}(\cdot )\) hides polylogarithmic factors.

We show that if these conditions are simultaneously satisfied, then the Max Induced \({\mathcal {C}}\)-Subgraph problem on graphs from \({\mathcal {D}}\) can be solved in \(2^{\tilde{{\mathcal {O}}}(n^{2/3})}\) time in the presence of balanced separators of size \({\mathcal {O}}(\varDelta )\) and in \(2^{\tilde{{\mathcal {O}}}(n^{3/4})}\) time for balanced separators of size \({\mathcal {O}}(\sqrt{m})\). The precise statement and proof of this result can be found in Sect. 2.

The conditions on \({\mathcal {C}}\) look natural and are satisfied by various specific classes of interest, like forests (corresponding to Feedback Vertex Set) and planar graphs (corresponding to Planarization). On the other hand, the condition on \({\mathcal {D}}\) looks more puzzling. However, there are certain non-sparse classes of graphs where the existence of such balanced separators has been established. For instance, balanced separators of size \({\mathcal {O}}(\varDelta )\) are known to exist in \(P_t\)-free graphs for any fixed \(t\in \mathbb {N}\) [4], and in long-hole-free graphs [11]. The existence of balanced separators of size \({\mathcal {O}}(\sqrt{m})\) is known for string graphs, which are intersection graphs of arc-connected subsets of the plane, and more generally for intersection graphs of connected subgraphs in any proper minor-closed class (see Lee [25]). All these observations yield a number of concrete corollaries to our main result, which are gathered in Sect. 3.

In Sect. 4, we investigate more closely the case that \({\mathcal {D}}\) is the class of string graphs. We show that in this class we can obtain significantly faster subexponential-time algorithms for the above-mentioned problems. The key idea is to use another result by Lee [25], which asserts that string graphs not containing \(K_{t,t}\) as a subgraph are sparse—they have \({\mathcal {O}}(n \cdot t \log t)\) edges. Observe that for all discussed classes \({\mathcal {C}}\), including edgeless graphs, forests, and planar graphs, their members do not contain a subgraph isomorphic to \(K_{t,t}\), for some constant t.

In Sect. 5, we discuss some lower bounds: we show that if \({\mathcal {C}}\) is the class of forests (corresponding to the Feedback Vertex Set problem) and \({\mathcal {D}}\) is characterized by a single excluded induced subgraph, then under the Exponential Time Hypothesis one cannot hope for subexponential-time algorithms in greater generality than provided by our main result.

2 Main Result

We use standard graph notation. We assume that the reader is familiar with treewidth. We recall some notations for tree decompositions in Sect. 6, where it is actually needed.

For a graph G, a set \(S\subseteq V(G)\) is a balanced separator if every connected component of \(G-S\) has at most \(\frac{2}{3}|V(G)|\) vertices. It is known that small balanced separators can be used to construct tree decompositions of small width, as made explicit in the following lemma.

Lemma 1

[14] If every subgraph of a graph G has a balanced separator of size at most kthen the treewidth of G is \({\mathcal {O}}(k).\)

Now, we are ready to state and prove our main result.

Theorem 1

Let \({\mathcal {C}}\) and \({\mathcal {D}}\) be classes of graphs that satisfy the following conditions:

  1. (P1)

    Every n-vertex graph from \({\mathcal {C}}\) has \({\mathcal {O}}(n)\) edges.

  2. (P2)

    The class \({\mathcal {D}}\) is closed under taking induced subgraphs.

  3. (P3)

    Given a graph \(G\in {\mathcal {D}}\) with n vertices and treewidth wone can find a largest set \(A \subseteq V(G)\) such that \(G[A]\in {\mathcal {C}}\) in \(2^{\tilde{{\mathcal {O}}}(w)}\cdot n^{{\mathcal {O}}(1)}\) time.

Furthermore, let the class \({\mathcal {D}}\) satisfy one of the following conditions:

  1. (P4a)

    Every graph in \({\mathcal {D}}\) with maximum degree \(\varDelta\) has a balanced separator of size \({\mathcal {O}}(\varDelta )\), or

  2. (P4b)

    Every graph in \({\mathcal {D}}\) with n vertices and maximum degree \(\varDelta\) has a balanced separator of size \({\mathcal {O}}(\sqrt{n\varDelta }).\)

Then, given an n-vertex graph \(G\in {\mathcal {D}},\) one can find a largest set \(A\subseteq V(G)\) such that \(G[A]\in {\mathcal {C}}\) in time

  1. (a)

    \(2^{\tilde{{\mathcal {O}}}(n^{2/3})},\) if \({\mathcal {D}}\) satisfies (P4a), or

  2. (b)

    \(2^{\tilde{{\mathcal {O}}}(n^{3/4})},\) if \({\mathcal {D}}\) satisfies (P4b).

Proof

Let a constant \(\tau\) be defined as follows, depending on which of the two conditions is satisfied by \({\mathcal {D}}\):

$$\begin{aligned} \tau = {\left\{ \begin{array}{ll} {1/3} &{} \text {if}\; {\mathcal {D}}\;\text {satisfies (P4a),}\\ {1/4} &{} \text {if}\; {\mathcal {D}}\;\text {satisfies (P4b)}. \end{array}\right. } \end{aligned}$$

Let \(G\in {\mathcal {D}}\) be the input graph and n be the number of its vertices. We devise a branching algorithm that finds a largest set \(A\subseteq V(G)\) such that \(G[A]\in {\mathcal {C}}\) in \(2^{\tilde{{\mathcal {O}}}(n^{1-\tau })}\) time. This matches the complexity bounds from the statement of the theorem.

Consider a fixed solution A, that is, a largest set \(A\subseteq V(G)\) such that \(G[A]\in {\mathcal {C}}\). Let \(A' \subseteq A\) be the set of vertices of degree greater than \(n^\tau\) in G[A]. By property (P1), we have \(|A'| = {\mathcal {O}}(n/n^\tau ) = {\mathcal {O}}(n^{1-\tau })\).

The algorithm guesses the set \(A'\) exhaustively, by trying all subsets of V(G) of the appropriate sizes \({\mathcal {O}}(n^{1-\tau })\), which results in \(n^{{\mathcal {O}}(n^{1-\tau })} = 2^{\tilde{{\mathcal {O}}}(n^{1-\tau })}\) branches. Fix one such branch and assume, for the purpose of further description of the algorithm, that it corresponds to the correct set \(A'\) (i.e., the one obtained from the fixed solution A). Let \(G'=G-A'\).

Suppose that \(G'\) contains a vertex v of degree at least \(n^{2\tau }\). If \(v \in A\), then v has degree at most \(n^\tau\) in G[A] (since \(v \notin A'\)). The algorithm further guesses that \(v\notin A\) and discards v (one branch), or it guesses that \(v\in A\) and discards all but at most \(n^\tau\) neighbors of v in \(G'\) (at most \(n^{n^\tau }\) branches). In the latter case, we do not fix the assumption that v or any particular neighbor of v belongs to A, so that the vertices that have survived this step can still be discarded in subsequent branching steps.

The step described above is repeated exhaustively. The overall number of branches generated in this way can be bounded as follows, where \(k=|V(G')|\):

$$\begin{aligned} F(k)&\leqslant F(k-1) + n^{n^\tau }\cdot F(k-(n^{2\tau } - n^\tau ))\\&\leqslant F(k-2) + n^{n^\tau }\cdot F(k-(n^{2\tau } - n^\tau )) + n^{n^\tau }\cdot F(k-(n^{2\tau } - n^\tau ))\\&\leqslant \cdots \leqslant F(k-(n^{2\tau } - n^\tau ))+(n^{2\tau } - n^\tau ) \cdot n^{n^\tau }\cdot F(k-(n^{2\tau } - n^\tau ))\\&= (n^{2\tau } - n^\tau +1) \cdot n^{n^\tau } \cdot F(k-(n^{2\tau } - n^\tau )) \\&\leqslant \left( (n^{2\tau } - n^\tau +1) \cdot n^{n^\tau } \right) ^{k/(n^{2\tau } - n^\tau )}\\&\leqslant \left( (n^{2\tau } - n^\tau +1) \cdot n^{n^\tau } \right) ^{n/(n^{2\tau } - n^\tau )} = n^{{\mathcal {O}}(n^{1+\tau - {2\tau }})}=2^{\tilde{{\mathcal {O}}}(n^{1-\tau })}. \end{aligned}$$

Once the branching step can no longer be applied, we obtain an induced subgraph \(G''\) of \(G'\) of maximum degree less than \(n^{2\tau }\). In the branch where all the choices have been made correctly (i.e., according to the fixed solution A), \(G''\) still contains all vertices from \(A \setminus A'\).

By property (P2), we have \(G'' \in {\mathcal {D}}\). Thus \(G''\) satisfies either (P4a) or (P4b), which means that \(G''\) has a balanced separator of size \({\mathcal {O}}(n^{2/3})\) in the former case or \({\mathcal {O}}(\sqrt{n \cdot n^{1/2}}) = {\mathcal {O}}(n^{3/4})\) in the latter case. In both cases, the size of the separator is \({\mathcal {O}}(n^{1-\tau })\). Moreover, again by property (P2), balanced separators of that size also exist in every subgraph of \(G''\). Therefore, by Lemma 1, we conclude that \(G''\) has treewidth \({\mathcal {O}}(n^{1-\tau })\). Since \(|A'|\leqslant {\mathcal {O}}(n^{1-\tau })\), it follows that the graph \(G[V(G'') \cup A']\) also has treewidth \({\mathcal {O}}(n^{1-\tau })\).

We know that \(G[V(G'') \cup A'] \in {\mathcal {D}}\) and, in the branch where all choices have been made correctly, this graph contains the entire maximum-size solution A. Now, we apply the procedure assumed in (P3) to the graph \(G[V(G'') \cup A']\) and observe that in the correct branch it finds some maximum-size solution (possibly different from A). Let us point out that in this step it is not sufficient to consider only the graph \(G''\), as the vertices from \(A'\) introduce some additional constraints on the solution we are looking for.

For the time complexity, the algorithm considers \(2^{\tilde{{\mathcal {O}}}(n^{1-\tau })}\) branches and in each of them it executes the procedure assumed in (P3) in \(2^{\tilde{{\mathcal {O}}}(n^{1-\tau })}\) time, which gives the total running time of \(2^{\tilde{{\mathcal {O}}}(n^{1-\tau })}\). \(\square\)

Remark 1

The condition (P1) in the statement of Theorem 1 can be relaxed to “every n-vertex graph from \({\mathcal {C}}\) has \({\mathcal {O}}(n^{2-\varepsilon })\) edges, for some constant \(\varepsilon > 0\)”. Then, we can follow the same approach with the following modification: we choose \(\tau =1-\frac{2}{3}\varepsilon\) in case of (P4a) and \(\tau =1-\frac{3}{4}\varepsilon\) in case of (P4b), and replace the threshold for branching on high-degree vertices from \(n^{2\tau }\) to \(n^{2\tau +\varepsilon -1}\). This way, we obtain algorithms with running time \(2^{\tilde{{\mathcal {O}}}(n^{1-\varepsilon /3})}\) for property (P4a) and \(2^{\tilde{{\mathcal {O}}}(n^{1-\varepsilon /4})}\) for property (P4b). This running time is subexponential for every \(\varepsilon >0\).

Remark 2

Let us point out that the conjunction of properties (P2) and (P4a) implies (P4b). We state them separately, as there are some natural graph classes with each type of behavior. One can also imagine unifying properties (P4a) and (P4b) into the existence of a balanced separator of size \({\mathcal {O}}(n^{\alpha }\varDelta ^\beta )\), for some constants \(\alpha ,\beta\). However, then, one needs to be careful when choosing \(\tau\) so that it belongs to the interval [0, 1]. As we did not find concrete examples of interesting graph classes \({\mathcal {D}}\) for which this approach would yield non-trivial results and which would not satisfy either (P4a) or (P4b), we refrain from discussing further details here.

3 Corollaries

In this section, we discuss possible classes \({\mathcal {C}}\) and \({\mathcal {D}}\) which satisfy the conditions of Theorem 1. For some choices of \({\mathcal {C}}\), we obtain well-studied computational problems:

  1. 1.

    for matchings, we obtain Max Induced Matching,

  2. 2.

    for forests, we obtain Max Induced Forest, the complement problem of Feedback Vertex Set (note that from the point of view of exact algorithms these problems are equivalent),

  3. 3.

    for graphs of maximum degree d, where d is fixed, we obtain Max Induced Degree-d Subgraph,

  4. 4.

    for planar graphs, we obtain Max Induced Planar Subgraph, also known as Planarization,

  5. 5.

    for graphs embeddable in \(\varSigma\), where the surface \(\varSigma\) is fixed, we obtain Max Induced \(\varSigma\)-Embeddable Subgraph,

  6. 6.

    for graphs of degeneracy at most d, where d is fixed, we obtain Max Induced d -Degenerate Subgraph.

We note that all these classes satisfy property (P1) of Theorem 1. We point out that the Euler formula implies that every n-vertex graph embeddable on a surface \(\varSigma\) with Euler genus g has at most \(3n+6g-6\) edges [34].

Given a graph of treewidth w, its tree decomposition of width at most \(5w+4\) can be computed in \(2^{{\mathcal {O}}(w)}\cdot n\) time [6]. Therefore, for the purpose of verifying property (P3), we can assume that a tree decomposition of width \({\mathcal {O}}(w)\) is additionally provided on input. While \(2^{\tilde{{\mathcal {O}}}(w)}\cdot n^{{\mathcal {O}}(1)}\)-time algorithms are quite straightforward and well known for the first two problems on the list, this is not necessarily the case for the others. As the Max Induced Degree-d Subgraph problem can be expressed in the so-called Existential Counting Modal Logic, an algorithm with running time \(2^{{\mathcal {O}}(w)}\cdot n^{{\mathcal {O}}(1)}\) can be easily derived from the meta-theorem of Pilipczuk [30]. Algorithms for Max Induced Planar Subgraph and, more generally, Max Induced \(\varSigma\)-Embeddable Subgraph, were provided by Kociumaka and Pilipczuk [20]. Finally, we give a suitable algorithm for Max Induced d-Degenerate Subgraph in Lemma 4 in Sect. 6.

It may be tempting to consider, as \({\mathcal {C}}\), the graphs with no even cycle \(C_{2k}\) (not necessarily induced), for some fixed integer \(k\geqslant 2\). This is because such graphs have \({\mathcal {O}}(n^{2-\Omega (1/k)})\) edges [7], and thus they satisfy the generalization of property (P1) mentioned in Remark 1 for \(\varepsilon =\Omega (1/k)\). However, for these classes, property (P3) turns out to be problematic: for any fixed \(\ell \geqslant 5\), there is no algorithm for a minimum set of vertices hitting all (non-induced) copies of \(C_\ell\) in a graph with treewidth w with running time \(2^{o(w^2)}\cdot n^{{\mathcal {O}}(1)}\) unless the ETH fails [30] (this bound appears to be essentially tight, as the problem can be solved in \(2^{\tilde{{\mathcal {O}}}(w^2)}\cdot n^{{\mathcal {O}}(1)}\) time [13]). It is unclear whether the additional assumption that the input graph belongs to some class \({\mathcal {D}}\), considered here, can help.

Now, let us consider classes \({\mathcal {D}}\). Examples of classes satisfying property (P4a) in Theorem 1 come from forbidding some induced subgraphs. Bacsó et al. [4] proved that \(P_t\)-free graphs with maximum degree \(\varDelta\) have treewidth \({\mathcal {O}}(\varDelta \cdot t)\). Very recently, Chudnovsky et al. [11] observed that long-hole-free graphs, that is, graphs with no induced cycles of length at least 5, also have balanced separators of size \({\mathcal {O}}(\varDelta )\).

An example of a class satisfying property (P4b) is the class of string graphs—intersection graphs of arc-connected subsets of the plane [22,23,24]. The importance of this class stems from the fact that they serve as a common generalization of classes of intersection graphs of geometric objects in the plane. Lee [25] showed that they admit balanced separators of size \({\mathcal {O}}(\sqrt{m})\), where m is the number of edges. In fact, he proved a more general result that if \({{{\mathcal {M}}}}\) is a class of graphs excluding a fixed graph as a minor, then intersection graphs of connected subgraphs of graphs from \({{{\mathcal {M}}}}\) admit balanced separators of size \({\mathcal {O}}(\sqrt{m})\). String graphs are precisely the intersection graphs of connected subgraphs of planar graphs [19].

Summing up, we obtain the following.

Corollary 1

Each of the following problems can be solved in \(2^{\tilde{{\mathcal {O}}}(n^{2/3})}\) time on \(P_t\) -free graphs (for every fixed t ) and in long-hole-free graphs, and in \(2^{\tilde{{\mathcal {O}}}(n^{3/4})}\) time on string graphs:

  1. 1.

    Max Induced Matching,

  2. 2.

    Max Induced Forest,

  3. 3.

    Max Induced Degree-d Subgraph, for every fixed \(d\in \mathbb {N},\)

  4. 4.

    Max Induced Planar Subgraph,

  5. 5.

    Max Induced \(\varSigma\)-Embeddable Subgraph, for every fixed surface \(\varSigma ,\)

  6. 6.

    Max Induced d-Degenerate Subgraph, for every fixed \(d\in \mathbb {N}.\)

As we have argued, in Corollary 1, we can replace string graphs with intersection graphs of connected subgraphs of graphs from \({{\mathcal {M}}}\), where \({{\mathcal {M}}}\) is any class of graphs excluding a fixed graph as a minor; this is because the result of Lee [25] holds in that generality.

Finally, let us point out that we can easily extend the approach of Theorem 1 to enforce some constraints on the set of vertices that are removed, i.e., \(V(G) \setminus A\). For example, we might require that this set is independent. To obtain this, whenever we decide to discard a vertex in the branching phase, we need to mark all its neighbors, so that we do not discard them later. Note that this might result in having a marked vertex of degree at least \(n^{2\tau }\), which is adjacent to at least \(n^{\tau }\) marked vertices. In this case we cannot perform any branching, but we can immediately terminate this call, as the existence of such a vertex certifies that \(A'\) was not chosen properly. Furthermore, in standard dynamic programming algorithms, based on tree decompositions, the constraints coming from marking can also be handled easily. Thus, in particular, we obtain the following corollary, answering a question by Paulusma [29].

Corollary 2

For every fixed tthe Independent Feedback Vertex Set problem can be solved in \(2^{\tilde{{\mathcal {O}}}(n^{2/3})}\) time on \(P_t\)-free graphs with n vertices.

4 Refined Algorithm for String Graphs

Let us point out that subexponential-time algorithms for Max Induced Matching and Max Induced Forest on string graphs, even with a better running time \(2^{\tilde{{\mathcal {O}}}(n^{2/3})}\), were already known [8]. They are based on another result by Lee [25].

Theorem 2

(Lee [25]) There is a constant \(c >0\) such that for every \(t \geqslant 1\) the following holds: every string graph that does not contain \(K_{t,t}\) as a (not necessarily induced) subgraph has at most \(c \cdot n \cdot t \log t\) edges.

The better running time comes from a similar win-win approach: either we have few edges (and thus a small balanced separator), or we have a large biclique, which can be exploited for branching in a very effective way. It turns out that a similar idea can be used to improve the running times of algorithms for other problems mentioned in Corollary 1, if the input is a string graph.

We prove the following general result.

Theorem 3

Let t be a constant, and let \({\mathcal {C}}\) be a class of graphs with the following properties:

  1. (SP1)

    No graph from \({\mathcal {C}}\) contains \(K_{t+1,t+1}\) as a subgraph.

  2. (SP2)

    Given a string graph G with n vertices and treewidth wone can find a largest set \(A \subseteq V(G)\) such that \(G[A]\in {\mathcal {C}}\) in \(2^{\tilde{{\mathcal {O}}}(w)}\cdot n^{{\mathcal {O}}(1)}\) time.

Then, given an n-vertex string graph Gone can find a largest set \(A\subseteq V(G)\) such that \(G[A]\in {\mathcal {C}}\) in \(2^{\tilde{{\mathcal {O}}}(n^{2/3})}\) time. The algorithm does not require the geometric representation.

Proof

The proof is similar to the proof of Theorem 1. Let G be the input string graph with n vertices. We assume that n is sufficiently large compared to t, as otherwise the input has constant size and we can solve the problem using brute force. Let A be the (unknown) solution that we are trying to find. First, we check whether G contains a subgraph isomorphic to \(K_{n^{1/3},n^{1/3}}\). We can do it in total time \(2^{\tilde{{\mathcal {O}}}(n^{1/3})}\) by exhaustive enumeration of all pairs of disjoint sets, each of size \(n^{1/3}\).

First, consider the case that such a biclique exists, and let X and Y be its bipartition classes. Note that \(|A \cap X| \leqslant t\) or \(|A \cap Y| \leqslant t\), as otherwise G[A] contains \(K_{t+1,t+1}\) as a subgraph, which contradicts property (SP1). In other words, we can choose the set of t vertices from one of the classes and immediately discard all other vertices from this class. We perform such a branching, and the number of branches is bounded by

$$\begin{aligned} F(n) \leqslant 2 \cdot \left( n^{1/3}\right) ^t F\left( n - (n^{1/3}-t) \right) \leqslant 2^{\tilde{{\mathcal {O}}}(n^{2/3})}. \end{aligned}$$

So let us assume that the search for a biclique fails. By Theorem 2, this means that G has \(\tilde{{\mathcal {O}}}(n^{4/3})\) edges, so, by a result of Lee [25, Theorem 1] and Lemma 1, G has treewidth \(\tilde{{\mathcal {O}}}(n^{2/3})\). Then we call the algorithm from property (SP2) to compute the solution. The total running time is \(2^{\tilde{{\mathcal {O}}}(n^{2/3})}\). \(\square\)

Now let us apply the theorem above to problems mentioned in Corollary 1. Clearly, if \({\mathcal {C}}\) is the class of forests, then it satisfies property (SP1) in Theorem 3 for \(t = 1\). If \({\mathcal {C}}\) is a class of graphs with degeneracy at most d (this already contains the case of graphs with maximum degree at most d), then property (SP1) is satisfied for \(t=d\). If \({\mathcal {C}}\) is the class of planar graphs, then property (SP1) is satisfied for \(t=2\). Finally, if \({\mathcal {C}}\) is the class of graphs embeddable in a fixed surface \(\varSigma\), then property (SP1) is satisfied for \(t=2g+3\), where g is the Euler genus of \(\varSigma\). Indeed, it was observed by Ringel [32] that \(K_{3,2g+2}\) cannot be embedded in a surface with Euler genus g.

Summing up, we obtain the following corollary from Theorem 3.

Corollary 3

Each of the following problems can be solved in \(2^{\tilde{{\mathcal {O}}}(n^{2/3})}\) time on string graphs, even if no geometric representation is given:

  1. 1.

    Max Induced Matching,

  2. 2.

    Max Induced Forest,

  3. 3.

    Max Induced Degree-d Subgraph, for every fixed \(d\in \mathbb {N},\)

  4. 4.

    Max Induced Planar Subgraph,

  5. 5.

    Max Induced \(\varSigma\)-Embeddable Subgraph, for every fixed surface \(\varSigma ,\)

  6. 6.

    Max Induced d-Degenerate Subgraph, for every fixed \(d\in \mathbb {N}.\)

5 Max Induced Forest in H-Free Graphs

Our original motivation was the Max Induced Forest problem. In the previous section, we discussed a subexponential-time algorithm solving it on \(P_t\)-free graphs. We now show that as long as the considered class of inputs \({\mathcal {D}}\) is characterized by a single excluded induced subgraph, that is, we investigate Max Induced Forest on H-free graphs for a fixed graph H, we cannot hope for more positive results. Namely, it turns out that if H is not a linear forest (i.e., a collection of vertex-disjoint paths), the problem is unlikely to admit a polynomial-time or even a subexponential-time algorithm on H-free graphs. Specifically, we obtain the following dichotomy.

Theorem 4

Let H be a fixed graph.

  1. 1.

    If H is a linear forest, then the Max Induced Forest problem can be solved in \(2^{\tilde{{\mathcal {O}}}(n^{2/3})}\) time on H-free graphs with n vertices.

  2. 2.

    Otherwise, on H-free graphs, the Max Induced Forest problem is NP-complete and cannot be solved in \(2^{o(n)}\) time unless the ETH fails.

Statement 1 of Theorem 4 follows from Corollary 1, because every linear forest is an induced subgraph of some path. Statement 2 follows from a combination of arguments already existing in the literature. However, since the proof is simple, we include it for the sake of completeness.

We prove statement 2 of Theorem 4 in two steps. First, we consider graphs H that contain a cycle or two branch vertices, that is, vertices of degree at least 3. In this case, we can apply the standard argument of subdividing every edge a suitable number of times, cf. [10, Theorem 3].

Lemma 2

Let H be a fixed graph that either contains a cycle or has a connected component with at least two branch vertices. Then Max Induced Forest is NP-complete on H-free graphs. Moreover, there is no algorithm solving Max Induced Forest in \(2^{o(n)}\) time for n-vertex H-free graphs unless the ETH fails.

Proof

We reduce from Max Induced Forest in graphs with maximum degree 6; it is known that this problem is NP-complete and has no subexponential-time algorithm assuming ETH [12]. Let G be a graph with n vertices and maximum degree 6. Let \(G^*\) be the graph obtained from G by subdividing every edge \(|V(H)|+1\) times. It is straightforward to observe that G has an induced forest on \(n-k\) vertices if and only if \(G^*\) has an induced forest on \(|G^*|-k\) vertices. Moreover, the number of vertices in \(G^*\) is linear in n.

Finally, we show that \(G^*\) is H-free. First, observe that if H contains a cycle, then H cannot be a subgraph of \(G^*\), as the girth of \(G^*\) is greater than \(|V(H)|+1\). On the other hand, the distance between any two branch vertices in \(G^*\) is at least \(|V(H)|+1\), so \(G^*\) does not contain H as a subgraph in case H has two branch vertices in the same connected component. \(\square\)

By Lemma 2, the only graphs H for which we might hope for a polynomial-time or even a subexponential-time algorithm for Max Induced Forest on H-free graphs are collections of disjoint subdivided stars. To resolve this case, we will show that the problem remains hard for line graphs. Recall that the line graph L(G) of a graph G is the graph whose vertices are the edges of G and where the adjacency relation corresponds to the relation of having a common endpoint in G.

Actually, Chiarelli et al. [10] reported that the hardness of Max Induced Forest on line graphs was observed by Speckenmeyer in his PhD thesis [33]. However, we were unable to find this result there. Therefore, we provide the easy proof, which boils down to essentially the same argument as in [10, Theorem 5].

Lemma 3

Max Induced Forest is NP-complete on line graphs. Moreover, there is no algorithm solving Max Induced Forest in \(2^{o(n)}\) time for n-vertex line graphs unless the ETH fails.

Proof

We reduce from the Hamiltonian Path problem, which is NP-complete and has no subexponential-time algorithm, even if the input graph has linearly many edges [12]. Let G be a graph, which is the input instance of Hamiltonian Path.

First, note that any induced forest in L(G) corresponds to a collection of vertex-disjoint paths in G. More formally, consider a set \(E' \subseteq E(G)\), such that \(L(G)[E']\) is a forest. We claim that the subgraph \(G'=(V(G),E')\) of G is a collection of vertex-disjoint paths. Suppose not. This means that \(G'\) contains a vertex v of degree at least 3 or a cycle C. In the former case, the edges incident to v in \(G'\) form a clique in \(L(G)[E']\). In the latter case, the edges of the cycle C form a cycle in \(L(G)[E']\). In either case, we get a contradiction to the assumption that \(L(G)[E']\) is a forest.

We claim that G has a Hamiltonian path if and only if L(G) has an induced forest on \(n-1\) vertices. Indeed, the \(n-1\) edges of a Hamiltonian path in G induce a path (in particular, a forest) in L(G). For the converse, suppose that L(G) has an induced forest on at least \(n-1\) vertices. By the observation above, this induced forest corresponds to a collection of vertex-disjoint paths in G with at least \(n-1\) edges in total. This is only possible if this collection consists of a single path of length \(n-1\), that is, a Hamiltonian path in G.

Finally, observe that the number of vertices of L(G) is equal to the number of edges of G, which is linear in the number of vertices of G. \(\square\)

Recall that line graphs are claw-free, that is, they contain no induced copy of \(K_{1,3}\). Thus Theorem 3 implies that if H contains any star with at least 3 leaves, then Max Induced Forest remains NP-complete and has no subexponential-time algorithm on H-free graphs unless ETH fails. Statement 2 of Theorem 4 follows from combining Lemmas 2 and 3.

6 Largest Induced Degenerate Subgraph in Low-Treewidth Graphs

This section is devoted to the proof of the following result, which we used in Sect. 3.

Lemma 4

For every fixed \(d\in \mathbb {N},\) there is an algorithm for Max Induced d-Degenerate Subgraph with running time \(2^{{\mathcal {O}}(w\log w)}\cdot n,\) where w is the treewidth of the input graph and n is the number of its vertices.

Preliminaries on tree decompositions. First, we introduce some notations and terminology, as they will be required in this section. A tree decomposition of a graph G is a tree T together with a mapping \(\beta (\cdot )\) that assigns a bag \(\beta (x)\) to each node x of T in such a way that the following conditions hold:

  1. (T1)

    for each \(u\in V(G)\), the set of nodes x with \(u\in \beta (x)\) induces a connected non-empty subtree of T; and

  2. (T2)

    for each \(uv\in E(G)\), there exists a node x such that \(\{u,v\}\subseteq \beta (x)\).

The width of a tree decomposition \((T,\beta )\) is \(\max _{x\in V(T)} |\beta (x)|-1\), and the treewidth of a graph G is the minimum width of a tree decomposition of G.

Henceforth, all tree decompositions will be rooted: the underlying tree T has a prescribed root vertex r. This gives rise a natural ancestor-descendant relation: we write \(x\preceq y\) if x is an ancestor of y (where possibly \(x=y\)). Then, for a node x of T, we define the component at x as

$$\begin{aligned} \alpha (x)=\biggl (\bigcup _{y\succeq x} \beta (y)\biggr )\setminus \beta (x). \end{aligned}$$

It easily follows from (T1) and (T2) that then \(N(\alpha (x))\subseteq \beta (x)\) for every node x.

A nice tree decomposition is a normalized form of a rooted tree decomposition in which every node is of one of the following four kinds.

  • Leaf node a node x with no children and with \(\beta (x)=\emptyset\).

  • Introduce node a node x with one child y such that \(\beta (x)=\beta (y)\cup \{u\}\) for some vertex \(u\notin \beta (y)\).

  • Forget node a node x with one child y such that \(\beta (x)=\beta (y)\setminus \{u\}\) for some vertex \(u\in \beta (y)\).

  • Join node a node x with two children y and z such that \(\beta (x)=\beta (y)=\beta (z)\).

Moreover, we require that the root r of the nice tree decomposition satisfies \(\beta (r)=\emptyset\).

It is known that any given tree decomposition \((T,\beta )\) of width k of an n-vertex graph G can be transformed in \(k^{{\mathcal {O}}(1)}\cdot \max (n,|V(T)|)\) time into a nice tree decomposition of G of width at most as large, see [12, Lemma 7.4]. Moreover, given an n-vertex graph G of treewidth w, a tree decomposition of G of width at most \(5w+4\) can be computed in \(2^{{\mathcal {O}}(w)}\cdot n\) time [6], and this tree decomposition has at most n nodes. By combining these two results, for the proof of Lemma 4, we can assume that the input graph G is supplied with a nice tree decomposition \((T,\beta )\) of width \(k\leqslant 5w+4\), where \(w={\mathrm {tw}}(G)\). From now on, our goal is to design a suitable dynamic programming algorithm working on this decomposition with running time \(2^{{\mathcal {O}}(k\log k)}\cdot n=2^{{\mathcal {O}}(w\log w)}\cdot n\).

Dynamic programming states. The main idea behind our dynamic programming algorithm is to view the notion of degeneracy via vertex orderings, as expressed in the following fact.

Lemma 5

(folklore) A graph H is d-degenerate if and only if there is a linear ordering \(\sigma\) of vertices of H such that every vertex of H has at most d neighbors that are smaller in \(\sigma .\)

Let us point out that sometimes degeneracy is expressed in terms of vertex ordering, where we count neighbors that are larger. This characterization is clearly equivalent, as it is sufficient to reverse the ordering given in Lemma 5.

Due to Lemma 5, the problem considered in Lemma 4 can be restated as follows: find a largest set \(A\subseteq V(G)\) that admits a linear ordering \(\sigma\) in which every vertex of A has at most d neighbors in G[A] that are smaller in \(\sigma\). Intuitively, our dynamic programming will therefore keep track of the intersection of the bag with A, the restriction of \(\sigma\) to this intersection; and how many smaller neighbors of each vertex from this intersection have been already forgotten.

We now proceed with formal details. For a node x of T, a set \(X\subseteq \beta (x)\), a linear ordering \(\sigma\) of X, and a function \(f:X\rightarrow \{0,\ldots ,d\}\), we define \(\varPhi _x[X,\sigma ,f]\in \mathbb {N}\) as follows. The value \(\varPhi _x[X,\sigma ,f]\) is the maximum size of a set \(Y\subseteq \alpha (x)\) such that \(X\cup Y\) admits a linear ordering \(\tau\) with the following properties: \(\tau\) restricted to X is equal to \(\sigma\) and for every \(a\in X\), there are at most f(a) vertices \(b\in Y\) that are adjacent to a and smaller than a in \(\tau\). Note that other neighbors of a that belong to X are not taken into consideration when verifying the quota imposed by f(a). Note also that such a set Y always exists, as \(Y=\emptyset\) satisfies the criteria.

For a fixed node x, the total number of triples \((X,\sigma ,f)\) as above is at most

$$\begin{aligned} 2^{k+1}\cdot (k+1)!\cdot (d+1)^{k+1}\leqslant 2^{{\mathcal {O}}(k\log k)}. \end{aligned}$$

Hence, we now show how to compute the values \(\varPhi _x[X,\sigma ,f]\) in a bottom-up manner, so that the values for a node x are computed based on the values for the children of x in \(2^{{\mathcal {O}}(k\log k)}\) time. The answer to the problem corresponds to the value \(\varPhi _r[\emptyset ,\emptyset ,\emptyset ]\), where r is the root of T. While \(\varPhi _r[\emptyset ,\emptyset ,\emptyset ]\) is just the size of a largest feasible solution, an actual solution can be recovered from the dynamic programming tables using standard methods within the same complexity: for every computed value \(\varPhi _x[X,\sigma ,f]\), we store the way this value was obtained, and then we trace back the solution from \(\varPhi _r[\emptyset ,\emptyset ,\emptyset ]\) in a top-down manner.

Transitions It remains to provide recursive formulas for the values of \(\varPhi _x[\cdot ,\cdot ,\cdot ]\). We only present the formulas, while the verification of their correctness, which follows easily from the definition of \(\varPhi _x[\cdot ,\cdot ,\cdot ]\), is left to the reader. As usual, we distinguish cases depending on the type of x.

  • Leaf node x. Then we have only one value:

    $$\begin{aligned} \varPhi _x[\emptyset ,\emptyset ,\emptyset ]=0. \end{aligned}$$
  • Introduce node x with child y such that \(\beta (x)=\beta (y)\cup \{u\}\). Then

    $$\begin{aligned} \varPhi _x[X,\sigma ,f]={\left\{ \begin{array}{ll}\varPhi _y[X,\sigma ,f] &{} \text {if }u\notin X;\\ \varPhi _y[X\setminus \{u\},\sigma |_{X\setminus \{u\}},f|_{X\setminus \{u\}}] &{} \text {if }u\in X.\end{array}\right. } \end{aligned}$$
  • Forget node x with child y such that \(\beta (x)=\beta (y)\setminus \{u\}\). Then we have

    $$\begin{aligned} \varPhi _x[X,\sigma ,f] = \max \,\left( \, \varPhi _y[X,\sigma ,f],\ 1 + \max _{(\sigma ',f')\in S(X,\sigma ,f)} \varPhi _y[X\cup \{u\},\sigma ',f']\,\right) , \end{aligned}$$

    where \(S(X,\sigma ,f)\) is the set comprising the pairs \((\sigma ',f')\) satisfying the following:

    • \(\sigma '\) is a vertex ordering of \(X\cup \{u\}\) whose restriction to X is equal to \(\sigma\); and

    • \(f':X\cup \{u\}\rightarrow \{0,\ldots ,d\}\) is such that for all \(a\in X\) that are adjacent to u and larger than u in \(\sigma '\), we have \(f'(a)\leqslant f(a)-1\), and for all other \(a\in X\), we have \(f'(a)\leqslant f(a)\). Moreover, we require that \(f'(u)\leqslant d-\ell\), where \(\ell\) is the number of vertices \(a\in X\) that are adjacent to u and smaller than u in \(\sigma '\).

  • Join node x with children y and z. Then

    $$\begin{aligned} \varPhi _x[X,\sigma ,f] = \max _{f_y+f_z\leqslant f} \varPhi _y[X,\sigma ,f_y]+\varPhi _z[X,\sigma ,f_z], \end{aligned}$$

    where \(f_y+f_z\leqslant f\) means that \(f_y(a)+f_z(a)\leqslant f(a)\) for each \(a\in X\).

It is straightforward to see that using the formulas above, each value \(\varPhi _x[X,\sigma ,f]\) can be computed in \(2^{{\mathcal {O}}(k\log k)}\) time based on the values computed for the children of x. This completes the proof of Lemma 4.