In this final section, we give our algorithmic results: An FPT randomized approximation scheme (FPTRAS) for #Size-BIS, and an exact FPT-algorithm for all three problems in bounded-degree graphs. We define an FPTRAS of #Size-BIS as in Arvind and Raman [1].
Definition 10
An FPTRAS for #Size-BIS is a randomised algorithm that takes as input a bipartite graph G, a non-negative integer k, and a real number \(\varepsilon \in (0,1)\) and outputs a real number z. With probability at least 2 / 3, the output z must satisfy \((1-\varepsilon )\text {IS}_k(G) \le z \le (1+\varepsilon )\text {IS}_k(G)\). Furthermore, there is a function \(f:{\mathbb {R}}\rightarrow {\mathbb {R}}\) and a polynomial p such that the running time of the algorithm is at most \(f(k)\,p(|V(G)|,1/\varepsilon )\).
Theorem 11
There is an FPTRAS for #Size-BIS with time complexity \(O\left( 2^k\cdot k^2/\varepsilon ^2\right) \) for input graphs with n vertices and m edges.
Proof
Let (G, k) be an instance of #Size-BIS with \(G = (U,V,E)\) and \(n = |V(G)|\). Let \(\varepsilon > 0\) be the other input of the FPTRAS. Let \(t = 10\lceil 2^k/\varepsilon ^2 \rceil \). The FPTRAS independently samples t uniformly-random size-k subsets of \(U\cup V\). Let X be the number of independent sets among the samples. The output z of the FPTRAS is \(z=X\cdot \left( {\begin{array}{c}n\\ k\end{array}}\right) /t\).
Note that \({\mathbb {E}}(X) = t\cdot \text {IS}_k(G)/\left( {\begin{array}{c}n\\ k\end{array}}\right) \). We now show that with probability at least 2 / 3,
$$\begin{aligned} (1-\varepsilon )\text {IS}_k(G) \le z \le (1+\varepsilon )\text {IS}_k(G). \end{aligned}$$
Since each sample lies entirely within U or entirely within V with probability at least \(2^{-k}\), we have \({\mathbb {E}}(X) \ge t2^{-k} \ge 10/\varepsilon ^2\). By Lemma 8, we have
$$\begin{aligned} {\mathbb {P}}\Big (|X - {\mathbb {E}}(X)| \ge \varepsilon {\mathbb {E}}(X)\Big ) \le 2e^{-10/3} < 1/3. \end{aligned}$$
Thus, with probability at least 2 / 3, we have \(|X - {\mathbb {E}}(X)| \le \varepsilon {\mathbb {E}}(X)\), and so \(|z - \text {IS}_k(G)| \le \varepsilon \text {IS}_k(G)\) holds as required.
Recall that we use the word-RAM model, in which operations on \(O(\log n)\)-sized words take O(1) time. Thus for each of the t samples, the algorithm generates the sample in O(k) time and makes \(\left( {\begin{array}{c}k\\ 2\end{array}}\right) \) queries to the graph to check that the selected set is an independent set. The running time is therefore as claimed. \(\square \)
We now turn to our algorithms for bounded-degree graphs. We require the following definitions. For any positive integer s, an s-coloured graph is a tuple (G, c) where G is a graph and \(c:V(G) \rightarrow [s]\) is a map. Suppose \(\mathcal {G} = (G,c)\) and \(\mathcal {G}' = (G',c')\) are coloured graphs with \(G = (V,E)\) and \(G' = (V',E')\).
We say a map \(\phi :V\rightarrow V'\) is a homomorphism from \(\mathcal {G}\) to \(\mathcal {G}'\) if \(\phi \) is a homomorphism from G to \(G'\) and, for all \(v \in V\), \(c(v) = c'(\phi (v))\). If \(\phi \) is also bijective, we say \(\phi \) is an isomorphism from \(\mathcal {G}\) to \(\mathcal {G}'\), that \(\mathcal {G}\) and \(\mathcal {G}'\) are isomorphic, and write \(\mathcal {G} \simeq \mathcal {G}'\). For all \(X \subseteq V\), we define \(\mathcal {G}[X] = (G[X], c|_X)\), and say \(\mathcal {G}[X]\) is an induced subgraph of \(\mathcal {G}\). Given coloured graphs \(\mathcal {H}\) and \(\mathcal {G}\), we denote the number of sets \(X \subseteq V(\mathcal {G})\) with \(\mathcal {G}[X] \simeq \mathcal {H}\) by \(\#{\text {Ind}}(\mathcal {H}\rightarrow \mathcal {G})\). Finally, we define \(V(\mathcal {G}) = V\) and \(E(\mathcal {G}) = E\) and we define \(\Delta (\mathcal {G})\) to be the maximum degree of G.
For each positive integer \(\Delta \), we consider a counting version of the induced subgraph isomorphism problem for coloured graphs of degree at most \(\Delta \).
We will later reduce our bipartite independent set counting problems to the coloured induced subgraph problem. Note that #Induced-Coloured-Subgraph\([\Delta ]\) can be expressed as a first-order model-counting problem in bounded-degree structures. A well-known result of Frick [11, Theorem 6] would yield an algorithm for #Induced-Coloured-Subgraph\([\Delta ]\) with running time \(g(k)\cdot n\), where \(k=|V(\mathcal {H})|\) and \(n=|V(\mathcal {G})|\). (To our knowledge this fact has not appeared in the literature, but the proof is not hard.) However, the function g of Frick’s algorithm may grow faster than any constant-height tower of exponentials. In the following, we provide an algorithm for #Induced-Coloured-Subgraph\([\Delta ]\) that is substantially faster: It runs in time \(O(n k^{(2\Delta +3)k})\).
The algorithm follows the strategy of [4] to count small subgraphs: Instead of counting (coloured) induced subgraphs, we can count (coloured) homomorphisms and recover the number of induced subgraphs via a simple basis transformation. Transforming to homomorphisms is useful because homomorphisms from small patterns to bounded-degree host graphs can be counted by a simple branching procedure–this is however not true for small induced subgraphs. The following lemma encapsulates counting homomorphisms in graphs of bounded degree. Given coloured graphs \(\mathcal {H}\) and \(\mathcal {G}\), we denote the number of homomorphisms from \(\mathcal {H}\) to \(\mathcal {G}\) by \(\#{\text {Hom}}(\mathcal {H}\rightarrow \mathcal {G})\).
Lemma 12
There is an algorithm to compute #Hom \(({\mathcal {H}}\rightarrow {\mathcal {G}})\) in time \(O(n k^k (\Delta +1)^k)\), where \(\mathcal {G}\) is a coloured graph with n vertices, \(\mathcal {H}\) is a coloured graph with k vertices, and both graphs have maximum degree at most \(\Delta \).
Proof
The algorithm works as follows: If \(\mathcal {H}\) is not connected, let \(\mathcal {H}_1,\dots ,\mathcal {H}_\ell \) be its connected components. Then it is straightforward to verify that
$$\begin{aligned} \#\hbox {Hom}({\mathcal {H}}\rightarrow {G}) = \prod _{i=1}^\ell \#\hbox {Hom}({\mathcal {H}_i}\rightarrow {\mathcal {G}})\,. \end{aligned}$$
Thus it remains to describe the algorithm for connected pattern graphs \(\mathcal {H}\).
Let \(\mathcal {H}\) be connected. A sequence of vertices \(v_1, \dots , v_k\) in a graph F is a traversal if, for all \(i \in \{1,\dots ,k-1\}\), the vertex \(v_{i+1}\) is contained in \(\{v_1,\dots ,v_i\} \cup \Gamma (\{v_1, \dots , v_i\})\). Let \(u_1, \dots , u_k\) be an arbitrary traversal of \(\mathcal {H}\) with \(\{u_1,\dots ,u_k\}=V(\mathcal {H})\); the latter property can be satisfied since \(\mathcal {H}\) is a connected graph with k vertices. Note that if \(f:V(\mathcal {H})\rightarrow V(\mathcal {G})\) is a homomorphism from \(\mathcal {H}\) to \(\mathcal {G}\), then \(f(u_1),\dots ,f(u_k)\) is a traversal in \(\mathcal {G}\), and this correspondence is injective. Thus the algorithm computes the number of traversals \(v_1,\dots ,v_k\) in \(\mathcal {G}\) for which the mapping f with \(f(u_i)=v_i\) for all i is a homomorphism from \(\mathcal {H}\) to \(\mathcal {G}\). This number is equal to \(\#{\text {Hom}}(\mathcal {H}\rightarrow \mathcal {G})\), which the algorithm seeks to compute.
Since the maximum degree of G is \(\Delta \), any set \({S \subseteq V(\mathcal {G})}\) satisfies \(|\Gamma (S)| \le \Delta |S|\). Thus there are at most \(n\cdot (\Delta k + k)^{k-1}\) traversal sequences in \(\mathcal {G}\), which can be generated in linear time in the number of such sequences. For each traversal sequence, verifying whether the sequence corresponds to a homomorphism takes time \(O(k\Delta )\) (in the word-RAM model with incidence lists for \(\mathcal {H}\) already prepared). Overall, we obtain a running time of \(O(n\cdot k^k \cdot (\Delta + 1)^k)\). \(\square \)
Using the above algorithm, we now construct an algorithm that performs a kind of basis transformation to obtain the number of induced coloured subgraphs.
Theorem 13
For all positive integers \(\Delta \), there is a fixed-parameter tractable algorithm for #Induced-Coloured-Subgraph\([\Delta ]\) with time complexity \(O(n \cdot k^{(2\Delta +3)\cdot k})\) for n-vertex coloured graphs \(\mathcal {G}\) and k-vertex coloured graphs \(\mathcal {H}\).
Proof
Let \((\mathcal {H},\mathcal {G})\) be an instance of #Induced-Coloured-Subgraph\([\Delta ]\), write \(\mathcal {G} = (G,c)\) and \(\mathcal {H} = (H,c')\), and let k be the number of vertices of \(\mathcal {H}\). Without loss of generality, suppose that the ranges of c and \(c'\) are [q] for some positive integer \(q \le k\). Namely, if any vertices of G receive colours not in the range of \(c'\), then our algorithm may remove them without affecting \(\#{\text {Ind}}(\mathcal {H}\rightarrow \mathcal {G})\); if any vertices of H receive colours not in the range of c, then \(\#{\text {Ind}}(\mathcal {H}\rightarrow \mathcal {G}) = 0\).
For coloured graphs \(\mathcal {K}\) and \(\mathcal {B}\), let \(\#{\text {Surj}}(\mathcal {K}\rightarrow \mathcal {B})\) be the number of vertex-surjective homomorphisms from \(\mathcal {K}\) to \(\mathcal {B}\), i.e., the number of those homomorphisms from \(\mathcal {K}\) to \(\mathcal {B}\) that contain all vertices of \(\mathcal {B}\) in their image.
Let S be the set of all q-coloured graphs \(\mathcal {K}\) such that \(\Delta (\mathcal {K}) \le \Delta \) and, for some \(t \in [k]\), \(V(\mathcal {K}) = [t]\). Let \(S'\) be a set of representatives of (coloured) isomorphism classes of S.
Let \({\varvec{x}}\) be the vector indexed by \(S'\) such that \({\varvec{x}}_{\mathcal {K}} = \#{\text {Ind}}(\mathcal {K}\rightarrow \mathcal {G})\) for all \(\mathcal {K} \in S'\). This vector contains the number of induced subgraph copies of \(\mathcal {H}\) in \(\mathcal {G}\), but it also contains the number of subgraph copies of all other graphs in \(S'\) in \(\mathcal {G}\). Let \({\varvec{b}}\) be the vector indexed by \(S'\) such that \({\varvec{b}}_{\mathcal {K}} = \#{\text {Hom}}(\mathcal {K}\rightarrow \mathcal {G})\) for all \(\mathcal {K} \in S'\); each entry of this vector can be computed via the algorithm of Lemma 12. Then we will show that \({\varvec{x}}\) and \({\varvec{b}}\) can be related to each other via an invertible matrix A such that \(A{\varvec{x}} = {\varvec{b}}\). By calculating A and \({\varvec{b}}\), we can then output \(\#{\text {Ind}}(\mathcal {H}\rightarrow \mathcal {G}) = (A^{-1}{\varvec{b}})_{\mathcal {H}}\).
To elaborate on this linear relationship between induced subgraph and homomorphism numbers, let us first consider some arbitrary graph \(\mathcal {K} \in S'\). By partitioning the homomorphisms from \(\mathcal {K}\) to \(\mathcal {G}\) according to their image, we have
$$\begin{aligned} \#\hbox {Hom}({\mathcal {K}}\rightarrow {\mathcal {G}}) = \sum _{\begin{array}{c} X \subseteq V(\mathcal {G})\\ |X| \le k \end{array}}\#\hbox {Surj}({\mathcal {K}}\rightarrow {\mathcal {G}[X]}). \end{aligned}$$
In the right-hand side sum, we can collect terms with isomorphic induced subgraphs \(\mathcal {G}[X]\), since we clearly have \(\#{\text {Surj}}(\mathcal {K}\rightarrow \mathcal {B}) = \#{\text {Surj}}(\mathcal {K}\rightarrow \mathcal {B}')\) if \({\mathcal {B}} \simeq {\mathcal {B}'}\). Hence, we obtain
$$\begin{aligned} \#\hbox {Hom}({\mathcal {K}}\rightarrow {\mathcal {G}}) = \sum _{\mathcal {K}'\in S'}\#\hbox {Surj}({\mathcal {K}}\rightarrow {\mathcal {K}'}) \cdot \#\hbox {Ind}({\mathcal {K}'}\rightarrow {\mathcal {G}}). \end{aligned}$$
(8)
Let A be the matrix indexed by \(S'\) with \(A_{\mathcal {K},\mathcal {K}'} =\)\(\#{\text {Surj}}(\mathcal {K}\rightarrow \mathcal {K}')\) for all \(\mathcal {K},\mathcal {K}' \in S'\). Then (8) implies that \(A{\varvec{x}} = {\varvec{b}}\). (An uncoloured version of this linear system is originally due to Lovász [18].)
We next prove that A is invertible. Indeed, given \(\mathcal {K},\mathcal {K}' \in S'\), write \(\mathcal {K} \lesssim \mathcal {K}'\) if \(\mathcal {K}\) admits a vertex-surjective homomorphism to \(\mathcal {K}'\). Since \(\lesssim \) is a partial order, as is readily verified, it admits a topological ordering \(\pi \). Permuting the rows and columns of A to agree with \(\pi \) does not affect the rank of A, and it yields an upper triangular matrix with non-zero diagonal entries, so it follows that A is invertible.
The algorithm is now immediate. It first determines S by listing all q-coloured graphs on at most k vertices with at most \(\lfloor \Delta k/2 \rfloor \) edges, then checking each one to see whether it satisfies the degree condition. It then determines \(S'\) from S by testing every pair of coloured graphs in S for isomorphism (by brute force). It then determines each entry \(A_{\mathcal {K},\mathcal {K}'}\) of A (by brute force) by listing the vertex-surjective maps \(\mathcal {K}\rightarrow \mathcal {K}'\). It then determines \({\varvec{b}}\) by invoking Lemma 12 to compute each entry \({\varvec{b}}_{\mathcal {K}} = \#{\text {Hom}}(\mathcal {K}\rightarrow \mathcal {G})\) for \({\mathcal {K}} \in S'\). Finally, it outputs \(\#{\text {Ind}}(\mathcal {H}\rightarrow \mathcal {G}) = (A^{-1}{\varvec{b}})_{\mathcal {H}}\).
Running time. All arithmetic operations are applied to integers bounded by \(n^k\), so they each fit into O(k) words, and we bound the complexity of each operation crudely by \(O(k^2)\). The number of q-coloured graphs on t vertices with at most \(\lfloor \Delta k/2 \rfloor \) edges is at most
$$\begin{aligned} k\cdot q^k \cdot \sum _{m=0}^{\lfloor \Delta k/2 \rfloor } \left( {\begin{array}{c}\left( {\begin{array}{c}k\\ 2\end{array}}\right) \\ m\end{array}}\right) \le k \cdot k^k \cdot \frac{\Delta k}{2} \cdot k^{2\lfloor \Delta k/2 \rfloor } = O(k^{2+(\Delta +1)k}) \text{ as } \text{ a } \text{ function } \text{ of } k, \end{aligned}$$
so our algorithm determines S in time \(O(k^{(\Delta +2)k})\) and \(|S| = O(k^{2+(\Delta +1)k})\). Moreover, checking whether two graphs in S are isomorphic by brute force requires \(O(k^2\cdot k!)\) time, so our algorithm determines \(S'\) in time \(O(|S|^2k^2\cdot k!) = O(k^{(2\Delta +3)k})\) time. In determining A, the algorithm checks at most k! possible homomorphisms for each of \(|S'|^2\) pairs of graphs, so it again takes time \(O(k^{(2\Delta +3)k})\). In determining \({\varvec{b}}\), the algorithm computes \(|S'| = O(k^{2+(\Delta +1)k})\) entries in total, each of which takes time \(O(n k^k (\Delta +1)^k)\), so in total it takes time \(O(nk^{(\Delta +3)k})\). Finally, it takes \(O(k^2|S'|^2) = O(k^{(2\Delta +3)k})\) time to invert A and determine \({\varvec{x}}\) (since A can be put into upper triangular form by permuting rows and columns). Overall, the running time of the algorithm is \(O(nk^{(2\Delta +3)k})\), as claimed. \(\square \)
We note that the above algorithm is not limited to host graphs of bounded degree. That is, the same approach can be taken for any host graph class for which counting homomorphisms from (vertex-coloured) patterns with k vertices has an \(f(k)\cdot n^{O(1)}\) time algorithm. To this end, simply use this algorithm as a sub-routine instead of Lemma 12 in the algorithm constructed in the proof of Theorem 13. Examples for such classes of host graphs are planar graphs or, more generally, any graph class of bounded local treewidth [11].
Theorem 14
For all positive integers \(\Delta \):
-
(i)
#Size-BIS \([\Delta ]\) has an algorithm with time complexity \(O(|V(G)| \cdot k^{(2\Delta +3)k})\);
-
(ii)
#Size-Left-BIS \([\Delta ]\) has an algorithm with time complexity \(O(|V(G)|\cdot \ell ^{\ell (2\Delta ^2+8\Delta +4)})\);
-
(iii)
#Size-Left-Max-BIS\([\Delta ]\) has an algorithm with time complexity \(O(|V(G)|\cdot \ell ^{\ell (2\Delta ^2+8\Delta +4)})\).
Recent independent work by Patel and Regts [20] implicitly contains an algorithm for counting independent sets of size k in graphs of maximum degree \(\Delta \) in time \(O(c^k n)\), where c is a constant depending on \(\Delta \). This implies Theorem 14(i). Since our own proof is very short, we provide it for the benefit of the reader. Subsequent work [21], published after our original paper [3], includes a version of Theorem 13 with running time \({\tilde{O}}((4\Delta )^{2k}n)\) (which is essentially best-possible under ETH). Note that using this result in the proof of Theorem 14(ii) and (iii) in place of Theorem 13 would not yield algorithms with running times \(n\cdot \ell ^{o(\ell )}\), as the quantity \(|{\mathcal {S}}_{\ell ,r}'|\) defined in the proof is \(\ell ^{\Omega (\ell )}\) when \(\Delta =3\) (for suitable values of r).
Proof
Proof of part (i): This is immediate from Theorem 13, since #Size-BIS[\(\Delta \)] is a special case of #Induced-Coloured-Subgraph\([\Delta ]\) (taking \(\mathcal {G}\) to be monochromatic and \(\mathcal {H}\) to be a monochromatic independent set of size k).
Proof of part (ii): For any bipartite graph \(G = (U,V,E)\) with degree at most \(\Delta \) and any non-negative integers \(\ell \) and r, let \(N_{\ell ,r}(G)\) be the number of sets \(X \subseteq U\) with \(|X|=\ell \) and \(|\Gamma (X)| = r\). Let \(N_{\ell ,r}'(G)\) be the number of pairs of sets \(X \subseteq U\), \(Y \subseteq V\) such that \(|X| = \ell \), \(|Y| = r\) and \(Y \subseteq \Gamma (X)\). Then we have
$$\begin{aligned} N_{\ell ,r}(G) = N_{\ell ,r}'(G) - \sum _{i=r+1}^{\Delta \ell } \left( {\begin{array}{c}i\\ r\end{array}}\right) N_{\ell ,i}(G). \end{aligned}$$
(9)
For any bipartite graph \(J = (U_J, V_J, E_J)\), we define the corresponding 2-colouring by \(c_J(v) = 1\) for all \(v \in U_J\) and \(c_J(v) = 2\) for all \(v \in V_J\). We define the corresponding coloured graph by \(\phi (J) = ((U_J \cup V_J, \{\{u,v\} \mid (u,v) \in E_J\}), c_J)\). Let \(S_{\ell ,r}\) be the set of all bipartite graphs \(J = (U_J,V_J,E_J)\) with \(U_J = [\ell ]\), \(V_J = \{\ell +1, \dots , \ell +r\}\), degree at most \(\Delta \) and no isolated vertices in \(V_J\). Let \(\mathcal {S}_{\ell ,r}\) be the corresponding set of coloured graphs, and let \(\mathcal {S}'_{\ell ,r}\) be a set of representatives of (coloured) isomorphism classes in \(\mathcal {S}_{\ell ,r}\). Then \(N_{\ell ,r}'(G) = \sum _{\mathcal {K} \in \mathcal {S}'_{\ell ,r}}\)\(\#{\text {Ind}}(\mathcal {K}\rightarrow \phi (G))\), and hence by (9) we have
$$\begin{aligned} N_{\ell ,r}(G) = \sum _{\mathcal {K} \in \mathcal {S}'_{\ell ,r}} \#\hbox {Ind}({\mathcal {K}}\rightarrow {\phi (G)}) - \sum _{i=r+1}^{\Delta \ell } \left( {\begin{array}{c}i\\ r\end{array}}\right) N_{\ell ,i}(G). \end{aligned}$$
(10)
Now suppose that \((G,\ell )\) is an instance of #Size-Left-BIS\([\Delta ]\). Then we have
$$\begin{aligned} \text {IS}_{\ell \text{-left }}(G) = \sum _{\begin{array}{c} X \subseteq U\\ |X|=\ell \end{array}}2^{|V|-|\Gamma (X)|} = \sum _{r=0}^{\Delta \ell } N_{\ell ,r}(G)2^{|V|-r}. \end{aligned}$$
(11)
To compute \(N_{\ell ,\Delta \ell }(G), \dots , N_{\ell ,0}(G)\), our algorithm applies (10). For each \(r \in \{\Delta \ell , \dots , 0\}\), it determines the \(\#{\text {Ind}}(\mathcal {K}\rightarrow \phi (G))\) terms of (10) using the #Induced-Coloured-Subgraph\([\Delta ]\) algorithm of Theorem 13, and the remaining terms of (10) recursively with dynamic programming. Finally, it computes \(\text {IS}_{\ell \text{-left }}(G)\) using (11).
To determine the time complexity, first note that \(|S_{\ell ,r}| \le \left( {\begin{array}{c}\Delta \ell ^2\\ \Delta \ell \end{array}}\right) = O(\ell ^{3\Delta \ell })\) holds for all \(r \in \{\Delta \ell , \dots , 0\}\). The algorithm therefore determines \(\mathcal {S}_{\ell ,r}'\) by brute force in time \(O(|S_{\ell ,r}|^2(\ell +\Delta \ell )^{\ell +\Delta \ell }) = O(\ell ^{\ell (8\Delta +2)})\). The algorithm then calculates each \(N_{\ell ,r}(G)\) in time
$$\begin{aligned} O(|S_{\ell ,r}'|\cdot |V(G)| \cdot (\ell +\Delta \ell )^{(2\Delta +3)(\ell +\Delta \ell )}) = O(|V(G)| \cdot \ell ^{\ell (2\Delta ^2+8\Delta +4)}). \end{aligned}$$
The overall running time is therefore \(O(|V(G)|\cdot \ell ^{\ell (2\Delta ^2+8\Delta +4)})\), so part (ii) of the result follows.
Proof of part (iii): Finally, suppose that \((G,\ell )\) is an instance of #Size-Left-Max-BIS\([\Delta ]\). Let \(\mu = \min \{r \mid N_{\ell ,r}(G) \ne 0\}\), and note that \(\text {IS}_{\ell \text{-left-max }}(G) = N_{\ell ,\mu }(G)\). As above, our algorithm determines \(N_{\ell ,\Delta \ell }(G), \dots , N_{\ell ,0}(G)\) using (10), and thereby determines and outputs \(N_{\ell ,\mu }(G)\). The overall running time is again \(O(|V(G)|\cdot \ell ^{\ell (2\Delta ^2+8\Delta +4)})\), so part (iii) of the result follows. \(\square \)