The Parameterised Complexity of Computing the Maximum Modularity of a Graph

The maximum modularity of a graph is a parameter widely used to describe the level of clustering or community structure in a network. Determining the maximum modularity of a graph is known to be NP\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\textsf {NP}$$\end{document}-complete in general, and in practice a range of heuristics are used to construct partitions of the vertex-set which give lower bounds on the maximum modularity but without any guarantee on how close these bounds are to the true maximum. In this paper we investigate the parameterised complexity of determining the maximum modularity with respect to various standard structural parameterisations of the input graph G. We show that the problem belongs to FPT\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\textsf {FPT}$$\end{document} when parameterised by the size of a minimum vertex cover for G, and is solvable in polynomial time whenever the treewidth or max leaf number of G is bounded by some fixed constant; we also obtain an FPT algorithm, parameterised by treewidth, to compute any constant-factor approximation to the maximum modularity. On the other hand we show that the problem is W[1]-hard (and hence unlikely to admit an FPT algorithm) when parameterised simultaneously by pathwidth and the size of a minimum feedback vertex set.


Introduction
The increasing availability of large network datasets has led to great interest in techniques to discover network structure.An important and frequently observed structure in networks is the existence of groups of vertices with many connections between them, often referred to as 'communities'.
Newman and Girvan introduced the modularity function in 2004 [24].Modularity gives a measure of how well a graph can be partitioned into communities and is used in the most popular algorithms to cluster large networks.For example, the Louvain method, an iterative clustering technique, uses the modularity function to choose which parts from the previous step to fuse into larger parts at each step [16,2].The widespread use of modularity and empirical success in finding communities makes modularity an important function to study from an algorithmic point of view.
In this paper we are concerned with the computational complexity of computing the maximum modularity of a given input graph, and specifically in the following decision problem.

Modularity
Input: A graph G and a constant q ∈ [0, 1].Question: Is the maximum modularity of G at least q?
This problem was shown to be NP-complete in general by Brandes et.al. [4], using a construction that relies on the fact that all vertices of a sufficiently large clique must be assigned to the same part of an optimal partition.They also showed that a variation of the problem in which we wish to find the optimal partition into exactly two sets is hard; their proof for this relied again on the use of large cliques, but DasGupta and Desai [6] later showed that this 2-clustering problem remains NP-complete on d-regular graphs for any fixed d ≥ 9.It has also been shown that it is NP-hard to approximate the maximum modularity within any constant factor [8], although there is a polynomial-time constant-factor approximation algorithm for certain families of scale-free networks [10].The hardness of computing constant-factor multiplicative approximations in general has motivated research into approximation algorithms with an additive error [8,17]: the best known result is an approximation algorithm with additive error roughly 0.42084 [17].
In this paper we initiate the study of the parameterised complexity of Modularity, considering its complexity with respect to several standard structural parameterisations.On the positive side, we show that the problem is in FPT when parameterised by the cardinality of a minimum vertex cover for the input graph G, and that it belongs to XP when parameterised by either the treewidth or max leaf number of G.The XP algorithm parameterised by treewidth can easily be adapted to give an FPT algorithm, parameterised by treewidth, to compute any constant-factor approximation maximum modularity.On the other hand, we demonstrate that Modularity, parameterised by treewidth, is unlikely to belong to FPT: we prove that the problem is W [1]-hard even when parameterised simultaneously by the pathwidth of G and the size of a minimum feedback vertex set for G.For background on parameterised complexity, and the complexity classes discussed here, we refer the reader to [5,11].
These results follow the same pattern as those obtained for the problem Equitable Connected Partition [12], and indeed our hardness result involves a reduction from a specialisation of this problem.There are clear similarities between the two problems: in a partition that maximises the modularity, every part will induce a connected subgraph and, in certain circumstances, we achieve the maximum modularity with a partition into parts that are as equal as possible.However, the crucial difference between the two problems is that the input to Equitable Connected Partition includes the required number of parts, whereas Modularity requires us to maximise over all possible partition sizes; in fact, if we restrict to partitions with a specified parts, it is no longer necessarily true that a partition maximising the modularity must induce connected subgraphs.This difference makes reductions between the two problems non-trivial.

The modularity function
The definition of modularity was first introduced by Newman and Girvan in [24].Many or indeed most popular algorithms used to search for clusterings on large datasets are based on finding partitions with high modularity [19,15], and the heuristics within them sometimes also use local modularity optimisation, for example in the Louvain method [2].See [14,25] for surveys on community detection including modularity based methods.Knowledge on the maximum modularity for classes of graphs helps to understand the behaviour of the modularity function.There is a growing literature on this which began with cycles and complete graphs in [4].Bagrow [1] and Montgolfier et al. [7] showed some classes of trees have high maximum modularity which was extended in [21] to all trees with maximum degree o(n), and furthermore to all graphs where the product of treewidth and maximum degree grows more slowly than the number of edges.Many random graph models also have high modularity, see [23,22] for a treatment of Erdős-Renyi random graphs, [21] for random regular graphs and also [26] which includes the preferential attachment model.
Given a set A of vertices, let e(A) denote the number of edges within A, and let vol(A) (sometimes called the volume of A) denote the sum of the degree d v (in the whole graph G) over the vertices v in A. For a graph G with m ≥ 1 edges and a vertex partition A of G, set the modularity score of A on G to be A∈A u,v∈A , where the maximum is over all partitions A of the vertices of G. Graphs with no edges are defined conventionally to have modularity 1. However note that if the modularity of graphs with no edges were defined to be 0 it would not change any of the results.The modularity function is designed to score partitions highly when most edges fall within the parts and penalise partitions with very few or very big parts.These two objectives are encoded as the edge contribution or coverage q E A (G) = 1 m A∈A e(A), and degree tax A∈A vol(A) 2 , in the modularity of a vertex partition A of G.Note that for any graph with m ≥ 1 edges 0 ≤ q * (G) ≤ 1.To see the lower bound, notice that the trivial partition which places all vertices in the same part has modularity zero.For example, complete graphs and stars have modularity 0 as noted in [4].A graph consisting of c disjoint cliques of the same size has modularity 1 − 1/c with the optimal partition taking each clique to be a part.
As modularity is at most 1 it is sometimes useful to consider the modularity deficit qA (G) = 1 − q A (G). Denote by ∂(A) the number of edges between vertex set A and the rest of the graph.Then and we may equivalently minimise the modularity deficit to maximise the modularity.In particular q(G) = min We will make use of several facts about the maximum modularity of a graph.
Fact 1.4 (Lemma 1.6.5 of [27]).If A is a partition of V (G) such that q A (G) = q * (G) then no part A consists of a single non-isolated vertex.
Proof.Let u be a vertex with degree d u > 0 and suppose (for a contradiction) that A = {{u}, A 1 , . . ., A k } is an optimal partition of G.For each i = 1, . . ., k define the vertex partition B i = {A 1 , . . ., A i ∪ {u}, . . ., A k }.We can derive a simple expression for q B i (G) − q A (G) as most terms cancel By assumption, A is an optimal partition so q B i (G) ≤ q A (G) and thus for each i we have 2m • e({u}, A i ) ≤ d u degsum(A i ).Hence we can sum over i = 1, . . ., k and the inequality should hold.However for the LHS 2m i e({u}, A i ) = 2md u and the RHS is and so we have our contradiction.
Observe that Facts 1.2, 1.3 and 1.4 together imply that the search for an optimal partition can be restricted to those in which all parts are connected subgraphs and no part consists of a single node.

Notation and definitions
Given a graph G = (V, E), and a set U ⊆ V of vertices, we write G[U ] for the subgraph of G induced by U and G \ U for G[V \ U ]. Given two disjoint subsets of vertices A, B ⊆ V , we write e(A, B) for the number of edges with one endpoint in A and the other in B. We shall often want to denote the number of edges between a set of vertices and the remainder of the graph so set ∂(A) = e(A, Ā).If P is a partition of a set X, and Y ⊂ X, we write P[Y ] for the restriction of P to Y .
A vertex cover of a graph G = (V, E) is a set U ⊆ V such that every edge has at least one endpoint in U ; equivalently, G \ U is an independent set (i.e.contains no edges).The vertex cover number of G is the smallest cardinality of any vertex cover of G.A feedback vertex set for G is a set U ⊆ V such that G \ U contains no cycles.Notice that the vertex cover number of G gives an upper bound on the size of the smallest feedback vertex set for G, written fvs(G).The max leaf number of G is the maximum number of leaves (degree one vertices) in any spanning tree of G.
A tree decomposition of a graph G is a pair (T, D) where T is a tree and D = {D(t) : t ∈ V (T )} is a collection of non-empty subsets of V (G) (or bags), indexed by the nodes of T , satisfying: for every e = uv ∈ E(G), there exists t ∈ V (T ) such that u, v ∈ D(t), 3. for every v ∈ V (G), if T (v) is defined to be the subgraph of T induced by nodes t with v ∈ D(t), then T (v) is connected.
We will assume throughout that the indexing tree T has a distinguished root node r; if not we may choose an arbitrary node to be the root.Given any node t ∈ V (T ) we write V t for the set of vertices of G that appear in bags indexed by t and the descendants of t.
If T is in fact a path, we say that (T, D) is a path decomposition of G.The width of the tree decomposition (T, D) is defined to be max t∈V (T ) |D(t)| − 1, and the treewidth of G, written tw(G), is the minimum width over all tree decompositions of G.The pathwidth of G, pw(G), is the minimum width over all path decompositions of G.
We note that there is an FPT algorithm to compute a minimum-width tree decomposition of any graph G, where the treewidth of G is taken as the parameter [3].Moreover, any such tree decomposition can be transformed into a so-called nice tree decomposition (having certain algorithmically useful properties) in linear time, without increasing the number of nodes by more than a constant factor [18].

Positive results
In this section we identify a number of structural restrictions on the input graph that allow us to compute the maximum modularity of a graph, or a good approximation to this quantity, efficiently.

Parameterisation by vertex cover number
In this section we demonstrate that Modularity is in FPT when parameterised by the vertex cover number of the input graph.
Theorem 2.1.Modularity, parameterised by cardinality of a minimum vertex cover for the input graph G, is in FPT.
To prove this result, we make use of recent work of Lokshtanov [20] which gives an FPT algorithm for the following problem.

Integer Quadratic Programming
Input: An n × n integer matrix Q, an m × n integer matrix A, and an m-dimensional vector b.Parameter: n + α, where α is the maximum absolute value of any entry in A or Q. Problem: Find a vector x ∈ Z n which minimises x T Qx, subject to Ax ≤ b.
Our strategy can be summarised as follows.We first observe that we may restrict our attention to partitions in which every part intersects the vertex cover.Moreover, the vertices outside the vertex cover can be classified into at most 2 k "types" according to their neighbourhood (which by definition must be a subset of the vertex cover).We then argue that the modularity of a partition depends only on (1) the inherited partition of the vertex cover and (2) the number of (non-vertex-cover) vertices of each type that belong to each of the parts.Using this characterisation, we can reduce the problem of maximising the modularity to that of solving a collection of instances of Integer Quadratic Programming.
Before embarking on the proof of Theorem 2.1, we introduce some notation.Suppose that the graph G = (V, E) has |E| = m, and that U = {u 1 , . . ., u k } is a vertex cover for G. Let P = {P 1 , . . ., P ℓ } be a partition of U , and set W = V \ U (so W is an independent set).
We can partition the vertices of W into 2 k sets based on their type: the type τ U (w) ∈ {0, 1} k of a vertex w ∈ W describes which of the vertices in U are neighbours of w.Formally τ U (w) j = 1 if u j w ∈ E(G) and τ U (w) j = 0 otherwise.For each σ ∈ {0, 1} k , we set S σ to be the set of all vertices in W with type exactly σ, that is, S σ = {w ∈ W : τ (w) = σ}.Now let A = {A 1 , . . ., A r } be a partition of V .We write x A σ,i for the number of vertices of type σ which are assigned to A i , that is, Finally, we introduce 0-1 vectors to encode the sets P i ∈ P: for 1 ≤ i ≤ ℓ, we let π i ∈ {0, 1} k be given by π i j = 1 if u j ∈ P i , and π i j = 0 otherwise.An example is given in Figure 1.
We now argue that, if the partition A extends P, we can compute the modularity of A using only the values x A σ,i , together with information about P.
, where |E| = m, and let P be a partition of U .If A is any partition of V which extends P and has the property that every A ∈ A has non-empty intersection with U , then Proof.Suppose that P = {P 1 , . . ., P ℓ } and A = {A 1 , . . ., A ℓ }, where P i ⊆ A i for each i; we set For any vertex w ∩ B i , we have that e(w, P i ) is given by the dot product τ (w) • π i ; thus the number of edges between P i and B i for each i is given by Since there are no edges inside any set B i , it follows that and hence we can write the edge contribution of A as Similarly for the degree tax, observe that a vertex w ∈ W of type τ (w) has degree τ (w)•1 ≤ k, and hence vol(B i ) = σ x σ,i (σ •1).Notice that vol(P i ) = 2e(P i )+ e(P i , B i ) and we already have an expression for e(P i , B i ) in terms of the x A σ,i in (1).Hence, as vol(P i ∪ B i ) = vol(P i ) + vol(B i ), we have and thus rearranging, as required.
We are now ready to prove the main result of this section.
Proof of Theorem 2.1.We will assume that the input to our instance of Modularity is a graph G = (V, E), where |E| = m.We may assume without loss of generality that we are also given as input a vertex cover U = {u 1 , . . ., u k } for G (as if not we can easily compute one in the allowed time).We may further assume that G does not contain any isolated vertices, as we can delete any such vertices (in polynomial time) without changing the value of the maximum modularity (by Fact 1.3).Note that the total number of possible partitions of U into non-empty parts is equal to the k th Bell number, B k (and hence is certainly less than k k ).It therefore suffices to describe an fpt-algorithm which determines, given some partition P of U , The maximum modularity of G can then be calculated by taking max{q P (G) : P is a partition of U }.
From now on, we consider a fixed partition P = {P 1 , . . ., P ℓ } of U , and describe how to compute q P (G).It follows from Facts 1.2 and 1.4, together with the fact that W is an independent set that, if A = {A 1 , . . ., A j } is a partition of V which achieves the maximum modularity, then every part A i has non-empty intersection with U .We will call a partition with this properties a U -partition of G.It then suffices to maximise the modularity over all U -partitions in order to determine the value of q P (G).Now, by Lemma 2.2, we know that we can express the modularity of a U -partition A as As we have fixed the partition P, all values e(P i ) can be regarded as fixed constants.In order to determine the maximum modularity we can obtain with a U -partition, we therefore need to find the values of x A σ,i which maximise this expression.We can rewrite (3) as the sum of a constant term, two linear functions θ and φ of the x A σ,i and a quadratic function ψ of the x A σ,i (up to scaling by constants): .
To find the maximum value of q A (G) over all U -partitions it therefore suffices to determine, for all possible values of θ(A) and φ(A), the minimum possible value of ψ(A).Before describing how to do this, we observe that the number of combinations of possible values for θ(A) and φ(A) and is not too large.Note that 0 ≤ σ,i x A σ,i (σ 3 , so the number of possible pairs (θ(A), φ(A)) is at most n 2 k 4 .Thus, if we know the minimum possible value of ψ(A) corresponding to each possible pair (θ(A), φ(A)), we can compute the maximum modularity achieved by any U -partition A such that (θ(A), φ(A)) = (y, z), and maximising over the polynomial number of possible pairs (y, z) will give q P (G).Now, given a possible pair of values (y, z) for (θ(A), φ(A)), we describe how to compute min{ψ(A) : A is a U -partition with θ(A) = y and φ(A) = z}.
Our strategy is to express this minimisation problem as an instance of Integer Quadratic Programming and then apply the FPT algorithm of [20].
In this instance, we have n = ℓ2 k ≤ k2 k , and our vector of variables x = (x 1 , . . ., x n ) T is given by where σ 1 , . . ., σ 2 k is a fixed enumeration of all vectors in {0, 1} k .The matrix Q expresses the value of ψ(A) in terms of x: if we set Q = {q i,j } where then it is easy to see that ψ(A) = x T Qx.Note also that the maximum absolute value of any entry in Q is at most 4k 2 .We now use the linear constraints to express the conditions that 1. θ(A) = y, 2. φ(A) = z, and 3. the values x i,σ correspond to a valid U -partition A.
The first of these conditions can be expressed as a single linear constraint: (σ,i) or equivalently a 1 x = y where a 1 is the 1 × n row vector with i th entry equal to We can similarly express the second condition as a single linear constraint: (σ,i) x A σ,i e(P i )(σ or equivalently a 2 x = z, where a 2 is the 1 × n row vector with i th entry equal to Note that every entry in the vectors a 1 and a 2 has absolute value no more than 2k 3 .For the third condition, note that the values x i,σ correspond to a valid U -partition if and only if every x i,σ is non-negative, and for each σ we have ℓ i=1 x A i,σ = |S σ |.We can therefore express all three conditions in the form Ax = b, where A is a 4 + (ℓ + 1)2 k × n and b is a 4 + (ℓ + 1)2 k -dimensional vector (notice that we use two inequalities to express each of the linear equality constraints).
Altogether, this means that the solution to this Integer Quadratic Programming instance will determine the values of x A i,σ which minimize (out of all values corresponding to some U -partition A) the value of ψ(A), subject to the additional requirement that θ(A) = y and φ(A) = z.Note that the number of variables n is at most k2 k and the largest absolute value of any entry in A or Q is at most 2k 3 , so the parameter in the instance of Integer Quadratic Programming is bounded by a function of k.This completes the proof.
We note the algorithm described can easily be modified to output an optimal partition.

Parameterisation by treewidth
In this section we demonstrate that Modularity, when parameterised by the treewidth of the input graph G, belongs to XP and so is solvable in polynomial time on graph classes whose treewidth is bounded by some fixed constant.We further show that for any fixed ε > 0 there is an FPT-algorithm, parameterised by treewidth, which computes a factor (1− ε)-approximation; i.e. returning a value between (1−ε)q * and q * where q * is the maximum modularity of the graph.Proof.As the proof makes use of standard dynamic programming techniques on tree decompositions, we only give an outline proof here.Suppose that G has n vertices and m edges, and has treewidth k.We will assume that we are given a nice tree decomposition (T, D) (where T is a tree and D = {D(t) : t ∈ V (T )}) of G, of width k, as part of the input (if not we can compute one in FPT time).
The proof relies heavily on Fact 1.2.This means we can compute the optimum modularity without considering partitions that induce disconnected subgraphs; hence, for any node t ∈ V (T ), we need only consider partitions A with the property that, if A ∈ A does not intersect D(t), then all vertices in A only appear in bags indexed by nodes in precisely one connected component of T \ t.
We compute the modularity by working upwards from the leaves in the standard way.As we do this, we need to keep track of relevant statistics for the parts that intersect the current bag (liquid parts) and also the total contribution to the modularity from the parts (frozen parts) which contain only vertices from bags indexed by descendants of the current node (and so by the reasoning above cannot accept more vertices from elsewhere in the graph).
For any node t ∈ V (T ), a valid state of t consists of the following: 1. a partition P of D(t); 2. a function α : P → [m] such that α(P i ) ≥ e(P i ) for each P i ∈ P; 3. a function β : P → [2m] such that β(P i ) ≥ vol(P i ) for each P i ∈ P.
Here P records the restriction of a partition to D(t), α keeps track of the number of edges captured so far in each of the liquid parts, and β keeps track of the volume so far of each of the liquid parts.Notice that the total number of possible states for any node t is at most For each possible state of a node t, we need to keep track of the maximum contribution to modularity from frozen parts we can achieve consistent with the liquid parts having the specified state: this is done with a function σ t , the signature of t.Given any state (P, α, β) of t, we first define a (t, P, α, β)-partition to be any partition A of V t such that: 2. for all A ∈ A with A ∩ D(t) = ∅: • α (A ∩ D(t)) = e(A), and We then set A is a (t, P, α, β)-partition and Throughout the proof we adopt the convention that the maximum value of an empty set is −∞.
It is clear that, with knowledge of σ r for the root r of the tree decomposition, we can easily determine the maximum modularity of G.It therefore remains to outline how we compute σ t for the four types of node in the nice tree decomposition, using only information about the values of σ t ′ where t ′ is a child of t.We begin by observing that if t is a leaf node then we can exhaustively consider all possibilities in time depending only on k.Now suppose t is an introduce node with child t ′ , where D(t) = D(t ′ ) ∪ {v}.Given any state (P, α, β) of t, we say that a state (P ′ , α ′ , β ′ ) of t ′ is introduce-compatible with (P, α, β) if: • P ′ = P \ {v}; • for every P ∈ P, if v / ∈ P then α ′ (P ) = α(P ), and if v ∈ P (but P \ {v} = ∅) then α ′ (P ) = α(P ) − |{u ∈ P : uv ∈ E(G)}|; • for every P ∈ P, if v / ∈ P then β ′ (P ) = β(P ), and if v ∈ P (but It then follows that σ t (P, α, β) is equal to Next, suppose that t is a forget node with child t ′ , where D(t) = D(t ′ ) \ {v}.Given any state (P, α, β) of t, we define two functions σ 1 t and σ 2 t ; these functions correspond to the case where one of the parts that is liquid at t ′ becomes frozen at t (if v was the last vertex in its part), and the case where all parts that are liquid at t ′ remain liquid at t, respectively.We set We then see that σ t (P, α, β) = max σ 1 t (P, α, β), σ 2 t (P, α, β) .Finally, suppose that t is a join node with children t 1 and t 2 , where D(t 1 ) = D(t 2 ) = D(t).In this case we see that σ t (P, α, β) = max σ t 1 (P, α 1 , β 1 )+σ t 2 (P, α 2 , β 2 ) : for all P ∈ P, α(P ) = α 1 (P ) + α 2 (P ) − e(P ) and β(P ) = β 1 (P ) + β 2 (P ) − vol(P ) .
To obtain our FPT approximation result, we use a very similar approach; the key is to restrict our attention to partitions with only a constant number of parts.For any constant c ∈ N, we write q ≤c (G) for the maximum modularity for G achievable with a partition into at most c parts, that is We refer to the problem of deciding whether q ≤c (G) ≥ q for a given input graph G and constant q ∈ [0, 1] as c-Modularity.We now argue that c-Modularity is in FPT parameterised by the treewidth of the input graph.The crucial difference from our XP algorithm above is the fact that, when we fix the number of parts in the partition, we can no longer assume that every part is connected.However, if the maximum number of parts c is a constant, we can keep track of the necessary statistics for every possible part, not just those that intersect the bag under consideration.
Lemma 2.4.c-Modularity is in FPT when parameterised by the treewidth of the input graph.
Proof.The strategy is broadly the same as that used in the proof of Theorem 2.3, however when the number of parts is fixed we can no longer assume that every part in the optimal partition is connected.Thus, instead of recording statistics relating to each part that intersects the bag currently under consideration, we keep track of the same statistics for each of the c (possibly empty) parts allowed in the partition.Formally, for any node t ∈ V (T ), a valid state of t consists of: 1. a function π : Here π records the mapping of vertices of D(t) to the c possible parts, α keeps track of the number of edges captured so far in each of the c parts, and β the volume so far of each part.Notice that the number of possible states for any node t is at most Given any state (π, α, β) of t, we define a (t, π, α, β)-partition to be any partition A = {A 1 , . . ., A c } of V t such that:

for each i ∈ [c]:
• α(i) = e(A i ), and We then set It is clear that, if r is the root of the tree decomposition, Thus it suffices to compute all values of θ r .Note that if t is a leaf node we can consider all possibilities in time depending only on k and c; we now outline how to compute the values of θ t for a node t, given the values for its children.Suppose first that t is an introduce node with child t ′ , where D(t) = D(t ′ ) ∪ {v}.Given any state (π, α, β) of t, we say that a state (π ′ , α ′ , β ′ ) of t ′ is introduce-compatible with (π, α, β) if: It then follows that θ t (P, α, β) is equal to Next, suppose that t is a forget node with child t ′ , where D(t) = D(t ′ ) \ {v}.In this case we have Finally, suppose that t is a join node with children t 1 and t 2 , where D(t 1 ) = D(t 2 ) = D(t).In this case we see that and ; thus, for any constant ǫ > 0, we obtain a factor (1 − ε)-approximation by solving ⌈ 1 ǫ ⌉-Modularity.This immediately gives the following result.
Corollary 2.5.Given any constant ǫ > 0, there is an FPT-algorithm, parameterised by the treewidth of the input graph G, that returns a partition A with q A (G) > (1 − ε)q * (G).
We conclude this section by noting that sparse graphs, in particular graphs G with low tree width, tw(G), and maximum degree, △(G), can have high maximum modularity.In particular Theorem 1.11 of [21] shows

Parameterisation by max leaf number
In this section we demonstrate that Modularity can be solved in time linear in the number of connected subgraphs of the input graph G; as a consequence of this result, we deduce that the problem belongs to XP when parameterised by the max leaf number of G.
Theorem 2.6.Let G be a graph on n vertices with m edges and at most h connected subgraphs.Then Modularity can be solved in time O(h 2 n).
Proof.We will assume without loss of generality (by Fact 1.3) that G contains no isolated vertices.For any induced subgraph H of G, and partition A H of V (H), we write where vol(A) denotes the volume of A in G.We then set where the maximum is taken over all partitions A H of V (H).Thus, q * (H, G) can be seen as the maximum possible contribution of parts contained in H to the modularity of G, if we only consider partitions of V (G) such that every part is either completely contained in V (H) or does not intersect V (H).
Let H be a connected subgraph of G.Then, for any partition A H of V (H) with |A H | > 1, such that each part induces a connected subgraph, it is clear that there exists a partition (X, Y ) of V (H) into two nonempty sets such that H[X] and H[Y ] are both connected, and every element of A H is completely contained in either X or Y .Conversely, if (X, Y ) is a partition with this property it is immediate that partitions of X and Y can be combined to give a partition of V (H).For any connected graph H, we write P(H) for the set of all partitions (X, Y ) of V (H) into two non-empty sets such that G[X] and G[Y ] are both connected.Since we need only consider partitions in which every part induces a connected subgraph (by Fact 1.2), it follows that again adopting the convention that the maximum, taken over an empty set, is equal to −∞.
By assumption, G has only h connected induced subgraphs.We note that, with suitable data structures, we can compute a list of all such subgraphs in time O(nh).To enumerate all connected induced subgraphs containing the vertex v, we can explore a search tree as follows: we associate the pair ({v}, V (G) \ {v}) with the root and, on reaching a node associated with the pair (U, W ), we select an arbitrary vertex x ∈ W such that N (x) ∩ U = ∅ (if such a vertex exists), and create two child nodes associated with (U ∪ {x}, W \ {x}) and (U, W \ {x}) respectively.When this process terminates, the vertex-set of every connected induced subgraph appears as the first element of the tuple for exactly one leaf node.Repeating the process for each vertex in the graph (after deleting those starting vertices already considered) will produce a list of all connected induced subgraphs.
From now on we will assume that we have computed a list H 1 , . . ., H h of all connected induced subgraphs of G; without loss of generality we may further assume that these subgraphs are listed in non-decreasing order of their number of vertices.In particular, this means that there is no connected induced subgraph that is strictly contained in H 1 , so P(H 1 ) = ∅ and q * (H 1 , G) = 1 m e(H 1 ) − 1 m 2 vol(H 1 ) 2 .We can reformulate (4) as follows: Note that, if H j \V (H i ) is connected, then H j \V (H i ) is H ℓ for some ℓ < j.Thus, if we know the values q * (H 1 , G), . . ., q * (H j−1 , G), we can compute q * (H j , G) in time O(j|H j |).It follows that, by considering the connected subgraphs H 1 , . . ., H h in order, we can compute q * (H) for every connected induced subgraph in time O(h 2 n).
Now suppose that G has connected components C 1 , . . ., C ℓ , where V (C i ) = V i for each i.By Fact 1.2 (see also Lemma 1.6.2 of [27]), we can restrict our attention to partitions A of V (G) such that every part is completely contained in some V i , Since each connected component C i is a connected induced subgraph of G, it occurs in the list H 1 , . . ., H h of connected induced subgraphs.Thus, once we have computed q * (H, G) for each connected induced subgraph H, we can immediately determine q * (G) by summing the appropriate values.Hence the overall time required to compute q * (G) is O(h 2 n).
It is known that, if the max leaf number of G is c, then G is a subdivision of some graph H on at most 4c vertices [13]; a graph on n vertices that is a subdivision of such a graph H has at most 2 4c n (4c) 2 connected subgraphs (once we have decided which branch vertices belong to a subgraph, it remains only to decide where to cut each path from one of the chosen branch vertices to one we have not chosen).Thus, if the max leaf number of G is bounded by a constant it follows that G has at most a polynomial number of connected subgraphs, and the following result is an immediate consequence of Theorem 2.6.
Corollary 2.7.Modularity is in XP when parameterised by the max leaf number of the input graph G.
We conjecture that this result is not optimal, and that Modularity is in fact in FPT with respect to this parameterisation.

Hardness results
In this section we complement our positive result about the FPT approximability of the problem parameterised by treewidth by demonstrating that computing the exact value of the maximum modularity is hard even in a more restricted setting.Theorem 3.1.Modularity, parameterised simultaneously by the pathwidth and the size of a minimum feedback vertex set for the input graph, is W[1]-hard.
Our proof of this result relies on the hardness of the following problem.

Equitable Connected Partition (ECP)
Input: A graph G = (V, E) and r ∈ N. Question: Is there a partition of V into r classes V 1 , . . ., V r such that |V i | − |V j | ≤ 1 for all 1 ≤ i < j ≤ r, and the induced subgraph G[V i ] is connected for each i ∈ 1, . . ., r?
The parameterised complexity of ECP was investigated thoroughly in [12].Among other results, the problem is shown to be W[1]-hard even when parameterised simultaneously by r, pw(G) and fvs(G).In proving this hardness result, the authors implicitly consider the following variation of ECP.

Anchored Equitable Connected Partition (AECP)
Input: A graph H = (V H , E H ), and a set of distinguished anchor vertices a 1 , . . ., a r ∈ V .Question: Is there a partition of V H into r classes V 1 , . . ., V r such that a i ∈ V i for all i, ||V i | − |V j || ≤ 1 for all 1 ≤ i < j ≤ r, and the induced subgraph G[V i ] is connected for each i ∈ 1, . . ., r?
Observe f m (B) > 2 2/m if the terms in the product above are either both positive or both negative.Hence f ) and 2 − √ 3 ∼ 0.26794919 ≥ 0.2679.This establishes part 1 of the lemma.The parts 2 and 3 follow in the same fashion.
We are now ready to prove Theorem 3.1.
Proof of Theorem 3.1.We give a reduction from AECP.Suppose that (H, {a 1 , . . ., a r }) is the input to an instance of AECP; we will describe how to construct a graph G, where pw(G) and fvs(G) are both bounded by a function of r, together with an explicit q 0 ∈ (0, 1) such that (G, q 0 ) is a yes-instance for Modularity if and only if (H, {a 1 , . . ., a r }) is a yes-instance for AECP.
We may assume without loss of generality that our instance of AECP satisfies all of the conditions of Lemma 3.2.
We define a new graph G, obtained from H by adding the following (see Figure 2): • α new leaves adjacent to each anchor vertex a 1 , . . ., a r , • β isolated edges disjoint from G, and • an arbitrary perfect matching on the anchor vertices a 1 , . . ., a r , where the values of α and β will be determined later.The idea of the construction is that the α edges help ensure that each anchor vertex is in a separate part of any modularity optimal partition and the β edges allow us to get the numbers to work at the end of the proof.Notice that, even with these modifications, G \ {a 1 , . . ., a r } is still a disjoint union of isolated vertices and paths with pendant edges; hence pw(G) ≤ r + 1 and fvs(G) ≤ r.We set m Define our instance of Modularity to be (G, q 0 ), where We now argue that (G, q 0 ) is a yes-instance if and only if (H, {a 1 , . . ., a r }) is a yes-instance for AECP.Recall that q * (G) = 1 − min and that the partition A which achieves the minimum in the expression above is exactly the modularity maximal A. In any modularity optimal partition, A, each isolated edge will form its own part: this follows from Facts 1.4 and 1.2.Write V ′ for vertices of G without the vertices supporting the β isolated edges, and let the minimisation be over A ′ which are vertex partitions of V ′ .We then have Rearranging, we see that The last inequality holds because A vol(A) = 2(m − β) and so ( 7) is a weighted sum of the f m (A) with total weight one.This, together with the fact that no A has zero volume, also implies that ( 7)≥( 8) with equality if and only if f m (A) = min B⊂V ′ f m (B) for every A ∈ A ′ .Note that, since A ′ is the restriction of some modularity optimal partition A to a connected component of G, we may assume that, for all A ∈ A ′ , G[A] is connected.Moreover, if v is a pendant vertex adjacent to u then u and v are in the same part in A ′ ; we call a partition with this last property (or, abusing notation, a set that would not violate this condition in a partition) 'pendant-consistent'.
We now make the following claim, writing s = |H|/r for the desired part size in our instance of AECP.b) if (H, {a 1 , . . ., a r }) is a yes-instance, then there is a vertex partition A ′ of V ′ so that f m (A) = 2 2/m for all A ∈ A ′ ; c) if there is a vertex partition A ′ = {A 1 , . . ., A r } of V ′ so that for all A i ∈ A, f m (A i ) = 2 2/m, A is pendant-consistent and G[A] is connected for all A ∈ A ′ , then (H, {a 1 , . . ., a r }) is a yes-instance.
We defer the proof of Claim 3.4.For now we assume that Claim 3.4 holds and that we have α > 32|E(H)| 2 and √ 2m = s + α + 1 and prove the theorem holds under these assumptions.By Claim 3.4(a) and line (7), we have graphs whose treewidth or max leaf number is bounded by some fixed constant; we also showed that there is an FPT algorithm, parameterised by treewidth, which computes any constantfactor approximation to the maximum modularity.In contrast with the positive approximation result, we demonstrated that the problem is unlikely to admit an exact FPT algorithm when the treewidth is taken to be the parameter, as it is W[1]-hard even when parameterised simultaneously by the pathwidth and size of a minimum feedback vertex set for the input graph.We conjecture that our XP algorithm parameterised by max leaf number is not optimal, and that Modularity in fact belongs to FPT with respect to this parameterisation.Another open question arising from our work is whether the problem belongs to FPT with respect to other parameters for which this is not ruled out by our hardness result, including treedepth, modular width and neighbourhood diversity.
It is also natural to ask whether our approximation result can be extended to larger classes of graphs, for example those of bounded cliquewidth or bounded expansion.Moreover, when considering treewidth as the parameter, it would be interesting to investigate the existence or otherwise of an ǫ-approximation in time f (tw, ǫ)n O (1) .

Theorem 2 . 3 .
Modularity parameterised by the treewidth of the input graph G is in XP.

Figure 2 :
Figure 2: Possible input graph H with anchors a 1 , a 2 , a 3 , a 4 and the graph G constructed from it by adding α new leaves adjacent to each anchor, β isolated edges and a perfect matching between the anchors.