Structural and Algorithmic Properties of 2Community Structures
 618 Downloads
Abstract
We investigate the structural and algorithmic properties of 2community structures in graphs introduced recently by Olsen (Math Soc Sci 66(3):331–336, 2013). A 2community structure is a partition of a vertex set into two parts such that for each vertex the numbers of neighbours in/outside its own part and the sizes of the parts are correlated. We show that some well studied graph classes as graphs of maximum degree 3, minimum degree at least \(V3\), trees and also others, have always a 2community structure. Furthermore, a 2community structure can be found in polynomial time in all these classes, even with additional request of connectivity in both parts. We introduce a concept of a weak 2community and prove that in general graphs it is NPcomplete to find a balanced weak 2community structure with or without request for connectivity in both parts. On the other hand, we present a polynomialtime algorithm to solve the problem (without the condition for connectivity of parts) in graphs of degree at most 3.
Keywords
Graph theory Complexity Graph partitioning Community structure Clustering Social networks1 Introduction
The research around community structures can be seen as a contribution to the wellestablish research of clustering and graph partitioning. The partition of graphs have been intensively studied with various measures to evaluate their quality, see e.g. [2, 7, 14, 17, 19] for an overview.
A standard abstract model for any kind of social networks such as Facebook or Linkedin is a graph, in which vertices are members of the network and edges are relationships between members. In such model ‘a community’ intuitively corresponds to a subgraph that has ‘more relationships’ inside the subgraph than outside of it. More generally, ‘a community structure’ corresponds to a partition of a graph into communities.
There have been several attempts to define the concept of communities formally, a good introduction including the motivation can be found in [1, 6, 11, 18, 20]. One of the first definitions of a community was motivated by the searching links in web graphs and introduced by Flake et al. [13]. It defines a community as a set of vertices C such that each vertex in C has at least as many neighbours inside C as outside. The same notion called an ‘alliance in graphs’ were introduced by Kristiansen et al. [16] and investigated further in various papers. The concept of communities and community structures have received a significant attention in further research where also some modified definitions of communities were studied e.g. the difference between the number of outside and inside neighbours should be larger than a given constant, the community should also be a dominating set, see e.g. [4, 5, 15] for overview and further references.
In this paper we study the structural and complexity problems of the recent definition of a community structure that reflects the sizes of communities too [10, 11, 18]. This new approach to communities is supported by the practical experiments showing the importance of capturing the sizes of communities for a better description of their properties [18].
The general concept of a community structure does not put any restriction on the number of communities. This paper focuses on a partition with two communities where the problems are already appealing. The presented techniques offer some possibilities for an extension to a larger number of communities. Informally, a 2community structure is a partition of the vertex set into two parts A, B such that for each vertex, say from part A, the ratio ‘the number of neighbours in part A’ over the size of A (excluding the vertex itself) is at least as large as ‘the number of neighbours in part B’ over the size of B. To generalise, in a kcommunity structure, the ratio must be valid for every two communities. We also introduce a weak community structure in which the vertex itself contributes to the ratio. The ratio condition in the latter definition is weaker, but it reflects the reasonable requirement that each member should be considered as a part of its own community (see Sect. 2 for the technical details). Even if there are minor differences between the definitions, the structural and complexity results for the two problems are very different as it is presented in this paper. Both definitions are relevant to describe the community structures, the choice depends on the suitability of the model.
We also study the 2communities problems with additional constraints such as connectivity or equality of sizes for both parts (a balanced partition). The connectivity request corresponds to the essential condition that each member in the community should ‘indirectly know’ all members in its own community, where the ‘indirectly know’ relation corresponds to a path between two vertices in the graph. The study of balanced communities is motived by the practical interest for equal size of the communities. In general, the balanced graph partitions are well studied, e.g. due to its applications in the divideandconquer algorithms, see e.g. [8]. In the balanced partition problem, which can be seen as a generalisation of the bisection problem to any given number of parts, the goal is to minimise the number of edges between partitions. It is known that the problem cannot be approximated within any finite factor in polynomial time in general graphs and it remains APXhard even on trees of constant maximum degree [12]. It demonstrates that some graph partitions problems that are related to e.g. balanced communities are hard to solve even for restricted graph classes and indicates hardness of various problems related to a community structure too. Hence all positive results in community structure problems would be important to get better understanding of the differences between community and partition problems.
Furthermore, a community structure is in fact a graph partition with a restricted number of edges between parts, therefore the new results for communities may find applications in the areas similar to a graph partition such as parallelcomputing, VLSIcircuit design, route planning [9] and divideandconquer algorithms [21].
There are only a few results related to this new definition of a community. Olsen [18] proved that a community structure (without the condition on the exact number of communities) can be found in polynomial time in any graph with at least 4 vertices, except a star. Recently, EstivillCastro et al. [11] claimed that the problem to find a kcommunity structure with restriction to all communities to be connected and equal size is NPcomplete in general graphs, but polynomially solvable in trees. In [18] Olsen also proved that it is NPcomplete to decide, whether there is a community structure in a graph in which a given set of vertices is included in a community.
Our contribution
 (i)
 (ii)graphs of maximum degree 3

a connected 2community structure exists and can be found in polynomial time (Theorem 2),

a balanced weak 2community structure exists and can be found in polynomial time (Theorem 6),

there are graphs without a balanced 2community structure (Remark 1),

there are graphs with a balanced 2community structure, but without a connected balanced weak 2community structure (Remark 2)

 (iii)graphs of minimum degree \((V3)\), complements of bipartite graphs, graphs with minimum degree \(\lceil \frac{(c1)\cdot V}{c} \rceil \) where c is the size of an inclusionwise maximal clique in the graph
 (iv)
The paper is structured as follows. In Sect. 2 we introduce formally some notations and definitions of studied problems. In Sect. 3 we show that in some wellstudied graph classes a 2community structure always exists and can be found in polynomial time, even with additional request for connectivity in both parts. In Sect. 4 we focus on the balanced 2community structure and present the structural and algorithmic results in general graphs and some graph classes. Conclusions and open problems are provided in Sect. 5.
2 Preliminaries
In the paper, all considered graphs are simple, undirected and connected. Let \(G=(V, E)\) be a graph. For a vertex \(v\in V\), let d(v) be the degree of the vertex v and for any subgraph H of the graph G let \(N_H(v)\) be the set of the neighbours of v in H, \(N_H[v]=N_H(v)\cup \{v\}\) and let \(d_{H}(v)=N_H(v)\). For a given partition of V into two parts (a 2partition), let an inneighbour of v (resp. outneighbour) be a neighbour in its own part (resp. out of its part) and \(d_{in}(v)\) (resp. \(d_{out}(v)\)) denote the number of inneighbours of v (resp. outneighbours). For a graph G and a subset of vertices \(S\subseteq V\), let G[S] denote the subgraph of G induced by S. A partition \(\{C_1,C_2\}\) of V is connected if the subgraphs \(G[C_1]\) and \(G[C_2]\) are connected and it is balanced if the sizes of \(C_1\) and \(C_2\) differ by at most 1. The cut size of a 2partition is the number of edges that have end vertices in the different parts of the partition. A graph is said to be of minimum (resp. maximum) degree k if any vertex of the graph has degree at least (resp. at most) k. A pendant vertex of G is any vertex of degree 1. A star is a complete bipartite graph \(K_{1,\ell }\) for any \(\ell \ge 1\). The complement graph \(\overline{G}=(V,\overline{E})\) of a graph \(G=(V,E)\) is the graph in which \(\{u,v\}\in E\) iff \(\{u,v\}\notin \overline{E}\) for all vertices \(u, v\in V\). A graph G is 2colourable if there exists a partition \(\{C_1, C_2\}\) of V such that \(G[C_1]\), \(G[C_2]\) contain only isolated vertices.
Now we introduce Olsen’s definition of a kcommunity structure from [18].
Definition 1
In this paper we investigate a community structure for a fixed number of two communities and also study some variants of the 2Community problem:
2Community
Input: A graph \(G=(V, E)\).
Question: Does G have a 2community structure?
The additional constraint which asks for subgraphs induced by each part of the partition to be connected is a natural condition useful for the problems related to the connectedness. The Connected 2Community problem is to decide if a graph has a connected 2community structure, i.e. a 2community structure \(\{C_1, C_2\}\) such that the subgraphs induced by \(C_1\), \(C_2\) are connected. We can define analogous problems for weak and balanced versions.
3 Connected 2Community Structures in Some Graph Classes
In this section we show that if a graph has certain structural properties, then it has a connected 2community structure which can be found in polynomial time. More precisely, we prove that such a statement is valid for trees and graphs of high minimum or low maximum degrees.
Theorem 1
Every tree with at least 4 vertices (except a star) has a connected 2community structure that can be found in linear time.
Proof
Let \(G=(V,E)\) be a tree not isomorphic to a star. We prove that there exists an edge \(e\in E\) such that two connected components of \(G\setminus e\) form a 2partition which is a connected 2community structure.
Let \(e=\{u,v\}\) be an edge in E such that d(v), \(d(u) \ge 2\) (due to the assumption about G such an edge e must exist). Consider a partition \(\{X_u,X_v\}\) of V with \(X_u\) (resp. \(X_v\)) be the set of vertices of the connected component of \(G\setminus e\) containing u (resp. v).
First we notice that only one of the vertices u and v may not satisfy the condition (1). If this is not true then \(\frac{d(u)1}{X_u 1}<\frac{1}{X_v}\) and \(\frac{d(v)1}{X_v 1}<\frac{1}{X_u}\). Since \(d(u),d(v)\ge 2\), it implies \(X_v<\frac{X_u1}{d(u) 1}\le X_u1\) and \(X_u<\frac{X_v1}{d(v) 1}\le X_v1\), which is not possible.
If both vertices u and v satisfy the condition (1), then \(\{X_u,X_v\}\) is obviously a 2community structure. If not, then without loss of generality, let the vertex u satisfy the condition (1) and v do not. Then the Update procedure is repeated and if no update is possible, a modified partition \(\{X_u, X_v\}\) is already a 2community structure as it is shown later.
The Update procedure:
Let \(v_1,v_2,\ldots , v_{d(v)1}\) be the neighbours of v excluding u (there is at least one such a vertex due to our assumption \(d(v)\ge 2\)). For each i, \(1\le i \le d(v)1\), and \(e_i=\{v,v_i\}\in E\), let \(X_i\) be the set of vertices of the connected component in \(G\setminus e_i\) containing \(v_i\).
Notice that if for all j, \(1\le j\le d(v)1\), \(d(v_j)=1\), then v must already satisfy the condition (1) in the partition \(\{X_u, X_v\}\) at the beginning of the Update procedure.
Hence, there exists i, \(1\le i\le d(v)1\) such that \(d(v_i)> 1\) and the vertex v satisfies the condition (1) in the partition \(\{X_i,V\setminus X_i\}\). Then, relabel \(u:=v\) and \(v:=v_i\) and return to the beginning of the Update procedure.
Each time the labels of u and v are updated, the size of \(X_u\) strictly increases by at least one, hence the whole process always terminates. A final partition at the end of the process is a connected 2community structure because both partitions correspond to two connected components of a tree obtained by removing an edge.
Notice that finding such an edge can be done in O(V) operations. First, in constant time fix an edge \(e=\{u,v\}\) such that \(d(v), d(u)\ge 2\). Then, consider \(G\setminus e\) as a union of two trees \(T_u\) and \(T_v\), where \(T_u\) is a tree on the vertex set \(X_u\) rooted in u (and similarly for \(T_v\) on \(X_v\) rooted in v). For each vertex w of G calculate recursively the size of the subtree of \(T_u\) (or \(T_v\)) rooted in w which can be done in time O(V). Finally, using the sizes of the subtrees, check if \(\{X_u, X_v\}\) corresponds to a 2community structure and if needed, update \(X_u\), \(X_v\) according to the algorithm. The number of such updates is clearly at most E. Since G is a tree, the repetition of the Update procedure finishes with a connected 2community structure in O(V) time. \(\square \)
Very recently, EstivillCastro et al. proved in [11] the same result using different methods. Our approach is more structural and the proof for the existence of an edge that connects two communities results directly in a linear time algorithm.
Now we investigate graphs that may contain cycles but that still have low densities, namely the graphs of maximum degree 3. First, the restrictions on the size of partitions are discussed to ensure the vertices fulfil the condition (1) of a 2community structure.
Lemma 1
Let \(G=(V,E)\) be a graph of maximum degree 3 of size n. Let \( \{C_1, C_2\}\) be a partition of V such that \(\lceil \frac{n1}{3}\rceil \le C_i\le n\lceil \frac{n1}{3}\rceil \), \(i=1,2\). Then each vertex of degree 3 in G with at most one outneighbour fulfils the condition (1) of a 2community structure.
Furthermore, if for some \(i\in \{1, 2\}\), \(C_i =\lceil \frac{n1}{3}\rceil \) (or also \(C_i=\lceil \frac{n1}{3}\rceil +1\) in case \(n \equiv 1 \mod 3\)) then each vertex of degree 3 in \(C_i\) with two outneighbours fulfils the condition (1) too.
Proof
Let \(\{C_1, C_2\}\) be a fixed partition of G such that \(\lceil \frac{n1}{3}\rceil \le C_i\le n\lceil \frac{n1}{3}\rceil \), \(i=1,2\). It is clear that the condition (1) is true for each vertex which has only neighbours in its own part. Firstly, suppose the vertex v from \( C_i\), \(i\in \{1, 2\}\) has exactly one outneighbour.
Since \(C_i\le n\lceil \frac{n1}{3}\rceil \), then obviously \(C_i\le n \frac{n1}{3} \) and \(\frac{2}{C_i 1}\ge \frac{1}{nC_i}\). Therefore the condition (1) is fulfilled for the vertex v.
Now suppose that for \(i\in \{1, 2\}\) there is a vertex \(v\in C_i\) with exactly two outneighbours and \(C_i =\lceil \frac{n1}{3}\rceil \). Obviously, \(\lceil \frac{n1}{3}\rceil \le \frac{n+2}{3}\) and hence \(2\lceil \frac{n1}{3}\rceil 2\le n\lceil \frac{n1}{3}\rceil \) which implies \(\frac{1}{\lceil \frac{n1}{3}\rceil 1}\ge \frac{2}{n\lceil \frac{n1}{3}\rceil }\). This corresponds to the condition (1) for the vertex v. Similarly if \(C_i=\lceil \frac{n1}{3}\rceil +1\) and \(n \equiv 1 \mod 3\): \(n1= 3\lceil \frac{n1}{3}\rceil \) which implies \(\frac{1}{\lceil \frac{n1}{3}\rceil }\ge \frac{2}{n\lceil \frac{n1}{3}\rceil 1}\). \(\square \)
Lemma 2
Let \(G=(V, E)\) be a graph of maximum degree 3 of size n. Let \(\{C_1, C_2\}\) be a partition of V such that \(\lceil \frac{n1}{3}\rceil \le C_1\le \lfloor \frac{n}{2}\rfloor \). Then each vertex of degree 2 in \(C_1\) with at most one outneighbour fulfils the condition (1) of a 2community structure.
If the partition is balanced, then each vertex of degree 2 in G with at most one outneighbour fulfils the condition (1).
Proof
Let \(\{C_1, C_2\}\) be a partition of V such that \(\lceil \frac{n1}{3}\rceil \le C_1\le \lfloor \frac{n}{2}\rfloor \). Obviously, any vertex of degree 2 with no neighbours out of its own part fulfils the condition (1). Moreover any vertex of degree 2 in \(C_1\) with only one outneighbour satisfies \(\frac{1}{C_11}\ge \frac{1}{C_2}\) since \(C_1\le C_2\).
If the partition is balanced, then \(\frac{1}{C_11}\ge \frac{1}{C_2}\) and \(\frac{1}{C_21}\ge \frac{1}{C_1}\), and hence the vertices of degree 2 from both parts with exactly one outneighbour satisfy the condition (1). \(\square \)
Lemma 3
Let \(G=(V, E)\) be a graph of maximum degree 3 of size n and \(\{C_1, C_2\}\) be a partition of V such that \(\lceil \frac{n1}{3}\rceil \le C_i\le n\lceil \frac{n1}{3}\rceil \), \(i=1, 2\).
 (i)
The vertices of degree 2 from the smaller part and all the vertices of degree 3 have at most one outneighbour.
 (ii)
The vertices of degree 2 and 3 have at most one outneighbour and the partition is balanced.
 (iii)
The vertices of degree 2 from the smaller part have at most one outneighbour, the vertices of degree 3 in \(C_i\), for some \(i\in \{1,2\}\), have at most two outneighbours and \(C_i=\lceil \frac{n1}{3}\rceil \) (or also \(C_i=\lceil \frac{n1}{3}\rceil +1\) if \(n \equiv 1 \mod 3\)) and the vertices of degree 3 in \(C_{3i}\) have at most one outneighbour.
Proof
In each case (i), (ii), or (iii), all the vertices of the graph G satisfy the condition (1) due to Lemmas 1 and 2. Hence, \(\{C_1, C_2\}\) is a 2community structure on G. \(\square \)
Lemma 4
Every connected graph of maximum degree 3 on n vertices, \(n\ge 4\), (except a star) has a connected partition \(\{C_1,C_2\}\) such that \(\lceil \frac{n1}{3}\rceil \le C_i\le n\lceil \frac{n1}{3}\rceil \), \(i=1, 2\). Moreover, such a partition can be found in polynomial time.
Proof
Let \(G=(V, E)\) be a graph with the given properties. If G is a tree, take a pendant vertex \(u\in V\) and let \(v\in V\) be its neighbour. If G is not a tree, let \(\{u, v\}\) be an edge of a cycle in G. Since G is not isomorphic to a star such an edge must exist.
Initially, put into \(C_1\) the vertices u, v together with their pendant vertices, if it is applicable. If there is a vertex z of degree 2 adjacent to u and v, update \(C_1: =C_1\cup \{z\}\). Define \(C_2:= V \setminus C_1\).
The algorithm keeps connectivity of \(G[C_{1}]\) and \(G[C_{2}]\) and extends \(C_{1}\) either by transferring vertices from \(C_{2}\) to \(C_{1}\) or relabelling a suitable connected part of the graph until \(\lceil \frac{n1}{3}\rceil \le C_i\le n\lceil \frac{n1}{3}\rceil \), \(i=1,2\).
The algorithm starts with the initial set \(C_{1}\) and repeats the Update Procedure until \(C_1\ge \lceil \frac{n1}{3}\rceil \). In each run of the procedure only one of the options 1 or 2 is executed.
The Update procedure:
Let w be a vertex in \(C_2\) which has a neighbour in \(C_1\) (such a vertex must exist since G is connected).
 If \(A\le n2\lceil \frac{n1}{3}\rceil \), putNotice that \(C_1\le n\lceil \frac{n1}{3}\rceil \), \(\{C_1,C_2\}\) is a connected partition and the size of \(C_1\) strictly increased.$$\begin{aligned} C_1:=C_1\cup A\cup \{w\}, C_2:=B. \end{aligned}$$
 If \( n2\lceil \frac{n1}{3}\rceil +1\le A\le n\lceil \frac{n1}{3}\rceil \), then notice that \(A\ge \lceil \frac{n1}{3}\rceil \) and putObviously, \(\{C_1,C_2\}\) is a connected partition with \(\lceil \frac{n1}{3}\rceil \le C_i\le n\lceil \frac{n1}{3}\rceil \), \(i=1,2\), hence the Update Procedure halts.$$\begin{aligned} C_1:=A,\ C_2:=V\setminus A. \end{aligned}$$
 If \(A>n\lceil \frac{n1}{3}\rceil \), putNotice that \(C_1<\lceil \frac{n1}{3}\rceil \), \(\{C_1,C_2\}\) is a connected partition and the size of \(C_1\) strictly increased.$$\begin{aligned} C_1:=C_1\cup B \cup \{w\}, \ C_2:=A. \end{aligned}$$
By our construction, the partition \(\{C_1,C_2\}\) remains connected during each run of the Update procedure.
Each time the Update procedure is executed, the size of \(C_1\) strictly increases, hence the algorithm always terminates.
At the end of the algorithm \(\lceil \frac{n1}{3}\rceil \le C_i\le n\lceil \frac{n1}{3}\rceil \), \(i=1,2\) and the algorithm clearly runs in a polynomial time. \(\square \)
Theorem 2
Every connected graph of maximum degree 3 with at least 4 vertices (except a star) has a connected 2community structure which can be found in polynomial time.
Proof
 (A)
if there exists \(i\in \{1, 2\}\) such that \(C_i>\lceil \frac{n1}{3}\rceil \) in case \(n\not \equiv 1 \mod 3\) or \(C_i>\lceil \frac{n1}{3}\rceil +1\) in case \(n\equiv 1\) mod 3, then all the vertices of degree 3 in \(C_i\) with two outneighbours,
 (B)
if the partition is not balanced, then all the vertices of degree 2 in the larger part with one outneighbour.
The Improvement Procedure: Stage 1 (Category (A) vertices)
In this stage we handle vertices in \(C_2\) of degree 3 with two outneighbours by transferring them into \(C_1\), keeping the size of \(C_1\) smaller than \(n\lceil \frac{n1}{3}\rceil \) and ensuring connectivity of the partition \(\{C_1,C_2\}\).
The Improvement Procedure: Stage 2 (Category (A) vertices)
Similarly to Stage 1, in Stage 2 we handle vertices in \(C_1\) of degree 3 with two outneighbours by transferring them into \(C_2\), keeping the size of \(C_2\) smaller than \(n\lceil \frac{n1}{3}\rceil \) and ensuring connectivity of the partition \(\{C_1,C_2\}\).
The Improvement Procedure: Stage 3 (Category (B) vertices)
If the partition is not balanced, the vertices of degree 2 with one outneighbour must be transferred from the larger part to the smaller part.
If \(C_1>C_2\), relabel \(C_1:=C_2\) and \(C_2:=V\setminus C_1\).
It is easy to see that the algorithm always terminates. Each iteration of the while loop in Stage 1 (resp. Stage 2) decreases the cutsize by at least one. In Stage 3 each iteration of the while loop increases the size of the smaller part by at least one and halts before or when the partition is balanced. Following the construction, if the Improvement Procedure needs to be run again, it must first run through Stage 1 or 2 which decreases the cutsize by at least one. Moreover, the algorithm clearly runs in polynomial time.
Let’s discuss the correctness of the algorithm. Suppose the algorithm terminates with the final partition \(\{C_1, C_2\}\). Due to the conditions inside the algorithm, \(\lceil \frac{n1}{3}\rceil \le C_i\le n \lceil \frac{n1}{3}\rceil \), \(i=1 , 2\).
Initially, the partition is connected and remains so after each stage, hence the final partition is connected too.

If the final partition is balanced then all vertices of degree 2 and 3 may have at most one outneighbour (otherwise the Improvement Procedure could be applied again), hence the final partition \(\{C_1, C_2\}\) is a 2community structure due to Lemma 3(ii).

If the final partition is not balanced, then the partition must have the properties described in Lemma 3(i) or (iii) (otherwise, one of Stages 1–2 could be applied again). Hence the final partition \(\{C_1, C_2\}\) is a 2community structure.\(\square \)
Now we investigate the problem of the existence and finding of a connected 2community structure in dense graphs. We prove that any graph \(G=(V, E)\) of minimum degree \(V3\) has a connected 2community structure which can be found in polynomial time.
Lemma 5
If the complement of the graph G is 2colourable (using each colour for at least 2 vertices), then G has a connected 2community structure which can be found in polynomial time.
Proof
Let \(G=(V,E)\) be a graph such that its complement \(\overline{G}\) is 2colourable. Fix a 2colouring of \(\overline{G}\) (with at least 2 vertices for each colour) and define \(\{C_1,C_2\}\) as a partition of V, where each part corresponds to one colour in \(\overline{G}\). Obviously, \(C_1,C_2\ge 2\). Notice that the induced subgraph on the vertex set \(C_1\) (resp. \(C_2\)) is a clique. Therefore, any vertex \(v\in V\) satisfies the condition (1) and the partition \(\{C_1,C_2\}\) is a 2community structure. Since a 2colouring can be found in polynomial time, the 2community structure \(\{C_1, C_2\}\) too. Obviously, the partition is connected. \(\square \)
This result directly implies the following theorem:
Theorem 3
The complement of any bipartite graph (with at least two vertices in each part) has a connected 2community structure which can be found in polynomial time.
Theorem 4
Any graph (except a star) of minimum degree \((n3)\), \(n\ge 4\), where n is the order of the graph, has a connected 2community structure which can be found in polynomial time.
Proof
Let G be a graph of size n and of minimum degree \((n3)\) (except a star), \(n\ge 4\), and \(\overline{G}\) be the complement of G. Notice that \(\overline{G}\) is of degree at most 2. If \(\overline{G}\) doesn’t contain an odd cycle, then there exists a 2colouring of \(\overline{G}\) with at least 2 vertices for each colour. In such case, a connected 2community structure can be found in polynomial time due to Lemma 5.
All vertices of \(C_2\) satisfy the condition (1) in G since \(G[C_2]\) is a clique. For each i, \(1\le i\le p\), all neighbours of \(v_i\) in \(G[C_1]\) satisfy the condition (1) in G since they have all vertices of \(C_1\) as neighbours. Moreover, the nonneighbour of \(v_i\) in \(G[C_1]\) and \(v_i\) itself satisfy the condition (1) in G since \(C_1>C_2\) implies that \(\frac{C_12}{C_11}\ge \frac{C_21}{C_2}\).
Observe that the partition \(\{C_1,C_2\}\) is connected. Obviously, \(G[C_2]\) is connected since \(G[C_2]\) is a clique. Moreover, any two vertices in \(C_1\) are neighbours except \(v_i\) and its neighbour in \(\overline{G}[O_{i,1}]\) for all i, \(1\le i\le p\). If \(B_1\ne \emptyset \), such two vertices must have a common neighbour in \(B_1\). If \(B_1=\emptyset \), then either \(O_{1,1}\ge 3\) or \(p\ge 2\) (due to assumptions on G), and such two vertices have a common neighbour either in \(O_{1,1}\) or \(O_{j,1}\), \(j\ne i\). Hence, \(G[C_1]\) is also connected. \(\square \)
Theorem 5
Let \(G=(V,E)\) be a graph with minimum degree \(\lceil \frac{(c1).V}{c} \rceil \) where c is the size of an inclusionwise maximal clique in G, i.e. such a clique is not a subgraph of another clique. Then, G has a connected 2community structure which can be found in polynomial time.
Proof
If \(c\ge V1\), then for any vertex \(u\in V\), \(d(u)\ge \lceil \frac{(V2).V}{V1} \rceil \ge V3\) and the rest follows from Theorem 4.
Now we prove that the partition \(\{C,V\setminus C\}\) is connected, which is obviously true for G[C]. Let suppose that \(G[V\setminus C]\) be disconnected and A be the smallest connected component of \(G[V\setminus C]\). Notice that \(A\le \frac{Vc}{2}\) and let \(u\in A\). Then \(\frac{(c1)\cdot V}{c} \le d(u)\le \frac{Vc}{2}+c2\) and hence \(V\le \frac{c\cdot (c4)}{c2}<c\), which is impossible. Therefore, \(G[V\setminus C]\) is a connected subgraph. \(\square \)
4 Balanced 2Community Structure
In this section we study complexity of the problems related to a balanced 2community structure. First we prove that every graph of maximum degree 3 has a balanced weak 2community structure that can be found in polynomial time. The structural properties of lowdegree graphs are crucial to obtain such a result. In general graphs, the Balanced Weak 2community and Balanced 2community problems are NPcomplete as it is shown further in the section. The latter result is contained as the main result in [10], an alternative shorter proof is presented in this section. Both NPcompleteness results are extended to a connected balanced 2community structure.
Remark 1
Due to Theorem 2, every graph of maximum degree 3 has a 2community structure, but it is not true for a balanced 2community structure, see Fig. 2. The graph is obtained by linking three “cross gadgets”. First notice that if a balanced 2community exists for the graph, then all vertices of each cross gadget must be in the same part. Indeed, each vertex of such community structure must have two neighbours in its own part. But on the other hand, this graph is impossible to split into two balanced parts without splitting a cross gadget.
Theorem 6
Any graph of maximum degree 3 with at least 4 vertices has a balanced weak 2community structure. Moreover, such a community structure can be found in polynomial time.
Proof

each vertex of degree 1 fulfils the condition (2), even if its neighbour is not in its own part,

each vertex of degree 2 or 3, which has at least one neighbour in its own part, satisfies the condition (2). Therefore, the only vertices which may not satisfy the condition (2) are vertices of degree 2 or 3 which have no neighbour in their own part.
 (S1)
If both parts contain a vertex of degree 2 or 3 that has no neighbour in its own part (say \(v_1\in C_1\), \(v_2\in C_2\)), then update: \(C_1:=C_1\cup \{v_2\} \backslash \{v_1\}, C_2:=C_2\cup \{v_1\}\backslash \{v_2\}\).
 (S2)
If there is only one partition that contains a vertex v of degree 2 or 3 that has no neighbour in its own part (without loss of generality suppose \(v\in C_1\)), then choose a vertex \(w\in C_2\) such that w has at least one neighbor in \(C_1\) and update: \(C_1:=C_1\cup \{w\} \backslash \{v\}, C_2:=C_2\cup \{v\}\backslash \{w\}\).
Moreover, the partition remains balanced after each step (S1) or (S2). Besides, the cut size between the partitions \(C_{1}\) and \(C_{2}\) always decreases (by at least 2 in case (S1), by at least 1 in case (S2)) so after a finite number of iterations (bounded trivially by \(O(V^2)\), every vertex of degree 2 or 3 has at least one neighbour in its own part. Hence, the algorithm returns a balanced weak 2community structure. \(\square \)
Remark 2
Notice that Theorem 6 cannot be extended to a connected case. There exist graphs of maximum degree 3 in which every balanced weak 2community structures is disconnected, see Fig. 3 as an example.
Remark 3
It can be observed that the Balanced 2community problem (hence also Balanced Weak 2community) is polynomially solvable for graphs with bounded treewidth. Such result follows directly from [3] where the t Decomposition problem closely related to communities was studied. The input to the t Decomposition problem is a graph \(G = (V, E)\), an integervalued function \(t=t(n)\) such that \(0\le t(n)\le n\) for every \(n\in \mathrm{I\!N}\), and two functions \(a,b : V\rightarrow \mathrm{I\!N}\) such that \(a(v), b(v) \le d(v)\), for all \(v\in V\). The problem consists of deciding if there is a partition \(\{V_1,V_2\}\) of V with \(V_1=t(V)\) such that \(d_{G[V_1]}(v)\ge a(v)\) for every \(v\in V_1\) and \(d_{G[V_2]}(v)\ge b(v)\) for every \(v\in V_2\).
In order for \(\{V_1,V_2\}\) to be a balanced 2community structure with \(V_1 \ge V_2\), every \(v\in V_1\) must satisfy the condition \(\frac{d_{G[V_1]}(v)}{\lceil n/2\rceil 1}\ge \frac{d(v)d_{G[V_1]}(v)}{\lfloor n/2\rfloor }\) and analogously for every \(v\in V_2\) must hold \(\frac{d_{G[V_2]}(v)}{\lfloor n/2\rfloor 1}\ge \frac{d(v)d_{G[V_2]}(v)}{\lceil n/2\rceil }\). Thus, Balanced 2community can be condidered as the t Decomposition problem for selected values of the functions t, a, b. The conditions for Balanced 2community can be transformed to the conditions of the t Decomposition problem where \(t(n)=\lceil \frac{n}{2}\rceil \), \(a(v)=b(v)=\lceil \frac{ n/2 1}{n1} d(v)\rceil \) for n even and \(a(v)=\lceil d(v)/2 \rceil \), \(b(v)=\lceil \frac{(n1)/21}{n1}d(v)\rceil \) for n odd.
Since the t Decomposition problem was proved to be polynomialtime solvable for bounded treewidth in [3], we can conclude the same result for the Balanced 2community problem. Notice that the result cannot be extended to a connected case for all graphs, see a tree on Fig. 3 as a counterexample.
Now we focus on the problem of Balanced 2community in general graphs. In [8] it has been proved that to find a connected balanced partition without any additional constraints is an NPcomplete problem in general graphs. We prove similar results for Balanced Weak 2community and Balanced 2community and their connected variants. To show that Balanced Weak 2community is NPcomplete, we use a reduction from the Balanced CoSatisfactory Partition problem, proved to be NPcomplete in [5].
The problems is defined as follow:
Balanced CoSatisfactory Partition
Input : A graph \(G=(V,E)\) on an even number of vertices.
Question : Is there a balanced partition \(\{C_1,C_2\}\) of V such that for every \(v\in V\), \(d_{in}(v)\le d_{out}(v)\)?
Theorem 7
Balanced Weak 2community is NPcomplete.
Proof
The problem is clearly in NP. In the following we define a polynomialtime reduction from Balanced CoSatisfactory Partition to Balanced Weak 2community. Let G be a graph on an even number n of vertices as an instance of Balanced CoSatisfactory Partition, and let \(\overline{G}\), the complement of G, be an instance of Balanced Weak 2community. If G admits a balanced cosatisfactory partition \(\{C_1,C_2\}\) then \(\{C_1,C_2\}\) is also a weak 2community. Suppose \(d_{in}(v)\le d_{out}(v)\) for every vertex \(v\in V\) (in the graph G). Let \(\bar{d}_{in}(v)\) (resp. \(\bar{d}_{out}(v)\)) be the number of inneighbours (resp. outneighbours) of v in \(\overline{G}\). Then, the following holds \(\bar{d}_{in}(v)+1=\frac{n}{2}d_{in}(v)\ge \frac{n}{2}d_{out}(v)=\bar{d}_{out}(v)\), which is the condition (2) for a balanced partition. Conversely, any balanced weak 2community in \(\overline{G}\) is a balanced cosatisfactory partition in G. \(\square \)
The proof of the NPcompleteness of Balanced CoSatisfactory Partition in [5] is based on the graphs \(G=(V, E)\), where \(V= F\cup T\cup V_0\) with some additional properties: F and T are independent sets, there are no edges between T and \(V_0\), and there is a vertex \(f\in F\) that is not adjacent to any vertex of \(V_0\). Any balanced cosatisfactory partition \(\{C_1, C_2\}\) of V must have the following structure: \(C_1=F\cup S\) and \(C_2=T\cup (V_0\setminus S)\) where \(S\subseteq V_0\). If \(\overline{G}\) is an instance of Balanced Weak 2community (constructed following the proof of Theorem 7), one can see that \(C_1\) is connected since f is adjacent to all vertices in \(F \cup S\) and \(C_2\) is connected since T is a clique and every vertex of T is adjacent to every vertex of \(V_0\setminus S\). Hence we can conclude that even the connected version of Balanced Weak 2community is NPcomplete.
Theorem 8
Connected Balanced Weak 2community is NPcomplete.
EstivillCastro et al. [10] have shown that Balanced 2community is NPcomplete by constructing a reduction from a variant of the Clique problem. We propose a shorter alternative proof which is also valid for the Connected Balanced 2community problem. The proof is based on the NPcomplete problem Balanced Satisfactory Partition which was introduced by Bazgan et al. [4] as follows:
Balanced Satisfactory Partition
Input : A graph \(G=(V,E)\) on an even number of vertices.
Question : Is there a balanced partition \(\{C_1,C_2\}\) of V such that for every \(v\in V\), \(d_{in}(v)\ge \frac{d(v)}{2}\)?
It can be proved that these two problems are in fact equivalent when the number of vertices is even.
Lemma 6
 1.
\(\frac{d_{in}(v)}{C_11}\ge \frac{d(v)}{n1}\)
 2.
\(\frac{d_{out}(v)}{C_2}\le \frac{d(v)}{n1}\)
 3.
\(\frac{d_{in}(v)}{C_11}\ge \frac{d_{out}(v)}{C_2}\)
Proof
\((1)\Leftrightarrow (2)\) : \(\frac{d_{in}(v)}{d(v)}\ge \frac{C_11}{n1} \Leftrightarrow 1\frac{d_{out}(v)}{d(v)}\ge \frac{nC_21}{n1} \Leftrightarrow 1\frac{nC_21}{n1}\ge \frac{d_{out}(v)}{d(v)} \Leftrightarrow \frac{d_{out}(v)}{d(v)}\le \frac{C_2}{n1}\)
\((3)\Leftrightarrow (1)\) : \(\frac{d_{in}(v)}{C_11}\ge \frac{d_{out}(v)}{C_2} \Leftrightarrow \frac{d_{in}(v)}{C_11}\ge \frac{d(v)d_{in}(v)}{n C_1} \Leftrightarrow d_{in}(v) [\frac{1}{C_11} + \frac{1}{nC_1}]\ge \frac{d(v)}{nC_1} \Leftrightarrow \frac{d_{in}(v)}{d(v)}\ge \frac{C_11}{n1}\) \(\square \)
Note Notice that the third assertion in Lemma 6 is the condition (1) of a 2community structure.
Lemma 7
Let \(G=(V,E)\) be a graph with an even number n of vertices and \(\{C_1, C_2\}\) be a balanced partition of V. Then for any vertex \( v\in V\), \(d_{in}(v)=\frac{n/21}{n1} d(v)\) if and only if \(d(v)=n1\).
Proof
If \(d(v)=n1\), then clearly \(d_{in}(v)=\frac{n}{2}1\). Suppose now that \(d_{in}(v)=\frac{n/21}{n1} d(v)\). Notice that \((2)(\frac{n}{2}1) + 1(n1) = 1\) from which it can be easily shown that \(\frac{n}{2}1\) and \(n1\) do not have common divisors. This implies that d(v) is a multiple of \(n1\). Thus, \(d(v)=n1\). \(\square \)
Note Let \(\{C_1,C_2\}\) be a balanced partition of G and \(v\in C_1\) be a vertex of degree \(n1\). Since v has \(\frac{n}{2}1\) neighbours in its own part and \(\frac{n}{2}\) in other part, v does not satisfy the condition of Balanced Satisfactory Partition. However, v satisfies the Balanced 2Community condition since \(\frac{d_{in}(v)}{C_11}=1\).
Proposition 1
For any graph with n vertices and maximum degree \((n2)\) the problems Balanced Satisfactory Partition and Balanced 2Community are equivalent.
Proof
Suppose that \(G=(V, E)\) is a yesinstance of Balanced Satisfactory Partition. Hence there exists a balanced partition \(\{C_1,C_2\}\) of V such that any vertex \(v\in V\) satisfies the condition \(d_{in}(v)\ge \frac{1}{2} d(v)\), which implies that \(d_{in}(v)\ge \frac{C_11}{2C_11} d(v) = \frac{C_11}{n1}d(v)\). Thus, G is a yesinstance of Balanced 2Community.
Suppose now that G is a yesinstance of Balanced 2Community. Hence there exists a balanced partition \(\{C_1, C_2\}\) of V such that any vertex \(v\in V\) satisfies the condition \(d_{in}(v)\ge \frac{C_11}{C_2} d_{out}(v)\) that is equivalent to \(d_{in}(v)\ge \frac{C_11}{n1} d(v)\) using Lemma 6. According to Lemma 7, there is no vertex v such that \(d_{in}(v)=\frac{C_11}{n1} d(v)\).
Moreover, d(v) cannot be even, since otherwise \(\frac{d(v)}{2}\) would be a whole number and thus \(d_{in}(v)\) could not be an integer number. Then d(v) is odd and let \(d(v)=2p+1\) for some integer p. We arrive to a contradiction by showing that \(p<d_{in}(v) < p+\frac{1}{2}\). Notice that \( d(v)<n1\Rightarrow \frac{d(v)1}{2}<\frac{C_11}{n1} d(v)\) that implies \(p<\frac{C_11}{n1} d(v)<d_{in}(v)\). Then necessarily \(d_{in}(v)\ge \frac{1}{2} d(v)\) for every vertex \(v\in V\), that is G is a yesinstance of Balanced Satisfactory Partition. \(\square \)
Balanced Satisfactory Partition has already been proved NPcomplete in [4], even if both parts are required to be connected. Moreover, the reduction used in [4] does not construct a graph with vertices of degree \(n1\).
Thus we obtain a similar result as in [10] (the authors have mentioned in the proof that used technique works also in a connected case).
Theorem 9
Connected Balanced 2Community is NPcomplete.
5 Conclusion and Open Problems
An interesting open question is to determine if a graph of size at least 4 (except stars) has always a 2community structure, even a connected one. In this paper we prove that the statement is true for trees, graphs of maximum degree 3, minimum degree \(V3\) and some other graph classes. Furthermore, such a structure can be found in polynomial time. The question remains open even for a weak 2community structure where the partial positive results are only known for the same graph classes.
In case of Balanced 2Community the situation is different. We show that any graph of maximum degree 3 has a balanced weak 2community structure, while we present a graph without a balanced 2community structure within the same class. Computationally speaking, finding a balanced weak 2community structure can be done in polynomial time in graphs of maximum degree 3 while the Balanced 2Community problem is NPcomplete in general graphs just as its weak version. The results are similar for connected communities.
To get better understanding of community structures, there are some interesting problems left open, as to extend 2community results to other graph classes, to characterise graph classes where the existential/complexity results for 2community/weak 2community problems and their connected versions are different or to generalise the results to kcommunities for a fixed k, \(k\ge 3\).
References
 1.Aharoni, R., Milner, E.C., Prikry, K.: Unfriendly partitions of a graph. J. Comb. Theory B 50, 1–10 (1990)MathSciNetCrossRefMATHGoogle Scholar
 2.Andreev, K., Racke, H.: Balanced graph partitioning. Theory Comput. Syst. 39(6), 929–939 (2006)MathSciNetCrossRefMATHGoogle Scholar
 3.Bazgan, C., Tuza, Z., Vanderpooten, D.: Degreeconstrained decompositions of graphs: bounded treewidth and planarity. Theor. Comput. Sci. 355(3), 389–395 (2006)MathSciNetCrossRefMATHGoogle Scholar
 4.Bazgan, C., Tuza, Z., Vanderpooten, D.: The satisfactory partition problem. Discret. Appl. Math. 154(8), 1236–1245 (2006)MathSciNetCrossRefMATHGoogle Scholar
 5.Bazgan, C., Tuza, Z., Vanderpooten, D.: Approximation of satisfactory bisection problems. J. Comput. Syst. Sci. 74(5), 875–883 (2008)MathSciNetCrossRefMATHGoogle Scholar
 6.Bazgan, C., Tuza, Z., Vanderpooten, D.: Satisfactory graph partition, variants, and generalizations. Eur. J. Oper. Res. 206(2), 271–280 (2010)MathSciNetCrossRefMATHGoogle Scholar
 7.Buluç, A., Meyerhenke, H., Safro, I., Sanders, P., Schulz, C.: Recent advances in graph partitioning. arXiv:1311.3144
 8.Chlebikova, J.: Approximating the maximally balanced connected partition problem in graphs. Inf. Process. Lett. 60(5), 223–230 (1996)MathSciNetCrossRefGoogle Scholar
 9.Delling, D., Goldberg, A., Pajor, T., Werneck, R.: Customizable route planning. In: Proceedings of 10th International Symposium on Experimental Algorithms, LNCS 6630, pp. 376–387 (2011)Google Scholar
 10.EstivillCastro, V., Parsa, M.: On connected two communities. In: Proceedings of the 36th Australasian Computer Science Conference (ACSC), pp. 23–30 (2013)Google Scholar
 11.EstivillCastro, V., Parsa, M.: Hardness and tractability of detecting connected communities. In: Proceedings of the Australasian Computer Science Week Multiconference (ACSW), Article No. 25 (2016)Google Scholar
 12.Feldmann, A.E., Foschini, L.: Balanced partitions of trees and applications. Algorithmica 71(2), 354–376 (2015)MathSciNetCrossRefMATHGoogle Scholar
 13.Flake, G., Lawrence, S., Giles, C.L.: Efficient identification of web communities. In: Proceedings 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM Press, pp. 150160 (2000)Google Scholar
 14.Fortunato, S.: Community detection in graphs. Phys. Rep. 486, 75–174 (2010)MathSciNetCrossRefGoogle Scholar
 15.Yero, I.G., RodríguezVelázquez, J.A.: Defensive alliances in graphs: a survey (2013). arXiv:1308.2096
 16.Kristiansen, P., Hedetniemi, S.M., Hedetniemi, S.T.: Alliances in graphs. J. Comb. Math. Comb. Comput. 48, 157–177 (2004)MathSciNetMATHGoogle Scholar
 17.Newman, M.E.J.: Detecting community structure in networks. Eur. Phys. J. B—Condens. Matter Complex Syst. 38(2), 321–330 (2004)CrossRefGoogle Scholar
 18.Olsen, M.: A general view on computing communities. Math. Soc. Sci. 66(3), 331–336 (2013)MathSciNetCrossRefMATHGoogle Scholar
 19.Schaeffer, S.E.: Graph clustering. Comput. Sci. Rev. 1, 27–64 (2007)CrossRefMATHGoogle Scholar
 20.Shafique, K.H.: Partitioning a graph in alliances and its application to data clustering. Ph.D. Thesis, School of Computer Science, University of Central Florida, Orlando, (2004)Google Scholar
 21.Shmoys, D.B.: Cut problems and their application to divideandconquer. In: Approximation Algorithms for NPHard Problems, PWS Publishing, pp. 192–235, (1996)Google Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.