We shall prove our lower bound for undirected graphs; this makes it as general as possible. The cost of a refining operation (R, S) in a graph G is
$$\operatorname{cost}(R,S):=|\{(u,v) \mid uv\in E(G),u\in R, v\in S\}|. $$
This is basically the number of edges between R and S, except that edges with both ends in R ∩ S are counted twice. For a partition π that admits a refining operation (R, S), denote by π(R, S) the partition that results from this operation.
Definition 14
Let G = (V, E) be a graph, and π be a partition of V.
-
If π is stable, then cost(π):=0.
-
Otherwise, cost(π):= minR, S cost(π(R, S))+ cost(R, S), where the minimum is taken over all effective refining operations (R, S) that can be applied to π.
Note that this is well-defined; if π is unstable, then there exists at least one effective elementary refining operation (R, S), and for any such operation, |π(R, S)|>|π|. We can now formulate the main result of this section.
Theorem 15
For every integer k ≥ 2, there is a graph G
k
with n∈O(2
k
k) vertices and m∈O(2
k
k
2
) edges, such that cost(α)∈Ω((m+n)logn), where α is the unit partition of V(G
k
).
Note that this theorem implies a complexity lower bound for all partition-refinement based algorithms for colour refinement, as discussed in the introduction. We will first prove some basic observations related to the above definitions, then give the construction of the graph, and finally prove Theorem 15.
Basic Observations
We start with two basic properties of stable partitions. The first proposition follows easily from the definitions.
Proposition 16
Let G=(V,E) be a graph, and π be a stable partition of V. For any π-closed subset S⊆V, π[S] is a stable partition for G[S].
Proposition 17
Let G=(V,E) be a graph, and π be a stable partition of V. For any π-closed set S and vertices u,v∈V: if the distance from u to S is different from the distance from v to S, then u≉
π
v.
Proof 12
Denote the distance from a vertex x to S by dist(x, S). W.l.o.g. we may assume that dist(u, S)<dist(v, S), so in particular dist(u, S) is finite. We prove the statement by induction over dist(u, S). If dist(u, S) = 0 then u ∈ S but v∉S. Since S is π-closed, this implies u ≉
π
v. Otherwise, u is adjacent to a vertex w with dist(w, S) = dist(u, S)−1, but v is not. Let R ∈ π be the cell with w ∈ R. Then by induction, |N(v) ∩ R|=0, so u ≉
π
v, since π is stable. □
For a partition π of V, denote by π
∞
the coarsest stable partition of V that refines π.
Proposition 18
Let π and ρ be partitions of V such that π≼ρ≼π
∞
. Then cost(π) ≥ cost(ρ).
Proof 13
Let (R, S) be a refining operation that can be applied to π, which yields π
′. Then it can be observed that the operation (R, S) can also be applied to ρ, and that for the resulting partition ρ
′, it holds again that π
′≼ρ
′≼π
∞
(Proposition 2 shows that ρ
′≼π
∞
).
An induction proof based on this observation shows that a minimum cost sequence of refining operations that refines π to π
∞
can also be applied to ρ, to yield the stable partition π
∞
, at the same cost. Therefore, cost(π) ≥ cost(ρ). □
A refining operation (R, S) on π is elementary if both R ∈ π and S ∈ π. The next proposition shows that adding the word ‘elementary’ in Definition 14 yields an equivalent definition.
Proposition 19
Let π be an unstable partition of V(G). Then
$$\operatorname{cost}(\pi)=\min\limits_{R,S} \operatorname{cost}(\pi(R,S)) + \operatorname{cost}(R,S), $$
where the minimum is taken over all effective elementary refining operations (R,S) that can be applied to π.
Proof 14
Let (R, S) an nonelementary refining operation for π, and let ρ
1 be the result of applying (R, S) to π. We shall prove that there is a sequence of elementary refining operations of total cost at most cost(R, S) that, when applied to π, yields a partition ρ
2 that refines ρ
1. The claim follows by Proposition 18.
Suppose that R consists of the cells R
1,…,R
q
and S consists of the cells S
1,…,S
p
. We apply the elementary refining operations (R
i
, S
j
) for all i ∈ {1,…,q},j ∈ {1,…,p} in an arbitrary order and let ρ
2 be the resulting partition. The cost of these elementary refinements is
$$\begin{array}{@{}rcl@{}} \sum\limits_{i,j}\operatorname{cost}(R_{i},S_{j})&={\sum}_{i,j}|\{(u,v)\mid uv\in E(G), u\in R_{i},v\in S_{j}\}| \\ &=|\{(u,v)\mid uv\in E(G), u\in R,v\in S\}|=\operatorname{cost}(R,S). \end{array} $$
It is easy to see that ρ
2 refines ρ
1. Indeed, if u, v ∈ S belong to the same class of ρ
2, then they belong to the same class S
j
, and for all classes R
i
they have the same number of neighbours in R
i
. Hence they have the same number of neighbours in \(R=\bigcup _{i}R_{i}\), and this means that they belong to the same class of ρ
1. □
Construction of the Graph
For \(k\in \mathbb {N}\), denote \(\mathcal {B}_{k}=\{0,\ldots ,2^{k}-1\}\). For ℓ ∈ {0,…,k} and q ∈ {0,…,2ℓ−1}, the subset \(\mathcal {B}^{\ell }_{q}=\{q2^{k-\ell },\ldots ,(q+1)2^{k-\ell }-1\}\) is called the q-th binary block of level
ℓ. Analogously, for any set of vertices with indices in \(\mathcal {B}_{k}\), we also consider binary blocks. For instance, if \(X=\{x_{i} \mid i\in \mathcal {B}_{k}\}\), then \(X^{\ell }_{q}=\{x_{i} \mid i\in \mathcal {B}^{\ell }_{q}\}\) is called a binary block of X. For such a set X, a partition
π of
X
into binary blocks is a partition where every S ∈ π is a binary block. A key fact for binary blocks that we will often use is that for any ℓ and q, \(\mathcal {B}^{\ell }_{q}=\mathcal {B}^{\ell +1}_{2q}\cup \mathcal {B}^{\ell +1}_{2q+1}\).
For every integer k ≥ 2, we will construct a graph G
k
. (An example for k = 3 is given in Fig. 1.) In its core this graph consists of the vertex sets \(X=\{x_{i}\mid i\in \mathcal {B}_{k}\}\), \(\mathcal X=\{{x^{j}_{i}}\mid i\in \mathcal {B}_{k},j\in \{1,\ldots ,k\}\}\), \(\mathcal Y=\{{y^{j}_{i}}\mid i\in \mathcal {B}_{k},j\in \{1,\ldots ,k\}\}\) and \(Y=\{y_{i}\mid i\in \mathcal {B}_{k}\}\) . Every vertex x
i
is adjacent to \({x^{j}_{i}}\) for all j ∈ {1,…,k} and every y
i
is adjacent to all \({y_{i}^{j}}\) . Furthermore, for all i, j
1, j
2 there is an edge between \(x^{j_{1}}_{i}\) and \(y^{j_{2}}_{i}\). (For \(\mathcal X\), binary blocks are subsets of the form \(\mathcal X^{\ell }_{q}:=\{{x^{j}_{i}} \mid i\in \mathcal {B}^{\ell }_{q}, j\in \{1,\ldots ,k\}\}\), and for \(\mathcal Y\) the definition is analogous.)
We add gadgets to the graph to ensure that any sequence of refining operations behaves as follows. After the first step, which distinguishes vertices according to their degrees, X and Y are cells of the resulting partition. Next, X splits up into two binary blocks \({X^{1}_{0}}\) and \({X^{1}_{1}}\) of equal size. This causes \(\mathcal X\) to split up accordingly into \({\mathcal X^{1}_{0}}\) and \({\mathcal X^{1}_{1}}\). One of these cells will be used to halve \(\mathcal Y\) in the same way. This refining operation (R, S) is expensive because [R, S] contains half of the edges between \(\mathcal X\) and \(\mathcal Y\). Next, Y can be split up into \({Y^{1}_{0}}\) and \({Y^{1}_{1}}\). Once this happens, there is a gadget AND1 that causes the two cells \({X^{1}_{0}}\), \({X^{1}_{1}}\) to split up into the four cells \({X^{2}_{q}}\), for q = 0,…,3. Again, this causes cells in \(\mathcal X, \mathcal Y\) and Y to split up in the same way and to achieve this, half of the edges between \(\mathcal X\) and \(\mathcal Y\) have to be considered. The next gadget AND2 ensures that if both cells of Y are split, then the four cells of X can be halved again, etc. In general, we design a gadget AND
ℓ
of level ℓ that ensures that if Y is partitioned into 2ℓ + 1 binary blocks of equal size, then X can be partitioned into 2ℓ + 2 binary blocks of equal size. By halving all the cells of X and Y
k = Θ(log n) times (with n = |V(G
k
)|), this refinement process ends up with a discrete colouring of these vertices. Since every iteration uses half of the edges between \(\mathcal X\) and \(\mathcal Y\) (which are Θ(m)), we get the cost lower bound of Ω(mlog n) (with m = |E(G
k
)|).
We now define these gadgets in more detail. For every integer ℓ ≥ 1, we define a gadget AND
ℓ
, which consists of a graph G together with two out-terminals
a
0, a
1, and an ordered sequence of p = 2ℓ
in-terminals
b
0,…,b
p−1. For ℓ = 1, the graph G has V(G) = {a
0, a
1, b
0, b
1}, and E(G) = {a
0
b
0, a
1
b
1}. For ℓ = 2, the graph G is identical to the construction of Cai, Fürer and Immerman [8]. (See Fig. 2. The out-terminals a
0, a
1 and in-terminals b
0,…,b
3 are indicated.) For ℓ ≥ 3, AND
ℓ
is obtained by taking one copy G
∗ of an AND2-gadget, and two copies G
′ and G
″ of an AND
ℓ−1-gadget, and adding four edges to connect the two pairs of in-terminals of G
∗ with the pairs of out-terminals of G
′ and G
″, respectively. As out-terminals of the resulting gadget we choose the out-terminals of G
∗. The in-terminal sequence is obtained by concatenating the sequences of in-terminals of G
′ and G
″. (See Fig. 3 for an example of AND3.) For any AND
ℓ
-gadget G with in-terminals \(b_{0},\ldots ,b_{2^{\ell }-1}\), the in-terminal pairs are pairs b
2p
and b
2p + 1, for all p ∈ {0,…,2ℓ−1−1}.
The graph G
k
is now constructed as follows. Start with vertex sets \(X,\mathcal X,\mathcal Y\) and Y, and edges between them, as defined above. For every ℓ ∈ {1,…,k−1}, we add a copy G of an AND
ℓ
-gadget to the graph. Denote the out- and in-terminals of G by a
0, a
1 and \(b_{0},\ldots ,b_{2^{\ell }-1}\), respectively.
-
For i = 0,1 and all relevant q: we add edges from a
i
to every vertex in \(X^{\ell +1}_{2q+i}\).
-
For every i, we add edges from b
i
to every vertex in \(Y^{\ell }_{i}\).
Finally, we add a starting gadget to the graph, consisting of three vertices v
0, v
1, v
2, the edge v
1
v
2, and edges \(\{v_{0}x_{i} \mid i{\in \mathcal {B}^{1}_{0}}\}\cup \{v_{1}x_{i} \mid i{\in \mathcal {B}^{1}_{1}}\}\). See Fig. 1 for an example of this construction. (In the figure, we have expanded the terminals of AND2 into edges, for readability. This does not affect the behaviour of the graph.)
Proposition 20
G
k
has O(2
k
k) vertices and O(2
k
k
2
) edges.
Proof 15
An easy induction proof shows that the A
N
D
ℓ
-gadget has O(2ℓ) vertices and edges. So, all AND
ℓ
gadgets together, for ℓ ∈ {1,…,k−1}, have at most O(2k) vertices and edges. Therefore, the bounds on the total number of vertices and edges of G
k
are dominated by the number of vertices and edges in \(G_{k}[\mathcal X\cup \mathcal Y]\), which is k2k + 1 and k
22k, respectively. □
We now state and prove the key property for A
N
D
ℓ
-gadgets. This requires the following definitions. For a graph G = (V, E), If ψ is a partition of a subset
S ⊆ V, then for short we say that a partition ρ of V
refines
ψ if it refines ψ∪{V∖S}. We say that ρ
agrees with
ψ if ρ[S] = ψ. (So if V∖S ≠ ∅, one can choose ρ such that it agrees with ψ but does not refine ψ.) For two graphs G and H, by G⊎H we denote the graph obtained by taking the disjoint union of G and H. We say that a partition π of V
distinguishes two sets V
1 ⊆ V and V
2 ⊆ V if there is a set R ∈ π with |R ∩ V
1|≠|R ∩ V
2|. This is often used for the case where V
1 = N(u) and V
2 = N(v) for two vertices u and v, to conclude that if π is stable, then u ≉
π
v. If V
1 = {x} and V
2 = {y}, then we also say that π
distinguishes
x
from
y.
Lemma 21
Let G be an AND
ℓ
-gadget with in-terminals
\(B=\{b_{0},\ldots ,b_{2^{\ell }-1}\}\)
and out-terminals a
0
,a
1
. Let ψ be a partition of B into binary blocks, and let ρ be the coarsest stable partition ρ of V(G) that refines ψ. Then ρ agrees with ψ. Furthermore, ρ distinguishes a
0
from a
1
if and only if ψ distinguishes all in-terminal pairs.
Proof 16
We prove the statement by induction over ℓ. For ℓ = 1, the statement is trivial. Now suppose ℓ = 2. We only consider partitions of {b
0,…,b
3} into binary blocks. Because of the automorphisms of this gadget, it follows that it suffices to consider the following four partitions for ψ. For all of them, a corresponding partition ρ is given; it can be verified that ρ is the coarsest stable partition of V(A
N
D
ℓ
) that refines ψ. (The nonterminal vertices are labeled c
0,…,c
3, as shown in Fig. 2.)
$$\begin{array}{@{}rcl@{}} \psi=\big\{ \{b_{0},b_{1},b_{2},b_{3}\} \big\} &\Longrightarrow \rho=\psi\cup \big\{ \{c_{0},c_{1},c_{2},c_{3}\},\{a_{0},a_{1}\} \big\}, \\ \psi=\big\{ \{b_{0},b_{1}\},\{b_{2},b_{3}\} \big\} &\Longrightarrow \rho=\psi\cup \big\{ \{c_{0},c_{1},c_{2},c_{3}\},\{a_{0},a_{1}\} \big\}, \\ \psi=\big\{ \{b_{0}\},\{b_{1}\},\{b_{2},b_{3}\} \big\} &\Longrightarrow \rho=\psi\cup \big\{ \{c_{0},c_{2}\},\{c_{1},c_{3}\},\{a_{0},a_{1}\} \big\},\\ \psi=\big\{ \{b_{0}\},\{b_{1}\},\{b_{2}\},\{b_{3}\} \big\} &\Longrightarrow \rho=\psi\cup \big\{ \{c_{0}\},\{c_{1}\},\{c_{2}\},\{c_{3}\},\{a_{0}\},\{a_{1}\} \big\}. \end{array} $$
We see that in all four cases, ρ agrees with ψ on B. Furthermore, ρ distinguishes the out-terminals if and only if ψ distinguishes all in-terminal pairs (which is only the case for the last ψ).
Now suppose ℓ ≥ 3. Recall that an AND
ℓ
-gadget H is obtained by taking two copies G
′ and G
″ of an AND
ℓ−1-gadget, and informally, putting a copy G
∗ of an AND2-gadget on top of those. Any partition ψ of the in-terminal set B of H into binary blocks corresponds to partitions ψ
′ and ψ
″ of the in-terminal sets B
′ and B
″ of G
′ and G
″ respectively, again into binary blocks. So by induction, we have coarsest stable partitions ρ
′ and ρ
″ of V(G
′) and V(G
″) that refine ψ
′ and ψ
″ and agree with them on B
′ and B
″, respectively. Together, this yields a partition π of V(G
′)∪V(G
″), which is stable for G
′⊎G
″, refines ψ, and agrees with ψ on B. (To be precise: if ψ is not the unit partition, then we can simply take π = ρ
′∪ρ
″, because ψ is a partition into binary blocks, and thus distinguishes every single in-terminal of G
′ from every single in-terminal of G
″. Otherwise, every set in π should be the union of the two corresponding sets in ρ
′ and ρ
″.) Then π gives a partition of the out-terminals of G
′ and G
″, which yields a matching partition ψ
∗ of the in-terminals B
∗ of G
∗, again into binary blocks. Applying the induction hypothesis to G
∗, we obtain a coarsest stable partition ρ
∗ of V(G
∗) that refines and agrees with ψ
∗. Combining π and ρ
∗ yields a stable partition ρ of the vertices V(H) of the entire gadget.
Applying the induction hypothesis to G
′ and G
″ shows that at least one in-terminal pair of G
∗ is not distinguished by ψ
∗ if and only if at least one in-terminal pair of G
′ or G
″ is not distinguished by ψ
′ or ψ
″ respectively. Applying the induction hypothesis to G
∗ then shows that ρ does not distinguish the out-terminals of H if ψ does not distinguish at least one in-pair of H. This then also holds for the coarsest stable partition of V(H) that refines ψ.
Finally, let ψ be a partition into binary blocks of the in-terminals B of H that distinguishes every pair, and let ρ be a coarsest stable partition that refines ψ. We prove that ρ also distinguishes a
0 from a
1. By definition, ρ distinguishes any vertex from B from any vertex not in B. We conclude that for any two vertices u, v ∈ V(H), if they have different distance to B, then u ≉
ρ
v (Proposition 17). So by Proposition 16, ρ induces stable partitions ρ
∗ and π for both G
∗ and G
′⊎G
″, respectively. The graphs G
′ and G
″ are components of G
′⊎G
″, so we conclude that ρ induces stable partitions ρ
′ and ρ
″ for both G
′ and G
″, respectively. By induction, it follows that ρ
′ and ρ
″ both distinguish the out-terminals of G
′ and G
″, respectively. (If this holds for the coarsest stable partition, then it holds for any stable partition.) Then ψ: = ρ[B
∗] (where B
∗ denotes again the in-terminal set of G
∗) distinguishes all in-terminal pairs of G
∗. So by induction, ρ distinguishes a
0 from a
1. □
The following Corollary follows immediately from Lemma 21.
Corollary 22
Let π be a stable partition for an AND-gadget G such that ψ=π[B] is a partition of the in-terminals B into binary blocks, and such that B is π-closed. If π does not distinguish the out-terminals, then at least one in-terminal pair is not distinguished.
Proof 17
Since B is π-closed, π refines ψ = π[B]. Since π is stable, it refines the coarsest stable partition ρ of V(G) that refines ψ. Now apply Lemma 21. □
Cost Lower Bound Proof
Intuitively, at level ℓ of the refinement process, the current partition contains all blocks \(\mathcal X^{\ell +1}_{q}\) of level ℓ + 1 and for all 0 ≤ q < 2ℓ, either \(\mathcal Y^{\ell }_{q}\) or the two blocks \(\mathcal Y^{\ell +1}_{2q}\) and \(\mathcal Y^{\ell +1}_{2q+1}\). In this situation one can split up the blocks \(\mathcal Y^{\ell }_{q}\) into blocks \(\mathcal Y^{\ell +1}_{2q}\) and \(\mathcal Y^{\ell +1}_{2q+1}\) using either refining operation \((\mathcal X^{\ell +1}_{2q},\mathcal Y^{\ell }_{q})\) or \((\mathcal X^{\ell +1}_{2q+1},\mathcal Y^{\ell }_{q})\). These operations both have cost 2k−(ℓ + 1)
k
2, and refining all the \(\mathcal Y^{\ell }_{q}\) cells in this way costs 2k−1
k
2. Once \(\mathcal Y\) is partitioned into binary blocks of level ℓ + 1, we can partition \(\mathcal X\) into blocks of level ℓ + 2 (using the AND
ℓ
-gadget), and proceed the same way. Since there are k such refinement levels, we can lower bound the total cost of refining the graph by 2k−1
k
3=Ω(mlog n) and are done. What remains to show is that applying the refining operations in this specific way is the only way to obtain a stable partition. To formalise this, we introduce a number of partitions of V(G
k
) that are stable with respect to the (spanning) subgraph \(G^{\prime }_{k}=G_{k}-[\mathcal X,\mathcal Y]\), and that partition \(\mathcal X\) and \(\mathcal Y\) into binary blocks. (For disjoint vertex sets S, T, we denote [S, T]={u
v ∈ E(G)∣u ∈ S, v ∈ T}.) So on G
k
, these partitions can only be refined using operations (R, S), where R is a binary block of \(\mathcal X\) and S is a binary block of \(\mathcal Y\).
Definition 23
For any ℓ ∈ {0,…,k−1}, and nonempty set \(Q\subseteq \mathcal {B}_{\ell }\), by τ
Q, ℓ
we denote the partition of \(\mathcal X\cup \mathcal Y\) that contains cells
-
\(\mathcal X^{\ell +1}_{q}\) for all \(q\in \mathcal {B}_{\ell +1}\),
-
\(\mathcal Y^{\ell }_{q}\) for all q ∈ Q, and both \(\mathcal Y^{\ell +1}_{2q}\) and \(\mathcal Y^{\ell +1}_{2q+1}\) for all \(q\in \mathcal {B}_{\ell }\setminus Q\).
π
Q, ℓ
denotes the coarsest stable partition for \(G^{\prime }_{k}=G_{k}-[\mathcal X,\mathcal Y]\) that refines τ
Q, ℓ
.
We now show that for every ℓ and Q, there is also a stable partition of \(G^{\prime }_{k}\) that partitions \(\mathcal X\) and \(\mathcal Y\) as prescribed by the above definition. In particular, this holds for π
Q, ℓ
.
Lemma 24
For every ℓ∈{0,…,k−1} and nonempty set
\(Q\subseteq \mathcal {B}_{\ell }\)
, π
Q,ℓ
agrees with τ
Q,ℓ
.
Proof 18
We design a stable partition ρ of \(V(G_{k})=V(G^{\prime }_{k})\) that is stable on \(G^{\prime }_{k}\), and agrees with τ
Q, ℓ
. So we start with ρ = τ
Q, ℓ
. For every cell \(\mathcal X^{\ell +1}_{q}\) in τ
Q, ℓ
, we add the cell \(X^{\ell +1}_{q}\) to ρ. For every cell \({\mathcal Y^{m}_{q}}\) in τ
Q, ℓ
(ℓ ≤ m ≤ ℓ + 1), we add the cell \({Y^{m}_{q}}\) to ρ. Then we add cells {v
0}, {v
1} and {v
2}.
For every AND
p
-gadget G of G
k
(with in-terminals adjacent to Y and out-terminals adjacent to X), we define a partition ψ of the in-terminals B as follows: for u, v ∈ B, u ≈
ψ
v if and only if N(u) ∩ Y is not distinguished from N(v) ∩ Y. Note that this yields a partition of B into binary blocks, and that this distinguishes an in-terminal pair b
2q
, b
2q + 1 (which are adjacent to \(Y^{p}_{2q}\) and \(Y^{p}_{2q+1}\), respectively, with union \(Y^{p-1}_{q}\)) if and only if ℓ ≥ p holds, or both ℓ = p−1 and q∉Q hold. Now we extend ρ by adding all cells of the coarsest stable partition of the AND
p
-gadget G that refines ψ. By Lemma 21, this partition distinguishes the out-terminals of G if and only if ℓ ≥ p (since Q is nonempty). Extending ρ this way for every AND-gadget yields the final partition ρ of V(G
k
). By definition, ρ agrees with τ
Q, ℓ
. From the construction, the stability condition is easily verified for almost all cells of ρ. Only cells {a
0, a
1} ∈ ρ consisting of out-terminals of AND
p
-gadgets need to be considered in more detail. As noted before, such cells only occur when p ≥ ℓ + 1. Then we have for every integer q that \(X^{p+1}_{2q}\cup X^{p+1}_{2q+1}={X^{p}_{q}}\subseteq X^{\ell +1}_{q^{\prime }}\in \rho \) (for some value q
′). Since a
0 is adjacent to every \(X^{p+1}_{2q}\) and a
1 is adjacent to every \(X^{p+1}_{2q+1}\), it follows that N(a
0) and N(a
1) are not distinguished by ρ. Therefore, ρ is stable for \(G^{\prime }_{k}\). Then the coarsest stable partition π
Q, ℓ
that refines τ
Q, ℓ
also agrees with τ
Q, ℓ
. □
Since π
Q, ℓ
is stable on \(G^{\prime }_{k}\), any effective refining operation (with respect to G
k
) should involve the edges between \(\mathcal X\) and \(\mathcal Y\). Since π
Q, ℓ
partitions \(\mathcal X\) and \(\mathcal Y\) as prescribed by τ
Q, ℓ
, we conclude that any effective elementary refining operation has the form described in the following corollary. Recall that a refining operation (R, S) for a partition π is elementary if both R and S are classes of π, and that by Proposition 19 it suffices to consider elementary refining operations.
Corollary 25
Let (R,S) be an effective elementary refining operation on π
Q,ℓ
. Then for some q∈Q,
\(R=\mathcal X^{\ell +1}_{2q}\)
or
\(R=\mathcal X^{\ell +1}_{2q+1}\)
, and
\(S=\mathcal Y^{\ell }_{q}\)
. The cost of this operation is k
22k−(ℓ+1).
This motivates the following definition: for q ∈ Q, by r
q
(π
Q, ℓ
) we denote the partition of V(G
k
) that results from the above refining operation. (Both choices of R yield the same result.)
Lemma 26
For every ℓ ∈ {0,…,k − 1}, nonempty
\(Q\subseteq \mathcal {B}_{\ell }\)
and q ∈ Q:
-
\(r_{q}(\pi _{Q,\ell })\preceq \pi _{\mathcal {B}_{\ell +1},\ell +1}\), and
-
if Q
′ = Q ∖ {q} is nonempty, then
\(r_{q}(\pi _{Q,\ell })\preceq \pi _{Q^{\prime },\ell }\).
Proof 19
Choose Q
′ and ℓ
′ satisfying one of the conditions (i.e. \(Q^{\prime }=\mathcal {B}_{\ell +1}\) and ℓ
′ = ℓ + 1, or Q
′ = Q∖{q} and ℓ
′ = ℓ). Then \(\tau _{Q,\ell }\preceq \tau _{Q^{\prime },\ell ^{\prime }}\) , so also \(\pi _{Q,\ell }\preceq \pi _{Q^{\prime },\ell ^{\prime }}\) (since \(\pi _{Q^{\prime },\ell ^{\prime }}\) is also a stable partition that refines τ
Q, ℓ
). If we now obtain a partition ρ from π
Q, ℓ
by splitting up one cell such that the only vertex pairs u, v with \(u\approx _{\pi _{Q,\ell }} v\) but u ≉
ρ
v are vertex pairs with \(u\not \approx _{\pi _{Q^{\prime },\ell ^{\prime }}} v\), then clearly still \(\rho \preceq \pi _{Q^{\prime },\ell ^{\prime }}\) holds. This is exactly how r
q
(π
Q, ℓ
) is obtained. □
Lemma 27
Let ω be the coarsest stable partition for G
k
. For all ℓ∈{0,…,k−1} and nonempty
\(Q\subseteq \mathcal {B}_{\ell }\)
: π
Q,ℓ
≼ω.
Proof 20
First, we note that by considering the various vertex degrees and using Proposition 17, one can verify that ω refines \(\{X,\mathcal X,\mathcal Y,Y,\{v_{0}\},\{v_{1}\},\{v_{2}\},V_{G}\}\), where V
G
denotes all vertices in AND-gadgets. In particular, V
G
is ω-closed, so ω induces a stable partition on G[V
G
] (Proposition 16), and therefore it does so on every AND-gadget of G
k
(which are components of G[V
G
]). Note that for any two different AND
ℓ
-gadgets H
1 and H
2 of G
k
, there exists an integer d such that H
1 contains a vertex at distance exactly d from the ω-closed set X∪Y, but H
2 does not. This observation can be combined with Proposition 17 to show that if u and v are part of different AND-gadgets, then u ≉
ω
v. Subsequently this yields that for any AND-gadget of G
k
with output terminals a
0, a
1, the set {a
0, a
1} is ω-closed, and the set of input terminals B of this gadget is ω-closed.
We now prove that ω[X] is discrete. Suppose that there is an AND-gadget in G
k
for which the out-terminals are not distinguished by ω. Then let ℓ be the minimum value such that this holds for the AND
ℓ
-gadget G of G
k
. As observed above, we may apply Corollary 22 to G, which shows that there is at least one pair of in-terminals b
2q
and b
2q + 1 that is not distinguished by ω. By stability, and since Y is ω-closed, this shows that there are vertices \(y_{i}\in Y^{\ell }_{2q}\) and \(y_{j}\in Y^{\ell }_{2q+1}\) in the adjacent binary blocks such that y
i
≈
ω
y
j
. Then, considering the ω-closed subgraph \(G_{k}[X\cup \mathcal X\cup \mathcal Y\cup Y]\), it easily follows that x
i
≈
ω
x
j
. If ℓ ≥ 2, then \(x_{i}\in X^{\ell }_{2q}\) is adjacent to the out-terminal a
0 of the AND
ℓ−1 gadget of G
k
, whereas \(x_{j}\in X^{\ell }_{2q+1}\) is adjacent to the other out-terminal a
1 of this gadget. By choice of ℓ, a
0≉
ω
a
1, so since {a
0, a
1} is ω-closed, this gives a contradiction with stability. If ℓ = 1, then we consider the starting gadget: \(x_{i}\in X^{\ell }_{0}\) is adjacent to v
0, and \(x_{j}\in X^{\ell }_{1}\) is adjacent to v
1, but {v
0} and {v
1} are distinct cells of ω, a contradiction with stability.
We conclude that for every AND-gadget, ω distinguishes the out-terminals. Since every vertex x
i
∈ X is adjacent to a unique set of such out-terminals, it follows that ω[X] is discrete. Therefore, for every q, \({\mathcal X^{k}_{q}}\) and \({\mathcal Y^{k}_{q}}\) are ω-closed. Hence ω refines τ
Q, ℓ
for every Q and ℓ, and thus it refines π
Q, ℓ
for every Q and ℓ. □
Proof 21 (Proof of Theorem 15)
Let G
k
be the graph described in Section 4.2, and π
Q, ℓ
be the partitions of V(G
k
) from Definition 23. By Lemma 27, the coarsest stable partition ω of G refines all partitions π
Q, ℓ
. For ease of notation, we define \(\pi _{\emptyset ,\ell }:=\pi _{\mathcal {B}_{\ell +1},\ell +1}\) for all ℓ < k−1. By Corollary 25, any effective elementary refining operation on a partition π
Q, ℓ
has cost 2k−(ℓ + 1)
k
2, and results in r
q
(π
Q, ℓ
) for some q ∈ Q. Denote Q
′ = Q∖{q}. By Lemma 26, \(r_{q}(\pi _{Q,\ell })\preceq \pi _{Q^{\prime },\ell }\). By Proposition 19, to compute the cost(π
Q, ℓ
), it suffices to consider only partitions that can be obtained by elementary refining operations. So we may now apply Proposition 18 to conclude that
$$\operatorname{cost}(\pi_{Q,\ell})\geq 2^{k-(\ell+1)}k^{2} + \min_{q\in Q}\operatorname{cost}(\pi_{Q\setminus\{q\},\ell}). $$
By induction on |Q| it then follows that \(\operatorname {cost}(\pi _{{\mathcal {B}}_{\ell },\ell })\geq 2^{k-1}k^{2}+ \operatorname {cost}(\pi _{\mathcal {B}_{\ell +1},\ell +1})\) for all 0 ≤ ℓ ≤ k−1. Hence, by induction on ℓ, \(\operatorname {cost}(\pi _{\mathcal {B}_{0},0})\geq 2^{k-1}k^{3}\), which lower bounds cost(α). By Proposition 20, n ∈ O(2k
k) and m ∈ O(2k
k
2), so log n ∈ O(k). This shows that cost(α)∈Ω((m + n)log n). □
Related lower bounds
In this section, we sketch how our construction also yields lower bounds for two other partitioning problems.
Bisimilarity
Bisimilarity is a key concept in concurrency theory and automated verification. A bisimulation is a binary relation defined on the states of a transition system (or between two transition systems). Intuitively, two states are bisimilar if the processes starting in these states look the same. Formally, a transition system is a vertex-labelled directed graph. Let S = (V, E, λ), where (V, E) is a directed graph and λ a function that assigns a set of properties to each state v ∈ V. A bisimulation on S is a relation ∼ on V satisfying the following three properties for all v, w ∈ V such that v∼w:
-
(i)
λ(v) = λ(w);
-
(ii)
for all v
′ ∈ N
+(v) there is a w
′ ∈ N
+(w) such that v
′∼w
′;
-
(iii)
for all w
′ ∈ N
+(w) there is a v
′ ∈ N
+(v) such that v
′∼w
′.
Not every bisimulation is an equivalence relation, but the reflexive symmetric transitive closure of a bisimulation is still a bisimulation. For convenience, in the following we assume that all bisimulations are equivalence relations. This is justified by the fact that the partition refinement algorithms (see below) that are commonly used to compute bisimulations, and that we study here, represent the relations using partitions of the vertex set and hence implicitly assume that the relations they represent are equivalence relations.
It is not hard to see that on each transition system S there is a unique coarsest bisimulation, which we call the bisimilarity relation on S. The bisimilarity relation can be defined by letting v be bisimilar to w if there is a bisimulation ∼ such that v∼w; it is then straightforward to verify that bisimilarity is a bisumlation and that all other bisimulations refine it. We remark that the bisimilarity relation on a transition system is precisely what Paige and Tarjan [24] call the coarsest relational partition of the initial partition given by the labelling. Thus the problem of computing the bisimilarity relation of a given transition system is equivalent to the problem of computing the coarsest relational partition considered in [24].
Note the similarity between a bisimulation and a stable colouring of a vertex-coloured digraph, which we may view as a transition system with a labelling λ that maps each vertex to its colour. Condition (i) just says that a bisimulation refines the original colouring, as a stable colouring is supposed to do as well. Conditions (ii) and (iii), which are equivalent under the assumption that a bisimulation be an equivalence relation and hence symmetric, says that if two vertices v, w are in the same class C then for every other class D, either both v and w have an out-neighbour in D or neither of them has. Thus instead of refining by the degree in D, we just refine by the Boolean value “degree at least 1”. This immediately implies that the coarsest stable colouring of S refines the coarsest bisimulation, that is, the bisimilarity relation, on S.
It should be clear from these considerations that the bisimilarity relation on a transition system S with n vertices and m edges can be computed in time O((n + m)log n) by a slight modification of the partitioning algorithm for computing the coarsest stable colouring (assuming, of course, that the labels can be computed and compared in constant time) [24].
As for the coarsest stable colouring, we may ask if the bisimilarity relation can be computed in linear time. It turns out that our lower bound for colour refinement implies a lower bound for bisimilarity. Again, we consider the class of partition refinement algorithms. As the partition refinement algorithms for colour refinement, partition refinement algorithms for bisimilarity maintain a partition of the set of vertices of the given transition system, and they iteratively refine it using refining operations until a bisimulation is reached. In each refining operation, such an algorithm chooses a union of current partition cells as refining set
R, and chooses another (possibly overlapping) union of partition cells S. Cells in S are split up according to the out-neighbourhoods of the vertices in the cells in R. That is, two vertices v, w currently in the same cell in S remain in the same cell after the refinement step if and only if for all cells C of the partition, with C ⊆ R, it holds that
$$N^{+}(v)\cap C\neq\emptyset\iff N^{+}(v^{\prime})\cap C\neq\emptyset. $$
Recall that N
+(v) denotes the set of out-neighbours of a vertex v in a directed graph (or transitition system). The cost bcost(R, S) of such a refinement relation (R, S) is the number of edges from S to R. Again, the sum of the costs of all refinement operations is a reasonable lower bound for the running time of a partition refinement algorithm. The cost bcost(α) of a partition α of the vertex set is then defined as the minimum cost of a sequence of refinement operations that transforms α to the coarsest bisimulation refining it (see Definition 14).
Theorem 28
For every integer k ≥ 2, there is a transition system S
k
with n∈O(2
k
k) vertices and m∈O(2
k
k
2
) edges and constant labelling function, such that such that bcost(α)∈Ω((m+n)logn), where α is the unit partition of V(S
k
).
Proof 22 (sketch)
The proof is essentially the same as the proof of Theorem 15. The transition system S
k
is a directed version of the the graph G
k
. Fig. 4 illustrates the direction of the edges. All vertices get the same label.
It is not hard to show that the bisimilarity classes of S
k
are exactly the same as the colour classes of G
k
in the coarsest stable colouring and that essentially the refinement steps do the same on G
k
and S
k
. Thus the lower-bound proof carries over. □
Equivalence in 2-Variable Logic
It is a well-known fact (due to Immerman and Lander [17]) that colour refinement assigns the same colour to two vertices of a graph if and only if the vertices satisfy the same formulas of the logic C2, two-variable first-order logic with counting.
Two variable first-order logic L2 is the fragment of first order logic consisting of all formulas built with just two variables. For example, the following L2-formula ϕ(x) in the language of directed graphs says that from vertex x one can reach a sink (a vertex of out-degree 0) in four steps:
$$\phi(x):=\exists y(E(x,y)\wedge\exists x(E(y,x)\wedge\exists y(E(x,y)\wedge\exists x(E(y,x)\wedge\forall y\,\neg E(x,y))))). $$
Two variable first-order logic with counting C2 is the extension of L2 by counting quantifiers of the form ∃≥i
x, for all i ≥ 1. For example, the following C2-formula ψ(x) in the language of directed graphs says that from vertex x one can reach a vertex of out-degree at least 10 in four steps:
$$\psi(x):=\exists y(E(x,y)\wedge\exists x(E(y,x)\wedge\exists y(E(x,y)\wedge\exists x(E(y,x)\wedge\exists^{\ge10}y E(x,y))))). $$
This formula is not equivalent to any formula of L2. Two-variable logics, and more generally finite variable logics, have been studied extensively in finite model theory (see, for example, [12, 14, 16, 19]).
We call two vertices of a graph L2-equivalent (C
2
-equivalent) if they satisfy the same formulas of the logic L2 (C2, respectively). Now Immerman and Lander’s theorem states that for all graphs G (possible coloured and/or directed) and all vertices v, w ∈ V(G), the vertices v and w have the same colour in the coarsest bi-stable colouring of G if and only if they are C2-equivalent. (Recall that bi-stable was defined in Section 3.4.) In particular, this implies that the C2-equivalence classes of a graph can be computed in time O((n + m)log n), but not better (by a partition-refinement algorithm).
On plain undirected graphs, the logic L2 is extremely weak. However, on coloured and/or directed graphs, the logic is quite interesting. The L2-equivalence relation refines the bisimilarity relation. It is well known that the L2-equivalence relation can be computed in time O((n + m)log n) by a variant of the colour refinement algorithm. Our lower bounds can be extended to show that it cannot be computed faster by a partition-refinement algorithm.
An Open Problem
The key idea of the O((n + m)log n) partitioning algorithms is Hopcroft’s idea of processing the smaller half. Hopcroft originally proposed this idea for the minimisation of deterministic finite automata. The algorithm proceeds by identifying equivalent states and then collapsing each equivalence class to a single new state. The partitioning problem (computing classes of equivalent states) is actually just the bisimilarity problem for finite automata, which may be viewed as edge-labelled transition systems.
However, for DFA-minimisation we only need to compute the bisimilarity relation for deterministic finite automata, that is, transition systems where each state has exactly one outgoing edge of each edge label. The systems in our lower bound proof are highly nondeterministic. Thus our lower bounds do not apply.
It remains a very interesting open problem whether similar lower bounds can be proved for DFA-minimisation, or whether DFA-minimisation is possible in linear time. Paige, Tarjan, and Bonic [25] proved that this is possible for DFAs with a single-letter alphabets. To the best of our knowledge, the only known result in this direction is a family of examples due to Berstel and Carton [7] (also see [6, 10]) showing that the O(nlog n) bound for Hopcroft’s original algorithm is tight.