Abstract
For a graph \(G=(V,E)\) with v(G) vertices the partition function of the random cluster model is defined by
where k(A) denotes the number of connected components of the graph (V, A). Furthermore, let g(G) denote the girth of the graph G, that is, the length of the shortest cycle. In this paper we show that if \((G_n)_n\) is a sequence of d-regular graphs such that the girth \(g(G_n)\rightarrow \infty \), then the limit
exists if \(q\ge 2\) and \(w\ge 0\). The quantity \(\Phi _{d,q,w}\) can be computed as follows. Let
then
The same conclusion holds true for a sequence of random d-regular graphs with probability one. Our result extends the work of Dembo, Montanari, Sly and Sun for the Potts model (integer q), and we prove a conjecture of Helmuth, Jenssen and Perkins about the phase transition of the random cluster model with fixed q.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
For a graph \(G=(V,E)\) the partition function of the random cluster model is defined by
where k(A) denotes the number of connected components of the graph (V, A). In many papers, one uses the parametrization \(w=e^{\beta }-1\).
When q is a positive integer, then \(Z_G(q,w)\) is also the partition function of the Potts-model with q spins, moreover there is a natural coupling between the two models, see for example [16]. In this paper we call a model a spin model with r spins if there is an \(r\times r\) symmetric matrix N and a vector \(\underline{\mu }\in \mathbb {R}^r\) such that for a graph \(G=(V,E)\) the probability of a \(\sigma : V\rightarrow \{1,2,\dots ,r\}\) is
where with the notation \([r]=\{1,2,\dots ,r\}\) we have
In both expressions the second product is over the edge set E(G), the symmetricity of N ensures that the expression is well-defined. The quantity \(Z_G(N,\underline{\mu })\) is the partition function of the model. In case of the Potts-model we have \(r=q\) and \(N=J_q+wI_q\), where \(J_q\) is the \(q\times q\) matrix consisting of 1’s and \(I_q\) is the \(q\times q\) identity matrix. The vector \(\underline{\mu }\) is the constant 1 vector in this case. In general, if \(\underline{\mu }\) is the constant 1 vector, then we will simply write \(Z_G(N)\) instead of \(Z_G(N,\underline{\mu })\).
Let v(G) denote the number of vertices of a graph G. In this paper we study the quantity
when \((G_n)_n\) is an essentially large girth sequence of d-regular graphs. A graph sequence \((G_n)_n\) is called essentially large girth if for all g we have \(\lim _{n\rightarrow \infty }\frac{L(G_n,g)}{v(G_n)}=0\), where L(G, g) denotes the number of cycles of length at most \(g-1\). It is known that a sequence of random d-regular graphs is essentially large girth graph sequence with probability one (see for instance [19]). So the problems of determining \(\lim _{n\rightarrow \infty } \frac{1}{v(G_n)}\mathbb {E}\ln Z_{G_n}(q,w)\) or \(\lim _{n\rightarrow \infty } \frac{1}{v(G_n)}\ln \mathbb {E}Z_{G_n}(q,w)\) for random d-regular graph sequence \((G_n)_n\) are very strongly related to this question. In fact, it will turn out that all these limits are the same. The main theorem of this paper is the following.
Theorem 1.1
If \((G_n)_n\) is an essentially large girth sequence of d-regular graphs, then the limit
exists for \(q\ge 2\) and \(w\ge 0\). The quantity \(\Phi _{d,q,w}\) can be computed as follows. Let
then
The same conclusion holds true with probability one for a sequence of random d-regular graphs.
The quantity \(\Phi _{d,q,w}\) has various alternative descriptions. As far as we know the description in the theorem is new even for the Potts model. There is a critical value \(w_c(d,q)\) such that if \(0\le w\le w_c(d,q)\), then \(\Phi _{d,q,w}(t)\) is maximized at \(t=0\), and so \(\Phi _{d,q,w}=q\left( 1+\frac{w}{q}\right) ^{d/2}\), and for \(w>w_c(d,q)\) we have \(\Phi _{d,q,w}>q\left( 1+\frac{w}{q}\right) ^{d/2}\). Moreover, if \(q>2\), then \(\frac{\partial }{\partial w }\Phi _{d,q,w}\) is discontinuous at \(w_c(q)\), that is, there is a first order phase transition at \(w_c(q)\). We will see that
For random d-regular graphs and the Potts model this was established by Galanis et al. [15]. For not necessarily integer \(q\ge 2\) this was conjectured by Helmuth et al. [17]. We will also prove that for any d-regular graph G we have
and we also show that if G contains \(\varepsilon v(G)\) cycles of length at most g for some fixed \(\varepsilon \) and g, then \(Z_{G}(q,w)\) is exponentially larger than that bound. Related results were obtained by Ruozzi [22].
1.1 Related works
There has been a lot of work on Potts model and random cluster model both on random regular graphs and essentially large girth graph sequences. The papers mentioned below treat various related problems, but we only mention the results that are directly related to Theorem 1.1.
The case \(q=2\) and \(w\ge 0\) is the so-called ferromagnetic Ising-model. In this case, Dembo and Montanari [12] proved that \(Z_{G_n}(2,w)^{1/v(G_n)}\) converges if \((G_n)_n\) is an essentially large girth sequence of d-regular graphs. In fact, they proved a significantly more general theorem about essentially large girth graphs that are not necessarily regular. Note that they use the terminology locally tree-like for what we use essentially large girth. When q is a positive integer and \(w\ge 0\), that is, in the case of the ferromagnetic Potts model, Dembo et al. [14] proved the convergence of \(Z_{G_n}(q,w)^{1/v(G_n)}\) for essentially large girth sequence of d-regular graphs for every d except when w belongs to a certain interval \((w_0,w_1)\). Later Dembo et al. [13] proved the convergence of \(Z_{G_n}(q,w)^{1/v(G_n)}\) for essentially large girth sequence of d-regular graphs, when d is even, q is a positive integer and \(w\ge 0\) even if \(w\in (w_0,w_1)\). Very recently, Helmuth et al. [17] proved the convergence of \(Z_{G_n}(q,w)^{1/v(G_n)}\) for essentially large girth sequence of d-regular graphs for large (not necessarily integer) q and \(w\ge 0\) with the additional hypothesis that \((G_n)_n\) satisfies some expansion condition. This line of research using cluster expansion and an expansion property of \((G_n)_n\) was extended by Carlson et al. [7] and by Carlson et al. [8] for the Potts model. Ferromagnetic Potts models on random regular graphs were also studied by Galanis et al. [15].
Our theorem does not cover the case when \(w<0\). When q is a positive integer and \(w=-1\), then \(Z_G(q,-1)\) counts the number of proper colorings of the graph G. This case was treated by Bandyopadhyay and Gamarnik [2]. They showed that if \(q\ge d+1\), then for an essentially large girth graph sequence of d-regular graphs \((G_n)_n\) we have
Their result was extended for integer \(q\ge 2\Delta \) and \(w\ge -1\) by Borgs et al. [6] for general Benjamini–Schramm convergent graph sequences, where \(\Delta \) is a bound on the degrees of all \(G_n\) in the sequence. The result of Bandyopadhyay and Gamarnik [2] was extended for not necessarily integer \(q\ge 8\Delta \) and \(w=-1\) by Abért and Hubai [1] also for arbitrary Benjamini–Schramm convergent graph sequences. Csikvári and Frenkel [11] showed that the same conclusion holds true for every fixed \(w\ge 0\) and q sufficiently large in terms of w and \(\Delta \). The partition function of the random cluster model is strongly related to the so-called Tutte polynomial [25]. For a graph \(G=(V,E)\) the Tutte polynomial \(T_G(x,y)\) is defined by
The connection between the Tutte polynomial and the random cluster model is
Bencs and Csikvári [4] proved that for an essentially large girth sequence of d-regular graphs \((G_n)_n\) we have
for \(x\ge 1\) and \(0\le y\le 1\). The next theorem summarizes the known results for the Tutte polynomial for non-negative x, y.
Theorem 1.2
Let \((G_n)_n\) be an essentially large girth sequence of d-regular graphs. Then the limit
exists if x, y satisfy one the following conditions
-
(i)
(Theorem 1.1) \((x-1)(y-1)\ge 2\) and \(y>1\),
-
(ii)
(see Sect. 4) \(x\ge d-1\) and \(y\ge 0\),
-
(iii)
(Bencs and Csikvári [4]) \(x\ge 1\) and \(0\le y\le 1\).
Figure 1 depicts the regions described in Theorem 1.2.
1.2 Plan of the paper
This paper has essentially two parts. In the first part we show that \(Z_G(q,w)\) can be approximated by the partition function of a 2-spin model for essentially large girth graphs, namely \(Z_G(q,w)\approx Z_G(M'_2,\underline{\nu }_2)\), where
The precise statement is Theorem 2.4. In this theorem we do not use that G is regular so we believe that this statement is very useful for studying random cluster model on other essentially large girth graphs like Erdős–Rényi random graphs \(G(n,\frac{c}{n})\). This statement implies that for an essentially large girth sequence of d-regular graphs \((G_n)_n\) we have
At that point one can simply cite a theorem of Sly and Sun [23, 24] (relying on a theorem of Dembo and Montanari [12]) that shows that the aforementioned limit exists since \(\det (M'_2)>0\). Indeed, Sly and Sun [23, 24] proved that for regular graphs any 2-spin model \((N,\underline{\mu })\) having a positive determinant is equivalent with a ferromagnetic Ising model. Dembo and Montanari [12] showed that for the ferromagnetic Ising-model the limit indeed exists. In these papers the main technique is an abstract interpolation method. Nevertheless, in this paper we do not rely on these papers, instead we build out a little theory for “ferromagnetic” 2-spin models that builds on Lee–Yang theory [18, 26, 27] and the gauge theory of Chertkov and Chernyak [9, 10]. This approach has some additional gains. First of all, it shows that the limit exists not only for essentially large girth sequence of d-regular graphs but for Benjamini–Schramm convergent sequence of d-regular graphs (for the definition of a Benjamini–Schramm convergent graph sequence see Definition 3.27, for the precise statement see Theorem 3.29). Secondly, we prove a stability theorem that shows that if a d-regular graph contains a linear number of short cycles, that is, it contains at least \(\varepsilon v(G)\) cycles of length at most g for some fixed \(\varepsilon \) and g, then both \(Z_G(q,w)\) and \(Z_G(N,\underline{\mu })\) are exponentially larger than the number obtained for an essentially large girth sequence of d-regular graphs (for the precise statement see Theorem 3.41 and the remark after the theorem).
So the second part of the paper is an elaborate analysis of the 2-spin model \((N,\underline{\mu })\), where N is a positive definite \(2\times 2\) matrix with positive entries and \(\underline{\mu }\) is a vector in \(\mathbb {R}^2\) with positive entries. We will show that there exists a quantity \(\Phi _d(N,\underline{\mu })\) such that for every essentially large girth sequence of d-regular graphs \((G_n)_n\) we have
To analyse these models we use a strategy that can be of independent interest. The idea is that we can associate many different polynomials to the same computational problem and the zeros of one of these polynomials satisfy Lee–Yang theorem, that is, they are on a circle. Indeed, for a d-regular graph G let us introduce the following polynomials [5, 26]
and a bit more generally,
We call \(F_G(x_0,\dots ,x_d)\) and \(F_G(x_0,\dots ,x_d|z)\) the subgraph counting polynomial.
We show that if N is a \(2\times 2\) positive definite matrix, then there are vectors \(\underline{v}(t)\in \mathbb {R}^{d+1}\) for each \(t\in [0,2\pi ]\) such that \(F_G(\underline{v}(t))=Z_G(N,\underline{\mu }).\) We will use these polynomials for two different things. First we show that there exists a \(t_1\) such that the zeros of \(F_G(\underline{v}(t_1)|z)\) lie on a circle for all d-regular graphs G. This will enable us to use a standard technique about limits of root measures. More details about this plan can be found in the introduction of Sect. 3. The second application is that there is a \(t_0\) such that the first coordinate of \(\underline{v}(t_0)\) is exactly \(\Phi _{d}(N,\underline{\mu })\) and all other coordinates have a nice sign structure. This will enable us to prove the aforementioned stability theorem for graphs containing a linear number of short cycles.
Notations. Given a graph \(G=(V,E)\) we use the notation v(G) and e(G) for the number of vertices and edges, respectively. Given a set \(S\subseteq V\) let E(S) denote the edges induced by S, that is, \(\{(u,v)\in E(G)\ |\ u,v\in S\}\) and let \(e(S)=|E(S)|\). Let G[S] denote the induced subgraph with vertex set S and edge set E(S). Similarly, \(e(V-S)\) is the number of edges induced by \(V\setminus S\), and \(G-S\) denotes the subgraph induced by \(V\setminus S\).
For an \(A\subseteq E(G)\) and a \(v\in V\) let \(d_A(v)\) denote the degree of the vertex v in the graph (V, A), that is, the number of edges of A incident to v.
For an \(A\subset E(G)\) and an \(S\subseteq V(G)\) let \(A\llbracket S\rrbracket =\{(u,v)\in A\ |\ u,v\in S\}\), so these are the edges of A that are induced by S.
Given graphs H and G let \(\hom (H,G)\) denote the number of homomorphisms from H to G, that is, the number of maps \(\varphi :V(H)\rightarrow V(G)\) such that \((\varphi (u),\varphi (v))\in E(G)\) whenever \((u,v)\in E(H)\).
The notation [q] stands for the set \(\{1,2,\dots ,q\}\). We denote the scalar products of vectors \(\underline{x}\) and \(\underline{y}\) by \(\langle \underline{x},\underline{y}\rangle \).
This paper is organized as follows. In the next section we introduce the rank 1 and rank 2 approximations of \(Z_G(q,w)\) and study its basic properties. In Sect. 3 we study the rank 2 approximation or more generally, \(Z_G(N,\underline{\mu })\). We end the paper with some remarks about the case \(1<q<2\).
2 Approximations
In this section we introduce various approximations of the partition function of the random cluster model. In the sequel the rank 2 approximation will be especially important for us.
2.1 Rank 1 approximation
For motivational purposes let us assume for a moment that q is a positive integer. Then it is known that
where M is the \(q\times q\) matrix with entries \(1+w\) in the diagonal and 1’s as off-diagonal elements. It is a natural idea to approximate M with the rank 1 matrix \(M_1\) such that the sum of all entries of M and \(M_1\) are equal. In other words, let \(M_1\) be the \(q\times q\) matrix with entries \(1+\frac{w}{q}\) everywhere. Note that by the definition of \(Z_G(M_1)\) we have
Let us call the quantity
the rank 1 approximation of \(Z_G(q,w)\). This quantity makes sense even if q is positive, but not necessarily integer and we will refer to it as the rank 1 approximation of \(Z_G(q,w)\) even in this case.
Lemma 2.1
If \(q\ge 1\), then
If \(0< q\le 1\), then
Proof
Using the fact that \(k(A)\ge v(G)-|A|\) for an \(A\subseteq E(G)\) we get that for \(q\ge 1\) we have
For \(q\le 1\) we have the opposite inequality in the above computation. \(\square \)
Lemma 2.1 implies that for any d–regular graph G and \(q>1\) we have
For \(q<1\) the same quantity is an upper bound. Note that the very same quantity appears as \(\Phi _{d,q,w}(0)\) in Theorem 1.1.
2.2 Rank 2 approximation
What is better than a rank 1 approximation? Naturally, a rank 2 approximation.
Again for motivational purposes let us assume for a moment that \(q\ge 2\) is an integer. This time let us approximate the matrix M with the following rank 2 matrix \(M_2\).
Then
Indeed, let \(S=\varphi ^{-1}(1)\) in the definition of \(Z_G(M_2)\). Let us introduce the quantity
The definition of \(Z^{(2)}_G(q,w)\) makes perfect sense if \(q> 1\), but not necessarily integer and we will refer to it as the rank 2 approximation of \(Z_G(q,w)\). Recall that
and note that
even if q is not an integer.
This time it is less clear that it is a natural approximation, but as it will turn out this is an asymptotically precise approximation for essentially large girth graphs if \(q\ge 2\) and \(w\ge 0\). We can prove it through a series of lemmas.
Lemma 2.2
We have
Proof
This identity is trivially true for positive integer q using the interpretation of \(Z_G(q,w)\) as the partition function of the Potts-model. Since we have polynomials on both sides we get that it is true for all q and w. \(\square \)
Lemma 2.3
For \(q\ge 2\) we have
For \(1< q\le 2\) we have
Proof
By Lemma 2.2 we have
By the definitions of \(Z^{(2)}_G(q,w)\) and \(Z^{(1)}_G(q,w)\) we have
Now the claim follows by Lemma 2.1\(\square \)
Now we are ready to prove that the rank 2 approximation is asymptotically precise for essentially large girth graphs if \(q\ge 2\) and \(w\ge 0\).
Theorem 2.4
Let G be a graph on n vertices with \(L=L(G,g)\) cycles of length at most \(g-1\). Let \(q\ge 2\). Then
Proof
The lower bound was already proven in Lemma 2.3. So we only need to prove the upper bound.
Given \(A\subseteq E(G)\) we can decompose A as follows. Let \(V_1,\dots ,V_r\) be the vertex sets of the connected components of the graph \(H=(V,A)\), and let \(A_1,\dots ,A_r\) be the corresponding subsets of A. If \(V_i\) is an isolated vertex, then \(A_i=\emptyset \).
Let us say that \(V_i\) is small if the induced graph \(G[V_i]\) does not contain a cycle. In particular, \(A_i\) does not contain a cycle either. Note that it is possible that \(A_i\) does not contain a cycle, but the induced graph \(G[V_i]\) contains a cycle, and so \(V_i\) is not small. Let \(\mathcal {S}_A\) denote the set of small \(V_i\)’s. We say that \(V_i\) is large if it is not small, and we denote by \(\mathcal {L}_A\) the set of large \(V_i\)’s. Note that \(|\mathcal {L}_A|\le n/g+L\) since each large connected component has size at least g or it contains a cycle of length at most \(g-1\).
Finally, let us say that a vertex set R is compatible with A if R is the union of some small \(V_i\)’s. Note that R may be the empty set. We denote this relation by \(R\sim A\). Furthermore, let \(A\llbracket R\rrbracket \) be the edges of A induced by the vertex set R. Note that if \(R\sim A\), then \(A\llbracket R\rrbracket \) is a forest. On the other hand, there is no restriction on \(A\llbracket V\setminus R\rrbracket .\) Figure 2 depicts an example for the introduced concepts.
Let \(k(R,A\llbracket R \rrbracket )\) denote the number of connected components of the graph \((R,A\llbracket R \rrbracket )\). By the binomial identity we have
Then
where in the last sum, \(D=A\llbracket R\rrbracket \) is a subset of the edges induced by R such that none of the induced connected components contains a cycle. Then
Hence
that is
\(\square \)
The following theorem is an immediate consequence of Theorem 2.4.
Theorem 2.5
Let \(q\ge 2\) and \(w\ge 0\). Let \((G_n)_n\) be an essentially large girth sequence of d-regular graphs. If the limit
exists, then the limit
exists too, and they have the same value.
3 Ferromagnetic 2-Spin Models
In this section we analyze the rank 2 approximation of the random cluster model. Since \(Z^{(2)}_G(q,w)=Z_G(M'_2,\underline{\nu }_2)\) for a \(2\times 2\) matrix \(M'_2\) we will actually prove that if N is a \(2\times 2\) positive definite matrix with positive entries and \(\underline{\mu }=(\mu _1,\mu _2)\) is a positive vector, then
exists for every essentially large girth sequence of d-regular graphs \((G_n)_n\). In fact, we will prove a much stronger theorem about Benjamini–Schramm convergent graph sequences.
The plan is to connect the quantity \(Z_G(N,\underline{\mu })\) with Lee–Yang theory. This connection is built out through the so-called subgraph counting polynomial (see [26]).
3.1 Subgraph counting polynomial
From now on we always assume that G is a d-regular graph.
Let us introduce the so-called subgraph counting polynomial
and a bit more generally,
As an example we give the subgraph counting polynomial \(F_{K_5}(x_0,x_1,x_2,x_3,x_4)\) of the complete graph \(K_5\) on 5 vertices. The first term corresponds to the empty subgraph, the last term corresponds to the graph itself.
The general plan is the following. In the next section we show that there are vectors \(\underline{v}(t)\) for each \(t\in [0,2\pi ]\) such that
We will show that there exists a \(t_1\) such that all zeros of \(F_G(\underline{v}(t_1)|z)\) lie on a circle for all d-regular graph G. This will imply the convergence of the sequence \(\frac{1}{v(G_n)}\ln Z_G(N,\underline{\mu })\) for not only essentially large girth d-regular graphs but for all Benjamini–Schramm convergent graph sequences.
We will also show that there exists a \(t_0\) such that the first coordinate of \(\underline{v}(t_0)\) is exactly \(\Phi _d(N,\underline{\mu })\), and all other coordinates have a nice sign structure. This will enable us to show that \(Z_G(N,\underline{\mu })\ge \Phi _d(N,\underline{\mu })^{v(G)}\) for all d-regular graph G, and if G contains a linear number of short cycles, then \(Z_G(N,\underline{\mu })\ge ((1+\delta )\Phi _d(N,\underline{\mu }))^{v(G)}\) for some \(\delta >0\).
3.2 Rank 2 matrices
Suppose that we can write an \(r\times r\) matrix N into the form \(N=\underline{a}\underline{a}^T+\underline{b}\underline{b}^T\) and let \(\underline{\mu }\in \mathbb {R}^r\). Then
where \(r_j=\sum _{k=1}^r\mu _ka_k^{d-j}b_k^{j}\). On the other hand, \(\underline{a}\) and \(\underline{b}\) are not the only vectors satisfying \(N=\underline{a}\underline{a}^T+\underline{b}\underline{b}^T\). Indeed, let us define the vectors \(\underline{a}(t)\) and \(\underline{b}(t)\) as follows:
and
Then \(N=\underline{a}(t)\underline{a}(t)^T+\underline{b}(t)\underline{b}(t)^T\). So each pair \(\underline{a}(t),\underline{b}(t)\) gives rise to a vector \(\underline{v}(t)=(r_0(t),\dots ,r_d(t))\) such that
Remark 3.1
We can apply our argument to \(N=M'_2\), \(\underline{\mu }=\underline{\nu }_2\) with the following vectors.
One can check that \(M'_2=\underline{a}\underline{a}^T+\underline{b}\underline{b}^T\) indeed holds true. We can again introduce the vectors \(\underline{a}(t),\underline{b}(t)\) giving rise to a vector \(\underline{v}(t)=(r_0(t),\dots ,r_d(t))\) such that
In this case
In particular,
In other words, \(r_0(t)=\Phi _{d,q,w}(t)\).
3.2.1 Decompositions of \(2\times 2\) positive definite matrices
Sometimes it will be convenient to require extra conditions about the vectors \(\underline{a}\) and \(\underline{b}\) in the decomposition of \(N=\underline{a}\underline{a}^T+\underline{b}\underline{b}^T\).
Lemma 3.2
Let N be a \(2\times 2\) positive definite matrix with positive entries.
-
(i)
Then there exists a decomposition \(N=\underline{a}\underline{a}^T+\underline{b}\underline{b}^T\) such that \(a_1,a_2,b_1>0\) and \(b_2<0\).
-
(ii)
There is also a decomposition \(N=\underline{a}'\underline{a}'^T+\underline{b}'\underline{b}'^T\) such that \(a'_1,a'_2,b'_1,b'_2>0\).
Proof
First we prove (i). Let \(\underline{v}_1,\underline{v}_2\) be the orthonormal set of eigenvectors of N corresponding to eigenvectors \(\lambda _1, \lambda _2>0\). If \(\lambda _1\ge \lambda _2\), then by the Perron–Frobenius theory we can assume that \(\underline{v}_1\) has positive entries. Since \(\underline{v}_1,\underline{v}_2\) are orthogonal, one of the entries of \(\underline{v}_2\) is positive, the other is negative. By considering \(-\underline{v}_2\) if necessary we can assume that the first entry is positive, the second is negative. Hence \(\underline{a}=\sqrt{\lambda _1}\underline{v}_1\) and \(\underline{b}=\sqrt{\lambda _2}\underline{v}_2\) satisfies the conditions.
Next let us prove (ii). We can assume that we have already found an \(\underline{a}\) and \(\underline{b}\) such that \(N=\underline{a}\underline{a}^T+\underline{b}\underline{b}^T\) and \(a_1,a_2,b_1>0\) and \(b_2<0\). Let
and
If we choose \(\alpha \) such a way that \(\alpha \in \left( -\frac{\pi }{2},\frac{\pi }{2}\right) \), that is, \(\cos (\alpha )>0\) and
then \(a'_1,a'_2,b'_1,b'_2>0\). Note that \(\frac{b_2}{a_2}>-\frac{a_1}{b_1}\) since \(N_{12}=a_1a_2+b_1b_2>0\). \(\square \)
3.2.2 The functions \(a_1(t),a_2(t),b_1(t),b_2(t)\)
In this section we introduce some functions that will appear many times in this paper.
Definition 3.3
For \(a_1,a_2,b_1,b_2\in \mathbb {R}\) let
Lemma 3.4
Suppose that for the \(2\times 2\) positive definite matrix N we have \(N=\underline{a}\underline{a}^T+\underline{b}\underline{b}^T=\hat{\underline{a}}\hat{\underline{a}}^T+\hat{\underline{b}}\hat{\underline{b}}^T\). Then there exists a t such that \(\underline{a}(t)=\hat{\underline{a}}\) and \(\underline{b}(t)=\hat{\underline{b}}\) or there exists a t such that \(\underline{a}(t)=\hat{\underline{a}}\) and \(\underline{b}(t)=-\hat{\underline{b}}\), and all vectors of those forms are solutions.
Proof
Consider the vectors \(\underline{x}=(a_1,b_1)\) and \(\underline{y}=(a_2,b_2)\). Our goal is to prove that U(2) act transitively on the pairs \(\underline{x},\underline{y}\). The equation \(N=\underline{a}\underline{a}^T+\underline{b}\underline{b}^T\) is equivalent to \(N_{11}=\langle \underline{x},\underline{x}\rangle \), \(N_{12}=\langle \underline{x},\underline{y}\rangle \), \(N_{22}=\langle \underline{y},\underline{y}\rangle \). Thus we know the length of \(\underline{x},\underline{y}\), and from these the angle between them. Thus with unitary operation we can transform any solution to any other solution, and by a unitary action applied to a solution we always get a solution. \(\square \)
Remark 3.5
For the specific choice \(a_1=a_2=\sqrt{1+\frac{w}{q}}\), \(b_1=\sqrt{\frac{(q-1)w}{q}}\) and \(b_2=-\sqrt{\frac{w}{q(q-1)}}\) we use the notation
We collected some claims about \(a_1(t),a_2(t),b_1(t),b_2(t)\) whose proof is just a straightforward computation. First we describe the sign structure of the functions \(a_1(t),a_2(t),b_1(t),b_2(t)\) on the interval \(\left[ 0,\frac{\pi }{2}\right) \).
Lemma 3.6
Let \(a_1,a_2,b_1,b_2\in \mathbb {R}\) such that \(a_1,a_2,b_1>0\) and \(b_2<0\) and \(a_1a_2+b_1b_2>0\). Let \(t\in \left[ 0,\frac{\pi }{2}\right) \). Then
-
(a)
if \(0\le \tan (t)\le \frac{b_1}{a_1}\) we have \(a_1(t),a_2(t),b_1(t)\ge 0\) and \(b_2(t)<0\),
-
(b)
if \(\frac{b_1}{a_1}\le \tan (t)\le \frac{a_2}{-b_2}\) we have \(a_1(t),a_2(t)\ge 0\) and \(b_1(t),b_2(t)\le 0\),
-
(c)
if \(\frac{a_2}{-b_2}\le \tan (t)\), then \(a_1(t)>0\) and \(a_2(t),b_1(t),b_2(t)\le 0\)
Lemma 3.7
Let \(a_1,a_2,b_1,b_2\in \mathbb {R}\). Then
Lemma 3.8
Let Q be a \(2\times 2\) real matrix with non-zero determinant. Then for every \(c\in \mathbb {R}\), there is a unique \(t\in [0,\pi )\) such that \(\frac{Q_{11}\cos (t)+Q_{12}\sin (t)}{Q_{21}\cos (t)+Q_{22}\sin (t)}=c\).
Proof
Let \(F_Q(t)=\frac{Q_{11}\cos (t)+Q_{12}\sin (t)}{Q_{21}\cos (t)+Q_{22}\sin (t)}.\) We have \(\frac{\partial }{\partial t}F_Q(t)=\frac{Q_{12}Q_{21}-Q_{11}Q_{22}}{(Q_{21}\cos (t)+Q_{22}\sin (t))^2}.\) Hence \(F_Q(t)\) is either strictly monotone decreasing or strictly monotone increasing on \([0,\pi )\) depending on the sign of \(\det (Q)\) with a discontinuity at \(t_0\), where \(\tan (t_0)=-\frac{Q_{21}}{Q_{22}}\). Since \(F_Q(0)=F_Q(\pi )=\frac{Q_{11}}{Q_{21}}\), and \(\lim _{t\searrow t_0}F_Q(t)=\pm \infty \) and \(\lim _{t\nearrow t_0}F_Q(t)=\mp \infty \), the claim follows.
We will also use the following identities.
Lemma 3.9
For arbitrary \(a_1,a_2,b_1,b_2\in \mathbb {R}\) we have
Remark 3.10
In case of \(Z^{(2)}_G(q,w)\) we get
It is also true that
3.3 The vector \(\underline{v}(t_1)\)
In this section we show that there exists a \(t_1\) such that for all d-regular graph G the zeros of \(F_G(\underline{v}(t_1)|z)\) lie on a circle.
3.3.1 Wagner’s subgraph counting technique.
In this section we will recall some theorem of Wagner (Theorem 3.2 of [26]) about the location of zeros of \(F_G(x_0,\dots ,x_d|z)\). For any fixed \(x_0,\dots ,x_d\) let us define the following key-polynomial
Theorem 3.11
(Wagner [26]). If \(K(x_0,\dots ,x_d|z)\) has no complex zero in the open disk of radius \(\kappa \) around 0, then \(F_G(x_0,\dots ,x_d |z)\) has no complex zero in the open disk of radius \(\kappa \) around 0 for any d-regular graph G.
If \(K(x_0,\dots ,x_d|z)\) has no complex zero in the complement of a closed disk of radius \(\kappa \) around 0, then \(F_G(x_0,\dots ,x_d |z)\) has no complex zero in the complement of a closed disk of radius \(\kappa \) around 0 for any d-regular graph G.
In particular, if \(K(x_0,\dots ,x_d|z)\) has only zeros on the circle of radius \(\kappa \) around 0, then \(F_G(x_0,\dots ,x_d|z)\) has complex zeros only on the circle of radius \(\kappa \) for any d-regular graph G.
3.3.2 Key polynomials for rank 2 matrices
Suppose that we have a rank 2 matrix N of the form \(N=\underline{a}\underline{a}^T+\underline{b}\underline{b}^T\in \mathbb {R}\) and a \(\underline{\mu }\in \mathbb {R}^{2}\). Then we know that \(F_G(\underline{v}(t))=Z_G(N,\underline{\mu })\), where \(\underline{v}(t)=(r_0(t),\dots ,r_d(t))\) for any \(t\in [0,2\pi )\).
Lemma 3.12
Let \(\underline{a},\underline{b},\underline{\mu }\in \mathbb {R}^{r}\). For \(k=1,\dots ,r\) let
and for \(j=0,\dots ,d\) let \(r_j(t)=\sum _{k=1}^r\mu _ka_k(t)^{d-j}b_k(t)^j\). Finally, let \(\underline{v}(t)=(r_0(t),\dots ,r_d(t))\) and
Then
Proof
By definition we have
\(\square \)
Lemma 3.13
Let \(\mu _1,\mu _2\in \mathbb {R}\) and \(a_1,a_2,b_1,b_2\in \mathbb {R}\) such that \(\mu _1,\mu _2> 0\), then all the complex zeros of \(K(\underline{v}(t)|z)\) are on a circle or on a line.
Moreover, if \(t_1\) satisfies
then the circle has center at 0. Furthermore, the radius of this circle is
Proof
From Lemma 3.12 we have that
Let us assume that \(K(\underline{v}(t)|\zeta )=0\).
If \(a_1(t)+b_1(t)\zeta =a_2(t)+b_2(t)\zeta =0\), then \(\zeta \) is the only zero of \(K(\underline{v}(t)|z)\) with multiplicity d, thus all the complex zeros are on a circle of radius \(|\zeta |=\left| \frac{a_1(t)a_2(t)}{b_1(t)b_2(t)}\right| ^{1/2}\) with center at 0.
If \(a_1(t)+b_1(t)\zeta \) or \(a_2(t)+b_2(t)\zeta \) is not 0, then by symmetry we can assume that \(a_2(t)+b_2(t)\zeta \ne 0\), and we get that
where \(M_t(z)=\frac{a_1(t)+b_1(t)z}{a_2(t)+b_2(t)z}\) is a Möbius transformation with real coefficients. Let us introduce the notation \(T=\left( \frac{\mu _2}{\mu _1}\right) ^{2/d}\). Thus we obtained that for any \(\zeta \) zero of \(K(\underline{v}(t)|z)\) we have
Since \(M_t(z)\) is Möbius transformation, therefore \(M_t^{(-1)}(z)\) maps cycles into cycles and lines, i.e. \(\zeta \in M_{t}^{(-1)}(S_{\sqrt{T}} )\), where \(S_c\) is a circle of radius c around 0.
In order to prove the second part of the statement we have to investigate when does the circle \(M_{t_1}^{(-1)}(S_{\sqrt{T}})\) have a center at 0. Since \(M_t(z)\) is Möbius transformation with real coefficients, thus \(M_t^{(-1)}(z)\) is also a Möbius transformation with real coefficients. This means that the image of a circle that is perpendicular to the real line is also perpendicular to the real line. We claim that \(M_{t_1}^{(-1)}(S_{\sqrt{T}})\) is not a line. To see it it is enough to show that \(M^{(-1)}_{t_1}(\pm \sqrt{T})\) is not \(\infty \), or equivalently \(M_{t_1}(\infty )\ne \pm \sqrt{T}\). If this would be the case, then \(M_{t_1}(\infty )=\frac{b_1(t_1)}{b_2(t_1)}=\pm \sqrt{T}\) would imply that \(a_1(t_1)+b_1(t_1)z\) and \(a_2(t_1)+b_2(t_1)z\) have a common zero, which lead us to a contradiction.
Thus the center of \(M^{(-1)}_{t_!}(S_{\sqrt{T}} )\) is at
This is 0 if and only if
This is equivalent to
To find the corresponding radius we have to calculate \(\left| M_{t_1}^{(-1)}(\sqrt{T})\right| \).
This implies that \(R_c=\sqrt{T}\left| \frac{a_2(t_1)}{b_1(t_1)}\right| \). Thus by equation \(T= \frac{a_1(t_1)b_1(t_1)}{a_2(t_1)b_2(t_1)}\) we also have \(R_c=T^{-1/2}\left| \frac{a_1(t_1)}{b_2(t_1)}\right| \), and by multiplying the two equations we get that \(R_c^2=\left| \frac{a_1(t_1)a_2(t_1)}{b_1(t_1)b_2(t_1)}\right| \). \(\square \)
Lemma 3.14
Let \(a_1,a_2,b_1,b_2,\mu _1,\mu _2\in \mathbb {R}\) such that \(a_1,a_2,b_1,\mu _1,\mu _2>0\) and \(b_2<0\) and \(a_1a_2+b_1b_2>0\), then there is a unique \(t_1\in \left[ 0,\frac{\pi }{2}\right] \) such that \(\frac{b_1}{a_1}<\tan (t_1)<\frac{a_2}{-b_2}\) and
For such a \(t_1\) we have \(a_1(t_1),a_2(t_1)>0\) and \(b_1(t_1),b_2(t_1)<0\) implying that \(\frac{a_1(t_1)a_2(t_1)}{b_1(t_1)b_2(t_1)}>0\).
Proof
Note that the function \(\frac{a_1(t)b_1(t)}{a_2(t)b_2(t)}\) is only positive at \(t\in \left[ 0,\frac{\pi }{2}\right] \) if \(a_1(t),a_2(t)>0\) and \(b_1(t),b_2(t)<0\), that is, \(\frac{b_1}{a_1}<\tan (t)<\frac{a_2}{-b_2}\). When \(t\rightarrow \arctan \left( \frac{b_1}{a_1}\right) \), then \(b_1(t)\rightarrow 0\), and so \(\frac{a_1(t)b_1(t)}{a_2(t)b_2(t)}\rightarrow 0\). If \(t\rightarrow \arctan \left( \frac{a_2}{-b_2}\right) \), then \(a_2(t)\rightarrow 0\), and so \(\frac{a_1(t)b_1(t)}{a_2(t)b_2(t)}\rightarrow \infty \). Since
the function is strictly monotone increasing, hence there is a unique \(t_1\) satisfying \(\frac{a_1(t_1)b_1(t_1)}{a_2(t_1)b_2(t_1)}=\left( \frac{\mu _2}{\mu _1}\right) ^{2/d}\). \(\square \)
Theorem 3.15
Let N be a \(2\times 2\) positive definite matrix with positive entries and let \(\underline{\mu }\in \mathbb {R}^2_{>0}\). Then there exists a \(\underline{v}_c\in \mathbb {R}^{d+1}\) and an \(R_c(N,\underline{\mu })\in \mathbb {R}_{>0}\) such that for any d-regular graph G we have \(Z_G(N,\underline{\mu })=F_G(\underline{v}_c)\) and all complex zeros of \(F_G(\underline{v}_c|z)\) lie on a circle around 0 of radius \(R_c(N,\underline{\mu })\). Moreover, \(R=R_c(N,\mu )\) is a positive real solution of
where \(T=\left( \frac{\mu _2}{\mu _1}\right) ^{2/d}\).
Proof
The first part of the claim follows from combining Lemma 3.2, 3.14, 3.13 and Theorem 3.11. Indeed, by Lemma 3.2 we know that there exists a decomposition \(N=\underline{a}\underline{a}^T+\underline{b}\underline{b}^T\) such that \(a_1,a_2,b_1>0\) and \(b_2<0\). Then Lemma 3.14 implies that there exists a \(t_1\) such that \(\frac{a_1(t_1)b_1(t_1)}{a_2(t_1)b_2(t_1)}=\left( \frac{\mu _2}{\mu _1}\right) ^{2/d}\). Then Lemma 3.13 shows that the zeros of \(K(\underline{v}(t_1)|z)\) lie on a circle that has center at 0. Then Theorem 3.11 implies that all complex zeros of \(F_G(\underline{v}(t_1)|z)\) lie on a circle around 0 for any d-regular graph G. Thus \(\underline{v}_c=\underline{v}(t_1)\) satisfies the conditions of the theorem.
To prove the statement concerning the radius of the circle note that by Lemma 3.9 and 3.13 we have \(N_{11}=a_1(t_1)^2+b_1(t_1)^2\), \(N_{22}=a_2(t_1)^2+b_2(t_1)^2\), \(N_{12}=a_1(t_1)a_2(t_1)+b_1(t_1)b_2(t_1)\), \(T=\frac{a_1(t_1)b_1(t_1)}{a_2(t_1)b_2(t_1)}\) and \(R^2=\frac{a_1(t_1)a_2(t_1)}{b_1(t_1)b_2(t_1)}\). Let us introduce the notations \(\overline{a}_1=a_1(t_1)\), \(\overline{a}_2=a_2(t_1)\), \(\overline{b}_1=b_1(t_1)\) and \(\overline{b}_2=b_2(t_1)\). Then we get that
Now one can see that everything cancels, and this is indeed 0. \(\square \)
3.4 Random regular graphs and Bethe approximation
In this section we recall about some results of Dembo et al. [13] on Bethe approximation. We introduce a quantity \(\Phi _d(N,\underline{\mu })\) for which it is true that if G is a random d-regular graph on n vertices, then we have \(\mathbb {E}Z_G(N,\underline{\mu })=n^{O(1)}\Phi _{d}(N,\underline{\mu })^n\). As a consequence of a theorem of Ruozzi we will also get that \(Z_G(N,\underline{\mu })\ge \Phi _{d}(N,\underline{\mu })^{v(G)}\) for a d-regular graph G if N is a \(2\times 2\) positive definite matrix.
In general, let \(N\in \mathbb {R}^{r\times r}_{>0}\) be a symmetric matrix and \(\underline{\mu }\in \mathbb {R}^r_{>0}\). Let \(B_{N,\underline{\mu }}\) be a symmetric distribution on \([r]^2\). Let \(b_{N,\underline{\mu }}\) be the marginal of \(B_{N,\underline{\mu }}\) to its first coordinate. Let us define
where for a probability distribution \(P=(p_1,\dots ,p_n)\) and a vector \(f=(f_1,\dots ,f_n)\) we have
with the usual convention \(0\cdot \ln \frac{1}{0}=0\). The subscript \(\hom \) in \(\mathbb {F}_{\hom }\) simply stands for homomorphism. Let
The quantity \(\Phi _d(N,\underline{\mu })\) also has a description through belief propagation equation, see Proposition 14.6 of [20] or Section 1.2 of [13]. Let \(h\in \mathbb {R}^r\) be a probability distribution. The belief propagation equation or Bethe recursion is
for all \(\sigma \in [r]\), where \(z_h\) is the normalizing constant ensuring that \(\textrm{BP}(h)\) is a probability distribution too. Let \(\mathcal {H}^*\) be the set of probability distributions for which \(\textrm{BP}(h)=h\). The Bethe functional is defined as
Then
Dembo et al. [13] showed that the quantity \(\Phi _d(N,\underline{\mu })\) is directly related to the expected value of \(Z_G(N,\underline{\mu })\) for random d-regular graphs.
Theorem 3.16
(Dembo et al. [13]). Let G be a random d-regular graph on n vertices. Then
3.4.1 A theorem of Ruozzi.
In this section we show that if N is a \(2\times 2\) positive definite matrix with positive entries and \(\underline{\mu }\) is a positive vector, then for any d-regular graph G we have \(Z_G(N,\underline{\mu })\ge \Phi _{d}(N,\underline{\mu })^{n}\). First we recall the setting of factor graphs.
Definition 3.17
A factor graph \(\mathcal {G}=(F,V,E,\mathcal {X},(g_a)_{a\in F})\) is a bipartite graph equipped with a set of functions. Its vertex set is \(F\cup V\), where F is the set of function nodes, and V is the set of variable nodes. The edge set of \(\mathcal {G}\) will be denoted by \(E(\mathcal {G})\). The neighbors of a factor node a or variable node v will be denoted by \(\partial a\) or \(\partial v\), respectively. For each variable node v we associate a variable \(x_v\) taking its values from the alphabet \(\mathcal {X}\). For each a there is an associated function \(g_a: \mathcal {X}^{\partial a}\rightarrow \mathbb {R}_{\ge 0}\). The partition function of the factor graph \(\mathcal {G}\) is
where \(\underline{x}_{\partial a}\) is the restriction of \(\underline{x}\) to the set \(\partial a\).
When \(\mathcal {X}=\{0,1\}\) we speak about a binary factor graph.
Let us consider an example.
Example 3.18
Suppose that \(G=(V,E)\) is an (ordinary) graph. We can associate a factor graph \(\mathcal {G}\) as follows. For each \(v\in V\) we introduce a variable node v and function node \(v'\), and for each edge \(e=(u,v)\) we introduce a function node e. In \(\mathcal {G}\) let us connect v with \(v'\) and \(e=(u,v)\) with u and v. Set \(\mathcal {X}=[r]\). Let N be a \(r\times r\) matrix and \(\underline{\mu }\in \mathbb {R}^r\). For each function node \(v'\) we introduce the function \(g_v(x)=\mu _x\) and for each edge e we introduce the function \(g_e(x,y)=N_{x,y}\). Then
The middle picture of Fig. 3 depicts the factor graph \(\mathcal {G}\) for the diamond graph G.
Example 3.19
Let \(G=(V,E)\) be a graph. Recall that
For \((a_0,\dots ,a_d)\in \mathbb {R}^{d+1}\) let us consider the following factor graph \(\mathcal {G}'=(F',V',E',\mathcal {X}',(g'_a)_{a\in F}).\) We subdivide each edge of E with one vertex. In the resulting bipartite graph one side corresponds to \(F'\), the other side corresponds to \(V'\). So with a slight abuse of notation we have \(F'=V\) and \(V'=E\). Let \(\mathcal {X}'=\{0,1\}\). For each \(v\in V\) let us introduce the function
where \(|x|=x_{e_1}+\dots +x_{e_{d_v}}\), and \(e_1,\dots ,e_{d_v}\) are the edges incident to v. Then
As we can see \(\mathcal {G}'\) is in some sense the dual of \(\mathcal {G}\). The picture on the right hand side of Fig. 3 depicts the factor graph \(\mathcal {G}'\) for the diamond graph G.
Next we need the concept of the Bethe approximation for factor graphs. First we need to introduce the pseudo-marginal polytope.
Definition 3.20
For each variable node v let us introduce a probability distribution \(b_v\) on \(\mathcal {X}\), and for each function node a let us also introduce a probability distribution \(b_a\) on \(\mathcal {X}^{\partial a}\):
and
Furthermore, \(b_v\) and \(b_a\) have to be consistent in the following sense: for all \(c\in \mathcal {X},\ a\in F, v\in \partial a\) we have
We will call a \(\underline{b}=((b_v)_{v\in V},(b_a)_{a\in F})\) a locally consistent set of marginals or simply pseudo-marginal. The set of such \(\underline{b}\) will be denoted by \(\textrm{Mar}(\mathcal {G})\).
Definition 3.21
The Bethe partition function \(Z_B(\mathcal {G})\) is defined as follows. Let \(\mathbb {F}\) be the following function evaluated on a \(\underline{b} \in \textrm{Mar}(\mathcal {G})\):
The notation \(\mathbb {F}\) is consistent with our previous notation \(\mathbb {F}_{\hom }\) as it will be explained later. Finally, let
and
Here \(H_B(\mathcal {G})\) is the Bethe free entropy, and \(Z_B(\mathcal {G})\) is the Bethe partition function. We note that if \(g_a(\underline{x})=0\), then we require \(b_a(\underline{x})=0\) and use the convention \(0\cdot \ln \frac{0}{0}=0\).
Example 3.22
By continuing examples 3.18 and 3.19 we can consider the Bethe partition functions of \(\mathcal {G}\) and \(\mathcal {G}'\). For \(\mathcal {G}\) we will denote it by \(Z_G^B(N,\mu )\).
Recall that a function g is log-supermodular if for all \(\underline{x},\underline{y}\in \{0,1\}^k\) we have
where \(\underline{x}\wedge \underline{y},\underline{x}\vee \underline{y}\in \{0,1\}^k\) such that \((\underline{x}\wedge \underline{y})_i=\min (x_i,y_i)\) and \((\underline{x}\vee \underline{y})_i=\max (x_i,y_i)\) for \(i\in [k]\).
Theorem 3.23
(Ruozzi [21]). Let \(\mathcal {G}=(F,V,E,\mathcal {X},(g_a)_{a\in F})\) be a factor graph with \(\mathcal {X}=\{0,1\}\) such that for all \(a\in F\) the functions \(g_a\) are log-supermodular. Then \(Z(\mathcal {G})\ge Z_B(\mathcal {G})\).
Lemma 3.24
For an \(r \times r\) matrix N and \(\underline{\mu }\in \mathbb {R}^r_{\ge 0}\), and for a d-regular graph G we have \(Z_G^B(N,\underline{\mu })\ge \Phi _d(N,\underline{\mu })^{v(G)}\).
Proof
By using the same probability distribution \(B_{N,\underline{\mu }}\) everywhere in the definition of \(Z_G^B(N,\underline{\mu })\) the consistency of marginals is immediately satisfied. Then the function \(\mathbb {F}\) on this pseudo-marginal simplifies to \(\mathbb {F}_{\hom }(B_{N,\underline{\mu }})\), and we get that \(Z_G^B(N,\underline{\mu })\ge \Phi _d(N,\underline{\mu })^{v(G)}\). \(\square \)
Theorem 3.25
For a \(2\times 2\) positive definite matrix N with positive entries and \(\underline{\mu }\in \mathbb {R}^2_{\ge 0}\), and a d-regular graph G we have \(Z_G(N,\underline{\mu })\ge \Phi _d(N,\underline{\mu })^{v(G)}\).
Proof
Note that the log-supermodularity of \(g_a\) in the case of the factor graph \(\mathcal {G}\) in Example 3.18 simply means that \(N_{11}N_{22}\ge N_{12}N_{21}\) which is satisfied as N is positive definite. Hence by combining Ruozzi’s theorem with the previous lemma we get that
Remark 3.26
For Theorem 3.25 we will give a new proof in Sect. 3.7 that implies a slightly stronger statement. Namely, if G is a d-regular graph such that for some g and \(\varepsilon >0\) the graph G contains at least \(\varepsilon v(G)\) cycles of length g, then \(Z_G(N,\underline{\mu })>((1+\delta )\Phi _d(N,\underline{\mu }))^{v(G)}\) for some \(\delta =\delta (d,N,\underline{\mu },g,\varepsilon )>0\).
3.5 Convergence of \(Z_{G_n}(N,\underline{\mu })\)
In this section we prove that if N is a \(2\times 2\) positive definite matrix with positive entries and \(\underline{\mu }\) is a positive vector, then \(\frac{1}{v(G_n)}\ln Z_{G_n}(N,\underline{\mu })\) is convergent for an essentially large girth sequence of d-regular graphs \((G_n)_n\). In fact, we will prove a stronger statement. Namely, we prove the convergence of the sequence \(\frac{1}{v(G_n)}\ln Z_{G_n}(N,\underline{\mu })\) for any Benjamini–Schramm convergent graph sequence \((G_n)_n\) of regular graphs.
Definition 3.27
For a finite graph G, a finite connected rooted graph \(\alpha \) and a positive integer r, let \(\mathbb {P}(G,\alpha ,r)\) be the probability that the r-ball centered at a uniform random vertex of G is isomorphic to \(\alpha \).
We say that a bounded-degree graph sequence \((G_n)_n\) is Benjamini–Schramm convergent if for all finite rooted graphs \(\alpha \) and \(r>0\), the probabilities \(\mathbb {P}(G_n,\alpha ,r)\) converge.
Benjamini–Schramm convergence is also called local convergence as it primarily grasps the local structure of the graphs \((G_n)_n\).
Given a vector \(\underline{a}\in \mathbb {R}^{d+1}\) and a d-regular graph G on n vertices let \(\lambda _1(G),\dots ,\lambda _{nd}(G)\) be the zeros of the polynomial \(F_G(\underline{a}|z)\). Let us define the probability measure \(\rho _{G,\underline{a}}\) on \(\mathbb {C}\) as follows:
where \(\delta _{\lambda }\) is the Dirac-measure on the number \(\lambda \).
Lemma 3.28
(a) For any integer \(k\ge 0\), a vector \(\underline{a}\in \mathbb {R}^{d+1}\) and a Benjamini–Schramm convergent sequence of d-regular graphs \((G_n)_n\) the sequence
is convergent.
(b) Let \(\underline{v}_c\in \mathbb {R}^{d+1}\) be such that the zeros of \(F_G(\underline{v}_c|z)\) lie on a circle of radius \(R_c\) for all graph G. If \((G_n)_n\) is a Benjamini–Schramm convergent sequence of d-regular graphs, then the sequence of measures \(\rho _{G_n,\underline{v}_c}\) converges weakly.
Proof
Part (a) is a special case of a much more general theorem claiming that
can be expressed as \(\frac{1}{v(G)}\sum _Hc_{H,k}\hom (H,G)\) for a fixed finite set of connected graphs H, and the fact that a sequence of bounded degree graphs \((G_n)_n\) is Benjamini–Schramm convergent if and only if for all connected graphs H the sequence \(\frac{\hom (H,G_n)}{v(G_n)}\) is convergent. For details see the paper of Csikvári and Frenkel [11].
Part (a) implies part (b) for the following reasons. The weak convergence of measures \(\rho _n\) on \(\mathbb {C}\) is equivalent with the convergence of \(\int z^k\overline{z}^{\ell }d\rho _n(z)\) for all integers \(k,\ell \ge 0\). But if \(\rho _n\) are supported on a fixed circle and they are symmetric to the real line, then this is equivalent with the convergence of \(\int z^md\rho _n(z)\) for all positive integer m. \(\square \)
Theorem 3.29
For any Benjamini–Schramm convergent sequence of d-regular graphs \((G_n)_n\) the sequence
is convergent.
Proof
By Theorem 3.15 there exists a \(\underline{v}_c\in \mathbb {R}^{d+1}\) such that for any d-regular graph G we have \(Z_G(N,\underline{\mu })=F_G(\underline{v}_c)\) and all zeros of \(F_G(\underline{v}_c|z)\) lie on a circle of radius \(R_c(N,\underline{\mu })\). First suppose that \(R_c=R_c(N,\underline{\mu })\ne 1\). We have
The measures \(\rho _{G_n,\underline{v}_c}\) are supported on a circle of radius \(R_c\ne 1\), thus \(\ln |z-1|\) is a continuous function on a region containing the circle but avoid an open neighborhood of \(z=1\). Since the measures \(\rho _{G_n,\underline{v}_c}\) are weakly convergent we get that the sequence \(\frac{1}{v(G_n)}\ln Z_{G_n}(N,\underline{\mu })\) is convergent.
Next we show that the limit exists even if \(R_c(N,\underline{\mu })=1\). Let \(\Phi _L(N,\underline{\mu })\) be defined by
if \(R_c(N,\underline{\mu })\ne 1\). Here L in \(\Phi _L(N,\underline{\mu })\) simply stands for the word limit. We show that \(\Phi _L(N,\underline{\mu })\) is a monotone increasing continuous function of \(\mu _1\). Indeed, if \(\mu '_1<\mu _1\), then
So if \(R_c(N,(\mu '_1,\mu _2))\ne 1\), then
Note that if \(R_c(N,(\mu _1,\mu _2))=1\), then
where \(T=\left( \frac{\mu _2}{\mu _1}\right) ^{2/d}\). For fixed N and \(\mu _2\) there are at most two \(\mu _1\) such that this equation is satisfied. Thus for such a \(\mu _1\) we can define
and we get that
\(\square \)
Next we need some result about the number of short cycles in random regular graphs. There are many such results in the literature, we chose one.
Lemma 3.30
(McKay et al. [19]) . Let \(\{c_1,\dots ,c_t\}\) be a non-empty subset of \(\{3,\dots ,g\}\). For a random regular graph G of order n and degree d, define \(M_C(G)=(m_1,\dots ,m_t)\), where \(m_i\) is the number of cycles of length \(c_i\) in G for \(1\le i\le t\). For \(1\le i\le t\) let \(\mu _i=\frac{(d-1)^{c_i}}{2c_i}\). Let S be a set of non-negative integer t-tuples. Then as \(n\rightarrow \infty \) the probability that \(M_C(G)\in S\) is equal to
Now we are ready to give a new proof of the fact that the limit of \(\lim _{n\rightarrow \infty }\frac{1}{v(G_n)}\ln Z_{G_n}(N,\underline{\mu })\) is \(\ln \Phi _{d}(N,\underline{\mu })\) for random regular graphs and essentially large girth sequence of regular graphs.
Theorem 3.31
(Sly and Sun [23, 24] building on Dembo and Montanari [12]). Let N be a \(2\times 2\) positive definite matrix with positive entries and let \(\underline{\mu }\in \mathbb {R}_{>0}^2\). If \((G_n)_n\) is an essentially large girth sequence of d-regular graphs, then
The same statement holds true for a sequence of random d-regular graphs with probability one.
Proof
We know from Theorem 3.29 that \(\lim _{n\rightarrow \infty }\frac{1}{v(G_n)}\ln Z_{G_n}(N,\underline{\mu })\) exists for an essentially large girth sequence of d-regular graphs. We only have to prove that this limit is \(\ln \Phi _{d}(N,\underline{\mu })\). To prove this it is enough to show one essentially large girth sequence of d-regular graphs \((G_n)_n\) for which \(\lim _{n\rightarrow \infty }\frac{1}{v(G_n)}\ln Z_{G_n}(N,\underline{\mu })=\ln \Phi _d(N,\underline{\mu })\). Let \(G_n\) be a random d-regular graph on n vertices, then by Markov’s inequality
Note that \(\mathbb {E}Z_{G_n}(N,\underline{\mu })=n^C\Phi _{d}(N,\underline{\mu })^n\) by Theorem 3.16, and for all graph G we have \(Z_{G_n}(N,\underline{\mu })\ge \Phi _{d}(N,\underline{\mu })^{v(G_n)}\) by Theorem 3.25. By Borel–Cantelli lemma we immediately get that \(\lim _{n\rightarrow \infty }\frac{1}{v(G_n)}\ln Z_{G_n}(N,\underline{\mu })=\ln \Phi _d(N,\underline{\mu })\) holds true with probability one. By Lemma 3.30 we can easily find an essentially large girth sequence of d-regular graphs \((G_n)_n\) such that \(\lim _{n\rightarrow \infty }\frac{1}{v(G_n)}\ln Z_{G_n}(N,\underline{\mu })=\ln \Phi _{d}(N,\underline{\mu })\). \(\square \)
3.6 Trigonometric Bethe approximation
In this subsection we will define some trigonometric polynomial and prove that its maximum is exactly the Bethe approximation.
Definition 3.32
By using the notations of Definition 3.3 for \(\underline{a},\underline{b},\underline{\mu }\in \mathbb {R}^2\) let
The following lemma is an immediate consequence of Lemma 3.4.
Lemma 3.33
If \(N=\underline{a}\underline{a}^T+\underline{b}\underline{b}^T=\hat{\underline{a}}\hat{\underline{a}}^T+\hat{\underline{b}}\hat{\underline{b}}^T\), then there exist an \(s\in \{-1,1\}\) and an \(\alpha \in [0,2\pi ]\) such that
By Lemma 3.33 we can introduce the following concept.
Definition 3.34
Let N be a \(2\times 2\) positive definite matrix with positive entries, and let \(\underline{\mu }\in \mathbb {R}^2\) be a vector with positive entries. Let \(N=\underline{a}\underline{a}^T+\underline{b}\underline{b}^T\) be any representation of N, then let us define
The main theorem of this section is the following.
Theorem 3.35
Let N be a \(2\times 2\) positive definite matrix with positive entries, and let \(\underline{\mu }\in \mathbb {R}^2\) be a vector with positive entries. Then
As a preparation for the proof we introduce some notations. We will also use the notions and tools from Sect. 3.4. The equation \(\textrm{BP}(h)=h\) using the substitution \(R=\frac{h_1}{h_2}\) becomes
Call \(\mathcal {R}_{N,\underline{\mu }}\) the set of non-negative solutions of this equation. By a simple calculation we have
We know that
Let us also choose a representation \(N=\underline{a}\underline{a}^T+\underline{b}\underline{b}^T\), and let
Lemma 3.36
Let \(a_1,a_2,b_1,b_2,\mu _1,\mu _2\in \mathbb {R}\), then for every \(t\in [0,2\pi ]\) such that \(b_1(t),a_2(t)\ne 0\) we have
Furthermore,
and
and if \(t_0\) maximizes \(\Phi _{\underline{a},\underline{b},\underline{\mu }}(t)\), then
Proof
We have
Note that we have \((a_2b_1-a_1b_2)^2=(a_1^2+a_2^2)^2(b_1^2+b_2^2)^2-(a_1a_2+b_1b_2)^2=N_{11}N_{22}-N_{12}^2\ne 0\). Next let us prove that
Let us multiply both sides with the denominator of the right hand side, and take the square of both sides. Note that by Lemma 3.9 and the decomposition \(N=\underline{a}\underline{a}^T+\underline{b}\underline{b}^T\) we have \(N_{11}=a_1(t)^2+b_1(t)^2\), \(N_{12}=a_1(t)a_2(t)+b_1(t)b_2(t)\) and \(N_{22}=a_2(t)^2+b_2(t)^2\) are true for every t. For ease of notation let
Then
The proof of the third identity follows similarly and we omit it.
If \(t_0\) maximizes \(\Phi _{\underline{a},\underline{b},\underline{\mu }}(t)\), in fact, we only need \(\Phi _{\underline{a},\underline{b},\underline{\mu }}'(t_0)=0\), then we have
\(\square \)
Now we are ready to prove Theorem 3.35.
Proof of Theorem 3.35
Note that \(\widetilde{\Phi }_d(N,\underline{\mu })\) does not depend on which representation \(N=\underline{a}\underline{a}^T+\underline{b}\underline{b}^T\) we choose so we can assume by part (ii) of Lemma 3.2 that \(a_1,a_2,b_1,b_2>0\). Then
is maximized at some \(t_0\in \left[ 0,\frac{\pi }{2}\right] \). Then \(S(t_0)>0\), and by \(R(t_0)=\frac{\mu _1}{\mu _2}S(t_0)^{d-1}\) we get that \(R(t_0)>0\). Thus \(R(t_0)\in \mathcal {R}_{N,\mu }\) and we have
On the other hand, \(\Phi _d(N,\underline{\mu })=\max _{R\in \mathcal {R}_{N,\underline{\mu }}}\overline{\mathbb {F}}(R,N,\underline{\mu })=\overline{\mathbb {F}}(R_0,N,\underline{\mu })\) for some \(R_0\). Then by Lemma 3.8 there exists a \(t'\in [0,2\pi ]\) such that \(R(t')=R_0\). Note that from
we get that \(a_1(t'),a_2(t')\) have the same sign. By changing \(t'\) to \(t'+\pi \) if necessary we can ensure that they are both positive. Hence we have
Then
\(\square \)
We end this section with a lemma that we will use later.
Lemma 3.37
Let Q be a \(2\times 2\) matrix with positive entries and positive determinant, and let k be an integer. Then the equation
has at most 3 non-negative solutions.
Proof
Let
Then \(\frac{\partial ^2}{\partial R^2}f\) is given as
Let
If \(R^*\le 0\), then f is concave of \([0,\infty )\) and so it has at most two solutions.
If \(R^*> 0\), then \(\frac{\partial ^2}{\partial R^2}f\) is positive if \(0\le R<R^*\), and negative if \(R>R^*\). If \(f(R^*)>0\), then f has at most 2 solutions on \((0,R^*)\) and has at most 1 solution on \((R^*,\infty )\). If \(f(R^*)<0\), then f has at most 1 solution on \((0,R^*)\) and has at most 2 solutions on \((R^*,\infty )\). Finally, if \(f(R^*)=0\), then f has at most 1 solution on \((0,R^*)\) and has at most 1 solutions on \((R^*,\infty )\). So in all cases it has at most 3 solutions. \(\square \)
3.7 The vector \(\underline{v}(t_0)\)
Let \(t_0\) be the maximizer of the function \(\Phi _{\underline{a},\underline{b},\underline{\mu }}\). In this section we study the vector \(\underline{v}(t_0)=(r_0(t_0),r_1(t_0),\dots ,r_d(t_0))\). First we need a simple lemma.
Lemma 3.38
Let \(\underline{a},\underline{b},\underline{\mu }\in \mathbb {R}^r\). For \(j=1,\dots ,r\) let
Let \(r_j(t)=\sum _{k=1}^r\mu _ka_k(t)^{d-j}b_k(t)^{j}\), then
Proof
We have
\(\square \)
Consider the vector \(\underline{v}(t_0)=(r_0(t_0),r_1(t_0),\dots ,r_d(t_0))\). We show that \(r_1(t_0)=0\) and \(r_j(t_0)\ge 0\) if j is even, and the numbers \(r_j(t_0)\) have the same sign for odd \(j\ge 3\). This follows from the following more general lemma.
Lemma 3.39
Let \(\mu _1,\mu _2>0\) and \(a_1,a_2,b_1,b_2\in \mathbb {R}\) such that \(a_1,a_2,b_1,b_2>0\). Let \(t_0\in \left[ 0,\frac{\pi }{2}\right] \) be the maximizer of \(\Phi _{\underline{a},\underline{b},\underline{\mu }}(t)=\mu _1a_1(t)^d+\mu _2a_2(t)^d\). Let
Then \(r_1(t_0)=0\) and either
-
(i)
\(r_j(t_0)\ge 0\) for \(j=0,\dots ,d\) or
-
(ii)
\(r_j(t_0)\ge 0\) for even j, and \(r_j(t_0)\le 0\) for odd j.
Proof
Observe that \(\frac{\partial }{\partial t}a_1(t)=b_1(t)\) and \(\frac{\partial }{\partial t}a_2(t)=b_2(t)\) and so
Hence if \(t_0\) maximizes \(r_0(t)\), then \(r_1(t_0)=0\).
To prove the inequalities we need to study several cases. First of all, \(r_1(t_0)=0\) implies that \(a_1(t_0)b_1(t_0)=0\) if and only if \(a_2(t_0)b_2(t_0)=0\). If \(a_1(t_0)b_1(t_0)=a_2(t_0)b_2(t_0)=0\), then \(r_1(t_0)=r_2(t_0)=\dots =r_{d-1}(t_0)=0\). We know that \(r_0(t_0)\ge r_0(0)=\mu _1a_1^d+\mu _2a_2^d>0\). Finally, \(r_d(t_0)=\mu _1b_1(t_0)^d+\mu _2b_1(t_0)^d\ge 0\) if d is even. If d is odd, then no matter what the sign of \(r_d(t_0)\) case (i) or (ii) is satisfied.
So we can assume that \(a_1(t_0)b_1(t_0)\ne 0\) and \(a_2(t_0)b_2(t_0)\ne 0\). By symmetry we can assume that \(|\mu _2a_2(t_0)^d|\ge |\mu _1a_1(t_0)^d|\). Note that
From \(r_1(t_0)=0\) we get that
Note that if \(\mu _2a_2(t_0)^d=\mu _1a_1(t_0)^d>0\), then \(\frac{a_1(t_0)b_2(t_0)}{a_2(t_0)b_1(t_0)}=-1\). Then for odd j we have \(r_j(t_0)=0\), and for even j all terms are positive in the above product, so \(r_j(t_0)>0\).
If \(\mu _2a_2^d>\mu _1a_1^d\), then \(\left| \frac{a_1(t_0)b_2(t_0)}{a_2(t_0)b_1(t_0)}\right| <1\), and so the last term is positive for all \(j\ge 2\). Then if \(\frac{b_1(t_0)}{a_1(t_0)}> 0\), then \(r_j(t_0)\ge 0\) for all j, and if \(\frac{b_1(t_0)}{a_1(t_0)}<0\), then \(r_j(t)\) is positive for even j and negative for odd \(j\ge 3\). We are done. \(\square \)
Remark 3.40
Note that besides \(r_1(t_0)=0\) we also have \(\frac{r_2(t_0)}{r_0(t_0)}\le \frac{1}{d-1}\) since \(\frac{\partial ^2}{\partial t^2}r_0(t)=d((d-1)r_2(t)-r_0(t))\) should be non-positive at \(t=t_0\).
The following theorem is a strengthening of Theorem 3.25 with a new proof.
Theorem 3.41
Let N be a \(2\times 2\) positive definite matrix with positive entries and let \(\underline{\mu }\in \mathbb {R}_{>0}^2\). For any d-regular graph G we have \(Z_G(N,\underline{\mu })\ge \Phi _{d}(N,\underline{\mu })^{v(G)}\). Furthermore, if G contains \(\varepsilon v(G)\) cycles of length at most g, then there exists a \(\delta =\delta (d,N,\underline{\mu },\varepsilon ,g)>0\) such that \(Z_G(N,\underline{\mu })\ge ((1+\delta )\Phi _{d}(N,\underline{\mu }))^{v(G)}\).
The proof presented below is strongly inspired by the work of Chertkov and Chernyak [10] on loop series and gauge transformation. The paper of Borbényi and Csikvári [5] contains a similar proof about the number of Eulerian orientations in regular graphs.
Proof
By part (ii) of Lemma 3.2 we can choose \(\underline{a},\underline{b}\in \mathbb {R}^2\) such that \(a_1,a_2,b_1,b_2>0\). Then \(r_j(0)=\mu _1a_1^{d-j}b_1^j+\mu _2a_2^{d-j}b_2^j>0\) for all \(j\in \{0,1,\dots ,d\}\). This implies that
has a maximizer \(t_0\) in the interval \(\left[ 0,\frac{\pi }{2}\right] \). Indeed, for any \(t\in [0,2\pi ]\) there is a \(t'\in \left[ 0,\frac{\pi }{2}\right] \) such that \(\cos (t')=|\cos (t)|\) and \(\sin (t')=|\sin (t)|\), so \(|\Phi _{\underline{a},\underline{b},\underline{\mu }}(t)|\le \Phi _{\underline{a},\underline{b},\underline{\mu }}(t')\). Thus the conditions of Lemma 3.39 are satisfied. We have
For each \(A\subseteq E(G)\) the number of vertices with odd \(d_A(v)\) is even, so by Lemma 3.39 each term in the sum is non-negative. Then taking \(A=\emptyset \) we get that
This completes the proof of the first part.
To prove the second part first observe that \(r_2(t_0)>0\). Indeed, since \(a_1,a_2,b_1,b_2>0\) and \(t_0\in \left[ 0,\frac{\pi }{2}\right] \) we get that \(a_1(t_0),a_2(t_0)>0\) and thus
would imply that \(b_1(t_0)=b_2(t_0)=0\) which then implies that
contradicting the positive definiteness of N. Also observe that if G contains \(\varepsilon v(G)\) cycles of length at most g, then it also contains \(\varepsilon ' v(G)\) vertex-disjoint cycles of length at most g for some \(\varepsilon '\) depending on d, g and \(\varepsilon \), but not depending on v(G). Then
Indeed, we can consider those sets \(A\subseteq E(G)\) that consists of the union of some of the vertex-disjoint cycles of length at most g. Here we also use the fact that \(0\le \frac{r_2(t_0)}{r_0(t_0)}\le \frac{1}{d-1}\le 1\) by Remark 3.40. Hence \(1+\delta =\left( 1+\left( \frac{r_2(t_0)}{r_0(t_0)}\right) ^g\right) ^{\varepsilon '}\) satisfies the claim of the theorem. \(\square \)
Remark 3.42
Theorem 3.41 implies that if \((G_n)_n\) is a sequence of d-regular graphs such that it is not essentially large girth, then
Since \(Z_G(q,w)\ge Z_{G}^{(2)}(q,w)\) for \(q\ge 2\) and \(w\ge 0\) this kind of stability statement is also true for \(Z_G(q,w)\).
3.8 Mixed state
In this section we introduce a concept that is strongly related to the phase transition of the random cluster model. We will see that there exists a \(w_c=w_c(q)\) such that if \(0\le w\le w_c\), then \(\Phi _{d,q,w}=q\left( 1+\frac{w}{q}\right) ^{d/2}\), and if \(w>w_c\), then \(\Phi _{d,q,w}>q\left( 1+\frac{w}{q}\right) ^{d/2}\). The problem with such a statement is that it depends on the parametrization (q, w), for a general model \((N,\underline{\mu })\) it does not make sense. On the other hand, there is a concept that makes sense even for general \((N,\underline{\mu })\), where N is a \(2\times 2\) positive definite matrix.
Definition 3.43
We say that \((N,\underline{\mu })\) exhibits a mixed state for a fixed positive integer d if \(R_c(N,\underline{\mu })=1\).
Note that \(R_c(N,\underline{\mu })=1\) does not depend on which representation \(N=\underline{a}\underline{a}^T+\underline{b}\underline{b}^T\) we choose. We also know that \(R=R_c(N,\underline{\mu })\) is a solution of
where \(T=\left( \frac{\mu _2}{\mu _1}\right) ^{2/d}\). This shows that \((N,\underline{\mu })\) exhibits a mixed state for d if
The main lemma of this section is the following.
Lemma 3.44
Let \(N=\underline{a}\underline{a}^T+\underline{b}\underline{b}^T\) for some \(a_1,a_2,b_1,b_2\in \mathbb {R}\) and let \(\mu _1,\mu _2>0\). Suppose that \((N,\underline{\mu })\) exhibits a mixed state for d, that is, for some \(t_1\) we have
Let \(t_2=2t_1-\frac{\pi }{2}\). Then for every \(t\in \mathbb {R}\) we have
In particular,
Proof
Note that for any \(u_1,u_2\in \mathbb {R}\) we have
To prove \(\mu _1a_1(t)^d=\mu _2a_2(t_2-t)^d\) it will be more convenient to prove the statement in the form
This is indeed true,
By symmetry the other claim is also true. \(\square \)
3.9 Specialization to \(N=M'_2\) and \(\underline{\mu }=\underline{\nu }_2\)
In this section we collected some results that are specialized to \(N=M'_2\) and \(\underline{\mu }=\underline{\nu }_2\). In particular, we will choose \(a_1=a_2=\sqrt{1+\frac{w}{q}}\), \(b_1=\sqrt{\frac{(q-1)w}{q}}\) and \(b_2=-\sqrt{\frac{w}{q(q-1)}}\).
Lemma 3.45
Let \(q\ge 2,w\ge 0\). Let \(a_1=a_2=\sqrt{1+\frac{w}{q}}\), \(b_1=\sqrt{\frac{(q-1)w}{q}}\), \(b_2=-\sqrt{\frac{w}{q(q-1)}}\), \(\nu _1=1\) and \(\nu _2=q-1\). Let \(r_j(0)=\nu _1a_1^{d-j}b_1^j+\nu _2a_2^{d-j}b_2^j\). Then we have \(r_j(0)\ge 0\) for \(j=0,1,\dots ,d\) and \(r_1(0)=0\).
Proof
We have
This is 0 if \(j=1\), and positive if \(j\ne 1\). \(\square \)
Recall that
The next lemma shows that it is enough to consider the interval \(\left[ 0,\frac{\pi }{2}\right] \) to find the maximum when \(q\ge 2\) and \(w\ge 0\).
Lemma 3.46
If \(q\ge 2\) and \(w\ge 0\), then there is a \(t_0\in \left[ 0,\frac{\pi }{2}\right] \) for which \(\Phi _{d,q,w}=\Phi _{d,q,w}(t_0)\).
Proof
By Lemma 3.38 we have
By Lemma 3.45 we have \(r_j(0)\ge 0\) for all \(j\in \{0,1,\dots ,d\}\) if \(q\ge 2\) and \(w\ge 0\). For any \(t\in [0,2\pi ]\) there is a \(t'\in \left[ 0,\frac{\pi }{2}\right] \) such that \(|\cos (t)|=\cos (t')\) and \(|\sin (t)|=\sin (t')\), thus
Hence \(\max _{t\in [0,2\pi ]}\Phi _{d,q,w}(t)=\max _{t\in \left[ 0,\frac{\pi }{2}\right] }\Phi _{d,q,w}(t).\) \(\square \)
The next lemma will be useful to get even more precise bounds on \(\tan (t_0)\).
Lemma 3.47
Let \(q\ge 2\) and \(w\ge 0\). Let \(\overline{t}\in \left[ 0,\frac{\pi }{2}\right] \) such that
Then we have \(a_{q,w,1}(\overline{t}),a_{q,w,2}(\overline{t}),b_{q,w,1}(\overline{t})>0\) and \(b_{q,w,2}(\overline{t})<0\). In particular, this is true if \(\overline{t}=t_0\) maximizing the function \(\Phi _{d,q,w}(t)\) in the interval \(\left[ 0,\frac{\pi }{2}\right] \).
Proof
Note that \(\Phi _{d,q,w}(t)=a_{q,w,1}(t)^d+(q-1)a_{q,w,2}(t)^d\), and its derivative is
Note that \(a_{q,w,1}(t)>0\) and \(b_{q,w,2}(t)<0\) for all \(t\in \left[ 0,\frac{\pi }{2}\right] \). Suppose for contradiction that \(b_{q,w,1}(\overline{t})<0\). Then from \(b_{q,w,1}(\overline{t})a_{q,w,1}(\overline{t})^{d-1}+b_{q,w,2}(\overline{t})a_{q,w,2}(\overline{t})^{d-1}=0\) we also get that \(a_{q,w,2}(\overline{t})^{d-1}<0\), that is, \(a_{q,w,2}(\overline{t})<0\) and d is even. Then
Note that \(a_{q,w,1}(t)>-a_{q,w,2}(t)\) for all \(t\in \left[ 0,\frac{\pi }{2}\right] \), and so
By \(a_{q,w,1}(t)b_{q,w,1}(t)+(q-1)a_{q,w,2}(t)b_{q,w,2}(t)=-q\cos (t)\sin (t)\) we have
but then
leads to a contradiction. Hence \(b_{q,w,1}(\overline{t})>0\). But then \(a_{q,w,1}(\overline{t})a_{q,w,2}(\overline{t})+b_{q,w,1}(\overline{t})b_{q,w,2}(\overline{t})=1\) implies that \(a_{q,w,2}(\overline{t})>0\). \(\square \)
Finally, we collected some claims about the derivatives of \(\Phi _{d,q,w}(t)\) at \(t=0.\)
Lemma 3.48
We have
In particular, if \(w<\frac{q}{d-2}\), then the function \(\Phi _{d,q,w}(t)\) has a local maximum at \(t=0\), and if \(w>\frac{q}{d-2}\) the function \(\Phi _{d,q,w}(t)\) has a local minimum at \(t=0.\)
3.9.1 Mixed state and phase transition.
In this section we discuss the mixed state and phase transition of \(Z^{(2)}_G(q,w)\).
We know that \((N,\underline{\mu })\) exhibits mixed state for some d if
where \(T=\left( \frac{\mu _2}{\mu _1}\right) ^{1/d}\). Applying this equation to \((M'_2,\underline{\nu }_2)\) we get that
Note that if \(q=2\), then this is constant 0, so in the case of \(q=2\), the spin system \((M'_2,\underline{\nu }_2)\) always exhibits mixed state. If \(q\ne 2\), then the solution \(w_c=w_c(q)\) of this equation is
Note that by L’Hôpital’s rule we have
so we will define \(w_c(2)=\frac{2}{d-2}\) even though the spin system \((M'_2,\underline{\nu }_2)\) itself always exhibits mixed state for every w if \(q=2\).
The main theorem of this section is the following. It asserts that the mixed state also describes a phase transition in the value of \(\Phi _{d,q,w}\).
Theorem 3.49
Let \(q\ge 2\). If \(0\le w\le w_c\), then \(\Phi _{d,q,w}=q\left( 1+\frac{w}{q}\right) ^{d/2}\). If \(w>w_c\), then \(\Phi _{d,q,w}>q\left( 1+\frac{w}{q}\right) ^{d/2}\).
Before we start to prove Theorem 3.49 we need a lemma about the curve \((q,w_c(q))\). For a visualization of this lemma see the dashed curve on Fig. 4.
Lemma 3.50
For every \(q\ge 2\) let \(x(q)=1+\frac{q}{w_c(q)}\) and \(y(q)=1+w_c(q)\). Then the curve (x(q), y(q)) on the (x, y) plane is the graph of a monotone increasing function. Furthermore, \(w_c(q)\le \frac{q}{d-2}\).
Proof
We have
and \(q=(x(q)-1)(y(q)-1)\). Thus
Clearly, \(y-1=w>0\) so we only need to show that the first term is also positive. We can rewrite the numerator as
This is 0 if \(q=2\) and its derivative is \(\frac{2}{d} (1-(q-1)^{-2/d})\ge 0\) for \(q\ge 2\). Hence \(\frac{dy }{dx}>0\).
The second part follows from the first part since if we follow any hyperbola \((x-1)(y-1)=q\) while decreasing x, and hence increasing y, it intersects the curve (x(q), y(q)) before hitting the line \(x=d-1\). The intersection of the hyperbola \((x-1)(y-1)=q\) and the line \(x=d-1\) is at \(y=1+\frac{q}{d-2}\), thus \(w_c(q)\le \frac{q}{d-2}\).
We decompose the proof of Theorem 3.49 into three propositions dealing with \(w=w_c\), \(0\le w<w_c\) and \(w>w_c\).
Proposition 3.51
For \(q\ge 2\) and \(w=w_c\) we have \(\Phi _{d,q,w}=q\left( 1+\frac{w}{q}\right) ^{d/2}\).
Proof
Let us assume that \(q>2\), the statement for \(q=2\) follows by continuity. We show that for \(w=w_c\) the function \(\Phi _{d,q,w}(t)\) has a global maximizer in \((0,\pi /2)\) with value \(\Phi _{d,q,w}(0)=q\left( 1+\frac{w}{q}\right) ^{d/2}\). We know that for \(w=w_c\) there is a \(t_1\in \left[ 0,\frac{\pi }{2}\right] \) such that
This means that
Note that if \(q=2\), then \(t_1=\frac{\pi }{4}\) is a solution. If \(q>2\), then
showing that \(t_1\in \left( \frac{\pi }{4},\frac{\pi }{2}\right) \). Let \(t_2=2t_1-\frac{\pi }{2}\in \left( 0,\frac{\pi }{2}\right) \). By Lemma 3.44 we know that \(\Phi _{d,q,w_c}(t)=\Phi _{d,q,w_c}(t_2-t)\). (For an example of the graph of a function \(\Phi _{d,q,w_c}(t)\) see Fig. 5)
We know that
This immediately implies that
The equation \(\Phi _{d,q,w_c}(t)=\Phi _{d,q,w_c}(t_2-t)\) also implies that
This means that a computation similar to the one in Sect. 3.6 gives that if
then the values \(R=R(0),R\left( \frac{t_2}{2}\right) ,R(t_2)\) are all solutions of the equation
The values \(R(0),R\left( \frac{t_2}{2}\right) ,R(t_2)\) are at least 1, because by Lemma 3.47 they are non-negative, and \(S(t)>1\) whenever \(t\in \left[ 0,\frac{\pi }{2}\right] \) and both \(a_{q,w,1}(t),a_{q,w,2}(t)>0\). The equation
has at most 3 solutions satisfying \(R\ge 1\) by Lemma 3.37 which means that there is no other \(t'\in \left( 0,\frac{\pi }{2}\right) \) that is a local maximizer or minimizer of \(\Phi _{d,q,w}(t)\). Note that \(\frac{d^2}{dt^2}\Phi _{d,q,w_c}\bigg |_{t=0}<0\) since \(w_c(q)<\frac{q}{d-2}\) by Lemma 3.50. So at \(\frac{t_2}{2}\) we have a local minimum, and at \(t_2\) we have a local maximum. Hence
\(\square \)
Proposition 3.52
For \(q\ge 2\) and \(0\le w\le w_c\) we have \(\Phi _{d,q,w}=q\left( 1+\frac{w}{q}\right) ^{d/2}\).
Proof
We will describe the pairs (q, w) for which \(\Phi _{d,q,w}=q\left( 1+\frac{w}{q}\right) ^{d/2}\). To do this it is better to use the Tutte polynomial \(T_G(x,y)\) instead of \(Z_G(q,w)\) with \(q=(x-1)(y-1)\) and \(w=y-1\). Recall that the connection between the Tutte polynomial and the partition function of the random cluster model is the following:
Then for \(q\ge 2\) and an essentially large girth sequence of d-regular graphs \((G_n)_n\) the statement
is equivalent with
This is independent of y. The Tutte polynomial has only non-negative coefficients [25], so if this limit value holds true for \((x,y_1)\) and \((x,y_2)\), then so for every \(y\in [y_1,y_2]\). Note that for \(x\ge d-1\) and \(y=1\) this was indeed proved by Bencs and Csikvári [4]. In fact, we do not even need to use this result since for \(q=1\) this statement is trivial. By Lemma 3.50 the curve \((q,w_c(q))\) for \(q\ge 2\) reparametrized with x and y is the graph of a monotone increasing function on the interval \([d-1,\infty )\), see the dashed line on Fig. 1. In particular, for \(q\ge 2\) the part of the hyperbola \((x-1)(y-1)=q\) with \(0\le w=y-1\le w_c\) goes under this curve implying \(\Phi _{d,q,w}=q\left( 1+\frac{w}{q}\right) ^{d/2}\). \(\square \)
Remark 3.53
We remark that the same argument also gives that if \(1<q<2\) and \(0\le w\le \frac{q}{d-2}\), then for an essentially large girth sequence of d-regular graphs \((G_n)_n\) we have
Proposition 3.54
For \(q\ge 2\) and \(w>w_c\) we have \(\Phi _{d,q,w}>q\left( 1+\frac{w}{q}\right) ^{d/2}\). Furthermore, the function \(\frac{\partial }{\partial w}\Phi _{d,q,w}\) has a discontuinity at \(w=w_c\) if \(q>2\).
Proof
Consider the function
We show that it is a strictly monotone increasing function in w for every \(t\in \left( 0,\frac{\pi }{2}\right) \). By definition
Then \(\frac{\partial h}{\partial w}\) is given as
This is positive if \(t\in \left( 0,\frac{\pi }{2}\right) \) since
Note that for \(q>2\) there is a \(t_0(w_c)\in \left( 0,\frac{\pi }{2}\right) \) such that \(\Phi _{d,q,w_c}(t_0(w_c))=\Phi _{d,q,w_c}(0)\), that is, \(h(w_c,t_0(w_c))=q\). Then for \(w>w_c\) we have \(h(w,t_0(w_c))>q\) which gives that
For \(q=2\) we know that \(w_c=\frac{2}{d-2}\) and for \(w> \frac{q}{d-2}=\frac{2}{d-2}\) we have \(\frac{\partial }{\partial t}\Phi _{d,q,w}(t)\bigg |_{t=0}=0\) and \(\frac{\partial }{\partial t^2}\Phi _{d,q,w}(t)\bigg |_{t=0}<0\), so at \(t=0\) we have a local minimum, thus \(\Phi _{d,q,w}>\Phi _{d,q,w}(0)\) for \(w>w_c\).
Next we prove the claim about \(\frac{\partial }{\partial w}\Phi _{d,q,w}\). Let \(w>w_c\) such that \(w-w_c\) is small enough, namely it satisfies
Then
From this it follows that
\(\square \)
Remark 3.55
It is well-known that if \(q=2\), then there is a second order phase transition, that is, \(\frac{\partial }{\partial w}\Phi _{d,2,w}\) is continuous, but \(\frac{\partial ^2}{\partial w^2}\Phi _{d,2,w}\) is discontinuous at \(w=w_c=\frac{2}{d-2}\). For details see Chapter 4.8 of [3].
3.10 Examples
In this section we give some examples for the theorems we proved.
Example 3.56
Let \(d=8\), \(q=5\) and \(w=1\). Then the vector
where we kept only the first three digits everywhere. Note that \(10.368=5\cdot \left( 1+\frac{1}{5}\right) ^{8/2}.\) So for every 8-regular graph G we have
Using \(t_0=0.6619549492373429\) we get the vector
again only keeping the first 3 digits everywhere. A more precise value of the first coordinate is 16.277748757985485, and so this is \(\Phi _{8,5,1}\). Note that the sign structure of \(\underline{v}(t_0)\) shows that
for every 8-regular graph G.
Example 3.57
Let \(d=4\), \(q=5\) and \(w=3\). Then
where we again kept only the first three digits everywhere. This time \(t_0=0.8316331320342567\) and \(\Phi _{4,5,3}=16.315621073058985\) while
In this case \(t_1=1.06627054934707\) and the corresponding vector
For the complete graph \(K_5\) on 5 vertices the subgraph counting polynomial looks like as follows:
All zeros of this polynomial have absolute value approximately 1.0747696. Of course, we could have used any 4-regular graph instead of \(K_5\) (see Fig. 6).
Example 3.58
Let \(d=4\) and \(q=5\) again, but let \(w=w_c=2\). Then
\(t_0=0.5575988373258864\) and
We have \(t_1=1.06419757674722\) and
One can check that
and all of its zeros have absolute value 1.
4 Selected Remarks About the Interval \(1<q<2\)
In this section we collected several remarks about the interval \(1<q<2\).
4.1 Two different quantities
In this section we aim to explain a seemingly negligible thing that makes the interval \(q\ge 2\) and \(1<q<2\) really different.
Once again let \(N=M_2'\) and \(\underline{\mu }=\underline{\nu }_2\), and parametrize the distribution h in the Bethe recursion as follows:
Then
and
If \(\textrm{BP}(h)=h\), then by dividing the Bethe recursions for \(h_1\) and \(h_2\) we get that
We remark that if we study the Potts model \(N=M=wI_q+J_q\) and \(\mu \equiv 1\) with
then we would have arrived to the same equation. Let \(\mathcal {R}_{d,q,w}\) be the set of non-negative solutions of this equation. Let \(\mathcal {R}^*_{d,q,w}\) be the solutions satisfying also that \(R\ge 1\). Let us introduce the notation
Then we know that
For later use let us also introduce
Similarly, we can consider the pair
In case of \(q\ge 2\) we have
But when \(1<q<2\) we have
While it is still true that for an essentially large girth sequence of d-regular graphs we have
we actually believe that
This means that the rank 2 approximation is not good enough in the interval \(1<q<2\).
Nevertheless, by Remark 3.53 we know that for \(1<q<2\) and \(0\le w\le \frac{q}{d-2}\) we have
We remark that this result is compatible with the conjecture
since for the function \(\Phi _{d,q,w}(t)\) we have
which is negative if \(w<\frac{q}{d-2}\) and positive \(w>\frac{q}{d-2}\). So in the first case we get that \(t=0\) is a local maximum, in the second case it is a local minimum.
References
Abért, M., Hubai, T.: Benjamini–Schramm convergence and the distribution of chromatic roots for sparse graphs. Combinatorica 35(2), 127–151 (2015)
Bandyopadhyay, A., Gamarnik, D.: Counting without sampling: asymptotics of the log-partition function for certain statistical physics models. Random Struct. Algorithms 33(4), 452–479 (2008)
Baxter, R.J.: Exactly Solved Models in Statistical Mechanics. Elsevier, Amsterdam (2016)
Bencs, F., Csikvári, P.: Evaluations of Tutte polynomials of regular graphs. J. Comb. Theory Ser. B 157, 500–523 (2022)
Borbényi, M., Csikvári, P.: Counting degree-constrained subgraphs and orientations. Discrete Math. 343(6), 111842 (2020)
Borgs, C., Chayes, J., Kahn, J., Lovász, L.: Left and right convergence of graphs with bounded degree. Random Struct. Algorithms 42(1), 1–28 (2013)
Carlson, C., Davies, E., Kolla, A.: Efficient algorithms for the Potts model on small-set expanders. arXiv preprint arXiv:2003.01154 (2020)
Carlson, C., Davies, E., Fraiman, N., Kolla, A., Potukuchi, A., Yap, C.: Algorithms for the ferromagnetic Potts model on expanders. arXiv preprint arXiv:2204.01923 (2022)
Chertkov, M., Chernyak, V.Y.: Loop calculus in statistical physics and information science. Phys. Rev. E 73(6), 065102 (2006)
Chertkov, M., Chernyak, V.Y.: Loop series for discrete statistical models on graphs. J. Stat. Mech. Theory Exp. 2006(06), P06009 (2006)
Csikvári, P., Frenkel, P.E.: Benjamini–Schramm continuity of root moments of graph polynomials. Eur. J. Comb. 52, 302–320 (2016)
Dembo, A., Montanari, A.: Ising models on locally tree-like graphs. Ann. Appl. Probab. 20(2), 565–592 (2010)
Dembo, A., Montanari, A., Sly, A., Sun, N.: The replica symmetric solution for Potts models on d-regular graphs. Commun. Math. Phys. 327(2), 551–575 (2014)
Dembo, A., Montanari, A., Sun, N.: Factor models on locally tree-like graphs. Ann. Probab. 41(6), 4162–4213 (2013)
Galanis, A., Štefankovič, D., Vigoda, E., Yang, L.: Ferromagnetic Potts model: refined #BIS-hardness and related results. SIAM J. Comput. 45(6), 2004–2065 (2016)
Grimmett, G.: The random-cluster model. In: Probability on Discrete Structures, pp. 73–123. Springer, Berlin (2004)
Helmuth, T., Jenssen, M., Perkins, W.: Finite-size scaling, phase coexistence, and algorithms for the random cluster model on random graphs. arXiv preprint arXiv:2006.11580 (2020)
Lee, T.-D., Yang, C.-N.: Statistical theory of equations of state and phase transitions. II. Lattice gas and Ising model. Phys. Rev. 87(3), 410 (1952)
McKay, B.D., Wormald, N.C., Wysocka, B.: Short cycles in random regular graphs. Electron. J. Comb. 11, R66–R66 (2004)
Mezard, M., Montanari, A.: Information, Physics, and Computation. Oxford University Press, Oxford (2009)
Ruozzi, N.: The Bethe partition function of log-supermodular graphical models. In: Advances in Neural Information Processing Systems, 25 (2012)
Ruozzi, N.: Beyond log-supermodularity: lower bounds and the Bethe partition function. In: Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence (2013)
Sly, A., Sun, N.: The computational hardness of counting in two-spin models on d-regular graphs. In: 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science, pp. 361–369. IEEE (2012)
Sly, A., Sun, N.: Counting in two-spin models on d-regular graphs. Ann. Probab. 42(6), 2383–2416 (2014)
Tutte, W.T.: A contribution to the theory of chromatic polynomials. Can. J. Math. 6, 80–91 (1954)
Wagner, D.G.: Weighted enumeration of spanning subgraphs with degree constraints. J. Comb. Theory Ser. B 99(2), 347–357 (2009)
Yang, C.-N., Lee, T.-D.: Statistical theory of equations of state and phase transitions. I. Theory of condensation. Phys. Rev. 87(3), 404 (1952)
Acknowledgements
We are very grateful to the anonymous reviewers for the careful reading and the suggestions leading to a significant improvement in the presentation of this paper.
Funding
Open access funding provided by ELKH Alfréd Rényi Institute of Mathematics.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
All authors declare that they have no conflicts of interest.
Additional information
Communicated by J. Ding.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The first author is supported by the NKFIH (National Research, Development and Innovation Office, Hungary) grant KKP-133921. The second author is supported by the ÚNKP-21-2 New National Excellence Program of the Ministry for Innovation and Technology from the source of the National Research, Development and Innovation Fund. The third author is supported by the Counting in Sparse Graphs Lendület Research Group.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Bencs, F., Borbényi, M. & Csikvári, P. Random Cluster Model on Regular Graphs. Commun. Math. Phys. 399, 203–248 (2023). https://doi.org/10.1007/s00220-022-04552-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00220-022-04552-1