Abstract
We study the problem of deterministic approximate counting of matchings and independent sets in graphs of bounded connective constant. More generally, we consider the problem of evaluating the partition functions of the monomerdimer model (which is defined as a weighted sum over all matchings where each matching is given a weight \(\gamma ^{V  2 M}\) in terms of a fixed parameter \(\gamma \) called the monomer activity) and the hard core model (which is defined as a weighted sum over all independent sets where an independent set I is given a weight \(\lambda ^{I}\) in terms of a fixed parameter \(\lambda \) called the vertex activity). The connective constant is a natural measure of the average degree of a graph which has been studied extensively in combinatorics and mathematical physics, and can be bounded by a constant even for certain unbounded degree graphs such as those sampled from the sparse Erdős–Rényi model \(\mathcal {G}(n, d/n)\). Our main technical contribution is to prove the best possible rates of decay of correlations in the natural probability distributions induced by both the hard core model and the monomerdimer model in graphs with a given bound on the connective constant. These results on decay of correlations are obtained using a new framework based on the socalled message approach that has been extensively used recently to prove such results for bounded degree graphs. We then use these optimal decay of correlations results to obtain fully polynomial time approximation schemes (FPTASs) for the two problems on graphs of bounded connective constant. In particular, for the monomerdimer model, we give a deterministic FPTAS for the partition function on all graphs of bounded connective constant for any given value of the monomer activity. The best previously known deterministic algorithm was due to Bayati et al. (Proc. 39th ACM Symp. Theory Comput., pp. 122–127, 2007), and gave the same runtime guarantees as our results but only for the case of bounded degree graphs. For the hard core model, we give an FPTAS for graphs of connective constant \(\varDelta \) whenever the vertex activity \(\lambda < \lambda _c(\varDelta )\), where \(\lambda _c(\varDelta ) :=\frac{\varDelta ^\varDelta }{(\varDelta  1)^{\varDelta + 1}}\); this result is optimal in the sense that an FPTAS for any \(\lambda > \lambda _c(\varDelta )\) would imply that NP=RP (Sly and Sun, Ann. Probab. 42(6):2383–2416, 2014). The previous best known result in this direction was in a recent manuscript by a subset of the current authors (Proc. 54th IEEE Symp. Found. Comput. Sci., pp 300–309, 2013), where the result was established under the suboptimal condition \(\lambda < \lambda _c(\varDelta + 1)\). Our techniques also allow us to improve upon known bounds for decay of correlations for the hard core model on various regular lattices, including those obtained by Restrepo et al. (Probab Theory Relat Fields 156(1–2):75–99, 2013) for the special case of \(\mathbb {Z}^2\) using sophisticated numerically intensive methods tailored to that special case.
This is a preview of subscription content, access via your institution.
Notes
For the large class of selfreducible problems, it can be shown that approximating the partition function is polynomialtime equivalent to approximate sampling from the Gibbs distribution [24].
An FPTAS for a quantity \(\mathcal {A}\) is a deterministic algorithm which, given an accuracy parameter \(\epsilon > 0\), outputs in time polynomial in \(1/\epsilon \) and the rest of the input an estimate \(\hat{A}\) satisfying \((1\epsilon )\mathcal {A}< \hat{A} < (1+\epsilon )\mathcal {A}\). A randomized version of such an algorithm is called a fully polynomial time randomized approximation scheme (FPRAS). Both of these are natural and standard notions of algorithmic approximation. See Sect. 2.1 for details.
For a description of how this boundary condition is computed, we refer the reader to the third paragraph of Appendix A.
Since the degree of every vertex v in the graph is n, every boundary condition sigma satisfies \(1 \ge F_v(\sigma ) \ge \frac{1}{1+\gamma n}\). Substituting these bounds in the definition of M in Lemma 3 yields the claimed bound.
References
Alm, S.E.: Upper bounds for the connective constant of selfavoiding walks. Comb. Probab. Comput. 2(02), 115–136 (1993). doi:10.1017/S0963548300000547
Alm, S.E.: Upper and lower bounds for the connective constants of selfavoiding walks on the Archimedean and Laves lattices. J. Phys. A 38(10), 2055–2080 (2005). doi:10.1088/03054470/38/10/001
Andrews, G.E.: The hardhexagon model and Rogers–Ramanujan type identities. Proc. Nat. Acad. Sci. 78(9), 5290–5292 (1981). http://www.pnas.org/content/78/9/5290. PMID: 16593082
Bandyopadhyay, A., Gamarnik, D.: Counting without sampling: asymptotics of the logpartition function for certain statistical physics models random structures and algorithms. Random Struct. Algorithms 33(4), 452–479 (2008)
Baxter, R.J.: Hard hexagons: exact solution. J. Phys. A: Math. Gen. 13(3), L61–L70 (1980). doi:10.1088/03054470/13/3/007. http://iopscience.iop.org/03054470/13/3/007
Baxter, R.J., Enting, I.G., Tsang, S.K.: Hardsquare lattice gas. J. Stat. Phys. 22(4), 465–489 (1980). doi:10.1007/BF01012867. http://link.springer.com/article/10.1007/BF01012867
Bayati, M., Gamarnik, D., Katz, D., Nair, C., Tetali, P.: Simple deterministic approximation algorithms for counting matchings. In: Proc. 39th ACM Symp. Theory Comput., pp. 122–127. ACM (2007). doi:10.1145/1250790.1250809
Broadbent, S.R., Hammersley, J.M.: Percolation processes I. Crystals and mazes. Math. Proc. Camb. Philos. Soc. 53(03), 629–641 (1957). doi:10.1017/S0305004100032680
DuminilCopin, H., Smirnov, S.: The connective constant of the honeycomb lattice equals \(\sqrt{2+\sqrt{2}}\). Ann. Math. 175(3), 1653–1665 (2012). doi:10.4007/annals.2012.175.3.14
Dyer, M., Greenhill, C.: On Markov chains for independent sets. J. Algorithms 35(1), 17–49 (2000)
Efthymiou, C.: MCMC sampling colourings and independent sets of \(\cal G(n,d/n)\) near uniqueness threshold. In: Proc. 25th ACMSIAM Symp. Discret. Algorithms, pp. 305–316. SIAM (2014). Full version available at arXiv:1304.6666
Galanis, A., Ge, Q., Štefankovič, D., Vigoda, E., Yang, L.: Improved inapproximability results for counting independent sets in the hardcore model. Random Struct. Algorithms 45(1), 78–110 (2014). doi:10.1002/rsa.20479
Gamarnik, D., Katz, D.: Correlation decay and deterministic FPTAS for counting listcolorings of a graph. In: Proc. 18th ACMSIAM Symp. Discret. Algorithms, pp. 1245–1254. SIAM (2007). http://dl.acm.org/citation.cfm?id=1283383.1283517
Gaunt, D.S., Fisher, M.E.: Hardsphere lattice gases. I. Planesquare lattice. J. Chem. Phys. 43(8), 2840–2863 (1965). doi:10.1063/1.1697217. http://scitation.aip.org/content/aip/journal/jcp/43/8/10.1063/1.1697217
Georgii, H.O.: Gibbs Measures and Phase Transitions. De Gruyter Studies in Mathematics, Walter de Gruyter Inc, (1988)
Godsil, C.D.: Matchings and walks in graphs. J. Graph Th. 5(3), 285–297 (1981). http://onlinelibrary.wiley.com/doi/10.1002/jgt.3190050310/abstract
Goldberg, L.A., Jerrum, M., Paterson, M.: The computational complexity of twostate spin systems. Random Struct. Algorithms 23, 133–154 (2003)
Goldberg, L.A., Martin, R., Paterson, M.: Strong spatial mixing with fewer colors for lattice graphs. SIAM J. Comput. 35(2), 486–517 (2005). doi:10.1137/S0097539704445470. http://link.aip.org/link/?SMJ/35/486/1
Hammersley, J.M.: Percolation processes II. The connective constant. Math. Proc. Camb. Philos. Soc. 53(03), 642–645 (1957). doi:10.1017/S0305004100032692
Hammersley, J.M., Morton, K.W.: Poor man’s Monte Carlo. J. Royal Stat. Soc. B 16(1), 23–38 (1954). doi:10.2307/2984008
Hayes, T.P., Vigoda, E.: Coupling with the stationary distribution and improved sampling for colorings and independent sets. Ann. Appl. Probab. 16(3), 1297–1318 (2006)
Jensen, I.: Enumeration of selfavoiding walks on the square lattice. J. Phys. A 37(21), 5503 (2004). doi:10.1088/03054470/37/21/002
Jerrum, M., Sinclair, A.: Approximating the permanent. SIAM J. Comput. 18(6), 1149–1178 (1989). doi:10.1137/0218077. http://epubs.siam.org/doi/abs/10.1137/0218077
Jerrum, M., Valiant, L.G., Vazirani, V.V.: Random generation of combinatorial structures from a uniform distribution. Theor. Comput. Sci. 43, 169–188 (1986)
Kahn, J., Kim, J.H.: Random matchings in regular graphs. Combinatorica 18(2), 201–226 (1998). doi:10.1007/PL00009817. http://link.springer.com/article/10.1007/PL00009817
Kesten, H.: On the number of selfavoiding walks. II. J. Math. Phys. 5(8), 1128–1137 (1964). doi:10.1063/1.1704216
Li, L., Lu, P., Yin, Y.: Approximate counting via correlation decay in spin systems. In: Proc. 23rd ACMSIAM Symp. Discret. Algorithms, pp. 922–940. SIAM (2012)
Li, L., Lu, P., Yin, Y.: Correlation decay up to uniqueness in spin systems. In: Proc. 24th ACMSIAM Symp. Discret. Algorithms, pp. 67–84. SIAM (2013)
Luby, M., Vigoda, E.: Approximately counting up to four. In: Proc. 29th ACM Symp. Theory. Comput., pp. 682–687. ACM (1997). doi:10.1145/258533.258663
Lyons, R.: The Ising model and percolation on trees and treelike graphs. Commun. Math. Phys. 125(2), 337–353 (1989)
Lyons, R.: Random walks and percolation on trees. Ann. Probab. 18(3), 931–958 (1990). doi:10.1214/aop/1176990730
Madras, N., Slade, G.: The SelfAvoiding Walk. Birkhäuser (1996)
Martinelli, F., Olivieri, E.: Approach to equilibrium of Glauber dynamics in the one phase region I. The attractive case. Comm. Math. Phys. 161(3), 447–486 (1994). doi:10.1007/BF02101929. http://link.springer.com/article/10.1007/BF02101929
Martinelli, F., Olivieri, E.: Approach to equilibrium of Glauber dynamics in the one phase region II. The general case. Comm. Math. Phys. 161(3), 487–514 (1994). doi:10.1007/BF02101930. http://link.springer.com/article/10.1007/BF02101930
Mossel, E.: Survey: Information flow on trees. In: Graphs, Morphisms and Statistical Physics, DIMACS Series in Discrete Mathematics and Theoretical Computer Science, vol. 63, pp. 155–170. American Mathematical Society (2004)
Mossel, E., Sly, A.: Rapid mixing of Gibbs sampling on graphs that are sparse on average. Random Struct. Algorithms 35(2), 250–270 (2009). doi:10.1002/rsa.20276
Mossel, E., Sly, A.: Gibbs rapidly samples colorings of \(\cal G(n, d/n)\). Probab. Theory Relat. Fields 148(1–2), 37–69 (2010). doi:10.1007/s004400090222x
Mossel, E., Sly, A.: Exact thresholds for IsingGibbs samplers on general graphs. Ann. Probab. 41(1), 294–328 (2013). doi:10.1214/11AOP737
Nienhuis, B.: Exact critical point and critical exponents of \(O(n)\) models in two dimensions. Phys. Rev. Let. 49(15), 1062–1065 (1982). doi:10.1103/PhysRevLett.49.1062
Pemantle, R., Steif, J.E.: Robust phase transitions for Heisenberg and other models on general trees. Ann. Probab. 27(2), 876–912 (1999)
Pönitz, A., Tittmann, P.: Improved upper bounds for selfavoiding walks in \(\mathbb{Z}^{d}\). Electron. J. Comb., 7, Research Paper 21 (2000)
Restrepo, R., Shin, J., Tetali, P., Vigoda, E., Yang, L.: Improved mixing condition on the grid for counting and sampling independent sets. Probab. Theory Relat. Fields 156(1–2), 75–99 (2013). Extended abstract in Proc. IEEE Symp. Found. Comput. Sci., 2011
Sinclair, A., Srivastava, P., Thurley, M.: Approximation algorithms for twostate antiferromagnetic spin systems on bounded degree graphs. J. Stat. Phys. 155(4), 666–686 (2014)
Sinclair, A., Srivastava, P., Yin, Y.: Spatial mixing and approximation algorithms for graphs with bounded connective constant. In: Proc. 54th IEEE Symp. Found. Comput. Sci., pp. 300–309. IEEE Computer Society (2013). Full version available at arXiv:1308.1762v1
Sly, A.: Computational transition at the uniqueness threshold. In: Proc. 51st IEEE Symp. Found. Comput. Sci., pp. 287–296. IEEE Computer Society (2010)
Sly, A., Sun, N.: Counting in twospin models on \(d\)regular graphs. Ann. Probab. 42(6), 2383–2416 (2014). doi:10.1214/13AOP888. http://projecteuclid.org/euclid.aop/1412083628
Vera, J.C., Vigoda, E., Yang, L.: Improved bounds on the phase transition for the hardcore model in 2dimensions. In: Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, Lecture Notes in Computer Science, vol 8096, pp. 699–713. Springer Berlin Heidelberg (2013). doi:10.1007/9783642403286_48
Vigoda, E.: A note on the Glauber dynamics for sampling independent sets. Electron. J. Combin. 8(1), R8 (2001). http://www.combinatorics.org/ojs/index.php/eljc/article/view/v8i1r8
Weisstein, E.W.: Selfavoiding walk connective constant. From MathWorld–a Wolfram Web Resource. http://mathworld.wolfram.com/SelfAvoidingWalkConnectiveConstant.html
Weitz, D.: Counting independent sets up to the tree threshold. In: Proc. 38th ACM Symp. Theory Comput., pp. 140–149. ACM (2006). doi:10.1145/1132516.1132538
Acknowledgments
We thank Elchanan Mossel, Allan Sly, Eric Vigoda and Dror Weitz for helpful discussions. We also thank the anonymous referees for detailed helpful comments.
Author information
Authors and Affiliations
Corresponding author
Additional information
Slightly weaker versions of some of the results in this paper appeared in an extended abstract in Proceedings of the IEEE Symposium on the Foundations of Computer Science (FOCS), 2013, pp. 300–309 [44]. This version strengthens the main result (Theorem 1.3) of [44] to obtain an optimal setting of the parameters, and adds new results for the monomerdimer model. An extended abstract of the current results appeared in the Proceedings of the ACMSIAM Symposium on Discrete Algorithms, 2015, pp. 1549–1563.
AS was supported in part by NSF grant CCF1016896 and by the Simons Institute for the Theory of Computing. PS was supported by NSF grant CCF1319745 and NSF grant CCF1016896. DŠ was supported by NSF grant CCF1016896 and was visiting the Simons Institute for the Theory of Computing when this work was done. YY was supported by NSFC Grants 61272081 and 61321491 and did part of this work while he was visiting UC Berkeley.
Appendices
A Description of numerical results
In this section, we describe the derivation of the numerical bounds in Table 1. As in [44], all of the bounds are direct applications of Theorem 1 using published upper bounds on the connective constant for the appropriate graph (except for the starred bound of 2.538 for the case of \(\mathbb {Z}^2\), which we discuss in greater detail below). The exact connective constant is not known for the Cartesian lattices \(\mathbb {Z}^2, \mathbb {Z}^3, \mathbb {Z}^4, \mathbb {Z}^5\) and \(\mathbb {Z}^6\), and the triangular lattice \(\mathbb {T}\), and we use the rigorous upper and lower bounds available in the literature [32, 49]. In contrast, for the honeycomb lattice, DuminilCopin and Smirnov [9] rigorously established the connective constant to be \(\mathbb {H}\) is \(\sqrt{2 + \sqrt{2}}\) in a recent breakthrough, and this is the bound we use for that lattice. In order to apply Theorem 1 for a given lattice of connective constant at most \(\varDelta \), we simply need to compute \(\lambda _c(\varDelta ) = \frac{\varDelta ^\varDelta }{(\varDelta 1)^{(\varDelta +1)}}\), and the monotonicity of \(\lambda _c\) guarantees that the lattice exhibits strong spatial mixing as long as \(\lambda < \lambda _c(\varDelta )\).
We now consider the special case of \(\mathbb {Z}^2\). As we pointed out in the introduction, any improvement in the connective constant of a lattice (or that of the Weitz SAW tree corresponding to the lattice) will immediately lead to an improvement in our bounds. In fact, as we discuss below, Weitz’s construction allows for significant freedom in the choice of the SAW tree. We show here that using a tighter combinatorial analysis of the connective constant of a suitably chosen Weitz SAW tree of \(\mathbb {Z}^2\), we can improve upon the bounds obtained by Restrepo et al. [42] and Vera et al. [47] using sophistical methods tailored to the special case of \(\mathbb {Z}^2\). Our basic idea is to exploit the fact that the Weitz SAW tree adds additional boundary conditions to the canonical SAW tree of the lattice. Thus, it allows a strictly smaller number of selfavoiding walks than the canonical SAW tree, and therefore can have a smaller connective constant than that of the lattice itself. Further, as in [44], the proof of Theorem 1 only uses the Weitz SAW tree, and hence the bounds obtained there clearly hold if the connective constant of the Weitz SAW tree is used in place of the connective constant of the lattice.
The freedom in the choice of the Weitz SAW tree—briefly alluded to above—also offers the opportunity to incorporate another tweak which can potentially increase the effect of the boundary conditions on the connective constant. In Weitz’s construction, the boundary conditions on the SAW tree are obtained in the following way (see Theorem 3.1 in [50]). First, the neighbors of each vertex are ordered in a completely arbitrary fashion: this ordering need not even be consistent across vertices. Whenever a loop, say \(v_0, v_1, \ldots , v_l, v_0\) is encountered in the construction of the SAW tree, the occurrence of \(v_0\) which closes the loop is added to the tree along with a boundary condition which is determined by the ordering at \(v_0\): if the neighbor \(v_1\) (which “started” the loop) happens to be smaller than \(v_l\) (the last vertex before the loop is discovered) in the ordering, then the last copy of \(v_0\) appears in the tree fixed as “occupied”, while otherwise, it appears as “unoccupied”.
The orderings at the vertices need not even be fixed in advance, and different copies of the vertex v appearing in the SAW tree can have different orderings, as long as the ordering at a vertex v in the tree is a function only of the path from the root of the tree to v. We now specialize our discussion to \(\mathbb {Z}^2\). The simplest such ordering is the “uniform ordering”, where we put an ordering on the cardinal directions north, south, east and west, and order the neighbors at each vertex in accordance with this ordering on the directions. This was the approach used by Restrepo et al. [42].
However, it seems intuitively clear that it should be possible to eliminate more vertices in the tree by allowing the ordering at a vertex v in the tree to depend upon the path taken from the origin to v. We use a simple implementation of this idea by using a “relative ordering” which depends only upon the last step of such a path. In particular, there are only three possible options available at a vertex v in the tree (except the root): assuming the parent of v in the tree is u: the first is to go straight, i.e., to proceed to the neighbor of v (viewed as a point in \(\mathbb {Z}^2\) which lies in the same direction as the vector \(v  u\), where v and u are again viwed as points in \(\mathbb {Z}^2\)). Analogously, we can also turn left or right with respect to this direction. Our ordering simply stipulates that straight > right > left.
To upper bound the connective constant of the Weitz SAW tree, we use the standard method of finite memory selfavoiding walks [32]—these are walks which are constrained only to not have cycles of length up to some finite length L. Clearly, the number of such walks of any given length \(\ell \) upper bounds \(N(v, \ell )\). In order to bring the boundary conditions on the Weitz SAW tree into play, we further enforce the constraint that the walk is not allowed to make any moves which will land it in a vertex fixed to be “unoccupied” by Weitz’s boundary conditions (note that a vertex u can be fixed to be “unoccupied” also because one of its children is fixed to be “occupied”: the independence set constraint forces u itself to be “unoccupied” in this case, and hence leads to additional pruning of the tree by allowing the other children of u to be ignored). Such a walk can be in one of a finite number k (depending upon L) of states, such that the number of possible moves it can make to state j while respecting the above constraints is some finite number \(M_{ij}\). The \(k\times k\) matrix \(M = (M_{ij})_{i,j\in [k]}\) is called the branching matrix [42]. We therefore get \(N(v,\ell ) \le \varvec{e_1}^TM^\ell \varvec{1}\), where \(\varvec{1}\) denotes the all 1’s vector, and \(\varvec{e_1}\) denotes the coordinate vector for the state of the zerolength walk.
Since the entries of M are nonnegative, the PerronFrobenius theorem implies that one of the maximum magnitude eigenvalues of the matrix M is a positive real number \(\gamma \). Using Gelfand’s formula (which states that \(\gamma = \lim _{\ell \rightarrow \infty }\Vert M^\ell \Vert _{}^{1/\ell }\), for any fixed matrix norm) with the \(\ell _\infty \) norm to get the last equality, we see that
Hence, the largest real eigenvalue \(\gamma \) of M gives a bound on the connective constant of the Weitz SAW tree.
Using the matrix M corresponding to walks in which cycles of length at most \(L=26\) are avoided, we get that the connective constant of the Weitz SAW tree is at most 2.433 (we explicitly construct the matrix M and then use Matlab to compute its largest eigenvalue). Using this bound for \(\varDelta \), and applying Theorem 1 as described above, we get the bound 2.529 for \(\lambda \) in the notation of the table, which is better than the bounds obtained by Restrepo et al. [42] and Vera et al. [47]. With additional computational optimizations we can go further and analyze self avoiding walks avoiding cycles of length at most \(L=30\). The first optimization is merging “isomorphic” states (this will decrease the number of states and hence the size of M significantly, allowing computation of the largest eigenvalue): formally, the state of a SAW will be a suffix of length s such that the Manhattan distance between the final point and the point s steps in the past is less than \(Ls\) (note that the state of a vertex in the SAW tree can be determined from the state of its parent and the last step), and two states are isomorphic if they have the same neighbors at the next step of the walk. The second optimization is computing the largest eigenvalue using the power method. For \(L=30\) we obtain that the connective constant of the Weitz SAW tree is at most 2.429, which on applying Theorem 1 yields the bound 2.538 for \(\lambda \), as quoted in Table 1.
B Proofs omitted from Sect. 3
We include here a proof of Lemma 1 for the convenience of the reader.
Proof of Lemma 1
Define \(H(t) :=f_{d,\lambda }^\phi (t\varvec{x} + (1t)\varvec{y})\) for \(t \in [0,1]\). By the scalar mean value theorem applied to H, we have
Let \(\psi \) denote the inverse of the message \(\phi \): the derivative of \(\psi \) is given by \(\psi '(y) = \frac{1}{\varPhi (\psi (y))}\), where \(\varPhi \) is the derivative of \(\phi \). We now define the vector \(\varvec{z}\) by setting \(z_i = \psi (sx_i + (1s)y_i)\) for \(1 \le i \le d\). We then have
We recall that for simplicity, we are using here the somewhat nonstandard notation \(\frac{\partial f}{\partial z_i}\) for the value of the partial derivative \(\frac{\partial f}{\partial R_i}\) at the point \(\varvec{R} = \varvec{z}\).
We now give the proof of the Lemma 3. The proof is syntactically identical to the proof of a similar lemma in [44], and the only difference (which is of course crucial for our purposes) is the use of the more specialized Lemma 2 in the inductive step.
Proof of Lemma 3
Recall that given a vertex v in \(T_{\le C}\), \(T_v\) is the subtree rooted at v and containing all the descendants of v, and \(F_v(\sigma )\) is the value computed by the recurrence at the root v of \(T_v\) under the initial condition \(\sigma \) restricted to \(T_v\). We will denote by \(C_v\) the restriction of the cutset C to \(T_v\).
By induction on the structure of \(T_\rho \), we will now show that for any vertex v in \(T_\rho \) which is at a distance v from \(\rho \), and has arity \(d_v\), one has
To see that this implies the claim of the lemma, we observe that since \(F_\rho (\sigma )\) and \(F_\rho (\tau )\) are in the interval [0, b], we have \(F_v(\sigma )  F_v(\tau ) \le \frac{1}{L}\phi (F_v(\sigma ))  \phi (F_v(\tau ))\). Hence, taking \(v = \rho \) in eq. (20), the claim of the lemma follows from the above observation.
We now proceed to prove eq. (20). The base case of the induction consists of vertices v which are either of arity 0 or which are in C. In the first case (which includes the case where v is fixed by both the initial conditions to the same value), we clearly have \(F_v(\sigma ) = F_v(\tau )\), and hence the claim is trivially true. In the second case, we have \(C_v = \left\{ v\right\} \), and all the children of v must lie in \(C'\). Thus, in this case, the claim is true by the definition of M.
We now proceed to the inductive case. Let \(v_1, v_2, \ldots v_{d_v}\) be the children of v, which satisfy eq. (20) by induction. In the remainder of the proof, we suppress the dependence of \(\xi \) on \(\phi \) and q. Applying Lemma 2 followed by the induction hypothesis, we then have, for some positive integer \(k\le d_v\)
This completes the induction.
C Proofs omitted from Sect. 4
C. 1 Maximum of \(\nu _\lambda \) and implications for strong spatial mixing
In this section, we prove Lemma 6. We begin by giving a brief overview of how the value of of q given in eq. (12) is chosen. As discussed in the paragraph following the equation, the idea is to start with the fact that \(\xi _{\phi , q}(\varDelta _c) = \frac{1}{\varDelta _c}\) independent of the value of q, and then to choose q so that \(\xi _{\phi , q}(d)\) is maximized at \(d = \varDelta _c\). Differentiation of \(\xi _{\phi , q}(d)\) with respect to d for fixed \(\lambda \) and q leads to the expression in eq. (22), where \(\tilde{x} = \tilde{x}_{\lambda }(d)\) is defined as the unique positive solution of the equation \(d\tilde{x} = 1 + f_d(\tilde{x})\). We then require that, as a necessary condition for \(\xi _{\phi , q}(d)\) to be maximized at \(d = \varDelta _c\), this derivative should vanish at this value of d. Using the fact that \(\tilde{x}_\lambda (\varDelta _c) = \frac{1}{\varDelta _c1}\), we then see that q must be equal to the value given in eq. (12) for this requirement to be satisfied. The rest of the proof of the lemma then shows that when q is so chosen, the function \(\nu _\lambda (d) :=\xi _{\phi , q}(d)\) is indeed maximized at \(d = \varDelta _c\).
Proof of Lemma 6
We first prove that given \(\lambda \), \(\tilde{x}_\lambda (d)\) is a decreasing function of d. For ease of notation, we suppress the dependence of \(\tilde{x}_\lambda (d)\) on d and \(\lambda \). From Lemma 5, we know that \(\tilde{x}\) is the unique positive solution of \(d\tilde{x} = 1 + f_d(\tilde{x})\). Differentiating the equation with respect to d (and denoting \(\frac{\text {d}\tilde{x}}{\text {d}d}\) by \(\tilde{x}'\)), we have
which in turn yields
Since \(\tilde{x} \ge \frac{1}{d} \ge 0\), this shows that \(\tilde{x}\) is a decreasing function of d.
We now consider the derivative of \(\nu _\lambda (d)\) with respect to d. Recalling that \(\nu _\lambda (d) = \xi (d)= \varXi (d, \tilde{x}_\lambda (d))\) and then using the chain rule, we have
Here, we use \(1 + f_{d,\lambda }(\tilde{x}) = d\tilde{x}\) to get the last equality. We now note that the quantity inside the square brackets is a strictly increasing function of \(\tilde{x}\), and hence a strictly decreasing function of d. Since \(\varXi (d, \tilde{x})\) is positive, this implies that there can be at most one positive zero of \(\nu _\lambda '(d)\), and if such a zero exists, it is the unique maximum of \(\nu _\lambda (d)\).
We now complete the proof by showing that \(\nu _\lambda '(d) = 0\) for \(d = \varDelta _c(\lambda )\). At such a d, we have \(\lambda = \lambda _c(d) = \frac{d^d}{(d1)^{d+1}}\). We then observe that \(\tilde{x}(d) = \frac{1}{d1}\), since
As an aside, we note that this is not a coincidence. Indeed, when \(\lambda = \lambda _c(d)\), \(\tilde{x}\) as defined above is well known to be the unique fixed point of \(f_d\), and the potential function \(\varPhi \) was chosen in [28] in part to make sure that at the critical activity, the fixed point is also the maximizer of (an analogue of) \(\varXi (d, \cdot )\).
We now substitute the value of \(\frac{1}{q}\) and \(\tilde{x}\) at \(d = \varDelta _c\) to verify that
as claimed. Substituting these values of d and \(\tilde{x}\), along with the earlier observation that \(f_{\varDelta _c}(\tilde{x}) = \tilde{x} = \frac{1}{\varDelta _c  1}\), into the definition of \(\nu _\lambda \), we have
which completes the proof.
C. 2 Symmetrizability of the message
In this section, we prove Lemma 4. We start with the following technical lemma.
Lemma 10
Let \(r \ge 1\), \(0< A < 1\), \(\gamma (x) :=(1x)^r\) and \(g(x) :=\gamma (Ax) + \gamma (A/x)\). Note that \(g(x) = g(1/x)\), and g is well defined in the interval [A, 1 / A]. Then all the maxima of the function g in the interval [A, 1 / A] lie in the set \(\left\{ 1/A, 1, A\right\} \).
Before proving the lemma, we observe the following simple consequence. Consider \(0 \le s_1, s_2 \le 1\) such that \(s_1s_2\) is constrained to be some fixed constant \(C < 1\). Then, applying the lemma with \(A = \sqrt{C}\) we see that \(\gamma (s_1)\) + \(\gamma (s_2)\) is maximized either when \(s_1 = s_2\) or when one of them is 1 and the other is C.
Proof of Lemma 10
Note that when \(r = 1\), \(g(x) = 2  A(x + 1/x)\), which is maximized at \(x = 1\). We therefore assume \(r > 1\) in the following.
We consider the derivative \(g'(x) =Ar\left[ (1A/x)^{r1}\frac{1}{x^2}  (1Ax)^{r1}\right] \). Note that \(g(x) = g(1/x)\) and that \(g'(x)\) and \(g'(1/x)\) have opposite signs, so it is sufficient to study g in the range [1, 1 / A]. We now note that in the interior of the intervals of interest \(g'\) always has the same sign as
where \(t :=\frac{r+1}{r1} > 1\) for \(r>1\). We therefore only need to study the sign of h in the interval \(I :=[1, 1/A]\). We note that \(h(1) = 0\), and consider the derivatives of h.
Note that \(h'(1) = (t+1)[A  1/r]\). We now break the analysis into two cases.
 Case 1: \(A \ge 1/r\).:

In this case, we have \(h''(x) > 0\) for x in the interior of the interval I, and \(h'(1) \ge 0\). This shows that \(h'(x) > 0\) for x in the interior of I, so that h is strictly increasing in this interval. Since \(h(1) = 0\), this shows that h (and hence \(g'\)) are positive in the interior of I. Thus, g is maximized in I at \(x = 1/A\).
 Case 2: \(A < 1/r\).:

We now have \(h'(1) < 0\) and \(h''(1) < 0\). Further, defining \(x_0 = \frac{1}{Ar}\), we see that \(h''\) is negative in \([1, x_0)\) and positive in \((x_0, 1/A]\) (and 0 at \(x_0\)). Since \(h'(1) < 0\), this shows that \(h'\) is negative in \([1, x_0]\), and hence can have no zeroes there. Further, we see that \(h'\) is strictly increasing in \([x_0, 1/A]\), and hence can have at most one zero \(x_1\) in \([x_0, 1/A]\).
If no such zero exists, then \(h'\) is negative in I. In this case, we see that h (and hence \(g'\)) is negative in the interior of I, and hence g is maximized at \(x = 1\). We now consider the case where there is a zero \(x_1\) of \(h'\) in \([x_0, 1/A]\). By the sign analysis of \(h''\), we know that \(h'\) is negative in \([1, x_1)\) and positive in \((x_1, 1/A]\). We thus see that h is strictly decreasing (and negative) in \((1, x_1)\) and strictly increasing in \((x_1, 1/A]\). It can therefore have at most one zero \(x_2\) in \((x_1, 1/A]\). If no such zero exists, then h (and hence \(g'\)) is negative in the interior of I, and hence g is maximized at \(x = 1\). If such a zero \(x_2\) exists in \((x_1, 1/A]\), then—because h is strictly increasing in \((x_1, 1/A)\) and negative in \((1, x_1]\)—h (and hence \(g'\)) is negative in \((1, x_2)\) and positive in \((x_2, 1/A)\), which shows that g is maximized at either \(x = 1\) or at \(x = 1/A\).
We now prove Lemma 4.
Proof of Lemma 4
We begin by verifying the first condition in the definition of symmetrizability. For each \(1 \le i \le d\), we have
where we use the fact that \(f_d(\varvec{x}) > 0\) for nonnegative \(\varvec{x}\). We now recall the program used in the definition of symmetrizability, with the definitions of \(\varPhi \) and \(f_d\) substituted, and with \(r = a/2\):
Note that eq. (23) implies that \(x_i \le \lambda /B  1\), so that the feasible set is compact. Thus, if the feasible set is nonempty, there is at least one (finite) optimal solution to the program. Let \(\varvec{y}\) be such a solution. Suppose without loss of generality that the first k coordinates of \(\varvec{y}\) are nonzero while the rest are 0. We claim that \(y_i = y_j \ne 0\) for all \(1 \le i \le j \le k\) and \(y_i = 0\) for \(i > k\).
To show this, we first define another vector \(\varvec{s}\) by setting \(s_i = \frac{1}{1+ x_i}\). Note that \(s_i = s_j\) if and only if \(x_i = x_j\) and \(s_i = 1\) if and only if \(x_i = 0\). Note that the constraint in eq. (23) is equivalent to
Now suppose that there exist \(i\ne j\) such that \(y_iy_j \ne 0\) and \(y_i \ne y_j\). We then have \(s_i \ne s_j\) and \(0< s_1, s_2 < 1\). Now, since \(r = a/2 \ge 1\) when \(a \ge 2\), Lemma 10 implies that at least one of the following two operations, performed while keeping the product \(s_is_j\) fixed (so that the constraints in Eqs. (23, 24) are satisfied), will increase the value of the sum \(\gamma (s_i) + \gamma (s_j) = \left( \frac{y_i}{1+y_i}\right) ^r + \left( \frac{y_j}{1+y_j}\right) ^r\):

(1)
Making \(s_i = s_j\), or

(2)
Making \(y_i = 0\) (so that \(s_i = 1\)).
Thus, if \(\varvec{y}\) does not have all its nonzero entries equal, we can increase the value of the objective function while maintaining all the constraints. This contradicts the fact that \(\varvec{y}\) is a maximum, and completes the proof.
D Proofs omitted from Sect. 5
D. 1 Symmetrizability of the message
In this section, we prove Lemma 7. As in the case of the hard core model, we begin with an auxiliary technical lemma.
Lemma 11
Let r and a satisfy \(1 < r \le 2\) and \(0< a < 1\) respectively. Consider the functions \(\gamma (x) :=x^r(2x)^r\) and \(g(x) :=\gamma (ax) + \gamma (a+x)\). Note that g is even and is well defined in the interval \([A, A]\), where \(A :=\min (a, 1a)\). Then all the maxima of the function g in the interval \([A, A]\) lie in the set \(\left\{ a, 0, a\right\} \).
The lemma has the following simple consequence. Let \( 0 \le s_1,s_2\le 1\) be such that \((s_1 + s_2)/2\) is constrained to be some fixed constant \(a \le 1\). Then, applying the lemma with \(s_1 = ax, s_2 = a + x\), we see that \(\gamma (s_1) + \gamma (s_2)\) is maximized either when \(s_1 = s_2 = a\) or when one of them is 0 and the other is 2a (the second case can occur only when \(a \le 1/2\)).
Proof of Lemma 11
Since g is even, we only need to analyze it in the interval [0, A], and show that restricted to this interval, its maxima lie in \(\left\{ 0, a\right\} \).
We begin with an analysis of the third derivative of \(\gamma \), which is given by
Our first claim is that \(\gamma '''\) is strictly increasing in the interval [0, 1] when \(1 < r \le 2\). In the case when \(r = 2\), the last two factors in eq. (25) simplify to constants, so that \(\gamma '''(x) = 12r(r1)(1x)\), which is clearly strictly increasing. When \(1< r < 2\), the easiest way to prove the claim is to notice that each of the factors in the product on the right hand side of is a strictly increasing nonnegative function of \(y = 1 x\) when \(x \in [0, 1]\) (the fact that the second and third factors are strictly increasing and nonnegative requires the condition that \(r < 2 \)). Thus, because of the negative sign, \(\gamma '''\) itself is a strictly decreasing function of y, and hence a strictly increasing function of x in that interval.
We can now analyze the behavior of g in the interval [0, A]. We first show that when \(a > 1/2\), so that \(A = 1  a \ne a\), g does not have a maximum at \(x = A\) when restricted to [0, A]. We will achieve this by showing that when \(1> a > 1/2\), \(g'(1a) < 0\). To see this, we first compute \(\gamma '(x) = 2rx^{r1}(2x)^{r1}(1x)\), and then observe that
We now start with the observation that \(g'''(x) = \gamma '''(a + x)  \gamma '''(ax)\), so that because of the strict monotonicity of \(\gamma '''\) in [0, 1] (which contains the interval [0, A]), we have \(g'''(x) > 0\) for \(x \in (0, A]\). We note that this implies that \(g''(x)\) is strictly increasing in the interval [0, A]. We also note that \(g'(0) = 0\). We now consider two cases.
 Case 1: \(g''(0) \ge 0\) :

Using the fact that \(g''(x)\) is strictly increasing in the interval [0, A] we see that \(g''(x)\) is also positive in the interval (0, A] in this case. This, along with the fact that \(g'(0) = 0\), implies that \(g'(x) > 0\) for \(x \in (0, A]\), so that g is strictly increasing in [0, A] and hence is maximized only at \(x = A\). As proved above, this implies that the maximum of g must be attained at \(x = a\) (in other words, the case \(g''(0) \ge 0\) cannot arise when \(a > 1/2\) so that \(A = 1 a \ne a\)).
 Case 2: \(g''(x) < 0\) :

Again, using the fact that \(g''(x)\) is strictly increasing in [0, A], we see that there is at most one zero c of \(g''\) in [0, A]. If no such zero exists, then \(g''\) is negative in [0, A], so that \(g'\) is strictly decreasing in [0, A]. Since \(g'(0) = 0\), this implies that \(g'\) is also negative in (0, A) so that the unique maximum of g in [0, A] is attained at \(x = 0\).
Now suppose that \(g''\) has a zero c in (0, A]. As before, we can conclude that \(g'\) is strictly negative in [0, c], and strictly increasing in [c, A]. Thus, if \(g'(A) < 0\), \(g'\) must be negative in all of (0, A], so that g is again maximized at \(x = 0\) as in Case 1. The only remaining case is when there exists a number \(c_1 \in (c, A]\) such that \(g'\) is negative in \((0, c_1)\) and positive in \((c_1, A]\). In this case, we note that \(g'(A) \ge 0\), so that—as observed above–we cannot have \(A \ne a\). Further, the maximum of g in this case is at \(x = 0\) if \(g(0) > g(A)\), and at \(x = A\) otherwise. Since we already argued that A must be equal to a in this case, this shows that the maxima of g in [0, A] again lie in the set \(\left\{ 0, a\right\} \).
We now prove Lemma 7.
Proof of Lemma 7
We begin by verifying the first condition in the definition of symmetrizability:
We now recall the program used in the definition of symmetrizability with respect to exponent r, with the definitions of \(\varPhi \) and \(f_{d,\gamma }\) substituted:
Since we are only interested in the values of \(\varvec{p}\) solving the program, we can simplify the program as follows:
We see that the feasible set is compact. Thus, if it is also nonempty, there is at least one (finite) optimal solution to the program. Let \(\varvec{y}\) be such a solution. Suppose without loss of generality that the first k coordinates of \(\varvec{y}\) are nonzero while the rest are 0. We claim that \(y_i = y_j \ne 0\) for all \(1 \le i \le j \le k\).
For if not, let \(i\ne j\) be such that \(y_iy_j \ne 0\) and \(y_i \ne y_j\). Let \(y_i + y_j = 2a\). The discussion following Lemma 11 implies that at least one of the following two operations, performed while keeping the sum \(y_i + y_j\) fixed and ensuring that \(y_i,y_j \in [0, 1]\) (so that all the constraints in the program are still satisfied), will increase the value of the sum \(\gamma (y_i) + \gamma (y_j) = y_i^r(2y_i)^r + y_j^r(2y_j)^r\):

(1)
Making \(y_i = y_j\), or

(2)
Making \(y_i = 0\) (so that \(y_j = 2a\)). This case is possible only when \(2a \le 1\).
Thus, if \(\varvec{y}\) does not have all its nonzero entries equal, we can increase the value of the objective function while maintaining all the constraints. This contradicts the fact that \(\varvec{y}\) is a maximum, and completes the proof.
Rights and permissions
About this article
Cite this article
Sinclair, A., Srivastava, P., Štefankovič, D. et al. Spatial mixing and the connective constant: optimal bounds. Probab. Theory Relat. Fields 168, 153–197 (2017). https://doi.org/10.1007/s0044001607082
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s0044001607082
Mathematics Subject Classification
 82B20
 60J10
 68W25
 68W40