Random-Cluster Dynamics on Random Regular Graphs in Tree Uniqueness

We establish rapid mixing of the random-cluster Glauber dynamics on random Δ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varDelta $$\end{document}-regular graphs for all q≥1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q\ge 1$$\end{document} and p<pu(q,Δ)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p<p_u(q,\varDelta )$$\end{document}, where the threshold pu(q,Δ)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_u(q,\varDelta )$$\end{document} corresponds to a uniqueness/non-uniqueness phase transition for the random-cluster model on the (infinite) Δ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varDelta $$\end{document}-regular tree. It is expected that this threshold is sharp, and for q>2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q>2$$\end{document} the Glauber dynamics on random Δ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varDelta $$\end{document}-regular graphs undergoes an exponential slowdown at pu(q,Δ)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_u(q,\varDelta )$$\end{document}. More precisely, we show that for every q≥1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q\ge 1$$\end{document}, Δ≥3\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varDelta \ge 3$$\end{document}, and p<pu(q,Δ)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p<p_u(q,\varDelta )$$\end{document}, with probability 1-o(1)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1-o(1)$$\end{document} over the choice of a random Δ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varDelta $$\end{document}-regular graph on n vertices, the Glauber dynamics for the random-cluster model has Θ(nlogn)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varTheta (n \log n)$$\end{document} mixing time. As a corollary, we deduce fast mixing of the Swendsen–Wang dynamics for the Potts model on random Δ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varDelta $$\end{document}-regular graphs for every q≥2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q\ge 2$$\end{document}, in the tree uniqueness region. Our proof relies on a sharp bound on the “shattering time”, i.e., the number of steps required to break up any configuration into O(logn)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O(\log n)$$\end{document} sized clusters. This is established by analyzing a delicate and novel iterative scheme to simultaneously reveal the underlying random graph with clusters of the Glauber dynamics configuration on it, at a given time.


Introduction
The random-cluster model is a random graph model, unifying the study of electrical networks, independent bond percolation, and the ferromagnetic Ising/Potts model from statistical physics [21,31]. It is defined on a graph G = (V, E) and parametrized by an edge probability p ∈ (0, 1) and cluster weight q > 0. Each configuration consists of a subset of edges ω ⊆ E (equivalently ω ∈ {0, 1} E ) and is assigned probability π G, p,q (ω) = 1 Z G, p,q p |ω| (1 − p) |E|−|ω| q c(ω) , (1.1) where c(ω) is the number of connected components in (V, ω) and Z G, p,q is a normalizing constant.
Aside from its inherent interest as a model of random networks, the random-cluster model provides an elegant class of Markov Chain Monte Carlo (MCMC) algorithms for sampling from the Ising/Potts model. For integer q ≥ 2, a sample ω from (1.1) can be transformed into one for the q-state ferromagnetic Potts model by independently assigning a random spin from {1, . . . , q} to each connected component of (V, ω); see, e.g., [18,31]. Such sampling algorithms, which include the popular Swendsen-Wang algorithm [50], are a widely-used alternative to the standard Ising/Potts Markov chains since the former are often efficient at "low-temperatures" (large p) where the latter suffer exponential slowdowns (see [8,33]).
Our focus here is on the Glauber dynamics of the random-cluster model. Specifically, we consider the following discrete-time Glauber dynamics chain, which we refer to as the FK-dynamics. From a configuration ω t ⊆ E, one step of the FK-dynamics transitions to a new configuration ω t+1 ⊆ E as follows: 1. Choose an edge e t ∈ E uniformly at random; 2. Set ω t+1 = ω t ∪{e t } with probability p := p q(1− p)+ p if e t is a "cut-edge" in (V, ω t ); p otherwise; 3. Otherwise set ω t+1 = ω t \ {e t }.
We say e is a cut-edge in (V, ω t ) if changing the state of e t changes the number of connected components c(ω t ) in (V, ω t ). This chain is, by design, reversible with respect to π G, p,q .
A central question in the study of Markov chains is how the mixing time-defined as the number of steps until the Markov chain is close to stationarity starting from the worst possible initial configuration-grows as the size of the graph G increases. Of particular interest in the context of random-cluster and Ising/Potts dynamics is the relation of mixing times to the rich equilibrium phase transitions of the model.
We consider this question when G is a random Δ-regular graph on n vertices. The study of spin systems and their dynamics on random graphs is quite active [12,[14][15][16]19,20,25,44,45]. Random Δ-regular graphs are a canonical example of graphs having exponential volume growth, with a non-trivial geometry, making them an attractive alternative to lattices or trees. More generally, the study of spin systems on random graphs yields insight into hard instances of the classical computational problems of sampling, counting, learning and testing [2,23,48,49] and features in the study of random constraint satisfaction problems [32,54].
The phase transition of the random-cluster model on random Δ-regular graphs is expected to involve three critical points [10,34,39,41]. Most relevant to us would be the critical threshold p u (q, Δ) corresponding to a uniqueness/non-uniqueness phase transition for the random-cluster model on the infinite Δ-regular wired tree (in which the leaves are externally wired to be in the same connected component). Roughly speaking, the uniqueness/non-uniqueness phase transition captures whether the wired boundary has an effect or not on the configuration near the root of the tree (in the limit as the height of the tree grows). It is believed that the mixing time slows down at p u (q, Δ), either polynomially or exponentially depending on q ≤ 2 or q > 2.
In this paper, we establish optimal mixing for the FK-dynamics on random Δ-regular graphs throughout the uniqueness region p < p u (q, Δ) for all real q ≥ 1 and all Δ ≥ 3. Theorem 1. Fix any q ≥ 1, Δ ≥ 3, and p < p u (q, Δ). Consider the FK-dynamics on a uniformly random Δ-regular graph on n vertices. With probability 1 − o(1) over the choice of the random graph G, the mixing time of the FK-dynamics on G is Θ n log n . 1 The FK-dynamics are known to be resistant to sharp analysis with the known techniques for Markov chains for spin systems. This is due, in part, to the fact that the random-cluster model presents highly non-local interactions: an update on an edge e t depends on the entire configuration ω t (E \{e t }). Indeed, the only other setting where the speed of convergence of FK-dynamics is well-understood via direct analysis is in square subsets of Z 2 [6,8,[26][27][28][29]. Other bounds to date have been obtained either indirectly, via comparison with global Markov chains using the results of [52,53] (and as a result, these bounds are off by polynomial factors), or by taking either p very small (e.g., under a Dobrushin-type condition) or very large, or q large. This is the state of affairs even on the (geometrically trivial) complete graph [8,30,38].
Our results are tight in the sense that the FK-dynamics is expected to undergo a slowdown at p u (q, Δ), as we describe next. The equilibrium phase transition of the random-cluster model on random Δ-regular graphs should qualitatively resemble those on the Δ-regular tree and the complete graph. Based on this relation, and understandings of those phase diagrams [10,34,39,41], it is expected to involve three critical points p u (q, Δ) ≤ p c (q, Δ) ≤ p * u (q, Δ). The tree uniqueness/non-uniqueness phase transition at p u (q, Δ) manifests on the finite Δ-regular tree in the form of existence/nonexistence of root-to-leaf paths under wired boundary conditions. The threshold p * u (q, Δ) corresponds to a (conjectured) second non-uniqueness/uniqueness transition; above this point even the Δ-regular tree under free boundary conditions has root-to-leaf connections (see [25,34,36,39] for more details). The threshold p c (q, Δ), on the other hand, corresponds to an order-disorder transition captured by the emergence of a "giant component" of linear size on the random graph (which, roughly, imposes "typical" boundary conditions on its treelike balls).
When q ∈ (1, 2] the phase transition is of second-order and these three thresholds coincide; namely p u (q, Δ) = p c (q, Δ) = p * u (q, Δ). On the other hand when q > 2, the phase transition on random Δ-regular graphs is conjectured to be of firstorder and p u (q, Δ) < p c (q, Δ) < p * u (q, Δ). Here, the uniqueness threshold p u (q, Δ) should mark the onset of the metastability phenomenon, and that should persist up to p * u (q, Δ). Metastability has been linked to an exponential slowdown for both randomcluster and Potts Glauber dynamics on the complete graph [7,13,24,30], and the same slowdown is expected to occur on random Δ-regular graphs. Namely, in the window ( p u (q, Δ), p * u (q, Δ)), the ordered and disordered phases should each be "metastable" behaving locally (on treelike balls) like the configurations on wired and free trees, respectively. The coexistence of these metastable phases with exponentially small boundaries, facilitates states from which reversible Markov chains cannot easily escape (i.e., these sets have bad conductance). It is thus expected that on random Δ-regular graphs, for every q > 2, the FK-dynamics mixes exponentially slowly throughout ( p u (q, Δ), p * u (q, Δ)). For q sufficiently large, such slowdown was established in [25] at p = p c (q, Δ) ∈ ( p u (q, Δ), p * u (q, Δ)). From Theorem 1 we obtain an efficient MCMC sampling algorithm, for both the random-cluster model and the ferromagnetic Ising/Potts model on random Δ-regular graphs in the uniqueness regime. Corollary 2. Fix any q ≥ 1, Δ ≥ 3, p < p u (q, Δ) and any accuracy parameter δ ∈ (0, 1). Then, with probability 1 − o(1) over the choice of the random Δ-regular n-vertex graph G, there is a sampling algorithm which, given the graph G, outputs a random-cluster configuration ω whose distribution is within total variation distance δ of π G, p,q . The running time of the algorithm is O(n(log n) 3

log(1/δ)).
The extra O((log n) 2 ) factor in the running time of the algorithm comes from the (amortized) computational cost of checking whether the chosen edge is a cut-edge in each step of the FK-dynamics. This is equivalent to the fully dynamic connectivity problem which has been thoroughly studied (see, e.g., [37,51]).
For integer q, the algorithm in Corollary 2 combined with the O(n) cost of translating between the random-cluster and Potts configurations mentioned earlier yields a sampling algorithm for the ferromagnetic q-state Potts model on random regular graphs up to the Potts uniqueness threshold (the uniqueness thresholds of both these models coincide). This improves on the best previously known sampling algorithm for both these models in [5], which runs inÕ(n 6/5 ) time, and it is a "weak sampler" in the sense that it outputs samples that are close in total variation distance to the target distribution but with a fixed accuracy. (See also the recent work of [36] for a poly(n) sampler for all p ∈ (0, 1) but provided q is sufficiently large.) As another important corollary of Theorem 1, we deduce fast mixing of the standard Swendsen-Wang (SW) algorithm for the ferromagnetic q-state Potts model [50]. This is an extensively-used global-update Markov chain. The dynamics starts from a Potts configuration σ t ∈ {1, . . . , q} V , moves to a "joint" spin/random-cluster configuration (σ t , ω t ) by including each monochromatic edge independently with probability p and then assigns to each connected component of (V, ω t ) a uniform at random spin from {1, ..., q} to obtain a new Potts configuration σ t+1 (see [18,50]).

Corollary 1.
Fix any integer q ≥ 2 and Δ ≥ 3, and let p < p u (q, Δ). Consider the Swendsen-Wang dynamics on a uniformly random Δ-regular graph on n vertices. With probability 1 − o(1) over the choice of the random graph G, the mixing time of the Swendsen-Wang dynamics on G is O n 2 log n .
Corollary 1 follows immediately from Theorem 1 and the comparison results of Ullrich [52,53]. Previously, our understanding of the speed of convergence of the SW dynamics on random Δ-regular graphs was very limited. For the special case of q = 2, which corresponds to the Ising model, it was established in [4] that the spectral gap of the SW dynamics is Ω(1) for all p < p u (2, Δ); this implies an O(n) mixing time bound. In addition, Guo and Jerrum [33] established an O(n 10 ) mixing time bound for the SW dynamics that applies to any graph and any p ∈ (0, 1). The methods in both of these works are specific to the Ising model (q = 2) and do not generalize to other values of q. Beyond the special case of q = 2, no sub-exponential bound was previously known for either the FK-dynamics or the SW dynamics throughout the uniqueness regime p < p u (q, Δ).

Proof ideas.
We comment briefly on the techniques and main innovations in our analysis next: for more details and an extended proof sketch, we refer the reader to Section 3. The main ingredient in our proof is an O(n log n) bound on the "shattering time" of the FK-dynamics (Theorem 4); this is the number of steps the chain requires to break up any configuration into connected components of size at most O(log n). The bound on the shattering time uses a novel and delicate iterative scheme to simultaneously reveal the underlying random graph and the connected components of the FK-dynamics configuration on it at a given time: see Definition 11 and Figures 2-3. While revealing procedures are a standard tool in the study of both random graphs and of the randomcluster model, their combined analysis is highly non-trivial, as the law of the randomcluster configuration at an edge depends on the global geometry of the graph. To our knowledge, this the first direct upper bound for the shattering time of the FK-dynamics in any setting. In fact, understanding the shattering time is usually the main obstacle for proving rapid mixing of the FK-dynamics on other graphs: e.g., on the complete graph, the shattering time is not known and only loose mixing time bounds (off by Θ(n 2 ) factors) can be derived [7].
Once the dynamics has shattered, we use standard methods (i.e., censoring [46]) to reduce the analysis of the FK-dynamics to localized dynamics in balls of radius o( √ n) centered at each vertex, but with random boundary conditions induced by the current state outside the ball. In random Δ-regular graphs, these balls are "treelike" and, after shattering, their boundary conditions are "almost free", in that only O(1) vertices in their boundaries are connected through the external configuration. This implies that the FKdynamics mix quickly and satisfy a log-Sobolev inequality akin to a product measure in each of these balls. The last ingredient in our proof is an exponential decay of correlation property (sometimes called spatial mixing) between the root and boundary of such balls. A delicate point is that since these balls have radius Θ(log n), we need exact control on the rate of this exponential decay to sustain the union bound over the n balls. Remark 1. We expect our methods for the analysis of the shattering phase to have applications to other locally tree-like graphs, e.g., wired trees and Erdős-Rényi random graphs. In the latter case, however, the possibility of having a small number of vertices of large degree poses technical obstructions to direct extension of our methods. Whereas this should not affect the equilibrium phase diagram of the model, interestingly, in the case of the Glauber dynamics for the Ising model on an Erdős-Rényi random graph, the high maximum degree is known to slow down the high-temperature mixing time to n 1+Ω( 1 log log n ) [44].
1.2. Organization of paper. The rest of the paper is organized as follows. In Section 2, we provide a number of preliminary definitions and notations we will use. In Section 3, we give a detailed proof overview highlighting some of the key novelties in our arguments. Our revealing procedures to bound the shattering time are the focus of Section 4. In Section 5 we establish the sharp rate of spatial mixing on treelike graphs with sparse boundary conditions. We combine these to conclude the proof of the upper bound of Theorem 1 in Section 6. We prove the matching lower bound on the mixing time in Section 7.

Preliminaries
In this section, we collect some standard definitions and properties that are necessary to present our proofs, and to which the reader can refer throughout. See the standard texts [11], [31], and [40] for more details on random graphs, the random-cluster model, and Markov chain mixing times, respectively.

Random Δ-regular graphs.
We begin by considering the underlying geometry we work on. Fix Δ ≥ 3 and consider the uniform distribution P rrg over Δ-regular graphs on n vertices. (Let us always assume n is such that Δn is even, so that such a graph exists.) We identify the vertices V (G) with the set {1, ..., n}, and the randomness of P rrg will be over the edge-subset of {i j = ji : 1 ≤ i, j ≤ n}. Throughout this paper, we set d := Δ − 1 for convenience.
2.1.1. Random graphs are treelike A key ingredient in our proof is the fact that random Δ-regular graphs are locally treelike. While this can be formalized in various ways, we use a notion that is most relevant to this paper, and applies uniformly to all vertices (as opposed to a vertex chosen uniformly at random).
For a graph G = (V (G), E(G)) and a vertex v ∈ V (G), we define the ball of radius R around v as: We include a short proof of Fact 3 after introducing the configuration model in Section 4.1. It is known that when R > 1 2 log d n, the number of cycles in every ball B R (v) goes to ∞ with n.

2.2.
The random-cluster model. For a graph G = (V, E), recall the definition of the random-cluster model from (1.1). We say an edge e ∈ E is open or wired if ω(e) = 1 and closed or free if ω(e) = 0. We say two vertices are connected in ω if they are in the same connected component of the sub-graph (V, {e ∈ E : ω(e) = 1}). For a vertex set V ⊂ V , denote by C V (ω) the union of connected components (clusters) containing v ∈ V in this sub-graph. For a configuration ω and edge set A ⊂ E, we use ω(A) for the restriction of ω to A.

Boundary conditions
To help study the random-cluster measure, we introduce boundary conditions. Definition 3. A random-cluster boundary condition ξ on G = (V, E) is a partition of V , such that the vertices in each element of the partition are identified with one another. The random-cluster measure with boundary conditions ξ , denoted π ξ G, p,q , is the same as in (1.1) except the number of connected components c(ω) = c(ω; ξ) would be counted with this vertex identification, i.e., if v, w are in the same element of ξ , they are always counted as being in the same connected component of ω in (1.1). In this manner, the boundary condition can alternatively be seen as ghost "wirings" of the vertices in the same element of ξ .
The free boundary condition, ξ = 0, is the one whose partition consists only of singletons. For a subset ∂ V ⊂ V , the wired boundary condition on ∂ V , denoted ξ = 1, is the one whose partition has all vertices of ∂ V in the same element and all vertices of V \ ∂ V as singletons; i.e., ξ = {∂ V } ∪ {v : v ∈ V \ ∂ V }. For boundary conditions ξ, ξ we say that ξ ≤ ξ if ξ is a finer partition than ξ . We have the following important monotonicity in boundary conditions: for any two boundary conditions ξ and ξ with ξ ≥ ξ , we have π ξ G, p,q π ξ G, p,q where denotes stochastic domination.

2.2.2.
Uniqueness/non-uniqueness transition on the Δ-regular tree As the geometry of the random graph is locally treelike, its dynamical transition point should be inherited from a transition on the Δ-regular tree. Throughout this paper, we denote by the rooted (at ρ) Δ-regular complete tree of depth h (the root has Δ children, and all other vertices have Δ − 1 children and one parent). Since the tree has depth h < ∞, evidently it is not actually Δ-regular, and has leaves and |∂T h | = Δd h−1 . The wired boundary condition "1" is the one that wires all vertices of ∂T h together. For every Δ ≥ 3 and q ≥ 1, the random-cluster measure π 1 T h , p,q undergoes a transition at p u (q, Δ): when p < p u (q, Δ) the probability that ρ is connected to ∂T h in ω goes to 0 as h → ∞, whereas when p > p u (q, Δ) it stays bounded away from zero [34]. (While in general p u (q, Δ) does not have a closed form, it can be expressed as the root of an explicit formula: see [5,34].) A key fact (see [34, Theorem 1.5]) we will use is that whenever p < p u (q, Δ) we have thatp (the probability of a cut-edge being open) satisfieŝ 2.3. Markov chain mixing times. Consider a (discrete-time) Markov chain with transition matrix P on a finite state space Ω, reversible with respect to an invariant distribution π ; denote the chain initialized from x 0 by (X x 0 t ) t≥0 . Its mixing time is given by where the total-variation distance between μ and ν is given by Here the infimum runs over all couplings of μ, ν. By this definition, to bound the mixing time, it suffices to bound the coupling time of the dynamics; i.e., if we construct a coupling P of the steps of the chain such that for each x 0 , y 0 ∈ Ω, we have P(X x 0 T = X y 0 T ) ≤ 1/4, then t mix ≤ T . It is a standard fact that t mix (δ) ≤ t mix log(2δ −1 ). See chapters 4-5 of [40] for more details.

A coupling for the FK-dynamics
Recall the definition of the FK-dynamics from the introduction. Note that in the presence of boundary conditions ξ , the only change is that in step (2) of the FK-dynamics transitions, the status of e being a cut-edge is dictated by whether its presence changes c(ω t ; ξ).
For the FK-dynamics, there is a canonical choice of coupling known as the identity coupling. This is the coupling that couples the evolution of two copies of the FKdynamics, (X x 0 t ) and (X y 0 t ), by using the same random edge e t and the same uniform random number U e t ,t to decide whether to add or remove e t . When q ≥ 1, the identity coupling is a monotone coupling, in the sense that if X x 0 t ≤ X y 0 t then X x 0 t+1 ≤ X y 0 t+1 with probability 1. The identity coupling can also be extended to a simultaneous coupling of all the Markov chains (X x 0 t ) indexed by their initial configuration x 0 ∈ {0, 1} E (i.e., a a grand coupling), so that if x 0 ≤ y 0 we have X x 0 t ≤ X y 0 t for all t ≥ 0. As a consequence, the coupling time starting from any pair of configurations is bounded by the coupling time starting from the free x 0 = 0 and wired y 0 = 1 configurations.

Extended Proof Sketch
In this section, we provide a detailed sketch of our proof of Theorem 1, outlining the structure of the argument and highlighting some of the key technical difficulties we encountered. Most of the paper is dedicated to upper bounding the mixing time of the FK-dynamics by O(n log n), so the sequel is dedicating to sketching that proof. The matching lower bound follows from coupling a certain projection of the FK-dynamics to a product chain and is derived in Section 7.

Proof outline.
Let G = (V (G), E(G)) be an n-vertex graph. Let (X 1 t ) t≥0 and (X 0 t ) t≥0 be two realizations of the FK-dynamics started from the all-wired and allfree configurations, respectively, and coupled via the identity coupling as defined in Section 2.
Our goal is to show that there exists T = O(n log n) such that for every vertex v ∈ V (G), with probability 1 − o(n −1 ), the configurations X 1 T and X 0 T agree on the Δ edges incident v, denoted (3.1) A union bound over the n vertices would then imply that under the identity coupling (1). By the monotonicity of the FK-dynamics under the identity coupling, this would show that the mixing time of the FK-dynamics is at most T = O(n log n).
There are two key stages to establishing this coupling, each of which we describe next. Stage I. In the first stage of the coupling, we show that after an initial burn-in period of T = O(n log n) steps, the configuration X 1 T is shattered. That is, its connected components have constant size in expectation, and every component is of size O(log n) with high probability; more precisely, we show that the size of the connected components have exponential tails. Since X 1 T ≥ X 0 T , the same holds for X 0 T . The intuition behind our proof of shattering after T steps goes as follows. Consider the balls ). Consequently, we can even take the minimum (intersection) of the chains Since all these chains are coupled using the same randomness, we maintain the domination ω t ≥ X 1 t for all t ≥ 0. We thus focus on showing the shattering property for ω T . Notice that we can bound the connected component of a vertex v in ω T via an iterative exploration process. We initialize a set A as the connected component The procedure ends when ∂ A is empty and outputs an edge set A necessarily containing the component of v in ω T . See the depiction in Figure 1. The nature of this exploration process lends itself naturally to comparison with a branching process in which the "children" of u are the vertices connected to u through Z u T . (It turns out that the revealing of these configurations can be done in such a way that although they are all coupled, the dependencies between configurations Z u T (B r (u)) are negligible: we comment on this later.) We will show that with high probability over G, the resulting branching process is sub-critical.
To see this, first note that since the mixing time on B r (v) is O(1) (r is constant), after T = Θ(n log n) steps, enough updates have occurred in each ball B r (v) so that the chains (Z v t ) have all mixed with high probability. Hence, up to a small error, we can consider instead the branching process where the number of children of v is given by the number of connections of v to the boundary of B r (v) in a sample from π 1 B r (v) . Now, most O(1)-sized balls in a random Δ-regular graph are trees. A key characteristic of the uniqueness regime p < p u (q, Δ) is that in the wired Δ-regular tree T r , the expected number of leaves connected to v under π 1 T r is less than 1 as long as r is large. As long as the role played by non-tree balls in G is bounded, this would imply the desired sub-criticality of the dominating branching process. We in fact need concentration bounds on the number of explored vertices in this branching process; towards this we show thatp is the actual exponential decay rate of root-to-leaf connectivities on Δ-regular (wired) trees.

Lemma 1.
Let T h denote the rooted Δ-regular complete tree of depth h and let p < p u (q, Δ). Let (1, ) be the wired boundary condition on ∂T h that additionally wires the root of T h to ∂T h . There exists a constant C = C( p, q, Δ) such that for every h and every leaf u ∈ ∂T h , Since there are O(d r ) leaves in T r , the lemma implies that the expected number of connections from v to the boundary is O((pd) r ), which is less than one for r large (aŝ pd < 1 when p < p u (q, Δ)). The reason we establish this decay for the boundary condition (1, ), instead of simply the wired one, is to eliminate the potential dependencies between the chains (Z u t ) through their roots. To conclude our sketch of the ideas in Stage I, we mention two fundamental challenges to implementing the above approach. First, since all the chains (Z v t ) are coupled via the identity coupling, revealing their configurations while maintaining some independence is delicate (see Lemma 5). We perform this revealing by additionally wiring the root to the boundary as hinted by the (1, ), and for each u, only revealing the new randomness needed to run the resulting chain on B R (u) up to time T . Roughly speaking, the wired boundary conditions allow us to evolve the un-revealed configuration in B r (v) in isolation.
Secondly, not every ball B r (v) in G will be a tree, and there are strong correlations between the short cycles of the underlying graph and the places where the randomcluster configuration is more wired. A key contribution of our work is to construct a simultaneous revealing procedure for the random graph G with the overlayed randomcluster configuration of ω T in a manner that handles these dependencies and can be approximated by the above sub-critical branching process; see Definition 11. Putting all these ideas together, we establish the following exponential tail bound (shattering estimate) on cluster sizes of X 1 T after a burn-in period of O(n log n) time. Theorem 4. Let p < p u (q, Δ) and suppose that G is sampled from P rrg , the uniform distribution over Δ-regular graphs on n vertices. Then, for every v ∈ V (G), k ≥ 1 and T ≥ Cn log n, where C > 0 is a sufficiently large constant, with probability ) By a union bound, Theorem 4 implies that all components of X 1 T are of size at most O(log n) with high probability. Theorem 4 is proved in §4.
Using the above arguments, we can further show that for each v ∈ V (G) the boundary condition induced on the ball B R (v) of radius R = ( 1 2 − δ) log d n by the configuration of X 1 T on the edges outside of B R (v) is typically K -sparse, i.e., the boundary condition induces only K = O(1) many connections on ∂ B R (v). Theorem 5 establishes that this property holds for all v ∈ V (G) simultaneously with high probability.
Stage II. After the initial T = O(n log n) steps of the burn-in phase, the configurations X 1 T and X 0 T shatter and induce sparse boundary conditions (with up to O(1) vertices wired through the boundary) on every ball B R (v) of radius R = ( 1 2 − δ) log d n with high probability. It remains to show that the copies of the FK-dynamics will couple on N v except with probability 1 − o(n −1 ) in an additional O(n log n) steps.

Starting at time T , we consider localized copies of the FK-dynamics in each ball
. This is done by ignoring (or censoring) the moves of the dynamics outside of B R (v) which has the effect of "freezing" the two distinct boundary conditions induced by X 1 With the sparse boundaries conditions frozen on ∂ B R (v), the two coupled chains continue to run inside B R (v), and we can more easily analyze their configurations near v. The censoring technology of [46] implies that if these censored chains are coupled on N v , then so are the original chains.
In Lemma 11, we show that if X 1 In fact, we can establish a tight bound on the log-Sobolev constant of the FK-dynamics This slightly stronger fact turns out to be crucial for deducing the tight O(n log n) bound on the mixing time of the FK-dynamics on G, i.e., without an additional polylog(n) factor.
With this optimal bound on the local mixing on treelike balls, we know that the localized chains have all mixed after O(n log n) steps of the FK-dynamics. Therefore, the probability that two instances of the FK-dynamics on B R (v) with distinct sparse boundary conditions ξ and ξ are not coupled on N v is given by the total variation distance between π (Recall that we say a boundary condition is K -sparse when there are only K boundary wirings.) We stress the importance of obtaining the sharpp 2 decay rate here for the spatial mixing to support a union bound over n vertices. Sincep < 1/d and R = ( 1 2 −δ) log d n, we havep 2R = o(n −1 ), but any weaker bound on the decay rate would force us to choose a larger R, which would cross the threshold at which point balls of G are no longer (L , R)-Treelike for L = O(1), and we would lose control over the mixing time on B R (v).
The proof of this spatial mixing property is based on the fact that in order for information to travel from the boundary of We contrast this to the more traditional bound on influence by the existence of a single connection from the center of a ball to its boundary, which in our setting would only yield a bound ofp R . (Such a bound by a single connectivity event is the one traditionally used on amenable graphs like Z 2 to go from spatial mixing with any positive rate of exponential decay to fast mixing: see [1,8,42].)

The FK-Dynamics Shatters Quickly on Random Graphs
Our first goal in this section is to prove Theorem 4 establishing existence of T burn = O(n log n) such that for t ≥ T burn , the configuration X 1 G,t is shattered. We will then use this to conclude that the boundary conditions X 1 G,t induces on any ball of volume o( Let us now be more precise.

Definition 4.
A random-cluster boundary condition ξ on an edge-subset H ⊂ E(G) is said to be K -Sparse if the number of vertices in non-trivial (non-singleton) boundary components of ξ is at most K .

Definition 5. A random-cluster configuration
The following key result asserts that the boundary of every ball about a vertex is O(1)-Sparse with high probability after an O(n log n) burn-in time: this is proven in Section 4.5.
Theorem 5. Fix p < p u (q, Δ). There exists C( p, q, Δ) such that for every t ≥ Cn log n, the following holds. For every Remark 2. By monotonicity of the FK-dynamics, for every G, we have that X 1 G,t π G , from which it follows that both Theorem 5, and the exponential tails of Theorem 4, hold under π G , i.e., if one replaces X 1 G,T by an equilibrium configuration ω ∼ π G .
In Section 4.1, we construct the relevant revealing procedures for FK-dynamics clusters on random graphs, and define the branching process we dominate it by. In Sections 4.2-4.4, we analyze these processes, and in Section 4.5, we complete the proofs of Theorems 4 and 5.

Couplings and revealing schemes for the FK-dynamics on random graphs.
In this section, we summarize the key couplings and revealing schemes for the connected components of X 1 G,t . These are fundamental to the proof of shattering for X 1 G,t in the uniqueness region after an O(n log n) burn-in time.

The configuration model
The configuration model P cm is a distribution over multigraphs on n vertices and fixed degree distribution, which we take to be Δ for every vertex, defined as follows [9]. Give every vertex v ∈ {1, ..., n} Δ-half-edges and select a matching on the Δn many half-edges uniformly at random to form the Δn/2 edges of the graph. Let M n be the set of possible edges (the set of pairs of half-edges).
The configuration model is a useful tool for studying the random Δ-regular graph, as the distribution P rrg is equal to the distribution P cm (· | G ∈ Γ rrg ) where Γ rrg is the event that the graph G is simple (i.e., has no self-loops or multi-edges). In particular, it is standard (see e.g., [9]) that P cm (Γ rrg ) > c for some c(Δ) > 0, and therefore for any event Γ , Refer to the book [22] for more on the configuration model. We will use (4.2), with an iterative revealing scheme of a matching of the Δn half-edges, to analyze the random Δ-regular graph.
The configuration model lends itself to revealing procedures. Towards introducing the joint revealing procedure for the random graph G ∼ P cm and the configuration X 1 G,t , let us first recall a standard revealing procedure for random Δ-regular graphs according to P cm on its own. This procedure is useful to proving random graph estimates for the configuration model and Δ-regular random graph. It also serves as a building block for the revealing procedure of the random graph together with the FK-dynamics configuration.
The following iterative algorithm is a way to sample from the configuration model for a given degree sequence. The fact that this gives a valid sample from P cm is straightforward after naturally identifying samples from P cm with samples from the uniform distribution over matchings on Δn. (1) Initialize the set of exposed edges as A 0 = ∅.
to a half-edge selected uniformly at random from the remaining un-matched ones to form the edge e m . Let Observe, importantly, that the choice of next half-edge to match (given by the function f ) can be adaptive, specifically, adapted to the filtration generated by (A 0 , ..., A m−1 ).
Definition 6 provides an adaptive sampling method from the configuration model distribution P cm (see e.g., [43]) and can be used to prove myriad properties of random Δ-regular graphs. In particular, it yields a simple proof of Fact 3 that G ∼ P rrg is (L , R)-Treelike for R ≤ n

A coupling of localized FK-dynamics chains
Our goal is to simultaneously expose edges of G ∼ P cm while revealing the FK-dynamics configuration X 1 G,t at time t on G. We show that under their joint distribution the size of the connected components of X 1 G,t have exponential tails; this in turn implies that the boundary condition on Note that a ball of radius O(log n) about a vertex v may have many cycles-indeed it may encompass the entire graph G-but a typical FK cluster of size O(log n) does not use most of these cycles. Thus, we expose the edges of P cm guided by the revealing of the random-cluster component of a vertex v in X 1 G,t ; in this way, to expose the C v (X 1 G,t ) we will not have to reveal much of the random graph.
There are two key difficulties to consider when constructing a joint revealing process for (G, X 1 G,t ): 1. Under either of X 1 G,t or e.g., the random-cluster measure π G , the value ω(e) on an edge e shown to belong to E(G), affects the distribution of the remainder of the underlying random graph. 2. Unlike the random-cluster measure π G , the law of X 1 G,t does not satisfy any domain Markov property. Indeed, the distribution of X 1 G,t (e) conditionally on some X 1 G,t (A) is quite difficult to analyze.
The key to overcoming these obstructions will be to reveal the configurations of a family of FK-dynamics chains that are localized (in the sense that their distribution only depends on a small O(1) sized subset of edges of the graph) and whose concatenation stochastically dominates the distribution of X 1 G,t . Let us be more precise next and explicitly construct a coupling of a family of localized FK-dynamics chains.

Definition 7. For a graph G and edge subset
A be the random-cluster measure on A with wired boundary conditions on ∂ A. Let (X 1 A,t ) t≥0 be the FK-dynamics chain that starts from all wired on E(G), censors (ignores) all updates in E(G) \ A, and makes FK-dynamics updates w.r.t. π 1 A when it updates edges in A. Importantly, the wiring on ∂ A ensures that the law of X 1 , the all wired configuration on A.
, if e t ∈ A, we resample e t given the remainder of the configuration on A, together with the wired boundary condition on ∂ A, using the same uniform random variable U e t ,t for every X 1 A,t such that e t ∈ A.
As in the grand coupling for different initializations, this is a monotone coupling. In particular, we have X 1 A key observation for our revealing process is that for every A, the configuration X 1 A,t depends only on: (1) the number of updates amongst (e s ) s≤t that belong to A, which we denote by κ A,t ; (2) the choice of edges to be updated on A on those κ A,t updates; we denote such set by O A,κ A,t ; and (3) the family of uniform random variables on those edges, (U e,s ) e∈A,s≤t .
With this observation in hand, we can extend this to a coupling of (X 1 A,t ) averaged over G ∼ P cm .
where ω t is a random-cluster configuration on G that results by first drawing G ∼ P cm , then drawing ω t ∼ P(X 1 G,t ∈ ·). Likewise, for every set A ⊂ M n , let P 1 A,t be the distribution over pairs Couple, under the distribution P, the family of distributions (P 1 A,t ) A⊂M n ,t≥1 by selecting the same random graph G ∼ P cm for all of them, then using the coupling of Definition 8 of the family (X 1 In this manner, we have constructed a monotone coupling of the family (G, (X 1 A,t ) t≥1 ) A⊂M n . Note that we use this coupling for sets A which we know have E(G) ∩ A = A, so that the averaging is only over the edges of E(G) \ A, which we earlier noted X 1 A,t is independent of; thus the role of this coupling is only to put the random graphs with their random-cluster configurations on the same probability space. We defer detailed discussion of the properties of the coupling to Section 4.2 (after constructing the revealing procedure in the sequel) but emphasize that by construction, if A ∩ B = ∅, the only dependency of X 1 A,t and X 1 B,t is through the distributions of the binomial random variables κ A,t and κ B,t .

The joint revealing procedure
We now construct a revealing procedure for G and a configurationω t on G that stochastically dominates X 1 G,t . Fix r to be chosen as a large constant (depending on p, q, Δ) later.
Definition 10. Given an exposed set of edges A of the random graph G ∼ P cm , we We drop the A from the notation when understood contextually.
Definition 11. Initialize: For an edge-set A 0 ⊂ M n revealed to be part of E(G) and a vertex set V 0 ⊂ V (G), we construct a joint iterative procedure to expose (a set containing) the connected components Through this process we will keep track of the following variables at each step: -A m : the set of edges of the random graph that have been revealed by step m; -V k : the set of vertices in the k-th generation we want to explore out of; -ω m : the random-cluster configuration revealed up to step m; -F m : elements of the filtration with respect to which the configurationω m on A m is measurable.
The process is defined as follows (see Figures 2 and 3 for a depiction of several steps of this process): be the random-cluster configuration revealed when the process terminates. The key observation about the above process is that we can control the cluster of V 0 in X 1 t (E(G)\A 0 ) by the set A m k ∅ ; the size of this set will then be approximately controlled by comparison to a sub-critical branching process in the following subsection.

Observation 6. The connected components of
In particular, the number of vertices in non-trivial (i.e., non-singleton) components of the boundary condition X 1 t (E(G) \ A 0 ) induces on A 0 is less than the number of vertices in non-trivial components of the boundary conditionω t (E(G) \ A 0 ) induces on A 0 . The edges in both of these sets of connected components are subsets of the edge-set A m k ∅ \ A 0 .
With Observation 6 in hand, we focus on obtaining the exponential tail bound of Theorem 4 for C v (ω t ) (the component of v inω t ) and likewise, the sparsity bound of

Constructing a dominating branching process
Towards proving Proposition 2, we construct a (non-Markovian, size-dependent) branching process which we will show stochastically dominates the sequence (V k ) k≥0 of our joint reveleaing process. This process (Z k ) k≥0 will then be shown to be sub-critical and satisfy exponential tail bounds on its total population, implying the same for the cluster of V 0 inω t .

Definition 12.
Initialize Z 0 = |V 0 |, and let (Z k ) k≥1 be the (size-dependent) branching process, which for each k, has progeny (χ i,k ) i≤Z k drawn i.i.d. from the following distribution: 1. With probability n −1/2 , let χ i,k = |V (T r )| ≤k Z and say the progeny number Fig. 3. Left: Proceeding from above, in the next generation, starting from v 2 ∈ V 1 , reveal the edges of B out t (v 2 ) in G; in this case, this is not a tree, but is disjoint from A 1 , so that A 2 = B out r (v 2 ). The configuration X 1 A 2 ,t is generated and concatenated withω 1 to formω 2 . Right: Running the FK-dynamics on A 3 with all-wired boundary conditions ensures that X 1 A 3 ,t is nonetheless independent of the configuration we had revealed inω 2 . The light purple vertices are connected to V 0 inω 3 and are added to V 2 to form the next generation 2. Otherwise, sample χ i,k from the distribution of the number of leaves in the connected component of the root under π (1, ) T r (the random-cluster measure on the Δ-regular tree of depth r with a wired boundary condition and with the root also wired to ∂T r ).
Let Z k+1 = i≤Z k χ i,k ; that is, the i-th member of the k'th generation gets χ i,k many children.
Note that this is not a branching process in the traditional sense, since the progeny distribution is not i.i.d. and depends on the population up to that generation. Nonetheless, we will show good tail bounds on (Z k ) k≥0 by dominating it by sub-critical branching processes between the Bad steps.
To justify the above construction, let us formalize the relation between (Z k ) k and the revealed vertices of the process in Definition 11, (V k ). Intuitively, we want to identify vertices v m ∈ V k with those of generation k in (Z k ); the progeny of v m will then be those vertices added to V k+1 in step (4) of Definition 11. Item (1) from the progeny distribution of Definition 12 corresponds to situations where: is not a tree; or (3) There are an insufficient number of updates on B out Examples of situations (1)-(2) were depicted in Figure 3. The n −1/2 probability assigned to these bad situations by the dominating branching process comes from the fact that Theorem 5 requires us to consider |A 0 | of size n −1/2+δ , and thus any edge has at least probability n − 1 2 −δ of intersecting A 0 . On the complement of situations (1)-(3) above, X 1 A m ,t is mixed, and is comparable to the (1, )-tree of depth r .

Comparing the revealing procedure and branching process
We conclude the section by stating the main two lemmas comparing the revealing procedure to the branching process defined above. Towards stating these, denote by t mix (T r , (1, ) the mixing time of FK-dynamics on T r with (1, ) boundary conditions, and define the burn-in time Recall, the definition of the update numbers (κ A m ,t ) m and define, for every t ≥ T burn , the event Standard tail estimates for binomial random variables will imply that E t holds with high probability. Let m 0 = 0 and for each k ≥ 0, let m k+1 = m k +|V k |, i.e., the total number of exposed vertices on the boundaries of explored balls, and in the same connected component as V 0 before the exploration for the (k + 1)-th generation begins. This will be the quantity which we compare to the population of the branching process of Definition 12. More precisely, on the event E t , by construction of (Z k ), and the choice of T burn , we are able to show the following stochastic domination.

Lemma 2.
There exists C 0 ( p, q, Δ) in the definition of (4.3) such that the following holds for every t ≥ T burn . For every A 0 , V 0 such that |A 0 |, |V 0 | ≤ n 1 2 −δ for δ > 0, every K > 0 fixed, and every ≥ 1, In this manner, we will have reduced the analysis of the set of exposed vertices through the revealing process of (G,ω t ), and thus, the clusters of X 1 G,t , to the analysis of the process (Z k ), which except on some rare Bad increments, is a simple branching process with subcritical progeny distribution dictated by connectivity probabilities in the wired measure π (1, ) T r . We will establish the following tail estimate for (Z k ). Lemma 3. Suppose p < p u (q, Δ) and fix any δ > 0, any M ≥ 1, and any 1 ≤ Z 0 ≤ n 1 2 −δ . There exist r 0 ( p, q, Δ), C( p, q, Δ, M), K 0 ( p, q, Δ, M) such that for every r ≥ r 0 fixed and every 0 < λ ≤ n

Outline of remainder of section
Having sketched the key revealing procedures and the way they fit together to provide the desired bounds on the clusters of X 1 G,t , let us prove the various relations and bounds claimed above. In Section 4.2, we prove various key properties of the configuration model revealing process of Definition 6 and the coupling of Definition 8 that will be central to the analysis of the revealing procedure of Definition 11. Then in Section 4.3, we show that the size-dependent branching process (Z k ) of Definition 12 stochastically dominates the FK process (V k ) of Definition 11 on a high-probability event, proving Lemma 2. In Section 4.4, we analyze the process (Z k ) by comparing its population to the sum of O(1) many sub-critical branching processes to deduce Lemma 3. In Section 4.5, we combine these ingredients to conclude Theorem 4 and Proposition 2, and from that Theorem 5. (G,ω t ). In this section, we describe some of the key properties of the coupling constructed in Definition 8, and the revealing procedure constructed for the clusters of V 0 inω t in Definition 11. The following preliminary lemmas describe the law of the random graph edges and overlaying FK configurations through the revealing process.

Properties of the configuration model revealing procedure
We begin with the following lemma on the law of the random graph G conditionally on a set A m which we have revealed to be a subset of E(G). Recall the configuration model's revealing procedure from Definition 6 and say a vertex is discovered if at least one of its halfedges has been matched, and exhausted if all of its half-edges have been matched.
Proof. Fix any edge-set A. We can sample from the conditional distribution P cm (· | A) by defining the adaptive scheme f in Definition 6 so that it first matches the half-edges belonging to A, yielding the set A |A|/2 = A after |A|/2 steps, then setting f to do a breadth-first search (BFS) of B out r (v): this latter part is done by choosing f so that it first exhausts v, then exhausts each of the neighbors of v, and so on.
Revealing the entire set B out r (v) takes at most |E(T r )| many steps beyond |A|/2. If for every m ∈ {|A|/2 + 1, ..., Since on each of these steps, the half-edge f (A m ) is being matched to a u.a.r. unmatched half-edge, uniformly over the at most |E(T r )| steps it takes to reveal B out r (v), the probability that the half-edge it is matched to belongs to A m−1 is at most (The first inequality here uses the fact that in the BFS of B out r (v), there are at most d r vertices of the ball that have been discovered but not exhausted.) Union bounding over the at most |E(T r )| ≤ 2Δd r such attempts yields the desired bound.
We can use a similar reasoning as the proof above to deduce a proof of Fact 3 as follows.
Proof of Fact 3. Fix any v and choose f so that the revealing scheme performs a BFS revealing of B R (v). In order for B R (v) to not be L-Treelike, it must be the case that for more than L different m's in the first |E(T R )| steps, the half-edge f (A m−1 ) is being matched to a half-edge belonging to A m−1 . (If there were at most L such steps, then the removal of the at-most L edges formed by those at-most L matchings in the revealing scheme, evidently leaves a tree, so that B R (v) would be L-Treelike.) Uniformly over A m−1 , the probability of this in the m'th step is at most d R+1 / (Δ(n − m)). Summing over the at most |E(T R )| many such attempts while revealing B R (v), we find that for every ≥ 1, Recall that the standard Chernoff bound applied to a Poisson binomial distribution with mean μ = N p says that for every s ≥ μ, With the choice R = ( 1 2 − δ) log d n, so that d R = n 1 2 −δ and |E(T R )| ≤ 2Δd R , (4.6) implies that the right-hand side of (4.5) is at most (Cn −δ ) for some C(Δ) and large enough n. As a consequence, choosing L > 2δ −1 , we would find sup v∈{1,...,n} (4.7) It remains to translate this to a bound under P rrg . This follows by the following standard comparison argument. Let Γ rrg be the event that the graph G ∼ P cm has no self-loops or double edges (i.e., it is a simple graph). Taking Γ = {G : G is not (L , R) − Treelike} in (4.2) and union bounding (4.7) over the n vertices yields the desired bound.

Properties of the coupling of localized Markov chains
The following lemma is the key fact about the construction of the grand coupling of FK dynamics, Definition 8, whereby after revealing some X 1 A,t , we can control the influence that revealing has on X 1 B,t for A ∩ B = ∅. In this manner, through the revealing procedure of Definition 11, which reveals different localized configurations X 1 A m ,t iteratively, as long as t ≥ T burn these are each close to their respective stationary distributions of π 1 A m , so that it is approximately a concatenation of localized FK models on treelike graphs, inducing an exponential decay of connectivities. (1) The configuration X 1 Proof. Let G be any graph having E(G) ∩ A = A. We claim that uniformly over G, items (1)-(2) above hold. Observe first that |E(G)| = Δn/2 necessarily, and therefore uniformly over such G, the number of updates on edges in A by time t in the update sequence (e s ) s≤t is distributed as Bin(t, 2|A|/(Δn)). Evidently, the distribution of O A,κ A,t only depends on κ A,t and not on the times these updates were; in particular, given that e j ∈ A for some j, the law of e j is clearly uniform at random on A. Finally, notice that for every e, the sequence (U e,s ) s≤t is independent of all other sources of randomness, implying the desired item (1).

A,t (A) is measurable with respect to κ A,t (the number of edgeupdates in A), the edges chosen to update O A,κ A,t , and the uniform random variables
Turning to item (2), we fix a κ A,t , O A,κ A,t and family (U e,s ) e∈A,s≤t . We can condition further on the exact times of the updates in A, i.e., (e s ) s≤t ∩ A. Conditionally on that set of updates, the distribution on the remaining updates is evidently t − κ A,t i.i.d. draws from E(G) \ A. It is then clear that κ B,t counts the number of times, amongst these remaining draws, that the update is in B. As in item (1), the induced distribution on O B,κ B,t is then the same as κ B,t i.i.d. draws from the edges of B. Finally, for every e ∈ B, the uniform random variables (U e,s ) s≤t are independent of all other sources of randomness.

Domination by the modified branching process (Z k ).
In this section, we establish the stochastic domination of the sequence (V k ) k≥0 from Definition 11 by the branching process (Z k ) of Definition 12.
Proof of Lemma 2. We prove the desired stochastic domination by induction over . The base case, Z 0 = |V 0 |, is by construction. Now fix ≥ 1 and suppose by way of induction that the following stochastic domination holds: Thus there exists a monotone coupling of the sequence on the left-hand side, such that it is below the sequence (Z j ) j≤ −1 in the natural element-wise ordering on the sequence. Working on that coupling, it suffices for us to then show that on the intersection E m t ∩ {m −1 ≤ n 1/2−δ/2 }, for every m ∈ {m −1 + 1, ..., m }, the distribution of the children of v m is stochastically below the progeny distribution of Definition 12.
Observe, first of all, that for every m ∈ {m −1 + 1, ..., m }, on E m t ∩ {m −1 ≤ n 1/2−δ/2 }, deterministically the number of children of v m is bounded by where the last inequality is by the inductive hypothesis, and the fact that Now, for every set of revealed edges (A l ) l≤m−1 , define the following events on F m−1 consisting of (κ A l ,t ) l≤m−1 , edge-values (O A l ,κ A l ,t ) l≤m−1 , uniform random variables ((U e,s ) e∈A l ,s≤t ) l≤m−1 : We first claim that these two events each happen with probability 1 − (1 + |V (T r )|)n 1/2−δ/2 . As such, by Lemma 4, for every A 0 , V 0 such that |A 0 | ≤ n 1/2−δ , Thus, for n large enough and r = o(log n), the above is at most 1 3 n −1/2 as desired. We next turn to the probability of Γ c upd,m ∩ Γ tree,m . Recall from item (2) of Lemma 5 that conditionally on F m−1 , the distribution of κ A m ,t is 2|A m | Δn .
Since we are on the event E m t and thus E m−1 t , we have that l≤m−1 κ A l ,t ≤ 4m|E(T r )|t/ (Δn), from which we deduce, using m ≤ m ≤ |V (T r )|m −1 ≤ |V (T r )|n 1/2−δ/2 , that the number of trials in the binomial is at least as long as r = o(log n). Since we are on the event Γ tree,m , we have d r ≤ |A m | ≤ |E(T r )| ≤ 2Δd r , and we see from lower tail estimates on binomial random variables that as long as C 0 in (4.3) is sufficiently large (depending on r, Δ). By item (2) of Lemma 5, conditionally on any (A l ) l≤m−1 and F m−1 , and any A m ∈ Γ tree,m and κ A m ,t ∈ Γ upd,m , the conditional distribution of X 1 A m ,t (A m ) is equivalent (up to relabeling of edges) to that of κ A m ,t updates of a heat-bath chain (Y 1 s ) s on a subtreeT r of the complete tree T r with (1, )-wired boundary conditions, initialized from Y 1 0 ≡ 1. Notice that the equivalent sub-treeT r consists of some k ≤ d of the children of the root, together with their complete sub-trees. In particular, the random-cluster model on A m with wired boundary conditions is stochastically below the FK model on the corresponding subset of T r with its (1, ) boundary conditions. In particular, the number of leaves in the FK cluster of the root under π (1, ) T r is stochastically below the same quantity under π (1, ) T r . It therefore suffices for us to show that as long as A m is a tree disjoint from A m−1 and κ A m ,t ≥ d r T burn /(2Δn), we have P Y (1, ) κ Am ,t ∈ · − π (1, ) T r tv ≤ This follows as long as C 0 is sufficiently large (depending on Δ), from the fact that and |E(T r )| ≤ 2Δd r , together with the sub-multiplicativity of total-variation distance.

Sub-criticality and tail bounds for the dominating branching process.
We now analyze the process (Z k ) of Definition 12, and show that it indeed is sub-critical, and satisfies good tails on its total population. For ease of notation, let P k = ≤k Z be the total population after k generations.
Proof of Lemma 3. Since (Z k ) is a size-dependent, non-Markov process, we cannot directly use results on branching processes to control its growth. Instead, to control the population of the process (Z k ), we compare it to a sum of branching processes in the following manner. Consider the stopping generation κ λ for exceeding population K 0 Z 0 + λ, i.e., Our aim is to control the probability that κ λ < ∞. Let Γ M,k be the event that no more than M of the progeny counts ((χ i, ) i≤Z ) ≤k−1 were Bad. By (4.6), we get where μ is the mean of the Binomial, i.e., μ = (K 0 Z 0 +λ)n − 1 2 . As long as n is sufficiently large and Z 0 , λ ≤ n 1 2 −δ for δ > 0, so that μ ≤ 2K 0 n −δ , this implies for some C > 0, Next consider the event that κ < ∞ on the event Γ M,k . On Γ M,k , we dominate the population P k by the following sum of sub-critical branching processes with bounded progeny distributions.
Define (Z (1) k ) k to be the branching process initialized atZ (1) 0 = Z 0 with progeny (χ (1) i,k ), distributed i.i.d. from the distribution of the number of leaves connected to the root, in a sample from π (1, ) T r , i.e., the distribution of (χ i,k ) conditionally on the progeny number not being Bad. LetP k be an independent branching process with the same progeny distribution, initialized fromZ The following stochastic domination is clear by construction if we decompose the process (Z k ) revealed in a breadth-first manner, into its excursions between the at most M times (on the event Γ M,k ) when the progeny number χ i,k was Bad.
Claim. Fix any k ≥ 1. We have the stochastic domination With this domination in hand, notice that in order for κ < ∞ while Γ M,κ holds, there must exist some k ≤ K 0 Z 0 + λ such that Γ M,k holds and P k ≥ K 0 Z 0 + λ. Therefore, by a union bound, Indeed, if no such j existed, as long as K 0 is sufficiently large, we could bound

Now fix any j ≤ M, anyZ
( j) 0 and consider the branching processZ k . This is a branching process with progeny distribution having mean m = A(pd) r for some A( p, q) per Lemma 7. Sincep < d −1 when p < p u (q, Δ), as long as r is greater than some r 0 ( p, q, Δ), for n sufficiently large we have m < 1, andZ ( j) k is sub-critical. Additionally, the progeny distribution ofZ ( j) k is almost surely bounded by |∂T r | ≤ Δd r −1 . As such, using the standard breadth-first exploration of the total population of the branching processZ i,k − 1)), we can bound is sufficiently large, the right-hand in the probability above exceeds the mean mN It follows from this, and the definition of N ( j) λ , that for some C( p, q, Δ, M, K 0 ) large enough, concluding the proof.

Proof of exponential tail on cluster sizes and shattering.
We are now in position to conclude the proof of the exponential tail bound on clusters of X 1 G,t , and use that to deduce that X 1 G,t is (K , R)-Sparse, except with probability o(n −2 ). We begin by using Lemmas 2-3 to prove the following tail bound on the sequence (V k ), which are the roots of the balls revealed through the revealing process of Definition 11.
Proof. Fix K 0 large to be chosen later, and define the following stopping generation Recall E t from (4.4). Since for every ≤ ς , we have from Lemma 2, that (|V |1{E t }) j≤ (Z j ) j≤ , we have that if C 0 in (4.3) is sufficiently large, the probability of {ς < ∞} is bounded by the probability of P ∞ = k≥0 Z k ≥ K 0 Z 0 + λ. By Lemma 2, we obtain Lemma 3 implies the existence of r ( p, q, Δ) such that the first-term above is at most Next, consider P(E c t ). By a union bound and item (1) of Lemma 5, with the trivial observations that m k ∅ ≤ n and |A m | ≤ |E(T r )| ≤ 2Δd r necessarily, we get for every t ≥ T burn , The above entails a deviation of at least 4td r n −1 from its mean; as such, by standard tail estimates for binomials, for every t ≥ T burn , P(E c t ) ≤ n exp(−td r n −1 ), (4.8) which is at most n −δ M for n large, as long as C 0 in (4.3) is sufficiently large (depending on δ M). The desired bound then follows up to a change of the constant C.
Before proceeding to prove Proposition 2, let us translate the tail bound of Lemma 6 on k |V k | to a tail bound on the FK cluster of a single vertex under X 1 G,t and π G . Notice that towards the proofs of Theorem 4 and Proposition 2, it suffices to show these for t ≥ T burn for some fixed choices of C 0 , r in (4.3) depending on p, q, Δ (as t mix (T r , (1, )) is of course independent of n).
Proof of Theorem 4. Fix any v ∈ {1, ..., n}, let A 0 = ∅ and let V 0 = {v} in Definition 11. By Observation 6, for each G ∼ P cm , the cluster of v in the configuration X 1 G,t , denoted By Lemma 6 and the above, we find that for each M, there exists C( p, q, Δ, M) such that Observing that P(X 1 G,t ∈ ·) = E cm [P(X 1 G,t ∈ ·)], we can use Markov's inequality to write We can obtain the same bound for P rrg by (4.2), up to a multiplicative c(Δ) −1 on the right-hand side. Taking M such that δ M > 2K and using the fact that √ a + b ≤ √ a+ √ b for all a, b ≥ 0, we deduce the desired tail bound on |C v (X 1 G,t )| up to the change of constant C to 2C. Using the monotonicity X 1 We now turn to proving that for typical random graphs, the configuration X 1 G,t is (K , R)-Sparse with high probability for all t ≥ T burn . This allows us to localize to treelike balls with sparse boundary conditions. Let us define the following subset of the boundary of a set H , which we will apply with the choice H = B R (v).

Definition 13. For a subgraph H = (V (H ), E(H )) of G and a configuration ω on E(G), let us define V H (ω) as the subset of vertices in V (H ) in non-trivial components in the boundary condition induced on H by ω(E(G) \ E(H )) (a connected component is non-trivial when it has at least two vertices).
We first prove the following proposition, giving a tail bound on V B R (v) (X 1 G,t ); after proving this proposition, we straightforwardly use it to conclude (K , R)-sparsity of X 1 G,t , i.e., Theorem 5.

Proof of Proposition 2.
Fix v ∈ {1, ..., n} and δ > 0, and let R = ( 1 2 − δ) log d n. We apply the revealing procedure of Definition 11 with the choices A 0 = E(B R (v)) and V 0 = ∂ B R (v). Recall from Observation 6 that the FK-clusters of V 0 induced bỹ ω t (E(G) \ A 0 ) (ω t was extended to be all wired off of A m k ∅ \ A 0 ) are confined to the set A m k ∅ \ A 0 , and the extended configurationω t satisfiesω t ≥ X 1 G,t . Thus, the sets ,t ), are below the number of vertices in V 0 that share a connected component of A m k ∅ \ A 0 with another vertex of V 0 .
Suppose that through the revealing process of Definition 11, for each m, the edges of B out r (v m ) are revealed one at a time per Definition 6. Notice then, that |V B R (v) (A m k ∅ \ A 0 )| is bounded by the number of times through the revealing of A m k ∅ , that a half-edge is matched up to a half-edge belonging to a vertex that has been discovered at that point. Throughout this process, conditionally on an exposed edge-set A (and the edge-update sequence, and uniform random variables given by the filtration up to that step of the revealing, but E(G) \ A is independent of these), the law of the next half-edge to be matched is uniform amongst un-matched half-edges. Thus on any such edge-revealing, uniformly on the history of the revealing, the probability that it matches with a half-edge belonging to a discovered vertex is at most )| . By a union bound, we obtain for Λ a sufficiently large constant (depending on p, q, Δ, r ), for all k ≥ 1, By Lemma 6 and the fact that Λ|V 0 | ≤ n 1 2 − δ 2 for n large, as long as Λ is large enough, the first term is at most n −5 . Using the fact that |V 0 | ≤ n 1 2 −δ , we see that the mean of the binomial is at most n −3δ/2 , so that by (4.6), for every fixed k ≥ 1, for n large enough. Choosing k = K sufficiently large (depending on δ), we can make the right-hand side at most n −4 . We deduce the proposition by using Markov's inequality to write and noticing that the expectation on the right equals the probability bounded in (4.9).

Sharp Rates of Correlation Decay in Trees and Treelike Graphs
In this section we establish the precise exponential decay rate of influence from an O(1)-Sparse boundary condition on the root of an O(1)-Treelike ball. We recall from Section 3, that getting the right decay rate, (as opposed to e.g., using the decay rate of connectivity from the root to the boundary) is central to pushing our argument through for all p < p u . In particular, the decay rate of influence will be inherited from twice the exponential decay rate of the wired tree.
Recall that the uniqueness point p u (q, Δ) is defined by a transition on the wired Δ-regular tree, where the measure π 1 T h transitions between exponentially small (in h) probability of a root-to-leaf connection, to giving this event uniformly positive probability. A recursion for this connectivity probability was calculated in [5, Lemma 33]. A careful examination of this recursion will yield the following identification of the rate of the exponential decay withp of (2.2).

Lemma 7. Let p < p u (q, Δ). There exists C( p, q, Δ) such that for every h and every
In particular, the probability of the root being connected to ∂T h in ω is at most C(pd) h .
In Section 5.1, we establish Lemma 7. In Section 5.2, we show that influence in the random-cluster model travels through the existence of two distinct connections; thus on Treelike graphs, influence has twice the exponential decay rate of root-to-leaf connectivities on the wired tree. This will yield Proposition 1.

Exponential decay rate in the wired Δ-regular tree.
Because of its recursive structure, connectivity properties of the random-cluster measure on the wired tree can be analyzed sharply. In this section, we pursue this and show that in the uniqueness regime of p < p u , the probability of a connection from the root to a leaf at depth h is O(p h ), as one would have for the free tree (corresponding to i.i.d. Ber(p) percolation on T h ). We first show that the probability of a root-to-boundary connection decays exponentially in h.
Let T h be the complete Δ-regular tree of height h rooted at ρ. The wired "1" boundary conditions on T h are those that wire all leaves of T h (all vertices in ∂T h ). Define the probability that the root is connected to a leaf of T h . Using the recursive structure of the tree, it was shown in [5,Lemma 33] that if we define μ := p q + 1 − p, for every h, we have and for every p < p u (q, Δ), this satisfies lim h→∞ ϕ h = 0. The following lemma establishes that this convergence is exponentially fast.
Proof. Consider the recursion of (5.1) for ϕ h . Since lim h→∞ ϕ h = 0, if lim x→0 Since both the numerator and denominator of (5.2) are differentiable and have limit 0 as x → 0, using L'Hôpital's rule we get = dp.
Recall that for every 0 < p < p u , we have 0 < dp < 1. Thus, there exists a sequence {ε h } such that lim h→∞ ε h = 0 and for every h, Expanding this out, we deduce as desired.
Our aim is to now prove Lemma 7, bounding connectivities of the root to a single leaf.
Proof of Lemma 7. To prove Lemma 7, we write a recursion for the root-to-leaf connection probability. Let ϑ h be the probability under π 1 T h that the root is connected to the left-most leaf of depth h. Let ϑ h be the probability of the same event, under π (1, ) T h where we recall that the (1, ) boundary conditions additionally wire the leaves of T h to the root. By monotonicity we have ϑ h ≤ ϑ h and by Lemma 10, we have ϑ h ≤ q 2 ϑ h .
Let (I i ) i≤Δ be the indicator function of the event that there is a root-to-boundary path going through the i-th child of the root; set I = Δ i=2 I i . Then, we can write where in the first inequality we used the fact that in order for the root to be connected to the left-most leaf, it is required that the root is connected to its left-most child w 1 , and that w 1 is connected to the left-most leaf of its sub-tree. The former event occurs with probability p orp, depending on whether or not the root is connected to ∂T h through any child besides w 1 . By monotonicity, for every i = 2, ..., Δ, the law of I i under π 1 T h is below its law under π (1, ) T h and the same holds for I . Since, by Lemma 10 a single external wiring may distort the distribution by at most a q 2 factor, we get π (1, ) T h pq 2 ϕ h ). A union bound and Lemma 8 then imply for all h; note that ε can be chosen as small as needed provided the constant C( p, q, Δ, ε) is large enough. Thus, setting a = C pq 2p−1 we obtain by continuing the recursion. Now, observe that sincepd < 1 when p < p u , Combining the above two bounds, there exists an absolute constant A = A( p, q, Δ) such that for every h we have ϑ h ≤ Ap h and thus ϑ h ≤ Aq 2ph . The first inequality in the lemma follows by noticing that all the leaves in T h are equivalent, and the second follows from a union bound over the Δd h−1 . Notice that Υ B,ξ is an increasing event. We claim that Υ B,ξ controls the propagation of influence from ∂ B.

Exponential decay rate in
Proof of Lemma 9. For ease of notation let B : We construct a monotone coupling P of ω ξ ∼ π ξ B and ω τ ∼ π τ B . The coupling P reveals the configurations ω ξ ∼ π ξ B and ω τ ∼ π τ B on B one edge at a time using i.i.d. uniform random variables U e ∈ [0, 1] for each e ∈ E(B). The same U e is used to reveal the values ω ξ (e) and ω τ (e) from the corresponding conditional measures. The order in which the uniform variables are revealed is irrelevant and can be adaptive; this will allow us to reveal the boundary components. (For more details on the process of revealing random-cluster components under the monotone coupling, see below, as well as e.g., [6,8]. ) We construct an adaptive revealing scheme that ensures that on the event Υ c ξ for the top sample ω ξ , the samples ω ξ and ω τ agree on N v . This implies the desired result as one would then have by the definition of total-variation distance, We construct P with the following iterative scheme which proceeds level-by-level from the leaves of B. Recall that for each ≥ 1, we let Q = {u ∈ B : d(u, v) ≥ } and E(Q ) is the set of edges with both endpoints in Q . At any time in the revealment process, we say that a vertex u ∈ Q is unsaturated in Q if there exists w ∈ Q such that the edge-values (ω ξ (uw), ω τ (uw)) have not been revealed. Let (U e ) e∈E(B) be a family of i.i.d. uniform random variables on [0, 1] and reveal the configuration ω ξ as follows: (2) Add the edge uw to the set E ξ ; (3) If ω ξ (uw) = 1, add the vertex w to V ξ ; . . Note that we can use the same family (U e ) e∈E (B) in this process to generate coupled samples of ω ξ and ω τ . Notice that this coupling is monotone, so that because ξ ≥ τ , ω ξ ≥ ω τ almost surely. Let C i V (ω ξ ) denote the set of open edges revealed up to the i-th iteration of the procedure; we observe that C i V (ω ξ ) is not necessarily equal to the intersection of C V (ω ξ ) with E(Q R−i ), but it is a subset of C V (ω ξ ) ∩ E(Q R−i ). Refer to Figure 4 for a depiction of the above revealing procedure.
Through this revealing process, we see that ω ξ is open on the edges in the random set C i V (ω ξ ) and free on the edges in its outer (edge) boundary in Q R−i . LetC i V (ω ξ ) be the union C i V (ω ξ ) with its outer (edge) boundary in Q R−i , and note that this corresponds to the state of E ξ after the i'th iteration. The random set C i V (ω ξ ) is measurable with respect to the uniform random variables assigned to edges ofC i V (ω ξ ).
To conclude the proof, it suffices to see that because |V 0 | ≤ 1, both ω ξ (C V,0 ) and ω τ (C V,0 ) induce the free boundary conditions onC c V,0 . In that case ω ξ and ω τ would agree onC c V,0 and in particular on N v . By monotonicity, it suffices for us to show that the boundary conditions induced by ξ and ω ξ (C V,0 ) onC c V,0 are free. Since the wirings of ξ are only on vertices of V ξ ⊂C V,0 , the only way for the boundary conditions on C c V,0 to be not free is if multiple vertices on its boundary are incident to open edges of ω ξ (C V,0 ). By construction, the only vertices inC V,0 which can be incident to an open edge of ω ξ (C V,0 ) must be at distance exactly R − i 0 from v. By the assumption that |V 0 | ≤ 1, there can be at most one such vertex, and therefore there are no non-trivial (i.e., non-singleton) boundary components induced onC c V,0 by the boundary condition (ξ, ω ξ (C V,0 )), implying the desired conclusion.
For each 0 ≤ i ≤ k, the graph F i = (F i , E(F i )) is a forest. For each i, let T i j = (T i j , E(T i j )) for j = 0, 1, . . . denote the distinct connected components (subtrees) of F i so that F i = j≥0 T i j . (For some i, this may be empty, and for other i, this may be a single vertex.) Now, in order for Υ B,ξ to hold, it must be the case that in each F i , every depth is intersected by at least two sites in the FK cluster of V B,ξ in Q . Specifically, for each i, at distance d i + 1 from v there must be at least two distinct vertices connected to V B,ξ with paths in Q d i +1 . Thus, for each i there must exist two open monotone paths (each intersecting each height in F i at exactly one vertex), γ i ⊂ E(T i j ) and γ i ⊂ E(T i j ) with j = j such that γ i (resp., γ i ) connects the root of T i j (resp., T i j ) to one of its leaves. If there are multiple such paths, choose according to some predetermined ordering, and call the sequences of paths Γ = γ 0 , . . . , γ k and Γ = γ 0 , . . . , γ k . See Figure 6 for a depiction.
We enumerate over the choices of such sequences of paths and then show that for any two fixed sequences of paths, the probability that they are both open is bounded by Cp 2R for some C( p, q, Δ, K , L). (We say that a sequence of paths is open if all of its paths are.) In order to enumerate over the choices of sequences of paths, for each monotone path γ i , let x i be its bottom endpoint, and define x i for γ i similarly. Since ξ is K -Sparse, there are evidently at most K many choices of x 0 , and K choices of x 0 . Now observe that since γ i is a monotone path on a tree, for each i, the bottom endpoint x i determines the entire path γ i . Since these paths form parts of the connections to V B,ξ the sequence of paths can be required to have endpoints at depths d i+1 − 1 that are either an ancestor of x 0 , or an ancestor of V (H ) . Here, at each height h / ∈ Z an ancestor of a vertex u at height h is a vertex along the geodesic from v to u. We make the following observation.
Indeed, except along the edges in H , every vertex has a unique parent which is an ancestor of that vertex at one smaller depth. Thus, the geodesics of B are uniquely determined by their endpoints together, possibly, with a subset of edges of H traversed along the geodesic, yielding the at most 2 L available choices.
Returning to the enumeration over Γ, Γ , the heights of the endpoints x i , x i are predetermined by i, and therefore, having chosen x 0 , x 0 for each i, there are at most 2L many choices of bottom end-point x i , and likewise of x i , and therefore at most 2L · 2 L many choices of γ i and γ i .
Hence, a union bound implies Now fix any two such sequences of paths Γ, Γ , and consider the probability that ω(Γ ∪ Γ ) = 1. Observe that Γ and Γ are vertex-disjoint by construction. Our aim is to make the events that Γ and Γ are open in ω independent. For this, let ρ i be the set of roots of the trees in F i . We introduce auxiliary wirings (as shown in Figures 5-6) for all vertices at depths {d : min i=0,...,k+1 |d − d i | ≤ 1}. Call the resulting distributionπ B ; by monotonicity, The distributionπ B is a product measure over the T i j 's with boundary condition (1, ) in each T i j (recall that this boundary condition wires all leaves ∂T i j together with the root of T i j ). Hence, since Γ and Γ are such that, for each i ≥ 0, γ i and γ i belong to distinct subtrees T i j i , T i j i of the forest F i , and we havẽ Let h i = d i+1 − d i be the height of the trees in F i . We deduce from Lemma 7 that there exists a constant A( p, q, Δ) > 0 such that uniformly over Γ, Γ , Plugging this bound into (5.4)-(5.5), we obtain 2Lp2R , from which the required (5.3) follows.

Remark 4.
A matching lower bound of Ω(p 2R ) for the decay rate in Proposition 1 is easy to construct by e.g., taking the K -Sparse boundary conditions ξ that wires two leaves w 1 , w 2 on distinct sub-trees of v, and the free boundary conditions ξ = 0 on T R . The event that the root is connected to w 1 and its corresponding child is connected to w 2 has probability at least Cp 2R by Lemma 7 and the FKG inequality (see e.g., [31]). On this event, the probability that the edge incident v down towards w 2 is open is p under the boundary condition ξ andp under ξ = 0.

Proof of Fast Mixing
In this section, we combine the results of Sections 4-5 to conclude the proof of Theorem 1. As indicated in Section 3, the analysis of Sections 4-5 reduce the mixing time of the FK-dynamics on a random graph to understanding the convergence to equilibrium on O(1)-Treelike balls of volume O(n 1 2 −δ ) with O(1)-Sparse boundary conditions. In Section 6.1, we recall the log-Sobolev inequality and comparison bounds for the log-Sobolev constant under different boundary conditions. In Section 6.2, we bound this log-Sobolev constant via straightforward comparison to a product chain. Then in Section 6.3, we proceed to combine all of the above ingredients to deduce the proof of Theorem 1 using the censoring inequalities of [46].

Mixing time preliminaries.
Let us recall some standard tools to help us bound the rate of convergence to equilibrium of the FK-dynamics on treelike balls with sparse boundary conditions.

Log-Sobolev inequalities
Recall, for a Markov chain with transition matrix P, the Dirichlet form for f : Ω → R. Then the log-Sobolev constant is given by .
As such, a log-Sobolev inequality takes the form E( f, f ) ≥ γ · Ent π [ f 2 ] for all f . A log-Sobolev inequality is stronger than a mixing time bound, in the sense that it implies exponential convergence with rate γ in total-variation distance from the stationary distribution. This is captured by the following standard fact (e.g., a proof in the discrete time setting we consider follows immediately from Lemma 2.8 and Eq. (2.10) of [3]).

Fact 7.
Consider an ergodic finite state Markov chain (X t ) t≥0 with transition matrix P reversible with respect to stationary measure π . If the chain has a log-Sobolev constant α = α(P), then for every γ < α,

Boundary condition comparisons for the FK-dynamics
The following formalizes the notion that sparse boundary conditions are "close to free", and allows us to compare the induced mixing time on balls with sparse boundary to those with free boundary.

Definition 15 (Definition 2.1 of [6]). For two boundary conditions (partitions
is the number of components in φ. For two partitions φ, φ that are not comparable, let φ be the smallest partition such that φ ≥ φ and φ ≥ φ and set D( Lemma 10 (Lemma 2.2 of [6]). Let G = (V, E) be an arbitrary graph, p ∈ (0, 1) and q > 0. Let φ and φ be two partitions of V encoding two distinct external wirings on the vertices of G. Let π φ G , π φ G be the resulting random-cluster measures. Then, for all FK configurations ω ∈ {0, 1} E , we have From Lemma 10, and the definition of the Dirichlet form, (6.1), we deduce the following.
Corollary 8. Let G = (V, E) be an arbitrary graph, p ∈ (0, 1) and q > 0. Consider the FK-dynamics on G with boundary conditions φ and φ , and let E G φ , E G φ denote their Dirichlet forms, respectively. Then Together with Corollary 8 and Lemma 10 again, this controls the change in both log-Sobolev constant (6.2), and mixing time, under two boundary conditions with distance D(φ, φ ).
6.2. Local mixing: fast mixing on treelike graphs with sparse boundary conditions. In this section we establish a bound for the speed of convergence of the FK-dynamics on L-Treelike balls with K -Sparse boundary conditions (see Definitions 1 and 4). Our goal is to prove the following lemma. For every p ∈ (0, 1) and q > 0, the log-Sobolev constant of the FK-dynamics Lemma 11 follows by comparing log-Sobolev on an L-Treelike ball with K -Sparse boundary to a tree with K -Sparse boundary conditions, whose log-Sobolev constant is bounded by comparison to a product chain. We first note the following bound on the log-Sobolev constant on trees with sparse boundaries. Proof. Consider the FK-dynamics onT h under the free boundary conditions. In this case, the random-cluster measure is a Ber(p) product measure and thus the log-Sobolev constant of the FK-dynamics is c|E(T h )| −1 for some c( p, q) > 0; see, e.g., [17]. The result then follows from Lemma 10 and Corollary 8.
To move from mixing on an L-Treelike ball to mixing on a tree, the following fact will be useful.

Fact 9.
Let G be a subgraph of G such that V (G) = V (G ) and E(G) ⊂ E(G ); let H = E(G ) \ E(G). Suppose φ is a boundary condition on G, G such that for every e ∈ H , the endpoints of e are wired in φ. For every p ∈ (0, 1) and q > 0, let P G and P G be the transition matrices of the FK-dynamics on G and G , respectively, with boundary conditions φ, and let α(P G ) and α(P G ) be their log-Sobolev constants. There exists a constant c( p) > 0 such that Proof. The FK-dynamics on G is a product Markov chain on {0, We can now combine the above ingredients to deduce the bound of Lemma 11. φ be the boundary condition that includes all the connections from ξ and adds wirings between w and w for every edge ww ∈ H . Corollary 2 implies that the log-Sobolev constant for the FK-dynamics onT R with boundary condition φ is at least cq 6(K +L) |E(T R )| −1 for some c( p, q) > 0. We then get from Fact 9 that the log-Sobolev constant for the FK-dynamics on B with boundary condition φ is at least cq 6(k+L) |E(B)| −1 . Lemma 10 and Corollary 8 then imply that the log-Sobolev constant on B with boundary conditions ξ is at least cq 6K +12L |E(B)| −1 . Theorem 1: upper bound. Fix p < p u (q, Δ), let ε = 1 −pd (positive whenp < p u ) and fix δ > 0 small enough (depending on ε, Δ) such that

Proof of
in which case the following is polynomially decaying in n: log d n and let K be a constant sufficiently large (depending on p, q, Δ) that both Fact 3 and Theorem 5 hold for (K , R). For each t, let Γ t be the set of Δ-regular graphs on n vertices having By Fact 3 and Theorem 5, there exists C 0 ( p, q, Δ) such that if T = C 0 n log n, then P rrg (Γ c T ) ≤ o (1). It suffices for us to prove that the mixing time of the FK-dynamics on any G ∈ Γ T is O(n log n).
Fix any G ∈ Γ T and for every configuration ω on E(G), let X ω t = X ω G,t be the FKdynamics chain on G initialized from X ω 0 = ω. Couple the family of chains ((X ω t ) t≥0 ) ω∈{0,1} E(G) using the grand coupling as in Definition 8: recall that this is the coupling that in each step picks the same random e ∈ E(G) to update, and the same uniform random variable U e,t on [0, 1] to decide the next state on the edge e. As mentioned earlier, this coupling is monotone when q > 1 so that for every t ≥ 0, if X ω t ≤ X ω t , then X ω t+1 ≤ X ω t+1 . It follows from the definition of t mix and monotonicity of the grand coupling (see e.g., [40]), that it suffices for us to show that there existsĈ( p, q, Δ) such that ifT = T +Ĉn log n, P X 1T = X 0T ≤ Now fix any such v and consider the probability above. For ease of notation, let B v = B R (v) and B c v = E(G) \ B v . Introduce two new Markov chains Y 1 t and Y 0 t that are coupled via the grand coupling to X 1 t , X 0 t except that they censor (ignore) all updates on edges of B c are made in E(B v ) between times T andT . By K -sparsity of φ 1 , Lemma 11, and Fact 7, the term in (6.6) is bounded by for some C 3 ( p, q, Δ); we thus have for C 2 sufficiently large (and thereforeĈ sufficiently large), that this is at most o(n −2 ). By the same reasoning, by K -sparsity of φ 0 , the same bound applies to (6.8).
Finally, since both φ 1 and φ 0 induce K -Sparse boundary conditions on B v , by Proposition 1 there exists a constant C( p, q, Δ, K ) > 0 such that (6.7) is at most which is o(n −1 ) by our choice of δ and (6.3). Putting these three bounds together we see that as long asĈ is sufficiently large (depending on p, q, Δ) the difference in (6.5) is o(n −1 ), from which the bound of (6.4) follows for n sufficiently large, concluding the proof.

Matching Lower Bound on the Mixing Time
In this section, we show a matching Ω(n log n) lower bound on the mixing time of the FK-dynamics on a random Δ-regular graph and thus complete the proof of Theorem 1 from the introduction. A general lower bound for the mixing time of the Glauber dynamics on spin systems was show in [35]. However, the non-locality of the FK-dynamics complicates extending the ideas from [35] to the random-cluster setting. In [8], the argument from [35] was adapted to the random-cluster model on Z 2 when p = p c (q), but the amenability of Z 2 together with the exponential decay of connectivities at p < p c was key to this extension.
In our setting, the non-amenability of the random Δ-regular graph prevents us from bounding the speed of disagreement percolation under couplings of the FK-dynamics and implementing the argument of [35] directly. Instead, we use the locally treelike structure of the random Δ-regular graph to directly couple a projection of the model on a certain set of n ε edges to a product measure on n ε edges, for which the coupon collector problem gives an immediate lower bound.
Claim. With P rrg -probability 1 − o(1), G is such that there exist n 1/5 vertices whose balls of radius 1 5 log d n are disjoint, and are trees. Proof. Per (4.2), it suffices to prove the above under P cm . We prove the claim by repeated application of Lemma 4. Namely, consider the procedure where we repeatedly take an arbitrary vertex v that has not been discovered yet, and reveal its ball of radius R. Let v i be the i'th vertex to be selected in this procedure, and let A i be j≤i E(B R (v j )). Then, for integer m ≤ n the probability that (B R (v 1 ), ..., B R (v m )) are disjoint trees, is at least By Lemma 4 (using the fact that each v i / ∈ V (A i−1 ) so that B out R (v i ) = B R (v i )) each of the summands is at most O(md 2R /(n − O(md R ))). Taking R = δ log d n and m = n δ , we see that the sum above is at most O(n 4δ /(n − O(n 2δ )) which is o(1) as long as δ < 1 4 .
Fix ε ∈ (0, 1/5) to be taken sufficiently small later. For every G having n 1/5 many vertices whose balls of radius 1 5 log d n are disjoint trees, choose arbitrarily some n ε vertices amongst the n 1/5 of Claim 7, and for each vertex collect a representative edge incident to it to form the set C = C ε (G). Our proof will rely on a coupling of the restrictions of X t,G and π G to C to Ber(p) product chains. For this, let: 1. X t = X t,G be a realization of the FK-dynamics; 2. Y t = Y t,G be a realization of the FK-dynamics that censors all updates in E(G) \ C; 3. ν as the product measure over |C| many Ber(p) random variables.
As before, let Y 0 t be the chain Y t initialized from the all-0 configuration.
Proof. We start with part (1). Our aim is to show that under the grand coupling of X 0 t and Y 0 t , for every t ≤ T = O(n log n), we have P(X 0 t = Y 0 t ) ≤ o (1). Under the grand coupling, let T T = (t 1 , t 2 , ..., t s(T ) ) denote the sequence of times on which the updated edge is in C, so that s(T ) counts the number of updates in C by time T . We can then bound P(X 0 t = Y 0 t ) ≤ P(s(T ) > n 2ε ) + P(X 0 t = Y 0 t , s(T ) ≤ n 2ε ).
The first term on the right-hand side is at most the probability that Binom(T, |C|/|E(G)|) ≥ n 2ε which is o(1) by the Chernoff bound (4.6). It thus suffices to work on the event s(T ) ≤ n 2ε . Let R := 1 6 log d n and let Z t be the FK-dynamics chain (coupled to X t , Y t through the grand coupling) that freezes the configuration on C∪(E(G)\ e∈C E(B R (e))) to be all-1. Let Z 0 t be the chain Z t initialized from the configuration that is all-0 on e∈C E(B R (e)) (but all-1 on the frozen edges). Observe, trivially, that X 0 t ≤ Z 0 t for all t ≥ 0. Also, observe that the updates of Z 0 t are stochastically dominated by Glauber updates on the union of 2|C| many d-ary trees (T e,1 , T e,2 ) e∈C of depth R, rooted at the endpoints of the edges of C, and each having (1, ) boundary conditions. By the monotonicity of the FK-dynamics, for all t ≥ 0, we have that P Z 0 t e∈C E(B R (e)) \ {e} ∈ · e∈C i∈{1,2} π (1, ) T e,i . (7.1) For each time t i ∈ T T , when an edge e t i ∈ C is updated, Y 0 t i (e t i ) is drawn from Ber(p). At the same time, X 0 t i (e t i ) is drawn from Ber(p) if the endpoints of e t i are not connected in X 0 t i , which in turn must occur if none of (T e,1 , T e,2 ) e∈C have an open root-to-leaf path in Z 0 t , as X 0 t ≤ Z 0 t . By the stochastic domination of (7.1) on Z 0 t , and Lemma 7, the probability that the endpoints of e t i are connected in Z 0 t i is at most 2C(pd) R ; for ε sufficiently small (depending on p, q, d), the above is O(n −3ε ). On the event that {s(T ) ≤ n 2ε }, we can union bound the above probability over the s(T ) times in T T , to find that P(X 0 t = Y 0 t , s(T ) ≤ n 2ε ) is at most O(n −ε ) = o(1) as desired.
For part (2), consider the 2|C| many d-ary trees (T e,1 , T e,2 ) e∈C emanating from the endpoints of the edges of C. Notice that if none of (T e,1 , T e,2 ) e∈C have an open rootto-leaf path, then the values ω(C) are conditionally distributed as a product of Ber(p) random variables, i.e., ω(C) would conditionally be distributed as ν(A).
By a union bound, the π G -probability that one of (T e,1 , T e,2 ) e∈C has an open root-to-leaf path is at most e∈C i∈{1,2} π (1, ) T e,i (e ↔ ∂T e,i ), which by Lemma 7 is at most 2n ε · C(pd) R . For ε sufficiently small (depending on p, q, d) this is o(1).
Proof of Theorem 1: lower bound. Take any n-vertex graph G with n 1/5 many vertices whose balls of radius 1 5 log d n are disjoint trees. Note that by Claim 7, such graphs have P rrg -probability 1 − o (1). Take ε sufficiently small per Lemma 12. Consider the event A + ⊂ {0, 1} C that at leastpn ε −n 2ε/3 of the edges in C are open. Let (Y s ) be the standard product chain over |C| = n ε many i.i.d. Ber(p) random variables, coupled to Y t (C) via Y s(t) = Y t (C) for all t, where s(t) counts the number of updates in C by time t. By item (1) of Lemma 12, for every T = O(n log n), P(X 0 T (C) ∈ A + ) ≤ P(s(T ) > cn ε log n ε ) + P(Y 0 T ∈ A + , s(T ) ≤ cn ε log n ε ) + o(1) ≤ P(s(T ) > cn ε log n) + sup Taking T := c 2 n log n ε = Θ(n log n) for c > 0 sufficiently small, the probability that s(T ) is more than cn ε log n ε is o(1) by the Chernoff bound (4.6). Turning to the middle term above, by the standard coupon collector bound, for every c > 0 sufficiently small, sup s≤cn ε log n ε P(Y 0 s ∈ A + ) ≤ o(1).