Supercritical percolation on nonamenable graphs: Isoperimetry, analyticity, and exponential decay of the cluster size distribution

Let $G$ be a connected, locally finite, transitive graph, and consider Bernoulli bond percolation on $G$. We prove that if $G$ is nonamenable and $p>p_c(G)$ then there exists a positive constant $c_p$ such that \[\mathbf{P}_p(n \leq |K|<\infty) \leq e^{-c_p n}\] for every $n\geq 1$, where $K$ is the cluster of the origin. We deduce the following two corollaries: 1. Every infinite cluster in supercritical percolation on a transitive nonamenable graph has anchored expansion almost surely. This answers positively a question of Benjamini, Lyons, and Schramm (1997). 2. For transitive nonamenable graphs, various observables including the percolation probability, the truncated susceptibility, and the truncated two-point function are analytic functions of $p$ throughout the entire supercritical phase.


Introduction
Let G = (V, E) be a connected, locally finite graph. In Bernoulli bond percolation, we choose to either delete or retain each edge of G independently at random with retention probability p ∈ [0, 1], to obtain a random subgraph ω of G with law P p = P G p . The connected components of ω are referred to as clusters, and we denote the cluster of v in ω by K v = K v (ω). Percolation theorists are particularly interested in the geometry of the open clusters, and how this geometry changes as the parameter p is varied. It is natural to break this study up into several cases according to the relationship between p and the critical probability p c = p c (G) = inf p ∈ [0, 1] : ω has an infinite cluster P p -a.s. , which always satisfies 0 < p c < 1 when G is transitive and has superlinear growth [21] (this result is easier and older if G has polynomial growth [44,Corollary 7.19] or exponential growth [32,43]).
Among the different regimes this leads one to consider, the subcritical phase 0 < p < p c is by far the easiest to understand. Indeed, the basic features of subcritical percolation have been well understood since the breakthrough 1986 works of Menshikov [46] and Aizenman and Barsky [1] which, together with the work of Aizenman and Newman [4], establish in particular that if G is a connected, locally finite, transitive graph then ζ(p) := − lim sup for every 0 ≤ p < p c , where we write E(K v ) for the set of edges that have at least one endpoint in the cluster K v . See also [22,23,38] for alternative proofs, and [16] for more refined results. Note that ζ(p) = ζ(p, G) is well-defined for every 0 < p < 1 and every connected, locally finite graph G, and in particular that the definition does not depend on the choice of the vertex v.
In contrast, the supercritical phase p c < p < 1 is rather more difficult to understand, and no good theory yet exists for supercritical percolation on general transitive graphs. A central role in the theory of this phase is played by the distribution of finite clusters. For Z d with d ≥ 3, most results concerning this distribution rely crucially on the important and technically challenging work of Grimmett and Marstrand [27], which allowed in particular for a complete proof of the asymptotics and The upper bounds of (1.2) and (1.3) were proven by Chayes, Chayes, Grimmett, Kesten, and Schonmann [18] and by Kesten and Zhang [40] respectively, both conditional on the then-conjectural Grimmett-Marstrand theorem. All of these upper bounds rely essentially on renormalization, a technique that is unavailable outside the Euclidean setting. The lower bound of (1.2) is trivial, while the lower bound (1.3) was proven by Aizenman, Delyon, and Souillard [2], see also [26,]. The more refined properties of finite clusters in supercritical percolation on Z d remain an area of active research, see e.g. [15,17] and references therein. We note moreover that percolation in the slightly supercritical (p ↓ p c ) regime remains very poorly understood on Z d even when d is large, in which case critical (p = p c ) percolation is now well understood following in particular the work of Hara and Slade [30]; See [31] for an overview of progress and problems in this direction.
In this paper, we develop a theory of supercritical percolation on nonamenable transitive graphs, studying in particular the distribution of finite clusters, the geometry of the infinite clusters, and the regularity of the dependence of various observables on p. Our main result, from which various corollaries will be derived, is the following theorem. An extension of the theorem to the quasitransitive case is given in Section 4. for every p c < p ≤ 1.
We stress that our methods are completely different to those used in the Euclidean context. We also note that we do not study the question of the (non)uniqueness of the infinite cluster, which remains completely open at this level of generality. We refer the interested reader to [33,36] and references therein for an up-to-date account of what is known regarding this question and the closely related problem of understanding critical percolation on nonamenable transitive graphs.
Here, we recall that a locally finite graph G is said to be nonamenable if its Cheeger constant is positive, where for each set K ⊂ V write ∂ E K for the set of edges with exactly one endpoint in K. For example, Euclidean lattices such as Z d are amenable, while regular trees of degree at least three and transitive tessellations of hyperbolic spaces are nonamenable. Previously, Theorem 1.1 was known only for p > (1 − Φ E (G))/(1 + Φ E (G)), where it follows by a simple counting argument [12, Theorem 2] and does not require transitivity. We believe that the full conclusions of Theorem 1.1 were only previously known for trees. Note that the theorem fails without the assumption of transitivity: For example, the graph obtained by attaching a 3-regular tree and a 4-regular tree by a single edge between their respective origins has p c = 1/3 and ζ(1/2) = 0. It was observed in [8] that the argument of [2] can be generalized to prove that ζ(p) = 0 for every p > p c whenever G is a Cayley graph of a finitely presented amenable group. In Corollary 3.4 we prove via a different argument that in fact ζ(p) = 0 for every amenable transitive graph and every p c ≤ p < 1. This gives a converse to Theorem 1.1, so that, combining both results, we obtain the following appealing percolation-theoretic characterization of nonamenability for transitive graphs: Corollary 1.2. Let G be a connected, locally finite, transitive graph. The following are equivalent: 2. p c (G) < 1 and ζ(p) > 0 for every p ∈ (p c , 1).
We note in particular that Theorem 1.1 and Corollary 1.2 resolve several questions raised by Bandyopadhyay, Steif, and Timár [8, Questions 1 and 2 and Conjecture 1].

Analyticity
One nice corollary of Theorem 1.1 is that many quantities including the percolation probability all depend analytically on p throughout the entire supercritical phase. The regularity properties of these functions have historically been a subject of great importance in percolation, motivated in part by the still uncompleted project to rigorize the heuristic computation of p c for various planar lattices by Sykes and Essam [52]. See e.g. [25,26,39] for further background.
Let us now state our general analyticity result. Let H v denote the set of finite connected subgraphs of G containing v, and say that a function F : H v → C has subexponential growth if lim sup n→∞ 1 n log sup{|F (H)| : H ∈ H v , |H| = n} < ∞. It is an observation essentially due to Kesten [39], and a consequence of Morera's Theorem and the Weierstrass M -test, that if F : H v → C has subexponential growth then the function is analytic in p on the set {p ∈ (0, 1) : ζ(p) > 0}, which always contains (0, p c ) when G is transitive as discussed above. More precisely, this means that for every p 0 ∈ (0, 1) with ζ(p 0 ) > 0 there exists ε > 0 and a complex analytic function on the complex ball of radius ε around p 0 whose restriction to (p 0 − ε, p 0 + ε) agrees with the function under consideration. See Proposition 3.1 for details. Thus, Theorem 1.1 has the following corollary. Corollary 1.3. Let G = (V, E) be a connected, locally finite, nonamenable, transitive graph, let v ∈ V and let F : Previously, it was not even known that θ(p) was differentiable on (p c , 1) under the same hypotheses. On the other hand, it is an immediate consequence of the mean-field lower bound on θ(p c + ε) [ We remark that in amenable transitive graphs, subexponential decay of the cluster volume distribution makes analyticity in the supercritical phase a rather more delicate issue. Indeed, while infinite differentiability of θ(p), χ f (p), κ(p), and τ f p (u, v) in the supercritical phase of Z d is an easy consequence of superpolynomial decay [26,Theorem 8.92], it remains open whether these quantities are analytic on the entire supercritical phase for percolation on Z d with d ≥ 3. The two-dimensional case of this problem was settled only in the very recent work of Georgakopoulos and Panagiotis [25], which also contains partial progress on various related problems.

Isoperimetry and random walk
In this section, we discuss applications of Theorem 1.1 to the isoperimetry of infinite percolation clusters on nonamenable transitive graphs. It is easily seen that percolation clusters cannot be nonamenable when p < 1, since infinite clusters will always contain arbitrarily large isoperimetrically 'bad regions', such as long paths. Nevertheless, one has the intuition that supercitical percolation clusters on nonamenable transitive graphs ought to be 'essentially nonamenable' in some sense. With the aim of making this intuition precise, Benjamini, Lyons, and Schramm [10] defined the anchored Cheeger constant of a connected, locally finite graph G = (V, E) to be where v is a vertex of G whose choice does not affect the value obtained, and said that G has anchored expansion if Φ * E (G) > 0. They asked [10, Question 6.5] whether every infinite cluster in supercritical Bernoulli bond percolation on G has anchored expansion whenever G is transitive 1 and nonamenable. (See also [44,Question 6.49].) Partial progress on this question was made by Chen, Peres, and Pete [19], who proved in particular that every infinite cluster has anchored expansion a.s. for p sufficiently close to 1. Their result does not require transitivity. Their paper also treats some other related models, most notably establishing anchored expansion for supercritical Galton-Watson 1 In fact they stated the question without the assumption of transitivity. An example showing that the question has a negative answer without this assumption is given in Remark 3.5. trees. Anchored expansion has subsequently been established for various other random graph models, see [7,11,20] Corollary 1.4. Let G be a connected, locally finite, nonamenable, transitive graph, and let p c < p ≤ 1. Then every infinite cluster in Bernoulli-p bond percolation on G has anchored expansion almost surely.
Corollary 1.4 also allows us to analyze the behaviour of the simple random walk on the infinite clusters of Bernoulli percolation under the same hypotheses. Indeed, it is a theorem of Virág [54,Theorem 1.2] that if G is a bounded degree graph with anchored expansion then there exists a positive constant c such that the simple random walk return probabilities on G satisfy the inequality In Section 3.3 we establish that a matching lower bound also holds in our setting, thereby deducing the following corollary. We write p ω n (v, v) for the n-step return probability of simple random walk from v on the percolation configuration ω. Corollary 1.5. Let G be a connected, locally finite, nonamenable, transitive graph, let p c < p < 1, and let v be a vertex of G. Then almost surely on the event that the cluster of v is infinite.  [53].
The same exp(−Θ(n 1/3 )) return probability asymptotics given by Corollary 1.5 also appear in many other random graph models of nonamenable flavour [7,50]. For results on isoperimetry and random walks in supercritical percolation on Z d , see e.g. [9,48] and references therein.
About the proof and organization. The proof of the main theorem, Theorem 1.1, is given in Section 2. The starting point of the proof is to use Russo's formula to express the p-derivative of the truncated exponential moment E p [e t|E(Kv)| 1(|E(K v )| < ∞)] as the sum of a positive term, which corresponds to the cluster growing while remaining finite, and a negative term, which corresponds to the finite cluster becoming infinite. (To do this rigorously we instead truncate at a large finite volume.) The proof then hinges on two key ideas, each of which allows us to bound the absolute value of one of these two terms. The bounds on the negative term work by re-purposing ideas from the Burton-Keane proof that there is at most one infinite cluster in percolation on amenable transitive graphs [14]. We use here the fact that supercritical percolation on a nonamenable transitive graph always stochastically dominates an invariant percolation process with trifurcation points (Lemma 2.5), which is a sort of weak converse to Burton-Keane and is due in the unimodular case to Benjamini, Lyons, and Schramm [10]. Meanwhile, for the positive term, we first bound the derivative in terms of certain 'skinny trees of bridges', and then bound the resulting expression using an inductive argument. This is the most technical part of the paper. Finally, the derivative itself, which is the difference of these two terms, can be bounded using martingale methods. Once these three bounds are in hand, the finiteness of E p [e t|E(Kv)| 1(|E(K v )| < ∞)] follows easily.
We then apply Theorem 1.1 to deduce Corollaries 1.2-1.5 in Section 3, generalize Theorem 1.1 to the quasi-transitive case in Section 4, and give closing remarks and open problems in Section 5.

Proof of the main theorem
In this section we prove Theorem 1.1. Let G = (V, E) be a connected, locally finite graph, and let v be a vertex of G. We write K v for the cluster of v, and write E v = |E(K v )| for the number of edges touching K v . Let H v be the set of all finite connected subgraphs of G containing v. Given a function F : for every p ∈ [0, 1] and n ≥ 1. To prove Theorem 1.1, it suffices to prove that if G is transitive and nonamenable then for every p c (G) < p < 1 there exists t > 0 such that ] is a polynomial in p and is therefore differentiable. We begin our analysis by expressing the p-derivative of Then for every n ≥ 1, p ∈ (0, 1), and F : H v → R we may compute that d dp In many situations, it is fruitful to bound the absolute value of this expression by viewing h p (K v ) as the final value of a certain martingale (indeed, a stopped random walk) that arises when exploring the cluster of v one step at a time. We will apply this strategy to bound d dp E p,n e tEv in Section 2.2. Next, we apply Russo's formula to give an alternative expression for the derivative in terms of pivotal edges. For each ω ∈ {0, 1} E and e ∈ E, let ω e = ω ∪ {e} and let ω e = ω \ {e}. Russo's formula [26,Theorem 2.32] states that if X : {0, 1} E → R depends on at most finitely many edges, then d dp Applying this formula to X of the form We denote the two terms appearing on the right hand side of this expression by for every F : H v → R, every p ∈ (0, 1), and every n ≥ 1.
Our strategy will be to use the equality (2.3) to obtain bounds on moments of certain random variables. In the remainder of this section we carry this out conditional on three supporting propositions, each of which gives control of one of the three quantities M p,n [e tEv ], U p,n [e tEv ], or D p,n [e tEv ], and which will be proved in the following three subsections. We begin by stating the following proposition, which is proven in Section 2.1.
We remark that the complementary inequality D p,n F ( always holds trivially on every connected, locally finite graph. Applying Proposition 2.1 to the function F (K v ) = e tEv , we obtain that for every p c < p < 1 there exists a positive constant c p such that for every n ≥ 1, and t ≥ 0. We want to show that if t > 0 is sufficiently small then the expectation on the left is finite. To do this, we bound each term on the right hand side by the sum of a term that can be absorbed into the left hand side and a term that we will show is bounded as n → ∞ for sufficiently small values of t > 0. For the second term, this is quite straightforward to carry out using the bound The term on the second line can readily be shown to be bounded as n → ∞ for sufficiently small t via a random walk analysis, as summarized by the following proposition which is proven in Section 2.2.
Proposition 2.2. Let α > 0 and p ∈ (0, 1). Then there exists t α,p > 0 such that Bounding U p,n [e tEv ] is rather more complicated. Let H be a finite connected graph. For each edge e of H, let H e denote the subgraph of H spanned by all edges of H other than e. Given a vertex v and a set W of vertices in H, we write Piv(H, v, W ) for the set of edges e such that there exists w ∈ W such that v is not connected to w in H e , and define We will use these quantities to bound U p,n [e tEv ]. To do this, we first write , ω(e) = 1, and E v ≤ n for each k ≥ 1, where k is a Stirling number of the second kind and ! k is the number of surjective functions from a set of size k to a set of size .
for every p ∈ (0, 1) and k ≥ 1. Summing over k, we obtain that for every t > 0. Similarly to (2.5), we can bound this expression by The second term is dealt with by the following proposition, which is proven in Section 2.3.
for every locally finite graph G = (V, E), every v ∈ V and every 0 ≤ t < t α,p .
Let us now see how the proof of Theorem 1.1 can be concluded given Propositions 2.1-2.3.
Proof of Theorem 1.1. Let G be a connected, locally finite, transitive, nonamenable graph, and let p > p c (G). Letting c p be the constant from Proposition 2.1, we deduce from (2.4), (2.5), and (2.8) that for every n ≥ 1 and t ≥ 0. Taking limits as n ↑ ∞ we obtain that for every t ≥ 0. Propositions 2.2 and 2.3 imply that there exists t 0 = t 0 (p, c p ) > 0 such that the right hand side is finite for every 0 ≤ t < t 0 , completing the proof.
It now remains to prove Propositions 2.1-2.3.

Bounding the negative term
In this section we prove Proposition 2.1. Note that the proof of this proposition is the only place where the assumption of nonamenability and transitivity are used in the proof of Theorem 1.1. The proof is also ineffective in the sense that it does not yield any explicit lower bound on the constant c p 0 , and is in fact the only ineffective step in the proof of Theorem 1.1. Let G = (V, E) be a locally finite graph, let ω ∈ {0, 1} E , and let S be a finite subset of V . We define the quantity By Menger's Theorem, this quantity is equal to the maximum size of a set of edge-disjoint paths from S to ∞ in the subgraph of G spanned by ω. Note in particular that P ω (S → ∞) is an increasing function of ω ∈ {0, 1} E . Proposition 2.1 will easily be deduced from the following proposition.
Proposition 2.4. Let G be a connected, locally finite, nonamenable, transitive graph. Then for every p > p c (G) there exists a positive constant c p such that for every S ⊂ V finite.
The proof of Proposition 2.4 uses elements of the Burton-Keane [14] argument, which is usually used to establish uniqueness of the infinite cluster in percolation on amenable transitive graphs. This argument in fact establishes that the inequality (2.9) holds when there are infinitely many infinite clusters for percolation with parameter p. (In the amenable case one can then prove by contradiction that such p do not exist, since the left hand side of (2.9) is at most |∂ E S|.) To apply this argument in our setting, since we are not assuming that there are infinitely many infinite clusters in Bernoulli-p percolation, we will first need to find an appropriate automorphism-invariant percolation process that has infinitely many infinite connected components each of which is infinitely ended and which is stochastically dominated by Bernoulli p-percolation. We do this by a case analysis according to whether or not G is unimodular, applying the results of [10] in the unimodular case and of [33] in the nonunimodular case.
We will borrow in particular from the presentation of the Burton-Keane argument given in [44,Theorem 7.6]. Let G = (V, E) be a graph and let η ∈ {0, 1} E . We say that a vertex v is a furcation of η if closing all edges incident to v would split the component of v in η into at least three distinct infinite connected components.
Lemma 2.5. Let G = (V, E) be a connected, locally finite, nonamenable, transitive graph, and let p c (G) < p ≤ 1. Then there exists an automorphism-invariant percolation process η on G such that the following hold: 2. The process η is stochastically dominated by Bernoulli-p bond percolation on G.
Before beginning the proof of this lemma, let us briefly introduce the notions of unimodularity and nonunimodularity. We refer the reader to [44,Chapter 8] for further background. Let G = (V, E) be a connected, locally finite, transitive graph with automorphism group Aut(G). The modular function ∆ : We say that G is unimodular if ∆(u, v) ≡ 1 and that G is nonunimodular otherwise. Most graphs occurring in examples are unimodular, including all Cayley graphs of finitely generated groups and all transitive amenable graphs [51].
Proof of Lemma 2.5. Recall the definition of the uniqueness threshold p u = p u (G) = inf{p ∈ [0, 1] : ω has a unique infinite cluster P p -a.s.}. The usual Burton-Keane argument implies that if q ∈ (p c , p u ) then the origin is a furcation for Bernoulli-q percolation with positive probability; See in particular the proof of [44,Theorem 7.6]. Thus, if p c (G) < p u (G) we may take η to be Bernoulli-q percolation for some p c < q < p u ∧ p. By the main result of [33], we always have that p c (G) < p u (G) when G is nonunimodular, which completes the proof in this case. Now suppose that G is unimodular. A result of Benjamini, Lyons, and Schramm [10, Lemma 3.8 and Theorem 3.10] states that for every p ∈ (p c , 1], Bernoulli-p bond percolation on G stochastically dominates an automorphism-invariant percolation process that is almost surely a forest with infinitely-ended components (see also [5,Theorem 8.13]). This process clearly has furcations, and so meets the conditions required by the lemma.
The rest of the proof of Proposition 2.4 is very similar to the Burton-Keane argument; we include the details for completeness. We will use the following elementary combinatorial lemma. Proof. We may assume that T has at least one vertex, since the claimed bound is trivial otherwise. Then we have that |E T | = |V T | − 1 and hence that The claim then follows by rearranging.
Proof of Proposition 2.4. Let G be a connected, locally finite, transitive, nonamenable graph, and let o be a fixed root vertex of G. Let p c < p ≤ 1, let ω be Bernoulli-p bond percolation, and let η be as in Lemma 2.5. Let Λ be the set of furcation points of η. Since ω stochastically dominates η we have that E p P ω (S → ∞) ≥ EP η (S → ∞) for every S ⊆ V finite, and so it suffices to prove that for every S ⊆ V finite. Since η is automorphism-invariant and G is transitive, it suffices to prove the deterministic statement P η (S → ∞) ≥ |Λ ∩ S|, (2.11) since (2.10) follows from (2.11) by taking expectations.
To prove (2.11), pick a spanning tree of each infinite cluster of η and let F be the union of all of these spanning trees. For each connected component T of F , let Λ T be the set of furcations of T , and note that Λ can be written as the disjoint union of the sets Λ T . For each component T of F , let T be obtained from T by iteratively deleting all vertices of degree at most 1, and note that the set of furcations of T coincides with the set of furcations of T . Let T (S) be the smallest connected subgraph of T containing all the vertices of S that belong to T , i.e., the union of all simple paths in T both of whose endpoints lie in S. (Note that T (S) is the empty graph if T does not intersect S.) Since T does not have any degree 1 vertices, it follows that P T (S → ∞) is at least the number of degree 1 vertices in T (S), and hence by Lemma 2.6 that Since the components of F are disjoint, we deduce that as claimed, where the sums are over the connected components of F .
Proof of Proposition 2.1. Fix a vertex v of G. Let ω 1 , ω 2 be independent copies of Bernoulli-p bond percolation on G, and let ω ∈ {0, 1} E be defined by Since we can explore K v (ω 1 ) without revealing the status of any edge in E \ E(K v (ω 1 )), the configuration ω is also distributed as Bernoulli-p bond percolation on G. If K v (ω) is finite then K v (ω) is not connected to ∞ in ω 2 \ e ∈ E : ω(e) = 0, and E v (ω e ) = ∞ , and we deduce that when K v (ω) is finite. Taking expectations on both sides and using that ω 1 and ω 2 are independent, we deduce from Proposition 2.4 that where c p > 0 is the constant from Proposition 2.4. We deduce immediately that if F : and the claim follows by taking expectations over K v .

Bounding the total derivative
In this section we prove Proposition 2.2. This is the easiest of the three propositions used in the proof of Theorem 1.1; The methods used are completely standard and go back to some of the earliest work on percolation, see e.g. [24,41] and [3, Section 3].
Proof of Proposition 2.2. Let G = (V, E) be a connected, locally finite graph, let v ∈ V , and let p ∈ (0, 1). Let (X i ) i≥1 be a sequence of i.i.d. mean-zero random variables with P(X i = 1 − p) = p and P(X i = −p) = 1 − p, and let Z n = n i=1 X i . Exploring the cluster of v one edge at a time leads to a coupling of percolation on G with the sequence (X i ) i≥1 and a stopping time T such that since |Z n | ≤ n for every n ≥ 1. Finally, since |X i | ≤ 1 for every i ≥ 1, we deduce from Azuma's inequality that for every n ≥ 1, so that the right hand side of (2.12) is finite whenever t < t 0 = α 2 /2.

Bounding the positive term
In this section we prove Proposition 2.3. This is the most difficult of the three propositions going into the proof of Theorem 1.1. The proof will rely on the following precise estimate, which will be proved via an inductive argument.
Proposition 2.7. Let G be a connected, locally finite graph, and let v be a vertex of G. Then the inequality holds for every k, n, m ≥ 1 and p ∈ (0, 1).
Before proving Proposition 2.7 we first explain how it implies the assertion of Proposition 2.3.
Proof of Proposition 2.3 given Proposition 2.7. Since Br k (K v , v) ≤ E v and k k ≤ e k k! for every k ≥ 1, it suffices to prove that for every α > 0 and p ∈ (0, 1) there exists λ = λ(α, p) < ∞ and C = C(α, p) < ∞ such that for every k ≥ 1. This can be deduced from Proposition 2.7 by elementary computation. Indeed, letting c = α 2 (1 − p) 8/α and letting Y be a geometric random variable with parameter e −c , we may write Since P(Y ≥ t) = e −c t ≤ e −ct for every t ≥ 0, we deduce from the identity where we used the change of variables s = t + 2k on the second line. Putting this all together we obtain that which is easily seen to imply an inequality of the desired form (2.14).
We shall prove a more refined version of Proposition 2.7 by induction on k. It will be important for us to work on arbitrary graphs in this section to facilitate the induction. Indeed, working on arbitrary graphs (or arbitrary subgraphs of a given graph) in this way is a useful trick to avoid non-monotonicity problems in inductive analyses of percolation, which we believe was first used by Kozma and Nachmias in [42] following a suggestion of Peres.
We begin with the base case k = 1. Note that Br 1 (K v , v) is bounded from above by the intrinsic radius R v of K v , i.e., the maximal graph distance in K v of a vertex from v. Thus the case k = 1 follows from the following bound on the probability of having a large skinny cluster, whose intrinsic radius is of the same order as its volume (think of m below as being at least αn).
Proof. Consider exploring the cluster of v as follows: at stage i, expose the value of those edges that touch ∂B int (v, i − 1), the set of vertices with intrinsic distance exactly i − 1 from v, and have not yet been exposed. Stop when ∂B int (v, i) = ∅. If R v ≥ m and E v ≤ n, there must exist at least m/2 stages i ∈ {1, . . . , m} where the sum of degrees in G of the vertices in ∂B int (v, i − 1) is at most 4n/m. At each such stage, the conditional probability that ∂B int (v, i) = ∅ given everything that has happened up to stage i − 1 is at most 1 − (1 − p) 4n/m . We deduce that as claimed, where the inequality 1 − x ≤ e −x has been used in the second inequality.
Remark 2.9. Using Lemma 2.8, we can already conclude the proof of the weaker statement that the intrinsic radius of a finite cluster has an exponential tail in supercritical percolation on a nonamenable transitive graph. This can be done via the same strategy used for the volume, but using Lemma 2.8 instead of Proposition 2.7 to bound the term U p,n [e tRv ], which is easily seen to be at For each k ≥ 1 and n, m ≥ 0 we define the quantity Note that Q k (p, n, m) is trivially equal to zero when the inequalities k ≤ m ≤ n are not satisfied. number of edges of G that K G v (q) touches, and write T v (q) = T G v (q) = Tr(K G v (q)). Write P = P G for the law of the collection of random variables (U e : e ∈ E).
Fix p ∈ [0, 1] and 1 ≤ k ≤ m ≤ n, and consider the event A = {E G v (p) = n, Lf k+1 (K G v (p), v) = m} whose probability we wish to estimate. Suppose that the event A holds, and let u 1 , . . . , u k+1 be a collection of leaves of T G v (p) = Tr(K G v (p)) attaining the maximum in the definition of Lf k+1 (K G v (p), v). (Note that this collection is not unique, but that the choice will not matter. In particular, we can and do choose u 1 , . . . , u k+1 to be a measurable function of the cluster K G v (p).) Let S be the subtree of T G v (p) spanned by the union of the geodesics connecting u 1 , . . . , u k+1 and the root o : . We say that an edge e of S is a last-branching edge if deleting it from S results in two connected components S 1 , S 2 , where S 1 contains the root and the following conditions hold: • S 2 is either a path or an isolated vertex; • The endpoint of e that belongs to S 1 is either equal to o or has degree at least three in S.
Observe that every last-branching edge of S is naturally associated both to a leaf of S and to a bridge of K G v (p), which we call a last-branching bridge. In particular, we may enumerate the last-branching bridges of K G v (p) by e 1 , . . . , e k+1 in such a way that for each 1 ≤ i ≤ k + 1 the geodesic from o to u i in S passes through the edge corresponding to e i but does not pass through the edge of S corresponding to e j for any j = i. Write L = {e 1 , . . . , e k+1 }.
As the form of the recursion (2.17) suggests, we will perform surgery to a random edge in L. Write p := pe −1/n , and consider the event is p -closed, and this edge belongs to L .
Since |L| = k + 1 and there are at most n p-open edges in K G v (p) on the event A , we may compute that where we have used that 1 − e −x ≥ x 2 for every x ∈ [0, 1] in the final inequality, and hence that for the subgraph of G spanned by those edges of G that do not touch K G v (p ). To bound P(B) we now argue that where C a,b is the event that the following conditions all hold: (ii) There is exactly one p-open edge e touched by K G v (p ) that is not p -open, and this edge lies in the set L. In particular, this edge has an endpoint x that does not lie in K G v (p ).
(iii) The p-cluster K Hv The only part of the claim (2.19) that merits explanation is the implicit claim that if (ii) holds then Without loss of generality assume that e = e k+1 ∈ L. Consider the subgraph S of T G v (p ) spanned by the union of the geodesics between u 1 , . . . , u k and the root. Since e = e k+1 is the last-branching bridge associated to u k+1 , the tree S contains one of the endpoints of the edge corresponding to e and we therefore observe that where the −1 term corresponds to the edge e itself. On the other hand, suppose that v 1 , . . . , v k are leaves of T G v (p ) attaining the maximum in the definition of Lf k (K G v (p ), v) and let r ≥ 0 be the distance in T G v (p ) from e to the subgraph of T G v (p ) spanned by the union of the geodesics between the leaves v 1 , . . . , v k and the root. Then considering the subgraph of T G v (p) spanned by the union of the geodesics between the leaves v 1 , . . . , v k , the leaf u k+1 , and the root yields that (again, the +1 term corresponds to the edge e itself) and hence that as claimed. It remains to estimate the probability of the event C a,b . Let D a,b ⊇ C a,b be the simpler event that the following conditions hold: (iib) There is exactly one p-open edge e touched by K G v (p ) that is not p -open, and this edge has an endpoint x that does not lie in K G v (p ).
(iiib) The p-cluster K Hv By definition, the probability that the condition (ib) holds is at most Q k (p , a, b). On the other hand, the conditional probability that (iiib) and (vb) hold given both that (ib) and (iib) hold and given the cluster K G v (p ) and the edge e is equal to was defined by taking a supremum over all graphs. Thus, we have that for every a, b ≥ 0. Since G was arbitrary, the claim now follows from (2.18) and (2.19).
Proof. We will first prove that for every k ≥ 1, p ∈ [0, 1], and n, m ≥ 0. The claim is trivial if the inequalities k ≤ m ≤ n are not satisfied, since in the case Q k (p, n, m) = 0. We will prove (2.20) by induction on k, simultaneously for all n, m ≥ 0 and p ∈ [0, 1]. The base case k = 1 follows immediately from Lemma 2.8. Before applying the induction hypothesis to the sum on the right hand side of (2.17), we first massage a little the bound from Lemma 2.8 on the term Q 1 (p, n − n 1 , m − m 1 − 1) by noting that The first inequality is simply Lemma 2.8. To see that the second inequality is true, note that it holds trivially when m = m 1 , while when m 1 < m it is true since in this case m − m 1 − 1 ≥ (m − m 1 )/2.
It remains only to deduce Proposition 2.7 from (2.20). It follows from the definitions that and hence by (2.20) that Applying the hockey-stick identity again to sum over r and using that a+b b is increasing in both b and a, we deduce that

Proofs of corollaries
In this section we apply Theorem 1.1 to deduce Corollaries 1.2-1.5.

Analyticity
In this section, we apply the following proposition to deduce Corollary 1.3 from Theorem 1.1. The proof of this proposition is well-known and very easy, but is included for completeness since we could not find a statement at the desired level of generality in the literature.
Proof of Corollary 1.3. This is immediate from Theorem 1.1 and Proposition 3.1.

Anchored expansion
The following proposition, which allows us to deduce Corollary 1.4 from Theorem 1.1, is implicit in the proof of [19,Theorem A.1]. We give a proof both for completeness and to stress that the argument does not require any isoperimetric assumptions on the ambient graph G.
Proposition 3.2. Let G = (V, E) be a connected, locally finite graph. Let p c < p < 1 and suppose that ζ(p) > 0. Then every infinite cluster K of Bernoulli-p bond percolation on G has anchored expansion with anchored Cheeger constant for every α < α(p), since Markov's inequality will then imply that lim N →∞ P p ∪ n≥N ∪ m≤αn A n,m = 0 for every α < α(p) as desired. To prove (3.2), first observe that for every S ⊆ ∂H we have that Summing over the possible choices of S with S ⊆ E(H) and |S| = m, we deduce that To conclude, simply note that if 0 < α ≤ p then αn m=1 n m p) , and it follows that the right hand side of (3.3) is finite for 0 < α < α(p) as claimed.
Remark 3.3. When G is transitive, it is a consequence of indistinguishability [28,45] that for each p c < p ≤ 1 there exists a deterministic constant φ * E (p) such that every infinite cluster has anchored Cheeger constant equal to φ * E (p) almost surely. We now turn to Corollary 1.2. It is a result of Häggström, Schonmann, and Steif [29] (see also [44,Corollary 8.38]) that if G is an amenable transitive graph and ω is an automorphisminvariant percolation process on G, then every cluster K of ω has Φ * E (K) = 0 almost surely. Thus, as discussed in the introduction, Proposition 3.2 has the following corollary. Proof of Corollary 1.2. This is immediate from Theorem 1.1 and Corollary 3.4.
Remark 3.5. Consider the graph G formed by attaching a binary tree to each vertex of Z 3 . This graph is nonamenable, has bounded degrees, and satisfies p c (G) = p c (Z 3 ) < 1/2. Moreover, if p c (G) < p < 1/2, then Bernoulli-p percolation on G has a unique infinite cluster almost surely, which is distributed as the graph obtained by taking the unique infinite cluster of percolation on Z 3 and attaching an independent subcritical Galton-Watson tree to each vertex. This graph clearly has subexponential growth, and consequently does not have anchored expansion. This example shows that Theorem 1.1 cannot be extended to arbitrary connected, bounded degree, nonamenable graphs, and therefore gives a negative answer to [10, Question 6.5] as originally stated.

Random walk analysis
almost surely on the event that the cluster of v is infinite. This estimate will be deduced as a consequence of the following geometric claim: There exists a constant c > 0 such that the intrinsic n-ball around v has an induced subgraph isomorphic to a path of length cn for every n sufficiently large a.s. on the event that K v is infinite (3.5) We now prove this claim. Fix p c < p < 1 and v ∈ V . Write B int (v, n) for the intrinsic ball of radius n around v in K v , and ∂B int (v, n) for the set of vertices with intrinsic distance exactly n from K v . Since K v has anchored expansion a.s. on the event that it is infinite by Corollary 1.4, it must trivially also have exponential growth a.s. on the event that it is infinite. Indeed, it follows from Theorem 1.1 and Proposition 3.2 that there exists a constant g 1 > 1 and a random variable N 1 that is almost surely finite on the event that K v is infinite such that |∂B int (v, n)| ≥ g n 1 for every n ≥ N 1 . On the other hand, since the volume of the intrinsic n-ball of any vertex is deterministically at most M n , where M is the maximum degree of G, it follows that for every n ≥ N 1 and m ≥ 0 almost surely on the event that K v is infinite, and consequently that there exist constants c 1 > 0 and g 2 > 1 such that #{u ∈ ∂B int (v, n) : u lies on an intrinsic geodesic from v to ∂B int (v, n + c 1 n )} ≥ g n 2 for every n ≥ N 1 almost surely on the event that the cluster of v is infinite. Let A n,m be the set of vertices u in ∂B int (v, n) such that there exists a path of length m in G starting at u that does not visit any vertex of B int (v, n) other than at its starting point, so that #A n, c 1 n ≥ #{u ∈ ∂B int (v, n) : u lies on an intrinsic geodesic from v to ∂B int (v, n + c 1 n )} ≥ g n 2 for every n ≥ N 1 almost surely on the event that the cluster of v is infinite. Moreover, choosing points greedily shows that for every n, m ≥ 1 and r ≥ 1 there exists a subset A n,m,r of A n,m with volume at least M −r #A n,m such that any two distinct points in A n,m,r have distance at least r in G. Furthermore, we can and do choose A n,m,r in such a way that it is a measurable function of B int (v, n). Putting the above facts together, we deduce that there exist constants c 2 > 0 and g 3 > 1 such that #A n, c 2 n , c 2 n ≥ g n 3 (3.6) for every n ≥ N 1 almost surely on the event that the cluster of v is infinite. Let k ≥ 1 and let A n,k be the event that B int (v, n + k) contains an induced subgraph isomorphic to a path of length k. Let m ≥ 2(k + 1). For each element u of A n,m,m , the conditional probability given B int (v, n) that K v contains an induced subgraph isomorphic to a path of length k starting at u is at least [p(1 − p) M −1 ] k , since we can take the path of the required length starting at u that is disjoint from B int (v, n) other than at u, find all of its edges to be open, and find to be closed all the edges that touch a vertex of the path other than u but are not included in the path. On the other hand, the separation between the points of A n,m,m makes all of these events independent from each other, and we deduce that for every n, k ≥ 1 and m ≥ 2(k + 1). Choosing it follows by Borel-Cantelli that there exists an almost surely finite N 2 such that the event A n, cn ∪ {#A n, 2cn , 2cn < g n 3 } occurs for every n ≥ N 2 almost surely. Together with (3.6), this shows that the event A n, cn holds for every n ≥ N 1 ∨ N 2 almost surely on the event that K v is infinite. This completes the proof of (3.5).
It remains to deduce (3.4) from (3.5). This argument is well-known and appears in e.g. [54, Example 6.1], so we will keep the discussion brief. Let c > 0 and N < ∞ be random variables such that B int (v, n 1/3 ) contains an induced subgraph isomorphic to a path of length cn 1/3 for every n ≥ N . Pick one such induced subgraph and call it the n-pipe, and let w(n) be the vertex of the n-pipe at minimal intrinsic distance from v. In time 2n, the random walk on G can return to v using the following strategy: Walk to w(n) in exactly d int (v, w(n)) = O(n 1/3 ) steps, spend the following 2n − 2d int (v, w(n)) = Θ(n) steps performing an excursion from w(n) to itself inside the n-pipe, then return to v in the final d int (v, w(n)) = O(n 1/3 ) steps. Each of these three stages has probability at least e −Cn 1/3 to occur for an appropriate choice of constant C, concluding the proof.

Extension to quasi-transitive graphs
Recall that a graph is said to be quasi-transitive if the action of its automorphism group on its vertex set has at most finitely many orbits. Theorems concerning percolation on transitive graphs can almost always be generalized to the quasi-transitive case, and ours are no exception. Once this theorem is established, extensions of Corollaries 1.3-1.5 to the quasi-transitive case follow by essentially the same proofs as in the transitive case.
Let G = (V, E) be a connected, locally finite, nonamenable, quasi-transitive graph, and let v be a vertex of G. The only place in the proof of Theorem 1.1 where transitivity is used is in the proof of Proposition 2.1. Thus, to prove Theorem 4.1, it suffices to prove that for every p c < p < 1 there exist positive constants t 0 , c and C such that D p,n e tEv ≥ cE p,n E v e tEv − C (4.1) for every n ≥ 1 and 0 < t ≤ t 0 . Indeed, once this is established the proof may be concluded in a very similar way to the transitive case.
Let V 1 , . . . , V k be the orbits of the action of Aut(G) on V . The proof of Lemma 2.5 generalizes to show that for every p c < p < 1, there exists an automorphism-invariant percolation process η such that η has furcations almost surely and η is stochastically dominated by Bernoulli-p bond percolation on G. Thus, the proof of Proposition 2.4 yields that there exists 1 ≤ i 0 ≤ k and a positive constant c p such that for every finite set S ⊆ V . The proof of Proposition 2.1 then yields that for every t > 0 and n ≥ 1. In order to deduce an inequality of the form (4.1) from (4.3), it suffices to prove the following lemma.
Lemma 4.2. Let G = (V, E) be a connected, locally finite, quasi-transitive graph with maximum degree M , let 0 < p < 1, and let V 1 . . . , V k be the orbits of the action of Aut(G) on V . Then for every 0 < p < 1 there exist positive constants α = α(p, k, M ) and t 0 = t 0 (p, k, M ) such that for every n ≥ 1 and 1 ≤ i ≤ k.
Proof of Lemma 4.2. It suffices to prove the claim for i = 1. Fix 0 < p < 1. Since G is connected, we may reorder V 2 , V 3 , . . . , V k so that V j is adjacent to j−1 =1 V for every 2 ≤ j ≤ k. Since Aut(G) acts transitively on V for each 1 ≤ ≤ k, every vertex in V m must be adjacent to at least one vertex in ∪ m−1 =1 V for each 2 ≤ m ≤ k. Let M be the maximum degree of G, and write K m v = K v ∩ ∪ m =1 V for each 1 ≤ m ≤ k. We first claim that there exists a positive constant c = c(p, M ) such that for each 2 ≤ m ≤ k we have that for every s > 0. The first inequality is trivial. For the second, consider exploring K v one edge at a time. On the event under consideration, we must query some number N ≥ 2M s/(p + 2M ) edges with one endpoint in V m and the other in m−1 =1 V and find that at most pN/2 of these edges are open. Since p/2 < p, the claimed inequality (4.5) follows by standard large deviation estimates for Binomial random variables.
To conclude, we note that we trivially have |K k v | = |K v | ≥ E v /M , and apply (4.5) and a union bound to deduce that planar map with locally finite dual M † , which is also quasi-transitive and nonamenable. Every edge e of M has a corresponding dual edge e † . (See e.g. [6, Section 2.1] for detailed definitions.) If ω is Bernoulli-p bond percolation on M then the configuration ω † defined by ω † (e † ) = 1 − ω(e) is distributed as Bernoulli-(1 − p) bond percolation on M † , and it follows from the results of [13] that p u (M ) = 1 − p c (M † ), so that p u -percolation on M is dual to p c -percolation on M † . Suppose that e is an edge of M with endpoints x and y, and let f and g be the two faces incident to e. Let ω be Bernoulli-p bond percolation on M and let K 1 and K 2 be the clusters of f and g in ω † \ {e † }. If e is closed and x is connected to y in ω then at least one of K 1 or K 2 must be finite. Moreover, if x is connected to y and K i is finite then the collection of edges other than e whose duals are in the boundary of K i must contain an open path from x to y, so that d int (x, y) ≤ min{|E(K 1 )|, |E(K 2 )|}.
Thus, it follows from sharpness of the phase transition (1.1) and Theorem 4.1 that for every p ∈ (0, p u ) ∪ (p u , 1) there exists a constant c p > 0 such that P p d int (x, y) ≥ n | x ↔ y ≤ e −cpn (5.1) for every pair of adjacent vertices x and y of M and every n ≥ 1. (With a little more work one can also obtain similar bounds for non-neighbouring pairs of vertices.) This contrasts the behaviour at p u , where it is proven in [37, Theorem 6.1] that, under the same assumptions, P pu d int (x, y) ≥ n | x ↔ y n −1 for some neighbouring pairs of vertices.

Conjectures for amenable graphs
We end with some conjectures concerning supercritical percolation on general transitive graphs that could potentially unify our results with those from the Euclidean case [2,18,40,48]. We expect that these conjectures have been around for some time as folklore. We recall that if G is a connected, locally finite graph, the isoperimetric profile of G is defined to be Conjecture 5.1. Let G = (V, E) be an infinite, connected, locally finite, transitive graph. Then for every p c < p < 1 there exist positive constants c p and C p such that exp −C p ψ(C p n) ≤ P p (n ≤ |K v | < ∞) ≤ exp −c p ψ(c p n) for every n ≥ 1.
Note that the nonamenable case of Conjecture 5.1 is exactly Theorem 1.1, and that the case G = Z d is covered by (1.3). In the case that G is a Cayley graph of a one-ended finitely presented group, the lower bound of Conjecture 5.1 is implicit in the proof of [8,Theorem 3]. For the same class of graphs and for p sufficiently close to 1 one can also establish the upper bound of Conjecture 5.1 via a Peierls argument.
Let us also draw attention to the following much weaker question, which also remains open.
Conjecture 5.2. Let G = (V, E) be an infinite, connected, locally finite, transitive graph. Then the truncated susceptibility E p |K v |1(|K v | < ∞) is finite for every p c < p ≤ 1.
In contrast to the volume, we expect that the radius of a finite supercritical cluster has an exponential tail on every transitive graph. Similar statements should also hold for the intrinsic radius. The nonamenable case of this conjecture is implied by Theorem 1.1, while the case G = Z d was proven by Chayes, Chayes, Grimmett, Kesten, and Schonmann [18]. Conjecture 5.3. Let G = (V, E) be an infinite, connected, locally finite, transitive graph. Then for every p c < p < 1 there exists a positive constant c p such that P p (K v ↔ ∂B(v, n), |K v | < ∞) ≤ e −cpn for every n ≥ 1.
Finally, we conjecture that the infinite clusters in supercritical percolation on G always inherit an anchored version of any isoperimetric inequality satisfied by G. |∂ ω E K| ψ(G, c p n) : K ⊆ K v connected, v ∈ K, and n ≤ |K| < ∞ > 0 almost surely on the event that K v is infinite.
The nonamenable case of this conjecture is implied by Corollary 1.4, while the case G = Z d was established by Pete [48]. Similarly to above, for Cayley graphs of one-ended, finitely presented groups and p close to 1 the conjecture may be established via a Peierls argument, see [48,Theorem 1.5]. We remark that Conjectures 5.1 and 5.4 are also closely related to Pete's exponential cluster repulsion conjecture [49,Conjecture 12.32].