Exponential growth of ponds in invasion percolation on regular trees

In invasion percolation, the edges of successively maximal weight (the outlets) divide the invasion cluster into a chain of ponds separated by outlets. On the regular tree, the ponds are shown to grow exponentially, with law of large numbers, central limit theorem and large deviation results. The tail asymptotics for a fixed pond are also studied and are shown to be related to the asymptotics of a critical percolation cluster, with a logarithmic correction.

1 Introduction and definitions 1

.1 The model: invasion percolation, ponds and outlets
Consider an infinite connected locally finite graph G, with a distinguished vertex o, the root. On each edge, place an independent Uniform[0, 1] edge weight, which we may assume (a.s.) to be all distinct. Starting from the subgraph C 0 = {o}, inductively grow a sequence of subgraphs C i according to the following deterministic rule. At step i, examine the edges on the boundary of C i−1 , and form C i by adjoining to C i−1 the edge whose weight is minimal. The infinite union is called the invasion cluster.
Invasion percolation is closely related to ordinary (Bernoulli) percolation. For instance, ([4] for G = Z d ; later greatly generalized by [11]) if G is quasi-transitive, then for any p > p c , only a finite number of edges of weight greater than p are ever invaded. On the other hand, it is elementary to show that for any p < p c , infinitely many edges of weight greater than p must be invaded. In other words, writing ξ i for the weight of the i th invaded edge, we have lim sup So invasion percolation produces an infinite cluster using only slightly more than critical edges, even though there may be no infinite cluster at criticality. The fact that invasion percolation is linked to the critical value p c , even though it contains no parameter in its definition, makes it an example of self-organized criticality.
Under mild hypotheses (see section 3.1), the invasion cluster has a natural decomposition into ponds and outlets. Let e 1 ∈ C be the edge whose weight Q 1 is the largest ever invaded. For n > 1, e n is the edge in C whose weight Q n is the highest among edges invaded after e n−1 . We call e n the n th outlet and Q n the corresponding outlet weight. WriteV n for the step at which e n was invaded, withV 0 = 0. The n th pond is the subgraph of edges invaded at steps i ∈ (V n−1 ,V n ]. Suppose an edge e, with weight p, is first examined at step i ∈ (V n−1 ,V n ]. (That is, i is the first step at which e is on the boundary of C i−1 .) Then we have the following dichotomy: either • e will be invaded as part of the n th pond (if p ≤ Q n ); or • e will never be invaded (if p > Q n ) This implies that the ponds are connected subgraphs and touch each other only at the outlets. Moreover, the outlets are pivotal in the sense that any infinite non-intersecting path in C starting at o must pass through every outlet. Consequently C is decomposed as an infinite chain of ponds, connected at the outlets.
In this paper we take G to be a regular tree and analyze the asymptotic behaviour of the ponds, the outlets and the outlet weights. This problem can be approached in two directions: by considering the ponds as a sequence and studying the growth properties of that sequence; or by considering a fixed pond and finding its asymptotics. We will see that the sequence of ponds grows exponentially, with exact exponential constants. For a fixed pond, its asymptotics correspond to those of ordinary percolation with a logarithmic correction.
These computations are based on representing C in terms of the outlet weights Q n , as in [1]. Conditional on (Q n ) ∞ n=0 , each pond is an independent percolation cluster with parameter related to Q n . In particular, the fluctuations of the ponds are a combination of fluctuations in Q n and the additional randomness.
Surprisingly, in all but the large deviation sense, the asymptotic behaviour for the ponds is controlled by the outlet weights alone: the remaining randomness after conditioning only on (Q n ) ∞ n=0 disappears in the limit, and the fluctuations are attributable solely to fluctuations of Q n .

Known results
The terminology of ponds and outlets comes from the following description (see [17]) of invasion percolation. Consider a random landscape where the edge weights represent the heights of channels between locations. Pour water into the landscape at o; then as more and more water is added, it will flow into neighbouring areas according to the invasion percolation mechanism. The water level at o, and throughout the first pond, will rise until it reaches the height of the first outlet. Once water flows over an outlet, however, it will flow into a new pond where the water will only ever rise to a lower height. Note that the water level in the n th pond is the height (edge weight) of the n th outlet.
The edge weights may also be interpreted as energy barriers for a random walker exploring a random energy landscape: see [15]. If the energy levels are highly separated, then (with high probability and until some large time horizon) the walker will visit the ponds in order, spending a long time in each pond before crossing the next outlet. In this interpretation the growth rate of the ponds determines the effect of entropy on this analysis. See the extended discussion in [15].
Invasion percolation is also related to the incipient infinite cluster (IIC), at least in the cases G = Z 2 ( [12]) and G a regular tree: see, e.g., [12], [1] and [5]. For a cylinder event E, the law of the IIC can be defined by or by other limiting procedures, many of which can be proved to be equivalent to each other. Both the invasion cluster and the IIC consist of an infinite cluster that is "almost critical", in view of (1.2) or (1. 3) respectively. For G = Z 2 ( [12]) and G a regular tree ( [1]), the IIC can be defined in terms of the invasion cluster: if X k denotes a vertex chosen uniformly from among the invaded vertices within distance k of o, and τ X k E denotes the translation of E when o is sent to X k , then Surprisingly, despite this local equivalence, the invasion cluster and the IIC are globally different: they are mutually singular and, at least on the regular tree, have different scaling limits, although they have the same scaling exponents. The regular tree case, first considered in [16], was studied in great detail in [1]. Any infinite non-intersecting path from o must pass through every outlet; on a tree, this implies that there is a backbone, the unique infinite non-intersecting path from o. In [1] a description of the invasion cluster was given in terms of the forward maximal weight process, the outlet weights indexed by height along the backbone (see section 3.2). This parametrization in terms of the external geometry of the tree allowed the calculation of natural geometric quantities, such as the number of invaded edges within a ball. In the following, we will see that when information about the heights is discarded, the process of edge weights takes an even simpler form.
The detailed structural information in [1] was used in [2] to identify the scaling limit of the invasion cluster (again for the regular tree). Since the invasion cluster is a tree with a single infinite end, it can be encoded by its Lukaciewicz path or its height and contour functions. Within each pond, the scaling limit of the Lukaciewicz path is computed, and the different ponds are stitched together to provide the full scaling limit.
The two-dimensional case was also studied in a series of papers by van den Berg, Damron, Járai, Sapozhnikov and Vágvölgyi ( [17], [6] and [5]). There they study, among other things, the probability that the n th pond extends a distance k from o, for n fixed. For n = 1 this is asymptotically of the same order as the probability that a critical percolation cluster extends a distance k, and for n > 1 there is a correction factor (log k) n−1 . Furthermore an exponential growth bound for the ponds is given. This present work was motivated in part by the question of what the corresponding results would be for the tree. Quite remarkably, they are essentially the same, suggesting that a more general phenomenon may be involved.
In the results and proofs that follow, we shall see that the sequence of outlet weights plays a dominant role. Indeed, all of the results in Theorems 2.1-2.4 are proved first for Q n , then extended to other pond quantities using conditional tail estimates. Consequently, all of the results can be understood as consequences of the growth mechanism for the sequence Q n . On the regular tree, we are able to give an exact description of the sequence Q n in terms of a sum of independent random variables (see section 3.3). In more general graphs, this representation cannot be expected to hold exactly. However, the similarities between the between the pond behaviours, even on graphs as different as the tree and Z 2 , suggest that an approximate analogue may hold. Such a result would provide a unifying explanation for both the exponential pond growth and the asymptotics of a fixed pond, even on potentially quite general graphs.

Summary of notation
We will primarily consider the case where G is the forward regular tree of degree σ: namely, the tree in which the root o has degree σ and every other vertex has degree σ + 1. The weight of the i th invaded edge is ξ i . The n th outlet is e n and its edge weight is Q n . We may naturally consider e n to be an oriented edge e n = (v n , v n ), where v n is invaded before v n . The step at which e n is invaded is denotedV n and the (graph) distance from o to v n isL n . SettingV 0 =L 0 = 0 for convenience, we write V n =V n −V n−1 and L n =L n −L n−1 .
There is a natural geometric interpretation of L n as the length of the part of the backbone in the n th pond, and V n as the volume (number of edges) of the n th pond. In particularV n is the volume of the union of the first n ponds.
R n is the length of the longest upward-pointing path in the n th pond, and R ′ n is the length of the longest upward-pointing path in the union of the first n ponds.
We shall later work with the quantity δ n ; for its definition, see (3.8).
We note the following elementary relations: Probability laws will generically be denoted P. For non-zero functions f (x) and g(x) , g(x) = 1; the point at which the limit is to be taken will usually be clear from the context. We write
as N → ∞, with respect to the metric of uniform convergence on compact intervals of t.
These theorems say that each component of Z satisfies a law of large numbers and functional central limit theorem, with the same limiting Brownian motion for each component.
Theorem 2.2 shows that the logarithmic scaling in Theorem 2.1 cannot be replaced by a linear rescaling such as e n (Q n − p c ). Indeed, log((Q n −p c ) −1 ) has characteristic additive fluctuations of order ± √ n, and therefore Q n − p c fluctuates by a multiplicative factor of the form e ± √ n . As n → ∞ this will be concentrated at 0 and ∞, causing tightness to fail.
(2.4) 1 n log L n , 1 n log R n and 1 2n log V n satisfy large deviation principles on [0, ∞) with rate n and rate function ψ, where (2.5) It will be shown that ψ arises as the solution of the variational problem

Tail behaviour of a pond
Theorems 2.1-2.3 describe the growth of the ponds as a sequence. We now consider a fixed pond and study its tail behaviour.
Theorem 2.4. For n fixed and ǫ → 0 + , k → ∞, and Using the well-known asymptotics we may rewrite (2.7)-(2.10) as Working in the case G = Z 2 , [6] considersR n , the maximum distance from o to a point in the first n ponds, which is essentially R ′ n in our notation. [6, Theorem 1.5] states that and notes as a corollary where i ↔ denotes a percolation connection where up to i edges are allowed to be vacant ("percolation with defects"). (2.18) suggests the somewhat plausible heuristic of approximating the union of the first n ponds by the set of vertices reachable by critical percolation with at most n − 1 defects. Indeed, the proof of (2.17) uses in part a comparison to percolation with defects. By contrast, on the tree the following result holds: Theorem 2.5. For fixed n and k → ∞, The dramatic contrast between (2.18) and (2.19) can be explained in terms of the number of large clusters in a box. In Z 2 , a box of side length S has generically only one cluster of diameter of order S.
On the tree, by contrast, there are many large clusters. Indeed, a cluster of size N has of order N edges on its outer boundary, any one of which might be adjacent to another large cluster, independently of every other edge. Percolation with defects allows the best boundary edge to be chosen, whereas invasion percolation is unlikely to invade the optimal edge.

Outline of the paper
Section 3.1 states a Markov property for the outlet weights that is valid for any graph. From section 3.2 onwards, we specialize to the case where G is a regular tree. In section 3.2 we recall results from [1] that describe the structure of the invasion cluster conditional on the outlet weights Q n . Section 3.3 analyzes the Markov transition mechanism of section 3.1 and proves the results of Theorems 2.1-2.3 for Q n . Section 4.1 states conditional tail bounds for L n , R n and V n given Q n . The rest of sections 4-6 use these tail bounds to prove Theorems 2.1-2.4. The proof of the bounds in section 4.1 is given in section 7. Finally, section 8 gives the proof of Theorem 2.5.

Markov structure of invasion percolation
In section 3.1 we give sufficient conditions for the existence of ponds and outlets, and state a Markov property for the ponds, outlets and outlet weights. Section 3.2 summarizes some previous results from [1] concerning the structure of the invasion cluster. Finally in section 3.3 we analyze the resulting Markov chain in the special case where G is a regular tree and prove the results of Theorems 2.1-2.3 for Q n .

General graphs: ponds, outlets and outlet weights
The representation of an invasion cluster in terms of ponds and outlets is guaranteed to be valid under the following two assumptions: since otherwise there would exist somewhere an infinite percolation cluster at level p c . We can then make the inductive definition 3) imply that the maxima are attained. Condition on Q n , e n , and the unionC n of the first n ponds. We may naturally consider e n to be an oriented edge e n = (v n , v n ) where the vertex v n was invaded before v n . The condition that e n is an outlet, with weight Q n , implies that there must exist an infinite path of edges with weights at most Q n , starting from v n and remaining in G\C n . However, the law of the edge weights in G\C n is not otherwise affected by Q n , e n ,C n . In particular we have on the event {q ′ ≤ Q n }. In (3.6) we can replace G\C n by the connected component of G\C n that contains e n .

Geometric structure of the invasion cluster: the regular tree case
In [1, section 3.1], the same outlet weights are studied, parametrized by height rather than by pond. W k is defined to be the maximum invaded edge weight above the vertex at height k along the backbone.
A key point in the analysis in [1] is the observation that is itself a Markov process. W k is constant for long stretches, corresponding to k in the same pond, and the jumps of W k occur when an outlet is encountered. The relation between the two processes is given by From (3.7) we see that the (Q n ) ∞ n=0 are the successive distinct values of (W k ) ∞ k=0 , and L n =L n −L n−1 is the length of time the Markov chain W k spends in state Q n before jumping to state Q n+1 . In particular, L n is geometric conditional on Q n , with some parameter depending only on Q n . As we will refer to it often, we define δ n to be that geometric parameter: A further analysis (see [1, section 2.1]) shows that the off-backbone part of the n th pond is a sub-critical Bernoulli percolation cluster with a parameter depending on Q n , independently in each pond. We summarize these results in the following theorem. Given (Q n ) ∞ n=0 , the ponds are conditionally independent for different n. δ n is a continuous, strictly increasing functions of Q n and satisfies The meaning of (3.10) is that It is not at first apparent that the geometric parameter δ n in (3.8) is the same quantity that appears in (3.9), and indeed [1] has two different notations for the two quantities: see [1, equations (3.1) and (2.14)]. Combining equations (2.3), (2.5), (2.14) and (3.1) of [1] shows that they are equivalent to For σ = 2 we can find explicit formulas for these parameters: However, all the information needed for our purposes is contained in the asymptotic relation (3.10).

The outlet weight process
The representation (3.6) simplifies dramatically when G is a regular tree. Then the connected component of G\C n containing e n is isomorphic to G, with e n corresponding to the root. Therefore the dependence of Q n+1 on e n andC n is eliminated and we have the following result. and for p c < q ′ < q.
Equations (3.12) and (3.13) say that, conditional on Q n , Q n+1 is chosen from the same distribution, conditioned to be smaller. In terms of (W k ) ∞ k=0 , (3.13) describes the jumps of W k when they occur, and indeed the transition mechanism (3.13) is implicit in [1].
Since θ is a continuous function, it is simpler to consider θ(Q n ): for 0 < u ′ < u. But this is equivalent to multiplying θ(Q n ) by an independent Uniform[0, 1] variable. Noting further that the logarithm of a Uniform[0, 1] variable is exponential of mean 1, we have proved the following proposition.
jointly for all n. 4 Law of large numbers and central limit theorem

Tail bounds for pond statistics
Theorem 3.1 expressed L n , R n and V n as random variables whose parameters are given in terms of Q n . Their fluctuations are therefore a combination of fluctuations arising from Q n , and additional randomness. The following proposition gives bounds on the additional randomness.
Recall that δ n is a certain function of Q n with δ n ∼ σ(Q n − p c ): see Theorem 3.1.
Proposition 4.1. There exist positive constants C, c, s 0 , γ L , γ R , γ V such that L n , R n and V n satisfy the conditional bounds for all n and all S, s > 0; and for s ≤ s 0 .
The proofs of (4.1)-(4.6), which involve random walk and branching process estimates, are deferred to section 7.

A uniform convergence lemma
Because Theorem 2.2 involves weak convergence of several processes to the same joint limit, it will be convenient to use Skorohod's representation theorem and almost sure convergence. The following uniform convergence result will be used to extend convergence from one set of coupled random variables (δ n,N ) to another (X n,N ): see section 4.3.
Lemma 4.2. Suppose {X n,N } n,N ∈N and {δ n,N } n,N ∈N are positive random variables such that δ n,N is decreasing in n for each fixed N , and for positive constants a, β and C, P(δ a n,N X n,N > S) ≤ CS −β (4.7) P(δ a n,N X n,N < s) ≤ Cs β (4.8) for all S and s. DefineX Then for any T > 0 and α > 0, w.p. 1, log(δ a n,NX n,N ) N α = 0. (4.10) Proof. Let ǫ > 0 be given. For a fixed N , (4.7) implies P max 1≤n≤N T log(δ a n,NX n,N ) where we used δ i,N ≥ δ n,N in the third inequality. But then since a.s. Similarly, (4.8) implies P max 1≤n≤N T log(δ a n,N X n,N ) N α < −ǫ ≤ 1≤n≤N T P δ a n,N X n,N < e −N α ǫ ≤ N T Ce −βN α ǫ (4.13) so that lim inf N →∞ max 1≤n≤N T log(δ a n,N X n,N ) N α ≥ −ǫ (4.14) a.s. Since ǫ > 0 was arbitrary and X n,N ≤X n,N , (4.10) follows.

Proof of Theorems 2.1-2.2
The conclusions about Q n are contained in Corollary 3.4. The other conclusions will follow from Lemma 4.2. From Corollary 3.4, we may apply Skorohod's representation theorem to produce realizations of the ponds for each N ∈ N, coupled so that a.s. as N → ∞. Then the relation (4.16) shows that 1 a log X n will satisfy a central limit theorem as well. The same holds forX n . We will successively set The bounds (4.7)-(4.8) follow immediately from the bounds in Proposition 4.1. This proves Theorem 2.2 for L n and V n . For R n , the quan-tityR is not the one that appears in Theorem 2.2, but the bound R n ≤ R ′ n ≤R n implies the result for R ′ n as well. The lemma also implies the law of large numbers results (2.2), by taking T = 1 and using the same ponds for every N .

Large deviations: proof of Theorem 2.3
In this section we present a proof of the large deviation results in Theorem 2.3. As in section 4, we prove a generic result using a variable X n and tail estimates. Theorem 2.3 then follows immediately using Corollary 3.4 and Proposition 4.1.
Note that Proposition 5.1 uses the full strength of the bounds in Proposition 4.1.
Proposition 5.1. Suppose that δ n and X n are positive random variables such that, for positive constants a, β, c, C, γ, s 0 , for all S and s, and P ( δ a n X n < s | δ n ) ≥ cs 1/a (5.3) on the event {δ a n < γs}, for s ≤ s 0 . Suppose also that 1 n log δ −1 n satisfies a large deviation principle with rate n on [0, ∞) with rate function ϕ such that ϕ(1) = 0, ϕ is continuous on (0, ∞), and ϕ is decreasing on (0, 1] and increasing on [1, ∞). Then 1 an log X n satisfies a large deviation principle with rate n on [0, ∞) with rate function Proof. It is easy to check that ψ is continuous, decreasing on [0, 1] and increasing on [1, ∞), ψ(1) = 0, and ψ(u) = ϕ(u) for u ≥ 1. So it suffices to show that for u > 0 and for 0 < u < 1. For (5.5), let ǫ > 0. Then where we used (5.1) with S = e anǫ . The last term in (5.7) is superexponentially small, so (5.7) and the large deviation principle for 1 n log δ −1 n imply lim sup On the other hand, Since ϕ is continuous and ǫ > 0 was arbitrary, this proves (5.5). For (5.6), let u ∈ (0, 1) be given and choose v ∈ (u, 1), ǫ ∈ (0, u). Then for n sufficiently large we have Here we used (5.3) with s = e −an (v−u) . Note that if n is large enough then s ≤ s 0 and the condition δ a n < γs follows from v − ǫ < 1 n log δ −1 n . Therefore, since ϕ is decreasing on (0, 1], (5.12) was proved for u < v < 1. However, since ϕ is continuous and the function −ϕ(v) − av is decreasing in v for v ≥ 1, (5.12) holds for all v ≥ u. So take the supremum over v ≥ u to obtain lim inf n→∞ 1 n log P 1 an log X n < u ≥ −ψ(u).

Tail asymptotics
In this section we prove the fixed-pond asymptotics from Theorem 2.4.
To extend (6.8) toX n , assume inductively that P(X n > k) ≍ (log k) n−1 /k 1/a . (The case n = 1 is already proved sinceX 1 = X 1 .) The bound P(X n+1 > k) ≥ P(X n+1 > k) is immediate, and we can estimate which is of lower order. This completes the induction.
Proof of (2.9)-(2.10). These relations follow immediately from (6.5) and Lemma 6.1; the bounds (6.6)-(6.7) are immediate consequences of Proposition 4.1. As in section 4.3, the asymptotics for R ′ n follow from the asymptotics forR n and the bound R n ≤ R ′ n ≤R n .
7 Pond bounds: proof of Proposition 4.1 In this section we prove the tail bounds (4.1)-(4.6). Since the laws of L n , R n and V n do not depend on n except through the value of δ n , we will omit the subscript in this section. For convenient reference we recall the structure of the bounds: for all S and s, and on the event {δ a < γs}, for s ≤ s 0 . We have a = 1 for X = L and X = R, and a = 2 for X = V . In (7.3) it is necessary to assume δ a < γs. This is due only to a discretization effect: for any N-valued random variable X, we have P ( δ a X < s | δ) = 0 whenever δ a < s. Indeed, the bounds (4.4)-(4.6) can be proved with γ = 1, although it is not necessary for our purposes.
Note that, by proper choice of C and s 0 , we can assume that S is large in (7.1) and s is small in (7.2) and (7.3). Since we only consider N-valued random variables X, we can assume δ is small in (7.2), say δ < 1/2 (otherwise take s < (1/2) a without loss of generality). Moreover, Theorem 3.1 shows that L, R and V are all stochastically decreasing in δ. Consequently it suffices to prove (7.1) for δ small, say δ < 1/2. Finally the constraint δ a < γs 0 makes δ small in (7.3) also.

7.2
The pond radius R: proof of (4.2) and (4.5) Conditional on δ and L, R is the maximum height of a percolation cluster with parameter p c (1 − δ) started from a path of length L.
We have R ≥ L so (7.2) follows immediately from the corresponding bound for L. R is stochastically dominated by whereR i is the maximum height of a branching process with offspring distribution Binomial(σ, p c (1 − δ)) started from a single vertex, independently for each i. Define a k = P R i > k δ (7.11) for k > 0. Thus a k is the probability that the branching process survives to generation k + 1. By comparison with a critical branching process, for some constant C 1 . On the other hand, a k satisfies where f (z) is the generating function for the offspring distribution of the branching process. (This is a reformulation of the well-known recursion for the extinction probability.) In particular, since Combining (7.12) with (7.14), (7.15) and taking k = ⌈S/2δ⌉ ≥ S/2δ, j = ⌊S/δ⌋ − ⌈S/2δ⌉ ≥ S/2δ − 2, Using this estimate we can compute ≥ c 13 s (7.18) provided δ is sufficiently small compared to s, i.e., provided γ R is small enough.

7.3
The pond volume V : proof of (4.3) and (4.6) From Theorem 3.1, conditional on δ and L, V is the number of edges in a percolation cluster with parameter p c (1 − δ), started from a path of length L and with no edges emerging from the top of the path. We can express V in terms of the return time of a random walk as follows.

Percolation with defects
In this section we prove with n − 1 defects. As a worst-case estimate we may assume that v 1 , . . . , v N are still at distance k from ∂B(k), so that by independence we have for constants c 1 , c 2 . If we set N = k 2 −n+1 then the second factor is of order 1, and the lower bound is proved. For the upper bound, use a slightly stronger form of (8. which proves the result (the first term is an error term if n ≥ 2 and combines with the second if n = 1).