Multi-Particle Diffusion Limited Aggregation

We consider a stochastic aggregation model on Z^d. Start with particles located at the vertices of the lattice, initially distributed according to the product Bernoulli measure with parameter \mu. In addition, there is an aggregate, which initially consists of the origin. Non-aggregated particles move as continuous time simple random walks obeying the exclusion rule, whereas aggregated particles do not move. The aggregate grows by attaching particles to its surface whenever a particle attempts to jump onto it. This evolution is referred to as multi-particle diffusion limited aggregation. Our main result states that if on d>1 the initial density of particles is large enough, then with positive probability the aggregate has linearly growing arms, i.e. if F(t) denotes the point of the aggregate furthest away from the origin at time t>0, then there exists a constant c>0 so that |F(t)|>ct, for all t eventually. The key conceptual element of our analysis is the introduction and study of a new growth process. Consider a first passage percolation process, called type 1, starting from the origin. Whenever type 1 is about to occupy a new vertex, with positive probability, instead of doing it, it gives rise to another first passage percolation process, called type 2, which starts to spread from that vertex. Each vertex gets occupied only by the process that arrives to it first. This process may have three phases: an extinction phase, where type 1 gets eventually surrounded by type 2 clusters, a coexistence phase, where infinite clusters of both types emerge, and a strong survival phase, where type 1 produces an infinite cluster that successfully surrounds all type 2 clusters. Understanding the behavior of this process in its various phases is of mathematical interest on its own right. We establish the existence of a strong survival phase, and use this to show our main result.


Introduction
In this work we consider one of the classical aggregation processes, introduced in [25] (see also [29]) with the goal of providing an example of "a simple and tractable" mathematical model of dendritic growth, for which theoretical and mathematical concepts and tools could be designed and tested on. Almost four decades later we still encounter tremendous mathematical challenges studying its geometric and dynamic properties, and understanding the driving mechanism lying behind the formation of fractal-like structures.
Multi-particle diffusion limited aggregation (MDLA) We consider the following stochastic aggregation model on Z d , d ≥ 1. Start with an infinite collection of particles located at the vertices of the lattice, with at most one particle per vertex, and initially distributed according to the product Bernoulli measure with parameter μ ∈ (0, 1). In addition, there is an aggregate, which initially consists of only one special particle, placed at the origin. The system evolves in continuous time. Non-aggregated particles move as simple symmetric random walks obeying the exclusion rule, i.e. particles jump at rate 2d to a uniformly random neighbor, but if the chosen neighbor already contains another non-aggregated particle, such jump is suppressed and the particle waits for the next attempt to jump. Aggregated particles do not move. Whenever a non-aggregated particle attempts to jump on a vertex occupied by the aggregate, the jump of this particle is suppressed, the particle becomes part of the aggregate, and never moves from that moment onwards. Thus the aggregate grows by attaching particles to its surface whenever a particle attempts to jump onto it. This evolution will be referred to as multi-particle diffusion limited aggregation, MDLA; examples for different values of μ are shown in Fig. 1.
Characterizing the behavior of MDLA is a widely open and challenging problem. Existing mathematical results are limited to one dimension [8,20]. In this case, it is known that the aggregate has almost surely sublinear growth for any μ ∈ (0, 1), having size of order √ t by time t. The main obstacle preventing the aggregate to grow with positive speed is that, from the point of view of the front (i.e., the rightmost point) of the aggregate, the density of particles decreases since the aggregate grows by forming a region of density 1, larger than the initial density of particles.
In dimensions two and higher, MDLA seems to present a much richer and complex behavior, which changes substantially depending on the value of μ; refer to Fig. 1. For small values of μ, the low density of particles affects the rate of growth of the aggregate, as it needs to wait particles that move diffusively to find their way to its boundary. This suggests that the growth of the aggregate at small scales is governed by evolution of the "local" harmonic measure of its boundary. This causes the aggregate to grow by protruding long fractal-like arms, similar to dendrites. On the other hand, when μ is large enough, the situation appears to be different. In this case, the aggregate is immersed in a very dense cloud of particles, and its growth follows random, dynamically evolving geodesics that deviate from occasional regions without particles. Instead of showing dendritic type of growth, the aggregate forms a dense region characterized by the appearance of a limiting shape, similar to a first passage percolation process [9,24]. These two regimes do not seem to be exclusive. For intermediate values of μ, the process shows the appearance of a limiting shape at macroscopic scales, while zooming in to mesoscopic and microscopic scales reveals rather complex ramified structures similar to dendritic growth, as in Fig. 2.
The main result of this paper is to establish that, unlike in dimension one, in dimensions d ≥ 2 MDLA has a phase of linear growth. We actually prove a stronger result, showing that the aggregate grows with positive speed in all directions. For t ≥ 0, let A t ⊂ Z d be the set of vertices occupied by the 494 V. Sidoravicius, A. Stauffer Remark 1.2 It is not difficult to see that the aggregate cannot grow faster than linearly. That is, there exists a constant c 3 such that the probability that A t ⊂ B(0, c 3 t) for all t ≥ t 0 goes to 1 with t 0 . This is the case because the growth of the aggregate is slower than the growth of a first passage percolation with exponential passage times of rate 1, which has linear growth; see, for example, [1,21].
We believe that Theorem 1.1 holds in a stronger form, with P Ā t ⊃ B(0, c 1 t) for all t ≥ t 0 going to 1 with t 0 . However, with positive probability, it happens that there is no particle within a large distance to the origin at time 0. In this case, in the initial stages of the process, the aggregate will grow very slowly as if in a system with a small density of particles. We expect that the density of particles near the boundary of the aggregate will become close to μ after particles have moved for a large enough time, allowing the aggregate to start having positive speed of growth. However, particles perform a non-equilibrium dynamics due to their interaction with the aggregate, and the behavior and the effect of this initial stage of low density is not yet understood mathematically. This is related to the problem of describing the behavior of MDLA for small values of μ, which is still far from reach, and raises the challenging question of whether the aggregate has positive speed of growth for any μ > 0. Even in a heuristic level, it is not at all clear what the behavior of the aggregate should be for small μ. On the one hand, the low density of particles causes the aggregate to grow slowly since particles move diffusively until they are aggregated. On the other hand, since the aggregate is immersed in a dense cloud of particles, this effect of slow growth could be restricted to small scales only, because at very large scales the aggregate could simultaneously grow in many different directions.
We now describe the ideas of the proof of Theorem 1.1. For this we use the language of the dual representation of the exclusion process, where vertices without particles are regarded as hosting another type of particles, called holes, which perform among themselves simple symmetric random walks obeying the exclusion rule. When μ is large enough, at the initial stages of the process, the aggregate grows without encountering any hole. The growth of the aggregate is then equivalent to a first passage percolation process with independent exponential passage times. This stage is well understood: it is known that first passage percolation not only grows with positive speed, but also has a limiting shape [9,24]. However, at some moment, the aggregate will start encountering holes. We can regard the aggregate as a solid wall for holes, as they can neither jump onto the aggregate nor be attached to the aggregate. In one dimension, holes end up accummulating at the boundary of the aggregate, and this is enough to prevent positive speed of growth. The situation is different in dimensions d ≥ 2, since the aggregate is able to deviate from any hole it encounters, advancing through the particles that lie in the neighborhood of the hole until it completely surrounds and entraps the hole. The problem is that the aggregate will find regions of holes of arbitrarily large sizes, which require a long time for the aggregate to go around them. When μ is large enough, the regions of holes will be typically well spaced out, giving sufficient room for the aggregate to grow in-between the holes. One needs to show that the delays caused by deviation from holes are not large enough to prevent positive speed. A challenge is that as holes cannot jump onto the aggregate, their motion gets a drift whenever they are neighboring the aggregate. Hence, holes move according to a non-equilibrium dynamics, which creates difficulties in controlling the location of the holes. In order to overcome this problem, we introduce a new process to model the interplay between the aggregate and holes.
First passage percolation in a hostile environment (FPPHE) This is a twotype first passage percolation process. At any time t ≥ 0, let η 1 (t) and η 2 (t) denote the vertices of Z d occupied by type 1 and type 2, respectively. We start with η 1 (0) containing only the origin of Z d , and η 2 (0) being a random set obtained by selecting each vertex of Z d \{0} with probability p ∈ (0, 1), Fig. 3 FPPHE with λ = 0.7 and p = 0.030, 0.029 and 0.027, respectively. Colors represent different epochs of the growth of η 1 , while the thin curve at the boundary represents the boundary between η 2 and vertices that are either unoccupied or host an inactive type 2 seed. The whole white region within this boundary is occupied by activated type 2 independently of one another. Both type 1 and type 2 are growing processes; i.e., for any times t < t we have η 1 (t) ⊆ η 1 (t ) and η 2 (t) ⊆ η 2 (t ). Type 1 spreads from time 0 throughout Z d at rate 1. Type 2 does not spread at first, and we denote η 2 (0) as type 2 seeds. Whenever the type 1 process attempts to occupy a vertex hosting a type 2 seed, the occupation is suppressed and that type 2 seed is activated and starts to spread throughout Z d at rate λ ∈ (0, 1). The other type 2 seeds remain inactive until type 1 or already activated type 2 attempts to occupy their location. A vertex of the lattice is only occupied by the type that arrives to it first, so η 1 (t) and η 2 (t) are disjoint sets for all t; this causes the two types to compete with each other for space. Note that type 2 spreads with smaller rate than type 1, but type 2 starts with a density of seeds while type 1 starts only from a single location. We show that it is possible to analyze MDLA via a coupling with this process by showing that a hole that has been in contact with the aggregate will remain contained inside a cluster of type 2. Since the aggregate grows in the same way as type 1, establishing that the type 1 process grows with positive speed allows us to show that MDLA has linear growth. Besides its application to studying MDLA, we believe that FPPHE is an interesting process to analyze on its own right, as it shows fascinating different phases of behavior depending on the choice of p and λ. An illustration of the behavior of this process is shown in Fig. 3.
The first phase is the extinction phase, where type 1 stops growing in finite time with probability 1. This occurs, for example, when p > 1 − p c , with p c = p c (d) being the critical probability for independent site percolation on Z d . In this case, with probability 1, the origin is contained in a finite cluster of vertices not occupied by type 2 seeds, and hence type 1 will eventually stop growing. This extinction phase for type 1 also arises when p ≤ 1 − p c but λ is close enough to 1 so that type 2 clusters grow quickly enough to surround type 1 and confine it to a finite set.
We show in this work that another phase exists, called the strong survival phase, and which is characterized by a positive probability of appearance of an infinite cluster of type 1, while type 2 is confined to form only finite clusters. Note that type 1 cannot form an infinite cluster with probability 1, since with positive probability all neighbors of the origin contain seeds of type 2. Unlike the extinction phase, whose existence is quite trivial to show, the existence of a strong survival phase for some value of p and λ is far from obvious. Here we not only establish the existence of this phase, but we show that such a phase exists for any λ < 1 provided that p is small enough. We also show that type 1 has positive speed of growth. For any t, we defineη 1 (t) as the set of vertices of Z d that are not contained in the infinite component of Z d \η 1 (t), which comprises η 1 (t) and all vertices of Z d \η 1 (t) that are separated from infinity by η 1 (t). The theorem below will be proved in Sect. 5, as a consequence of a more general theorem, Theorem 5.1.

Theorem 1.3
For any λ < 1, there exists a value p 0 ∈ (0, 1) such that, for all p ∈ (0, p 0 ), there are positive constants c 1 = c 1 ( p, d) and c 2 = c 2 ( p, d) for which There is a third possible regime, which we call the coexistence phase, and is characterized by type 1 and type 2 simultaneously forming infinite clusters with positive probability. (We regard the coexistence phase as a regime of weak survival for type 1, in the sense that type 1 survives but leaves enough space for type 2 to produce at least one infinite cluster.) Whether this regime actually occurs for some value of p and λ is an open problem, and even simulations do not seem to give good evidence of the existence of this regime. For example, in the rightmost picture of Fig. 3, we observe a regime where η 1 survives, while η 2 seems to produce only finite clusters, but of quite long sizes. This also seems to be the behavior of the central picture in Fig. 3, though it is not as clear whether each cluster of η 2 will be eventually confined to a finite set. However, the behavior in the leftmost picture of Fig. 3 is not at all clear. The cluster of η 1 has survived until the simulation was stopped, but produced a very thin set. It is not clear whether coexistence will happen in this situation, whether η 1 will eventually stop growing, or even whether after a much longer time the "arms" produced by η 1 will eventually find one another, constraining η 2 to produce only finite clusters.
Establishing whether a coexistence phase exists for some value of p and λ is an interesting open problem. We can establish that a coexistence phase occurs in a particular example of FPPHE, where type 1 and type 2 have deterministic passage times, with all randomness coming from the locations of the seeds. In this example, all three phases occur. We discuss this in Sect. 2. See also the recent paper [6], where coexistence is established when Z d is replaced by a hyperbolic non-amenable graph.
Historical remarks and related works MDLA belongs to a class of models, introduced firstly in the physics and chemistry literature (see [15] and references therein), and later in the mathematics literature as well, with the goal of studying geometric and dynamic properties of static formations produced by aggregating randomly moving colloidal particles. Some numerically established quantities, such as fractal dimension, showed striking similarities between clusters produced by aggregating particles and clusters produced in other growth processes of entirely different nature, such as dielectric breakdown cascades and Laplacian growth models (in particular, Hele-Shaw cell [26]). These similarities were further investigated by the introduction of the Hastings-Levitov growth model [13], which is represented as a sequence of conformal mappings. Nonetheless, it is still debated in the physics literature whether some of these models belong to the same universality class or not [4].
In the mathematics literature, the diffusion limited aggregation model (DLA), introduced in [14] following the introduction of MDLA in [25], became a paradigm object of study among aggregation models driven by diffusive particles. However, progress on understanding DLA and MDLA mathematically has been relatively modest. The main results known about DLA are bounds on its rate of growth, derived by Kesten [16,17] (see also [2]), but several variants have been introduced and studied [3,5,7,10,23,27]. Regarding MDLA, it was rigorously studied only in the one-dimensional case [8,19,20], for which sublinear growth has been proved for all densities p ∈ (0, 1) in [20].
Structure of the paper We start in Sect. 2 with a discussion of an example of FPPHE where the passage times are deterministic, and show that this process has a coexistence phase. Then, in preparation for the proof of strong survival of FPPHE (Theorem 1.3), we state in Sect. 3 existing results on first passage percolation, and discuss in Sect. 4 a result due to Häggstrom and Pemantle regarding non-coexistence of a two-type first passage percolation process. This result plays a fundamental role in our analysis of FPPHE. Then, in Sect. 5, we state and prove Theorem 5.1, which is a more general version of Theorem 1.3. In Sect. 6 we relate FPPHE with MDLA, giving the proof of Theorem 1.1.

Example of coexistence in FPPHE
In this section we consider FPPHE with deterministic passage times. That is, whenever type 1 (resp., type 2) occupies a vertex x ∈ Z d , then after time 1 (resp., 1/λ) type 1 (resp., type 2) will occupy all unoccupied neighbors of x. If both type 1 and type 2 try to occupy a vertex at the same time, we choose one of them uniformly at random. Recall that we denote by η i (t), i ∈ {1, 2}, the set of vertices occupied by type i by time t. For simplicity, we restrict this discussion to dimension d = 2. Figure 4 shows a simulation of this process for p = 0.2 and different values of λ. In all the three pictures in Fig. 4, η 1 seems to survive. However, note that the leftmost picture in Fig. 4 differs from the other two since η 2 also seems to give rise to an infinite cluster, characterizing a regime of coexistence. See Fig. 5 for more details.
Our theorem below establishes the existence of a coexistence phase. We note that here the phase for survival for η 1 is stronger than that shown in Theorem 1.3. Here we show that for some small enough p, η 1 survives for any λ < 1. The actual value of λ plays a role only on determining whether coexistence happens. In the theorem below and its proof, a directed path in Z d is defined to be a path whose jumps are only along the positive direction of the coordinates. Theorem 2.1 For any λ ∈ (0, 1) and any p ∈ (0, 1 − p dir c ), where p dir c = p dir c (Z d ) denotes the critical probability for directed site percolation in Z d , we have P η 1 produces an infinite cluster > 0. (1) Furthermore, for any λ ∈ (0, 1), there exists a positive p 0 < 1 − p dir c such that for any p ∈ ( p 0 , 1 − p dir c ) we have P η 1 and η 2 both produce infinite clusters > 0.
Proof Consider a directed percolation process on Z d where a vertex is declared to be open if it is not in η 2 (0), otherwise the vertex is closed. For any t ≥ 0, let C t be the vertices reachable from the origin by a directed path of length at most t where all vertices in the path are open. We will prove (1) by showing that Let x ∈ η 2 (0) be the vertex of η 2 (0) that is the closest to the origin, in 1 norm. Clearly, for any time t < x 1 , we have that η 1 (t) has not yet interacted with η 2 (0), giving that η 1 (t) = {y ∈ Z d : y 1 ≤ t} = C t . See Fig. 6a for an illustration. Then, at time x 1 , η 1 tries to occupy all vertices at distance x 1 from the origin, leading to the configuration in Fig. 6b and activating the seed x of η 2 (0), illustrated in pink in the picture. Since η 1 is faster than η 2 , η 1 is able to "go around" x, traversing the same path as in a directed percolation process. This leads to the configuration in Fig. 6c. Note that the same behavior occurs when η 1 finds a larger set of consecutive seeds of η 2 at the same 1 distance from the origin. For example, see what happens with the three red seeds in Fig. 6d-f. In this case, a directed percolation process does not reach any vertex inside the red triangle in Fig. 6f, as those vertices are shaded by the three red seeds. Since η 2 is slower than η 1 , the cluster of η 2 that starts to grow when the three red seeds are activated cannot occupy any vertex outside of the red triangle.
A different situation occurs when η 1 finds a vertex of η 2 (0) in the axis, as with the yellow vertex of Fig. 6c. Note that, in a directed percolation process, all vertices below the yellow seed will not be reachable from the origin. In our two-type process, something similar occurs, but only for a finite number of steps. When η 1 activates the yellow seed at x ∈ η 2 (0), η 1 cannot immediately go around x as explained above. For λ close enough to 1, η 2 occupies the successive vertex in the axis before η 1 can go around x. This continues for some steps, with η 2 being able to grow along the axis; see Fig. 6d, e. However, Evolution of FPPHE with deterministic passage times and λ = 0.59. Black vertices represent η 1 and white vertices represent unocuppied vertices. All other colors represent clusters of η 2 at each step η 1 will be 1 − λ faster than η 2 . This will accumulate for roughly 1 1−λ steps, when η 1 will finally be able to go around η 2 ; as in Fig. 6f. This happens unless η 2 (0) happens to have a seed at a vertex neighboring one of the vertices on the axis occupied by the growth of η 2 . This is illustrated by the green vertices of Fig. 6f-h. When the first green vertex out of the axis is activated, η 1 will not be able to occupy the vertex to the right of the green vertex, and will encounter the next green seed before it can go around the first green seed found at the axis. The crucial fact to observe is that the clusters of η 2 that start to grow after the activation of each green seed can only occupy vertices located to the right of the seeds, and at the same vertical coordinate. This is a subset of the vertices that are shaded by the green seeds in a directed percolation process. Therefore, (1) follows since the vertices occupied by η 2 are a subset of the following set: take the union of all triangles obtained from sets of consecutive seeds away from the axis (as with the pink, red and blue seeds in Fig. 6), and take the union of semi-lines starting at seeds located at the axis or at seeds neighboring semi-lines starting from seeds of smaller 1 distance to the origin (as with the yellow and green seeds in Fig. 6). This set is exactly the set of vertices not reached by a directed path from the origin. Now we turn to (2). First notice that, from the first part, we have that η 1 (t) ⊇ C t for all p and λ. Since C t does not depend on λ, once we fix p ∈ (0, 1 − p dir c ), we can take λ as close to 1 as we want, and η 1 will still produce an infinite component. Now we consider one of the axis. For example, the one containing the green vertices in Fig. 6. Let (x, 0) be the first vertex occupied by η 2 in that axis. For each integer k, we will define X k as the smallest non-negative integer such that (k, X k ) will be occupied by η 1 . Similarly, Y k is the smallest non-negative integer such that (k, −Y k ) will be occupied by η 1 . Now we analyze the evolution of X k ; the one of Y k will be analogous. Assume that X 1 , X 2 , . . . , X k−1 = 0. Then, with probability at least p we have that X k+1 ≥ 1. When this happens, η 1 will need to do at least 1 1−λ steps before being able to occupy the axis again. However, for each s ≥ 2, the probability that X k+s > X k+1 is at least p. This gives that the probability that the random variable X reaches value above 1 before going back to zero is at least 1 − (1 − p) 1 1−λ . Once we have fixed p, by setting λ close enough to 1 we can make this probability very close to 1. This gives that X k has a drift upwards. Since the downwards jumps of X k are of size at most 1, this implies that at some time X k will depart from 0 and will never return to it. A similar behavior happens for Y k , establishing (2).

Preliminaries on first passage percolation
Let υ be a probability distribution on (0, ∞) with no atoms and with a finite exponential moment. Consider a first passage percolation process {ξ(t)} t , which starts from the origin and spreads according to υ. More precisely, for each pair of neighboring vertices x, y ∈ Z d , let ζ x,y be an independent random variables of distribution υ. The value ζ x,y is regarded as the time that ξ needs to spread throughout the edge (x, y). Note that ζ defines a random metric on Z d , where the distance between two vertices is the length of the shortest path between them, and the length of a path is the sum of the values of ζ over the edges of the path. Hence, given any initial configuration ξ(0) ⊂ Z d , the set ξ(t) comprises all vertices of Z d that are within distance t from ξ(0) according to the metric ζ . We assume throughout the paper that d ≥ 2.
For X ⊂ Z d , let Q υ X be the probability measure induced by the process ξ when ξ(0) = X . When the value of ξ(0) is not important, we will simply write Q υ , and when υ is the exponential distribution of rate 1, we write Q.
is obtained by adding a unit cube centered at each point of ξ(t). A celebrated theorem of Richardson [24], extended by Cox and Durrett [9], establishes that the rescaled set Such a result is now widely referred to as a shape theorem, and B υ is referred to as the limit set. The set B υ defines a norm | · | υ on R d via We abuse notation and define, for any t ≥ 0, B υ (t) as the ball of radius t according to the norm above: B υ (t) = x ∈ Z d : |x| υ ≤ t . As before, we drop the subscript υ when υ is the exponential distribution of rate 1.
In [18,Theorem 2], Kesten derived upper bounds on the fluctuations of ξ(t) around B υ (t). We state Kesten's result in Proposition 3.1 below, in a form that is more suitable to our use later. 1 Before, we need to introduce some notation. Given any set of positive values {ζ x,y } x,y to the edges of the lattice, which we from now on refer to as passage times, and given any two vertices x, y ∈ Z d , let D(x, y; ζ ) be the distance between x and y according to the metric ζ . (5) We extend this notion to subsets by writing For two vertices x, y ∈ Z d we use the notation x ∼ y if x and y are neighbors in Z d .
Given a set A ⊂ Z d , we say that an event is measurable with respect to passage times inside A if the event is measurable with respect to the passage times of the edges whose both endpoints are in A.
For any t > 0, any δ ∈ (0, 1), and any set of passage times ζ , define the event Disregarding some discrepancies in the choice of the boundary, S δ t (ζ ) is the event that ξ(t) is either not contained in B υ ((1 + δ)t) or does not contain B υ ((1 − δ)t). Proposition 3.1 Let υ be a probability distribution on (0, ∞), with no atoms, and with a finite exponential moment. There exist constants c 1 , c 2 , c 3 > 0 depending on d and υ such that, for all t ≥ 1 and all δ > Moreover, we have that is measurable with respect to the passage times ζ x,y : x ∼ y and x, y ∈ B ((1 + δ)t) .
Proof First we establish (7). Note that the event inf Then, if this event does not hold, that is under inf , establishing (7). The bound in (6) follows directly from Kesten's result [18,Theorem 2].

Encapsulation of competing first passage percolation
Here we consider two first passage percolation processes that compete for space as they grow through Z d . One of the processes spreads throughout Z d at rate 1, while the other spreads according to a distribution υ such that its limit shape is contained in B (λ) for some λ < 1, with λ being a parameter of the system. We will say that λ is the rate of spread of the second process. We assume that the starting configuration of each process comprises only a finite set of vertices. In this case, one expects that both processes cannot simultanenously grow indefinitely; that is, one of the processes will eventually surround the other, confining it to a finite subset of Z d . This was studied by Häggström and Pemantle [12]. In the proof of our main result, we will employ a refined version of a result in their paper. In particular, we will give a lower bound on the probability that the faster process surrounds the slower one within some fixed time.
First we define the processes precisely. Let ξ 1 denote the faster process so that, for each time t ≥ 0, ξ 1 (t) gives the set of vertices occupied by the faster process at time t. Similarly, let ξ 2 denote the slower process. For each neighbors x, y ∈ Z d , let ζ 1 x,y be an independent exponential random variables of rate 1, and let ζ 2 x,y be an independent random variable of distribution υ. For i ∈ {1, 2}, ζ i x,y represents the passage time of process ξ i through the edge (x, y).
The processes start at disjoint sets ξ 1 (0), ξ 2 (0) ⊂ Z d . Then they spread throughout Z d according to the passage times ζ 1 and ζ 2 with the constraint that, whenever a vertex is occupied by either ξ 1 or ξ 2 , the other process cannot occupy that vertex afterwards. Therefore, for any t ≥ 0, we obtain that ξ 1 (t) and ξ 2 (t) are disjoint sets. To define ξ 1 , ξ 2 more precisely, we will iteratively set s k (x), for each x ∈ Z d and k ∈ {1, 2}, so that at the end s k (x) is the time x is occupied by process k, or s k (x) = ∞ if x is not occupied by process k. Start setting s 1 (x) = 0 for all x ∈ ξ 1 (0), s 2 (x) = 0 for all x ∈ ξ 2 (0), and s k (x) = ∞ for all k ∈ {1, 2} and x / ∈ ξ k (0). Then, choose the value of k ∈ {1, 2} and the pair of neighboring vertices x, y with s k (x) < ∞ and s k (y) = ∞ that minimizes s k (x) + ζ k x,y , and set s k (y) = s k (x) + ζ k x,y . Then, Given two sets X 1 , X 2 , ⊂ Z d , let Q υ X 1 ,X 2 denote the probability measure induced by the processes ξ 1 , ξ 2 with initial configurations ξ 1 (0) = X 1 and The proposition below is a more refined version of a result of Häggström and Pemantle [12,Proposition 2.2]. It establishes that if ξ 2 starts from inside B (r ) for some r ∈ R + , and ξ 1 starts from a single vertex outside of a larger ball B (αr ), for some α > 1, then there is initially a large separation between ξ 1 and ξ 2 , allowing ξ 1 to surround ξ 2 with high probability. Moreover, we obtain that ξ 1 will eventually confine ξ 2 to some set B (R) for some given R, and the probability that this happens goes to 1 with α. We need to state this result in a high level of detail, as we will apply it at various scales later in our proofs. We say that an event is increasing (resp., decreasing) with respect to some passage times ζ if whenever the event holds for ζ it also holds for any passage times ζ that satisfies ζ x,y ≥ ζ x,y (resp., ζ x,y ≤ ζ x,y ) for all neighboring x, y ∈ Z d . that, for any λ ∈ (0, 1), any r > 1, and any α > and whose probability of occurrence satisfies In particular, when F occurs, within time T , ξ 1 encapsulates ξ 2 inside B (R).
We defer the proof of the proposition above to "Appendix A". The proof will follow along the lines of [12, Proposition 2.2], but we need to perform some steps with more care, as we need to obtain bounds on the probability that F occurs, to establish bounds on R and T , to derive that F is increasing with respect to η 2 and decreasing with respect to ζ 1 , and to obtain the measurability constraints on F. We will need to apply the above proposition in a more complex setting. For this, it is important to keep in mind the process FPPHE defined in Sect. 1, where a cluster of type 2 starts spreading from each type 2 seed when that seed is activated, and type 2 seeds are initially distributed in Z d as a product measure. We will apply the encapsulation procedure of Proposition 4.1 for each different cluster of type 2 growing out of its seed. This means that we will apply Proposition 4.1 at several scales (that is, with different values of r ) and at several places of Z d . The encapsulation happening in one place may end up interfering in the spread of type 1 and type 2 in the other places.
In order to have a version of Proposition 4.1 that can handle this situation, we will focus in one such encapsulation. For that encapsulation, we represent type 1 as ξ 1 , and assume that ξ 1 (0) contains at least one vertex from ∂ o B (αr ). For the cluster of type 2 whose encapsulation we are considering, we let it start from ξ 2 (0) ⊆ B (r ). Here ξ 2 will only represent the cluster of type 2 that spreads from ξ 2 (0). For the other clusters of type 2, we will not refer to them as ξ 2 but simply as type 2.
To model the spread of the other clusters of type 2, we introduce a positive number γ and two sequences of simply connected subsets ( ι ) ι and ( ι ) ι of Z d , such that the sets ι are all disjoint (8) and, for each ι ≥ 1, we have ι ⊂ ι ⊂ y ι + B (γ r ) for some y ι ∈ Z d , and moreover, ι \ ι is simply connected and contains ∂ i ι .
The sets ι represent the other regions of Z d (of smaller scale) where the encapsulation of a cluster of type-2 of FPPHE may be happening while ξ 1 , ξ 2 spread from ξ 1 (0), ξ 2 (0), whereas the sets ι represent the regions inside which each type 2 cluster gets confined to. (The value of γ will be quite small, so that all ι are of scale smaller than r because clusters of scale larger than r will be treated afterwards: in the proof we will consider the clusters essentially in order of their sizes.) Outside of the set ι ι , the spread of ξ 1 , ξ 2 will follow the passage times ζ 1 , ζ 2 , respectively. However, the spread of ξ 1 , ξ 2 inside ι ι may be different and quite complicated. We will not need to specify this precisely, we will only require the following properties: (P1) For each ι, if ξ 2 does not enter ι from ξ 2 (0), then ι \ ι becomes entirely occupied by ξ 1 . (P2) For any ι, x ∈ ∂ i ι , and y ∈ ι , either the time that ξ 1 takes to spread from x to y within ι is smaller than that given by the passage times ζ 1 , or y is occupied by type 2. (P3) For any ι, x ∈ ∂ i ι , and y ∈ ι , either the time that ξ 2 takes to spread from x to y within ι is larger than that given by the passage times ζ 2 , or y is occupied by type 1.
Above when we refer to the time given by the passage times ζ k , k ∈ {1, 2}, we mean the time given by the passage times ζ k when we completely ignore the presence of the cluster of type 2 that grows from within ι . Regarding properties (P2) and (P3) above, in our application of the proposition below, we will do some scaling of the passage times so that, within each ι , type 1 will actually spread at a rate faster than 1 while type 2 will spread at a rate slower than λ. Since within ι type 1 needs to do a detour around the growing cluster of type 2, we will use a coupling argument to say that even with this detour type 1 will spread inside ι faster than the passage times given by ζ 1 . Similarly, within ι type 2 may benefit from ξ 2 entering from outside and blocking type 1 as it attempts to encapsulate type 2. We will use a coupling argument to say that, even with the help from ξ 2 , type 2 will spread inside ι slower than the passage times given by ζ 2 . This will become clearer in Sect. 5.1, where we present a high-level description of the whole proof. At this point, we do not need much detail of how the spread of type 1 and type 2 happen inside each ι .
The goal of the proposition below, which is a refinement of Proposition 4.1, is to argue that with high probability the passage times ζ 1 , ζ 2 are such that ξ 1 encapsulates ξ 2 inside a ball surrounding B (r ) unless there exists a set Proposition 4.2 There exist positive constants c 1 , c 2 , c 3 depending only on d so that, for any λ ∈ (0, 1), any r > 1, and any α > and is increasing with respect to ζ 2 and decreasing with respect to ζ 1 , such that its occurrence implies that either (8) and (9), such that there exists ι for which ι ∩ B R 11−λ 10 2 = ∅ and ι does not satisfy at least one of the properties (P1)-(P3). Moreover, we obtain that 2d+4 .

Proof of Theorem 1.3
Theorem 1.3 will follow directly from Theorem 5.1 below, which we will prove in this section. The proof is quite long, so we start with an overview. For clarity's sake, we discuss the proof overview under the setting of Theorem 1.3, and only state Theorem 5.1 in Sect. 5.2.

Proof overview
We start with a high-level overview of the proof. Below we refer to Fig. 7. Since p is small enough, initially η 1 will grow without finding any seed of η 2 (0), as in Fig. 7a. When η 1 activates a seed of η 2 , then we will apply Proposition 4.1 to establish that η 1 will go around η 2 , encapsulating it inside a small ball (according to the norm | · |). This is illustrated by the encapsulation of C 1 in Fig. 7b. The yellow ball in the picture marks the region inside which the cluster of η 2 will be trapped; in Proposition 4.1 this corresponds to the ball B 10R 11−λ . To ensure the encapsulation of a cluster of η 2 , we will need to observe the passage times inside a larger ball; for example, only to ensure the measurability requirement of the event in Proposition 4.1 we need to observe the passage times in B R 11 Fig. 7 Illustration of the proof strategy of Theorem 1.3, with the application of the encapsulation procedure of Häggström and Pemantle [12] (cf. Proposition 4.1). White balls indicate seeds of η 2 that were not yet activated, black represents the growth of η 1 , and yellow balls represent the regions inside which the activated clusters of η 2 got trapped. The red circle around each yellow ball corresponds to a larger area in which the passage times need to be observed in order to ensure the success of the encapsulation of the η 2 cluster; for example, only to ensure the measurability requirement of the event in Proposition 4.1 we need to observe the passage times in B R 11−λ 10 2 , whereas the η 2 cluster is trapped within B 10R 11−λ encapsulation of different clusters of η 2 will happen independently. However, when η 1 encounters a large cluster of η 2 seeds, as it happens with the cluster C 3 in Fig. 7d, the encapsulation procedure will require a larger region to succeed.
We will carry this out by developing a multi-scale analysis of the encapsulation procedure, where the size of the region will depend, among other things, on the size of the clusters of η 2 (0). After the encapsulation takes place, as in Fig. 7e, we are left with a larger yellow ball and a larger red circle. Also, whenever two clusters of η 2 (0) are close enough such that their corresponding red circles intersect, as it happens with C 2 in Fig. 7c, then the encapsulation cannot be guaranteed to succeed. In this case, we see these clusters as if they were a larger cluster, and perform the encapsulation procedure over a slightly larger region, as in Fig. 7c, d.
There is one caveat in the above description. Suppose η 1 encounters a very large cluster of η 2 , for example C 3 in Fig. 7d. It is likely that during the encapsulation of C 3 , inside the red circle of this encapsulation, we will find smaller clusters of η 2 . This happens in Fig. 7d with C 4 . This does not pose a big problem, since as long as the red circle of the encapsulation of the small clusters do not intersect one another and do not intersect the yellow ball produced by the encapsulation of C 3 , the encapsulation of C 3 will succeed. This is illustrated in Fig. 7e, where the encapsulation of C 4 happened inside the encapsulation of C 3 . There is yet a subtlety. During the encapsulation of C 4 , the advance of η 1 is slowed down, as it needs to make a detour around the growing cluster of C 4 . This slowing down could cause the encapsulation of C 3 to fail. Similarly, as η 2 spreads from C 3 , η 2 may find vertices that have already been occupied by η 2 due to the spread of η 2 from other non-encapsulated seeds. This would happen, for example, if the yellow ball that grows from C 3 were to intersect the yellow ball that grows from C 4 . If this happens before the encapsulation of C 4 ends, then the spread of C 3 gets a small advantage. The area occupied by the spread of η 2 from C 4 can in this case be regarded as being absorbed by the spread of η 2 from C 3 , causing C 3 to spread faster than if C 4 were not present. We will need to show that η 1 is not slowed down too much by possible detours around smaller clusters, and η 2 is not sped up too much by the absorption of smaller clusters.
To do this, we will define a sequence of scales R 1 , R 2 , . . ., with R k increasing with k. The value of R k represents the radius of the region inside which encapsulation takes place at scale k. (Later when making this argument rigorous, for each scale k we will need to introduce several radii, but to simplify the discussion here we can think for the moment that R k gives the radius of the red circles in Fig. 7, and that the radius of the yellow circles at scale k is just a constant times R k .) The larger the cluster of seeds of η 2 , the larger k must be. We will treat the scales in order, starting from scale 1. This procedure is illustrated in Fig. 8 for the encapsulation of the configuration in Fig. 7a. Once all clusters of scale k − 1 or below have been treated, we look at all remaining (untreated) clusters that are not too big to be encapsulated at scale k. If two clusters of scale k are too close to each other, so that their corresponding red circles intersect, we will not carry out the encapsulation and will treat these clusters as if they were one cluster from a larger scale, as illustrated in Fig. 8a. After disregarding these, all remaining clusters of scale k are disjoint and can be treated independently using the more refined Proposition 4.2. The ι will be the clusters of scale smaller than k that happen to fall inside the red circle of the cluster of scale k. Although small and going very fast to zero with k, the probability that the encapsulation procedure fails is still positive. So it will happen that some encapsulation will fail, as illustrated by the vertex at the top of Fig. 8a. If this happens for some cluster of scale k, which is an event measurable with respect to the passage times inside a red circle of scale k containing the η 2 seeds of that cluster, we then take the whole area inside this red circle and consider it as a larger cluster of η 2 (0) seeds, leaving it to be treated at a larger scale, as in Fig. 8b. Then we turn to the next scale, as in Fig. 8c, d.
In order to handle the slow down of η 1 due to detours imposed by smaller scales, and the sped up of η 2 due to absorption of smaller scales, we will introduce a decreasing sequence of positive numbers 1 , 2 , . . ., as follows. In the encapsulation of a cluster C of scale k, we will show not only that η 1 is able to encapsulate C, but also that η 1 does that sufficiently fast. We do this by coupling the spread of η 1 inside the red circle of C with a slower first passage percolation process of rate k i=1 e − i that evolves independently of η 2 . In other words, this slower first passage percolation process does not need to do a detour around C, but pay the price by having slower passage times. We show that the spread of η 1 around C is faster than that of this slower first passage percolation process. Similarly, we show that, even after absorbing smaller scales, η 2 still spreads slow enough inside the red circle of C, so that we can couple it with a faster first passage percolation process of rate λ k i=1 e i , which evolves independently of everything else. We show using this coupling that the spread of η 2 is slower than that of the faster first passage percolation process. Thus at scale k, η 1 is spreading at rate at least k i=1 e − i while η 2 is spreading at rate at most λ k i=1 e i , regardless of what happened at smaller scales. By adequately setting k , we can ensure that k i=1 e − k > λ k i=1 e k for all k, allowing us to apply Proposition 4.2 at all scales.
The final ingredient is to develop a systematic way to argue that η 1 produces an infinite cluster. For this we introduce two types of regions, which we call contagious and infected. We start at scale 1, where all vertices of η 2 (0) are contagious. Using the configuration in Fig. 7a as an example, all white balls there are contagious. The contagious vertices that do not belong to large clusters or are not close to other contagious vertices, are treated at scale 1. The other contagious vertices remain contagious for scale 2. Then, for each cluster treated at scale 1, either the encapsulation procedure is successful or not. If it is successful, then the yellow balls produced by the encapsulation of these clusters are declared infected, and the vertices in these clusters are removed from the set of contagious vertices. In Fig. 8b, the yellow area represents the infected vertices after clusters of scale 1 have been treated. Recall that when an encapsulation is successful, all vertices reached by η 2 from that cluster must be contained inside the yellow area. On the other hand, if the encapsulation is not successful, then all vertices inside the red circle become contagious and go to scale 2, together with the other preselected vertices. An example of this situation is given by the cluster at the top-right corner of Fig. 8b. We carry out this procedure iteratively until there are no more contagious vertices or the origin has been disconnected from infinity by infected vertices. The proof is concluded by showing that η 2 is confined to the set of infected vertices, and that with positive probability the infected vertices will not disconnect the origin from infinity.
Roadmap of the proof We now proceed to the details of the proof. We split the proof in few sections. In Sect. 5.2, we state Theorem 5.1, the more general version of Theorem 1.3. Then in Sect. 5.3 we set up the multi-scale analysis, specifying the sizes of the scales and some parameters. This will define boxes of multiple scales, and we will classify boxes as being either good or bad. Roughly speaking, a box will be good if the encapsulation procedure inside the box is successful. The concrete definition of good boxes is done in Sect. 5.4. In Sect. 5.5 we estimate the probability that a box is good, independent of what happens outside the box. We then introduce contagious and infected sets in Sect. 5.6, and show that η 2 is confined to the set of infected vertices. At this point, it remains to show that the set of infected vertices does not disconnect the origin from infinity. For this, we need to control the set of contagious vertices, which can actually grow as we move to larger scales (for example, this happens when some encapsulation procedure fails). The event that a vertex is contagious at some scale k depends on what happens at previous scale. We estimate the probability of such event by establishing a recursion over scales, which we carry out in Sect. 5.7. With this we have a way to control whether a vertex is infected. In order to show that the origin is not disconnected from infinity by infected vertices, we apply the first moment method. We sum, over all contours around the origin, the probability that this contour contains only infected vertices. Since infected vertices can arise at any scale, we need to look at multi-scale paths and contours of infected vertices, which we do in Sect. 5.8. We then put all ingredients together and complete the proof of Theorem 1.3 in Sect. 5.9.

General version of Theorem 1.3
In this section we will consider a generalization of FPPHE, where the passage times of η 2 can be given by any distribution, while the passage times of η 1 are exponential random variables of rate 1.
Let υ be a probability distribution on (0, ∞), with no atoms, and such it has a finite exponential moment. It holds by [1, Theorem 2.16] that a first passage percolation with passage times given by i.i.d. random variables with distribution υ has a limit shape B υ , as in (4). Recall that B (r ) = r B denotes the ball of radius r according to the norm induced by the shape theorem of first passage percolation with passage times that are exponential random variables of rate 1.
For any edge (x, y) of the lattice, let ζ 1 x,y be an independent exponential random variable of rate 1, and let ζ 2 x,y be an independent random variable distributed according to υ. For i ∈ {1, 2}, ζ i x,y is regarded as the passage time of η i through (x, y); that is, when η i occupies x, then after time ζ i x,y we have that η i will occupy y provided that y has not been occupied by the other type.
Recall that, for any t, we defineη 1 (t) as the set of vertices of Z d that are not contained in the infinite component of Z d \η 1 (t), which comprises η 1 (t) and all vertices of Z d \η 1 (t) that are separated from infinity by η 1 (t). Theorem 1.3 follows immediately from the theorem below by taking υ to be the exponential distribution of rate λ.

Multi-scale setup
Let ∈ (0, 1/2) be fixed and small enough so that all inequalities below hold: We can define positive constants C FPP < C FPP , depending only on d, such that for all r > 0 we have Set C FPP to be the largest constant and C FPP to be the smallest constant satisfying (11). Since B (r ) is convex and has all the symmetries of the lattice Z d , we not only obtain that B (r ) is contained in the ∞ -ball of radius C FPP r but it contains the 1 -ball of radius C FPP r . Using that the latter contains the ∞ -ball of radius Given υ, we can define υ ≥ 1 as the smallest number such that Equivalently, we have υ = sup x∈B(λ) |x| υ . If υ is an exponential distribution of rate λ, we have υ = 1.
Let L 1 be a large number, and fix α > 1 so that it satisfies the conditions in Proposition 4.2. We let k be an index for the scales. For k ≥ 1, once L k has been defined, we set Also, for k ≥ 1, define where c 1 is the constant in Proposition 4.2. Since 1 − λ > 2 , we have that B R enc k contains all the passage times according to which the event in We then obtain the following bounds for L k : and The first bound follows from (16) and (11), and the fact that in (16) L k is obtained via an infimum, so any cube containing B 100k d R outer k−1 must have side length at least L k . The second bound follows from similar considerations, but applying (14) and (11).
The intuition is that L k is the size of scale k, and R k is the radius of the clusters of η 2 (0) to be treated at scale k. The value of R enc k gives the radius inside which the encapsulation takes place; in the overview in Sect. 5.1 and in Figs. 7 and 8, R enc k will be larger than the radius of each yellow ball so that each η 2 cluster treated at scale k will be contained inside a ball of radius R enc k . Regarding R outer k , it represents a larger radius, which will be needed for the development of some couplings between scales; in the overview in Sect. 5.1 and in Figs. 7 and 8, R outer k gives the radius of the red circles.
With the definitions above we obtain for some constant c = c(d, , α, υ) > 288000d 2 α exp 1+c 1 2 υ . Iterating the above bound, we obtain Using similar reasons we can see that which allows us to conclude that wherec is a positive constant depending on α, , d and υ, and the last step follows for all k ≥ 1 by setting L 1 large enough. At each scale k ≥ 1, tessellate Z d into cubes of side-length L k , producing a collection of disjoint cubes Whenever we refer to a cube in Z d , we will only consider cubes of the form . . , d}. We will need cubes at each scale to overlap. We then define the following collection of cubes We refer to each such cube Q k (i) of scale k as a k-box, and note that Q k (i) ⊃ Q core k (i). One important property is that if a subset A ⊂ Z d is completely contained inside a cube of side length 18d L k , As described in the proof overview (see Sect. 5.1), when going from scale k to scale k + 1, we will need to consider a slowed down version of η 1 and a sped up version of η 2 . For this reason we set 1 = 0 and define for k ≥ 2 Set λ 1 1 = 1 and λ 2 1 = λ, and let ζ 1 1 = ζ 1 and ζ 2 1 = ζ 2 be the passage times used by η 1 and η 2 , respectively. For k ≥ 2, define where the third inequality follows from the bound on via (10). For each k ≥ 2, consider two collections of passage times ζ 1 k and ζ 2 k on the edges of Z d , which are given by ζ 1 λ 1 k and ζ 2 λ λ 2 k , respectively. These will be the passage times we will use in the analysis at scale k. Note that, for any given k, the passage times of ζ 1 k are independent exponential random variables of parameter λ 1 k , while for the passage times of ζ 2 k we obtain that its limit shape is contained in B λ 2 k . Moreover, up to a time scaling, having passage times ζ 1 k , ζ 2 k is equivalent to having type 1 spreading at rate 1, while type 2 spreads according to a random variable whose limit shape is contained in B be the effective rate of type 2 in comparison with that of type 1 at scale k. From now on, we will refer to the λ 2 k as the rate of spread of type 2 at scale k, even if type 2 may not have exponential passage times.
We obtain that Thus the effective rate of spread of type 2 is smaller than 1 at all scales. We can also define the effective passage time of type 2 at scale k as in this way, at scale k, when employing Proposition 4.2, we will take the passage times ζ 1 k , ζ 2 k and scale time by a factor of λ 1 k , so that type 1 spreads according to the passage times ζ 1 and type 2 spreads according to the passage times ζ eff k . Finally, for k ≥ 1, define Note that, using the passage times ζ 1 k , ζ 2 k , we have that T 1 k represents the time required to run each encapsulation procedure at scale k (before time is scaled by a factor of λ 1 k as mentioned above).

Definition of good boxes
For each Q k (i), we will apply Proposition 4.2 to handle the situation where Q k (i) entirely contains a cluster of η 2 (0). At scale k we will only handle the clusters that have not already been handled at a scale smaller than k. By the relation between L k and R k in (18), the cluster of η 2 inside Q k (i) will not start growing before η 1 reaches the boundary of L k i + B (R k ). By the time η 1 reaches the boundary of (For the moment we assume that L k i + B (α R k ) does not contain the origin, otherwise we will later consider that the origin has already been disconnected from infinity by η 2 .) At this point we apply Proposition 4.2 with r = R k and λ = λ eff k , obtaining values R and T such that where the second inequality follows from (15) and the last inequality follows from (23), and where the last two inequalities follow from (26) and (25), respectively. Note that in our application of Proposition 4.2 above time has been scaled by λ 1 k , since we apply it with type 1 (resp., type 2) spreading at rate 1 (resp., λ eff k ) instead of the actual rate λ 1 k (resp., λ 2 k ). This is the reason why the term 1 λ 1 k appears in the definition of T 1 k in (25). With this we get λ 1 k T 1 k in the righthand side of (27), and the actual time that the encapsulation procedure takes is 1 , be the event in the application of Proposition 4.2 with the origin at L k i, r = R k , passage times given by ζ 1 , ζ eff k , and η 1 starting from x. Here x represents the first The event G enc k (i) implies that η 1 encapsulates η 2 inside L k i +B R enc k during a time interval of length T 1 k , unless η 2 "invades" L k i + B R enc k from outside, that is, unless another cluster of η 2 starts growing and reaches the boundary of (When we apply the above argument later in the proof, we will only try to encapsulate a cluster of η 2 at scale k if the ball does not intersect other balls being treated at the same scale. If there is another ball being treated at the same scale k and intersecting L k i +B R outer k , then these balls will be only treated at a larger scale, not allowing different clusters of η 2 of the same scale to interfere in each other's encapsulation.) We will also define two other events G 1 k (i) and G 2 k (i), which will be measurable with respect to ζ 1 k , ζ 2 k inside Q outer k (i). For any X ⊂ Z d , let ζ 1 k | X be the passage times that are equal to ζ 1 k inside X and are equal to infinity everywhere else; define ζ 2 k | X analogously. Define the event G 1 k (i) as The main intuition behind this event is that, during the encapsulation of a (k + 1)-box, η 1 will need to perform some small local detours when encapsulating clusters of scale k or smaller. We can capture this by using the slower passage times ζ 1 k+1 . If G 1 k holds for the k-boxes that are traversed during the encapsulation of a (k + 1)-box, then using the slower passage times ζ 1 k+1 but ignoring the actual detours around k-boxes will only slow down η 1 .
We also need to handle the case where the growth of η 2 is sped up by absorption of smaller scales. For i ∈ Z d , define Note that the event G 2 While η 2 travels from x to Q enc k (i), the encapsulation of Q enc k (i) may start taking place. Then, η 2 can only get a sped up inside Q enc k (i) if η 2 enters Q enc k (i) before the encapsulation of Q enc k (i) is completed. However, under G 2 k (i) and the passage times ζ 2 k , the time that η 2 takes to go from x to Q enc k (i) is larger than the time, under ζ 2 k+1 , that η 2 takes to go from x to all vertices in Q enc k (i). Therefore, under G 2 k (i), we can use the faster passage times ζ 2 k+1 to absorb the possible sped up that η 2 may get by the cluster growing inside Q enc and say that Hence, intuitively, Q k (i) being good means that η 1 successfully encapsulates the growing cluster of η 2 inside Q k (i), and this happens in such a way that the detour of η 1 during this encapsulation is faster than letting η 1 use passage times ζ 1 k+1 , and also the possible sped up that η 2 may get from clusters of η 2 coming from outside Q k (i) is slower than letting η 2 use passage times ζ 2 k+1 . We now explain why in the definition of G 1 k (i) and G 2 k (i) we calculate passage times from ∂ Q The reason is that we had to define G 1 k (i) and G 2 k (i) in such a way that they are measurable with respect to the passage times inside Q outer k (i). We do this by forcing to use only passage times inside Q outer k (i). By using the distance between ∂ Q outer k (i) and ∂ Q outer/3 k (i), we can ensure that this constraint does not change much the probability that the corresponding events occur.

Probability of good boxes
In this section we show that the events G enc k (i), G 1 k (i) and G 2 k (i), defined in Sect. 5.4, are likely to occur.

Lemma 5.2 There exist positive constants L
Moreover, the event G k (i) is measurable with respect to the passage times inside Q outer k (i).
Before proving the lemma above, we state and prove two lemmas regarding the probability of the events G 1 k (i) and G 2 k (i).

Lemma 5.3
There exist positive constants L 0 = L 0 (d, ) and c = c(d) such that if L 1 ≥ L 0 , then for any k ≥ 1 and any i ∈ Z d we have We will show that there exists a constant c = c(d) > 0 such that and Using (28) and (29), it remains to show that Note that Thus we need to show that the last term in the right-hand side above is at least (1 + δ)τ 2 , which is equivalent to showing that Rearranging the terms, the inequality above translates to Using that exp (k + 1) −2 ≥ 1 + (k + 1) −2 and then applying the value of δ, we obtain that the left-hand side above is at least Hence, it now suffices to show that which is true since the right-hand side above is at most 11−λ 10 2 + exp ( /4) ≤ 11 10 Now we turn to establish (28) and (29). We start with (28). First note that Recall the notation S δ t from Proposition 3.1, which is the (unlikely) event that at time t first passage percolation of rate 1 does not contain B ((1 − δ)t) or is not contained in B ((1 + δ)t). Then using time scaling to go from passage times of rate λ 1 k+1 to passage times of rate 1, and using the union bound on x, we obtain where in the first inequality we used that k , and in the second inequality we applied Proposition 3.1. Now we turn to (29). We again use time scaling and the fact that where the second inequality follows since Finally, the last step of the derivation above follows from Propositon 3.1.
The next lemma shows that G 2 k (i) occurs with high probability.

Lemma 5.4
There exist positive constants L 0 = L 0 (d, ) and c = c(d, ν) such that if L 1 ≥ L 0 , then for any k ≥ 1 and any i ∈ Z d we have .
Proof Set δ = 20k 2 and fix an arbitrary x ∈ ∂ i Q outer/3 k (i). Define the smallest distance between x and Q enc k (i) with respect to the norm υ as Under the passage times ζ 2 k , the time it takes to reach Q enc . Therefore, we define and will show later that there exists a constant c > 0 such that, uniformly over x, Now, under the faster passage times ζ 2 k+1 , the time it takes to reach Q enc Under the passage times ζ 2 k+1 , which is a scale of ζ 2 by a factor of λ λ 2

k+1
, the time . Therefore, we set , and will show that there exists a constant c > 0 such that Assuming (30) and (31) for the moment, it remains to show that Replacing λ 2 k+1 with λ 2 k exp( (k + 1) −2 ) in the definition of τ 2 , (32) follows if we show that where the first inequality follows by the definition of R outer k in (15). So now it suffices to show that Rearranging gives that Using that e −a ≤ 1 − a + a 2 /2 for all a ≥ 0, (32) holds if the following is true Using the value of δ, we are left to showing which is true since the right-hand side is at least 1 4 · 1 − 8 ≥ 1 4 · 15 16 . This establishes (32). Now we turn to establish (30) and (31), which essentially follow from Proposition 3.1. We start with (30). Scaling the passage times ζ k 2 by λ 2 k λ we obtain passage times distributed as υ. Hence, The same reasoning holds for (31), which gives Then the lemma follows by taking the union bound over x, and using the fact that R outer k is very large at all scales so that the extra term obtained from the union bound can be absorbed in the constant c.
Proof of Lemma 5.2 Proposition 4.2 gives that G enc k (i) can be defined so that it is measurable with respect to the passage times inside Moreover, Proposition 4.1 gives a constant c 2 > 0 so that, for all large enough L 1 , we have where the last step follows by applying the bounds in (23) and c is a positive constant. By definition, the events G 1 k and G 2 k (i) are measurable with respect to the passage times inside L k i + B R outer k . So the proof is completed by using the bounds in Lemmas 5.3 and 5.4.

Contagious and infected sets
As discussed in the proof overview in Sect. 5.1, for each scale k, we will define a set C k ⊂ Z d as the set of contagious vertices at scale k, and also define a set I k ⊂ Z d as the set of infected vertices at scale k. The main intuition behind such sets is that C k represents the vertices of Z d that need to be handled at scale k or larger, whereas I k represents the vertices of Z d that may be taken by η 2 at scale k. In particular, we will show that the vertices of Z d that will be occupied by η 2 are contained in k≥1 I k .
At scale 1 we set the contagious vertices as those initially taken by η 2 ; that is, All clusters of C 1 that belong to good 1-boxes and that are not too close to contagious clusters from other 1-boxes will be "cured" by the encapsulation process described in the previous section. The other vertices of C 1 will become contagious vertices for scale 2, together with the vertices belonging to bad 1-boxes. Using this, define C bad k as the following subset of the contagious vertices: Intuitively, C bad k is the set of contagious vertices that cannot be cured at scale k since they are not far enough from other contagious vertices in other k-boxes. Now for the vertices in C k \C bad k , the definition of C bad k gives that we can select a set I k ⊂ Z d representing k-boxes such that for each x ∈ C k \C bad k there exists a unique i ∈ I k for which x ∈ Q k (i), and for each pair i, j ∈ I k , we have Q outer Then, given C k , we define I k as the set of vertices that can be taken by η 2 during the encapsulation of the good k-box, which is more precisely given by We then define inductively The lemma below gives that if the contagious sets of scales larger than k are all empty, then η 2 must be contained inside k−1 j=1 I j . Lemma 5.5 Let A ⊂ Z d be arbitrary. Then, for any k ≥ 1, either we have that there exists j > k and i ∈ Z d with Q outer or Proof We will assume that (35) does not occur; that is, the set The lemma will follow by showing that the above implies (36). We start with scale 1. Recall that C 1 contains all elements of η 2 (0). Then, all elements of C 1 \C bad 1 are handled at scale 1. Let i ∈ I 1 , so Q 1 (i) intersects C 1 \C bad 1 . If Q 1 (i) is a good box, the passage times inside Q enc 1 (i) are such that η 1 encapsulates η 2 within Q enc 1 (i) unless another cluster of η 2 enters Q enc 1 (i) from outside. When the encapsulation succeeds, we have that the cluster of η 2 growing inside Q enc 1 (i) never exits Q enc 1 (i) ⊂ I 1 . Before proceeding to the proof for scales larger than 1, we explain the possibility that the encapsulation above does not succeed because another cluster of η 2 (say, from Q 1 ( j)) enters Q enc 1 (i) from outside. Note that if Q outer 1 ( j) ∩ Q outer 1 (i) = ∅, then the two clusters are not handled at scale 1: they will be handled together at a higher scale. Now assume that Q outer 1 ( j) and Q outer 1 (i) are disjoint and do not intersect any other region Q outer 1 from a contagious site. Thus both Q 1 (i) and Q 1 ( j) are handled at scale 1. If they are both good, the encapsulations succeed within Q enc 1 (i) and Q enc 1 ( j), and do not interfere with each other. Assume that Q 1 (i) is good, but Q 1 ( j) is bad. In this case, we will make Q outer 1 ( j) to be contagious for scale 2, but up to scale 1 this does not interfere with the encapsulation within Q enc 1 (i) because these two regions are disjoint. The encapsulation of Q outer 1 ( j) will be treated at scale 2 or higher, and the fact that Q outer 1 ( j) ∩ Q outer 1 (i) = ∅ will be used to allow a coupling argument between scales.
We now explain the analysis for a scale j ∈ {2, 3, . . . , k}, assuming that we have carried out the analysis until scale j − 1. Thus, we have showed that all contagious vertices successfully handled at scale smaller than j are contained inside I 1 ∪ I 2 ∪· · ·∪ I j−1 . Consider a cell Q j (i) of scale j with i ∈ I j . During the encapsulation of η 2 inside Q enc j (i), it may happen that η 1 advances through a cell Q j−1 (i ) that was treated at scale j −1; that is, i ∈ I j−1 . (For simplicity of the discussion, we assume here that this cell is of scale j − 1, but it could be of any scale j ≤ j − 1.) Note that Q j−1 (i ) must be good for scale j − 1 because otherwise cell i would not be treated at scale j. The fact that Q j−1 (i ) is good implies that the time η 1 takes to go from ∂ i Q outer/3 and for each ι, the set ι ⊂ ι is given by the union of Q enc j (i ) over all j , i for which Q outer j (i ) ⊂ ι . Therefore, under the event that all the cells involved in the definition of { ι } ι are good, Proposition 4.2 gives η 2 cannot escape the set j ι=1 I ι , after all contagious vertices of scale at most j have been analyzed. Therefore, inductively we obtain that η 2 (t) ⊂ ∞ ι=1 I ι . For scales larger than k, we will use that (37) holds. Since for any scale j and any i ∈ Z d we have that

Recursion
Define Recall that Q core k (i) are disjoint for different i ∈ Z d , as defined in (22). Define also where c is the constant in Lemma 5.2 so that for any k ∈ N and i ∈ Z d , we have P (G k (i)) ≥ 1 − q k .
By the definition of C k from (34), in order to have Q core k (i) ∩ C k = ∅ it must happen that either where ι is the unique number such that x ∈ Q core k−1 (ι). The condition above holds by the following. If (38) does not happen, then there must exist a x ∈ C k−1 ∩ Q core k (i) that was not treated at scale k − 1; that is, x ∈ C bad k−1 . Then, by the definition of C bad k−1 , it must be the case that there exists a y satisfying the conditions in (39). The values x, y as in (39) must satisfy Proof The theorem is true for k = 1 since Q core 1 (i) ∩ C 1 = ∅ is equivalent to Q core 1 (i) ∩ η 2 (0) = ∅ . Our goal is to apply an induction argument to establish the lemma for k > 1. First note that, since the event that a box of scale k − 1 is good is measurable with respect to passage times inside a ball of diameter R outer k−1 , we have that condition (38) is measurable with respect to the passage times inside It remains to establish the measurability result for condition (39). Note that condition (39) gives the existence of a point y in x∈Q core such that y ∈ C k−1 . Let j be the integer such that y ∈ Q core k−1 ( j). Then the induction hypothesis gives that {Q core k−1 ( j) ∩ C k−1 = ∅} is measurable with respect to the passage times inside Therefore, condition (39) is measurable with respect to the passage times inside Q super k (i).

Lemma 5.7
There exists a constant c = c(d, , α, υ) > 0 such that, for all k ∈ N and all i ∈ Z d , we have Proof From the discussion above, we have that ρ k (i) is bounded above by the probability that condition (38) occurs plus the probability that condition (39) occurs. We start with condition (38). Note that Q core k−1 ( j), for j defined as in (38), must be contained inside Therefore, there is a constant c 0 depending only on d such that the number of options for the value of j is at most for some constant c 1 , where the first inequality comes from (21) and the last inequality follows from (19). Then, taking the union bound on the value of j, we obtain that the probability that condition (38) occurs is at most Now we bound the probability that condition (39) happens. For any z ∈ Z d , let ϕ(z) ∈ Z d be such that z ∈ Q core k−1 (ϕ(z)). We will need to estimate the number of different values that ϕ(x) and ϕ(y) can assume. Since x ∈ Q core k (i), we have that ϕ(x) can assume at most L k +2L k−1 L k−1 d values. For ϕ(y), first note that any Therefore, Q core k−1 (ϕ(y)) must be contained inside a cube of side length and consequently there are at most possible values for ϕ(y). Letting A k be the number of ways of choosing the Q core k−1 boxes containing x, y according to condition (39), we obtain for some constant c 2 = c 2 (d, , α, υ) > 0, where the inequality follows from (19). Now, given x, y, we want to give an upper bound for From Lemma 5.6 we have that the event Q core k−1 (ϕ(x)) ∩ C k−1 is measurable with respect to the passage times inside Q where we related R outer k−2 and R k−1 via (21). Since by (40) and (18) we have where the last inequality follows from (12), we then obtain Q super k−1 (ϕ(x)) ∩ Q super k−1 (ϕ(y)) = ∅. This gives that the events Q core k−1 (ϕ(x)) ∩ C k−1 and Q core k−1 (ϕ(y)) ∩ C k−1 are independent, yielding In the lemma below, recall that η 2 (0) is given by adding each vertex of Z d with probability p, independently of one another. Also letρ be such that ρ ≥ sup j ρ 1 ( j).

Lemma 5.8
Fix any positive constant a. We can set L 1 large enough and then p small enough, both depending on a, α, , d and υ, such that for all k ∈ N and all i ∈ Z d , we have Proof For k = 1, then ρ k (i) is bounded above by the probability that η 2 (0) intersects Q core k (i). Once L 1 has been fixed, this probability can be made arbitrarily small by setting p small enough. Now we assume that k ≥ 2. We will expand the recursion in Lemma 5.7. Using the same constant c as in Lemma 5.7, definē Now fix k, set A −1 = 1, and define for = 0, 1, . . . , k − 1 With this, the recursion in Lemma 5.7 can be written as where in the second inequality we used that (x + y) m ≤ 2 m−1 (x m + y m ) for all x, y ∈ R and m ∈ N. Iterating the above inequality, we obtain We now claim that A ≤ 4c 2 (3(k − )) 5d(d+2) 2 for all = 0, 1, . . . , k − 1.
We can prove (44) by induction on . Note that A 0 does satisfy the above inequality. Then, using the induction hypothesis and the recursive definition of A in (42), we have Now we use that (x + 1) 5/2 ≤ 6x 3 for all x ≥ 1, which yields establishing (44). Plugging (44) into (43), we obtain Given a value of L 1 , for all small enough p we obtain thatρ is sufficiently small to yield  (19) and (20) that R j ≤ c 1 c j 2 ( j!) d L 1 for positive constants c 1 , c 2 . Therefore, for any k ≥ , we have that where in the last step we use that c Hence, for sufficiently large L 1 we obtain

Multiscale paths of infected sets
Let x ∈ Z d be a fixed vertex. We say that = (k 1 , i 1 ), (k 2 , i 2 ), . . . , (k , i ) is a multi-scale path from x if x ∈ Q enc k 1 (i 1 ), and for each j ∈ {2, 3, . . . , } we have Q enc Given such a path, we say that the reach of is given by sup |z − x| : z ∈ (k,i)∈ Q enc k (i) , that is, the distance between x and the furthest away point of . We will only consider paths such that Q enc k j (i j ) ⊂ I k j . Recall the way the sets I κ are constructed from C κ \C bad κ , which is defined in (33). Then for any two (k, i), (k , i ) ∈ with k = k we have Q enc k (i) ∩ Q enc k (i ) = ∅. Therefore, we impose the additional restriction that on any multi-scale path = (k 1 , i 1 ), (k 2 , i 2 ), . . . , (k , i ) we have k j = k j−1 for all j ∈ {2, 3, . . . , }. Now we introduce a subset˜ of as follows. For each k ∈ N and i ∈ Z d , define Let κ 1 > κ 2 > · · · be an ordered list of the scales that appear in cells of . The set˜ will be constructed in steps, one step for each scale. First, add to˜ all cells of of scale κ 1 . Then, for each j ≥ 2, after having decided which cells of of scale at least κ j−1 we add to˜ , we add to˜ all cells (k, i) ∈ of scale k = κ j such that Q The idea behind the definitions above is that we will look at "paths" of multi-scale cells such that two neighboring cells in the path are such that their Q neigh2 regions intersect, and any two cells in the path have disjoint Q neigh regions. The first property limits the number of cells that can be a neighbor of a given cell, allowing us to control the number of such paths, while the second property allows us to argue that the encapsulation procedure behaves more or less independently for different cells of the path. = (k 1 , i 1 ), (k 2 , i 2 ), . . . , (k , i ) be a multi-scale path starting from x. Then, the subset˜ defined above is such that Fig. 9 Illustration of the relations between the variables in the proof of Lemma 5.9. A line

Lemma 5.9 Let
Proof Let ϒ be an arbitrary subset of˜ with ϒ =˜ . The first part of the lemma follows by showing that Define Clearly, ϒ neigh ⊃ ϒ, and since {Q Recall that˜ = ϒ, and since no element of˜ \ϒ was added to ϒ neigh , we have that ϒ neigh = . Using that (k,i)∈ Q enc k (i) is a connected set, we obtain a value Refer to Fig. 9 for a schematic view of the definitions in this proof. Let (k , i ) be the cell of ϒ neigh for which Q enc k (i) intersects Q enc k (i ). Since (k , i ) ∈ ϒ neigh , let (k , i ) be the element of ϒ of largest scale for which Q neigh k then (k , i ) = (k , i ). We obtain that the distance according to | · | between Q neigh k (i ) and Q enc k (i) By the construction of˜ , and the fact that (k , i ) was set as the element of largest scale satisfying Q neigh k In the former case, the distance in (47) is bounded above by 2R outer k −1 , while in the latter case the distance is zero. So we assume that the distance between Q neigh k We obtain that (k , i ) / ∈ ϒ, otherwise it would imply that (k, i) ∈ ϒ neigh violating the definition of (k, i). The distance between Q neigh k Therefore, we have that Q which gives that y ∈ Q neigh2 κ (ι).
Now we define the type of multi-scale paths we will consider.
Definition 5.10 Given x ∈ Z d and m > 0, we say that = (k 1 , i 1 ), (k 2 , i 2 ), . . . , (k , i ) is a well separated path of reach m starting from x if all the following hold: If the origin is separated from infinity by η 2 , then there must exist a multi-scale path for which the union of the Q enc k (i) over the cells (k, i) in the path contains the set occupied by η 2 that separates the origin from infinity. Then Lemma 5.9 gives the existence of a well separated path for which the union of the Q neigh2 k (i) over (k, i) in the path separates the origin from infinity.

Lemma 5.11
Fix any positive constant c. We can set L 1 large enough and then p small enough, both depending only on c, α, d, and υ, so that the following holds. For any integer ≥ 1, any given collection of (not necessarily distinct) integer numbers k 1 , k 2 , . . . , k , and any vertex x ∈ Z d , we have P ∃ a well separated path Proof For any j, since the path is infected we have Q enc k j (i j ) ⊂ I k j . This gives that there existsĩ j such that Q core . Also, the number of choices forĩ j is at most some constant c 1 , depending only on d. Since {Q neigh k j (i j )} j=1,..., is a collection of disjoint sets, if we fix the path = (k 1 , i 1 ), (k 2 , i 2 ), . . . , (k , i ), and take the union bound over the choices of i 1 ,ĩ 2 , . . . ,ĩ , we have from Lemma 5.8 that where a can be made as large as we want by properly setting L 1 and p. It remains to bound the number of well separated paths that exist starting from , the number of ways to choose the first cell is at most 12 5 Hence, the number of ways to choose i j+1 given (k j , i j ) and k j+1 is at most Therefore, we have that P ∃ a well separated path = (k 1 , i 1 ), (k 2 , i 2 ), . . . , (k , i ) from x that is infected is at most where the second inequality follows for some c 2 = c 2 (d, α, , υ) by the value of R outer k from (20), and the last inequality follows by setting a large enough and such that a ≥ 2c.
For the lemma below, define the event E κ,r = {there exists a well separated path from the origin that is infected, has only cells of scale smaller than κ, and has reach at least C FPP r }.
Let E ∞,r be the above event without the restriction that all scales must be smaller than κ. Below we restrict to r > 3 just to ensure that log log r > 0. Proof Let A r be the set of vertices of Z d of distance at most C FPP r from the origin. Set δ r = 1 (d+3) log log r and κ = δ r log r . For any large enough a depending on L 1 and p, we have where in the last inequality we use Lemma 5.8. Since a above can be chosen as large as needed (by requiring that L 1 is large enough and p is small enough), we can choose a large enough a so that If the event above does not happen, then Lemma 5.5 gives that Let be a well separated path from the origin, with all cells of scale smaller than κ, and which has reach at least r . Define m k ( ) to be the number of cells of scale k in . Since must contain at least one cell for which its Q neigh2 region is not contained in A r , we have Because of the type of bounds derived in Lemma 5.11, it will be convenient to rewrite the inequality above so that the term κ−1 k=1 2 k appears. Note that using (20) we can set a constant c 0 ≥ 2 such that R outer j ≤ c j 0 ( j!) d+2 L 1 for all j ≥ 1, which gives For any , define ϕ( ) = κ−1 k=1 m k ( )2 k . We can then split the sum over all paths according to the value of ϕ( ) of the path. Using this, Lemma 5.11, and the fact that ϕ( ) ≥ where A m is the number of ways to fix and set k 1 , k 2 , . . . , k such that ϕ( ) = j=1 2 k j = m, and c is the constant in Lemma 5.11. For each choice of , k 1 , k 2 , . . . , k , we can define a string from {0, 1} m by taking 2 k 1 consecutive 0s, 2 k 2 consecutive 1s, 2 k 3 consecutive 0s, and so on and so forth. Note that each string is mapped to at most one choice of , k 1 , k 2 , . . . , k . Therefore, A m ≤ 2 m , the number of strings in {0, 1} m . The proof is completed since c can be made arbitrarily large by setting L 1 large enough and then p small enough, and 5C FPP r 12c L 1 .

Completing the proof of Theorem 5.1
Proof of Theorem 5. 1 We start showing that η 1 grows indefinitely with positive probability. Let e 1 = (1, 0, 0, . . . , 0) ∈ Z d . Any set of vertices that separates the origin from infinity must contain a vertex of the form be 1 for some nonnegative integer b. For any b and t ≥ 0, let f b (t) = P η 2 (t) contains be 1 and separates the origin from infinity .
For the moment, we assume that b is larger than some fixed, large enough value b 0 . Recall that B (r ) ⊆ [−C FPP r, C FPP r ] d , which gives that be 1 does not contain the origin. Hence, in order for η 2 (t) to contain be 1 and separate the origin from infinity, η 2 (t) must contain at least a vertex of distance (according to the norm | · |) greater than b 2C FPP from be 1 . When η 2 (t) separates the origin from infinity, it must contain a set of sites that form a connected component according to the ∞ norm, which itself separates the origin from infinity and contains a vertex of distance (now according to the norm | · |) greater than b 2C FPP from be 1 . This connected component implies the occurence of the event in Proposition 5.12, hence Note that, as needed, the bound above does not depend on t; this will allows us to derive a bound for the survival of η 1 that is uniformly bounded away from 0 as t grows to infinity. Note also that the constant c from Proposition 5.12 can be made arbitrarily large by setting L 1 and p properly. Therefore, ∞ b=b 0 f b (t) can be made smaller than 1, and in fact goes to zero with b 0 . Regarding the case b ≤ b 0 , for each k ≥ 1 let K k be the set of (k, i) such that R outer Note that there exists a constant c b depending on b 0 such that the cardinality of K k is at most c b for all k. Then, using Lemma 5.8, we have P ∃k : which can be made arbitrarily small since a can be made large enough by choosing L 1 large and p small. This concludes this part of the proof, since Now we turn to the proof of positive speed of growth for η 1 . Note that η 1 ∪η 2 is stochastically dominated by a first passage percolation process where the passage times are at least i.i.d. exponential random variables of rate 2, because η 2 is slower than a first passage percolation of exponential passage times of rate 1. Then, by the shape theorem we have that there exists a constant c > 0 large enough such that Now fix any t, take c as above, and set κ = 1 + log t (log log t) 2 . For any large enough a depending on L 1 and p, we have where in the second inequality we use Lemma 5.8, and the third inequality follows because a can be chosen large enough in Lemma 5.8. The above derivation allows us to restrict to cells of scale smaller than κ. Note that since there are no contagious set of scale κ or larger intersecting [−ct, ct] d , the spread of η 1 (t) inside [−ct, ct] d stochastically dominates a first passage percolation process of rate λ 1 κ . Thus, disregarding regions taken by η 2 , we can set a sufficiently small constant c > 0 so that, at time t,η 1 will contain a ball of radius 2c t around the origin with probability at least 1 − exp −c t d+1 2d+4 for some constant c , by Proposition 3.1. The only caveat is that, at time t, there may be regions of scale smaller than κ that are taken by η 2 and intersects the boundary of B 2c t . If we show that such regions cannot intersect ∂ i B c t , then we have that the probability that η 1 survives up to time t butη 1 (t) does not contain a ball of radius c t around the origin is at most 1 − 2 exp −a2 κ−1 − exp −c t d+1 2d+4 . This is indeed the case, since we can take a constant c such that any cell of scale smaller than κ has diameter at most where the inequalities above hold for all large enough t, completing the proof.

From MDLA to FPPHE
Here we show how to use the proof scheme for FPPHE from Sect. 5 to establish Theorem 1.1. The relation between FPPHE and MDLA is very delicate, and we will need to introduce another process, which we call the h-process. For the sake of clarity, this section is split into a few subsections.

Dual representation and Poisson clocks
We start by recalling the dual representation of the exclusion process. In this dual representation, vertices without particles are regarded as hosting another type of particle, called holes, while vertices hosting an original particle are seen as unoccupied. Using the terminology of the dual representation, in MDLA, holes perform a simple exclusion process among themselves, where they move as simple symmetric random walks obeying the exclusion rule (jumps to vertices already occupied by a hole or by the aggregate are suppressed). Then the growth of the aggregate is equivalent to a first passage percolation process which expands along its boundary edges at rate 1, but with the additional feature that the aggregate does not occupy vertices that are occupied by holes.
To be more precise, we now define MDLA in terms of Poisson clocks. A Poisson clock of rate ν is a clock that rings infinitely many times, and such that the time until the first ring, as well as the time between any two consecutive rings, are given by independent exponential random variables of rate ν. Even though edges of Z d have so far always been considered as undirected, we will need to assign an independent Poisson clock of rate 1 to each oriented edge (x → y). Then the evolution of MDLA is as follows. When the clock of (x → y) rings, if x is occupied by a hole and y is unoccupied, the hole jumps from x to y. If x is occupied by the aggregate and y is unoccupied, then the aggregate occupies y. In any other case, nothing is done. Henceforth, the Poisson clocks used to construct MDLA will be referred to as the MDLA-clocks.

MDLA with discovery of holes
We give a different representation of MDLA, which we refer to as MDLA with discovery of holes. Each vertex of Z d will either be occupied by the aggregate, be occupied by a hole, or be unoccupied. As before, the aggregate starts from the origin. However, unlike before, each vertex of Z d \{0} is initially unoccupied, and is assigned a non-negative integer value, which is given by an independent random variable having value i ≥ 0 with probability (1 − μ) i μ. This value represents the number of holes that can be born at that vertex.
More precisely, when the MDLA-clock of an edge (x → y) rings, a few things may happen.
• If x hosts a hole and y is unoccupied, the hole jumps from x to y.
• If x belongs to the aggregate, y is unoccupied and the value of y is 0, then the aggregate occupies y.
• If x belongs to the aggregate, y is unoccupied and the value of y is i ≥ 1, then the value of y is changed to i − 1, a hole is born at y (so y becomes occupied), and the aggregate does not occupy y. • In any other case, nothing happens.
Note that holes move independently of the values of the vertices, and perform continuous time, simple symmetric random walks (jumping at the time of the MDLA-clocks) obeying the exclusion rule; that is, whenever a hole attempts to jump onto a vertex already occupied by a hole or by the aggregate, the jump is suppressed. Note that this process is equivalent to the description of MDLA with the dual representation of the exclusion process. The only difference is that, instead of placing all holes at time 0, holes are added one by one as the process evolves. More precisely, holes are born as the aggregate tries to occupy unoccupied vertices of value at least 1.

Backtracking jumps and overall strategy
The main idea we will use to compare MDLA and FPPHE is the following. Regardless of the location of a hole, if the hole jumps from a vertex x to a vertex y, with positive probability the MDLA-clock of (y → x) rings before the other 2(2d − 1) MDLA-clocks involving edges of the form (· → x) or (y → ·). This causes the hole to jump back to x before the hole can jump to any other vertex adjacent to y or before the aggregate or another hole can occupy x. We call this a backtracking jump. This type of jump intuitively gives that the rate at which a hole leaves a set of vertices is strictly smaller than 1. In other words, holes are slower than the aggregate. A natural approach is to set λ < 1 in FPPHE to represent the rate at which holes move (taking into consideration backtracking jumps), and then couple MDLA and FPPHE so that the following properties hold.
1. The seeds of η 2 are the vertices of value at least 1 in MDLA. 2. The aggregate contains η 1 at all times. 3. For all t ≥ 0, the holes that have been discovered by time t are contained inside η 2 (t).
Despite the above idea being relatively simple, the following delicate issue prevents this to be made into a rigorous argument. Suppose that the above three properties hold up to time t, and assume that at time t the aggregate of MDLA contains vertices that do not belong to η 1 (t). Hence, at a later time the aggregate may discover a hole at a vertex x which is at the boundary of the aggregate but is not at the boundary of η 1 . In other words, the aggregate may discover a hole at a vertex x whose seed cannot be activated since x is yet unreachable by η 1 . At this time, property 3 would cease to hold.
To go around the above issue, we will employ a coupling argument to show that MDLA stochastically dominates FPPHE locally. In particular, we will use that coupling to show that the encapsulation procedure used for FPPHE (via Proposition 4.2) works as well for MDLA. Then, the multi-scale machinery developed in Sect. 5 can be used to obtain that each cluster of holes get encap-sulated by the aggregate at some finite (possibly large) scale. For this, we will use the fact that the encapsulation procedure we did for FPPHE in Proposition 4.2 is implied by the occurrence of a monotone event F, which is increasing with respect to the passage times of type 2 and decreasing with respect to the passage times of type 1.

Coupling of the initial configuration
We now formalize the coupling of the initial configurations of MDLA and FPPHE, as suggested in the previous section. For each vertex x ∈ Z d \{0}, note that the probability that x is assigned a value at least 1 in MDLA with discovery of holes is Then, we set p = 1 − μ so that we can couple the vertices with value at least 1 with the location of the type-2 seeds of FPPHE. From now on, for each vertex of value at least 1, we will refer to it as a seed, regardless of whether we are talking about MDLA or FPPHE.

The h-process
We will not actually couple MDLA with FPPHE, but we will couple MDLA with another process {h t } t , which will be a growing subset of Z d . We call this process the h-process. The h-process will be constructed using the MDLA-clocks and the seeds, where the seeds have been coupled with MDLA as described in Sect. 6.4.
When a vertex x belongs to h t , we will say that x is infected. To avoid confusion, we will not say that x is occupied by h t since, as we explain later, a vertex that is occupied by the aggregate can also be infected. Our goal with the h-process is to obtain that the holes that have already been discovered at time t are contained in h t ∪ ∂ o h t , and the ones in ∂ o h t are the holes that will jump back to h t (in a backtracking jump).
At time 0 we set h 0 = ∅, and let the aggregate spread using the MDLA-clocks using the representation with discovery of holes. The h-process will evolve according to three operations: birth, expansion and halting upon encapsulation.
• Birth. If at time t a hole is discovered by the aggregate inside a cluster C ⊂ Z d of seeds, 2 then we infect C; that is, we add to h t the whole cluster C of seeds. • Expansion. For each unoriented edge (x, y), we will define a passage time τ x,y (which we will specify later on and will depend on the evolution of MDLA). So if x gets infected at time t, then x infects y at time t + τ x,y ; note that y could get infected before t + τ x,y if a neighbor of y different than x infects y. • Halting upon encapsulation. The h-process is allowed to infect vertices that are occupied by the aggregate. However, if at some moment a cluster C of h t is separated from infinity by the aggregate, which means that any path from C to infinity intersects A t , then h t will not infect any vertex of ∂ o C that already belongs to the aggregate. 3 This is to guarantee that a cluster of the h-process is confined to a finite set when it gets encapsulated by the aggregate. Now we introduce some notation. For each vertex x, let E x be the set of (unoriented) edges incident to x, and let E → x (resp., E ← x ) be the set of oriented edges going out of (resp., coming into) x. For each edge (x → y), let which includes (y → x) but not (x → y). Let M = 4d − 1 and note that, given an edge (x → y), we have that M = |E x→y |.
Later, 1 M will be a lower bound on the probability that a hole performs a backtracking jump.
We will use the convention that if we write (x → y) ∈ ∂ e h t , we mean that x ∈ h t and y / ∈ h t . We will update a set of edges H(t) as the h-process evolves, starting from H(0) = ∅. Moreover, for each (x → y) ∈ H(t), we will associate an independent Bernoulli random variable B x→y of parameter 1/M. If needed, the random variable B x→y will be redraw independently during the evolution of the h-process. Once H(t) is specified, we define 3 If the aggregate decides to occupy a site x at the same time as the h-process decides to infect x, then we assume that the aggregate occupies x immediately before the h-process infects x. This is just a convenience to take care of the following situation. Assume that x is a neighbor of a cluster of infected sites which is disconnected from infinity by the aggregate but x itself is not disconnected from infinity by the aggregate. This implies that the aggregate occupies the neighbors of x that belong to the infected cluster. If at this time the aggregate and the h-process both decide to occupy x, we obtain that the aggregate does so first, and then the h-process does not infect x due to the halting upon encapsulation operation.

Evolution of the h-process
Here we will describe how the h-process uses the MDLA-clocks to evolve. Our description here will be precise, but will not enter in the details needed to define the passage times of the h-process. This will be carried out in Sect. 6.7. Start from time 0, where we have h 0 = ∅ and H(t) = ∅. From this time, we let MDLA evolve using its MDLA-clocks. If at some time t MDLA tries to occupy a seed (that is, MDLA discovers a hole) inside some cluster C ⊂ Z d of seeds, the h-process undergoes a birth operation and we set h t = C. At this moment, we continue to let MDLA evolve using its MDLA-clocks. If new holes are discovered, new births take place and clusters are added to the h-process (that is, new clusters are infected).
We will now discuss all possibilities that could happen for the expansion of the h-process. In all cases below, we will assume that the expansion of the h-cluster is happening in an infected cluster that is not disconnected from infinity by the aggregate. Otherwise, the halting upon encapsulation would already have happened to that cluster, which would prevent it from expanding.
The h-process only expands when an edge at the boundary of the h-process or an edge from H B rings. (The edges in H B are needed to verify backtracking jumps, and "B" in the subscript actually stands for backtracking.) If an edge that is internal to the h-process (that is, both of its endpoints are already infected) rings, then the h-process does not change, even if that causes new holes to be discovered. Similarly, if an edge that is external to the h-process (that is, both endpoints are not infected) and does not belong to H B rings, and no new hole get discovered by this operation, then the h-process does not change.
Now we assume that an edge (x → y) from the boundary of the h-process or from H B rings at time t, and discuss what occurs with the h-process at this time. We split our discussion into three cases, and at the end explain two particular situations. Let s < t be the last time before t that the h-process changed.

Case 1: A hole jumps out of the h-process
This corresponds to (x → y) ∈ ∂ o h s with a hole at x ∈ h s and y / ∈ h s unoccupied at time t−. Thus, the ring of (x → y) causes the hole to jump from x to y.
It could be the case that there is already an edge (y → y) ∈ H(t) with y = x. If this is the case, we simply do nothing; this case will be better discussed in Sect. 6.6.5. Otherwise, if there is no such edge, we add (x → y) to H(t) to verify whether the hole will do a backtracking jump to x. Moreover, we draw (independently from previous values that this random variable could have assumed) the Bernoulli random variable B x→y of parameter 1/M. If B x→y = 0, which means that the hole will not backtrack to x, then we infect y at time t.

Case 2: Verification of backtracking jumps
This corresponds to (x → y) ∈ H B (s). Let (u → v) be the edge from H(s) such that (x → y) ∈ E u→v ⊂ H B (s), and assume that (u → v) was added to H at time t ≤ s. Note that, from Sect. 6.6.1, this was done because a hole jumped from u to v at time t . Then while no clock from E u→v rings, u will remain unoccupied and the hole will remain at v.
When the clock of an edge (x → y) from E u→v rings at time t, then the first thing we do is to remove (u → v) from H(t). We will say that the possibility of a backtracking jump through (u → v) has been verified. We then couple the value of B u→v with the MDLA-clocks so that if B u→v = 1, we have that the first clock to ring among the ones from E u→v is (v → u). If this happens, then (x → y) = (v → u) so, at time t, the hole backtracks to u. Note that both u and v are already infected. In this case, nothing else needs to be done.
On the other hand, if B u→v = 0, the backtracking will not happen and Note also that v is already infected. If (x → y) ∈ E ← u , then u could get occupied by the aggregate or by another hole, which could prevent the hole from v to jump back to u when the clock (v → u) rings. In this case, we do not need to do anything else.
Finally, if (x → y) ∈ E → v , then the hole at v = x may jump to y, if y is unoccupied. If, in addition, we have that y / ∈ h s , then the hole jumped out of the h-process, so we perform the steps described in Sect. 6.6.1 to (x → y) so that we can later verify the possibility of a backtracking jump through (x → y). In particular, we add (x → y) to H(t), sample B x→y , and infect y if B x→y = 0.
The purpose of the set H is to keep track of the edges over which a backtracking jump can happen. It will hold that at all times s , H(s ) is a set of disjoint edges (with no endpoint in common) such that for any (w → z) ∈ H(s ), w ∈ h s is unocuppied in M DL A. (53) The above is quite straightforward from our description above, but we will actually prove it in Lemma 6.1, after we describe precisely how the passage times are constructed. We remark that in (53) we do not require z to host a hole. This is because of a corner case that we need to handle carefully, and which we will explain in Sect. 6.6.4.

Case 3: Expansion without jump of holes
Here we assume that (x → y) ∈ ∂ o h t \H B (s) and such that one of the following conditions happens.
• x ∈ h s is unoccupied at time t−. • x is occupied by the aggregate at time t−. • x is occupied by a hole, but y is occupied by either a hole or the aggregate at time t− (preventing the hole from x to jump to y).
(The case of x being occupied by a hole and y unoccupied is covered by Sect. 6.6.1.) The above three situations bring little trouble to us since it does not cause any hole to jump, so we will simply choose to infect y with probability M−1 M < 1, otherwise we do nothing. This choice is to guarantee that the passage times τ ·,· stochastically dominate (but are not equal to) exponential random variables of rate 1 in this case.

B
We need to give special attention to the set H * B . Suppose now that we have carried the above process up to a time t, when it happens that H * there exist a neighbor x of x and a neighbor y of y such that (x → x), (y → y ) ∈ H(t), and hence y ∈ h t and (x → y) ∈ E x →x ∩ E y→y ⊂ H B (t).
(54) Note that if t is the first time that H * B (t) = ∅, then both x and y host a hole. The above gives that (x → y) is involved in the backtracking jumps of both (x → x) and (y → y ), which is in conflict with the fact that B x →x and B y→y are independent.
To solve this, for each (x → y) ∈ H * B (t), we will consider two Poisson clocks: the actual MDLA-clock, which will be associated to the backtracking jump of (y → y ), and a fake-clock, which will be associated to the backtracking jump of (x → x). So, if B x →x = 1, it means that the clock of (x → x ) will ring before the MDLA-clocks of E x →x \(x → y) and before the fake-clock of (x → y). Similarly, if B y→y = 1, it means that the clock of (y → y) will ring before the MDLA-clocks of E y→y . The evolution of MDLA simply ignores the fake-clocks. Since the fake-clocks and the MDLA-clocks are independent, there is no conflict with the independence of B x →x and B y→y . Now we explain why this does not create other problems.
If the MDLA-clock of (x → y) rings, we say that a clock from E y→y rings, whereas when the fake-clock of (x → y) rings, we say that a clock from E x →x rings. Assume that the first clock to ring among the MDLA-clocks and fakeclocks of E x→y is the MDLA-clock of (x → y). Let s > t be the time at which that clock rings, and assume that this is the first clock to ring among the clocks of E x →x and E y→y . Note that in this case we have B y→y = 0. Then, the hole that is in x jumps to y, and we perform the steps described in Sect. 6.6.2 for the backtracking jump of (y → y ) when B y→y = 0. No action is taken with regards to the backtracking jump of (x → x). In particular, we have that (y → y ) / ∈ H(s) and, more crucially, we have that (x → x) ∈ H(s) even if there is no hole at x. (This is the reason why in (53) we have not required y to host a hole.) The fact that (x → x) remained in H(s) will not cause problems because the hole that was in x jumped inside h s (because y ∈ h t ⊆ h s ). So, in some sense, that hole did backtrack to the h-process. We will later still process the backtracking jump of (x → x) even if there may not be a hole at x (which just means that no hole will jump, but the h-process may still be updated according to the decision of a backtracking jump). For example, if B x →x = 1, we will assume that there is a backtracking jump over (x → x) causing x not to be added to the h-process, which remains to be true even if x does not host a hole.
The fake-clock of (x → y) will exist while (x → y) ∈ H * B , that is, while both (x → x) and (y → y ) belong to H. When this ceases to be true, the fake-clock of (x → y) is simply deleted and we will only keep track of its MDLA-clock. In the following, the term the clocks of E x→y means the M DL A − clocks and the fake-clocks associated to the edges of E x→y when considering the backtracking jump of (x → y). (55)

Holes revisiting uninfected vertices
Consider the setting in the previous section, where (x → x) ∈ H(s) but there is no hole at x. Note that if B x →x = 1 (so that x / ∈ h s ), before the backtracking jump of (x → x) is processed (that is, before the MDLA-clocks of E x →x \(x → y) and the fake-clock of (x → y) ring), it could happen that a hole jumps from a vertex x to x. Note that x = x , because x remained unoccupied from the time the hole jumped from ∈ E x →x , this jump does not affect the backtracking jump of (x → x). But since x is not infected, the hole just jumped out of the h-process. Suppose that this happens at some time s . This is the situation explained in Sect. 6.6.1, when nothing needs to be done to the h-process. The reason is that this hole just occupied the place of the yet to be verified backtracking jump of (x → x). Thus we only need to wait that backtracking to be processed.
Note that it can also happen that much later x is still not infected and a hole jumps from x to x again. But, as we explained above, this can only happen after the initial backtracking jump of (x → x) has been proceed. Since we had that B x →x = 1 (so that x remained uninfected), the second jump of a hole from x to x can only happen after the clock of the edge (x → x ) rings (because this happens before any edge from E ← x rings, so x remained unoccupied). When the edge (x → x ) rings, the memoryless property of exponential random variables guarantees that the rings of the clocks in E x →x are from this moment independent of the past. Note that at this time it also happens that the value of B x →x is redrawn independently, so this new hole jumping to x will have no correlation with the previous backtracking jump through (x → x).

Construction of the passage times of the h-process
In the previous section we described how the h-process uses the MDLA-clocks to evolve. Here we will use the discussion from the previous section to describe the construction of the passage times of the h-process.
For each (unoriented) edge (x, y), we will construct an ordered list x,y = x,y , (2) x,y , . . . , x,y will be independent random variables, and the number of elements κ x,y in x,y will also be a random variable. Recalling that τ x,y is the passage time of the h-process through the edge (x, y), we will construct the h-process so that Suppose the h-process has been constructed up to time t, and assume that the set of holes discovered up to time t is contained in Assume also that for any y ∈ ∂ o h t that hosts a hole, there is a unique x such that (x → y) ∈ H t . Moreover, in the past a hole jumped to y from x, and the possibility of a backtracking jump of that hole has not yet been verified.
The last property above means that if the hole jumped from x to y at some time s ≤ t, then during (s, t] the clocks of E x→y have not rung (but it could be the case that a fake-clock from E x→y has rung, see the discussions in Sects. 6.6.4 and 6.6.5). We will use the convention that ∂ o h t gives the outer boundary of h t that can be infected by the h-process. (Recall that vertex at the boundary of h t that is occupied by the aggregate does not get infected if the corresponding cluster of the h-process is separated from infinity by the aggregate of MDLA.) We let MDLA evolve according to its clock until the first time t + W at which either of two events happen: 1. A birth operation takes place (see the definition in Sect. 6.5). Note that in this case new vertices are added to the h-process, so h t+W = h t . 2. A clock (regardless of whether it is a MDLA-clock or a fake-clock) from ∂ e h t ∪ H B (t) rings. In this case, we will not yet observe which of these clocks rang. We will call this case a potential expansion operation.

Birth operation or addition of vertices to the h-process
Suppose that at time t +W an (unoriented) edge (x, y) becomes part of ∂ e h t+W ; assume without loss of generality that this is because x got infected at time t + W . Then we create a variable x,y whose initial value is 0. The value of x,y will be updated and will be used to add elements to the list x,y . Then we perform the following step: (59)

Potential expansion operation of the h-process
Suppose that at time t + W an edge from ∂ e h t ∪ H B (t) rings. The h-process will expand according to the description in Sect. 6.6, but we elaborate a little bit more here. We do not immediately observe which is the edge that rings, this edge is still random and could also correspond to a fake-clock from an edge of H * B (t). Now we want to sample which clock from ∂ e h t ∪ H B (t) rings first. We will denote by (x → y) the random edge whose clock rang.
However, we need to be a bit careful in the sampling of (x → y). First, let where each edge in H * B (t) is counted with multiplicity 2 (to account for its fake-clock and MDLA-clock) while the other edges have multiplicity 1. We would like to set (x → y) = (x → y ). The problem is that, when a hole jumps over an edge (u → v) at some time s, where v / ∈ h s , we need to decide right away whether that hole will backtrack to u later and this impacts which edge from E u→v rings first. This decision is a function of the Bernoulli variable B u→v . As explained in Sect. 6.6.2, if B u→v = 1, we know that the next clock to ring will be that of (v → u), whereas if B u→v = 0 we have that the next clock to ring is chosen uniformly at random from the clocks of E u→v \(v → u).
We can now state precisely how ( ∈ E x →x ∩ E y →y , then we say that (x → y ) ∈ E x →x only if it was the fake-clock of (x → y ) that rang, and say that (x → y ) ∈ E y →y if it was the MDLA-clock of (x → y ) that rang.
Now we describe the actions we need to do to compute the passage times as the h-process evolves according to the description in Sect. 6.6. If an edge enters ∂ e h t+W due to the ring of (x → y), we perform the steps described in Sect. 6.7.1 for that edge. Moreover, in any case, we After the above, we do the following.
If (x, y) ∈ ∂ e h t , add x,y as a new element to the list x,y , and reset x,y to 0.
Note that if we always have (x → y ) = (x → y), then the variables x,y would be i.i.d. exponential random variables of rate 1, since they are constructed exactly as described in Lemma B.3. However, we will have cases when (x → y ) = (x → y), which will still give rise to the variables x,y being independent exponential random variables, but of possibly different rates. This will be explained in Sect. 6.8, after the passage times have been constructed.
Finally, if we have that (x → y) ∈ ∂ e h t and y gets infected at time t + W , we close the list x,y , and define τ x,y as in (56) (thereby concluding the construction of τ x,y ). This means that nothing else will be added to the list x,y . This can happen in the following cases: • (x → y) ∈ ∂ e h t \H B (t), with x not hosting a hole in MDLA or y being occupied in MDLA. This was described in Sect. 6.6.3, where y gets infected with probability M−1 M .
• (x → y) ∈ ∂ e h t \H B (t), with x hosting a hole in MDLA, y unoccupied, there exists no other edge (· → y) ∈ H(t), and B x→y = 0. This is the case that a hole jumps out of h t (from x to y) and does not backtrack to x. This was described in Sects. 6.6.1 and 6.6.2.
and (x → y) = (v → u)). As mentioned in (52), the probability that we may infect y if there is still a hole at x = v and B x→y = 0. (Note that this actually falls into the setting of the previous case, but we chose to highlight it here since a given edge from E → v rings at rate M M−1 instead of rate 1, due to the conditioning on B x→y = 0.) Iterating the above construction will produce the lists x,y for some edges (x, y). For each edge that was not visited during this procedure, we sample an independent, exponential random variable of rate M−1 M to be its passage time. For each edge (u, v) whose construction was not completed, we add to u,v an independent exponential random variable of rate M−1 M , add u,v as a new element to u,v , and complete the construction of τ u,v . Regardless of the value of these random variables, the evolution of the h-process will not change. Also, the final passage times of those edges stochastically dominate exponential random variables of rate M−1 M .

Properties of the passage times
Before establishing properties of the passage times, we establishe (53).
Proof Clearly (53) holds at time 0 since H(0) = ∅. Now assume that it holds during [0, t), and that at time t a hole jumps out of h t− , from x to y; so (x → y) is added to H(t) via Case 1 (cf. Sect. 6.6.1) and x ∈ h t . Then, at time t, x is not occupied by a hole or the aggregate in MDLA and belongs to h t , while y hosts a hole in MDLA. If B x→y = 0, y gets infected at time t and (53) continues to hold. If B x→y = 1, y remains uninfected, but x remains unoccupied until at a time s > t the clock of an edge from E ← x rings, but that edge must be (y → x) since B x→y = 1. So the hole gets back to an infected vertex.
It remains to show that the edges in H(t) are disjoint. Assume that this is not the case; that is, there are edges (x → y), (u → v) ∈ H(t) with |{u, v} ∩ {x, y}| = 1. Assume that (x → y) and (u → v) were added to H at times t x and t u , respectively, with t x > t u . If u = x, then at some time during (t u , t x ) a hole jumped into u = x in order to go to y at time t x . But this would cause (u → v) to be removed from H by Case 2, which would imply that Let Z (M) be the following random variable. Take Z to be an exponential random variable of rate M, Z be an exponential random variable of rate M−1 M , and Q be a Bernoulli random variable of parameter M−1 M , where Z , Z and Q are independent of one another. Define Proof If an edge (w, z) is such that its passage time was not completed during the procedure above, then we know that its passage time stochastically dominates an independent, exponential random variable of rate M−1 M , which in turn stochastically dominates Z (M) by the second part of Lemma 6.2.
So now we consider a given edge (w, z), whose passage time was completely constructed during the procedure described in Sects. 6.6 and 6.7. Let t w < t z be the times such that w is infected at time t w and z is infected at time t z . The passage time of (w, z) is then completed at time t z and is t z − t w . Now we split the proof into two parts. In the first part, assume that w does not host a hole at time t w (for example, because it was infected according to Case 3, Sect. 6.6.3). The crucial property we will use in this case is that (w → z) cannot belong to H B during [t w , t z ) since an edge (z → ·) cannot be in H as z is not infected and an edge (· → w) cannot be in H since w was infected without a hole. As the h-process evolves from t w , at each step, we add W to w,z , until we find a step where the clock that rings is the one of (w → z). This is the procedure described in Lemma B.3 for the construction of independent exponential random variables of rate 1. Hence, at the time the clock of (w → z) rings, call this time s, we have that w,z is an exponential random variable of rate 1, which is then added to w,z . If at time s we fall into the setting of Case 3 (Sect. 6.6.3), then we only infect z with probability M−1 M ; otherwise we wait for the next time the clock of (w → z) rings, adding another exponential random variable of rate 1 to the list w,z , and iterating this procedure. If at time s we fall into the setting of Case 1 (Sect. 6.6.1), we only infect z if B w→z = 0, which occurs with probability M−1 M ; otherwise we iterate this procedure since the hole will jump back to w (or to the infected set before (z → w) rings). Therefore, each element of the list w,z is an independent random variable of rate 1, and the number of elements is given by a geometric random variable of success probability M−1 M . Lemma B.1 gives that τ w,z is in this case an exponential random variable of rate M−1 M . Since this stochastically dominates Z (M) by Lemma 6.2, this first part is completed. Now, for the second part, assume that w hosts a hole at time t w . Assume that that hole jumped to w from a vertex w . The first situation to imagine is that the hole jumped from w to w at time t w , which causes (w → w) to be added to H(t w ); thus B w →w = 0. But, it could also be the case that B w →w = 1. For this to happen, the hole must have jumped from w to w at a time t w < t w , which caused (w → w) to be added to H(t w ) and w not to be added to h t w . Then, in a time t w , the clock of an edge (w → w) (which is not part of E w →w ) rings at time t w when w is still occupied by a hole, triggering Case 3, which decides to infect w. Regardless of which of the two situations above occurs, we know that (w → z) ∈ E w →w .
If B w →w = 1, then we know the hole will do a backtracking jump from w to w before the clock of (w → z) rings, which will cause (w → w) to be removed from H. At this point, w will not have a hole anymore and we may proceed as in the first part of the proof, which implies that the passage time τ w,z stochastically dominates an exponential random variable of rate M−1 M . The most delicate case is when B w →w = 0. Suppose that at time s a clock from E w →w rings. Note that, since |E w →w | = M, we will have that w,z = s − t w is distributed as an exponential random variable of rate M. Let (x → y) be the edge whose clock rang at time s. Then a few cases may happen.

Proposition 6.4 If υ stochastically dominates and is not equal to the exponential distribution of rate 1, then there exists > 0 such that
Proof Let FPP 1 stands for a first passage percolation process with i.i.d. exponential passage times of rate 1, and let FPP υ stands for a first passage percolation with i.i.d. passage times of distribution υ. This proposition is implicitly proven in [28], but the theorem in [28] states the above result only in the axial direction. To do this, let x n = (n, 0, 0, . . . , 0) ∈ Z d , and for any x ∈ Z d , define T (x) and T υ (x) to be the time that FPP 1 and FPP υ , respectively, take to occupy x. By Kingman's subadditive ergodic theorem [21] we have that the following limits exists almost surely: By monotonicity and stochastic domination, we obtain that ν ≤ ν υ , and the main result of [28] is to establish the strict inequality ν < ν υ . The proof in [28] goes via a renormalization argument. First define a fixed, but large value and partition Z d into cubes of side-length . Then they say that a cube R is good if a certain "good" event happens. The good event is such that for any given path P of FPP υ inside P, there is a positive probability (uniformly over P and R) such that can find an alternative path P which differs very little from P and such that the time FPP 1 takes to traverse P is at most the time that FPP υ takes to traverse P minus a fixed value δ > 0. Then a percolation argument (which is by now quite standard) gives that the set of good cubes percolates on Z d . This means that any long enough path on Z d , say of size n, must pass through a number of good cubes of order n. (Here and δ are fixed, while n can grow.) Then, we can consider the geodesic path P that FPP υ takes to go from 0 to x n , and using the above reasoning we obtain an order of n good cubes R such that P has a long piece inside R. Then, for each good cube, with positive probability we can replace the long pieces of P within the good cube with the alternative path provided by the definition of good cubes, which produces a path from 0 to x n whose passage time in FPP 1 is faster than that of FPP υ by a factor of order n. This implies, for example, that one obtains a value δ > 0, depending only on υ, for which ν ≤ (1 − δ)ν υ . This is the argument in [28].
An important feature of the proof in [28] is that it does not depend on the direction; this fact was already observed in [22]. In other words, instead of only considering the sequence of vertices (x n ) n as defined above, we can consider any rational value q ∈ Q d and a sequence x n = q n. Then, for each x ∈ R d , associate x to the integer point y ∈ Z d such that x ∈ y + [−1/2, 1/2) d , and generalize T (x) and T υ (x) to be the time that FPP 1 and FPP υ , respectively, take to occupy the integer point that is associated to x. Then we can define ν(q) and ν υ (q) as in (63): The very same proof in [28] gives that one can find δ > 0 depending only on υ such that, uniformly over all q ∈ Q d , one has ν(q) ≤ (1 − δ )ν υ (q). Then one can obtain two continuous functionsν,ν υ from R d to R + by taking the unique continuous extension of ν, ν υ . Since δ > 0 uniformly on the choice of q, we have the existence of > 0 for which B υ ⊂ B(1 − ).
Remark 6.5 The result in [28] is in some sense more general than stated above since instead of requiring stochastic domination, it just requires that υ is less variable than an exponential random variable of rate 1; but we will not require such level of generality.
Proof of Theorem 1.1 Using Lemma 6.3, we have that the h-process grows slower than if the red seeds grows clusters of first passage percolation with passage times distributed as Z (M) . Let υ be the distribution of Z (M) . Then, by Proposition 6.4, we have the existence of λ < 1 such that B υ ⊆ B(λ). Then, performing the whole multi-scale procedure described in Sect. 5 and using the encapsulation procedure of Sect. 4, we obtain that with positive probability the set of sites that are not infected by the h-process and that are occupied by the aggregate grows indefinitely and contains a ball (up to regions separated from infinity by this set of sites).
Note that, in the h-process, whole clusters of seeds are added when the aggregate tries to occupy a seed, while in FPPHE seeds are activated one by one. This is not a problem. In fact, the same proof works if in FPPHE the activation of a seed implies the activation of its whole cluster of seeds. The reason is that, in the encapsulation procedure of Proposition 4.2, we already assumed that ξ 2 (0) starts from time 0 from any subset of a ball B (r ) of radius r .
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

A Appendix: Proof of Propositions 4.1 and 4.2
We first establish Proposition 4.1 and at the end, in Sect. A.4, we discuss how the proof can be changed to establish Proposition 4.2. We start describing the overall strategy of the proof, and setting up the notation. The main intuition behind the proof is that since ξ 2 (0) is initially inside B (r ) and ξ 1 (0) is outside B (αr ), with α > 1 being large enough, there is enough space for ξ 1 to start growing before noticing the presence of ξ 2 . This gives enough time for ξ 1 and ξ 2 to get closer to the set predicted by the shape theorem. Then we can guarantee that ξ 1 can encapsulate ξ 2 by letting ξ 1 occupy a sequence of growing annulus sectors centered at the origin. This is illustrated in Fig. 10.
The value of C n is related to the angle of the annulus sector at step n, which starts from the angle related to position x and increases until C n is the full unit circle, according to the norm | · |. Let N be the step where we obtain the unit circle; i.e., N is the smallest integer so that C N = C N +1 . Note that d H (C n , C n−1 ) ≤ δ 2 /2, ... ... ... 10 A schematic view of the encapsulation procedure. The left picture shows the ball B (r ) that contains ξ 2 (0), and we assume that ξ 1 (0) is at the boundary of B (αr ). The proof goes by establishing that ξ 1 occupies the regions illustrated in black, while ξ 2 is contained in the yellow regions. The growing sequence of annulus sectors that are occupied by ξ 1 grows until it reaches a full annulus, as illustrated in the rightmost picture. At this moment, ξ 2 has been encapsulated and is confined inside a ball where d H stands for the Hausdorff distance. Let A 1 0 = {x} and A 1 n = y ∈ R d : (1 + δ) n−1 αr < |y| ≤ (1 + δ) n αr and y |y| ∈ C n for all n ≥ 1.

B(r) B(αr)
The goal is to show that, for each n = 1, 2, . . . , N , ξ 1 completely occupies A 1 n after step n. Hence ξ 1 will encapsulate ξ 2 when it occupies A 1 N . As in (11), we let C FPP be a constant such that B (r ) ⊃ [−C FPP r, C FPP r ] d for all r > 0, which gives that B In order to show that ξ 1 occupies A 1 n for all n, we need to bound the distance between A 1 n and A 1 n−1 . Given y ∈ A 1 n , let y be the closest vertex of Z d to the point y |y| (1 + δ) n−1 αr . Using the triangle inequality we have Since the ball in R d centered at the point y |y| (1 + δ) n−1 αr and of radius (1 + δ) n−1 αr d H (C n−1 , C n ) must contain a point w ∈ R d with w |w| ∈ C n−1 and |w| = (1 + δ) n−1 αr , we obtain that inf z∈A 1 Thus, using that |y − y| ≤ (1 + δ) n − (1 + δ) n−1 αr + 3 2C FPP , we obtain where the last step holds by choosing c 1 large enough in the condition of α. This leads us to define t n = (1 + δ) n+1 δαr and T n = n i=1 t n = 1 − (1 + δ) −n (1 + δ) n+2 αr.
The value t n represents the time we will wait so that ξ 1 grows from A 1 n−1 until it contains A 1 n . For all n ≥ 1, define also the sets where U n = x ∈ Z d : ∃y ∈ A 1 n−1 and z ∼ x for which |z − y| ≤ (1 + δ) n+3 δαr ⊃ A 1 n−1 ∪ A 1 n Note that A 2 n = B (r + λ(1 + δ)T n ) and the distance between A 1 n−1 and ∂ i A 1 n is at least (1 + δ)t n ; we have chosen to define A 2 n and A 1 n independently of t n because later we will apply Proposition 4.1 with a time scaling, which will only cause a change in the definition of t n in this proof.
The idea is that after occupying A 1 n−1 , ξ 1 will occupy A 1 n after a time interval of length t n , and this is achieved only using passage times inside A 1 n . At the same time, for each n, after time T n , ξ 2 will be contained inside A 2 n . The crucial part of the construction is that A 1 n ∩ A 2 n = ∅. This means that the passage times inside A 1 n that we use to guarantee that ξ 1 grows from A 1 n−1 to A 1 n do not intersect ξ 2 .

A.1 The spread of ξ 2
In this part we show that the growth of ξ 2 is not too fast, so that ξ 2 is contained inside A 2 n . Recall that, by the conditions in Proposition 4.1, α is assumed to be large enough; in particular, there exists a large c 1 such that α > By choosing c 1 large enough in the condition of α, we can guarantee that δ ≥ (λT 1 ) − 1 2d+4 ≥ (λT n ) − 1 2d+4 , which allows us to apply Proposition 3.1. Then using that we obtain for positive constants c , c . Since the number of vertices in B (r ) is of order r d , and T n ≥ T 1 is large enough by the condition in α, the lemma follows.

A.2 The spread of ξ 1
Here we show that the growth of ξ 1 is fast enough, so that ξ 1 occupies A 1 n at time T n .
endpoints belong to X and are equal to infinity everywhere else. For each integer n, define the events E (1) n = D(B (r ) , ∂ o A 2 n ; ζ 2 ) ≤ T n , E (2) n = sup x∈A 1 n D A 1 n−1 , x; ζ 1 | A 1 n > t n and E n = E (1) n ∪ E (2) n .
We define the event F in the proposition by F = N n=1 E c n . We also define R = (1 + δ) N αr and T = T N . By Lemmas A.1, A.2 and A.3, we have where the last inequality follows since λt 1 = (1 + δ) 2 λδαr . This establishes the bound in the probability appearing in Proposition 4.1.
Since E (1) n is measurable with respect to A 2 n and E (2) n is measurable with respect to A 1 n , we have that F is measurable with respect to the passage times inside Note that F is increasing with respect to ζ 2 and decreasing with respect to ζ 1 | To conclude the proof of the proposition, it suffices to show that F implies that ξ 1 occupies all vertices in N n=1 A 1 n , since We will use induction on n to establish a stronger result by showing that, for each n, given that ξ 1 (T n−1 ) ⊃ A 1 n−1 , the event E c n implies that ξ 1 (T n ) ⊃ A 1 n . First, for n = 0 we have from the initial condition that ξ 1 (0) ⊃ A 1 0 . Now assume that ξ 1 (T n−1 ) ⊃ A 1 n−1 . Since E n does not hold, we have that Since E (2) n does not happen, and ξ 2 (T n ) ∩ A 1 n = ∅, the passage times inside A 1 n guarantee that ξ 1 (T n ) ⊃ A 1 n , concluding the proof. |x − y| ≥ λ 1 − (1 + δ) −n (1 + δ) n+3 αr − 2γ r ≥ λT n 1 + δ − 2γ α ≥ (1 + δ/2)λT n , and Lemma A.1 follows by adjusting the constant c.
Now we turn to the effect of the i on the spread of ξ 1 . There are two aspects regarding the spread of ξ 1 : the time to spread from A 1 n−1 to A 1 n (handled in Lemma A.2) and the measurability of this event (handled in Lemma A.3). Regarding Lemma A.2, the crucial inequality is (68). But since A 1 n−1 and A 1 n may intersect i i , to ensure that ξ 1 can spread in the nth step, we need to change E (2) n to where A 1 n ( ) is the set A 1 n plus all sets i that intersect A 1 n ; that is, A 1 n ( ) = A 1 n ∪  (1) n and E (2) n are independent. In other words, that A 1 n and A 2 n do not intersect. But this is true since their definition did not change; hence 70 holds as it is.

B Appendix: Standard properties of exponential random variables
Here we state some properties of exponential random variables that we will use in the paper.
Lemma B.1 (Random sum) Fix any q ∈ (0, 1). Let L be a geometric random variable of success probability q. Let X 1 , X 2 , . . . be an i.i.d. sequence of exponential random variables of rate 1. Hence, L i=1 X i is an exponential random variable of rate q. Lemma B.2 (Scaling and minimum) Let X be an exponential random variable of rate θ . Then, for any M > 0, we have X M is an exponential random variable of rate Mθ.
Moreover, for integer M, X M has the same distribution of the minimum of M independent, exponential random variables of rate θ .
In the lemma below we show that a collection of exponential random variables Z 1 , Z 2 , . . . , Z k can be sampled by first sampling the minimum value among all of them, which is the variable Z I whose value is W , and then using the memoryless property of exponential random variables to say that the other ones are equal to W plus an exponential random variable of the same rate. Lemma B.3 (Decomposition on the minimum) Fix any integers k ≥ 1. Let X i , i = 1, 2, . . . , k, be independent exponential random variables of rate θ i . Let I be a random variable in {1, 2, . . . , k} which has value i with probability θ i k j=1 θ j . Let W be an independent, exponential random variable of rate k j=1 θ j . Thus, if we set Z i , i = 1, 2, . . . , j, as we obtain that the Z i are independent exponenential random variables of rate θ i .
Note that the above lemma can be iterated. That is, after we see that Z I = W is the minimum among the Z i , then the value of Z i for i = I can be sampled by first sampling the minimum among the X i with i = I . Thus we obtain new random variables I and W so that X I = W and Z I = W + W , while the other values of Z i with i = I, I are equal to W + W plus an independent exponential random variable of the same rate. Then, after having sampled the values of Z I and Z I , we can iterate the above procedure with the Z i that were not yet sampled, for k − 2 iterations, until all the Z i 's have been obtained.