Inhomogeneous long-range percolation in the weak decay regime

We study a general class of percolation models in Euclidean space including long-range percolation, scale-free percolation, the weight-dependent random connection model and several other previously investigated models. Our focus is on the weak decay regime , in which inter-cluster long-range connection probabilities fall oﬀ polynomially with small exponent, and for which we establish several structural properties. Chief among them are the continuity of the bond percolation function and the transience of inﬁnite clusters.


Introduction and overview
We study (edge-)inhomogeneous percolation models of the following type: let η ⊂ R d , d ≥ 1, denote a stationary ergodic point set of unit intensity.Canonical choices of η are a homogeneous Poisson point process or the integer lattice Z d .A more detailed discussion of the class of underlying point sets for which our results hold is given in Section 2 together with an alternative and more rigorous construction of the model.Let η ′ denote an independent marking of η by i.i.d.Uniform(0, 1) random variables.We call x = (x, s) ∈ η ′ a vertex in location x ∈ η with (vertex) mark s ∈ (0, 1).We denote by G = G φ,η = (V (G ), E(G )) the random geometric graph obtained by first choosing V (G ) = η ′ and then, conditionally on η ′ , generating the edge set E(G ) by adding unoriented edges between x, y ∈ η ′ independently with probability 1 − e −φ(xy) , xy ∈ (η ′ ) [2] , where A [2] = {B ⊂ A : |B| = 2} for any set A and φ(xy) is the connection function of the model.Note that, we always write xy for the set {x, y}, when we refer to edges.
Note, however, that the results in this article pertain solely to genuine long-range models in which no upper bound on the length of potential edges exists.More precisely, our goal in the present article is to analyse the weak decay regime of inhomogeneous long-range percolation, i.e. the regime in which δ eff < 2, where δ eff is the (dimension free) exponent of decay of the probability of a long-range connection between large clusters ϕ(s, t, r) ds dt (d log r) −1 .
We discuss this quantity (or rather a closely related one) formally in Section 2, and assume for the moment that it is well-defined.Its role is akin to that of decay exponent δ in classical long-range percolation (see the examples above).The significance of δ eff had been conjectured in [9] and its importance for the existence of an infinite cluster for one-dimensional models was established in [10].In [16] it was shown that δ eff > 2 implies the existence of a subcritical phase in any dimension, even for models in which connection probabilities between vertices are allowed to depend on other vertices in their spatial vicinity.
To give an intuition of how δ eff influences the structure of G , assume that ϕ is given via (1), for some kernel g and ρ(x) ≍ x −δ .If the kernel g is bounded away from 0, then our model is merely an inhomogeneous perturbation of classical long-range percolation, for which it is well-known, see e.g.[2,19], that if ρ decays weakly, namely if δ = δ eff < 2, then the model feels little of the geometry of the embedding space R d and behaves very unlike nearest neighbour percolation on Z d and more like a short-range model in high dimensions in some aspects.As was already observed upon the invention of the model [19], this is can be derived from the behaviour under rescaling: if one checks whether two large local clusters, each of size N , say, are connected directly by a long edge, then the 'gain' obtained from independent trials associated with the N 2 − N pairs of vertices asymptotically beats the spatial decay of connection probabilities and one finds a connection with high probability if the distance of the clusters is O(N 1/d ).This observation is at the heart of the classical renormalisation group arguments for long-range percolation with δ < 2. In particular, δ moderates both the inter-point connection probabilities and the inter-cluster connection probabilities at large scales, cf.[2, Lemma 2.4].However, as was first observed in [9], this is not true in the general weight-dependent random connection model: if the kernel g decays sharply at 0, then the inter-cluster connection probabilities also depend on g and the regime in which.Thus, the regime in which the model easily overcomes the geometric restrictions of the embedding space cannot be found by looking at δ alone.Instead, one needs to consider the derived exponent δ eff ≤ δ depending both on δ and g and which naturally appears in renormalisation arguments, see [10].
Below, we establish that δ eff < 2 is sufficient to imply a number of important results which where established for homogeneous long range percolation with δ < 2 in [2].Namely, under varying assumptions on the underlying vertex locations η, we prove • an asymptotic density result for local clusters of sublinear size (Theorem 2.3), • continuity of the bond percolation function for G in dimensions 2 and above (Theorem 2.4), • a robustness result for the infinite cluster und removal of long edges in dimensions 2 and above (Theorem 2.6), • transience of the infinite cluster (Theorem 2.7).
There are several technical challenges that we have to overcome which complicate the analysis of the inhomogeneous model compared to long-range percolation.The most crucial one is the presence of additional strong dependencies induced by the vertex marks, which prevents the use of a number of well-established tools for i.id.(long-range) percolation.Another severe drawback is, as discussed above, that the inhomogeneity influences the scaling behaviour of the models: Coarse-graining a homogeneous long-range percolation model yields another homogeneous long-range percolation model, whereas coarse-graining G η,φ does not lead to a model that can be readily related to some suitable G η ′ ,φ ′ .The solution of the first problem can be considered as the main contribution of this work on a technical level: We establish renormalisation techniques akin to those in [2] that rely solely on non-negative correlations instead of independence.In particular, our proofs are novel even for homogeneous long-range percolation and some of our results ar even new for this special case, namely if η is not Poisson, deterministic or an i.i.d.percolated lattice.Unfortunately, we were not able to overcome the second challenge mentioned above in a similarly comprehensive manner and this is partly reflected in our main results -most notably we were not able to show that the bond percolation function is continuous throughout the whole weak decay regime if d = 1.

Notation.
Throughout the article, we use the Landau symbols to denote the stronger statement that f (x)/g(x) converges to 1.
Overview of the paper.In the next section, we provide a formal construction of our model, present our main results and discuss them in more detail.Section 3 contain the proof of Theorem 2.3, which forms the basis of all other main results.Transience of the infinite cluster is obtained in Section 4 and the remaining results are proved in Section 5.

Model definition and main results
Before formulating our main results, we provide a rigorous construction of the model and definitions of the key quantities and notions involved.
Construction from a doubly marked point process.Among our fundamental assumptions are that G has a unique (if any) infinite component, and that its distribution is the same everywhere in space.We begin by discussing which vertex location sets η fall within our framework.Let Λ ⊂ R d be either Z d or R d and let η denote a simple point process1 of finite intensity on Λ, which is stationary and ergodic under P with respect to the natural group of shifts (T x ) x∈λ , T x (y) = x + y for all y ∈ Λ, associated with Λ.  [20].Our model is now constructed as a deterministic functional of the points of η, and two independent i.i.d.sequences of edge and vertex marks.Let X 1 , X 2 , . . .denote an enumeration of η and let T = {T j : j ∈ Z} be a family of i.i.d.random variables distributed uniformly on (0, 1) independent of η.Set hence, η ′ is a point process on Λ × (0, 1) with unit intensity.Let further V = {V i,j : i < j ∈ Z} be a second family of i.i.d.Uniform(0, 1) random variables, independent of η ′ , that we call edge marks, which we assign to the elements of (η ′ ) [2] .We denote the point and edge marked process by ξ.For given ϕ, the graph G is now deterministically constructed from ξ as the graph with vertex set η and edge set Palm versions η ′ 0 , ξ 0 and G 0 of η ′ , ξ, and G , respectively, are obtained by replacing η in the above construction by η 0 .In the remainder of the paper, the enumeration of the locations plays no role and we usually denote vertices x = (x, s) ∈ η ′ as in the introduction and occasionally write s x for the mark of vertex x in location x, and V xy for the edge mark associated with the pair xy ∈ (η ′ ) [2] .Finally, to assure that there is at most 1 infinite component in G , η and φ should satisfy a suitable 'finite energy property', c.f. [4,8,18] and we shall always assume tacitly that this is the case.
Monotonicity and positive correlation.The inverse vertex mark s −1  x can be viewed as weight or fitness of the vertex in location x ∈ η, (giving the weight-dependent random connection model its name), the likelihood of connections should be increasing with weight and proximity.Our arguments heavily use the weak FKG-property and to obtain nonnegative correlations, we require ϕ to be decreasing in all three arguments.Formally, an increasing map of a doubly marked point configuration ξ does not decrease if either • vertices are added to η ′ , • vertex marks are decreased, • edge marks are decreased.
In particular, if we interpret G as a map on marked point configurations, it is increasing in the above sense for the canonical partial order on random geometric graphs.An increasing event E ∈ σ(ξ) is such that ½ E is an increasing functional of ξ.We require G to satisfy the weak FKG-property, i.e. we have that for any increasing functionals f, g on configurations ξ.A sufficient condition for this is that ϕ(•, •, •) is non-increasing in all 3 arguments and that η has the weak FKG-property, i.e.
where f ′ , g ′ are non-decreasing functionals of point configurations under addition of points, c.f. [9,12] for related constructions.Remark 2.2.Note that the stated conditions on ϕ and η suffice, because the edge and vertex marks are added in an i.i.d.fashion.However, the monotonicity assumption on ϕ can be relaxed, e.g. if η is a Poisson process.Then increasing the intensity of η can always be realised by adding another independent Poisson process and thus always increases the resulting graph G , even if ρ is not monotone.On the other hand increasing intensity and contracting space, i.e. reducing all inter-location distances, are equivalent.A different direction in which our setup can be generalised is to weaken the requirement of positive correlation on η, as long as the marks remain independent of η, since most calculations only require that the model has positive correlations conditionally on η.Similarly, we believe that our techniques can be adapted without much effort to certain situations in which edge and vertex marks are weakly dependent upon each other or even upon η, as long as vertex marks and edge marks remain positively correlated.We have not attempted to strive for the most general results in this respect, since the main motivation for our model was to cover all Poisson and lattice based models with i.i.d.marks that have so far been treated in the literature in a unified setting.A model with strong positive correlations that is not covered by our approach but might be amenable to certain techniques from the present paper is the spatial preferential attachment model [14,15].
The standard situation is that both δeff (0+) = δeff (0) and that the lim inf in the definition of δeff (0) can be replaced by an actual limit.In this case, δeff (0) coincides with the exponent δ eff discussed in the Section 1.In particular, this is the case in the homogeneous case in which ϕ can be represented as a function ρ(•) of inter-location distance only that satisfies ρ(z) ≍ z −δ .There, we see immediately that δeff (0) = δeff (0+) = δ, cf.[10].
More generally, if G is any stationary ergodic geometric random graph, we write where G 0 is the Palm version of G (the latter always exists, since V (G) must be distributionally invariant under shifts along Λ by stationarity).By ergodicity, we have that θ G ∈ [0, 1] is constant and corresponds to the density of the infinite component.
Our first and most general result localises the existence of an infinite cluster.We use the notation Theorem 2.3 (Local clusters of sublinear size are asymptotically dense).Let G denote an instance of inhomogeneous long-range percolation on a stationary, ergodic and positively correlated point set η such that δeff (0+) < 2. If θ G > 0, then for every λ ∈ (0, 1), we have It stands to reason, that the assertion of Theorem 2.3 can be improved to a stretched exponential bound on the probability of existence of a local cluster of linear size, at least if we assume independence for η.We plan to address this in future work.The corresponding result for classical long-range percolation was established in [3,Theorem 3.2].
Set now where G p is obtained from G by independent Bernoulli bond percolation with retention probability p.When no additional percolation is involved we also write θ = θ(1) for the density of the infinite cluster in G .The following result states that θ(p) is a continuous function of p in two or more dimensions as long as we remain in the weak decay regime.Theorem 2.4 extends [2, Thm 1.5], [6,Cor. 4] and [7,Thm. 3.3] for d ≥ 2. However, note that all three previous results correspond to the case in which δeff (0+) trivially coincides with the a priori spatial decay exponent δ (in our notation based on the WDRCM).In particular, in the special case of scale-free percolation [5,6,7], δeff (0+) < 2 ≤ δ precisely if the critical threshold is 0, in which case Theorem 2.4 is standard.The fact that percolation may occur in d = 1 only inside the weak decay regime (or possibly at its boundary) was established in [10] for the weight-dependent random connection model.However, the techniques used there are not strong enough to make assertions about the behaviour of θ(p) near the critical value.
It follows from the proof of Theorem 2.4, that percolation in G is also robust under edge truncation, which we formulate as our next theorem.Denote by G {ℓ} the graph obtained from G by removing all edges longer than ℓ > 0. Note that both Theorem 2.3 and Theorem 2.6 provide 'locality' statements for percolation, i.e. if G percolates and G n → G locally, then the G n will percolate eventually.Variants of either result should also hold outside the weak decay regime, but they are far harder to establish there.This is closely related to the fact that no version of the Grimmett-Marstrand Theorem [11] is currently known that applies to long-range percolation with polynomial tails.
Transience.A connected loop-free multigraph G = (V (G), E(G)) together with a conductance function C : E(G) → (0, ∞) is called a network.Note that we may always view C as a function defined on V (G) [2] setting C(xy) = 0 for potential edges xy / ∈ E(G).The random walk Y = (Y i ) i≥0 on (G, C) is obtained by reweighing the transition probabilities of simple random walk on G according to C, i.e. the walker chooses their way with probabilities proportional the sum of the conductances on the edges incident to their current position.In particular, we obtain simple random walk on G as a special case, if C is constant.We only consider locally finite networks, i.e.
for finite G and then extend the notion to infinite graphs via a limiting procedure.In particular, by identifying all vertices at graph distance further than n from v ∈ V (G) with one vertex z n (whilst removing any loops and keeping multiple edges with their conductances) we obtain a sequence of finite networks (G n , C n ).Moreover, the limit

Theorem 2.7 (Transience of infinite cluster). Let ϕ be such that δeff < 2 and let η be either a Poisson process or an i.i.d. percolated version of Z d . Then, if an infinite component in G exits, it is almost surely transient.
The proof of Theorem 2.7 is given in Section 4. Theorem 2.7 is strictly stronger than the previous transience results in [2,9,13] and in particular establishes the recurrencetransience transition conjectured for the two-dimensional soft Boolean model in [9].In d ≥ 3 transience should of course also hold outside the weak decay regime, but this is difficult to establish for the same reason as the corresponding truncation result.

Percolation in finite boxes
Throughout the following sections, we work repeatedly with the collections of half-open cubes We write Γ m (x) = x + [−m/2, m/2) d , x ∈ mZ d for the cube of side length m centred at x ∈ mZ d and write Γ m for Γ m (0).For any bounded domain Λ ⊂ R d , we define the k-neighbourhood of Λ as To prove Theorem 2.3, we first establish some auxiliary results and develop an improved version of the renormalisation approach used in [2] to study homogeneous long-range percolation.Let us begin by setting up some notation.We say that a finite collection M of numbers in (0, 1) is µ-regular, for µ Lemma 3.1.Fix µ ∈ (0, 1/2).Any finite collection M of i.i.d.Uniform(0, 1) random variables is µ-regular with probability exceeding and, by Bernstein's inequality, By definition of I(µ, M ), we have |I(µ, M )| ≤ n 1−µ and the claimed bound follows.
The purpose of µ-regularity is to obtain lower bounds on connection probabilities for large vertex sets that depend solely on their size and distance of each other.

Lemma 3.2.
There exist a constant C = C(g, ρ) > 0 and for any µ we have the uniform deterministic bound Remark 3.3.µ-regularity of a vertex set is solely a property of the i.i.d.vertex marks and, conditionally on η ′ , the event V 1 ↔V 2 is measurable with respect to the edge marks of edges joining V 1 and V 2 only.Hence, Lemma 3.1 yields a large deviation bound for untypical behaviour of vertex marks and Lemma 3.2 is essentially a large deviation bound for the i.i.d.sequence of edge marks, given that the vertex marks involved show typical behaviour.
Proof of Lemma 3.2.Let V i , i = 1, 2 denote vertex sets of size v such that all locations of vertices in V 1 ∪ V 2 are within distance D of each other and such that V 1 and V 2 have µ-regular marks.Let further F i be the empirical distribution function of the vertex marks corresponding to Since M 1 is µ-regular, this implies that A similar argument holds for F 2 and it follows that, for Now note that x i = (x i , t i ) ∈ V 1 and x j = (x j , t j ) ∈ V 2 are always connected if their corresponding edge mark satisfies which can be evaluated independently of the exact spatial positions.Since the edge mark collection {V x i x j ; x i , x j } is i.i.d. and independent of η ′ we have for the number Σ of edges between V 1 , V 2 It now follows from (2) and the non-negativity of distribution functions that for h(s, t) = ϕ(s, t, D), (s, t) ∈ (0, 1) 2 , and the assertion of the lemma follows, because the estimate is uniform in the configuration η ′ on the event that V 1 , V 2 are µ-regular.
The renormalisation scheme we use to prove Theorem 2.3 requires a number of interdependent parameters, which we now introduce.We first choose a sequence of density parameters (̺ n ) such that ̺ n < 1/4 for all n ∈ N and In fact, the precise polynomial decay of ̺ n is not important as long as which is possible due to our assumption on δeff (0+).Now choose ν = ν(ϕ) satisfying and let Finally, we choose (σ n ) such that σ n ∈ 2N + 1 for all n and such that for all but finitely many n, where ω = ω(ϕ) satisfies For the remainder of the section, one should think of the sequences (̺ n ), (σ n ) and the numbers ν, µ and ω as having been fixed.Assume further, that two large integers k ∈ N, ℓ ∈ 2N are given -we are going to specify these parameters below dependent on the density of the infinite cluster θ and the auxiliary variable λ appearing in the formulation of Theorem 2.3.Define a sequence (m n ) = (m n (ℓ)) of lengths via To lighten notation, we write Γ n (x) = Γ mn (x), x ∈ m n Z d , for the stage-n cube at x, i.e. the cube in C(m n ) with midpoint x.Note that for any n ∈ N, each stage-n cube can be decomposed into precisely σ d n stage-n − 1 cubes, which we call its subcubes.The preclusters of a cube Γ n (x) are maximal subsets (with respect to inclusion) of η ′ ∩ Γ n (x) × (0, 1) which are contained in the same connected component of G [∆ k Γ n (x)] (note that their definition depends on k).
A stage-0 cube is a cube Γ 0 (x) ∈ C(ℓ) and said to be alive if it contains a precluster with at least ⌈ℓ d θ/2⌉ vertices.Similarly, the cardinality thresholds act as lower bounds for the number of vertices in the preclusters at further stages, but the condition for aliveness becomes a little more complex.For n ≥ 1, a stage-n cube Γ n (x) is alive, if of its subcubes are alive; B(n) at least r n of its living subcubes contain a (µ, v n−1 )-regular precluster; C(n) there are (µ, v n−1 )-regular preclusters C 1 , . . ., C rn , each associated with a different subcube, which are are all mutually adjacent in G .
There is some redundancy in defining aliveness via the properties A(n), B(n) and C(n), but this formulation makes it straightforward to relate the definition to probabilities.Note, that the construction ensures that a living stage-n subcube always contains a precluster of size at least v n .Furthermore, the events are increasing.The main tool needed to prove Theorem 2.3 is the following lemma.

Lemma 3.4.
There exists some constant 0 < C < ∞ depending only on ϕ and ℓ such that To establish Lemma 3.4, we proceed in several steps.For n ∈ N ∪ {0}, we denote by Γ n ∈ C(m n ) the stage-n cube centred at the origin.We now define the events and aim to give lower bounds for their probabilities.Our first result is a straightforward recursive bound for P(A ′ n ) in terms of P(A n ).Lemma 3.5.Set Proof.By translation invariance and Markov's inequality, To obtain further bounds involving the events B n and C n , we define the maximal precluster C n,k (x) ⊂ Γ n (x) × (0, 1) of a stage-n cube Γ n (x) to be its precluster of largest cardinality (note that C n,k (x) may be empty if η ∩ Γ n (x) is empty), unless there is a tie between several preclusters, in which case the maximal precluster is the (almost surely unique) one amongst them containing the vertex with the smallest mark.This definition only works for almost every configuration ξ, we thus set C n,k (x) = ∅ on the set of configurations ξ on which there are at least two preclusters of maximal size with the same minimal mark to obtain a well-defined precluster in any case.Analogously, we define R n,k (x) ⊂ Γ n (x) × (0, 1) to be the maximal (µ, v n )-regular precluster associated with a stage-n cube Γ n (x).
Remark 3.6.Note that once C n,k (x) is non-empty for some configuration ξ, it remains non-empty if any vertex mark of a vertex in C n,k (x) is decreased or if any edge mark of an edge adjacent to C n,k (x) is decreased.The same is true for R n,k (x), respectively.This fact is needed to obtain a monotonicity property of the events E(•), F (•, •) defined in Lemma 3.7.
Let now Γ n [1], . . ., Γ n [σ d n ] denote the subcubes of Γ n , n ∈ N with corresponding centers x n (i) and maximal preclusters n .Lemma 3.7.For any n ∈ N, we have where Proof.We have the disjoint decomposition and on the first event B c n ∩ A ′ n , there has to be a living subcube that has no (µ, v n−1 )regular precluster, which implies that its largest precluster cannot be (µ, v n−1 )-regular.The second event satisfies where n , there has to be a pair of boxes containing a (µ, v n−1 )-regular precluster each, but such that the maximal such precluster in either box are not adjacent.Thus, (10) is established.
The next two lemmas complete the estimates that we need to prove Lemma 3.4.For their proofs we use the following auxiliary subsampling of vertices: Let η be given.To each stage-n cube Γ n (x) ∈ C(m n ), we assign a sample X(x, n) = {X 1 (x, n), . . ., X vn (x, n)} of tagged vertex locations in η ∩ Γ n (x), chosen uniformly without replacement and such that the families {X(x, n), x ∈ m n Z d , n ∈ {0, 1, 2, . . .}} are all mutually independent and also independent of all vertex and edge marks.If η∩Γ n (x) contains fewer than v n vertices, then we set X(x, n) = ∅.We write X(x, n), x ∈ m n Z d , n ∈ {0, 1, 2, . . .} for the vertices corresponding to the tagged sites.The configuration ξ augmented by the independent tagging is denoted ξ and the induced probability distribution on tagged configurations by P.
Lemma 3.8.Let Proof.A simple union bound and translation invariance yield Hence it remains to estimate the probability on the right.Fix n ≥ 1.We define an alternative tagging of vertex locations in Γ n [1] depending on η as well as edge and vertex marks.Namely, we set Y = ∅ on A n−1 (1) c and on where Y 1 , . . ., Y v n−1 are chosen uniformly without replacement amongst the vertex locations belonging to the maximal precluster C n−1,k (x n (1)) of Γ n (1).Let Pη (•) := P(•|η) and denote the joint distribution of ξ and Y by P and its conditional version given a fixed point configuration η by Pη .Note that on A n−1 (1), η must have at least v n−1 points in Γ n [1].It follows from the uniformity of the sample X(x n (1), n − 1) and its independence of edge end vertex marks, that since a uniform sample drawn from a finite set S conditioned to be contained in an independently generated random subset S ′ ⊂ S has the same distribution as a uniform sample drawn from S ′ .We have where the we define the conditional probabilities to equal 0 if |η ∩ Γ n [1]| < v n−1 and the equalities are due to the fact that the events E(1), A n−1 (1) do not involve the tagging at all.However, denoting by S = {S 1 , . . ., S v n−1 } the vertex marks belonging to the tagged vertex locations in X(x n (1), n − 1) and by T = {T 1 , . . ., T v n−1 } the vertex marks belonging to the tagged vertex locations in Y , we also have by ( 12).Yet S is an i.i.d.sample of v n−1 Uniform(0, 1) random variables under P(•|η), whenever |η ∩ Γ n (1)| ≥ v n−1 .Moreover, we claim that the events A n−1 (1) and )} are increasing in S and that the event {S is not µ-regular} is decreasing in This is easily seen to be true for A n−1 (1) and E (1).The statement for )}, then decreasing any one of the marks S i , 1 ≤ i ≤ v n , can only increase the maximal precluster ) in size (or decrease the lowest mark, if there is a tie in sizes), in particular this means that none of the tagged vertices can leave C n−1,k (x n (1)) if their marks are decreased.Combining ( 13) with ( 14), the FKG-inequality and Lemma 3.1, we obtain Integration over the point configurations η and inserting the result into (11) yields which concludes the proof.
Lemma 3.9.Let Proof.We use a similar approach as in the proof of Lemma 3.8, albeit we need to take a little more care, since the events involved are more complicated.Let n ∈ N, 1 ≤ i < j ≤ σ d n , and the corresponding subcubes Γ n (i), Γ n (j) ⊂ Γ n be fixed.Our goal is to bound the probability P(B(i) ∩ B(j) ∩ F (i, j)).As in the previous proof, we define additional randomly tagged locations that depend on edge and vertex marks.More precisely, set Y (i) = Y (j) = ∅ on (B(i) ∩ B(j)) c and on B(i) ∩ B(j), we sample two sets of locations , respectively, uniformly and without replacement.The joint distribution of ξ and the tagged sets Y (i), Y (j) is denoted by P. The vertex mark collections corresponding to Y (i) and Y (j) are denoted by T (i) and T (j), the corresponding vertex sets by Y(i), Y(j) ⊂ η ′ , and the vertex mark sets associated with the independently tagged locations X(x n (i), n − 1) and X(x n (j), n − 1) are denoted by S(i) and S(j), respectively.The edge marks on potential edges between X(x n (i), n−1) and X(x n (j), n−1) are denoted by Arguing precisely as in the proof of Lemma 3.8, we find that where and P η , Pη and Pη denote the conditional versions of P, P and P, respectively, given a fixed point configuration η.We may thus rewrite Let us say, that two vertices x = (x, s), y = (y, t) with x, y ∈ η ∩ Γ n are strongly connected, if If V, W ⊂ η ′ ∩ Γ n × (0, 1) are disjoint vertex sets, then we set {V ⇌ W } := {∃x ∈ V, y ∈ W : x and y are strongly connected}.
We further have, that since √ dm n is an upper bound for the distance of any two vertices in Γ n and the event {S(i), S(j) are µ-regular} almost surely occurs conditionally on G.Under Pη with η placing sufficiently many points into subcubes such that Pη (B(i) ∩ B(j)) > 0, the joint distribution of S(i), S(j) and Furthermore, the event is measurable w.r.t.σ(S(i), S(j), V (i, j)) and increasing.B(i)∩B(j) is clearly an increasing event w.r.t. to the full configuration ξ and G is increasing in S(i), S(j) and V (i, j), since if ξ satisfies G, then any configuration obtained by decreasing a vertex mark in S(i) ∪ S(j) or an edge mark in V (i, j) must also be in G.For vertex marks, this is checked as in the proof of Lemma 3.8 under the additional provision that (µ, v n )-regularity be not violated for either maximal precluster, which follows from the monotonicity of that property in S(i) and S(j), respectively.For the edge marks V (i, j), this follows from the fact that the G (and therefore the composition of the maximal preclusters) can only be affected if the edge corresponding to the mark is added.But since ξ satisfies G, this means adding an edge incident to both maximal (µ, v n−1 )-regular preclusters, which can only make those clusters larger.We conclude that we may apply the FKG-inequality, Lemma 3.1 and Lemma 3.2 to (17) to obtain Combining the estimate ( 18) with ( 16) and integrating over point configurations η, we get and this estimate is uniform in the choice of subcubes Γ n (i), Γ n (j).It follows that , and the proof is concluded.
We are now ready to prove Lemma 3.4.
Proof of Lemma 3.4.By Lemma 3.7, we have thus combining Lemmas 3.5, 3.8 and 3.9 yields, for any n ∈ N, To bound the right hand side further, we first observe that, by definition of σ n , for all but finitely many n ∈ N. Since v n → ∞ as n → ∞, we also have for all sufficiently large n, and finally, we claim that for all sufficiently many n, which we show at the end of the proof.Inserting ( 20)-( 22) into (19), we obtain Setting ṽn−1 = v ν n−1 and using the definition of δeff (µ * ) < 2 as well as µ < µ * , we see that there exists some small ζ 0 > 0 with δeff (µ * ) + ζ 0 < 2, such that for every ζ ∈ (0, ζ 0 ) there exists some N (ζ) such that for all n > N (ζ) Using that δeff (µ * ) < 2 and the choice ( 6) of ν, it is easy to see that if we chose ζ small enough, we can find some small value µ 0 (depending on µ, ν), such that for all sufficiently large n a n ≤ a n−1 From the choice of ω in ( 8) and the definition of v n−1 , it follows that v n grows at least like a small power of n!.Since ρ n decays only polynomially, we can find some large and since ̺ n < 1/4 for all n ∈ N, we conclude that which by induction yields (1 + 3̺ k ) for all n > L.
Since (̺ n ) is summable, the product on the right hand side converges and we obtain the uniform bound (9) asserted in the lemma.
We conclude by verifying (22).Note that and, writing ν = 1 + ε ν with ε ν > 0, it is sufficient to show for all sufficiently large n ∈ N, which follows from the following calculation based on the choices of ̺ n and σ n : We can find numbers K, R ∈ N, such that for all n > K + 1, Using the the bound (8), we see that ε ν (dω − 2) > 2 and hence we can find ε ′ such that for all sufficiently large n, which concludes the proof.
The remainder of this section is devoted to the completion of the proof of Theorem 2.3.
Proof of Theorem 2.3.Let ε ∈ (0, 1/2) and 0 < λ < 1 be given.Fix ω such that and note that this condition implies (8) and let the other parameters µ, ν, (σ n ) and (ρ n ) be defined as before.We have not yet specified the initial cube length ℓ = m 0 and the parameter k used in the definition of preclusters, which we do now.By ergodicity, we may choose ℓ so large, that with probability exceeding 1 − ε/2, there is a set A of at least θℓ d /2 vertices inside Γ 0 that belong to the infinite cluster.Since the infinite cluster is unique, there is some k * (ℓ) < ∞ such that all the vertices in A are contained within the same cluster of G [∆ k Γ 0 ] with probability exceeding 1 − ε/2 if k > k * (ℓ).We thus have shown that, with probability exceeding 1 − ε we can find a (v 0 , µ)-precluster in Γ 0 and conclude that a 0 = P(Γ 0 is not alive) ≤ ε.Invoking Lemma 3.4, we now obtain that where C depends only on ϕ and ℓ.If the stage-L cube Γ L is alive, then it follows from the definitions of aliveness and preclusters, that ∆ k Γ L contains a cluster of size Let ΓL denote the union of Γ L with the 3 d − 1 stage-L cubes neighbouring it.Since k depends only on the initial cube size ℓ, we can find L 0 ∈ N such that From the choice of (̺ n ), (σ n ) and ( 23) we can also deduce the existence of constants 0 < q 1 , q 2 , q 3 , q 4 < ∞, such that for L sufficiently large.Since we can choose ε arbitrarily close to 0 and λ arbitrarily close to 1, the conclusion of Theorem 2.3 now follows easily for the subsequence To obtain the result for the original sequence, fix ε > 0 and 1 > λ ′ > λ arbitrarily.Now choose N so large that for all k ≥ N are satisfied.Then, if n ≥ M n , we can always find k with m k ≤ n < m k+1 and such that and the proof is complete.

Transience
We • for every i ≥ 2, there is a partition of V i into subsets v 1 , v 2 , . . . of size ℓ i+1 such that for every k and every pair (u Fix furthermore a parameter µ ∈ (0, 1/2) which governs the regularity of vertex weights just as in the previous section.Once again, the scaling sequence (σ n ) tells us how fast the scales of the renormalisation scheme grow, but the construction of connections between clusters at different scales will be significantly different to make it compatible with Definition 4.1.
Recall that C(m) denotes the collection of disjoint cubes of side-length m centred at the points of mZ d .A cube for n ≥ 1, is called a stage-n cube.We now define the procedure that will allow us to conclude the renormalisability of C ∞ for (α n ).The scheme requires us to start at some sufficiently large scale, so let n 1 ∈ N be the smallest stage of cubes we will consider.We do not fix n 1 yet and assume only that it is large.We now define what it means for a cube to be good or bad.We begin at the bottom levels, namely Γ ∈ C n 1 is good if Having declared what happens at the bottom levels, we are now prepared to initiate the recursive part of the scheme.For a given level-n cube Γ with n > n 1 + 1, we say that a pair Based on the goodness at levels n 1 , n 1 + 1 and the notion of well-connectedness, we now declare an stage-n cube Γ with n > n 1 + 1 to be good, if there exists a subset F of the good subcubes of Γ satisfying Remark 4.4.The recursive architecture induced by a hierarchy of good cubes is more complex than the one used in the proof of Theorem 2.3, due to the intertwining of levels n and n + 2. However, note that the goodness of Γ only depends on G [Γ] and therefore is independent of the status of cubes on the same level, if η is the (percolated) lattice or a Poisson process.
A careful inspection of the above construction yields that it produces indeed a renormalised graph sequence.
Having defined our renormalisation scheme subject to the precise choice of the parameters λ, µ and n 1 , we are now ready to complete the proof.
Proof of Lemma 4.3.The calculation is very similar to the proof of Theorem 2.3, which allows us to recycle some of the parameters chosen there.Let, in particular, µ * > 0 be given as in (5) and define ν > 1 and µ < µ * as in ( 6) and (7).Finally, choose , 3}, denote the events that the conditions E(n), E i (n), F i (n) are satisfied respectively for the stage-n cube Γ n centred at the origin.Similarly, we write L n , n ≥ n 1 for the event that Γ n is good.Note that transience of C ∞ has either probability 0 or 1 by ergodicity.Due to translation invariance and Lemma 4.5 it is thus enough to show that and since the events L n are increasing, this can be further simplified, using the FKG inequality, to Note that by construction, L n 1 and L n 1 +1 are defined differently than the other scales, so we will bound their probabilities separately.We first upper bound the probability of the converse event L c n for n > n 1 + 1 and begin by writing To bound P(( , one may argue as in the proof of Lemma 3.8 to obtain It follows that, for n sufficiently large, for some constants c, C > 0 which only depend on µ.Moving on to bound P((F 2 n ) c ∩ F 1 n ), note that any two vertices in Γ n are at most at distance away from each other.Let q n be the probability that two renormalised clusters of (n − 2)level subcubes in the same n-level cube are not connected.We note that by construction, any (n − 2)-renormalised cluster in any good (n − 2)-level cube contains at least n−2 i=1 α i vertices.Therefore, by Lemma 3.2, and noting that λ > 1/ν implies where the second inequality holds for all sufficiently large n.We now proceed to bound P(F 1 n ).Note that by independence, the number of good subcubes of Γ dominates a Bin(m, q) random variable X, where q = P(L n−1 ), m = σ d n .Fixing Θ ∈ (0, 1), Chernoff's bound states The same bound applies to P(L c n 1 +1 ), since the calculation for the complements of the defining events E 1 n 1 +1 , E 2 n 1 +1 and E 3 n 1 +1 can be done along the same lines as for the F i n .Finally, we have by Theorem 2.3 that for any λ ∈ (0, 1) and any ε > 0, P(L n 1 ) > 1 − ε if n 1 is chosen sufficiently large.
Define now the sequence ℓ n := 1 − (n + 1) −3/2 and observe that ∞ i=1 ℓ i > 0. We will now show that if P(L n ) > ℓ n , then it follows inductively that P(L n+1 ) > ℓ n+1 .We calculate where the second inequality holds if n is sufficiently large.Let n 1 now be large enough so that all n > n 1 satisfy the previous assumptions about n being large and furthermore let n 1 be large enough that P(L n 1 ) > 1 − 2 −3/2 .Then, using the same calculation yields that P(L n 1 +1 ) > ℓ n 1 +1 and the claim follows for all larger n.
We can now write ℓ n > 0.
Together with Lemma 4.5, this gives the existence of the renormalized graph sequence with positive probability and concludes the proof.

Continuity properties of percolation
For a given graph G, we denote by pG the graph obtained from G by independent Bernoulli vertex percolation and that η is finite range, if η(A) and η(B) are independent, whenever A and B are sufficiently far separated.Proof.We renormalise the model using the cubes C(m).By the finite range assumption on η, we my fix m 0 so large, that η(Γ m (x)) and η(Γ m (y)) are independent for all m ≥ m 0 and all x, y ∈ mZ d with |x − y| ≥ 3m.Using the classical domination result of Liggett et al. [17], it is straightforward to show that there are retention probabilities p * 3,d , q * 3,d < 1, such that any ergodic 3-independent (p, q)-site-bond percolation measure on Z d with p > p * 3,d and q > q * 3,d almost surely produces an infinite cluster.
Note that by assumption such a choice is always possible, if λ is sufficiently close to 1.By Theorem 2.3 and the fact that µ-regularity is a monotone event for a given cube, we can choose m so large, that the density p λ of λ-good cubes is arbitrarily close to 1.In particular, we can achieve p λ ∈ (p * 2,d , 1).By Lemma 3.2 the maximal local clusters in two good cubes associated with neighbouring vertices in mZ d are connected with probability at least and we obtain, by choice of λ and µ(λ) q λ ≥ 1 − e −m λd(2−δ λ )(1+o(1)) > q * 3,d ,

Remark 2 . 5 .
The conclusion of Theorem 2.4 does not include d = 1, unless δ = δeff (0+) in which it is a straightforward and minor extension of[2, Theorem 1.5].This is rooted in the scaling behaviour of the inhomogeneous model: in general, coarse-grained versions of the model behave quite differently to the original model.For homogeneous long-range percolation the opposite is true, as discussed in the introduction, cf.[2, Lemma 2.4].In our renormalisation arguments, the tool used to connect large clusters is Lemma 3.2 below, which is only effective for clusters close enough to each other.Therefore our proof of Theorem 2.4 relies on comparison with supercritical nearest-neighbour models, whereas the d = 1 case would require a comparison with a suitable supercritical long-range model.
is any random geometric graph and D ⊂ R d is some bounded domain, we write G[D] for the subgraph of G induced by vertices located in D.

Proposition 5 . 1 .
(Continuity of percolation from the left, d ≥ 2) Let d ≥ 2 and let G be an instance of inhomogeneous long range percolation on a finite range point set η with δeff (0+) < 2. Assume that an infinite cluster exists.Then there exists p < 1 such that pG contains an infinite cluster almost surely.
Remark 2.1.The choice Λ = Z d is the most natural one for the discrete set up.However, we do not use any symmetry properties specific to Z d .Our results are based on renormalisation arguments which use half cubes of the form (−a, a] d and their translates.All our results remain valid, if one chooses Λ = {Bz, z ∈ Z d }, where B is some non-singular d × dmatrix and replaces the cubes by the the corresponding parallelepiped.This changes a few constants appearing below relating volumes and distances but does not alter the content of the theorems.The same applies of course to adapting the norm | • | to Λ -it usually more natural to work with the corresponding lattice distance on Λ instead of Euclidean distance.The canonical examples for η are a homogeneous Poisson process and i.i.d.Bernoulli(p) percolation on Z d with p ∈ (0, 1].For simplicity, we assume that Eη((−1/2, 1/2] d ) = 1, this can always be achieved by a straightforward rescaling of the ambient space.Although some parts of our considerations are valid under the sole assumptions of ergodicity and positive correlations on η (see the paragraph below), some of our main results require a stronger control on dependencies.We say that η has finite range, if there exists some number K such that η(A) and η(B) are independent, if A, B ⊂ R d are at distance further than K of each other.Often, it is convenient to view η from a typical point, hence we frequently work with the Palm version η 0 of η that has a point at the origin 0 ∈ R d .Note that η is translation invariant under shifts of Λ if and only if, under P 0 , η is invariant under shifting the origin into another typical point of η, see [2,w that C ∞ is transient by explicitly constructing a transient subgraph.To this end, we use the notion of renormalised graphs[2, Def.2.8].An ℓ-merger of an infinite graph H is any graph H ′ obtainable by partitioning V (H) into subsets v 1 , v 2 , . . . of size ℓ, setting V (H ′ ) = {v i , i ∈ N} and v i v j ∈ E(H ′ ) if and only if there are u i ∈ v i , u j ∈ v j with u i u j ∈ V (H).The graph G 0 = (V 0 , E 0 ) is renormalised for the sequence (ℓ n ) n∈N if we can construct a sequence of G 1 , G 2 , . . . of graphs with [2,subsets of V i−1 ) and every w 1 ∈ u 1 , w 2 ∈ u 2 (interpreted as a pair of subsets of V i−2 ), we have that either xy ∈ E i−2 for all x ∈ w 1 , y ∈ w 2 , or .The wording of our definition of renormalised graphs differs from [2, Def.2.8], but it is straightforward to check that the two formulations are in fact equivalent.Define for n ∈ N the valuesα n := (n + 1) 2λd , σ n := (n + 1) 2 ,where λ ∈ (1/2, 1).Our goal is to show that C ∞ contains a subgraph that is renormalised for (α n ) n∈N , by[2, Lemma 2.7], this implies transience.