1 Introduction

Motivation and main results. We consider the level-set percolation for the Gaussian free field (GFF) on a connected, locally finite, transient graph \(G=(V,E)\). Of particular interest is the case of the hypercubic lattice \({{\mathbb {Z}}}^d\) in dimensions \(d \ge 3\). The Gaussian free field \(\varphi = (\varphi _x)_{x\in V}\) is defined as the centered Gaussian process with covariance \({\mathbb {E}}(\varphi _x \varphi _y) = g(x,y)\) for all \(x,y\in V\), where \(g(\cdot ,\cdot )\) stands for the Green function of the simple random walk on G. Given \(h\in {{\mathbb {R}}}\), we are interested in the excursion set \(\{\varphi \ge h\}\,{:}{=}\,\{x\in V:\, \varphi _x\ge h\}\) seen as a random subgraph of G (with the induced adjacency). As h varies, this defines a percolation model for which one may expect to see a phase transition in h from a percolative regime—where \(\{\varphi \ge h\}\) contains an infinite connected component—to a non-percolative regime—where all the clusters of \(\{\varphi \ge h\}\) are finite. Consider the percolation density function defined by

$$\begin{aligned} \theta (h)\,{:}{=}\,{\mathbb {P}}[|\mathcal {C}_o(h)|=\infty ], \end{aligned}$$

where \(\mathcal {C}_o(h)\) denotes the connected component (or cluster) of a fixed origin \(o\in V\) in \(\{\varphi \ge h\}\). We can then define the percolation critical point \(h_*\) given by

$$\begin{aligned} h_*(G)\,{:}{=}\,\sup \{h\in {{\mathbb {R}}}:\, \theta (h)>0\}. \end{aligned}$$

The first and most fundamental question in percolation theory is the existence of a non-trivial phase transition, which in our case corresponds to \(-\infty< h_*< +\infty \). A soft argument due to Bricmont et al. [BLM87] shows that the GFF percolates above any negative level, i.e. \(h_*({{\mathbb {Z}}}^d)\ge 0~(>-\infty )\) for all \(d\ge 3\)—see also [AS18] for a proof that \(h_*(G)\ge 0\) for every transient graph G. The strict inequality \(h_*({{\mathbb {Z}}}^d)>0\) has been recently proved [DPR18a] in the case of \({{\mathbb {Z}}}^d\) for all \(d\ge 3\). The opposite inequality \(h_*<+\infty \) is a bit more delicate. In the special case \(G={{\mathbb {Z}}}^d\), \(d\ge 3\), this was proved by Rodriguez and Sznitman [RS13]—the case \(d=3\) had already been obtained in [BLM87]. It is also known that \(h_*<+\infty \) for regular trees [Szn16] and graphs of polynomial growth satisfying certain regularity properties [DPR18b, Remark 7.2 1)], but this remains open for more general transient graphs. Remarkably, this is in contrast with the classical Bernoulli percolation, for which proving the existence of a percolative regime is in general much harder than proving the existence of a non-percolative regime—see [DCGR+20].

Once the existence of a phase transition is established, the next important question concerns the uniqueness of the critical point, i.e. whether \(h_*\) defined above is the only value at which one can see a qualitative change in the large-scale behavior of the model. This immediately raises the question of whether there are critical points at other values of h and how to define them. There are two main approaches to this question.

From a percolation theory perspective, a natural approach consists in defining alternative critical parameters \({\bar{h}}\) and \(h_{**}\), which characterize a strongly percolative and a strongly non-percolative regime, respectively. In the last decade, this approach has been successfully implemented in the case \(G={{\mathbb {Z}}}^d\): definitions appeared in many works—see e.g. [RS13, PR15, DRS14, Szn15]—and more recently it has been proved by Duminil-Copin, Goswami, Rodriguez & Severo [DCGRS20] that indeed \({\bar{h}}=h_*=h_{**}\). This equality is often referred to as “sharpness” of phase transition and is also expected to hold for other transient graphs, but this remains open. The corresponding result for Bernoulli percolation on \({{\mathbb {Z}}}^d\) was obtained in the highly influential works of Aizenman & Barsky [AB87] and Menshikov [Men86] (on the subcritical phase) and Grimmett & Marstrand [GM90] (on the supercritical phase).

From the point of view of statistical physics, a classical approach consists in considering a function (such as \(\theta \)) describing the macroscopic behavior of the model, and define the critical points to be the singularities of that function. Uniqueness of the critical point then corresponds to the analyticity of this function on \({{\mathbb {R}}}\setminus \{h_*\}\), which is precisely the main result of the present article. Let us mention that the corresponding result for Bernoulli percolation on \({{\mathbb {Z}}}^d\) has been proved on the subcritical phase by Kesten [Kes81] and on the supercritical phase by Georgakopoulos and Panagiotis [GP]. Hermon and Hutchcroft [HH21] also proved a corresponding result for Bernoulli percolation on non-amenable transitive graphs.

In order to state our main result, we need to introduce some notation. Let \(\mathcal {X}\) denote the family of all finite subsets of V. We say that a cluster observable \(F:\mathcal {X}\rightarrow {{\mathbb {C}}}\) has subexponential growth if \(|F(S)|\le e^{o(\mathrm {cap}({\overline{S}}))}\) as \(\mathrm {cap}({\overline{S}})\rightarrow \infty \). Here \(\mathrm {cap}({\overline{S}})\) denotes the (harmonic) capacity of \({\overline{S}}\), the (vertex) closure of S—see Sect. 2 for definitions. Finally, for a cluster observable \(F:\mathcal {X}\rightarrow {{\mathbb {C}}}\) and a subset \(X\in \mathcal {X}\), consider the function \({\overline{F}}^X:{{\mathbb {R}}}\rightarrow {{\mathbb {C}}}\) defined by

$$\begin{aligned} {\overline{F}}^X(h)\,{:}{=}\,{\mathbb {E}}[F(\mathcal {C}_X(h)) \mathbbm {1}_{|\mathcal {C}_X(h)|<\infty }], \end{aligned}$$

where \(\mathcal {C}_X(h)\) denotes the union of all clusters in \(\{\varphi \ge h\}\) intersecting X.

Theorem 1.1

Let \(G={{\mathbb {Z}}}^d\), \(d\ge 3\). Then for every observable \(F:\mathcal {X}\rightarrow {{\mathbb {C}}}\) of subexponential growth and every \(X\in \mathcal {X}\), the function \({\overline{F}}^X\) is well-defined and analytic on \({{\mathbb {R}}}\setminus \{h_*\}\).

Notice that the analyticity of the percolation density \(\theta \) on \({{\mathbb {R}}}\setminus \{h_*\}\) follows from Theorem 1.1 by taking \(F\equiv 1\), for which \({\overline{F}}^{\{o\}}=1-\theta \). Besides \(\theta \), other functions of interest are the (truncated) susceptibility

$$\begin{aligned} \chi (h)\,{:}{=}\,{\mathbb {E}}[|\mathcal {C}_o(h)| \mathbbm {1}_{|\mathcal {C}_o(h)|<\infty }], \end{aligned}$$

the (finite) open clusters per vertex

$$\begin{aligned} \kappa (h)\,{:}{=}\,{\mathbb {E}}[|\mathcal {C}_o(h)|^{-1} \mathbbm {1}_{|\mathcal {C}_o(h)|<\infty }], \end{aligned}$$

the truncated k point function

$$\begin{aligned} \tau ^{f}_X(h)\,{:}{=}\,{\mathbb {P}}[\mathcal {C}_X(h) \text { connected},~|\mathcal {C}_X(h)|<\infty ] \end{aligned}$$

and the (non-truncated) k point function

$$\begin{aligned} \tau _X(h)\,{:}{=}\,{\mathbb {P}}[\mathcal {C}_X(h) \text { connected}], \end{aligned}$$

where \(X\in \mathcal {X}\) with \(|X|=k\). The following is a corollary of Theorem 1.1.

Corollary 1.2

Let \(G={{\mathbb {Z}}}^d\), \(d\ge 3\). Then all the functions \(\theta (h)\), \(\chi (h)\), \(\kappa (h)\), \(\tau ^{f}_X(h)\) and \(\tau _X(h)\), \(X\in \mathcal {X}\), are analytic on \({{\mathbb {R}}}\setminus \{h_*\}\).

The only function for which Corollary 1.2 does not follow readily from Theorem 1.1 is the (non-truncated) k point function \(\tau _X(h)\). In order to deduce its analyticity, simply notice that by the uniqueness of the infinite cluster (see e.g. [RS13, Remark 1.6]) and the inclusion–exclusion principle, we can write

$$\begin{aligned} \tau _X(h)&=\tau ^{f}_X(h)+1-{\mathbb {P}}\big [\bigcup _{x\in X}\{|\mathcal {C}_x(h)|<\infty \}\big ]\\&=\tau ^{f}_X(h)+1-\sum _{\emptyset \ne Y\subset X} (-1)^{|Y|+1}{\mathbb {P}}[|\mathcal {C}_Y(h)|<\infty ]. \end{aligned}$$

We remark that the analyticity of \(\tau _X(h)\) may break down on the supercritical phase if uniqueness of infinite cluster does not hold. Indeed, for Bernoulli percolation there are examples [HH21] of transitive non-amenable graphs for which \(\tau \) has a discontinuity at the uniqueness critical point \(p_u\), which in this case satisfies \(p_c<p_u<1\).

Our proof of analyticity of \({\overline{F}}^X\) makes crucial use of the following convenient series decomposition. For every integer \(N\ge 0\) and \(h\in {{\mathbb {R}}}\), consider the event

$$\begin{aligned} A^X_N(h)\,{:}{=}\,\{|\mathcal {C}_X(h)|<\infty \}\cap \{N\le \mathrm {cap}(\overline{\mathcal {C}_X(h)})<N+1\}. \end{aligned}$$

We can then write

$$\begin{aligned} {\overline{F}}^X(h)=\sum _{N=0}^\infty {\mathbb {E}}[F(\mathcal {C}_X(h)) \mathbbm {1}_{A^X_N(h)}]. \end{aligned}$$
(1.1)

With the series (1.1) in hands, it is enough to show that each term \(^X_N(h)\,{:}{=}\, {\mathbb {E}}[F(\mathcal {C}_X(h)) \mathbbm {1}_{A^X_N(h)}]\) can be analytically extended to a domain of \({{\mathbb {C}}}\) containing \({{\mathbb {R}}}\setminus \{h_*\}\) on which the series converges locally uniformly. A crucial step to establish such a convergence is proving that \({\mathbb {P}}[A^X_N(h)]\) decays exponentially in N and locally uniformly in \(h\ne h_*\). This is the content of the following theorem, which we believe to be of independent interest.

Theorem 1.3

Let \(G={{\mathbb {Z}}}^d\), \(d\ge 3\). Then for every \(\varepsilon >0\) and \(X\in \mathcal {X}\), there exists \(c=c(X,\varepsilon ,d)>0\) such that \({\mathbb {P}}[A^X_N(h)]\le e^{-cN}\) for every \(N\ge 0\) and every \(h\in {{\mathbb {R}}}\) with \(|h-h_*|\ge \varepsilon \).

It is easy to prove that there exists \(c'=c'(d)>0\) such that \(\mathrm {cap}(K)\ge c'|K|^{\frac{d-2}{d}}\) for every subset \(K\subset {{\mathbb {Z}}}^d\). The following corollary thus follows readily from Theorem 1.3.

Corollary 1.4

Let \(G={{\mathbb {Z}}}^d\), \(d\ge 3\). Then for every \(\varepsilon >0\) and \(X\in \mathcal {X}\), there exists \(c=c(X,\varepsilon ,d)>0\) such that \({\mathbb {P}}[N\le |\mathcal {C}_X(h)|<\infty ]\le \exp \{-cN^{\frac{d-2}{d}}\}\) for every \(N\ge 0\) and every \(h\in {{\mathbb {R}}}\) with \(|h-h_*|\ge \varepsilon \).

The order of exponential decay in the upper bounds provided by Theorem 1.3 and Corollary 1.4 are believed to be the correct ones. Optimizing on the constant c governing the rate of exponential decay is more challenging and beyond the scope of this article, but we believe that our techniques might shed some light on this problem as well.

In recent years, large deviation problems for GFF percolation events have attracted considerable attention—see e.g. [Szn15, NS20, Nit18, Szn19b, GRS21]. A common feature in these problems is a deep connection with potential theory and in particular the notion of capacity. Typically, the exponential rate of decay is given by the solution of a constrained optimization problem involving the Dirichlet energy and, in some cases, the percolation density \(\theta \) as well—see e.g. [Szn19a, Szn20, Szn21] for results in that direction for the closely related model of random interlacements. It is therefore relevant to understand the regularity of \(\theta \) in order to study these optimization problems. Motivated by this, it has been recently proved [Szn19c] that for the vacant set of random interlacements, \(\theta \) is \(C^1\) on an interval of the parameter space, which is conjectured to coincide with the supercritical phase. We expect that the techniques developed in the present article may be helpful to study similar questions for the random interlacements and other strongly correlated models as well. The proof of Theorem 1.3 is based on a coarse graining argument which is very much in the spirit of the works cited above. However, we would like to highlight a key new aspect of our work: we use a coarse graining procedure that involves, at the same time, multiple scales instead of only one. We describe this multi-scale coarse graining scheme in more details in the end of this section.

We now discuss the case of general transient graphs. First, we observe that the (uniform) exponential decay for \({\mathbb {P}}[A^X_N(h)]\) always implies the analyticity of \({\overline{F}}^X\)—see Proposition 2.2. By a simple shift-argument, one can show that such exponential decay holds for all negative values of h on any transient graph—see Proposition 2.3. This implies the following theorem. Recall that \(h_*(G)\ge 0\) is known to hold [BLM87] for every transient graph G.

Theorem 1.5

For every transient graph G, every observable \(F:\mathcal {X}\rightarrow {{\mathbb {C}}}\) of subexponential growth and every \(X\in \mathcal {X}\), the function \({\overline{F}}^X\) is well-defined and analytic on \((-\infty ,0)\).

Under weaker assumptions on the decay of \({\mathbb {P}}[A^X_N(h)]\), we can prove that \({\overline{F}}^X\) is smooth for observables \(F:\mathcal {X}\rightarrow {{\mathbb {C}}}\) of (at most) polynomial growth, i.e. satisfying \(|F(S)|\le C|\mathrm {cap}({\overline{S}})|^C\) for all \(S\in \mathcal {X}\) and some constant \(C\in (0,\infty )\). We say that a sequence \((c_N)_{N\ge 0}\) of positive real numbers decays super-polynomially fast if \(\lim _{N\rightarrow \infty } \tfrac{\log (c_N)}{\log N}=-\infty \). We define

We also define an analogous parameter in the subcritical phase

Theorem 1.6

For every transient graph G, every observable \(F:\mathcal {X}\rightarrow {{\mathbb {C}}}\) of (at most) polynomial growth and every \(X\in \mathcal {X}\), \({\overline{F}}^X\) is well-defined and \(C^{\infty }\) on \({{\mathbb {R}}}\setminus [{\widetilde{h}},{\hat{h}}]\).

The parameters \({\widetilde{h}}\) and \({\hat{h}}\) defined above can be seen, respectively, as an alternative definition of the classical parameters \({\bar{h}}\) and \(h_{**}\) mentioned above—see [DCGRS20] for the precise definitions. Indeed, for the case \(G={{\mathbb {Z}}}^d\), one can prove that finite clusters have stretched exponential tails for \(h>h_{**}\) and \(h<{\bar{h}}\), therefore \({\bar{h}}\le {\widetilde{h}}\le h_*\le {\hat{h}}\le h_{**}\), which in turn implies \({\widetilde{h}}=h_*={\hat{h}}\) as the equality \({\bar{h}}=h_{**}\) is known in this case [DCGRS20]. It is natural to expect that the equality \({\widetilde{h}}=h_*={\hat{h}}\) holds in great generality, but sharpness of phase transition remains open beyond \({{\mathbb {Z}}}^d\). It is also natural to expect that, independently of sharpness, one might be able to bootstrap the decay of \({\mathbb {P}}[A^X_N(h)]\) from super-polynomial to exponential via a coarse graining argument, thus proving that \(\theta \) is analytic on \({{\mathbb {R}}}\setminus [{\widetilde{h}},{\hat{h}}]\). This is essentially what we do in the proof of Theorem 1.3 for the case \(G={{\mathbb {Z}}}^d\): we start from a sub-optimal decay provided by the assumption \(h\in {{\mathbb {R}}}\setminus [{\bar{h}},h_{**}]~(={{\mathbb {R}}}\setminus \{h_*\}\) by [DCGRS20]) and enhance it to the desired exponential decay through a coarse graining argument—see the discussion below for more details. On general graphs though, developing a coarse graining argument is more challenging due to a poorer understanding of their geometry.

   About the proof. As mentioned above, our proof makes crucial use of the series (1.1). We first use a shift-argument based on the Cameron–Martin formula to naturally construct an analytic extension of the function \({\overline{F}}^X_N={\mathbb {E}}[F(\mathcal {C}_X(h)) \mathbbm {1}_{A^X_N(h)}]\) to the whole complex plane \({{\mathbb {C}}}\) for every \(N\ge 0\). This construction provides a simple way to effectively estimate the growth of this entire function along the imaginary direction. More precisely, we prove in Proposition 2.1 that \({\overline{F}}^X_N(h+it)\le \exp \{\tfrac{1}{2}t^2(N+1)\}\overline{|F|}^X_N(h)\) for every \(N\ge 0\) and \(h,t\in {{\mathbb {R}}}\). Due to this result, it is not difficult to deduce the locally uniform convergence (and therefore analyticity) of the series (1.1) from the (uniform) exponential upper bound for \({\mathbb {P}}[A^X_N(h)]\) on the real line—see Proposition 2.2. This exponential bound is then provided in the case \(G={{\mathbb {Z}}}^d\) by Theorem 1.3, which is the most technical part of this article.

Before discussing the ideas involved in the proof of Theorem 1.3, we would like to highlight some key differences between GFF level-sets and Bernoulli percolation. Kesten’s proof [Kes81] of analyticity for Bernoulli percolation on the subcritical phase is based on a series expansion similar to (1.1), but in terms of the cluster size \(N=|\mathcal {C}_X|\). Since in the subcritical phase the cluster size decays exponentially in probability [AB87, Men86] and the expansion in the imaginary direction also grows (at most) exponentially in N, one can prove that the series converges locally uniformly near the real line and is therefore analytic. This strategy does not work in the supercritical phase though: while the expansion in the imaginary direction is still exponential in N, the decay of the cluster probabilities is exponential in its boundary size [KZ90], which is typically of order \(N^{\frac{d-1}{d}}=o(N)\). Motivated by this issue, Georgakopoulos & Panagiotis [GP] considered a series decomposition in terms of the size of (multi-)interfaces instead, in which case both the expansion in the imaginary direction and the decay on the real line are of the same exponential order. For the GFF level-sets though, none of these decompositions can work as the decay on the real line is subexponential in both the volume and boundary sizes. Nevertheless, we observe that both the imaginary expansion and the decay of cluster probability (in both subcritical and supercritical phases!) are exponential in the capacity of the cluster, thus allowing us to make effective use of the series expansion (1.1). This fact is due to an entropic repulsion phenomenon that emerges from the strong (non-integrable) correlations of the GFF, which in turn are deeply related to the potential theory attached to the random walk.

We will now outline the main ideas present in the proof of Theorem 1.3. As mentioned above, a quite substantial multi-scale coarse graining argument takes place in the proof. We start by discussing the more natural single-scale coarse graining approach with the hope of making the need for a multi-scale argument more apparent. This single-scale approach would consist in choosing an appropriate scale L and observe that on the event \(A^X_N(h)\) one can find a family \(\mathcal {F}\) of L-boxes on which an unlikely event (so called bad event) happens. Then one can hope to prove that, for every given \(\mathcal {F}\), the probability that all of these boxes are bad is at most \(e^{-cN}\), while keeping the combinatorial complexity (i.e. the number of possible families \(\mathcal {F}\)) of order \(e^{o(N)}\). In order to prove the desired exponential upper bound, one can use the harmonic decomposition of GFF on each box of \(\mathcal {F}\) into the sum of a local and a global field and then consider two cases: either most boxes of \(\mathcal {F}\) are globally bad—which corresponds to the global (harmonic) field deviating from 0—or many boxes are locally bad—which corresponds to the occurrence of an unlikely percolation event for the local field. By applying a large deviation result of Sznitman [Szn15], one can prove that the probability of the first case decays exponentially in the capacity, i.e. it is smaller than \(e^{-cN}\), as desired. In the second case though, one can use independence to show that its probability is smaller than \(p_L^{|\mathcal {F}|}\), where \(p_L\) is the probability of a single L-box being locally bad. On the one hand, since the available a priori bound on \(p_L\) is only stretched exponential in L and the geometry of \(\mathcal {F}\) is completely arbitrary, one quickly notices that in order for the desired inequality \(p_L^{|\mathcal {F}|}\le e^{-cN}\) to hold uniformly in \(\mathcal {F}\), it is necessary to choose L not too large. On the other hand, because of the arbitrary geometry of \(\mathcal {F}\) again, it is necessary to take L sufficiently large in order to have a combinatorial complexity of order \(e^{o(N)}\). As a consequence, choosing such a scale L becomes impossible, suggesting the need of a multi-scale approach.

Our multi-scale coarse graining construction goes roughly as follows. For each configuration \(\varphi \in A_N(h)\) we construct a set of bad (and very-bad) boxes \(\mathcal {F}\) consisting of multiple scales. We do so inductively in the scales, starting by a sufficiently large scale L such that the combinatorial complexity is of order \(e^{o(N)}\). We then look at all the boxes where something unlikely happens—these boxes are called bad—and we add to \(\mathcal {F}\) all those boxes where something “very unlikely” happens—these boxes are called very-bad. Here “very unlikely” corresponds to an event depending on a box \(B\in \mathcal {F}\) for which an improved a priori upper bound of type \(q_L\le e^{-c\,\mathrm {cap}(B)}\) holds. If these boxes have capacity of order N, we are done. Otherwise, we can go down to a smaller scale \(L'<L\) and inspect the bad \(L'\)-boxes contained in the remaining L-boxes (i.e. bad but not very-bad) and add to \(\mathcal {F}\) those \(L'\)-boxes which are very-bad. By continuing this process, we eventually obtain either a family of very-bad boxes with capacity of order N or a very large number of bad boxes of the smallest scale \(L_0\). We can then prove that the probability of both cases is smaller than \(e^{-cN}\). Since each time we go down one scale we look only inside certain boxes of the previous scale, it turns out that we can do so by keeping the combinatorial complexity of order \(e^{o(N)}\), as desired. For this construction to work though, one has to define the notions of bad and very-bad boxes in a very careful way so that a certain propagation property holds—see item (iii) of Definition 3.2.

   Organization of the paper. In Sect. 2 we review the potential theory attached to the simple random walk and describe the shift-argument used to extend each term of the series (1.1) to an entire function. We then prove Theorems 1.5 and 1.6 and also deduce Theorem 1.1 from Theorem 1.3, to which the remaining sections are dedicated. In Sect. 3, we describe the large deviation argument used to prove Theorem 1.3. In Sect. 4 we prove the (deterministic) multi-scale coarse graining theorem stated in Sect. 3. Finally, in Sects. 5 and 6 we prove the decay in probability for the notions of bad and very-bad boxes introduced in Sect. 3.

2 Potential Theory and Analytic Extension

We start by introducing some notation. For any pair \(x,y\in V\) we write \(x\sim y\) if \(\{x,y\}\in E\). Given \(S\in \mathcal {X}\), we may consider its (inner) boundary \(\partial S\,{:}{=}\,\{x\in S:~\exists y\in V{\setminus } S,~x\sim y\}\), its outer boundary \(\partial ^{out} S\,{:}{=}\,\{x\in V {\setminus } S:~\exists y\in S,~x\sim y\}\) and its closure \({\overline{S}}\,{:}{=}\,S\cup \partial ^{out}S\).

We now recall some potential theory attached to simple random walk (SRW) on the graph \(G=(V,E)\), which is assumed to be locally finite, connected and transient for the SRW. We denote by \(P_x\) the canonical law of the discrete-time SRW on G starting at \(x \in V\) and write \((X_n)_{n \ge 0}\) for the corresponding process. We let \(g(\cdot ,\cdot )\) stand for the Green function of the walk,

$$\begin{aligned} g(x,y) \,{:}{=}\,\dfrac{1}{d(y)}\sum _{n=0}^{\infty } P_x [X_n = y], \quad \text { for }x,y \in V, \end{aligned}$$
(2.1)

where d(y) denotes the degree of y. It is well known that the Green function is finite (as G is transient), symmetric and positive-definite. Therefore, we can effectively define the GFF \(\varphi =(\varphi _x)_{x\in V}\) as the centered Gaussian field with covariance matrix g. In the case of \({{\mathbb {Z}}}^d\), \(d\ge 3\), it is well known that \(g(x,y)\asymp \Vert x-y \Vert _{\infty }^{2-d}\).

Given a finite and non-empty \(K \subset V\), we write

$$\begin{aligned} g_K(x,y) \,{:}{=}\,\dfrac{1}{d(y)} \sum _{n=0}^\infty P_x[X_n =y, n < T_K], \quad \text { for }x,y \in V. \end{aligned}$$
(2.2)

for the Green function of simple random walk killed on the boundary of K, where \(T_K:= \inf \{n\ge 0: X_n \in V{\setminus } K\}\). For \(x\in V\), consider the equilibrium measure \(e_K(x) \,{:}{=}\,d(x)P_x [{\widetilde{H}}_K = \infty ] \mathbbm {1}_{x\in K}\), where \({\widetilde{H}}_K\,{:}{=}\,\min \{n\ge 1:~X_n\in K\}\). The capacity of K is defined as its total mass,

$$\begin{aligned} \mathrm {cap}(K) \,{:}{=}\,\sum _{x \in K} e_K(x). \end{aligned}$$
(2.3)

The capacity is an increasing and sub-additive function, i.e. \(\mathrm {cap}(A)\le \mathrm {cap}(A\cup B)\le \mathrm {cap}(A)+\mathrm {cap}(B)\) for every \(A,B\subset V\). The following variational characterization of the capacity is useful for obtaining lower bounds:

$$\begin{aligned} \mathrm {cap}(K)=\big (\inf _{\nu } E(\nu )\big )^{-1}, \end{aligned}$$
(2.4)

where \(E(\nu )\,{:}{=}\,\sum _{x,y} \nu (x)\nu (y)g(x,y)\) and the infimum ranges over all probability measures \(\nu \) supported on K. As a direct consequence, one has the following inequality

$$\begin{aligned} \frac{|K|}{\sup _{x\in K} \sum _{y\in K} g(x,y)}\le \mathrm {cap}(K)\le \frac{|K|}{\inf _{x\in K} \sum _{y\in K} g(x,y)}. \end{aligned}$$
(2.5)

In the special case of the box \(B_L=[0,L)^d\) on \({{\mathbb {Z}}}^d\), one can conclude that

$$\begin{aligned} \mathrm {cap}(B_L) \asymp L^{d-2} \quad \text {for all } L\ge 1. \end{aligned}$$
(2.6)

The optimizing measure in (2.4) is precisely the normalized equilibrium measure \({\overline{e}}_K(x)\,{:}{=}\,e_K(x)/\mathrm {cap}(K)\). Further, for every \(K \subset K' \subset \subset {{\mathbb {Z}}}^d\) one has the sweeping identity

$$\begin{aligned} \mathrm {cap}(K) = \sum _{x\in K'} e_{K'}(x) P_x[H_K < \infty ], \end{aligned}$$
(2.7)

where \(H_K\,{:}{=}\,\min \{n\ge 0:~X_n\in K\}\). Consider the Dirichlet inner product defined as

$$\begin{aligned} \mathcal {E}(f,g) \,{:}{=}\,-\sum _{x \in V} \Delta f(x) g(x) \end{aligned}$$
(2.8)

for every pair of functions \(f,g: V \rightarrow {{\mathbb {R}}}\) for which the sum converges (for instance, if either \(\Delta f\) or g has finite support), where \(\Delta f(x)\,{:}{=}\,\sum _{y\sim x}(f(y)-f(x))\) is the Laplacian of f. One also has the following variational characterization of capacity in terms of the Dirichlet energy

$$\begin{aligned} \mathrm {cap}(K)=\inf _{f} \mathcal {E}(f,f), \end{aligned}$$
(2.9)

where the infimum ranges over all functions f such that \(\mathcal {E}(f,f)\) is well defined and \(f(x)\ge 1\) for every \(x\in K\). The optimizing function in (2.9) is called the harmonic potential of K and is given by

$$\begin{aligned} f_K(x)\,{:}{=}\,P_x[H_K<\infty ]. \end{aligned}$$

In fact, \(f_K\) takes value 1 on K and is harmonic on \(V{\setminus } K\), i.e. \(\Delta f_K(x)=0\) for all \(x\in V{\setminus } K\).

Given a function \(f:V \rightarrow {{\mathbb {C}}}\) for which the Laplacian \(\Delta f(x)\) has finite support, we introduce the complex measure

$$\begin{aligned} {\widetilde{{\mathbb {P}}}}_f(d\varphi )\,{:}{=}\,\exp \left\{ -\tfrac{1}{2}\mathcal {E}(f,f)-\mathcal {E}(f,\varphi )\right\} {\mathbb {P}}(d\varphi ). \end{aligned}$$
(2.10)

Notice that when f takes real values, the Cameron–Martin formula implies that \({\widetilde{{\mathbb {P}}}}_{f}\) is a probability measure and furthermore, the law of \(\varphi \) under \({\widetilde{{\mathbb {P}}}}_{f}\) coincides with the law of \(\big (\varphi _x-f(x)\big )\) under \({\mathbb {P}}\). This observation will allow us to extend the probability of local events to the complex plane.

Proposition 2.1

Let F be a cluster observable such that for every \(N\ge 1\) there is \(M>0\) for which \(|F(S)|\le M\) for every \(S\in \mathcal {X}\) with \(\mathrm {cap}(S)\le N\). For every \(X\in \mathcal {X}\) and \(N\ge 0\), the function \({\overline{F}}^X_N(h)={\mathbb {E}}[F(\mathcal {C}_X(h)) \mathbbm {1}_{A^X_N(h)}]\) extends to an entire function such that for every \(h,t\in {{\mathbb {R}}}\),

$$\begin{aligned} |{\overline{F}}^X_N(h+it)|\le \exp \left\{ \tfrac{1}{2}t^2(N+1)\right\} \overline{|F|}^X_N(h), \end{aligned}$$
(2.11)

where \(\overline{|F|}^X_N(h)={\mathbb {E}}[|F(\mathcal {C}_X(h))| \mathbbm {1}_{A^X_N(h)}]\).

Proof

Let \(S\in \mathcal {X}\). We start by extending \(h\mapsto {\mathbb {P}}[\mathcal {C}_X(h)=S]\) to the complex plane. For every \(z\in {{\mathbb {C}}}\), we define

$$\begin{aligned} \begin{aligned} \theta ^X_S(z)\,{:}{=}\,{\widetilde{{\mathbb {P}}}}_{zf_{{\overline{S}}}}[\mathcal {C}_X(0)=S]&= {\mathbb {E}}[\exp \left\{ -\tfrac{1}{2}z^2\mathcal {E}(f_{{\overline{S}}},f_{{\overline{S}}})-z\mathcal {E}(f_{{\overline{S}}},\varphi )\right\} \mathbbm {1}_{\mathcal {C}_X(0)=S}]\\&=\exp \{-\tfrac{1}{2}z^2\mathrm {cap}({\overline{S}})\}\sum _{k=0}^{\infty } \dfrac{{\mathbb {E}}[\left( -\mathcal {E}(f_{{\overline{S}}},\varphi )\right) ^k \mathbbm {1}_{\mathcal {C}_X(0)=S}]}{k!}z^k. \end{aligned} \end{aligned}$$
(2.12)

First notice that since the event \(\{\mathcal {C}_X(0)=S\}\) only depends on \(\varphi \) restricted to \({\overline{S}}\) and \(hf_{{\overline{S}}}=h\) on \({\overline{S}}\), it follows from the Cameron–Martin formula that \(\theta ^X_{S}(h)\) is indeed equal to \({\mathbb {P}}[\mathcal {C}_X(h)=S]\) for \(h\in {{\mathbb {R}}}\). In order to prove that \(\theta ^X_S(z)\) is analytic on \({{\mathbb {C}}}\) it suffices to show that the series in (2.12) converges locally uniformly. Indeed, this follows directly from the fact that for all \(n\ge 0\),

$$\begin{aligned} \left|\sum _{k=0}^{n} \dfrac{\left( -\mathcal {E}(f_{{\overline{S}}},\varphi )\right) ^k}{k!}z^k \right|\le \sum _{k=0}^\infty \dfrac{\left|\mathcal {E}(f_{{\overline{S}}},\varphi )\right|^k}{k!} |z|^k=\exp \left\{ |z\mathcal {E}(f_{{\overline{S}}},\varphi )|\right\} , \end{aligned}$$

and \({\mathbb {E}}[\exp \left\{ |z\mathcal {E}(f_{{\overline{S}}},\varphi )|\right\} ]\) is finite for every \(z\in {{\mathbb {C}}}\) as \(\mathcal {E}(f_{{\overline{S}}},\varphi )\) is a Gaussian random variable.

We will now obtain a bound for \(\theta ^X_S(h+it)\) in terms of \(\theta ^X_S(h)\) for \(h,t\in {{\mathbb {R}}}\). By the Cameron–Martin formula, we have

$$\begin{aligned} \theta ^X_S(h+it)={\widetilde{{\mathbb {P}}}}_{itf_{{\overline{S}}}}[\mathcal {C}_X(h)=S]= {\mathbb {E}}[\exp \left\{ \tfrac{1}{2}t^2\mathrm {cap}({\overline{S}})-it\mathcal {E}(f_{{\overline{S}}},\varphi )\right\} \mathbbm {1}_{\mathcal {C}_X(h)=S}]. \end{aligned}$$

Since \(|\exp \left\{ -it\mathcal {E}(f_{{\overline{S}}},\varphi )\right\} |=1\) a.s., we obtain

$$\begin{aligned} |\theta ^X_S(h+it)|\le \exp \{\tfrac{1}{2}t^2\mathrm {cap}({\overline{S}})\}{\mathbb {P}}[\mathcal {C}_X(h)=S]. \end{aligned}$$
(2.13)

Finally, for every \(z\in {{\mathbb {C}}}\), we define

$$\begin{aligned} {\overline{F}}^X_N(z)\,{:}{=}\,\sum _{S\in \mathcal {A}_N} F(S)\theta ^X_S(z). \end{aligned}$$
(2.14)

Here \(\mathcal {A}_N\) denotes the family of all sets \(S\in \mathcal {X}\) such that \(N\le \mathrm {cap}({\overline{S}})<N+1\). By (2.13) and our assumption on F,

$$\begin{aligned} \sum _{S\in \mathcal {A}_N}|F(S)\theta ^X_S(z)|\le \exp \{\tfrac{1}{2}|\text {Im}(z)|^2 N\}\sup _{S\in \mathcal {A}_N} |F(S)| <\infty . \end{aligned}$$

We can then apply the Weierstrass M-test to conclude that the series in (2.14) converges locally uniformly and therefore \({\overline{F}}^X_N\) is indeed analytic on \({{\mathbb {C}}}\). The inequality (2.11) follows readily from (2.13). \(\square \)

With Proposition 2.1 in hands, we can now easily obtain a sufficient condition for the analyticity of \({\overline{F}}^X\).

Proposition 2.2

Let \(X\in \mathcal {X}\). If there exists a constant \(t>0\) such that \({\mathbb {P}}[A^X_N(h)]\le e^{-tN}\) for every \(N\ge 0\) and \(h\in (a,b)\), then \({\overline{F}}^X\) is analytic on (ab) for every cluster observable F of subexponential growth.

Proof

By Proposition 2.1 and our assumption on the decay of \({\mathbb {P}}[A^X_N(h)]\), we obtain that \(|{\overline{F}}^X_N(z)|\le \exp \{-\tfrac{1}{2}t(N-1)\}\sup _{S\in \mathcal {A}_N} |F(S)|\) for every \(z\in (a,b)\times (-\sqrt{t},\sqrt{t})\). By the subexponential growth of F it follows that the series \({\overline{F}}^X(z)=\sum _{N=0}^\infty {\overline{F}}^X_N(z)\) converges uniformly on \((a,b)\times (-\sqrt{t},\sqrt{t})\), hence it is analytic on that set. \(\square \)

Notice that Theorem 1.1 follows directly from Proposition 2.2 and Theorem 1.3, whose proof is presented in the following sections. Theorem 1.5 follows from Proposition 2.2 and the following simple result. Recall that \(\{\varphi \ge h\}\) is known to percolate for every \(h<0\) on any transient graph [BLM87].

Proposition 2.3

For every transient graph G, \(X\in \mathcal {X}\), \(h<0\) and \(N\ge 0\), we have

$$\begin{aligned} {\mathbb {P}}[A^X_N(h)]\le \exp \{-\tfrac{1}{2}h^2 N\}. \end{aligned}$$

Proof

Let \(h<0\) and \(S\in \mathcal {A}_N\). Recall that by the Cameron–Martin formula,

$$\begin{aligned} {\mathbb {P}}[\mathcal {C}_X(h)=S]= & {} {\widetilde{{\mathbb {P}}}}_{hf_{{\overline{S}}}}[\mathcal {C}_X(0)=S]\\= & {} \exp \left\{ -\tfrac{1}{2}h^2\mathcal {E}(f_{{\overline{S}}},f_{{\overline{S}}})\right\} {\mathbb {E}}[\exp \left\{ -h\mathcal {E}(f_{{\overline{S}}},\varphi )\right\} 1_{\mathcal {C}_X(0)=S}]. \end{aligned}$$

Notice that on the event \(\{\mathcal {C}_X(0)=S\}\), we have \(\varphi _x\le 0\) for every \(x\in \partial ^{out} S\supset \partial {\overline{S}}\). Moreover, \(\Delta f_{{\overline{S}}}(x)=0\) for every \(x\in V\setminus \partial {\overline{S}}\) and \(\Delta f_{{\overline{S}}}(x)\le 0\) for every \(x\in \partial {\overline{S}}\). It follows that \(\exp \{-h\mathcal {E}(f_{{\overline{S}}},\varphi )\}\le 1\) on the event \(\{\mathcal {C}_X(0)=S\}\). Furthermore, \(\mathcal {E}(f_{{\overline{S}}},f_{{\overline{S}}})=\mathrm {cap}({\overline{S}})\ge N\). Overall we obtain

$$\begin{aligned} {\mathbb {P}}[\mathcal {C}_X(h)=S]\le \exp \left\{ -\tfrac{1}{2}h^2 N\right\} {\mathbb {P}}[\mathcal {C}_X(0)=S] \end{aligned}$$

and the desired inequality follows by summing over S. \(\square \)

We finish this section by proving Theorem 1.6.

Proof of Theorem 1.6

Let us write D(hR) for the closed disk in the complex plane that is centred at h and has radius R. Consider some \(h\in {{\mathbb {R}}}\setminus [{\tilde{h}},{\hat{h}}]\) and let \(R=(N+1)^{-1/2}\). By Proposition 2.1 and the Cauchy estimate, we can bound the kth derivative of \({\overline{F}}^X_N\) as follows

$$\begin{aligned} |\partial ^k{\overline{F}}^X_N(h)|\le \dfrac{k!M_R}{R^k}, \end{aligned}$$

where \(M_R=\sup _{z\in D(h,R)} |{\overline{F}}^X_N(z)|\). The inequality (2.11) implies that

$$\begin{aligned} M_R\le e^{1/2} \sup _{S\in \mathcal {A}_N} |F(S)|\sup _{h'\in [h-R,h+R]} {\mathbb {P}}[A^X_N(h')]. \end{aligned}$$

Thus, by the (at most) polynomial growth of F and the superpolynomial decay of \({\mathbb {P}}[A^X_N(h')]\), it follows that \(|\partial ^k {\overline{F}}^X_N(h)|\) decays to 0 super-polynomially fast and uniformly on compact subsets of \({{\mathbb {R}}}\setminus [{\tilde{h}},{\hat{h}}]\). We can now conclude that the sum \(\sum _{N=0}^{\infty } |\partial ^k {\overline{F}}^X_N(h)|\) converges uniformly on compact subsets of \({{\mathbb {R}}}\setminus [{\tilde{h}},{\hat{h}}]\), hence the kth derivative of \({\overline{F}}^X(h)\) exists and is equal to \(\sum _{N=0}^{\infty } \partial ^k {\overline{F}}^X_N(h)\). \(\square \)

3 Exponential Decay of Capacity on \({{\mathbb {Z}}}^d\)

In this section, we will introduce some definitions and state the technical results needed for the proof of Theorem 1.3. Since the capacity is sub-additive, there is always some (random) \(u\in X\) such that \(\mathrm {cap}(\overline{\mathcal {C}_u(h)})\ge \frac{\mathrm {cap}(\overline{\mathcal {C}_X(h)})}{|X|}\). By the transitivity of \({{\mathbb {Z}}}^d\) and a union bound, we can assume without loss of generality that \(X=\{o\}\) and henceforth omit X from the notation.

3.1 Markov decomposition and harmonic deviations

We start by introducing some notation. Given an integer \(L\ge 1\), let \(B_L=[0,L)^d\), \(U_L=[-L,2L)^d\), \(D_L=[-3L,4L)^d\) and \(K_L=[-100L,101L)^d\). We will write \(B_L(z)=z+B_L\), \(U_L(z)=z+U_L\), \(D_L(z)=z+D_L\) and \(K_L(z)=z+K_L\) for their translates with respect to a vertex \(z\in L{{\mathbb {Z}}}^d\). We will view \(L{{\mathbb {Z}}}^d\) both as a graph that is naturally isomorphic to \({{\mathbb {Z}}}^d\) and as the collection of all the boxes \(B_L(z)\). Given a box \(B=B_L(z)\), we consider the Gaussian fields

$$\begin{aligned} \xi ^B_x\,{:}{=}\,E_x\big [\varphi _{X_{T_{K}}}\big ] =\sum _{y} P_x[X_{T_K}=y]\varphi _y, \quad \psi ^B_x \,{:}{=}\,\varphi _x- \xi _x^B, \quad \text { for }x \in {\mathbb {Z}}^d, \end{aligned}$$
(3.1)

where \(K=K_L(z)\). One then has the decomposition

$$\begin{aligned} \varphi _x=\psi ^B_x+\xi ^B_x, \quad \forall x\in {{\mathbb {Z}}}^d. \end{aligned}$$

It is clear that \(\xi ^B_x=\varphi _x\) (and therefore \(\psi ^B_x=0\)) for every \(x\in {{\mathbb {Z}}}^d\setminus K\). The Markov property implies that \(\psi ^B\) is independent of \(\sigma (\varphi _x, x\in {{\mathbb {Z}}}^d{\setminus } K)\), hence it is independent from \(\xi ^B\). Moreover, \(\xi ^B\) is harmonic in K and the covariance matrix of \(\psi ^B\) is equal to the Green function \(g_K\) for simple random walk killed on the boundary of K. The fields \(\xi ^B\) and \(\psi ^B\) are often called harmonic and local fields, respectively. The aforementioned decomposition of \(\varphi \) is of great importance for large deviation results as it will allow us to distinguish local contributions (driven by \(\psi \)) from global ones (driven by \(\xi \)). In this subsection we focus on estimating the global contributions, which correspond to deviations of \(\xi \) and are governed by the capacity.

Let \(\varepsilon >0\) and \(L\ge 1\). We say that the box \(B=B_L(z)\in L{{\mathbb {Z}}}^d\) is \((\xi ,\varepsilon )\)-good if

$$\begin{aligned} |\xi ^B_x|<\varepsilon \quad \text {for every } x\in D, \end{aligned}$$

where \(D=D_L(z)\). If B is not \((\xi ,\varepsilon )\)-good, we will call it \((\xi ,\varepsilon )\)-bad. Sznitman [Szn15] obtained a precise estimate for the probability that many boxes of the same scale are \((\xi ,\varepsilon )\)-bad. For our purposes, a multi-scale version of Sznitman’s result is necessary. To formally state this new version, we will need the following definition. Consider a family \(\mathcal {F}\) containing at least one box from each of \(L_1{{\mathbb {Z}}}^d,L_2{{\mathbb {Z}}}^d,\ldots ,L_r{{\mathbb {Z}}}^d\), where \(1\le L_1<L_2<\ldots <L_r\) are integers. We say that \(\mathcal {F}\) is well-separated if for every pair of distinct boxes \(B_{L_i}(z), B_{L_j}(w)\in \mathcal {F}\), the boxes \(K_{L_i}(z)\) and \(K_{L_j}(w)\) are disjoint, where \(1\le i\le j\le r\). We remark that for a well-separated family \(\mathcal {F}\), the local fields \(\psi ^{B}\), \(B\in \mathcal {F}\), are independent from each other, which will be useful in the following sections in estimating the probability of certain events. Finally, we define

$$\begin{aligned} \Sigma =\Sigma (\mathcal {F})\,{:}{=}\,\bigcup _{B\in \mathcal {F}} B. \end{aligned}$$

The following is a slight modification of Sznitman’s result.

Lemma 3.1

There is a constant \(c_0>0\) such that the following holds. For every \(\varepsilon >0\) there is a constant \(\delta >0\) such that for every well-separated collection \(\mathcal {F}\) with \(|\mathcal {F}|\le \delta \mathrm {cap}(\Sigma )\), we have

$$\begin{aligned} {\mathbb {P}}(B \text {is} (\xi ,\varepsilon )-\text {bad} \; \forall B\in \mathcal {F})\le \exp \left( -c_0\varepsilon ^2 \mathrm {cap}(\Sigma )\right) . \end{aligned}$$

Proof

It suffices to prove that for some constant \(c'>0\), we have

$$\begin{aligned} {\mathbb {P}}[\bigcap _{B\in \mathcal {F}}\{\sup _{D} \xi ^B\ge \varepsilon \}]\le \exp \left( -c'\varepsilon ^2 \mathrm {cap}(\Sigma )\right) . \end{aligned}$$
(3.2)

Indeed, notice that \(\xi ^B\) are centered and either \(\Sigma (\mathcal {F}^-)\) or \(\Sigma (\mathcal {F}^+)\) has capacity at least \(\mathrm {cap}(\Sigma )/2\) by the sub-additivity of the capacity, where \(\mathcal {F}^-\,{:}{=}\,\{B\in \mathcal {F}: \; \inf _{D} \xi ^B\le -\varepsilon \}\) and \(\Sigma (\mathcal {F}^+)\,{:}{=}\,\{B\in \mathcal {F}: \; \sup _{D} \xi ^B\ge \varepsilon \}\). Moreover, there are \(2^{|\mathcal {F}|}\le 2^{\delta \mathrm {cap}(\Sigma )}\) possibilities for \(\mathcal {F}^{\pm }\), so it is enough to take \(\delta >0\) sufficiently small.

The proof of (3.2) is essentially the same as in [Szn15, Corollary 4.4]. We will point out the necessary changes. The results mentioned throughout this proof are from [Szn15].

We attach to \(\mathcal {F}\) the collection F of functions f from \(\mathcal {F}\) into \({{\mathbb {Z}}}^d\) such that \(f(B)\in D\). Let \(\nu \) be the equilibrium measure of \(\Sigma \) and \(\lambda (B)=\nu (B)/\mathrm {cap}(\Sigma )\). Define

$$\begin{aligned} Z_f=\sum _{B\in \mathcal {F}} \lambda (B)\xi ^B(f(B)) \end{aligned}$$

and

$$\begin{aligned} Z=\sup _{f\in F}Z_f. \end{aligned}$$

We need to show that there exists a constant \(C=C(d)>0\) such that

$$\begin{aligned} \text {var}(Z_f)\le \dfrac{C}{\mathrm {cap}(\Sigma )} \end{aligned}$$
(3.3)

for every \(f\in F\) and

$$\begin{aligned} {\mathbb {E}}[Z]\le C \left( \frac{|\mathcal {F}|}{\mathrm {cap}(\Sigma )}\right) ^{1/2}. \end{aligned}$$
(3.4)

The first inequality can be obtained by arguing as in the proof of [Szn15, Theorem 4.2]. Due to the fact that boxes in \(\mathcal {F}\) have in general different scales, we need to slightly modify the argument from [Szn15, Theorem 4.2] in order to obtain the second inequality. Indeed, following the proof of [Szn15, Lemma 4.3] we get

$$\begin{aligned} \begin{aligned} {\mathbb {E}}[&(Z_f-Z_k)^2]\\&\le C' \sum _{B,B'\in \mathcal {F}}\lambda (B)\lambda (B') \dfrac{\Vert f(B)-k(B)\Vert _{\infty }\Vert f(B')-k(B')\Vert _{\infty }}{LL'} {\mathbb {E}}[\xi ^B(f(B))\xi ^{B'}(f(B'))] \end{aligned} \end{aligned}$$
(3.5)

for every \(f,k\in F\), where \(C'\) is a constant, L denotes the scale of B and \(L'\) denotes the scale of \(B'\). It follows from (3.5), (3.3) and the fact that \(\Vert f(B)-k(B) \Vert _\infty \le 7\,L\) that

$$\begin{aligned} {\mathbb {E}}[(Z_f-Z_k)^2]\le 49C'{\mathbb {E}}(Z_f^2)\le \dfrac{C''}{\mathrm {cap}(\Sigma )}, \end{aligned}$$

where \(C''=49CC'\). Setting \({\widetilde{Z}}_f=\sqrt{\mathrm {cap}(\Sigma )}Z_f\) we obtain

$$\begin{aligned} {\mathbb {E}}[({\widetilde{Z}}_f-{\widetilde{Z}}_k)^2]^{1/2}\le \sqrt{C''}. \end{aligned}$$

Now given \(x\in (0,\sqrt{C''}]\), for every \(L\ge 1\) we pick the largest integer l such that \(l\le 7x L/\sqrt{C''}\) and for each box \(B\in \mathcal {F}\) of scale L, we partition D into disjoint boxes, each having \(\Vert \cdot \Vert _{\infty }\)-diameter at most l. If \(f,k\in F\) are such that for every \(B\in \mathcal {F}\), f(B) and k(B) lie in the same box of D, then it follows from (3.5) that \({\mathbb {E}}[({\widetilde{Z}}_f-{\widetilde{Z}}_k)^2]^{1/2}\le x\). Arguing as in page 1820 of [Szn15] we obtain (3.4). We can now use the Borell-TIS inequality as in the proof of [Szn15, Corollary 4.4] to obtain

$$\begin{aligned} {\mathbb {P}}[\bigcap _{B\in \mathcal {F}}\{\sup _{D} \xi ^B\ge \varepsilon \}]\le \exp \left\{ -\dfrac{1}{2\sigma ^2}(\varepsilon -|{\mathbb {E}}(Z)|)_{+}\right\} \end{aligned}$$

with \(\sigma ^2=\sup _{F} \text {var}(Z_f)\). With (3.3) and (3.4) in hands, the desired result follows once we choose \(\delta >0\) so that \(C\sqrt{\delta }\le \varepsilon /2\) and \(\delta \) is much smaller than \(c'\). \(\square \)

Notice that by applying Lemma 3.1 to a single box \(B\in L{{\mathbb {Z}}}^d\) and recalling (2.6), we have

$$\begin{aligned} {\mathbb {P}}(B \text {is} (\xi ,\varepsilon )-\text {bad})\le \exp \left( -c\varepsilon ^2 L^{d-2}\right) . \end{aligned}$$
(3.6)

3.2 Bad Boxes and Multi-scale Coarse Graining

Our aim now is to set up the abstract multi-scale coarse graining scheme used to prove Theorem 1.3. This is encapsulated in Theorem 3.3 below, which is purely deterministic and whose proof is postponed to Sect. 4. In the next subsections, we deduce the desired exponential decay of capacity in the subcritical and supercritical phases separately by applying Theorem 3.3 with well chosen notions of “bad” and “very-bad” events.

Let us start by giving some definitions and introducing some notations. For every integer \(L\ge 1\) and \(h\in {{\mathbb {R}}}\), let \(\mathcal {C}_o(h,L)\) be the set of boxes of \(L{{\mathbb {Z}}}^d\) that contain a vertex of \(\mathcal {C}_o(h)\). We recall that the inner vertex boundary \(\partial \mathcal {C}_o(h,L)\) of \(\mathcal {C}_o(h,L)\) is defined as the set of boxes in \(\mathcal {C}_o(h,L)\) that have a neighbour in \(L{{\mathbb {Z}}}^d\setminus \mathcal {C}_o(h,L)\).

We will introduce a general framework that will allow us to study both the supercritical and the subcritical regime. To this end, we consider a family of “bad events” indexed by boxes \(B=B_L(z)\in L{{\mathbb {Z}}}^d\), satisfying certain properties. In what follows, the diameter is measured with respect to the graph metric of \({{\mathbb {Z}}}^d\).

Definition 3.2

(Admissible bad events). Given \(L\ge 1\), we say that a family of events \(\mathcal {E}^{i}_B\) with \(B=B_L(z)\in L{{\mathbb {Z}}}^d\), \(i\in \{b,vb\}\), is h-admissible if it satisfies the following properties:

  1. (i)

    \(\mathcal {E}^{b}_B\) and \(\mathcal {E}^{vb}_B\) are disjoint for every B,

  2. (ii)

    if \(L\le \mathrm {diam}(\mathcal {C}_o(h))<\infty \), then \(\mathcal {E}_B:=\mathcal {E}^{b}_B\cup \mathcal {E}^{vb}_B\) happens for every \(B\in \partial \mathcal {C}_o(h,L)\),

  3. (iii)

    item:prop if a pair \(B,B'\) of neighbouring boxes lies in \(\mathcal {C}_o(h,L)\) and \(\mathcal {E}^{b}_B\) happens, then \(\mathcal {E}_{B'}\) happens.

For our purposes, both \(\mathcal {E}^{b}_B\) and \(\mathcal {E}^{vb}_B\) will be chosen to be unlikely events, with \(\mathcal {E}^{vb}_B\) in particular being extremely unlikely, in the sense that its probability decays exponentially in \(\mathrm {cap}(B)\). Item (ii) can be thought of as an initiation property that ensures that the union of the boxes \(B\in \mathcal {C}_o(h,L)\) for which \(\mathcal {E}_B\) happens, has capacity at least \(\mathrm {cap}(\mathcal {C}_o(h))\). Item (iii) can be thought of as a propagation property. Ideally, we would like the event \(\mathcal {E}^{vb}_B\) to happen for most boxes in \(\partial \mathcal {C}_o(h,L)\). If this is not the case, then we have many boxes \(B\in \partial \mathcal {C}_o(h,L)\) for which \(\mathcal {E}^{b}_B\) happens. In this case, item (iii) ensures that for many boxes in \(\mathcal {C}_o(h,L)\) that are adjacent to \(\partial \mathcal {C}_o(h,L)\), the event \(\mathcal {E}_B\) happens. Continuing in this way we explore more and more boxes for which \(\mathcal {E}_B\) happens.

With such events in hand, we will associate to \(\mathcal {C}_o(h)\) an interface \(\mathcal {I}\) such that for each box B of \(\mathcal {I}\), \(\mathcal {E}_B\) happens. An interface \(\mathcal {I}\) is a finite collection of disjoint boxes of \(L_1{{\mathbb {Z}}}^d, L_2{{\mathbb {Z}}}^d, \ldots L_k{{\mathbb {Z}}}^d\) for an integer \(k>0\) and \(1\le L_1<L_2<\ldots <L_k\). For most of the \(\mathcal {I}\) we will consider, o will be contained in a bounded component of \({{\mathbb {Z}}}^d \setminus \mathcal {I}\) (thus the term “interface”), but it will be more convenient for us not to add this condition in the definition. When \(\mathcal {E}_B\) happens for each box B of \(\mathcal {I}\), we will say that \(\mathcal {I}\) occurs. There are two subsets of \(\mathcal {I}\) that play an important role. The first one, denoted \(\mathcal {B}\), is the set of boxes \(B\in \mathcal {I}\) such that \(\mathcal {E}^{b}_B\) happens. The second one, denoted \(\mathcal{VB}\mathcal{}\), is the set of all boxes \(B\in \mathcal {I}\) such that \(\mathcal {E}^{vb}_B\) happens.

In the following theorem, we construct a family of interfaces \(\mathcal {I}_N\) of small cardinality such that whenever \(A_N(h)\) happens, some interface \(\mathcal {I}\in \mathcal {I}_N\) occurs for which either \(\mathcal{VB}\mathcal{}\) has large capacity or \(\mathcal {B}\) has large cardinality. To avoid any confusion, we remark that the notation \(\mathcal {B}\subset L_1{{\mathbb {Z}}}^d\) means that all boxes in \(\mathcal {B}\) are of scale \(L_1\).

Theorem 3.3

(Multi-scale coarse graining). Let \(\mathcal {E}^{i}_B\), \(i\in \{b,vb\}\), \(B\in L{{\mathbb {Z}}}^d\), \(L\ge 1\), be a family of events which are h-admissible for each \(L\ge 1\). For every \(\rho >0\) and \(\delta >0\), there exist constants \(0<t=t(d,\rho ,\delta )<1\), \(L_0=L_0(d,\rho ,\delta )>0\), \(N_0=N_0(d,\rho ,\delta )>0\) such that for every \(N\ge N_0\), there is a family \(\mathcal {I}_N\) of interfaces such that the following hold:

  1. (a)

    \(|{\mathcal {I}}_N|\le e^{\delta tN}\),

  2. (b)

    for every \(\mathcal {I}\in \mathcal {I}_N\), we have \(L_0\le L_1\) and \(|\mathcal {I}|\le \delta t N\),

  3. (c)

    on the event \(A_N(h)\), some \(\mathcal {I}\in \mathcal {I}_N\) occurs with \(L_k\le \mathrm {diam}(\mathcal {C}_o(h))\) and \(\mathcal {B}\subset L_1{{\mathbb {Z}}}^d\), and one of the following holds:

  1. (c1)

    \(\mathrm {cap}\big (\bigcup _{B\in \mathcal{VB}\mathcal{}} B\big )\ge N/4d\),

  2. (c2)

    \(|\mathcal {B}|L_1^{\rho }\ge tN\).

We stress that the constants \(t,L_0\) and \(N_0\) in the above theorem depend only on d, \(\rho \) and \(\delta \) and not on the choice of \(\mathcal {E}^{b}_B\) and \(\mathcal {E}^{vb}_B\). We also remark that for our applications, \(\mathcal {E}_B\) will be chosen in such a way that its probability decays stretched exponentially with exponent the constant \(\rho \) appearing in the statement of the theorem.

3.2.1 Exponential decay in the supercritical regime

We will split the proof of Theorem 1.3 into two parts, depending on whether h belongs to the supercritical or the subcritical regime. We will first handle the supercritical regime. Our aim is to choose \(\mathcal {E}^{b}_B\) and \(\mathcal {E}^{vb}_B\) appropriately and then apply Theorem 3.3.

To this end, consider an integer \(L> 1\) and a box \(B\in L{{\mathbb {Z}}}^d\). We define \(L_0=\left\lceil L/M \right\rceil \approx L^{\frac{1}{d-1}}\log (L)\), where \(M=\left\lfloor L^{\frac{d-2}{d-1}}/\log (L)\right\rfloor \). A connected subgraph of B is called dense if it intersects at least \(\frac{3}{4}K\approx \frac{3}{4}M^d\) boxes of \(L_0{{\mathbb {Z}}}^d\) contained in B, where K is the number of boxes of \(L_0{{\mathbb {Z}}}^d\) contained in B, and has diameter at least L/5—the latter follows immediately for any connected subgraph that intersects at least \(\frac{3}{4}K\) boxes contained in B, provided that L is large enough, but we will not need this fact. Note that if M divides L, then B contains \(M^d\) boxes of \(L_0{{\mathbb {Z}}}^d\) but it contains fewer boxes if M does not divide L.

Fix \(h'<h_*\) and \(\varepsilon _0:=(h_*-h')/2\). For any \(h\le h'\), let \(\mathcal {E}^{b}_B\) be the intersection of the events

  1. (b1)

    for every \(B'\) which is either B or a neighbour of B, \(\{\varphi \ge h\}\cap B'\) contains a dense cluster,

  2. (b2)

    \(\{\varphi \ge h\}\cap B\) contains a dense cluster that is not contained in \(\mathcal {C}_o(h)\),

  3. (b3)

    B is \((\xi ,\varepsilon _0)\)-good,

and \(\mathcal {E}^{vb}_B\) be the union of the events

  1. (vb1)

    for some \(B'\) which is either B or a neighbour of B, \(\{\varphi \ge h\}\cap B'\) does not contain a dense cluster,

  2. (vb2)

    all dense clusters of \(\{\varphi \ge h\}\cap B\) are contained in \(\mathcal {C}_o(h)\), but for some neighbouring box \(B'\) of B, \(\{\varphi \ge h\}\cap B'\) contains a dense cluster that is not contained in \(\mathcal {C}_o(h)\),

  3. (vb3)

    B is \((\xi ,\varepsilon _0)\)-bad.

We shall verify that the family of events \(\mathcal {E}^{i}_B\) is h-admissible. It is straightforward to verify that \(\mathcal {E}^{b}_B\) and \(\mathcal {E}^{vb}_B\) are disjoint for every B, so that (i) holds. Let us verify (ii). Consider a box \(B\in \partial \mathcal {C}_o(h,L)\). If \(\mathcal {E}^{vb}_B\) happens, then there is nothing to show. If \(\mathcal {E}^{vb}_B\) does not happen, then the non occurrence of (vb1) and (vb3) directly implies the occurrence of (b1) and (b3), respectively. It remains to check that (b2) holds. Let \(B'\in L{{\mathbb {Z}}}^d{\setminus } \mathcal {C}_o(h,L)\) be a neighbour of B. Since (b1) happens, \(\{\varphi \ge h\}\cap B'\) contains a dense cluster, which in turn is not contained in \(\mathcal {C}_o(h)\) as \(B'\) is disjoint from it. From this and our assumption that (vb2) does not happen, we can conclude that (b2) happens, as we wanted. Finally, let us verify (iii). Consider two neighboring boxes \(B,B''\in \mathcal {C}_o(h,L)\) such that \(\mathcal {E}^{b}_B\) happens. If \(\mathcal {E}^{vb}_{B''}\) happens, then there is nothing to show. Otherwise, (b1) and (b3) clearly happen for \(B''\) in place of B. It is not hard to see that property (b2) happens for \(B''\), since (b2) happens for B and (vb2) does not happen for \(B''\).

The events appearing in (b2), (vb1) and (vb2) are unlikely to happen. However, it will be convenient for us to work with events that, in addition to being unlikely, are independent on different boxes that are far away from each other. For this reason, we will now introduce certain local bad and very-bad events. In what follows, given a box \(B=B_L(z)\), U stands for \(U_L(z)\) and D stands for \(D_L(z)\).

We say that B is \((\psi ,h,\varepsilon )\)-good if for every function \(g:{\overline{D}}\rightarrow {{\mathbb {R}}}\) which is harmonic in D and \(|g(x)|<\varepsilon \) for all \(x\in D\), the following happen:

  • \(\{\psi ^B+g\ge h\}\cap U\) contains a cluster of diameter at least L/5,

  • for every pair \(\mathcal {C}_1,\mathcal {C}_2\) of clusters of \(\{\psi ^B+g\ge h\}\cap U\) of diameter at least L/5, there is a path in \(\{\psi ^B+g\ge h\}\cap D\) connecting \(\mathcal {C}_1\) to \(\mathcal {C}_2\).

If B is not \((\psi ,h,\varepsilon )\)-good, we will call it \((\psi ,h,\varepsilon )\)-bad. It is not hard to see that if \(\mathcal {E}^{b}_B\) happens for some \(B\in \mathcal {C}_o(h,L)\) and \(L\le \mathrm {diam}(\mathcal {C}_o(h))\), then B is \((\psi ,h,\varepsilon _0)\)-bad (with the choice \(g=\xi ^B\)), since \(\mathcal {C}_o(h)\cap U\) contains a cluster of diameter at least L/5. The following result will be proved in Sect. 5.

Proposition 3.4

(Decay of badness). For every \(h'<h_*\) and \(0<\varepsilon < h_*-h'\), there exist constants \(c_1=c_1(h',\varepsilon )>0\) and \(\rho =\rho (d)>0\) such that for every \(h\le h'\) and \(L\ge 1\),

$$\begin{aligned} {\mathbb {P}}[B_L \text {is} (\psi ,h,\varepsilon )-\text {bad}]\le e^{-c_1 L^{\rho }}. \end{aligned}$$

We now define another local event. We say that B is \((\psi ,h,\varepsilon )\)-very-good if for every function \(g:{\overline{D}}\rightarrow {{\mathbb {R}}}\) which is harmonic in D and \(|g(x)|<\varepsilon \) for all \(x\in D\), the following happen:

  • for every \(B'\) which is either B or some neighbour of B, \(\{\psi ^B+g\ge h\}\cap B'\) contains a dense cluster,

  • for every neighbour \(B''\) of B and every pair of dense clusters of \(\{\psi ^B+g\ge h\}\cap B\) and \(\{\psi ^B+g\ge h\}\cap B''\), respectively, there is a path in \(\{\psi ^B+g\ge h\}\cap D\) visiting both dense clusters.

If B is not \((\psi ,h,\varepsilon )\)-very-good, we will call it \((\psi ,h,\varepsilon )\)-very-bad. It is not hard to see that if \(\mathcal {E}^{vb}_B\) happens and B is \((\xi ,\varepsilon _0)\)-good, then B is \((\psi ,h,\varepsilon _0)\)-very-bad. The following result will be proved in Sect. 6.

Proposition 3.5

(Decay of very-badness). For every \(h'<h_*\) and \(0<\varepsilon <h_*-h'\), there exist a constant \(c_2=c_2(h',\varepsilon )>0\) such that for every \(h\le h'\) and \(L\ge 1\) large enough,

$$\begin{aligned} {\mathbb {P}}[B_L \text {is} (\psi ,h,\varepsilon )-\text {very-bad}]\le e^{-c_2 L^{d-2}}. \end{aligned}$$

Assuming Theorem 3.3 and Propositions 3.4 and 3.5, we are now in position to prove Theorem 1.3 for h in the supercritical regime.

Proof of Theorem 1.3

for \(h<h_*\). Consider some \(h\le h'<h_*\) and let \(\rho >0\) be the exponent of Proposition 3.4. Consider also a constant \(\delta >0\) which will be chosen along the way to be sufficiently small. We start by applying Theorem 3.3 for the choice of events \(\mathcal {E}^{b}_B\) and \(\mathcal {E}^{vb}_B\) mentioned above to obtain a family \(\mathcal {I}_N\) as in the statement of the theorem. For each \(\mathcal {I}\in \mathcal {I}_N\), we will prove an exponential upper bound for the probability that \(\mathcal {I}\) occurs satisfying either (c1) or (c2) and then apply a union bound over all \(\mathcal {I}\in \mathcal {I}_N\).

First, let us fix \(\mathcal {I}\in \mathcal {I}_N\) and a pair of subsets \(\mathcal {I}_1,\mathcal {I}_2\subset \mathcal {I}\) such that \(\mathcal {I}_1\) satisfies \(\mathrm {cap}\big (\bigcup _{B\in \mathcal {I}_1} B\big )\ge N/4d\) and \(\mathcal {I}_2\) satisfies \(\mathcal {I}_2\subset L_1{{\mathbb {Z}}}^d\) (where \(L_1\) is the smallest scale of \(\mathcal {I}\)) and \(|\mathcal {I}_2|L_1^{\rho }\ge tN\). We will bound separately the probability that \(\mathcal{VB}\mathcal{}=\mathcal {I}_1\) and \(\mathcal {B}=\mathcal {I}_2\). We start with the latter. Let \(\mathcal {I}'_2\) be a well-separated subset of \(\mathcal {I}_2\) that is maximal with respect to this property. By the maximality of \(\mathcal {I}'_2\), for every \(B_{L_1}(z)\in \mathcal {I}_2\), there is some \(B_{L_1}(w)\in \mathcal {I}'_2\) such that \(K_{L_1}(z)\cap K_{L_1}(w)\ne \emptyset \), hence \(|\mathcal {I}'_2|\ge 201^{-d}|\mathcal {I}_2|\). Notice that the local fields \(\psi ^B\), \(B\in \mathcal {I}'_2\) are independent of each other (since \(\mathcal {I}'_2\) is well-separated) and each box in \(\mathcal {I}'_2\) is \((\psi ,h,\varepsilon _0)\)-bad (since the boxes of \(\mathcal {I}\) have scale smaller than the diameter of \(\mathcal {C}_o(h)\)). Therefore, by Proposition 3.4 and independence, we have

$$\begin{aligned} {\mathbb {P}}[\mathcal {I} \text { occurs with }\mathcal {B}=\mathcal {I}_2]\le \exp \{-201^{-d}c_1tN\}. \end{aligned}$$

We shall now bound the probability that \(\mathcal{VB}\mathcal{}=\mathcal {I}_1\). First, we restrict \(\mathcal {I}_1\) to a well-separated subset with capacity of order N. Let \(L_1<L_2<\ldots <L_k\) be the scales of \(\mathcal {I}_1\). Let \(\mathcal {I}_1^k\) be a subset of \(\mathcal {I}_1\cap L_k{{\mathbb {Z}}}^d\) which is well-separated and maximal with respect to this property. Proceeding inductively, for each \(i\in \{1,2,\ldots ,k\}\), let \(\mathcal {I}_1^i\) be a subset of \(\mathcal {I}_1\cap L_i{{\mathbb {Z}}}^d\) such that \(\bigcup _{j=i}^k \mathcal {I}_1^j\) is well-separated and \(\mathcal {I}_1^i\) is a maximal set with respect to this property. Finally, let \(\mathcal {I}'_1=\bigcup _{j=1}^k \mathcal {I}_1^j\). It follows from the maximality of the construction that for every \(B\in \mathcal {I}_1\) of scale \(L_i\) there exists \(B'\in \mathcal {I}'_1\) of scale \(L_j\ge L_i\) such that the \(\Vert \cdot \Vert _{\infty }\)-distance between B and \(B'\) is at most \(201L_j\). In this case, for every \(x\in B\) we have that \(P_x[H_{B'}<\infty ] \ge q\), where \(q=q(d)>0\) is a constant depending only on the dimension d—see e.g. [Law91, Proposition 2.2.2]. It then follows from the sweeping identity (2.7) that \(\mathrm {cap}(\Sigma (\mathcal {I}'_1))\ge q\mathrm {cap}(\Sigma (\mathcal {I}_1))\ge qN/4d\).

Notice that each box in \(\mathcal {I}'_1\) is either \((\xi ,\varepsilon _0)\)-bad or \((\psi ,h,\varepsilon _0)\)-very-bad. Let \(\Xi (\mathcal {I}'_1)\) be the (random) union of the boxes in \(\mathcal {I}'_1\) that are \((\xi ,\varepsilon _0)\)-bad and let \(\Psi (\mathcal {I}'_1)\) be the (random) union of the boxes in \(\mathcal {I}'_1\) that are \((\psi ,h,\varepsilon _0)\)-very-bad. By the sub-additivity of the capacity,

$$\begin{aligned} \text {either} \quad \mathrm {cap}(\Xi (\mathcal {I}'_1))\ge \frac{qN}{8d} \quad \text {or} \quad \mathrm {cap}(\Psi (\mathcal {I}'_1))\ge \frac{qN}{8d}. \end{aligned}$$

Applying Lemma 3.1 and a union bound over all possibilities for \(\Xi (\mathcal {I}'_1)\) we obtain

$$\begin{aligned} {\mathbb {P}}\left[ \mathcal {I} \text { occurs with } \mathcal{VB}\mathcal{}=\mathcal {I}_1 \text { and }\mathrm {cap}(\Xi (\mathcal {I}'_1))\ge \frac{qN}{8d}\right]&\le \sum _{\mathcal {J}} \exp \left\{ -c_0\varepsilon _0^2\mathrm {cap}(\mathcal {J})\right\} \\&\le 2^{\delta tN}\exp \left\{ \frac{-c_0\varepsilon _0^2 qN}{8d}\right\} , \end{aligned}$$

where the sum ranges over all possible \(\mathcal {J}\) such that \(\mathrm {cap}(\mathcal {J})\ge \frac{qN}{8d}\). Recall that \(|\mathcal {J}|\le |\mathcal {I}'_1|\le |\mathcal {I}|\le \delta t N\), so that we can indeed guarantee that \(\mathcal {J}\) satisfies the hypothesis of Lemma 3.1 by decreasing the value of \(\delta \) if necessary. The term \(2^{\delta tN}\) above accounts for the number of possible \(\mathcal {J}\). For the second case, notice that

$$\begin{aligned} \mathrm {cap}(\Psi (\mathcal {I}'_1))\le \sum _{B_{L_i}(z)\in \Psi (\mathcal {I}'_1)}\mathrm {cap}(B_{L_i}(z))\le C\sum _{B_{L_i}(z)\in \Psi (\mathcal {I}'_1)}L_i^{d-2} \end{aligned}$$

by the sub-additivity of the capacity and (2.6). Hence by Proposition 3.5, we have

$$\begin{aligned} {\mathbb {P}}\left[ \mathcal {I} \text { occurs with } \mathcal{VB}\mathcal{}=\mathcal {I}_1 \text { and } \mathrm {cap}(\Psi (\mathcal {I}'_1))\ge \frac{qN}{8d}\right]&\le \sum _{\mathcal {J}}\exp \left\{ -c_2\sum _{B_{L_i}(z)\in \mathcal {J}}L_i^{d-2}\right\} \\&\le 2^{\delta tN}\exp \left\{ -\frac{c_2qN}{8Cd}\right\} , \end{aligned}$$

where the sum ranges over all possible J for \(\Psi (\mathcal {I}'_1)\) such that \(\mathrm {cap}(\mathcal {J})\ge \frac{qN}{8d}\).

Since \(|\mathcal {I}_N|\le e^{\delta tN}\), applying a union bound over all \(\mathcal {I}\in \mathcal {I}_N\) and all possible \(\mathcal {I}_1,\mathcal {I}_2\subset \mathcal {I}\), and decreasing \(\delta \) even further, if necessary, we obtain that

$$\begin{aligned} {\mathbb {P}}[A_N(h)]&\le \exp \{\delta tN\}2^{\delta tN}\\&\quad \times \left( \exp \{-201^{-d}c_1tN\}+2^{\delta tN}\exp \left\{ \frac{-c\varepsilon _0^2 qN}{8d}\right\} +2^{\delta tN}\exp \left\{ -\frac{c_2qN}{8Cd}\right\} \right) \\&\le \exp \{-c'N\} \end{aligned}$$

for some constant \(c'>0\) depending only on \(h'\) and d, as desired. \(\square \)

3.2.2 Exponential decay in the subcritical regime

We now move on to the proof of Theorem 1.3 for h in the subcritical regime. We will implement a strategy similar to the one we used for the supercritical regime.

First, we need to choose suitably the events \(\mathcal {E}^{b}_B\) and \(\mathcal {E}^{vb}_B\). Given \(h\ge h'>h_*\), \(\varepsilon _0=(h'-h_*)/2\) and a box \(B\in L{{\mathbb {Z}}}^d\), let \(\mathcal {E}_B\) be the event that \(\{\varphi \ge h\}\cap U\) contains a cluster of diameter at least L/5, and let \(\mathcal {E}^{b}_B\,{:}{=}\,\mathcal {E}_B \cap \{B \text {is} (\xi ,\varepsilon _0)-\text {good}\}\) and \(\mathcal {E}^{vb}_B\,{:}{=}\,\mathcal {E}_B \cap \{B \text {is} (\xi ,\varepsilon _0)-\text {bad}\}\). It is straightforward to see that this family of events is h-admissible when \(\mathcal {C}_o(h)\) has diameter at least L, since then for every box \(B\in \mathcal {C}_o(h,L)\), the event \(\mathcal {E}_B\) happens.

Notice that when the event \(\mathcal {E}^{b}_B\) happens, \(\{\psi ^B\ge h-\varepsilon _0\}\cap U\) contains a cluster of diameter at least L/5. The latter happens with probability decaying stretched exponentially.

Proposition 3.6

For every \(h'>h_*\), there exist constants \(c_3=c_3(h',d)>0\) and \(\rho =\rho (d)>0\) such that for every \(h\ge h'\) and \(L\ge 1\),

$$\begin{aligned} {\mathbb {P}}[\{\psi ^B\ge h\}\cap U \text { contains a cluster of diameter at least} L/5]\le e^{-c_3 L^{\rho }}. \end{aligned}$$

Proof

This is a simple consequence of the (subcritical) sharpness of GFF percolation on \({{\mathbb {Z}}}^d\) (i.e. \(h_*=h_{**}\)) mentioned in the introduction. Indeed, by the main result of [DCGRS20], for every \(h>h_*\), there exist \(\rho =\rho (d)\in (0,1)\) and \(c=c(d,h)\) such that for every \(N \ge 1\),

$$\begin{aligned} {\mathbb {P}}[o\leftrightarrow []{\varphi \ge h}\partial B_N]\le e^{-cN^{\rho }}, \end{aligned}$$
(3.7)

where \(\{o\leftrightarrow []{\varphi \ge h}\partial B_N\}\) denotes the event that o is connected to \(\partial B_N\) in \(\{\varphi \ge h\}\). Assume that \(\{\psi ^B\ge h\}\cap D\) contains a cluster of diameter at least L/5 and let \(\varepsilon _0=(h'-h_*)/2\). Up to a probability decaying exponentially in \(L^{d-2}\), B is \((\xi ,\varepsilon _0)\)-good by (3.6). When this happens, \(\{\varphi \ge h-\varepsilon _0\}\cap D\) contains a cluster of diameter at least L/5. The latter event has probability decaying stretched exponentially in the subcritical regime by (3.7). \(\square \)

Assuming Theorem 3.3 and Proposition 3.6, we are now in position to prove Theorem 1.3 for h in the subcritical regime.

Proof of Theorem 1.3

for \(h>h_*\). The proof is similar to that of the case \(h<h_*\) presented in Sect. 3.2.1. Consider some \(h\ge h'> h_*\), and let \(\rho >0\) be the exponent of Proposition 3.6. Consider also a small enough constant \(\delta >0\). We can apply Theorem 3.3 to obtain a family \(\mathcal {I}_N\) satisfying the conclusion of the theorem. Fix \(\mathcal {I}\in \mathcal {I}_N\) and a pair of subsets \(\mathcal {I}_1,\mathcal {I}_2\subset \mathcal {I}\) as in the proof of the case \(h<h_*\). If \(\mathcal {B}=\mathcal {I}_2\) happens, then we restrict to a maximal well-separated subset \(\mathcal {I}'_2\) of \(\mathcal {I}_2\). Arguing as in the previous section and using Proposition 3.6 and independence, we deduce that

$$\begin{aligned} {\mathbb {P}}[\mathcal {I} \text { occurs with }\mathcal {B}=\mathcal {I}_2]\le \exp \{-201^{-d}c_3tN\}. \end{aligned}$$

If \(\mathcal{VB}\mathcal{}=\mathcal {I}_1\) happens, then we restrict to a well-separated subset \(\mathcal {I}'_1\) of \(\mathcal {I}_1\) defined as in the proof of the case \(h<h_*\) for which we have \(\mathrm {cap}(\Sigma (\mathcal {I}'_1))\ge q\mathrm {cap}(\Sigma (\mathcal {I}_1))\ge qN/4d\). Then we apply Lemma 3.1 to conclude that

$$\begin{aligned} {\mathbb {P}}[\mathcal {I} \text { occurs with } \mathcal{VB}\mathcal{}=\mathcal {I}_1]\le \exp \left\{ -\frac{c_0\varepsilon _0^2 qN}{4d}\right\} . \end{aligned}$$

A union bound over all \(\mathcal {I}\in \mathcal {I}_N\) and over all possible subsets \(\mathcal {I}_1,\mathcal {I}_2\) of \(\mathcal {I}\) gives

$$\begin{aligned} {\mathbb {P}}[A_N(h)]\le 2^{\delta tN} \exp \{\delta tN\}\left( \exp \{-201^{-d}c_3tN\}+\exp \left\{ -\frac{c_0\varepsilon _0^2 qN}{4d}\right\} \right) \le \exp \{-c' N\} \end{aligned}$$

for some constant \(c'>0\) depending only on \(h'\) and d, as desired. \(\square \)

4 Multi-scale Coarse Graining Construction

We will now proceed with the proof of Theorem 3.3. In order to prove the theorem we need to introduce some notation. For every integer \(L\ge 1\), let \(\mathcal {B}(L)\) and \(\mathcal{VB}\mathcal{}(L)\) be the set of boxes \(B\in \mathcal {C}_o(h,L)\) such that \(\mathcal {E}^{b}_B\) and \(\mathcal {E}^{vb}_B\) happens, respectively. Finally, define \(\mathcal {I}(L):=\mathcal {B}(L)\cup \mathcal{VB}\mathcal{}(L)\). Notice that by properties (i)–(iii) of Definition 3.2, we know that if \(\mathrm {diam}(\mathcal {C}_o(h))\ge L\) then \(\mathcal {B}(L)\cap \mathcal{VB}\mathcal{}(L)=\emptyset \), \(\partial \mathcal {C}_o(h,L)\subset \mathcal {I}(L)\) and \(\partial ^{out} \mathcal {B}(L) \cap \mathcal {C}_o(h,L)\subset \mathcal{VB}\mathcal{}(L)\).

The following lemma will be used in the proof of Theorem 3.3. For simplicity, we may henceforth identify any set of boxes \(\mathcal {F}\) with its union \(\Sigma (\mathcal {F})=\bigcup _{B\in \mathcal {F}} B\).

Lemma 4.1

If the event \(A_N(h)\) happens and in addition \(\mathrm {diam}(\mathcal {C}_o(h))\ge L\), then we have \(\mathrm {cap}\left( \mathcal{VB}\mathcal{}(L)\cup \big (\partial \mathcal {C}_o(h)\cap \mathcal {B}(L)\big )\right) \ge N/2d\).

Remark 4.2

In general, \(\partial \mathcal {C}_o(h)\) is not contained entirely in \(\mathcal {I}(L)\). See Fig. 1.

Proof

We will show that \(X\,{:}{=}\,\mathcal{VB}\mathcal{}(L)\cup \big (\partial \mathcal {C}_o(h)\cap \mathcal {B}(L)\big )\) is a separating set of \(\mathcal {C}_o(h)\), namely that for every \(x\in \mathcal {C}_o(h)\), any infinite path starting from x must visit eventually X.

We first partition \(\mathcal {C}_o(h)\) into \(\mathcal {C}_o(h)\cap \mathcal{VB}\mathcal{}(L)\), \(\mathcal {C}_o(h)\cap \mathcal {B}(L)\) and \(\mathcal {C}_o(h)\setminus \mathcal {I}(L)\). It is clear that X is a separating set of \(\mathcal {C}_o(h)\cap \mathcal{VB}\mathcal{}(L)\). Let us show that X is also a separating set of \(\mathcal {C}_o(h)\cap \mathcal {B}(L)\). Indeed, each box of \(\partial ^{out} \mathcal {B}(L)\) lies either in \(L{{\mathbb {Z}}}^d\setminus \mathcal {C}_o(h,L)\) or in \(\mathcal {C}_o(h,L)\), and in the latter case, it must lie in \(\mathcal{VB}\mathcal{}(L)\) by property (iii) of Definition 3.2. With this observation in mind, consider an infinite path \(\gamma \) starting from some vertex in \(\mathcal {C}_o(h)\cap \mathcal {B}(L)\). If \(\gamma \) eventually visits \(\mathcal{VB}\mathcal{}(L)\), then there is nothing to show. If \(\gamma \) does not visit \(\mathcal{VB}\mathcal{}(L)\), then consider the subpath \(\gamma '\) of \(\gamma \) up to the first vertex \(u\in \partial \mathcal {C}_o(h)\) that \(\gamma \) visits. Then \(\gamma '\) visits only vertices in \(\mathcal {C}_o(h,L)\) and by our assumption, it does not visit any vertices in \(\partial ^{out} \mathcal {B}(L)\) because \(\partial ^{out} \mathcal {B}(L)\cap \mathcal {C}_o(h,L)\subset \mathcal{VB}\mathcal{}(L)\). Thus all vertices of \(\gamma '\) lie in \(\mathcal {B}(L)\). In particular, this holds for u, hence \(u\in \mathcal {B}(L)\cap \partial \mathcal {C}_o(h)\subset X\).

It remains to consider \(\mathcal {C}_o(h)\setminus \mathcal {I}(L)\). First, notice that for every component S of \(\mathcal {C}_o(h,L)\setminus \mathcal {I}(L)\), we have \(\partial ^{out} S \subset \mathcal {I}(L)\) because \(\partial \mathcal {C}_o(h,L)\subset \mathcal {I}(L)\). Moreover, \(\partial ^{out} S\cap \mathcal {B}(L)=\emptyset \) because otherwise some box of S would belong to \(\mathcal {I}(L)\) by property (iii) of Definition 3.2. Thus \(\partial ^{out} S\subset \mathcal{VB}\mathcal{}(L)\), which implies that \(\mathcal{VB}\mathcal{}(L)\) is a separating set of \(\mathcal {C}_o(h,L)\setminus \mathcal {I}(L)\), hence it is a separating set of \(\mathcal {C}_o(h){\setminus } \mathcal {I}(L)\), as desired.

We can now easily deduce that \(\mathrm {cap}(X)\ge \mathrm {cap}(\mathcal {C}_o(h))\). Notice that

$$\begin{aligned} \mathrm {cap}(\mathcal {C}_o(h))\ge \mathrm {cap}(\partial ^{out} \mathcal {C}_o(h))/2d\ge N/2d \end{aligned}$$

because when we start a simple random walk from some \(x\in \partial \mathcal {C}_o(h)\), one way to never visit \(\mathcal {C}_o(h)\) again is to first visit a given neighbour \(y\in \partial ^{out} \mathcal {C}_o(h)\) and from there to never visit \(\mathcal {C}_o(h)\cup \partial ^{out} \mathcal {C}_o(h)\) again. \(\square \)

Fig. 1
figure 1

An illustration of an interface \(\mathcal {I}(L)\). Each box of \(\mathcal {C}_o(h,L)\) is depicted by a square. Red boxes belong to \(\mathcal {B}(L)\), blue boxes belong to \(\mathcal{VB}\mathcal{}(L)\) and uncoloured boxes belong to \(\mathcal {C}_o(h,L){\setminus } \mathcal {I}(L)\). The two curves depict \(\partial \mathcal {C}_o(h)\)

We are now ready to prove Theorem 3.3.

Proof of Theorem 3.3

Our aim is to construct an occurring multi-scale interface \(\mathcal {I}\) for every configuration on the event \(A_N(h)\). We will construct \(\mathcal {I}\) by starting from \(\mathcal {I}(2^k)\) for a certain choice of \(2^k\) and then adding boxes of smaller and smaller scales. We will divide the definition of \(\mathcal {I}\) into segments. At each step of the first segment we will add at most \(\dfrac{N}{f(N)}\) boxes, at each step of the second segment we will add at most \(\dfrac{N}{f(f(N))}\) boxes, and so on, where \(f(N)=\log ^b(N)\), \(b=3(d-2)/\rho \). The process will stop once we reach a scale of size roughly L or if it happens that (c1) or (c2) is satisfied before we reach that scale.

It suffices to prove the theorem for \(\rho \le 1\). Consider an integer \(L\ge 1\) and let \(N\ge N_0\), where \(N_0\) is a large enough constant that will be determined along the way. Assume that the event \(A_N(h)\) happens and let

$$\begin{aligned} k_{1,1}\,{:}{=}\,\max \left\{ 0\le k\le \log _2(\mathrm {diam}(\mathcal {C}_o(h))): |\mathcal {I}_{1,1}(2^k)|\ge \dfrac{N}{f(N)}\right\} , \end{aligned}$$

where \(\mathcal {I}_{1,1}(2^k)\,{:}{=}\,\mathcal {I}(2^k)\). Notice that \(k_{1,1}\) is well-defined, since \(|\mathcal {I}_{1,1}(1)|\ge |\partial \mathcal {C}_o(h)|\ge \mathrm {cap}(\mathcal {C}_o(h))\ge N/2d\), provided that \(N_0\) is large enough. By further increasing the value of \(N_0\), we can assume that \(2^{k_{1,1}}\le r\,{:}{=}\,\frac{1}{2}\mathrm {diam}(\mathcal {C}_o(h))\) because \(|\mathcal {C}_o(h,\left\lfloor r \right\rfloor )|\le 6^d\). By definition,

$$\begin{aligned} |\mathcal {I}_{1,1}(L_{1,1})|< \dfrac{N}{f(N)}, \end{aligned}$$

where \(L_{1,1}\,{:}{=}\,2^{k_{1,1}+1}\le \mathrm {diam}(\mathcal {C}_o(h))\).

We now define an interface \(\mathcal {I}'_{1,1}\) as follows. If \(|\mathcal {I}_{1,1}(L_{1,1})|\ge \frac{N}{2^df(N)}\), then we let \(\mathcal {I}'_{1,1}\,{:}{=}\,\mathcal {I}_{1,1}(L_{1,1})\). Otherwise, the number of boxes of \(L_{1,1}{{\mathbb {Z}}}^d\) that contain a box of \(\mathcal {I}_{1,1}(2^{k_{1,1}})\) is at least

$$\begin{aligned} \dfrac{|\mathcal {I}_{1,1}(2^{k_{1,1}})|}{2^d}\ge \dfrac{N}{2^d f(N)}. \end{aligned}$$

We choose \(\dfrac{N}{2^d f(N)}-|\mathcal {I}_{1,1}(L_{1,1})|\) boxes of \(\mathcal {I}_{1,1}(2^{k_{1,1}})\) that are disjoint from \(\mathcal {I}_{1,1}(L_{1,1})\) in an arbitrary way and we add them to \(\mathcal {I}_{1,1}(L_{1,1})\) to obtain an interface \(\mathcal {I}'_{1,1}\). In both cases we have

$$\begin{aligned} \dfrac{N}{2^d f(N)}\le |\mathcal {I}'_{1,1}|<\dfrac{N}{f(N)}. \end{aligned}$$
Fig. 2
figure 2

An illustration of \(\mathcal {I}'_{1,1}\) on the top and \(\mathcal {I}'_{1,2}\) on the bottom. The boxes of \(\mathcal {B}'_{1,1}\) and \(\mathcal {B}'_{1,2}\) are depicted as red and the boxes of \(\mathcal{VB}\mathcal{}'_{1,1}\) and \(\mathcal{VB}\mathcal{}'_{1,2}\) are depicted as blue. Uncoloured boxes are not included in \(\mathcal {I}'_{1,1}\) or \(\mathcal {I}'_{1,2}\)

We then naturally define \(\mathcal{VB}\mathcal{}'_{1,1}:=\big (\mathcal{VB}\mathcal{}(L_{1,1})\cup \mathcal{VB}\mathcal{}(2^{k_{1,1}})\big )\cap \mathcal {I}'_{1,1}\) and \(\mathcal {B}'_{1,1}:=\big (\mathcal {B}(L_{1,1})\cup \mathcal {B}(2^{k_{1,1}})\big )\cap \mathcal {I}'_{1,1}\). For the set \(\mathcal{VB}\mathcal{}'_{1,1}\) we have

$$\begin{aligned} \text {either} \quad \mathrm {cap}(\mathcal{VB}\mathcal{}'_{1,1})\ge N/4d \quad \text {or} \quad \mathrm {cap}(\mathcal{VB}\mathcal{}'_{1,1})< N/4d. \end{aligned}$$

In the first case, the process stops because (c1) is satisfied, and we let \(\mathcal {I}=\mathcal{VB}\mathcal{}'_{1,1}\). In the second case, we would like to check whether (c2) is satisfied. For that purpose, we consider two cases according to whether

$$\begin{aligned} |\mathcal {B}'_{1,1}|\ge \dfrac{N}{2^df(N)} \quad \text {or} \quad |\mathcal {B}'_{1,1}|< \dfrac{N}{2^df(N)}. \end{aligned}$$

In the first case, we stop the first segment of our process. In the second case, we move on to the second step of the first segment. We remark that along the way of the second and every subsequent step, we will define some integers \(k_{i,j}, L_{i,j}\) and some collections \(\mathcal {I}'_{i,j}\) of \(2^{k_{i,j}}\)-boxes and \(L_{i,j}\)-boxes, where \(L_{i,j}=2^{k_{i,j}+1}\). To avoid repetition, let us mention that we will use the notation \(\mathcal{VB}\mathcal{}'_{i,j}:=(\mathcal{VB}\mathcal{}(L_{i,j})\cup \mathcal{VB}\mathcal{}(2^{k_{i,j}}))\cap \mathcal {I}'_{i,j}\) and \(\mathcal {B}'_{i,j}:=(\mathcal {B}(L_{i,j})\cup \mathcal {B}(2^{k_{i,j}}))\cap \mathcal {I}'_{i,j}\).

For the second step, we will require N to be large enough so that \(f(N)\ge 4d\). Now let

$$\begin{aligned} k_{1,2}=\max \{k\ge 0: |\mathcal {I}_{1,2}(2^k)|\ge \dfrac{N}{f(N)}\}, \end{aligned}$$

where \(\mathcal {I}_{1,2}(2^k)\) is the set of boxes of \(\mathcal {I}(2^k)\) that lie in some box of \(\mathcal {B}'_{1,1}\). To see that \(k_{1,2}\) is well-defined, notice first that

$$\begin{aligned} \mathrm {cap}\big (\partial \mathcal {C}_o(h)\cap \mathcal {B}'_{1,1}\big )>N/4d \end{aligned}$$
(4.1)

Indeed, as \(\mathcal{VB}\mathcal{}(L_{1,1})\subset \mathcal{VB}\mathcal{}'_{1,1}\), we obtain that \(\mathrm {cap}\left( \mathcal{VB}\mathcal{}'_{1,1}\cup \big (\partial \mathcal {C}_o(h)\cap \mathcal {B}'_{1,1}\big )\right) \ge N/2d\) by Lemma 4.1. The sub-additivity of capacity gives

$$\begin{aligned} \mathrm {cap}\left( \mathcal{VB}\mathcal{}'_{1,1}\cup \big (\partial \mathcal {C}_o(h)\cap \mathcal {B}'_{1,1}\big )\right) \le \mathrm {cap}(\mathcal{VB}\mathcal{}'_{1,1})+ \mathrm {cap}\big (\partial \mathcal {C}_o(h)\cap \mathcal {B}'_{1,1}\big ). \end{aligned}$$

Inequality (4.1) follows now from our assumption that \(\mathrm {cap}(\mathcal{VB}\mathcal{}'_{1,1})< N/4d\). Hence \(\mathcal {I}_{1,2}(1)\), which contains \(\partial \mathcal {C}_o(h)\cap \mathcal {B}'_{1,1}\), has size at least N/4d. Moreover, by our assumption that \(|\mathcal {B}'_{1,1}|< \dfrac{N}{2^df(N)}\), we obtain that \(k_{1,2}<k_{1,1}\). This proves that \(k_{1,2}\) is well-defined.

Let now \(L_{1,2}\,{:}{=}\,2^{k_{1,2}+1}\). Arguing as in the first step, we obtain an interface \(\mathcal {I}'_{1,2}\) such that

$$\begin{aligned} \dfrac{N}{2^d f(N)}\le |\mathcal {I}'_{1,2}|< \dfrac{N}{f(N)}, \end{aligned}$$

that is obtained from \(\mathcal {I}_{1,2}(L_{1,2})\) by adding enough \(2^{k_{1,2}}\)-boxes of \(\mathcal {I}_{1,2}(2^{k_{1,2}})\) that are disjoint from the boxes of \(\mathcal {I}_{1,2}(L_{1,2})\). At this point, we take cases according to whether

$$\begin{aligned} \mathrm {cap}\Big (\bigcup _{j=1}^2 \mathcal{VB}\mathcal{}'_{1,j}\Big )\ge N/4d \quad \text {or} \quad \mathrm {cap}\Big (\bigcup _{j=1}^2 \mathcal{VB}\mathcal{}'_{1,j}\Big )< N/4d. \end{aligned}$$

As before, if the first case happens, the process stops and we define \(\mathcal {I}=\bigcup _{j=1}^2 \mathcal{VB}\mathcal{}'_{1,j}\), while if the second case happens, then we check whether

$$\begin{aligned} |\mathcal {B}'_{1,2}|\ge \dfrac{N}{2^df(N)} \quad \text {or} \quad |\mathcal {B}'_{1,2}|< \dfrac{N}{2^df(N)}. \end{aligned}$$

Similarly, if the first case happens, we end the first segment. If the second case happens, then we continue to the third step. At this point, we need a generalisation of Lemma 4.1 which will ensure that \(|\mathcal {I}_{1,3}(1)|\ge N/4d\) and more generally that \(|\mathcal {I}_{i,j}(1)|\ge N/4d\) for the subsequent steps. This is proved in Lemma 4.3.

Continuing in this manner, we obtain a sequence of interfaces \((\mathcal {I}'_{1,j})_{j\ge 1}\), where \(\mathcal {I}'_{1,j}\) is contained in \(\mathcal {B}'_{1,j-1}\). We claim that eventually for some integer \(j_1\ge 1\),

$$\begin{aligned} \mathrm {cap}\Big (\bigcup _{i=1}^{j_1} \mathcal{VB}\mathcal{}'_{1,i}\Big )\ge N/4d \quad \text {or} \quad \dfrac{N}{2^df(N)}\le |\mathcal {B}'_{1,j_1}|<\dfrac{N}{f(N)}. \end{aligned}$$

Indeed, if \(\mathrm {cap}\Big (\bigcup _{i=1}^{j} \mathcal{VB}\mathcal{}'_{1,i}\Big )< N/4d\) for all \(j\ge 1\), then by Lemma 4.3 and the sub-additivity of the capacity we have \(\mathrm {cap}(\mathcal {I}'_{1,j})>N/4d\). On the other hand, by (2.6) and the sub-additivity of capacity again, we have

$$\begin{aligned} \mathrm {cap}(\mathcal {I}'_{1,j})\le C|\mathcal {I}'_{1,j}|L^{d-2}_{1,j}\le \dfrac{CNL^{d-2}_{1,j}}{f(N)}. \end{aligned}$$

We thus conclude that

$$\begin{aligned} L_{1,j}\ge \Big (\dfrac{f(N)}{4dC}\Big )^{\frac{1}{d-2}}. \end{aligned}$$
(4.2)

However, it follows from the definitions that \((L_{1,j})_{j\ge 1}\) is a strictly decreasing sequence, and so (4.2) cannot hold for arbitrary large j.

We end the first segment as soon as we reach a step \(j_1\) as above. We shall now decide whether we start the second segment or not. If it happens that

$$\begin{aligned} \mathrm {cap}\Big (\bigcup _{j=1}^{j_1} \mathcal{VB}\mathcal{}'_{1,j}\Big )\ge N/4d \quad \text {or} \quad \dfrac{N}{2^df(N)}\le |\mathcal {B}'_{1,j_1}|<\dfrac{N}{f(N)} \; \text {and} \; L_{1,j_1}\ge f(N)^{\frac{1}{\rho }},\nonumber \\ \end{aligned}$$
(4.3)

then our process stops. In the first case, we simply set \(\mathcal {I}=\bigcup _{j=1}^{j_1} \mathcal{VB}\mathcal{}'_{1,j}\). In the second case though, we set \(\mathcal {I}=\mathcal {B}(L_{1,j_1})\cap \mathcal {I}'_{1,j_1}\) if \(|\mathcal {B}(L_{1,j_1})\cap \mathcal {I}'_{1,j_1}|\ge |\mathcal {B}(2^{k_{1,j_1}})\cap \mathcal {I}'_{1,j_1}|\) and \(\mathcal {I}=\mathcal {B}(2^{k_{1,j_1}})\cap \mathcal {I}'_{1,j_1}\) otherwise. In other words, \(\mathcal {I}\) contains only one of the sets \(\mathcal {B}(L_{1,j_1})\cap \mathcal {I}'_{1,j_1}\) and \(\mathcal {B}(2^{k_{1,j_1}})\cap \mathcal {I}'_{1,j_1}\), namely that of larger size. If (4.3) is not satisfied, then we move on to the second segment.

Arguing in a similar manner, we obtain a sequence of occurring interfaces \((\mathcal {I}'_{2,j})_{j\ge 1}\) such that \(|\mathcal {I}'_{2,j}|<N/f(f(N))\) for all \(j\ge 1\), where each \(\mathcal {I}'_{2,j}\) lies in \(\mathcal {I}'_{1,j_1}\). The segment ends when we reach a certain step \(j_2\) such that either

$$\begin{aligned} \mathrm {cap}\Big (\bigcup _{i=1}^2 \bigcup _{j=1}^{j_i} \mathcal{VB}\mathcal{}'_{i,j}\Big )\ge N/4d \quad \text {or} \quad \dfrac{N}{2^d f(f(N))}\le |\mathcal {B}'_{2,j_2}|<\dfrac{N}{f(f(N))}. \end{aligned}$$

The process stops at the end of the second segment if

$$\begin{aligned}&\mathrm {cap}\Big (\bigcup _{i=1}^2 \bigcup _{j=1}^{j_i} \mathcal{VB}\mathcal{}'_{i,j}\Big )\ge N/4d \quad \text {or} \quad \\&\dfrac{N}{2^df(f(N))}\le |\mathcal {B}'_{2,j_2}|<\dfrac{N}{f(f(N))} \; \text {and} \; L_{2,j_2}\ge f(f(N))^{\frac{1}{\rho }}. \end{aligned}$$

In that case, we set \(\mathcal {I}=\bigcup _{i=1}^2\bigcup _{j=1}^{j_i} \mathcal{VB}\mathcal{}'_{i,j}\), \(\mathcal {I}=\mathcal {B}(L_{2,j_2})\cap \mathcal {I}'_{2,j_2}\) or \(\mathcal {I}=\mathcal {B}(2^{k_{2,j_2}})\cap \mathcal {I}'_{2,j_2}\), as appropriate.

Proceeding inductively, we define sequences of occurring interfaces \((\mathcal {I}'_{1,j})_{j=1}^{j_1},(\mathcal {I}'_{2,j})_{j=1}^{j_2},\ldots \) such that \(|\mathcal {I}'_{i,j}|<N/f^{\circ i}(N)\) for all i and j, where \(f^{\circ i}\) denotes the i-fold composition of f. At the end of an arbitrary kth segment, we either have \(\mathrm {cap}\Big (\bigcup _{i=1}^k \bigcup _{j=1}^{j_i} \mathcal{VB}\mathcal{}'_{i,j}\Big )\ge N/4d\) or \(\dfrac{N}{2^df^{\circ k}(N)}\le |\mathcal {B}'_{i,j_i}|<\dfrac{N}{f^{\circ k}(N)}\). Let \(m=m(N,L)\) be the largest integer such that \(f^{\circ m}(N)> M\,{:}{=}\,d2^d CL^{d-2}\). Notice that m is well-defined for every N such that \(f(N)> M\). If the desired conditions are not satisfied at the end of the ith segment for every \(i\le m\), we move on to the \((m+1)\)th segment. This segment plays a special role, as we are defining each \(\mathcal {I}'_{m+1,j}\) in such a way that

$$\begin{aligned} \dfrac{N}{2^d M}\le |\mathcal {I}'_{m+1,j}|< \dfrac{N}{M}. \end{aligned}$$

At the end of the \((m+1)\)th segment we have \(\mathrm {cap}\Big (\bigcup _{i=1}^{m+1} \bigcup _{j=1}^{j_i} \mathcal{VB}\mathcal{}'_{i,j}\Big )\ge N/4d\) or \(\dfrac{N}{2^dM}\le |\mathcal {B}'_{m+1,j_{m+1}}|<\dfrac{N}{M}\). Finally, we set \(\mathcal {I}=\bigcup _{i=1}^{m+1}\bigcup _{j=1}^{j_i} \mathcal{VB}\mathcal{}'_{i,j}\), \(\mathcal {I}=\mathcal {B}(L_{m+1,j_{m+1}})\cap \mathcal {I}'_{m+1,j_{m+1}}\) or \(\mathcal {I}=\mathcal {B}(2^{k_{m+1,j_{m+1}}})\cap \mathcal {I}'_{m+1,j_{m+1}}\), as appropriate.

It is not hard to see that if (c1) is not satisfied, then (c2) is satisfied for

$$\begin{aligned} t=\frac{L^{\rho }}{2^{d+1} M}. \end{aligned}$$
(4.4)

Indeed, if the process stops at the end of the ith segment for some \(i\le m\), then the smallest scale \(L_1\) of \(\mathcal {I}\) (which is either \(L_{i,j_i}\) or \(\frac{1}{2}L_{i,j_i}\)) is at least \(\frac{1}{2}L_{i,j_i}\ge \frac{1}{2} f^{\circ i}(N)^{\frac{1}{\rho }}\) and \(|\mathcal {B}|\ge \dfrac{|\mathcal {B}'_{i,j_i}|}{2}\ge \dfrac{N}{2^{d+1} f^{\circ i}(N)}\) (here we use the notation introduced above the statement of Theorem 3.3). Thus \(|\mathcal {B}|L^{\rho }_1 \ge 2^{-\rho -d-1}N\). On the other hand, if the process stops at the end of the \((m+1)\)th segment, then we can argue as in the proof of (4.2) to deduce that

$$\begin{aligned} L_{m+1,j_{m+1}}\ge \Big (\dfrac{M}{4dC}\Big )^{\frac{1}{d-2}}=2L. \end{aligned}$$

Thus the smallest scale \(L_1\) of \(\mathcal {I}\) (which is either \(L_{m+1,j_{m+1}}\) or \(\frac{1}{2}L_{m+1,j_{m+1}}\)) is at least \(\frac{1}{2}L_{m+1,j_{m+1}}\ge L\) and \(|\mathcal {B}|\ge \dfrac{|\mathcal {B}'_{m+1,j_{m+1}}|}{2}\ge \dfrac{N}{2^{d+1} M}\), which implies that \(|\mathcal {B}|L^{\rho }_1 \ge t N\). Since \(t\le 2^{-\rho -d-1}\), the desired assertion follows in both cases.

The above construction gives us a family of interfaces \(\mathcal {I}_N\) satisfying all the properties claimed in Theorem 3.3. The only properties that do not follow immediately from the construction are that \(|\mathcal {I}_N|\le e^{\delta tN}\) and that \(|\mathcal {I}|\le \delta t N\) for every \(\mathcal {I}\in \mathcal {I}_N\). In order to prove these inequalities, we will treat each segment separately. We start with the first segment. To determine \(\mathcal {I}'_{1,j}, j=1,2,\ldots ,j_1\), we need to first determine the sequence \((L_{1,j})_{j=1}^{j_1}\). Recall that by construction we have \(L_{1,1}\le \mathrm {diam}(\mathcal {C}_o(h))\). As we mentioned above Corollary 1.4, a cluster of capacity at most N has volume (and therefore diameter) at most \(C_1N^{\frac{d}{d-2}}\le C_1N^3\), thus \(L_{1,1}\le \mathrm {diam}(\mathcal {C}_o(h))\le C_1N^3\). Therefore, \((L_{1,j})_{j=1}^{j_1}\) is simply a strictly decreasing sequence of powers of 2 with exponents at most \(\log _2(C_1N^3)\), which in turn implies that there are at most \(2^{\log _2(C_1N^3)}=C_1N^3\) possibilities for \((L_{1,j})_{j=1}^{j_1}\). Once the scales \((L_{1,j})_{j=1}^{j_1}\) are fixed, we should bound the possibilities for \(\mathcal {I}'_{1,j}, 1\le j\le j_1\). Notice that for all \(j=1,2,\ldots ,j_1\), each box of \(\mathcal {I}'_{1,j}\) is at distance at most \(C_1N^3\) from the origin and furthermore \(|\mathcal {I}'_{1,j}|\le N_1:=\lfloor N/f(N) \rfloor \). Hence, for each \(1\le j\le j_1\), the number of possibilities for \(\mathcal {I}'_{1,j}\) given \(L_{1,j}\) is at most

$$\begin{aligned} \sum _{k=1}^{N_1}{N'_1\atopwithdelims (){k}}, \end{aligned}$$

where \(N'_1=C_2N^{3d}\) is an upper bound for the number of boxes of \(L_{1,j}{{\mathbb {Z}}}^d\) and \(\frac{L_{1,j}}{2}{{\mathbb {Z}}}^d\) at distance at most \(C_1N^3\) from the origin, and the sum accounts for the possible values of \(|\mathcal {I}'_{1,j}|\). Using the inequality \({{n}\atopwithdelims (){k}}\le \left( \tfrac{n}{k}\right) ^k e^k\) and the monotonicity of the combinatorial coefficient \({{n}\atopwithdelims (){k}}\) for \(k\le n/2\) we obtain that

$$\begin{aligned} \sum _{k=1}^{N_1}{N'_1\atopwithdelims (){k}}\le N_1 \left( \dfrac{N'_1}{N_1}\right) ^{N_1} e^{N_1}\le \exp \left\{ 3N_1\log \left( \frac{N'_1}{N_1}\right) \right\} \end{aligned}$$

for every N large enough so that \(N_1\le \frac{N_1'}{2}\). Overall, there at most

$$\begin{aligned} C_1N^3\left( \sum _{k=1}^{N_1}{N'_1\atopwithdelims (){k}}\right) ^{\log _2(C_1N^3)}\le \exp \left\{ C_3\dfrac{N\log ^2(N)}{f(N)}\right\} \end{aligned}$$
(4.5)

possibilities for the first segment. By increasing \(C_3\) if necessary, the term inside the exponential in (4.5) is also an upper bound for the number of boxes of the first segment contained in \(\mathcal {I}\).

Moving on to the second segment, first notice that all scales \((L_{2,j})_{j=1}^{j_2}\) are powers of 2 smaller than \(f(N)^{\frac{1}{\rho }}\) (recall that (4.3) does not hold). Therefore, there are at most \(f(N)^{\frac{1}{\rho }}\) possibilities for \((L_{2,j})_{j=1}^{j_2}\). Since \(L_{1,j_1}< f(N)^{\frac{1}{\rho }}\) and every box of the second segment is contained in \(\mathcal {I}'_{1,j_1}\), which in turn contains at most \(N_1\) boxes, we deduce that for every \(j=1,2,\ldots ,j_2\), the boxes of \(\mathcal {I}'_{2,j}\) are chosen from a set of at most \(N_1 f(N)^{\frac{d}{\rho }}\le C_4 N\lfloor f(N) \rfloor ^{\frac{d}{\rho }-1}=:N'_2\) boxes. Hence for each \(1\le j\le j_2\) the number of possibilities for \(\mathcal {I}'_{2,j}\) given \(L_{2,j}\) is at most

$$\begin{aligned} \sum _{k=1}^{N_2}{{N'_2}\atopwithdelims (){k}}\le \exp \left\{ 3N_2\log \left( \frac{N'_2}{N_2}\right) \right\} , \end{aligned}$$

where \(N_2=\lfloor N/f(f(N)) \rfloor \). Overall, there are at most

$$\begin{aligned} f(N)^{\frac{1}{\rho }}\left( \sum _{k=1}^{N_2}{N'_2\atopwithdelims (){k}}\right) ^{\log _2\big (f(N)^{\frac{1}{\rho }}\big )}\le \exp \left\{ C_3\dfrac{N\log ^2(f(N))}{f(f(N))}\right\} \end{aligned}$$

possibilities for the second segment, where for the last inequality, we increase the value of \(C_3\) if necessary.

Setting \(g_0(N):=N\), \(g_i(N):=f^{\circ i}(N)\) for \(1\le i\le m\) and \(g_{m+1}(N):=M\) (recall that \(M=d2^d CL^{d-2}\)), we see that for the boxes of an arbitrary ith segment, there are at most

$$\begin{aligned} \exp \left\{ C_3\dfrac{N\log ^2(g_{i-1}(N))}{g_i(N)}\right\} \end{aligned}$$

possibilities. Overall, we deduce that

$$\begin{aligned} |\mathscr {I}_N|\le \exp \left\{ C_3 N \sum _{i=1}^{m+1}\dfrac{\log ^2(g_{i-1}(N))}{g_i(N)} \right\} . \end{aligned}$$
(4.6)

Furthermore, the term inside the exponential in (4.6) is an upper bound for \(|\mathcal {I}|\).

Therefore, it remains to prove that

$$\begin{aligned} C_3\sum _{i=1}^{m+1}\dfrac{\log ^2(g_{i-1}(N))}{g_i(N)}\le \delta t, \end{aligned}$$
(4.7)

provided that L and N are large enough (recall from (4.4) that \(t=\tfrac{L^{\rho }}{2^{d+1} M}\)). We start by bounding the \((m+1)\)th term. By the definition of m, we have \(f^{\circ (m+1)}(N)\le M\), which implies \(g_m(N)=f^{\circ m}(N)\le e^{M^{\frac{1}{b}}}\). Since \(g_{m+1}(N)=M\), we have

$$\begin{aligned} \dfrac{\log ^2(g_{m}(N))}{g_{m+1}(N)}\le M^{-1+\frac{2}{b}}. \end{aligned}$$
(4.8)

Now, let us handle the sum up to the mth term. First notice that for all \(i\le m\), \(\tfrac{\log ^2(g_{i-1}(N))}{g_i(N)}=\log ^{2-b}(g_{i-1}(N))\). Now, recall that \(b=3(d-2)/\rho >2\) and observe that \(\log ^{2-b}(x)\le 2^{-1}\log ^{2-b}(f(x))\) for all \(x\ge C_5\). Since \(g_{i-1}(N)\ge g_m(N)\ge C_5\) for all L and N that are large enough, one readily deduces

$$\begin{aligned} \log ^{2-b}(g_{i-1}(N))\le 2^{-1} \log ^{2-b}(g_i(N)). \end{aligned}$$

Iterating the last inequality, we obtain that \(\log ^{2-b}(g_{i-1}(N))\le 2^{i-m} \log ^{2-b}\left( g_{m-1}(N)\right) \), which in turn implies

$$\begin{aligned} \sum _{i=1}^m \dfrac{\log ^2(g_{i-1}(N))}{g_i(N)}\le 2\log ^{2-b}\left( g_{m-1}(N)\right) . \end{aligned}$$
(4.9)

By the definition of m we know that \(f^{\circ m}(N)\ge M\), which implies \(g_{m-1}(N)=f^{\circ (m-1)}(N)\ge e^{M^{\frac{1}{b}}}\). Plugging this in (4.9) gives

$$\begin{aligned} \sum _{i=1}^m \dfrac{\log ^2(g_{i-1}(N))}{g_i(N)}\le 2M^{-1+\frac{2}{b}}. \end{aligned}$$
(4.10)

Combining (4.8) and (4.10), we deduce that

$$\begin{aligned} C_3\sum _{i=1}^{m+1}\dfrac{\log ^2(g_{i-1}(N))}{g_i(N)}\le 3C_3 M^{-1+\frac{2}{b}}. \end{aligned}$$

Recalling the definitions of M and t, we see that \(t=C_6 M^{-1+\frac{\rho }{d-2}}\). Since by definition \(b=3(d-2)/\rho \), the desired inequality (4.7) follows readily as long as \(\delta \ge \tfrac{3C_3}{C_6}M^{-\frac{\rho }{3(d-2)}}\), which can be guaranteed by making L sufficiently large. This completes the proof. \(\square \)

For \(1\le i\le m+1\) and \(1\le j\le j_i\), let

$$\begin{aligned} \overline{\mathcal{VB}\mathcal{}}_{i,j}=\left( \bigcup _{k=1}^{i-1} \bigcup _{l=1}^{j_k}{\mathcal{VB}\mathcal{}}'_{k,l}\right) \cup \left( \bigcup _{l=1}^{j}{\mathcal{VB}\mathcal{}}'_{i,l}\right) . \end{aligned}$$

We now prove the lemma mentioned in the proof of the above theorem. We recall that for convenience we identify sets of boxes with the corresponding subsets of \({{\mathbb {Z}}}^d\).

Lemma 4.3

For every \(i,j\ge 1\) we have \(\mathrm {cap}\left( \overline{\mathcal{VB}\mathcal{}}_{i,j}\cup \big (\partial \mathcal {C}_o(h)\cap \mathcal {B}_{i,j}\big )\right) \ge N/2d\).

Proof

As in the proof of Lemma 4.1, the desired result will follow once we show that \(X_{i,j}=\overline{\mathcal{VB}\mathcal{}}_{i,j}\cup \big (\partial \mathcal {C}_o(h)\cap \mathcal {B}_{i,j}\big )\) is a separating set of \(\mathcal {C}_o(h)\). Recall the definitions of \(\mathcal {I}'_{i,j}\) and \(\mathcal {I}_{i,j}(L_{i,j})\). We will prove that \(X_{i,j}\) is a separating set of \(\mathcal {C}_o(h)\) in the special case where \(\mathcal {I}'_{1,1}=\mathcal {I}_{1,1}(L_{1,1}),\mathcal {I}'_{1,2}=\mathcal {I}_{1,2}(L_{1,2}),\ldots ,\mathcal {I}'_{i,j}= \mathcal {I}_{i,j}(L_{i,j})\). The general case follows easily by removing \(\left( \mathcal {I}'_{1,1}{\setminus } \mathcal {I}_{1,1}(L_{1,1})\right) \cup \left( \mathcal {I}'_{1,2}{\setminus } \mathcal {I}_{1,2}(L_{1,2})\right) \cup \ldots \cup \left( \mathcal {I}'_{i,j}{\setminus } \mathcal {I}_{i,j}(L_{i,j})\right) \) from \(X_{i,j}\).

It is clear that \(X_{i,j}\) is a separating set of \(\mathcal {C}_o(h)\cap \overline{\mathcal{VB}\mathcal{}}_{i,j}\). We claim that

$$\begin{aligned} \text {every box in} \partial ^{out} \mathcal {B}_{i,j} \text {lies either in} \overline{\mathcal{VB}\mathcal{}}_{i,j} \text {or in} L_{i,j}{{\mathbb {Z}}}^d\setminus \mathcal {C}_o(h,L_{i,j}), \end{aligned}$$
(4.11)

which implies that \(X_{i,j}\) is a separating set of \(\mathcal {C}_o(h)\cap \mathcal {B}_{i,j}\) by arguing as in the proof of Lemma 4.1. Indeed, for \((i,j)=(1,1)\), the claim follows from property (iii). Proceeding inductively, assume that the statement holds for an arbitrary (ij). Let (kl) be the next pair of indices, i.e. \((k,l)=(i,j+1)\) if \(j<j_i\) or \((k,l)=(i+1,1)\) if \(j=j_i\). Clearly, every box in \(\partial ^{out} \mathcal {B}_{k,l}\) lies either in \(L_{k,l}{{\mathbb {Z}}}^d\setminus \mathcal {C}_o(h,L_{k,l})\), in which case there is nothing to show, or in \(\mathcal {C}_o(h,L_{k,l})\). So let us consider a box \(B\in \partial ^{out} \mathcal {B}_{k,l}\cap \mathcal {C}_o(h,L_{k,l})\). Then B has a neighbour \(B'\in \mathcal {B}_{k,l}\subset \mathcal {B}_{i,j}\), which implies that B is contained entirely in \(\mathcal {B}_{i,j}\) or in \(\partial ^{out} \mathcal {B}_{i,j}\).

If B is contained in \(\mathcal {B}_{i,j}\), then \(B\in \mathcal{VB}\mathcal{}_{k,l}\subset \overline{\mathcal{VB}\mathcal{}}_{k,l}\) because of our assumption that \(B\in \partial ^{out} \mathcal {B}_{k,l}\cap \mathcal {C}_o(h,L_{k,l})\). Let us now assume that B is contained in some box \(B''\in \partial ^{out} \mathcal {B}_{i,j}\). It follows from our inductive hypothesis that \(B''\) lies either in \(\overline{\mathcal{VB}\mathcal{}}_{i,j}\) or in \(L_{i,j}{{\mathbb {Z}}}^d{\setminus } \mathcal {C}_o(h,L_{i,j})\). Notice that \(L_{i,j}{{\mathbb {Z}}}^d{\setminus } \mathcal {C}_o(h,L_{i,j})\subset L_{k,l}{{\mathbb {Z}}}^d{\setminus } \mathcal {C}_o(h,L_{k,l})\), since any box \(B_1\in L_{k,l}{{\mathbb {Z}}}^d\) is contained in a box \(B_2\in L_{i,j}{{\mathbb {Z}}}^d\) and if \(B_1\) intersects \(\mathcal {C}_o(h)\), then so does \(B_2\). Thus B lies either in \(\overline{\mathcal{VB}\mathcal{}}_{k,l}\) or in \(L_{k,l}{{\mathbb {Z}}}^d{\setminus } \mathcal {C}_o(h,L_{k,l})\). This proves the inductive statement and the claim follows.

It remains to handle \(\mathcal {C}_o(h)\setminus Y_{i,j}\), where \(Y_{i,j}=\overline{\mathcal{VB}\mathcal{}}_{i,j}\cup \mathcal {B}_{i,j}\). Let \(Z_{i,j}=\mathcal {C}_o(h,L_{i,j})\setminus Y_{i,j}\). Then we claim that \(\partial ^{out} Z_{i,j}\) is contained in \(\overline{\mathcal{VB}\mathcal{}}_{i,j}\) which implies that \(\overline{\mathcal{VB}\mathcal{}}_{i,j}\) is a separating set of \(\mathcal {C}_o(h)\setminus Y_{i,j}\). We will prove the claim inductively. For \((i,j)=(1,1)\), this follows from the proof of Lemma 4.1 where it is shown that for every component S of \(\mathcal {C}_o(h,L_{1,1})\setminus \mathcal {I}(L_{1,1})\), we have \(\partial ^{out} S\subset \mathcal{VB}\mathcal{}(L_{1,1})\).

Assume that the statement holds for some (ij). We will prove it for the next pair of indices (kl). Let S be a component of \(Z_{k,l}\). Although \(S\subset Z_{k,l}\), it is possible that some box of S is contained in \(\mathcal {B}_{i,j}\). Let us assume that this is the case. Then by the connectivity of S and (4.11), all boxes of S are contained in \(\mathcal {B}_{i,j}\). Notice that \(\partial ^{out} S\) lies in \(\mathcal {C}_o(h,L_{k,l})\) because otherwise some box B of S lies in \(\partial ^{out} \mathcal {C}_o(h,L_{k,l})\cap \mathcal {B}_{i,j}\), hence B is contained in \(Y_{k,l}\), which contradicts the definition of \(Z_{k,l}\). From this we deduce that \(\partial ^{out} S\subset Y_{k,l}\). Moreover, no box of \(\partial ^{out} S\) lies in \(\mathcal {B}_{k,l}\) because otherwise some box of S lies in \(\mathcal {I}_{k,l}\subset Y_{k,l}\) by our assumption that S is contained in \(\mathcal {B}_{i,j}\). Therefore, \(\partial ^{out} S\) lies in \(\overline{\mathcal{VB}\mathcal{}}_{k,l}\).

Let us now assume that no boxes of S are contained in \(\mathcal {B}_{i,j}\). Then S lies entirely in \(\mathcal {C}_o(h,L_{i,j})\setminus Y_{i,j}\), since \(Y_{k,l}\) contains \(\overline{\mathcal{VB}\mathcal{}}_{i,j}\). We can now apply the induction hypothesis to deduce that \(\partial ^{out} S\) is contained in \(\overline{\mathcal{VB}\mathcal{}}_{i,j}\subset \overline{\mathcal{VB}\mathcal{}}_{k,l}\). This completes the inductive proof.

\(\square \)

5 Decay of Badness

In this section, we will prove Proposition 3.4. We will make use of the (supercritical) sharpness of phase transition for GFF percolation [DCGRS20] (i.e. \({\overline{h}}=h_*\)). We say that a box \(B=B_L(z)\), \(z\in L{{\mathbb {Z}}}^d\), is \((\varphi ,h)\)-good if there exists a connected component in \(\{\varphi \ge h\}\cap B\) with diameter at least L/5 and furthermore any two clusters in \(\{\varphi \ge h\}\cap U\) having diameter at least L/10 are connected to each other in \(\{\varphi \ge h\}\cap D\). By the main result of [DCGRS20], for every \(h'<h_*\) there exist \(\rho =\rho (d)\in (0,1)\) and \(c=c(d,h')\) such that for every \(h\le h'\) and \(L \ge 1\),

$$\begin{aligned} {\mathbb {P}}[B_L \text {is} (\varphi ,h)-\text {good}]\ge 1-e^{-cL^{\rho }}. \end{aligned}$$
(5.1)

Our aim is to express the event that a box is \((\psi ,h,\varepsilon )\)-bad in terms of events depending on \(\varphi \), so that we can use (5.1). For this purpose, we will make use of the following classical fact about discrete harmonic functions. For any function \(f:\overline{D_N}\rightarrow {{\mathbb {R}}}\) which is harmonic in \(D_N\), we have that

$$\begin{aligned} |f(x)-f(y)|\le C'\left\| f \right\| _{\infty }/N \end{aligned}$$
(5.2)

for neighbouring x and y in \([-2N,3N)^d\), where \(C'=C'(d)>0\) is a universal constant—see [Law91, Theorem 1.7.1]. We shall apply this result for \(f=\xi ^B\) and B being a \((\xi ,\varepsilon )\)-good box for a certain value of \(\varepsilon >0\). We first need to introduce some definitions.

Consider an integer \(N\ge 1\) and let \(L=\left\lfloor N/M \right\rfloor \approx N^{\frac{1}{\alpha +2}}\), where \(\alpha =(2d+1)^2\) and \(M=N^{\frac{\alpha +1}{\alpha +2}}\). We say that a connected subgraph \(\mathcal {C}\) of \(D_N\) is very dense if \(\mathcal {C}\cap U_N\) contains a connected subgraph of diameter at least N/5 and for every box \(B=B_L(z)\), \(z\in L{{\mathbb {Z}}}^d\) contained in \(D_N\), \(\mathcal {C}\cap B\) contains a connected subgraph of diameter at least L/5.

Given \(0<\varepsilon <h_*-h\), we say that a strong local uniqueness happens in \(B_N\) if \(\{\varphi \ge h+\varepsilon \}\cap D_N\) contains a very dense cluster and furthermore, for every \(k\in \{-\left\lceil \varepsilon L^{\alpha }\right\rceil ,\ldots , \left\lceil \varepsilon L^{\alpha } \right\rceil \}\), every box \(B=B_L(z)\), \(z\in L{{\mathbb {Z}}}^d\) contained in \(D_N\) is \((\varphi ,h-rL^{-\alpha })\)-good, where \(r=k-1-7dC'\varepsilon \) and \(C'\) is the constant appearing in (5.2). We denote by \(\text {NSLU}(h,\varepsilon ,N)\) the event that strong local uniqueness does not happen in \(B_N\).

Lemma 5.1

For every \(h'<h_*\), there exist constants \(c=c(h',d,\varepsilon )>0,\rho =\rho (d)>0\) such that for every \(h\le h'\) and \(N\ge 1\),

$$\begin{aligned} {\mathbb {P}}[\text {NSLU}(h,\varepsilon ,N)]\le e^{-c N^\rho }. \end{aligned}$$

Consider now the boxes of \(L^2{{\mathbb {Z}}}^d\) contained in \(D_N\). Given such a box B, we define \(\text {Conf}(h,\varepsilon ,B)\) as the event that there are a set \(S\subset D\) of cardinality \(|S|\ge L\) and an integer \(k\in \{-\left\lceil \varepsilon L^{\alpha }\right\rceil ,\ldots ,\) \(\left\lceil \varepsilon L^{\alpha } \right\rceil \}\) such that \(h-kL^{-\alpha }\le \varphi _x < h-rL^{-\alpha }\) for every \(x\in S\), where \(r=k-1-7dC'\varepsilon \). In other words, when the event \(\text {Conf}(h,\varepsilon ,B)\) happens, \(\varphi _x\) is confined for at least L vertices in D. Finally, we define

$$\begin{aligned} \text {Conf}(h,\varepsilon ,N)\,{:}{=}\,\bigcup _{B} \text {Conf}(h,\varepsilon ,B), \end{aligned}$$

where the union is taken over all boxes of \(L^2{{\mathbb {Z}}}^d\) contained in \(D_N\).

Lemma 5.2

For every \(h'<h_*\) and every \(0<\varepsilon <h_*-h'\), there exist constants \(c=c(h',\varepsilon ,d)>0,\rho =\rho (d)>0\) such that for every \(h\le h'\) and \(N\ge 1\),

$$\begin{aligned} {\mathbb {P}}[\text {Conf}(h,\varepsilon ,N)]\le e^{-c N^{\rho }}. \end{aligned}$$

Proposition 3.4 follows readily by applying the following (deterministic) lemma for \(\delta =(h_*-h'-\varepsilon )/2\) together with (3.6) and Lemmas 5.1 and 5.2 above.

Lemma 5.3

Let \(h<h_*\), \(0<\varepsilon <h_*-h\) and \(\delta <h_*-h-\varepsilon \). For every \(N\ge 1\) large enough, if the box \(B_N\) is \((\psi ,h,\varepsilon )\)-bad, then one of the events \(\{B_{N} \text {is} (\xi ,\delta )-\text {bad}\}\), \(\text {NSLU}(h,\varepsilon +\delta ,N)\) or \(\text {Conf}(h,\varepsilon +\delta ,N)\) happens.

We now turn to the proof of each of the above lemmas.

Proof of Lemma 5.3

If \(B_{N}\) is \((\xi ,\delta )\)-bad or the event \(\text {Conf}(h,\varepsilon +\delta ,N)\) happens, then there is nothing to prove, so let us assume that \(B_{N}\) is \((\xi ,\delta )\)-good and \(\text {Conf}(h,\varepsilon +\delta ,N)\) does not happen. We need to show that \(\text {NSLU}(h,\varepsilon +\delta ,N)\) happens. To this end, if \(\{\varphi \ge h+\varepsilon +\delta \}\cap D_N\) does not contain a very dense cluster, then \(\text {NSLU}(h,\varepsilon +\delta ,N)\) happens, so let us assume that \(\{\varphi \ge h+\varepsilon +\delta \}\cap D_N\) does contain a very dense cluster \(\mathcal {C}_1\).

We claim that for some function \(f:\overline{D_{N}}\rightarrow {{\mathbb {R}}}\) which is harmonic in \(D_{N}\) and satisfies \(|f(x)|<\varepsilon +\delta \) for every \(x\in D_N\),

$$\begin{aligned}&\{\varphi +f\ge h\}\cap U_N \text {contains a cluster} \mathcal {C}_2 \text {of diameter at least} N/5 \text {which is not } \nonumber \\&\quad \text { connected to }\,\mathcal {C}_1 \text {in} \{\varphi +f\ge h\}\cap D_N. \end{aligned}$$
(5.3)

Indeed, \(B_N\) is \((\psi ,h,\varepsilon )\)-bad, so it follows from the decomposition of \(\varphi \) that there is a function f as above for which either \(\{\varphi +f\ge h\}\cap U_N\) does not contain a cluster of diameter at least N/5 or (5.3) happens. However, \(\mathcal {C}_1\) contains a cluster of \(\{\varphi +f\ge h\}\cap U_N\) of diameter at least N/5, which implies that (5.3) happens.

Fix now a box \(B=B_{L^2}(z)\in L^2{{\mathbb {Z}}}^d\) that intersects \(\mathcal {C}_2\). We assume that N is large enough so that \(D=D_{L^2}(z)\) is contained in \([-2N,3N)^d\). Notice that

$$\begin{aligned} \varphi _x+\max _{u\in D} f(u)\ge \varphi _x+f(x) \ge h \text { for } x\in \mathcal {C}_2\cap D \end{aligned}$$
(5.4)

and

$$\begin{aligned} \varphi _x+\min _{u\in D} f(u)\le \varphi _x+f(x) < h \text { for } x\in \partial ^{out} \mathcal {C}_2\cap D. \end{aligned}$$
(5.5)

Since f is harmonic in \(D_N\) and \(|f(x)|< \varepsilon +\delta \) for every \(x\in D_N\), we have that \(|f(x)-f(y)|\le C'(\varepsilon +\delta )/N\) for neighbouring x and y in \([-2N,3N)^d\) by (5.2). Since D has diameter at most \(7dL^2\), we conclude that

$$\begin{aligned} \max _{u\in D} f(u)-\min _{u\in D} f(u)\le 7dC'(\varepsilon +\delta ) L^2/N\le 7dC'(\varepsilon +\delta ) L^{-\alpha }. \end{aligned}$$

Consider the smallest \(k\in \{-\left\lceil (\varepsilon +\delta ) L^{\alpha }\right\rceil ,\ldots ,\left\lceil (\varepsilon +\delta ) L^{\alpha }\right\rceil \}\) such that \(\max _{u\in D}f(u)\le kL^{-\alpha }\). Then \(\max _{u\in D}f(u)> (k-1)L^{-\alpha }\), hence

$$\begin{aligned} \min _{u\in D}f(u)\ge \max _{u\in D}f(u)-7dC'(\varepsilon +\delta ) L^{-\alpha }>(k-1-7dC'(\varepsilon +\delta ))L^{-\alpha }. \end{aligned}$$
(5.6)

We can now deduce from (5.4) and (5.5) that

$$\begin{aligned} \varphi \ge h-kL^{-\alpha } \text { on } \mathcal {C}_2\cap D \end{aligned}$$
(5.7)

and

$$\begin{aligned} \varphi < h-rL^{-\alpha } \text { on } \partial ^{out} \mathcal {C}_2\cap D, \end{aligned}$$
(5.8)

where \(r=k-1-7dC'(\varepsilon +\delta )\).

Now as \(\text {Conf}(h,\varepsilon +\delta ,N)\) does not happen and (5.7) holds, for all but at most \(L-1\) vertices x of \(\mathcal {C}_2\cap D\) we have \(\varphi _x\ge h-rL^{-\alpha }\). We claim that \(\{\varphi \ge h-rL^{-\alpha }\}\cap \mathcal {C}_2\cap D\) contains a cluster of diameter at least L. Indeed, notice that \(\mathcal {C}_2\cap D\) contains a connected set of diameter at least \(3L^2\) because the graph distance between B and \(\partial D\) is \(3L^2\). Consider a path \(\gamma \) in \(\mathcal {C}_2\cap D\) connecting two vertices u and v with graph distance \(d(u,v)\ge 3L^2\). Then we have that

$$\begin{aligned} 3L^2\le \sum _{i=1}^{j-1} d(x_i,x_{i+1}), \end{aligned}$$

where \(x_0=u,x_j=v\) and \(x_1,\ldots ,x_{j-1}\) are the vertices of \(\gamma \) in between u and v such that \(h-kL^{-\alpha }\le \varphi _{x_i}<h-rL^{-\alpha }\), ordered in turn of appearance in \(\gamma \) as we move from u to v. As \(\gamma \) can have at most \(L-1\) vertices x such that \(h-kL^{-\alpha }\le \varphi _x<h-rL^{-\alpha }\), we can deduce that \(j\le L\), hence for some \(i\in \{0,1,\ldots ,j-1\}\) we have that \(d(x_i,x_{i+1})\ge \frac{3\,L^2}{j}\ge 3\,L\). The subpath \(\gamma '\) of \(\gamma \) in between \(x_{i}\) and \(x_{i+1}\) has thus diameter at least \(3L-2\ge L\) and \(\varphi _x \ge h-rL^{-\alpha }\) for every \(x\in \gamma '\). Consider the cluster \(\mathcal {C}_3\) of \(\{\varphi \ge h-rL^{-\alpha }\}\cap D\cap U_N\) containing \(\gamma '\). We will show that \(\mathcal {C}_2\) contains \(\mathcal {C}_3\), which proves the claim. To this end, recall that \(\mathcal {C}_2\) is a cluster of \(\{\varphi +f\ge h\}\cap U_N\). Notice that for every \(x\in D\), \(\varphi _x+f(x)\ge \varphi _x+\min _{u\in D}f(u)>\varphi _x+rL^{-\alpha }\) by (5.6), hence

$$\begin{aligned} \{\varphi \ge h-rL^{-\alpha }\}\cap D\subset \{\varphi +f\ge h\}\cap D. \end{aligned}$$
(5.9)

Since \(\mathcal {C}_2\) and \(\mathcal {C}_3\) overlap at \(\gamma '\), we deduce that \(\mathcal {C}_2\) contains \(\mathcal {C}_3\).

Consider a box \(B_L(w)\in L{{\mathbb {Z}}}^d\) lying in B that intersects \(\mathcal {C}_3\). We will show that \(B_L(w)\) is \((\varphi ,h-rL^{-\alpha })\)-bad, which implies that \(\text {NSLU}(h,\varepsilon +\delta ,N)\) happens, as desired. Recalling the definition of a very dense cluster, we see that \(B_L(w)\) intersects \(\mathcal {C}_1\) as well. Notice that both \(\mathcal {C}_3\cap U_L(w)\) and \(\mathcal {C}_1\cap U_L(w)\) contain a cluster of diameter at least L/5 because both \(\mathcal {C}_3\) and \(\mathcal {C}_1\) have diameter at least L/5. On the other hand, \(\mathcal {C}_3\) is not connected to \(\mathcal {C}_1\) in \(\{\varphi \ge f+h\}\cap D_L(w)\) by (5.3) and the fact that \(\mathcal {C}_3\subset \mathcal {C}_2\). Using (5.9), we can deduce that \(\mathcal {C}_3\) is also not connected to \(\mathcal {C}_1\) in \(\{\varphi \ge h-rL^{-\alpha }\}\cap D_L(w)\). Thus \(B_L(w)\) is \((\varphi ,h-rL^{-\alpha })\)-bad. \(\square \)

Proof of Lemma 5.1

Let us start by constructing a very dense cluster. Let \(R=\left\lceil L/100\right\rceil \) and let F be the set of boxes \(B\in R{{\mathbb {Z}}}^d\) such that D is contained in \(D_N\). If every box in F is \((\varphi ,h+\varepsilon )\)-good, then \(\{\varphi \ge h+\varepsilon \}\cap D_N\) contains a cluster \(\mathcal {C}\) such that \(\mathcal {C}\cap B\) contains a cluster of diameter at least R/5 for every \(B\in F\). This is because for every pair of neighbouring boxes B and \(B'\) in F, both \(\{\varphi \ge h+\varepsilon \}\cap B\) and \(\{\varphi \ge h+\varepsilon \}\cap B'\) contain a cluster of diameter at least R/5, and these two clusters are connected in \(\{\varphi \ge h+\varepsilon \}\cap D\). Provided that N is large enough, it follows that for every box \(B\in L{{\mathbb {Z}}}^d\) contained in \(D_N\), \(\mathcal {C}\cap B\) contains a cluster of diameter at least L/5 and furthermore, \(\mathcal {C}\) has diameter at least N/5. In other words, \(\mathcal {C}\) is a very dense cluster.

If for every \(k\in \{-\left\lceil \varepsilon L^{\alpha }\right\rceil ,\ldots , \left\lceil \varepsilon L^{\alpha } \right\rceil \}\), all boxes of \(L{{\mathbb {Z}}}^d\) contained in \(D_N\) are \((\varphi ,h-rL^{-\alpha })\)-good, then we have strong local uniqueness. Increasing the value of N, if necessary, we can assume that \(h-rL^{-\alpha }<h_*\) for \(k=-\left\lceil \varepsilon L^{\alpha }\right\rceil \), hence for every \(k\in \{-\left\lceil \varepsilon L^{\alpha }\right\rceil ,\ldots , \left\lceil \varepsilon L^{\alpha } \right\rceil \}\) as well. Since we are considering at most \(CM^d\) boxes in total (the boxes of \(L{{\mathbb {Z}}}^d\) contained in \(D_N\) and the boxes of F) and we are considering \(2\left\lceil \varepsilon L^{\alpha }\right\rceil +2\) different level-sets (with the \(h+\varepsilon \) level-set included), we can apply (5.1) to obtain that

$$\begin{aligned} {\mathbb {P}}[\text {NSLU}(h,\varepsilon ,N)]\le (2\left\lceil \varepsilon L^{\alpha }\right\rceil +2)CM^d e^{-cR^{\rho }}\le e^{-c'N^{\rho '}} \end{aligned}$$

for some constants \(c'=c'(h',d)>0\) and \(\rho '=\rho '(d)>0\), as desired. \(\square \)

Proof of Lemma 5.2

We will show that the probability of \(\text {Conf}(h,\varepsilon ,B)\) decays stretched exponentially for every \(B\in L^2{{\mathbb {Z}}}^d\). Then the desired result will follow from a union bound over all \(B\in L^2{{\mathbb {Z}}}^d\) lying in \(D_N\) and the fact that there are polynomially many choices for B.

In order to prove the aforementioned result, consider a subset S of D of cardinality L and an integer \(k\in \{-\left\lceil \varepsilon L^{\alpha }\right\rceil ,\ldots , \left\lceil \varepsilon L^{\alpha } \right\rceil \}\). We will estimate the probability that \(h-kL^{-\alpha }\le \varphi _x < h-rL^{-\alpha }\) for all \(x\in S\) and then apply a union bound over all possible S and k. Let us set \(h_1\,{:}{=}\,h-kL^{-\alpha }\) and \(h_2\,{:}{=}\,h-rL^{-\alpha }\). Choose a subset \(S'\) of S such that for every \(x,y\in S'\) we have \(d(x,y)\ge 2\), and \(S'\) is a maximal subset of S with respect to this property. Then \(|S'|\ge \frac{L}{2d+1}\). Now conditioning on \(\varphi _y\) for \(y\in {{\mathbb {Z}}}^d \setminus S'\), we obtain

$$\begin{aligned} {\mathbb {P}}[h_1\le \varphi _x< h_2, \forall x\in S']={\mathbb {E}}\left[ {\mathbb {P}}\left[ h_1\le \varphi _x< h_2, \forall x\in S'\mid \sigma (\varphi _y, y\in {{\mathbb {Z}}}^d \setminus S')\right] \right] =\\ {\mathbb {E}}\left[ \prod _{x\in S'} {\mathbb {P}}\left[ h_1\le \varphi _x<h_2\mid \sigma (\varphi _y, y\in {{\mathbb {Z}}}^d \setminus S') \right] \right] \le (h_2-h_1)^{|S'|}\le (h_2-h_1)^{\frac{L}{2d+1}}. \end{aligned}$$

For the second equality, we used that conditionally on all \(\varphi _y\), \(y\notin S'\), the random variables \(\varphi _x\), \(x\in S'\) are pairwise independent. For the first inequality, we used that

$$\begin{aligned} {\mathbb {P}}[h_1\le \varphi _x<h_2\mid \sigma (\varphi _y, y\in {{\mathbb {Z}}}^d \setminus S')]\le h_2-h_1 \end{aligned}$$

which follows from the fact that conditionally on \(\sigma (\varphi _y, y\in {{\mathbb {Z}}}^d {\setminus } S')\), \(\varphi _x\) is a normal random variable with variance 1 (the value of the mean is not important), hence its probability density function is bounded by 1.

On the other hand, D contains \((7L^2)^d\) vertices, hence there are at most \((7L^2)^{dL}\) possible subsets of D of cardinality L. A union bound over the \(2\left\lceil \varepsilon L^{\alpha } \right\rceil +1\) possible values of k and the subsets of D of cardinality L implies that

$$\begin{aligned} {\mathbb {P}}[\text {Conf}(h,\varepsilon ,B)]\le \left( 2\left\lceil \varepsilon L^{\alpha } \right\rceil +1 \right) (7L^2)^{dL} (h_2-h_1)^{\frac{L}{2d+1}}. \end{aligned}$$

By our choice of \(\alpha \),

$$\begin{aligned} \left( 2\left\lceil \varepsilon L^{\alpha } \right\rceil +1 \right) (7L^2)^{dL}(h_2-h_1)^{\frac{L}{2d+1}}&=\exp \left\{ 2dL\log (L)+\frac{-\alpha }{2d+1}L\log (L)+O(L)\right\} \\&= \exp \left\{ -L\log (L)+O(L)\right\} . \end{aligned}$$

This completes the proof. \(\square \)

6 Decay of Very-Badness

In this section, we will prove Proposition 3.5. First, we need to express the event that B is \((\psi ,h,\varepsilon )\)-very-bad in terms of \(\varphi \) and \(\xi \).

We say that a box B is \((\varphi ,h,\varepsilon )\)-very-good if for every function \(g:{\overline{D}}\rightarrow {{\mathbb {R}}}\) which is harmonic in D and \(|g(x)|<\varepsilon \) for all \(x\in D\), the following happen:

  • for every \(B'\) which is either B or some neighbour of B, \(\{\varphi +g\ge h\}\cap B'\) contains a dense cluster,

  • for every neighbour \(B''\) of B and every pair of dense clusters of \(\{\varphi +g\ge h\}\cap B\) and \(\{\varphi +g\ge h\}\cap B''\), respectively, there is a path in \(\{\varphi +g\ge h\}\cap D\) visiting both dense clusters.

If B is not \((\varphi ,h,\varepsilon )\)-very-good, we will call it \((\varphi ,h,\varepsilon )\)-very-bad.

We shall now introduce another event that will be used to handle the non-uniqueness of a dense cluster. We define \(H(h,\varepsilon ,B)\) to be the event that there are

  • a function \(g:{\overline{D}}\rightarrow {{\mathbb {R}}}\) which is harmonic in D and \(|g(x)|<\varepsilon \) for all \(x\in D\), and

  • a pair \(\mathcal {C}_1,\mathcal {C}_2\) of clusters of \(\{\varphi +g\ge h\}\cap U\) of diameter at least L/5,

for which there is no path in \(\{\varphi +g\ge h\}\cap D\) connecting \(\mathcal {C}_1\) with \(\mathcal {C}_2\). It is not hard to see that if \(H(h,\varepsilon ,B)\) happens and B is \((\xi ,\delta )\)-good for some \(\delta >0\), then B is \((\psi ,h,\varepsilon +\delta )\)-bad.

Recall that the definition of a dense cluster involves considering the boxes of \(L_0{{\mathbb {Z}}}^d\) that are contained in \(B_L\). In order to construct a dense cluster, we will need to work with the columns of this collection of \(L_0\)-boxes. To define them precisely, let \(\{e_1,e_2,\ldots ,e_d\}\) be the standard basis of \({{\mathbb {Z}}}^d\). Given a collection \(\mathcal {F}\) of boxes of \(R{{\mathbb {Z}}}^d\) for some \(R\ge 1\), the columns of \(\mathcal {F}\) parallel to \(e_i\), \(i\in \{1,2,\ldots ,d\}\) are defined as follows. For every sequence of integers \((y_j)_{j=1,j\ne i}^d\), the set of boxes \(B_R(z)\in \mathcal {F}\), \(z=(z_1,z_2,\ldots ,z_d)\) with \(z_j=y_j\) for every \(j\ne i\), will be called a column of \(\mathcal {F}\) parallel to \(e_i\).

We will now prove Proposition 3.5.

Proof of Proposition 3.5

Notice that if \(B_L\) is \((\psi ,h,\varepsilon )\)-very-bad and \((\xi ,\delta )\)-good for some \(\delta >0\), then it is \((\varphi ,h,\varepsilon +\delta )\)-very-bad. Applying this observation for \(\delta =(h_*-h'-\varepsilon )/2\) and using Lemma 3.1 to handle the case that B is \((\xi ,\delta )\)-good, we see that (after redefining \(\varepsilon \)) it suffices to prove that for every \(h'<h_*\) and \(0<\varepsilon <h_*-h'\) there is a constant \(c=c(h',\varepsilon ,d)>0\) such that for every \(h\le h'\)

$$\begin{aligned} {\mathbb {P}}[B_L \text {is} (\varphi ,h,\varepsilon )-\text {very-bad}]\le e^{-c L^{d-2}}. \end{aligned}$$

Recall that \(L_0= \left\lceil L/M \right\rceil \), where \(M=\left\lfloor L^{\frac{d-2}{d-1}}/\log (L)\right\rfloor \). For simplicity, we will assume that M divides L, so that L/M is an integer. The general case can be treated similarly.

We will first focus on the existence of a dense cluster. Consider the boxes of \(L_0{{\mathbb {Z}}}^d\) contained in \(U_L\) and notice that they form a partition of \(U_L\). We will show that when only a few columns of this partition contain a \((\varphi ,h+\varepsilon )\)-bad box, \(\{\varphi \ge h+\varepsilon \}\cap B'_L\) contains a dense cluster, where \(B'_L\) is either \(B_L\) or a neighbouring box of \(B_L\). The latter easily implies that \(\{\varphi +g\ge h\}\cap B'_L\) contains a dense cluster for every function \(g:\overline{D_L}\rightarrow {{\mathbb {R}}}\) which is harmonic in \(D_L\) and \(|g(x)|<\varepsilon \) for all \(x\in D_L\). Then we will proceed to show that the probability of having many columns that contain a \((\varphi ,h+\varepsilon )\)-bad box decays exponentially in \(L^{d-2}\).

Among the columns of the partition of \(U_L\) that are parallel to \(e_i\), \(i=1,2,\ldots ,d\), consider those that contain a box which is \((\varphi ,h+\varepsilon )\)-bad. We let \(\Phi _i\) be the event that there are at least \(\frac{M^{d-1}}{10(2d-1)!}\) such columns. When the event \(\bigcup _{i=1}^d \Phi _i\) does not happen, we will show that \(\{\varphi \ge h+\varepsilon \}\cap B'_L\) contains a dense cluster. To this end, since the dense cluster needs to lie in \(B'_L\), we need to restrict to the collection of boxes \(B_{L_0}(z)\) such that \(D_{L_0}(z)\) is contained in \(B'_L\). Let us assume that L is large enough so that this collection is non-empty. This collection forms a partition of a smaller box \(B''_L\) that is contained in \(B'_L\). Notice that the number of boxes in each column of \(B''_L\) is \(M-6\). Let \(\Gamma \) be the set of boxes of the partition of \(B''_L\) that are \((\varphi ,h+\varepsilon )\)-good. Then for every \(e_i\), \(\Gamma \) contains at least

$$\begin{aligned} (M-6)^{d-1}-\frac{M^{d-1}}{10(2d-1)!}\ge \left( 1-\frac{1}{5(2d-1)!}\right) (M-6)^{d-1} \end{aligned}$$

columns parallel to \(e_i\), provided that L is large enough so that

$$\begin{aligned} \frac{M^{d-1}}{2}\le (M-6)^{d-1}. \end{aligned}$$

By Lemma 6.1 below, a connected component \(\mathcal {F}\) of \(\Gamma \) contains at least \(\frac{4}{5}(M-6)^d\) boxes. Increasing the value of L, if necessary, we can assume that \(\frac{4}{5}(M-6)^d\ge \frac{3}{4}M^d\), so that \(\mathcal {F}\) contains at least \(\frac{3}{4}M^d\) boxes. For each pair of neighbouring boxes \(B=B_{L_0}(z)\), \(B'\) in \(\mathcal {F}\), both \(\{\varphi \ge h+\varepsilon \}\cap B\) and \(\{\varphi \ge h+\varepsilon \}\cap B'\) contain a cluster of diameter at least \(L_0/5\), hence there is a path in \(\{\varphi \ge h+\varepsilon \}\cap D\) visiting both clusters, where \(D=D_{L_0}(z)\). By combining all these clusters, we obtain that \(\{\varphi \ge h+\varepsilon \}\cap B'_L\) contains a cluster \(\mathcal {C}\) visiting all boxes of \(\mathcal {F}\).

To show that \(\mathcal {C}\) is dense, it remains to estimate its diameter. Since \(\mathcal {F}\) contains at least \(\frac{3}{4}M^d\) boxes, it must intersect a column \(\mathrm {Col}(B''_L,\Gamma )\) of \(B''_L\) which is contained entirely in \(\Gamma \). Since \(\mathcal {F}\) is a connected component of \(\Gamma \), it must contain \(\mathrm {Col}(B''_L,\Gamma )\) entirely. In other words, \(\mathcal {C}\) contains a vertex from the first and the last box of \(\mathrm {Col}(B''_L,\Gamma )\), which implies that \(\mathcal {C}\) has diameter at least \((M-8)L_0\). We have that \((M-8)L_0=L-8L_0\ge L/5\), provided that L is large enough. Thus \(\mathcal {C}\) is a dense cluster.

We will now estimate \({\mathbb {P}}[\bigcup _{i=1}^d \Phi _i]\). To this end, let \(i=1,2,\ldots ,d\) and S be a set of \(\frac{M^{d-1}}{10(2d-1)!}\) boxes that lie in different columns of \(U_L\) parallel to \(e_i\). We will first count the possibilities for S and then estimate the probability that for a fixed S as above, all its boxes are \((\varphi ,h+\varepsilon )\)-bad. Notice that there are \(A^{d-1}\) columns parallel to \(e_i\), where \(A=3\,M\), and each column contains A boxes. Hence there are at most

$$\begin{aligned} 2^{A^{d-1}}A^{A^{d-1}}=\exp \left\{ A^{d-1}\log (2A)\right\} \le \exp \left\{ C\frac{L^{d-2}}{\log ^{d-2}(L)}\right\} \end{aligned}$$

possibilities for S, since we can construct S by first choosing a set of columns and then picking a box from each column of this set.

Moving on to the probabilistic estimate, let \(S'\) be a subset of S which is well-separated and is maximal with respect to this property. Then it is not hard to see that \(|S'|\ge 201^{-d} |S|\). Let \(\varepsilon _0=(h_*-h'-\varepsilon )/2\). We will now consider two cases. Either at least \(|S'|/2\) boxes of \(S'\) are \((\xi ,\varepsilon _0)\)-good or at least \(|S'|/2\) boxes of \(S'\) are \((\xi ,\varepsilon _0)\)-bad. In the first case, because we have assumed that all boxes of S are \((\varphi ,h+\varepsilon )\)-bad, we can deduce that at least \(|S'|/2\) boxes of \(S'\) are \((\psi ,h+\varepsilon ,\varepsilon _0)\)-bad. Applying Proposition 3.4 and using a union bound over the subsets of \(S'\) we obtain

$$\begin{aligned} {\mathbb {P}}[\text {at least} |S'|/2 \text {boxes of} S' \text {are} (\psi ,h+\varepsilon ,\varepsilon _0)-\text {bad}]\le 2^{|S'|}\exp \{-c_1|S'|L_0^{\rho }/2\}\le \exp \{-c_2L^{d-2}\}, \end{aligned}$$

where in the last inequality we used that

$$\begin{aligned} |S'|\le M^{d-1} \le \frac{L^{d-2}}{\log ^{d-1}(L)} \end{aligned}$$
(6.1)

and

$$\begin{aligned} |S'|L_0^{\rho }\ge c_3 \frac{L^{d-2}L_0^{\rho }}{\log ^{d-1}(L)}. \end{aligned}$$

On the other hand, if the second case holds, we can argue as follows. Let T be the set of boxes of \(S'\) that are \((\xi ,\varepsilon _0)\)-bad. Applying Lemma 6.2 below, we see that \(\mathrm {cap}(\Sigma (T))\ge rL^{d-2}\) for some constant \(r>0\). We shall now apply Lemma 3.1 and for this reason we need to check that \(|T|\le \delta rL^{d-2}\), where \(\delta \) is the constant of Lemma 3.1. This inequality follows from (6.1) by choosing L to be large enough. Hence a union bound over the subsets of \(S'\) implies that

$$\begin{aligned} {\mathbb {P}}[\text {at least} |S'|/2 \text {boxes of} S' \text {are} (\xi ,\varepsilon _0)-\text {bad}]\le 2^{|S'|}\exp \{-cr\varepsilon _0^2 L^{d-2}\}\le \exp \{-c_4 L^{d-2}\}. \end{aligned}$$

Overall, we obtain that

$$\begin{aligned} {\mathbb {P}}[\cup _{i=1}^d \Phi _i]\le \exp \left\{ C\frac{L^{d-2}}{\log ^{d-2}(L)}\right\} \left( \exp \{-c_2L^{d-2}\}+\exp \{-c_4L^{d-2}\}\right) \le \exp \{-c_5L^{d-2}\}. \end{aligned}$$

Let \(B'_L\) be a neighbouring box of \(B_L\). We shall now consider the event that for some function \(g:{\overline{D}}\rightarrow {{\mathbb {R}}}\) which is harmonic in D and \(|g(x)|<\varepsilon \) for all \(x\in D\), and a pair \(\mathcal {C}_1\), \(\mathcal {C}_2\) of dense clusters of \(\{\varphi +g\ge h\}\cap B_L\) and \(\{\varphi +g\ge h\}\cap B'_L\), respectively, there is no path in \(\{\varphi +g\ge h\}\cap D_L\) connecting \(\mathcal {C}_1\) to \(\mathcal {C}_2\). Let \(i=1,2,\ldots ,d\) be such that \(B'_L=B_L\pm Le_i\). Notice that for every \(L_0\)-box B intersecting \(\mathcal {C}_j\), \(j=1,2\), \(\mathcal {C}_j\cap U\) contains a cluster of diameter at least \(L_0/5\). Hence each column of \(B_L\cup B'_L\) parallel to \(e_i\) that intersects both \(\mathcal {C}_1\) and \(\mathcal {C}_2\) must contain a box \(B\in L_0{{\mathbb {Z}}}^d\) such that \(H(h,\varepsilon ,B)\) happens, since otherwise, \(\mathcal {C}_1\) and \(\mathcal {C}_2\) are connected in \(\{\varphi +g\ge h\}\cap D_L\). We will show that many columns are intersected by both clusters. Indeed, it follows from the definition of a dense cluster that each of \(\mathcal {C}_1\) and \(\mathcal {C}_2\) intersects at least \(3M^{d-1}/4\) columns parallel to \(e_i\). In particular, at least \(M^{d-1}/2\) columns parallel to \(e_i\) are intersected by both \(\mathcal {C}_1\) and \(\mathcal {C}_2\), and all of them contain a box \(B\in L_0{{\mathbb {Z}}}^d\) such that \(H(h,\varepsilon ,B)\) happens.

To estimate the probability of the event that at least \(M^{d-1}/2\) columns contain a box \(B\in L_0{{\mathbb {Z}}}^d\) such that \(H(h,\varepsilon ,B)\) happens, we consider two cases. Either at least \(M^{d-1}/4\) columns contain a \((\xi ,\varepsilon _0)\)-bad box or at least \(M^{d-1}/4\) columns contain a box B such that \(H(h,\varepsilon ,B)\) happens and B is \((\xi ,\varepsilon _0)\)-good. When \(H(h,\varepsilon ,B)\) happens and B is \((\xi ,\varepsilon _0)\)-good, B is \((\psi ,h,\varepsilon +\varepsilon _0)\)-bad. In both cases, we can argue as above to obtain the desired decay. \(\square \)

We will now prove the two lemmas mentioned above. In what follows, columns refer to the usual lines of \({{\mathbb {Z}}}^d\) (of width 0, opposed to union of boxes as considered above).

Lemma 6.1

Let \(d\ge 2\) and \(0<x<\frac{1}{(2d-1)!}\). Consider a subset \(\Gamma \) of \(B_L\) such that for every direction \(e_i\), \(\Gamma \) contains at least \((1-x)L^{d-1}\) columns of \(B_L\) parallel to \(e_i\). Then \(\Gamma \) contains a connected set of size at least \((1-(2d-1)!x)L^d\).

Proof

We will prove inductively on the dimension that the statement of the lemma holds for all \(\Gamma \) (as in the statement) and all \(0<x<\frac{1}{(2d-1)!}\).

For \(d=2\), the statement holds because any pair of vertical and horizontal columns shares a common vertex. Let us assume that it holds for some \(d\ge 2\). We will prove it for \(d+1\). Consider a subset \(\Gamma \) of the \((d+1)\)-dimensional box \(B_L\) as in the statement of the lemma. For each \(i=2,3,\ldots ,d+1\), let \(m_i\) be the number of \(k\in \{0,1\ldots ,L-1\}\) such that \(\Gamma \cap \left( \{k\}\times [0,L)^d\right) \) contains at most \((1-y)L^{d-1}\) columns of the d-dimensional box \(\{k\}\times [0,L)^d\) parallel to \(e_i\), where \(y=2dx\). Notice that \(\Gamma \) contains at most

$$\begin{aligned} m_i(1-y)L^{d-1}+(L-m_i)L^{d-1}=L^d-m_iyL^{d-1} \end{aligned}$$

columns parallel to \(e_i\), because for the remaining \(L-m_i\) elements of the set \(\{0,1\ldots ,L-1\}\), \(\Gamma \cap \left( \{k\}\times [0,L)^d\right) \) contains at most \(L^{d-1}\) columns. Hence \(L^d-m_iyL^{d-1}\ge (1-x)L^d\), which implies that \(m_i\le \frac{Lx}{y}=\frac{L}{2d}\).

Consider one of the remaining \(L-\sum _{i=2}^{d+1} m_i\ge L/2\) sets \(\Gamma \cap \left( \{k\}\times [0,L)^d\right) \) and notice that it satisfies the assumption of our inductive hypothesis for x replaced by y. Hence \(\Gamma \cap \left( \{k\}\times [0,L)^d\right) \) contains a connected set S of size at least \((1-z)L^d\), where \(z=(2d-1)!y=(2d)!x\). To find a connected set of the desired cardinality, consider some \(l\ne k\), \(l\in \{0,1,\ldots ,L-1\}\) and notice that among the at least \((1-x)L^d\) columns of \(B_L\) parallel to \(e_l\) that lie in \(\Gamma \), S meets at least \((1-x-z)L^d\ge (1-(2d+1)!x)L^d\) of them. The union of S with the columns that it meets forms a connected set of size at least \((1-(2d+1)!x)L^{d+1}\). This completes the proof. \(\square \)

Lemma 6.2

For every \(0<r<1\), there is \(c=c(r,d)>0\) such that the following holds. For every \(\Gamma \subset B_L\) that contains one vertex from at least \(rL^{d-1}\) columns parallel to \(e_1\), \(\mathrm {cap}(\Gamma )\ge cL^{d-2}\).

Proof

We will assume without loss of generality that any column parallel to \(e_1\) intersects \(\Gamma \) at 0 or exactly 1 vertex. Let \(F_1\) be the face of \(B_L\) intersecting all columns of \(B_L\) parallel to \(e_1\) and let \(\Gamma '\) be the projection of \(\Gamma \) to \(F_1\). We claim that \(\mathrm {cap}(\Gamma )\ge t \mathrm {cap}(\Gamma ')\) for some constant \(t=t(d)>0\). Indeed, recall the variational characterization of the capacity (2.4). Let \(\nu '\) be the probability measure supported on \(\Gamma '\) such that \(\mathrm {cap}(\Gamma ')=E(\nu ')^{-1}\) and define \(\nu \) to be the probability measure supported on \(\Gamma \) such that \(\nu (x)=\nu '(x')\), where \(x'\) is the projection of x to \(F_1\). Then \(\mathrm {cap}(\Gamma )\ge E(\nu )^{-1}\). Notice that by projecting \(\Gamma \) onto \(F_1\), the distance between its vertices decreases. Since the Green’s function g(xy) is asymptotically decreasing in the distance \(\Vert x-y\Vert \), we have \(E(\nu ')\ge t(d)E(\nu )\) and the claim follows.

We will now lower bound the capacity of \(\Gamma '\) by applying (2.5). To this end, notice that \(\Gamma '\) contains at least \(rL^{d-1}\) vertices and consider some vertex \(x\in \Gamma '\). Since the number of vertices in \(F_1\) that are at \(\Vert \cdot \Vert _{\infty }\)-distance k from x is of order \(k^{d-2}\), it is not hard to see that there are constants \(t_1=t_1(d,r)>0\) and \(t_2=t_2(d,r)>0\) such that for at least \(t_1L\) values of \(k\in \{0,1,\ldots ,L-1\}\), \(\Gamma '\) contains at least \(t_2 k^{d-2}\) vertices at distance k from x. The desired lower bound on \(\mathrm {cap}(\Gamma ')\) follows now from (2.5). \(\square \)