1 Introduction

1.1 Metastability in the Ising model

Consider Glauber dynamics for the supercritical Ising model on the hypercubic lattice (\(d\ge 2\)) started in the minus configuration but with a positive external magnetic field \(h\). Aizenman and Lebowitz predicted that the model initially settles in a metastable minus phase, eventually relaxing to the plus phase on a time scale that grows exponentially with \(1/h^{d-1}\) as \(h\rightarrow 0\) [1].

To be more precise, let \(\beta \) denote the inverse-temperature and let \(\beta _\mathrm{c}\) denote the critical inverse-temperature. Suppose \(\beta >\beta _\mathrm{c}\). Let \(\mu ^+,\mu ^-\) denote the plus and minus phases of the equilibrium Ising model. Start the Glauber dynamics at time \(0\) with all vertices initially taking minus spin. Let \(\sigma ^-_{t}\) denote the state of the Glauber dynamics at time \(t\). With \(\beta \) fixed, let \(h \rightarrow 0\) with \(t=\exp (\lambda /h^{d-1})\). A heuristic argument suggests that if \(\lambda _1\) is sufficiently small and \(\lambda _2\) is sufficiently large then for every local observable \(f\):

  1. (i)

    \(\mathbb{E }[f(\sigma ^{-}_{t})]\rightarrow \mu ^-(f)\) if \(\lambda < \lambda _1\).

  2. (ii)

    \(\mathbb{E }[f(\sigma ^{-}_{t})]\rightarrow \mu ^+(f)\) if \(\lambda > \lambda _2\).

Part (i) is a lower bound on the relaxation time and part (ii) is an upper bound. Schonmann [21] proved this behavior in dimensions \(d\ge 2\). However, his proof left open the question of whether or not \(\lambda _1=\lambda _2\). Schonmann and Shlosman [22] settled this question in dimension two, proving that the above holds with \(\lambda _1=\lambda _2\); the transition is sharp in a logarithmic sense. Their proof refines the heuristic argument and shows that the critical value of \(\lambda \) is a simple function of the cost of creating a so-called critical droplet, which in turn is a function of the surface tension of the Wulff shape. The proof takes advantage of specific features of the two dimensional Ising model such as duality. When considering disordered models, in two and higher dimensions, these simplifying features no longer exist. New arguments from the \(L_1\)-theory of phase coexistence have to be used instead.

The focus of this paper will be the dilute Ising model. For the purpose of comparison, we note that the proof of our main result (Theorem 1.2.1 below) implies that the upper bound of [22] for the undiluted Ising model extends to higher dimensions. To avoid certain technicalities we assume that \(\beta \in (\beta _0,\infty )\setminus \mathcal{N }\) (see Sect. 1.2).

Let \(\mu ^h\) denote the equilibrium, undiluted Ising measure with a magnetic field \(h>0\). With reference to (4.4), let \(\mathsf{{E}}^\mathrm{undil}_\mathrm{c}\) denote the quantity defined by taking \(p=1\) and \(\theta =2\pi \); the significance of these choices will be explained shortly. The cost of creating a critical droplet under \(\mu ^h\) is \(\mathsf{{E}}^\mathrm{undil}_\mathrm{c}/h^{d-1}\) and \(\mathsf{E}^\mathrm{undil}_\mathrm{c}=\mathrm{{O}}(\beta )\) as \(\beta \rightarrow \infty \). Define a local observable to be a function that only depends on the spins in the region \([-1/h,1/h]^d\).

Theorem 1.1.1

Consider the value

$$\begin{aligned} \lambda _2^\mathrm{undil}=\frac{\mathsf{E}^\mathrm{undil}_\mathrm{c}}{d+1}. \end{aligned}$$

Let \(\lambda >\lambda _2^\mathrm{undil}\). For every number \(C_0>0\) there is a constant \(C>0\) such that for \(h>0\), for every local observable \(f\),

$$\begin{aligned} \left|\mathbb{E }\left( f\left(\sigma ^{-}_{\exp (\lambda /h^{d-1})}\right) \right) - \mu ^h (f) \right| \le C \exp (-C_0/h). \end{aligned}$$

This is an improvement on the upper bound in [21] and corresponds to the upper bound predicted by the heuristic of [22]. Proving rigorously the lower bound suggested by [22] in dimensions three and higher requires the development of new arguments which we postpone to a future work.

The dilute Ising model is a variant of the Ising model that is obtained by randomizing the Ising model edge coupling strengths. The impact of dilution on the relaxation of Glauber dynamics has been studied in [10, 19]. In the Griffiths phase, which corresponds to the sub-critical regime, the disorder is proven to lead to a slowdown of the dynamics. In the phase transition regime, the metastability has been investigated for the random field Curie-Weiss model [5].

We will consider the Ising model on \(\mathbb{Z }^d\) diluted in the simplest way possible. Let \(p\in (0,1)\). Independently, delete each edge with probability \(1-p\). When \(p\) is sufficiently large (larger than the percolation threshold for \(\mathbb{Z }^d\)), the remaining edges form a supercritical percolation cluster (and an infinite number of finite clusters). From this point of view, the undiluted Ising model is a limiting case of the dilute Ising model corresponding to \(p=1\). It is natural to ask how the relaxation time depends on \(p\). In this paper we show that even a small dilution can greatly reduce the relaxation time.

1.2 The dilute Ising model

Let \(\mathbb{Z }^d\) represent the hypercubic lattice. The Ising model assigns each site of \(\mathbb{Z }^d\) a spin of \(\pm 1\). Let \(\Sigma =\{\pm 1\}^{\mathbb{Z }^d}\) denote the set of Ising configurations.

Let \(E=\{\{x,y\}:\Vert x-y\Vert _1=1\}\) denote the set of nearest neighbor edges of \(\mathbb{Z }^d\). The equilibrium Ising measure with local coupling strengths \(J=(J(e):e\in E)\) and external magnetic field \(h\) is defined using the formal Hamiltonian

$$\begin{aligned} -\frac{1}{2} \sum _{e = \{x,y\} \in E} J(e) \sigma (x) \sigma (y) - \frac{1}{2} \sum _{x\in \mathbb{Z }^d} h \sigma (x),\quad \quad \sigma \in \Sigma . \end{aligned}$$
(1.1)

We will consider local coupling strengths with the Bernoulli distribution. Let \(\mathbb{Q }\) denote the product measure such that for each edge \(e, \mathbb{Q }[J(e)=1]=p\) and \(\mathbb{Q }[J(e)=0]=1-p\).

It is well known that when \(h\not =0\), the Ising measure \(\mu ^{J,h}\) is well defined by the Gibbs formalism for any inverse-temperature \(\beta > 0\) and local coupling strengths \(J\ge 0\) [11, 15]. Consider the spontaneous magnetization of the Ising measure,

$$\begin{aligned} m^* = \lim _{h\rightarrow 0+} \mathbb{Q }\left[\mu ^{J,h} \left( \sigma (0) \right)\right]\!. \end{aligned}$$
(1.2)

When \(\beta \) is sufficiently large \(m^* > 0\); there is said to be phase coexistence. For such \(\beta \) there are two different Gibbs measures at \(h=0\), corresponding to the limits \(h\rightarrow 0+\) and \(h\rightarrow 0-\). The critical inverse-temperature

$$\begin{aligned} \beta _\mathrm{c}(p)=\inf \left\{ \beta > 0 : m^* > 0 \right\} \!. \end{aligned}$$

For notational simplicity we will drop the explicit dependence on \(p\). The critical value \(\beta _\mathrm{c}\) is finite if and only if \(p > p_\mathrm{c}\), where \(p_\mathrm{c}\) is the threshold for bond percolation on \((\mathbb{Z }^d,E)\); \(\beta _\mathrm{c}\) diverges as \(p\rightarrow p_\mathrm{c}\) and \(\beta _\mathrm{c}\) tends to the undiluted value as \(p\rightarrow 1\).

As well as defining the equilibrium Ising model, the formal Hamiltonian defines a dynamic model. Let \((\sigma ^{-}_t)_{t\ge 0}\) denote a Markov chain on the set \(\Sigma \) of Ising configurations, starting at time \(0\) with minus spins everywhere, and evolving with time according to Glauber dynamics. Given a set of coupling strengths \(J\), let \(\mathbb{E }_J\) denote expectation with respect to the Glauber dynamics. Our results extend to some other dynamics such as the Metropolis dynamics (see Sect. 2.4).

Heuristically, as explained in the next section, we expect the dilute Ising model to show a metastability property similar to the undiluted Ising model, albeit on a shorter timescale. Our main result is an upper bound on the relaxation time. However, we do not give a lower bound on the relaxation time so some doubt persists regarding the heuristic picture—we have not ruled out the possibility of the system relaxing on a time scale shorter than \(\exp (\lambda /h^{d-1})\) for all \(\lambda >0\). Proving such a lower bound is a work in progress.

To state our main result we must introduce some quantities that will be defined rigorously later. These quantities all implicitly depend on the value of \(p\) which for the purpose of our analysis we will take to be fixed in the interval of interest \((p_c,1)\).

Let \(\beta _0\) denote the minimum value such that for all \(\beta >\beta _0\) the assumptions of slab percolation (see Sect. 2.3) and spatial mixing (see Sect. 5.4) hold. Let \(\mathcal{N }\subset (0,\infty )\) denote the set of zero measure defined by (2.4). We will always assume that the inverse temperature satisfies \(\beta \in (\beta _0,\infty )\setminus \mathcal{N }\).

We will use \(\theta \) to describe the shape of our catalysts. Values of \(\theta \in (0,\pi )\) will correspond to real catalysts. For the purpose of comparison, we will also give a bound on the mixing time that does not depend on the presence of catalysts; this corresponds to the case \(\theta =2\pi \).

Let \(C_\mathrm{dil}^\theta \) measure the cost of forming catalysts as defined in Sect. 6. As \(p\) increases and \(\theta \) decreases catalysts become rarer: for \(\theta \in (0,\pi )\) and \(p\in (p_\mathrm{c},1)\), \(C_\mathrm{dil}^\theta =\mathrm{{O}}(\frac{1}{\theta }\log \frac{1}{1-p})\) (6.12). For \(\theta =2\pi \) take \(C_\mathrm{dil}^\theta =0\).

For \(\theta \in (0,\pi )\cup \{2\pi \}\) let \(\mathsf{{E}}^\theta _\mathrm{c}/h^{d-1}\) denote the cost of creating a critical droplet in a catalytic cone with angle \(\theta \) as defined in Sect. 4.2. We will see that the smaller \(\theta \) is, the better the catalytic effect: \(\mathsf{E}^\theta _\mathrm{c}\) has order \(\beta \theta ^{d-1}\). Taking the limit \(p\rightarrow 1\) with \(\theta =2\pi \) we get the undiluted case: \(\mathsf{E}^{2\pi }_\mathrm{c}\rightarrow \mathsf{E}^\mathrm{undil}_\mathrm{c}\).

Theorem 1.2.1

For \(\theta \in (0,\pi )\cup \{2\pi \}\) consider the value

$$\begin{aligned} \lambda _2^\theta = \frac{\mathsf{E}^\theta _\mathrm{c}+ C_\mathrm{dil}^\theta }{d+1}. \end{aligned}$$

Let \(\lambda >\lambda _2^\theta \). For every number \(C_0>0\) there are constants \(C,c>0\) such that for \(h>0\), for every local observable \(f\),

$$\begin{aligned}&\mathbb{Q }\left[\, \left| \mathbb{E }_J \left( f\left(\sigma ^{-}_{\exp (\lambda /h^{d-1})}\right) \right) - \mu ^{J,h} (f) \right| \le C \Vert f\Vert _\infty \exp (-C_0/h) \, \right] \\&\quad \ge 1-C\exp (-c /\sqrt{h}). \end{aligned}$$

Taking \(\theta =\beta ^{-1/d}\) we see that

$$\begin{aligned} \frac{1}{\lambda _2^\mathrm{undil}} \inf _{\theta \in (0,\pi )} \lambda _2^\theta \rightarrow 0 \quad \text{ as}\quad \beta \rightarrow \infty . \end{aligned}$$
(1.3)

The surprising message of Theorem 1.2.1 is that however close \(p\) is to one, at low temperatures the dilute Ising model relaxes much more quickly than the corresponding undiluted Ising model.

We expect, based on the undiluted Ising model [6], that the slab percolation threshold is equal to \(\beta _\mathrm{c}\). Further study of slab percolation and spatial-mixing properties for the dilute Ising model may extend the domain of validity of Theorem 1.2.1 to the whole supercritical regime \((\beta _\mathrm{c},\infty )\).

The case \(\theta =2\pi \) is included as it describes the rate of creation of critical droplets in regions of typical dilution. The effect of dilution on the rate of creation of critical droplets in such regions depends on the sign of \(\mathsf{{E}}^\mathrm{undil}_\mathrm{c}-\mathsf{E}^{2\pi }_\mathrm{c}\).

1.3 Heuristic

The metastability phenomenon for the undiluted Ising model [22] is related to the rate of nucleation of plus droplets with linear size order \(1/h\). Consider a small neighborhood of the origin. Initially all the spins are minuses. Small clusters of plus spins quickly form and then disappear. After a short time the system looks like it has reached equilibrium with minus spins in the majority. However, if we look at a much larger region we will be in for a surprise. A small number of larger droplets of plus spin will have formed and started to spread. They will eventually merge and cover the whole region, leaving the majority of spins in the plus state.

The rate at which droplets of plus phase form, and what happens to the droplets once they have formed, depends on their energy. Let \(\mathcal{V }\subset \mathbb{R }^d\) with unit volume. For \(b>0\) let \(\mathsf{{E}}^\mathcal{V }(b)\) denote, up to a factor of \(h^{d-1}\), the energy of a plus droplet with the shape \((b/h)\mathcal{V }\). \(\mathsf{{E}}^\mathcal{V }(b)\) can be estimated as a balance between the surface tension at the phase boundary and the effect of the magnetic field \(h\),

$$\begin{aligned} \mathsf{{E}}^\mathcal{V }(b)/h^{d-1} = (b/h)^{d-1} \mathcal{F }(\mathcal{V }) - h m^* (b/h)^d. \end{aligned}$$

Here \(\mathcal{F }(\mathcal{V })\) is the surface tension of \(\mathcal{V }\) (see Sect. 3.3) and \(m^*\) is the mean magnetization in the plus phase (1.2).

Let \(B_\mathrm{c}^\mathcal{V }= \frac{(d-1) \mathcal{F }(\mathcal{V })}{d m^*}\). The energy function \(\mathsf{{E}}^\mathcal{V }(b)\) is increasing on the interval \((0,B_\mathrm{c}^\mathcal{V })\) and decreasing beyond \(B_\mathrm{c}^\mathcal{V }\). Droplet with \(b<B_\mathrm{c}^\mathcal{V }\) are unstable and tend to be eroded by the surrounding minus spins. Droplets with \(b>B_\mathrm{c}^\mathcal{V }\) are expected to spread. The nucleation of a droplet with \(b>B_\mathrm{c}^\mathcal{V }\) requires that the system overcomes an energy barrier \(\mathsf{{E}}^\mathcal{V }_\mathrm{c}/h^{d-1}\), where

$$\begin{aligned} \mathsf{{E}}^\mathcal{V }_\mathrm{c}:= \mathsf{{E}}^\mathcal{V }(B_\mathrm{c}^\mathcal{V })=\left(\frac{\mathcal{F }(\mathcal{V })}{d}\right)^d \left(\frac{d-1}{\beta m^*}\right)^{d-1}. \end{aligned}$$

Given the inverse-temperature \(\beta \) there is a unique shape \(\mathcal{W }\), known as the Wulff shape, with minimal surface tension; see the definition of \(\mathcal{W }_{2\pi }(1)\) in Sect. . Setting \(\mathcal{V }=\mathcal{W }\) minimizes \(\mathsf{{E}}^\mathcal{V }_\mathrm{c}\). The critical droplet shape is \((B_\mathrm{c}^\mathcal{W }/h) \mathcal{W }\).

In any small neighborhood the rate at which copies of the critical droplet form is approximately \(\exp (-\mathsf{{E}}^\mathcal{W }_\mathrm{c}/h^{d-1})\). Droplets larger than the critical droplet spread out with roughly uniform speed and eventually invade the whole space. The space-time cone of points from which one can reach the origin by time \(t\) (when growing at a fixed speed) has size \(\mathrm{{O}}(t^{d+1})\). If \(t=\exp (\lambda /h^{d-1})\) with

$$\begin{aligned} \lambda > \lambda _c = \frac{\mathsf{{E}}^\mathcal{W }_\mathrm{c}}{d+1} \end{aligned}$$
(1.4)

then we should expect to see a critical droplet form, and then spread to cover the origin, by time \(t\). This heuristic picture has been turned into a rigorous proof for the two dimensional Ising model [22].

The dilute Ising model is self averaging so \(m^*\) and \(\mathcal{F }(\mathcal{W })\) can unambiguously be defined with respect to the (quenched) dilute Ising model. Both \(m^*\) and \(\mathcal{F }(\mathcal{W })\) are believed to be continuous functions of \(p\), so we might expect a small dilution to have a small effect on the rate of mixing. However, we must be careful. In other models, such as random walk in random environment [26], it is well known that a small amount of randomness can change the asymptotic behavior [4, 23].

Dilution seems to be capable of slowing down the dynamics. Consider an expanding droplet of plus phase. If it encounters an area of high dilution it may get blocked and have to seep around the obstruction, slowing down its progress.

However, areas of high dilution can also speed up the dynamics. The limiting factor in the undiluted Ising model is the rate at which plus droplets nucleate. Nucleation of plus droplets is infrequent due to the high cost of phase coexistence on their boundaries. The dilution creates atypical regions, which we will call catalysts, where the surface tension is unusually low and so the rate of nucleation is unusually high.

The natural human response to catalysts is to try to classify them. Some catalysts do not seem to have much effect on the relaxation time. Consider (when \(d=2\)) a circle where all the edges crossing its perimeter have been diluted. If a plus droplet forms inside the circle, there is no way for it to spread outwards.

We therefore want to focus on catalysts that create a sheltered region to help plus droplets nucleate, but are not so closed off they prevent plus droplets from escaping. There seem to be two competing factors. Large catalysts will be relatively rare and so the droplets they help to nucleate will take a long time to reach the origin. Conversely, small catalysts cannot do a great deal to increase the rate at which critical droplets nucleate.

We conjecture that there is an optimal catalyst shape that determines the relaxation time of the system. However, we do not know how to calculate the optimal shape. In this paper we look at a restricted class of catalysts: surfaces of diluted edges that form open-bottom cones. We control the nucleation rate in the cones, and the subsequent growth of the droplet to regions of more typical dilution. This approach leads to an upper bound on the relaxation time that is much smaller than the time predicted by the formula (1.4) with quenched surface tension. Indeed, (1.3) shows that asymptotically in \(\beta \) the values of \(\lambda _2\) differ greatly.

According to the heuristic, the upper bound on the relaxation time can be improved by finding more efficient classes of catalysts. We expect trumpet shaped catalysts to be more efficient than cone shaped catalysts; they should be able to speed up nucleation in a similar way but using fewer diluted edges. Note also that the surface of diluted edges from which the catalyst is composed does not need to be particularly smooth. Allowing some roughness on the microscopic level increases the density of your class of catalysts, likely leading to an improved bound.

Our priority in this paper was to show that disorder can greatly increase the rate of mixing. We have not addressed the issue of the lower bound for the metastable time for disordered models. That requires bounding above the catalytic effect and is the subject of a work in progress.

1.4 Outline of the paper

In Sect. 2 we define the dilute Ising model and recall some of its basic features. The random-cluster representation is used to state a coarse graining property.

In Sect. 3 and 4 we look at the Ising model without a magnetic field. In Sect. 3 we describe the \(L_1\)-theory of phase coexistence. The theory can describe both the typical cost of phase coexistence and the cost of phase coexistence in the neighborhood of catalysts. To combine the two cases we consider the cone \({\mathcal{A }}_\theta :=\left\{ {\mathbf{{x}}}\in {\mathbb{R }}^d: x_1 \ge \Vert {\mathbf{{x}}}\Vert _2 \cos (\theta /2)\right\} \) where either \(\theta \in (0,\pi )\) or \(\theta =2\pi \). In Sect. 4 we look at generalizations of the Wulff shape to \(\mathcal{A }_\theta \). The Wulff shape is the shape with minimal surface tension given its volume. The Wulff shape can be used to quantify the large deviations of the equilibrium Ising model.

In Sect. 5 we reintroduce the magnetic field. We justify the energy function featured in the heuristic. We prove regularity results concerning cluster boundaries. The motivation for this is to study the spectral gap of the dilute Ising model in finite regions with various boundary conditions.

Finally in Sect. 6 we use the accumulated results to prove Theorem 1.2.1. We do this by proving that the cone shaped regions act as catalysts. To show that the clusters of plus phase formed in the catalysts grow we consider another type of cone: space-time cones that are Wulff shaped spatially and growing in size with time.

1.5 Notation

Throughout the paper \(C,c,c_\mathrm{stb},c_\mathrm{hs}\), etc, will be used to refer to positive numbers that may depend on \(p,\beta \) and \(\theta \) but not on \(h\). We will recycle \(C\) and \(c\) to refer to various less important positive constants; the values they represent will change from appearance to appearance.

Let \(\mathcal{S }^{d-1}\) denote the set of unit vectors in \(\mathbb{R }^d\). Let \({\mathbf{{e}}}_1,\dots ,{\mathbf{{e}}}_d\) denote the canonical basis vectors. We will use bold to differentiate continuous variables \({\mathbf{{x}}}\in \mathbb{R }^d\) from lattice points \(x\in {\mathbb{Z }}^d\).

Consider \(\mathcal{A },\mathcal{B }\subset \mathbb{R }^d, {\mathbf{{x}}}\in \mathbb{R }^d\) and \(c\ge 0\). Let \(\mathcal{A }+\mathcal{B }=\{{\mathbf{{a}}}+{\mathbf{{b}}}:{\mathbf{{b}}}\in \mathcal{A },{\mathbf{{b}}}\in \mathcal{B }\}\) denote the sum of the two sets. Let \(\mathcal{A }+{\mathbf{{x}}}\) denote the translation of \(\mathcal{A }\) by \({\mathbf{{x}}}\). Let \(c\mathcal{A }\) denote \(\{c\,{\mathbf{{a}}}:{\mathbf{{a}}}\in \mathcal{A }\}\), the set \(\mathcal{A }\) scaled by a factor of \(c\).

2 Properties of the dilute Ising model

2.1 Definition of \(\mu ^{J,\zeta ,h}_\Lambda \)

Let \(J=(J(e):e\in E)\) be a given realization of the coupling strengths. We will now define formally the Ising measure \(\mu ^{J,\zeta ,h}_\Lambda \) with a magnetic field \(h\in \mathbb{R }\) and boundary conditions \(\zeta \in \Sigma \) on a finite domain \(\Lambda \subset \mathbb{Z }^d\) at inverse-temperature \(\beta > 0\).

Define the external vertex boundary \(\partial \Lambda \) of \(\Lambda \):

$$\begin{aligned} \partial \Lambda&= \partial ^+\Lambda \cup \partial ^-\Lambda \quad \quad \,\text{ where}\\ \partial ^\pm \Lambda&= \{x\not \in \Lambda : \exists y\in \Lambda ,\ \{x,y\}\in E \text{ and}\;\zeta (x)=\pm 1\}. \end{aligned}$$

Taking w to stand for wired, define edge sets for \(\Lambda \):

$$\begin{aligned} E (\Lambda )&= \left\{ \left\{ x, y \right\} \in E : x, y \in \Lambda \right\} \!,\\ E^\pm (\Lambda )&= \left\{ \left\{ x, y \right\} \in E : x \in \Lambda \text{ and}\; y\in \partial ^\pm \Lambda \right\} \!,\\ E^\mathrm{w}(\Lambda )&= E (\Lambda ) \cup E^\pm (\Lambda )\!. \end{aligned}$$

The set of spin configurations compatible with \(\zeta \) outside \(\Lambda \) is

$$\begin{aligned} \Sigma _\Lambda ^\zeta :=\{\sigma \in \Sigma :\forall x \not \in \Lambda ,\ \sigma (x)=\zeta (x)\}. \end{aligned}$$

To make sense of the Ising Hamiltonian we must limit the sums in (1.1) to the terms that depend on \((\sigma (x):x\in \Lambda )\). We take this as an opportunity to adjust the Hamiltonian by an additive constant; this will have no effect on the resulting probability measure, but it will be convenient in the next section when we define the random-cluster measure. We take the Ising Hamiltonian on \(\Lambda \) to be the function \(H^{J,\zeta ,h}_\Lambda :\Sigma _\Lambda ^\zeta \rightarrow \mathbb{R }\),

$$\begin{aligned} H^{J,\zeta ,h}_\Lambda (\sigma ) = \sum _{e =\{x, y\} \in E^\mathrm{w}(\Lambda )} J(e) 1_{\{\sigma (x)\not =\sigma (y)\}}+\sum _{x\in \Lambda } h 1_{\{\sigma (x)=-1\}}. \end{aligned}$$

The dilute Ising measure \(\mu ^{J,\zeta ,h}_{\Lambda }\) at inverse-temperature \(\beta \) is defined by

$$\begin{aligned} \mu ^{J,\zeta ,h}_{\Lambda } (\{\sigma \}) = \frac{1}{Z^{J,\zeta ,h}_\Lambda } \exp \left( - \beta H_{\Lambda }^{J,\zeta ,h} (\sigma ) \right) \end{aligned}$$

where \(Z^{J,\zeta ,h}_\Lambda \) is a normalizing constant, the partition function, defined by

$$\begin{aligned} Z^{J,\zeta ,h}_\Lambda = \sum _{\sigma \in \Sigma ^\zeta _\Lambda } \exp \left( - \beta H_{\Lambda }^{J,\zeta ,h} (\sigma ) \right)\!. \end{aligned}$$
(2.1)

We have used \(\sigma \) above to index summations over \(\Sigma ^\zeta _\Lambda \). It has also been used as a random variable—the mean spin at the origin is written \(\mu ^{J,\zeta ,h}_\Lambda (\sigma (0))\). Furthermore, given a set \(V\subset \mathbb{Z }^d\) we will write \(\sigma (V)\) to denote the average spin in \(V\),

$$\begin{aligned} \sigma (V)=\frac{1}{|V|}\sum _{x\in V} \sigma (x)\in [-1,+1]. \end{aligned}$$
(2.2)

2.2 The random-cluster representation for \(\mu ^{J,\zeta ,h}_\Lambda \)

The spin-spin correlations in the Ising model can be described by the \(q=2\) case of the random-cluster model [14]. We have to be extra careful because of the general boundary conditions \(\zeta \in \Sigma \), dilute coupling strengths \((J(e))\), and the magnetic field \(h\ge 0\). In this section we will describe a random-cluster representation \(\phi ^{J,\zeta ,h}_\Lambda \) for the Ising model \(\mu ^{J,\zeta ,h}_\Lambda \) and a joint measure \(\varphi ^{J,\zeta ,h}_\Lambda \).

The Ising measure \(\mu ^{J,\zeta ,h}_\Lambda \) was defined using the graph \((\Lambda ,E^\mathrm{w}(\Lambda ))\). Add to this graph a ghost vertex \(\mathfrak{g }\) through which the magnetic field will act, and a set of ghost edges \(E^\mathfrak{g }(\Lambda )=\{\{\mathfrak{g },x\},x\in \Lambda \}\).

When defining the random-cluster model on a given graph, for each edge \(e\) there is an interaction-strength parameter \(p_e\in [0,1]\). There is also another parameter, \(q>0\), that influences the number of clusters that are formed. In order to describe the correlations of the dilute Ising model we will fix

$$\begin{aligned} q=2 \quad \text{ and}\quad p_e={\left\{ \begin{array}{ll} 1-\exp (-\beta J(e)),&e\in E^\mathrm{w}(\Lambda ),\\ 1-\exp (-\beta h),&e\in E^\mathfrak{g }(\Lambda ). \end{array}\right.} \end{aligned}$$

The state space of \(\phi ^{J,\zeta ,h}_\Lambda \) is \(\Omega _\Lambda =\{0,1\}^{E^\mathrm{w}(\Lambda )\cup E^\mathfrak{g }(\Lambda )}\). With \(\omega \in \Omega _\Lambda \), an edge \(e\) is open if \(\omega (e)=1\) and closed if \(\omega (e)=0\). Two vertices of \(\Lambda \) are connected if they are joined by paths of open edges either

  1. (i)

    to each other inside \((\Lambda ,E(\Lambda ))\), or

  2. (ii)

    both to \(\partial ^+\Lambda \cup \{g\}\), or

  3. (iii)

    both to \(\partial ^-\Lambda \).

A cluster is a maximal collection of connected vertices. Let \(V_+\), \(V_-\) denote the clusters connected to \(\partial ^+\Lambda \cup \{\mathfrak{g }\}\), \(\partial ^-\Lambda \), respectively. Let \(n=n(\Lambda ,\omega )\) count the number of other clusters in \(\Lambda \). Label these clusters \(V_1,\ldots ,V_n\).

We will define the random-cluster probability measure \(\phi ^{J,\zeta ,h}_\Lambda \) using a coupling probability measure \(\varphi ^{J,\zeta ,h}_\Lambda \) defined on \(\Sigma ^\zeta _\Lambda \times \Omega _\Lambda \). The marginal distribution of \(\varphi ^{J,\zeta ,h}_\Lambda \) on \(\Sigma ^\zeta _\Lambda \) will be the Ising measure \(\mu ^{J,\zeta ,h}_\Lambda \). The marginal distribution of \(\varphi ^{J,\zeta ,h}_\Lambda \) on \(\Omega _\Lambda \) defines the random-cluster measure \(\phi ^{J,\zeta ,h}_\Lambda \).

With reference to [Section 1.4] GspsRC define the coupled probability measure \(\varphi ^{J,\zeta ,h}_\Lambda \) as follows. Let \((\sigma ,\omega )\in \Sigma ^\zeta _\Lambda \times \Omega _\Lambda \). Recall the notation (2.2) for average spins. Note that \(\sigma (V)=\pm 1\) if and only if \(\sigma (x)=\sigma (y)\) for all \(x,y\in V\). We will say that \(\sigma \) is an \(\omega \)-admissible configuration if

$$\begin{aligned} \sigma (V_+)=+1, \quad \sigma (V_-)=-1\quad \,\text{ and} \quad \sigma (V_i)=\pm 1,\ i=1,\ldots ,n. \end{aligned}$$

If \(\sigma \) is \(\omega \)-admissible, with reference to (2.1) let

$$\begin{aligned} \varphi ^{J,\zeta ,h}_\Lambda (\{(\sigma ,\omega )\})&= \frac{1}{Z^{J,\zeta ,h}_\Lambda } \prod _e p_e^{\omega (e)} (1 - p_e)^{1 - \omega (e)}, \end{aligned}$$

otherwise let \(\varphi ^{J,\zeta ,h}_\Lambda (\{(\sigma ,\omega )\})=0\).

For configurations \(\omega \in \Omega _\Lambda \) under which \(\partial ^+\Lambda \cup \{\mathfrak{g }\}\) is connected to \(\partial ^-\Lambda \), there are no \(\omega \)-admissible configurations. Let \(\mathrm{D}_\Lambda ^\zeta \subset \Omega _\Lambda \) represent the set of configurations such that the vertices in \(\partial ^+\Lambda \cup \{\mathfrak{g }\}\) are not connected to the vertices of \(\partial ^-\Lambda \); \(\mathrm{D}^\zeta _\Lambda \) is the support of \(\phi ^{J,\zeta ,h}_\Lambda \). For \(\omega \in \mathrm{D}^\zeta _\Lambda \), there are \(2^{n(\Lambda ,\omega )}\) \(\omega \)-admissible configurations and

$$\begin{aligned} \phi ^{J,\zeta ,h}_\Lambda \left( \{\omega \} \right)=\frac{2^{n(\Lambda ,\omega )}}{Z^{J,\zeta ,h}_\Lambda } \prod _e p_e^{\omega (e)} (1 - p_e)^{1 - \omega (e)}. \end{aligned}$$

The coupling \(\varphi ^{J,\zeta ,h}_\Lambda \) has a probabilistic interpretation. To sample an Ising configuration \(\sigma \sim \mu ^{J,\zeta ,h}_\Lambda \) given a sample \(\omega \sim \phi ^{J,\zeta ,h}_\Lambda \), set \(\sigma (V_+)=+1\), set \(\sigma (V_-)=-1\), and independently for \(i=1,\ldots ,n\) set

$$\begin{aligned} \sigma (V_i)={\left\{ \begin{array}{ll} +1&\,\text{ with} \text{ probability} 1/2,\\ -1&\,\text{ otherwise}.\\ \end{array}\right.} \end{aligned}$$

It is sometimes easier to ignore the ghost edges. Let \(r(\omega )\) denote the edge configuration obtained by closing all the ghost edges. We will say that \(V\subset \Lambda \) is a real cluster if \(V\) is an \(r(\omega )\)-cluster. If \(V\) is a real cluster under \(\phi ^{J,\zeta ,h}_\Lambda \), and not connected to \(\partial ^\pm \Lambda \), then

$$\begin{aligned} \sigma (V)={\left\{ \begin{array}{ll} +1&\text{ with} \text{ probability} e^{\beta h |V|}/(1+e^{\beta h |V|}),\\ -1&\text{ otherwise}.\\ \end{array}\right.} \end{aligned}$$
(2.3)

Let \(x\leftrightarrow y\) denote the event that \(x\) and \(y\) are in the same real cluster, and let \(A\leftrightarrow B\) denote the event that \(a \leftrightarrow b\) for some \(a\in A\) and \(b\in B\). Note that when \(h=0\), all the clusters are real clusters.

There are two special cases of the \(h=0\) random-cluster model, wired and free boundary conditions. Wired boundary conditions refers to either all-plus or all-minus boundary conditions. Under wired boundary conditions \(\mathrm{D}^\zeta _\Lambda = \Omega _\Lambda \). The limit \(\phi ^{J,\mathrm{w}}=\lim _{\Lambda \rightarrow \mathbb{Z }^d} \phi ^{J,+,0}_\Lambda \) is called the wired random-cluster measure on \(\mathbb{Z }^d\). Free boundary conditions refers to pretending that \(\partial \Lambda =\varnothing \) whilst defining \(\phi ^{J,\zeta ,h}_\Lambda \). The resulting measure \(\phi ^{J,\mathrm{f},0}_\Lambda \) depends only on \((J(e):e\in E(\Lambda ))\). The measure \(\phi ^{J,\mathrm{f}}\) obtained by taking the limit of \(\phi ^{J,\mathrm{f},0}_\Lambda \) as \(\Lambda \rightarrow \mathbb{Z }^d\) is called the free random-cluster measure on \(\mathbb{Z }^d\).

Let \(0\leftrightarrow \infty \) denote the event that the origin is in an infinite real-cluster. The set

$$\begin{aligned} \mathcal{N }:=\left\{ \beta :\mathbb{Q }\left[\mu ^{J,\mathrm{f}}(0 \leftrightarrow \infty )\right] < \mathbb{Q }\left[\mu ^{J,\mathrm{w}}(0\leftrightarrow \infty )\right]\right\} \end{aligned}$$
(2.4)

is at most countable [24]. It is conjectured that \(\mathcal{N }=\varnothing \).

2.3 Coarse graining

Coarse graining is an important technique in the study of percolation and the random-cluster model. The open edges of the random-cluster model percolate for \(\beta >\beta _\mathrm{c}\). Slab percolation is a stronger property than percolation [24]. In three and higher dimensions, slab percolation refers to percolation in a slab \(\mathbb{Z }^{d-1}\times \{1,\dots ,n\}\). In two dimensions it refers to the existence of spanning clusters in rectangles with arbitrarily high aspect ratios. The slab-percolation threshold is defined

$$\begin{aligned} \hat{\beta }_\mathrm{c}=\inf \{ \beta >0 :\, \text{ slab} \text{ percolation} \text{ occurs} \text{ under} \, \mathbb{Q }[\mu ^{J,\mathrm{f}}]\} \ge \beta _\mathrm{c}. \end{aligned}$$

It is conjectured that \(\hat{\beta }_\mathrm{c}=\beta _\mathrm{c}\).

With \(K\) a positive integer, let

$$\begin{aligned} \mathbb{B }_K = [-K/2, K/2)^d \cap \mathbb{Z }^d. \end{aligned}$$

For \(i\in \mathbb{Z }^d\), let \(\mathbb{B }_K(i) := \mathbb{B }_K + K i\) denote a copy of \(\mathbb{B }_K\) centered at \(K i\).

Let \(\Lambda \subset \mathbb{Z }^d\) and let \(\omega \in \Omega _\Lambda \) denote an edge configuration for \(\Lambda \). If \(\mathbb{B }_K(i)\subset \Lambda \) and, looking at the restriction of \(\omega \) to \(\mathbb{B }_K(i)\), if there is a unique real-cluster \(A\subset \mathbb{B }_K(i)\) connecting the \(2d\) faces of \(\mathbb{B }_K(i)\), let \(\mathbb{B }_K^\dagger (i)=A\); let \(\mathbb{B }_K^\ddagger (i)\) denote the real \(\Lambda \)-cluster containing \(\mathbb{B }_K^\dagger (i)\). Otherwise, let \(\mathbb{B }_K^\dagger (i)=\mathbb{B }_K^\ddagger (i)=\varnothing \).

Let \(\varepsilon _\mathrm{cg}>0\). Recall the definition (1.2) of the spontaneous magnetization \(m^*\).

Definition 2.3.1

A box \(\mathbb{B }_K(i) \subset \Lambda \) is \(\varepsilon _\mathrm{cg}\) -good if:

  1. (i)

    \(\mathbb{B }_K^\dagger (i)\) is connected (by paths of length one) to each \(\mathbb{B }_K^\dagger (j)\) such that \(\mathbb{B }_K(j)\subset \Lambda \) and \(\Vert i-j\Vert _1=1\).

  2. (ii)

    The diameters of the real-clusters of \(\Lambda \setminus (\cup _j \mathbb{B }_K^\dagger (j))\) intersecting \(\mathbb{B }_K(i)\) are at most \(K/2\).

  3. (iii)

    \(\mathbb{B }_K^\ddagger (i)\cap \mathbb{B }_K(i)\) contains between \(K^dm^*(1-\varepsilon _\mathrm{cg})\) and \(K^dm^*(1+\varepsilon _\mathrm{cg})\) vertices. Otherwise \(\mathbb{B }_K(i)\) is \(\varepsilon _\mathrm{cg}\) -bad.

Note that any box not entirely contained in \(\Lambda \) is automatically \(\varepsilon _\mathrm{cg}\)-bad. If a box is \(1\)-bad then it is also \(\varepsilon _\mathrm{cg}\)-bad for all \(\varepsilon _\mathrm{cg}\in (0,1)\). Recall that we have fixed \(\beta \in (\beta _0,\infty )\setminus \mathcal{N }\).

To allow unusually high dilution on the edge boundary of \(\Lambda \), let \(J\sim \mathbb{Q }\), and write \(J^{\prime }\stackrel{\Lambda }{\sim } J\) if \(J^{\prime }\) is a collection of coupling strengths that agrees with \(J\) in \(E(\Lambda )\). Combining [24, Theorem 2.1 and Proposition 2.2] yields:

Proposition 2.3.2

There are constants \(c_\mathrm{cg}=c_\mathrm{cg}(\varepsilon _\mathrm{cg})>0\) and \(K_0=K_0(\varepsilon _\mathrm{cg})\in \mathbb{N }\) such that for \(K^{\prime }\ge K_0\) and any \(\mathbb{B }_K(i_1),\dots ,\mathbb{B }_K(i_n)\subset \Lambda \), with \(\mathbb{Q }\)-probability \(1-\exp (-c_\mathrm{cg}K n)\),

$$\begin{aligned} \sup _{J^{\prime }\stackrel{\Lambda }{\sim } J} \varphi ^{J^{\prime },+,0}_\Lambda (\mathbb{B }_K(i_1),\dots ,\mathbb{B }_K(i_n)\,\text{ are}\,\varepsilon _\mathrm{cg}-\text{ bad}) \le \exp (-c_\mathrm{cg}K n). \end{aligned}$$

Coarse graining gives a crude measure of the cost of phase coexistence. Consider a path of neighboring boxes \(\mathbb{B }_K(i_1),\ldots ,\mathbb{B }_K(i_j)\) in \(\Lambda \). If the first box and last box are not connected by a path of open edges, then there must be a \(1\)-bad box somewhere along the path of boxes. Moreover, there must be a surface of at least \(\lfloor K/K_0(1)\rfloor ^{d-1}\,1\)-bad \(\mathbb{B }_{K_0(1)}\)-boxes separating \(\mathbb{B }_K(i_1)\) from \(\mathbb{B }_K(i_j)\) [7], cf. Lemma 4.2]. We obtain the following corollary to Proposition 2.3.2. Let \(c_\mathrm{cg}^{\prime }\) denote a positive constant, independent of \(\varepsilon _\mathrm{cg}\).

Corollary 2.3.3

Let \(K\ge K_0(1)\). For \(k=1,\ldots ,n\), let \(\mathbb{B }_K(i^k_1),\ldots ,\mathbb{B }_K(i^k_{j_k})\) denote a simple path of neighboring boxes in \(\Lambda \) with length \(j_k\le \exp (\sqrt{K})\). Assume the \(n\) chains are disjoint. Let \(A\) denote the event that for each \(k\), \(\mathbb{B }_K(i^k_1)\) is not connected to \(\mathbb{B }_K(i^k_{j_k})\) in \(\cup _{l=1}^{j_k} \mathbb{B }_K(i^k_l)\). With \(\mathbb{Q }\)-probability \(1-\exp (-c_\mathrm{cg}^{\prime }K^{d-1}n)\),

$$\begin{aligned} \sup _{J^{\prime }\stackrel{\Lambda }{\sim } J} \varphi ^{J^{\prime },+,0}_\Lambda (A) \le \exp (-c_\mathrm{cg}^{\prime }K^{d-1}n). \end{aligned}$$

We can quantify the extent to which a magnetic field and mixed boundary conditions affect the coarse graining property. Recall that \(E^\pm (\Lambda )\) denotes the set of edges connecting \(\Lambda \) to the external vertex boundary \(\partial ^\pm \Lambda \).

Lemma 2.3.4

Let \(\zeta \in \Sigma \). For any \(\mathbb{B }_K(i_1),\dots ,\mathbb{B }_K(i_n)\subset \Lambda \), with \(\mathbb{Q }\)-probability \(1-\exp (- c_\mathrm{cg}K n)\),

$$\begin{aligned} \sup _{J^{\prime }\stackrel{\Lambda }{\sim } J} \varphi ^{J^{\prime },\zeta ,h}_\Lambda (\mathbb{B }_K(i_1),\ldots ,\mathbb{B }_K(i_n)\,\, \text{ are}\,\varepsilon _\mathrm{cg}\text{-bad})\\ \le \exp (\beta |E^\pm (\Lambda )| + \beta h |\Lambda | - c_\mathrm{cg}K n). \end{aligned}$$

Proof

Let \((\sigma ,\omega )\) be an element of the support of \(\varphi ^{J^{\prime },+,0}_\Lambda \); under \(\omega \) no ghost edges are open. Let \(B\) denote the set of configurations that agree with \((\sigma ,\omega )\) as far as the vertices and the real edges are concerned,

$$\begin{aligned} \frac{\varphi ^{J^{\prime },\zeta ,h}_\Lambda (B)}{\varphi ^{J^{\prime },+,0}_\Lambda (\{(\sigma ,\omega )\})} \le \frac{Z^{J^{\prime },+,0}_\Lambda }{Z^{J^{\prime },\zeta ,h}_\Lambda }. \end{aligned}$$

The right-hand side is bounded above by \(\exp (\beta |E^\pm (\Lambda )|+\beta h |\Lambda |)\). The lemma follows by Proposition 2.3.2. \(\square \)

2.4 The graphical construction of the Glauber dynamics

For \(\xi \in \Sigma ^\zeta _\Lambda \), let \((\sigma _t=\sigma ^{\xi }_{\Lambda ,\zeta ,h;t})_{t\ge 0}\) denote the dynamic Ising model, started at time \(0\) in state\(\xi \) and evolving according to the Glauber dynamics (also knows as heat-bath dynamics) in \(\Lambda \) with boundary conditions \(\zeta \). The Glauber dynamics can be described by the following graphical construction. Let \(\sigma _0=\xi \). Place a rate-one Poisson process at each vertex. Label the points of the Poisson processes \((x_i,t_i)\) with \(t_1<t_2<\ldots \); to each point \((x_i,t_i)\) attach a uniform \([0,1]\) random variable \(U_i\). Let \(\sigma _{t-}\) denote the Ising configuration immediately before time \(t\). For each point of the Poisson process, resample the spin at \(x_i\) from the Ising measure conditional on the state of the neighboring spins. If the probability \(\sigma (x_i)=+1\) conditional on \(\{\sigma (y)=\sigma _{t_i-}(y):y\sim x\}\) is \(q\), set \(\sigma _{t_i}(x_i)=+1\) if \(U_i>1-q\), and \(-1\) otherwise. The dynamics are:

  1. (i)

    Monotonic with respect to the boundary and initial conditions.

  2. (ii)

    Finite range: to update site \(x\) only requires knowledge of the neighbors.

  3. (iii)

    Bounded with respect to the transition rates.

Our results are also valid for other dynamics, such as the Metropolis dynamics, that share these properties.

2.5 Stochastic orderings

Given two measures \(\mu _1,\mu _2\) on a set \(\mathbb{R }^\Lambda \), we will say that \(\mu _1\) is stochastically dominated by \(\mu _2\), and we will write \(\mu _1\le _{\mathrm{st}}\ \mu _2\), if there is a coupling \((\sigma _1,\sigma _2)\) on \(\mathbb{R }^\Lambda \times \mathbb{R }^\Lambda \) such that

  1. (i)

    \(\sigma _1 \sim \mu _1\),

  2. (ii)

    \(\sigma _2 \sim \mu _2\), and

  3. (iii)

    \(\sigma _1\le \sigma _2\) with probability one.

Holley’s inequality [14, Theorem 2.1] can be used to prove stochastic orderings for the ferromagnetic Ising model. The Ising model on a fixed graph \(\Lambda \) is stochastically increasing with respect to the magnetic field and the boundary conditions: for \(h_1\le h_2\) and any \(\zeta _1\le \zeta _2\),

$$\begin{aligned} \mu ^{J,\zeta _1,h_1}_\Lambda \le _{\mathrm{st}}\ \mu ^{J,\zeta _2,h_2}_\Lambda . \end{aligned}$$

The effect of expanding the region depends on the boundary conditions. With \(\Delta \subset \Lambda \),

$$\begin{aligned} \mu ^{J,+,h}_\Delta \ge _{\mathrm{st}}\ \mu ^{J,+,h}_\Lambda \quad \text{ but} \quad \mu ^{J,-,h}_\Delta \le _{\mathrm{st}}\ \mu ^{J,-,h}_\Lambda . \end{aligned}$$

Under plus boundary conditions, the random-cluster representation is stochastically increasing with \(h\in [0,\infty )\) and \(J\),

$$\begin{aligned} \phi ^{J_1,+,h_1}_\Lambda \le _{\mathrm{st}}\ \phi ^{J_2,+,h_2}_\Lambda \quad \text{ if}\quad h_2\ge h_1 \ge 0\, \text{ and} \, \forall e,\ J_2(e)\ge J_1(e) \ge 0. \end{aligned}$$

Note that sending \(J(e)\) to zero or infinity on the boundary allows us to compare free and wired boundary conditions.

3 \(L_1\)-theory

3.1 Microscopic and mesoscopic scales

Recall that we have fixed \(p\in (p_\mathrm{c},1)\) and \(\beta \in (\beta _0,\infty )\setminus \mathcal{N }\). To take advantage of the coarse graining result, we introduce some notation. With reference to Theorem 1.2.1, let \(h>0\). Define

$$\begin{aligned} K= \lfloor h^{-1/(2d)} \rfloor , \quad \quad N= K\lfloor h^{-1}/K \rfloor \approx h^{-1}. \end{aligned}$$
(3.1)

We will call \(N\) the macroscopic scale. This is the scale at which nucleation of plus droplets occurs. The number \(K\) denotes a mesoscopic scale. We will consider regions with size order \(N\) composed of boxes \(\mathbb{B }_K(i)\). The mesoscopic scale has been chosen so that we can use \(L_1\)-theory (\(K\gg \log N\)) and so that the effect of the magnetic field on the mesoscopic scale is negligible (\(hK^d\approx 0\)).

Let \(\mathcal{D }\subset \mathbb{R }^d\) denote a connected region. Let \(\mathbb{D }(\mathcal{D },N,K)\) denote a discretized version of \(\mathcal{D }\) composed of mesoscopic boxes,

$$\begin{aligned} \mathbb{D }(\mathcal{D },N,K)=\bigcup _{i\in I} \mathbb{B }_K(i), \quad \quad I=\left\{ x\in \mathbb{Z }^d:x+\left[-\frac{K}{2N},\frac{K}{2N}\right]^d\subset \mathcal{D }\right\} . \end{aligned}$$
(3.2)

We have used \(h>0\) to define a set \(\Lambda =\mathbb{D }(\mathcal{D },N,K)\) with size \(\mathrm{{O}}(N)=\mathrm{{O}}(h^{-1})\) on which we wish to study the Ising model \(\mu ^{J,\zeta ,h}_\Lambda \). Given \(\Lambda \), we will also want to consider the Ising measure \(\mu ^{J,\zeta ,0}_\Lambda \). From now on, \(h\) will always determine the scale \(N\) but will not always indicate the strength of the magnetic field. The coarse graining implies that under \(\mu ^{J,\mathrm{f},0}_\Lambda \), \(\sigma (\mathbb{B }_K(i))\) is close to either \(\pm m^*\) with high probability. This motivates the definition below of the magnetization profile \(\mathbb{M }^\zeta _K:\mathbb{R }^d\rightarrow \mathbb{R }\) associated with the Ising configuration \(\sigma \).

Let \(\Lambda \) denote an arbitrary finite subset of \(\mathbb{Z }^d\). Let \(\zeta \in \Sigma \) denote a boundary condition. Recall the notation (2.2) and that under \(\mu ^{J,\zeta ,h}_\Lambda \), \(\sigma (x)=\zeta (x)\) for \(x\not \in \Lambda \).

For \(\sigma \in \Sigma _\Lambda ^\zeta \), define the profile \(\mathbb{M }_K^\zeta :\mathbb{R }^d\rightarrow \mathbb{R }\) as follows. For \({\mathbf{{x}}}\in \mathbb{R }^d\), choose \(i\) such that \(N{\mathbf{{x}}}\in [-K/2,K/2)^d+K i\). Let

$$\begin{aligned} \mathbb{M }_K^\zeta ({\mathbf{{x}}})= \frac{1}{2}\times {\left\{ \begin{array}{ll} 1+\sigma (\mathbb{B }_K(i))/m^*,&\mathbb{B }_K(i)\subset \Lambda ,\\ 1+\sigma (\mathbb{B }_K(i)),&\text{ otherwise.}\\ \end{array}\right.} \end{aligned}$$
(3.3)

To understand this definition, suppose that \(\Lambda \) takes the form \(\mathbb{D }(\mathcal{D },N,K)\) for some \(\mathbb{D }\subset \mathbb{R }^d\), and suppose that \(\zeta \) is constant on mesoscopic boxes. For \(x\in \mathcal{D }\), \(\mathbb{M }_K^\zeta (x)\approx 1\) indicates plus phase, \(\mathbb{M }_K^\zeta (x)\approx 0\) indicates minus phase. Outside \(\mathcal{D }\), the values 1 and 0 correspond to \(\zeta =+\) and \(\zeta =-\) boundary conditions, respectively. The definition of the profile outside \(\mathcal{D }\) is important when we integrate over the profile, particularly in Sect. 4.5.

The idea behind \(L_1\)-theory is that \(\mathbb{M }_K^\zeta \) can be approximated by the class of bounded variation profiles (Sect. 3.3). The large deviations of \(\mathbb{M }_K^\zeta \) can be described in terms of surface tension.

3.2 Surface tension in a parallelepiped

In statistical physics, surface tension is the excess free energy per unit area due to the presence of an interface. The definitions of surface tension in [22] and [25] differ by a factor of \(\beta \); we have chosen to follow [25].

Let \(({\mathbf{{n}}},{\mathbf{{u}}}_2,\dots ,{\mathbf{{u}}}_d)\) denote an orthonormal basis for \(\mathbb{R }^d\) and let \(\mathcal{R }\) denote the rectangular parallelepiped

$$\begin{aligned} \mathcal{R }_{L,H}({\mathbf{{n}}},{\mathbf{{u}}}_2,\ldots ,{\mathbf{{u}}}_d):=\left\{ t_1 {\mathbf{{n}}}+ \sum _{k = 2}^d t_k{\mathbf{{u}}}_k: \mathbf{t}\in \left[-\frac{H}{2},\frac{H}{2}\right]\times \left[-\frac{L}{2},\frac{L}{2}\right]^{d-1}\right\} . \end{aligned}$$

\(\mathcal{R }\) is centered at the origin, has height \(H\) in the direction \({\mathbf{{n}}}\) and extension \(L\) in the other directions.

Let \(\Lambda =\mathbb{D }(\mathcal{R },N,1)\) denote a discrete version of \(\Lambda \) (3.2). The box \(\Lambda \) has sides of length \(NL\) in the directions \({\mathbf{{u}}}_2, \ldots , {\mathbf{{u}}}_d\). The surface tension can be written in terms of either the Ising model partition function (2.1) or the random-cluster representation.

Definition 3.2.1

Let \(\zeta \) denote the configuration in \(\Sigma \) given by \(\zeta (y)=+1\) if \(y\cdot {\mathbf{{n}}}\ge 0\) and \(\zeta (y)=-1\) otherwise. The surface tension \(\tau ^{J}_\Lambda \) is defined by

$$\begin{aligned} \tau ^{J}_\Lambda = \frac{1}{(NL)^{d-1}} \log \frac{Z^{J,+,0}_\Lambda }{Z^{J,\zeta ,0}_\Lambda } =\frac{1}{(NL)^{d-1}} \log \frac{1}{\phi ^{J,+,0}_\Lambda (\mathrm{D}^\zeta _\Lambda )}. \end{aligned}$$

Let \(J\sim \mathbb{Q }\). Surface tension converges in probability as \(N\rightarrow \infty \) [25, Theorem 1.3]:

Proposition 3.2.2

For \(\beta > 0\) and \({\mathbf{{n}}}\in \mathcal{S }^{d-1}\), there exists \(\tau ({\mathbf{{n}}})\ge 0\), the surface tension perpendicular to \({\mathbf{{n}}}\), such that for all parallelepipeds \(\mathcal{R }=\mathcal{R }_{L,H}({\mathbf{{n}}},{\mathbf{{u}}}_2,\dots ,{\mathbf{{u}}}_d)\),

$$\begin{aligned} \tau ^J_\Lambda \xrightarrow {\mathbb{Q }\text{-probability}}\,\tau ({\mathbf{{n}}})\quad \text{ as}\, N\rightarrow \infty . \end{aligned}$$
(3.4)

Surface tension is strictly positive at temperatures below the threshold for slab percolation [25, Proposition 2.11]:

Proposition 3.2.3

There are constants \(C,c>0\) such that for \({\mathbf{{n}}}\in \mathcal{S }^{d-1}\) and \(\beta > \hat{\beta }_\mathrm{c}, then\,c \tau (\mathbf{{e}}_1) \le \tau ({\mathbf{{n}}})\le C\tau (\mathbf{{e}}_1)\le C \beta \).

We note for completeness that we are discussing quenched surface tension. Annealed surface tension, which will not be used in this paper, describes the cost of phase coexistence under the averaged measure \(\mathbb{Q }\mu ^{J,\zeta ,h}\). When studying large deviations under the averaged measure, the environment \(J\) changes to reduce the surface tension. Although we are interested in the large deviations of \(J\), we prefer to control them ‘by hand’ using the \(\mathbb{Q }_\theta \) notation defined in (4.5). This is less efficient in terms of the size of the large deviation needed. However, it is much simpler.

3.3 Surface tension in cones

\(L_1\)-theory describes the Ising model at equilibrium [8, 9, 25]. We will restrict our attention to certain subsets of \(\mathbb{R }^d\) with zero surface tension at the boundary. At the microscopic scale, this corresponds to the sampling \(J\sim \mathbb{Q }\) but conditioned on the existence of a surface of edges with \(J(e)=0\).

For \(\theta \in (0,2\pi ]\) define a linear cone \(\mathcal{A }_\theta \),

$$\begin{aligned} \mathcal{A }_\theta :=\{{\mathbf{{x}}}\in \mathbb{R }^d: x_1 \ge \Vert {\mathbf{{x}}}\Vert _2 \cos (\theta /2)\}. \end{aligned}$$

For \(\theta =2\pi \), \(\mathcal{A }_\theta \) is simply the whole of \(\mathbb{R }^d\). Let \(\mathcal{L }^d\) denote the Lebesgue measure on \(\mathcal{A }_\theta \) and let \(\mathcal{V }(u,\varepsilon )\) denote the \(\mathcal{L }^d\)-ball of radius \(\varepsilon \) about \(u:\mathcal{A }_\theta \rightarrow \mathbb{R }\).

The perimeter \(\mathcal{P }(U)\) of a Borel set \(U \subset \mathcal{A }_\theta \) can be written in terms of functions of bounded variation (see [2, Chapter 3] and [25, Section 3.1]). The set of bounded variation profiles \(\mathrm{BV}\) is taken to be

$$\begin{aligned} \mathrm{BV}:= \left\{ U: U \subset \mathcal{A }_\theta \,\text{ is} \text{ a} \text{ Borel} \text{ set} \text{ and}\,\mathcal{P }(U) <\infty \right\} \!. \end{aligned}$$

We call the \(U\in \mathrm{BV}\) profiles because their indicator functions approximate the magnetization profile \(\mathbb{M }_K^\zeta \).

Bounded variation profiles \(U \in \mathrm{BV}\) have a reduced boundary \(\partial ^\star U\) and an outer normal \({\mathbf{{n}}}^U:\partial ^\star U \rightarrow \mathcal{S }^{d-1}\). Let \(\mathcal{H }^{d-1}\) denote the \(d-1\) dimensional Hausdorff measure on \(\mathcal{A }_\theta \); \(\mathcal{H }^{d-1} (\partial ^\star U) =\mathcal{P }(U)\). We will write \(\partial U\) to refer to the reduced boundary of \(U\) excluding (when \(\theta <2\pi \)) the boundary of \(\mathcal{A }_\theta \),

$$\begin{aligned} \partial U := \partial ^\star U\setminus \partial ^\star \mathcal{A }_\theta . \end{aligned}$$
(3.5)

The outer normal \({\mathbf{{n}}}^U\) defined on \(\partial U\) is Borel measurable. With reference to (3.4), this allows us to define the surface tension and energy of bounded variation profiles for the dilute Ising model. Define the surface tension \(\mathcal{F }\) by

$$\begin{aligned} \mathcal{F }(U) = \int _{\partial U} \tau ({\mathbf{{n}}}^U({\mathbf{{x}}})) d\mathcal{H }^{d-1} ({\mathbf{{x}}}) ,\quad U \in \mathrm{BV}. \end{aligned}$$
(3.6)

Define the energy \(\mathcal{E }\) by

$$\begin{aligned} \mathcal{E }(U)=\mathcal{F }(U) - \beta m^* \mathcal{L }^d(U), \quad \quad U \in \mathrm{BV}. \end{aligned}$$
(3.7)

The motivation for these quantities is that they measure, in the following sense, the cost of phase coexistence associated with the Ising model. Sample \(J\) from \(\mathbb{Q }\) conditional on \(J(e)=0\) for all \(e=\{x,y\}\) such that \(x\) but not \(y\) is in \(\mathbb{D }(\mathcal{A }_\theta ,N,K)\) (3.2). This makes the surface tension on \(\partial ^\star \mathcal{A }_\theta \) zero. Let \(\mathcal{D }\) denote a compact subset of \(\mathcal{A }_\theta \). For a profile of bounded variation \(U\subset \mathcal{D }\), the surface of \(\mathbb{D }(U,N,K)\) has size \(\mathrm{{O}}(1/h^{d-1})\) and \(\mathbb{D }(U,N,K)\) has volume \(\mathrm{{O}}(1/h^d)\) (3.1). Heuristically, we expect that the probability of seeing the plus phase in \(\mathbb{D }(U,N,K)\) and the minus phase in \(\mathbb{D }(\mathcal{D }\setminus U,N,K)\) to be approximately

$$\begin{aligned} \exp \left(-\frac{\mathcal{F }(U)}{h^{d-1}} \right)&\quad \text{ under}\quad \mu ^{J,-,0}_{\mathbb{D }(\mathcal{D },N,K)}, \quad \text{ and}\\ \exp \left(\frac{1}{h^{d-1}} \left[\inf _{U^{\prime }} \mathcal{E }(U^{\prime })-\mathcal{E }(U) \right]\right)&\quad \text{ under}\quad \mu ^{J,-,h}_{\mathbb{D }(\mathcal{D },N,K)}. \end{aligned}$$

The infimum is over profiles \(U^{\prime }\in \mathrm{BV}\) compatible with the boundary conditions. For general \(\mathcal{D }\), it is difficult to evaluate the infimum—the conflicting contributions of the positive field and negative boundary conditions may lead to a complicated equilibrium magnetization profile under \(\mu ^{J,-,h}_{\mathbb{D }(\mathcal{D },N,K)}\). The problem is simpler if \(\mathcal{D }\) is the Wulff shape.

4 The Wulff shape in \(\mathcal{A }_\theta \)

4.1 Wulff, Winterbottom and Summertop shapes

Let \(U\subset \mathcal{A }_\theta \) denote a set of bounded variation. Consider the problem of minimizing \(\mathcal{F }(U)\) given that \(U\) has volume \(b^d\).

Proposition 4.1.1

Let \(\theta \in (0,\pi ]\cup \{2\pi \}\). The problem of finding \(U\in \mathrm{BV}\) with volume \(b^d\) and minimal surface tension has a unique solution when \(\theta <\pi \); for \(\theta =\pi \) and \(\theta =2\pi \) the solution is unique up to translations. There is a scaling constant \(w_\theta \) such that the solution is the convex shape

$$\begin{aligned} \mathcal{W }_\theta (b)=w_\theta b \{{\mathbf{{x}}}\in \mathcal{A }_\theta : \forall {\mathbf{{n}}}\in \mathcal{S }^{d-1},\ {\mathbf{{x}}}\cdot {\mathbf{{n}}} \le \tau ({\mathbf{{n}}})\}. \end{aligned}$$
(4.1)

If we omit the \(b\), take \(b=w_\theta ^{-1}\) so that \(\mathcal{W }_\theta \equiv \mathcal{W }_\theta (w_\theta ^{-1})\).

Special cases of \(\mathcal{W }_\theta \) are known by a variety of names. When \(\theta =2\pi \), \(\mathcal{W }_\theta \) is the Wulff shape. When \(\theta =\pi \), \(\mathcal{W }_\theta \) is the Winterbottom shape. When \(\theta \in (0,\pi )\) and \(d=2\), \(\mathcal{W }_\theta \) is the Summertop shape [27]. We will refer to \(\mathcal{W }_\theta \) as the Wulff shape in \(\mathcal{A }_\theta \).

Proof of Proposition 4.1.1

Let \(U\) denote a compact subset of \(\mathcal{A }_\theta \). With reference to [17], (G)], the formula (3.6) that defines the surface tension \(\mathcal{F }(U)\) is equivalent to

$$\begin{aligned} \mathcal{F }(U)=\lim _{\varepsilon \rightarrow 0} \frac{|(U+\varepsilon \mathcal{W }_{2\pi })\cap \mathcal{A }_\theta | - |U|}{\varepsilon }. \end{aligned}$$

We can use (4.1) to give a lower bound on the cost of phase coexistence: let

$$\begin{aligned} \underline{\mathcal{F }}(U)=\lim _{\varepsilon \rightarrow 0} \frac{|U+\varepsilon \mathcal{W }_\theta |-|U|}{\varepsilon }; \end{aligned}$$

\(\underline{\mathcal{F }}\) takes the cost of phase coexistence orthogonal to \({\mathbf{{n}}}\) to be \(\tau ({\mathbf{{n}}})\times 1_{\{{\mathbf{{n}}}\in \mathcal{A }_\theta \}}\). Observe that:

  1. (i)

    For any \({\mathbf{{x}}}\in \mathcal{A }_\theta \) and \(\varepsilon >0\),

    $$\begin{aligned} {\mathbf{{x}}}+\varepsilon \mathcal{W }_\theta \subset ({\mathbf{{x}}}+\varepsilon \mathcal{W }_{2\pi }) \cap \mathcal{A }_\theta . \end{aligned}$$

    Thus for \(U\in \mathrm{BV}\), \(\underline{\mathcal{F }}(U)\le \mathcal{F }(U)\).

  2. (ii)

    By the Brunn–Minkowski theorem, \(\mathcal{W }_\theta (b)\) is the unique shape in \(\mathcal{A }_\theta \) (up to translations) with volume \(b^d\) and minimal \(\underline{\mathcal{F }}\)-surface cost.

  3. (iii)

    By the convexity of \(\mathcal{W }_{2\pi }\), when \(U=\mathcal{W }_\theta \),

    $$\begin{aligned} (U+\varepsilon \mathcal{W }_{2\pi })\cap \mathcal{A }_\theta = U+\varepsilon \mathcal{W }_\theta \end{aligned}$$

    and so \(\underline{\mathcal{F }}(\mathcal{W }_\theta (b))=\mathcal{F }(\mathcal{W }_\theta (b))\).

Therefore any shape with volume \(b^d\) in \(\mathcal{A }_\theta \) with minimal \(\mathcal{F }\)-surface tension must take the shape \(\mathcal{W }_\theta (b)\). The claim of uniqueness when \(\theta <\pi \) follows from the fact that for any \({\mathbf{{x}}}\in \mathcal{A }_\theta \setminus \{0\}\), \(\mathcal{F }({\mathbf{{x}}}+\mathcal{W }_\theta (b))>\mathcal{F }(\mathcal{W }_\theta (b))\). \(\square \)

4.2 Critical droplets

Let \(\mathsf{{E}}^\theta (b)\) account for the cost of filling \(\mathcal{W }_\theta (b)\) with the plus phase,

$$\begin{aligned} \mathsf{{E}}^\theta (b):=\mathcal{E }(\mathcal{W }_\theta (b))= b^{d-1}\mathcal{F }(\mathcal{W }_\theta (1))- b^d \beta m^*. \end{aligned}$$
(4.2)

The positive term represents the cost of phase coexistence, the negative term represents the benefit of conforming to the magnetic field. Let \(B_\mathrm{c}^\theta \) denote the maximizer of \(\mathsf{{E}}^\theta \), and let \(B_\mathrm{root}^\theta \) denote the positive root of \(\mathsf{{E}}^\theta \),

$$\begin{aligned} B_\mathrm{c}^\theta =\frac{d-1}{d} \frac{\mathcal{F }(\mathcal{W }_\theta (1))}{\beta m^*}, \quad \quad B_\mathrm{root}^\theta =\frac{\mathcal{F }(\mathcal{W }_\theta (1))}{\beta m^*}. \end{aligned}$$
(4.3)

The significance of \(B_\mathrm{root}^\theta \) is that the Ising measure with minus boundary conditions and magnetic field \(h\) on \(\mathbb{D }(\mathcal{W }_\theta (b),N,K)\) favors the minus phase if \(b<B_\mathrm{root}^\theta \), whereas it favors the plus phase if \(b>B_\mathrm{root}^\theta \). Let

$$\begin{aligned} \mathsf{{E}}^\theta _\mathrm{c}:=\mathsf{{E}}^\theta (B_\mathrm{c}^\theta )=\left(\frac{\mathcal{F }(\mathcal{W }_\theta (1))}{d}\right)^d \left(\frac{d-1}{\beta m^*}\right)^{d-1}. \end{aligned}$$
(4.4)

The maximum \(\mathsf{{E}}^\theta _\mathrm{c}\) of \(\mathsf{{E}}^\theta \) characterizes the energy barrier to creating arbitrarily large plus droplets in the cone \(\mathcal{A }_\theta \), starting from the minus phase.

As the \(J(e)\) are independent we can find regions that resemble, due to high local dilution, \(\mathbb{D }(\mathcal{W }_\theta (b),N,K)\) for any \(\theta \in (0,\pi )\) and any \(b>0\). However, to maximize the number of catalysts we do not want to take \(b\) any larger than we have to.

Proposition 4.2.1

The diameter of the critical droplet is bounded uniformly over \(\beta >\hat{\beta }_\mathrm{c}\) and \(\theta \in (0,\pi )\cup \{2\pi \}\). As \(\theta \rightarrow 0, \mathsf{{E}}^\theta _\mathrm{c}=\mathrm{{O}}(\beta \theta ^{d-1})\).

Proof

For \(U\in \mathrm{BV}, \mathcal{F }(U)\) has order \(\beta \mathcal{H }^{d-1}(\partial U)\) by Proposition 3.2.3. Choose \(u_\theta \) such that the cone

$$\begin{aligned} \mathcal{U }_\theta := u_\theta \{{\mathbf{{x}}}\in \mathcal{A }_\theta :x_1\le 1\} \end{aligned}$$

has unit volume. As \(\theta \rightarrow 0\), both \(\mathcal{U }_\theta \) and \(\mathcal{W }_\theta (1)\) have length of order \(\theta ^{-(d-1)/d}\) in the \(x_1\)-direction. By the optimality of the Wulff shape, \(\mathcal{F }(\mathcal{W }_\theta (1))\le \mathcal{F }(\mathcal{U }_\theta )\) which has order \(\beta (\theta u_\theta )^{d-1}\). Substitute this approximation into (4.3) and (4.4). \(\square \)

4.3 Notation for Wulff shapes

Consider the discrete analogue \(\mathbb{A }_\theta \) of \(\mathcal{A }_\theta \) at microscopic scale \(N\) and mesoscopic scale \(K\) (3.2),

$$\begin{aligned} \mathbb{A }_\theta :=\mathbb{D }(\mathcal{A }_\theta ,N,K), \end{aligned}$$

and the discrete analogues of the Wulff shape,

$$\begin{aligned} \mathbb{W }_\theta (b):=\mathbb{D }(\mathcal{W }_\theta (b),N,K), \quad \quad b\ge 0. \end{aligned}$$

For \(\theta \not =2\pi \), the edge boundary of \(\mathbb{A }_\theta \) is infinite, thus the probability of finding a pattern of dilution that carves out a translation of the \(\mathbb{A }_\theta \) anywhere in \(\mathbb{Z }^d\) is zero. Instead let \(B_\mathrm{max}^\theta >0\) denote a fixed, but as yet unknown, quantity. We will limit our attention to the region \(\mathbb{W }_\theta (B_\mathrm{max}^\theta )\). To impose free boundary conditions on the portion of the boundary of \(\mathbb{W }_\theta (B_\mathrm{max}^\theta )\) corresponding to \(\partial ^\star \mathcal{A }_\theta \), let \(\mathbb{Q }_\theta \) denote the dilution measure \(\mathbb{Q }\) conditioned appropriately,

$$\begin{aligned} \mathsf{{Catalyst}}(\theta )&:= \{J:\text{ all} \text{ edges} \text{ connecting} \,\mathbb{W }_\theta (B_\mathrm{max}^\theta ) \text{ to} \,\partial \mathbb{A }_\theta \text{ are} \text{ closed}\}\nonumber \\ \text{ and}\,\mathbb{Q }_\theta&:= \mathbb{Q }[ \,\,\cdot \,\mid \mathsf{{Catalyst}}(\theta ) ]. \end{aligned}$$
(4.5)

The probability of \(\mathsf{{Catalyst}}(\theta )\) is simply \((1-p)\) raised to the power of the number of edges connecting \(\mathbb{W }_\theta (B_\mathrm{max}^\theta )\subset \mathbb{A }_\theta \) to the external vertex boundary \(\partial \mathbb{A }_\theta \) of \(\mathbb{A }_\theta \). \(\mathcal{W }_\theta (b)\) has size order \(b\theta ^{-(d-1)/d}\) along the \(x_1\)-direction and order \(b\theta ^{1/d}\) in the directions \(x_2,\dots ,x_d\). The number of edges relevant to \(\mathsf{{Catalyst}}(\theta )\) is therefore order \((B_\mathrm{max}^\theta \theta ^{-(d-1)/d})\times (B_\mathrm{max}^\theta \theta ^{1/d})^{d-2}\) so

$$\begin{aligned} \mathbb{Q }[\mathsf{{Catalyst}}(\theta )]= \exp \left(\log (1-p) \left(B_\mathrm{max}^\theta \right)^{d-1}\mathrm{{O}}(\theta ^{-1/d}) / h^{d-1}\right)\!. \end{aligned}$$
(4.6)

As well as Wulff shaped regions, we also need to consider Wulff ‘annuli’: the difference between two Wulff shapes. With \(0\le b_1\le b_2\le B_\mathrm{max}^\theta \), let

$$\begin{aligned} \mathcal{W }_\theta (b_1,b_2):= \mathcal{W }_\theta (b_2)\setminus \mathcal{W }_\theta (b_1) \quad \text{ and}\quad \mathbb{W }_\theta (b_1,b_2):= \mathbb{W }_\theta (b_2)\setminus \mathbb{W }_\theta (b_1). \end{aligned}$$

When \(\theta =2\pi \), \(\mathcal{W }_\theta (b_1,b_2)\) is annular; otherwise it is simply-connected. There can be three parts to the boundary of \(\mathcal{W }_\theta (b_1,b_2)\):

  1. (i)

    the inner boundary \(\partial \,\mathcal{W }_\theta (b_1)\),

  2. (ii)

    the outer boundary \(\partial \,\mathcal{W }_\theta (b_2)\), and

  3. (iii)

    the free part of the boundary \(\partial ^\star \mathcal{A }_\theta \cap \partial ^\star \mathcal{W }_\theta (b_1,b_2)\).

If \(b_1=0\) then there is no inner boundary and \(\mathbb{W }_\theta (b_1,b_2)=\mathbb{W }_\theta (b_2)\). If \(\theta =2\pi \) then the free part of the boundary is empty.

Let \((+,-)\) denote an Ising configuration that is equal to \(+1\) on \(\mathbb{W }_\theta (b_1)\), and equal to \(-1\) on \(\mathbb{A }_\theta \setminus \mathbb{W }_\theta (b_2)\); see Fig. 1. We will show in Sect. 4.6 that for \(b\in [b_1,b_2]\), the probability that \(\mathbb{M }_K^{(+,-)}\) is close to \(\mathcal{W }_\theta (b)\) under \(\mu ^{J,(+,-),0}_\Lambda \) is approximately

$$\begin{aligned} \exp \left(\frac{\mathcal{F }(\mathcal{W }_\theta (b_1))-\mathcal{F }(\mathcal{W }_\theta (b))}{h^{d-1}} \right). \end{aligned}$$

As well as \((+,-)\) boundary conditions, we will also consider boundary conditions of \((-,+), (+,+)\) and \((-,-)\). We may simplify \((+,+)\) to + and \((-,-)\) to \(-\).

Fig. 1
figure 1

From left to right \(\mathcal{W }_\theta \) is the intersection of \(\mathcal{A }_\theta \) with \(\mathcal{W }_{2\pi }\). The set \(\mathbb{W }_\theta (b_1,b_2)\) with \((+,-), (-,+)\) and \((+,+)\) boundary conditions. The dotted lines indicate free boundary conditions

We will also need to consider some sets that only differ from \(\mathbb{W }_\theta (b)\) and \(\mathbb{W }_\theta (b_1,b_2)\) at the mesoscopic scale. For each \(b\ge 0\), \(\mathbb{W }_\theta (b)\) is composed of boxes \(\mathbb{B }_K(i)\subset \mathbb{A }_\theta \), each containing \(K^d\) vertices. Therefore as a function of \(b, |\mathbb{W }_\theta (b)|\) is non-decreasing and a step function (i.e. piece-wise constant and cadlag). Let \(\{x_1,x_2,\dots \}\) denote an ordering of \(\mathbb{A }_\theta \) such that taking

$$\begin{aligned} \Delta _\theta ^n:=\{x_1,\dots ,x_n\} \quad \quad (n\ge 0) \end{aligned}$$
(4.7)

we have \(\mathbb{W }_\theta (b)=\Delta _\theta ^{|\mathbb{W }_\theta (b)|}\) for \(b\ge 0\). Let \(\mathbb{W }_\theta ^{\prime }(\,\cdot \,)\) denote a second increasing family of subsets of \(\mathbb{A }_\theta \) such that

  1. (i)

    \(\mathbb{W }_\theta (b)=\mathbb{W }_\theta ^{\prime }(b)\) at the points \(b\) of discontinuity of \(|\mathbb{W }_\theta (b)|\),

  2. (ii)

    \(\mathbb{W }_\theta (b)\subset \mathbb{W }_\theta ^{\prime }(b)\),

  3. (iii)

    for each \(n\), for some \(b_\theta (n)\), \(\mathbb{W }_\theta ^{\prime }(b_\theta (n))=\Delta _\theta ^n\).

Let \(\mathbb{W }_\theta ^{\prime }(b_1,b_2)=\mathbb{W }_\theta ^{\prime }(b_2)\setminus \mathbb{W }_\theta (b_1)\).

4.4 High and uniformly high \(\mathbb{Q }_\theta \)-probability

We will say that an event occurs with high \(\mathbb{Q }_\theta \)-probability if under \(\mathbb{Q }_\theta \) it occurs with probability at least \(1-C\exp (-c/\sqrt{h})\). Note that by taking \(C=\exp (c/\sqrt{h_0})\), we only have to consider \(h\in (0,h_0)\) where \(h_0\) can be arbitrarily small.

Given \(B_\mathrm{max}^\theta >0\), suppose that

$$\begin{aligned} 0\le b_1\le b\le b_2\le B_\mathrm{max}^\theta \quad \text{ and}\quad \Lambda =\mathbb{W }_\theta ^{\prime }(b_1,b_2). \end{aligned}$$
(4.8)

Given a class of events defined in terms of \(b,b_1\) and \(b_2\) we will say that they occur with uniformly high \(\mathbb{Q }_\theta \)-probability if each event occurs with probability at least \(1-C\exp (-c/\sqrt{h})\), uniformly over (4.8). Abusing this notation, we may place some additional restriction on \(b_1\) and \(b_2\); for example fixing \(b_1=0\). In that case interpret (4.8) with the additional restriction in place.

4.5 \(L_1\)-theory under \((\mathrm{w},-)\) boundary conditions

The \(L_1\)-theory developed in [25] describes the dilute Ising model in cubes \(\Lambda =\{1,\dots ,N\}^d\) under the measure \(\mu ^{J,-,0}_\Lambda , J\sim \mathbb{Q }\). The proofs in [25] are easily adapted to sets of the form \(\mathbb{W }_\theta ^{\prime }(b_1,b_2)\) with \(J\sim \mathbb{Q }_\theta \). Moreover, the methodology accommodates the \((\mathrm{w},-)\) boundary conditions described below; considering \((\mathrm{w},-)\) boundary conditions allow us in Sect. 4.6 to measure the cost of phase coexistence between the inner and outer boundary under Dobrushin-style \((+,-)\) boundary conditions. The cost of phase coexistence under \((+,-)\) boundary conditions is then used in Sect. 5.1 to predict the effect of adding the magnetic field by calculating the trade off between favoring the plus phase and the minus phase.

Consider \(\Lambda \) as in (4.8). Let \((\mathrm{w},-)\) denote wired boundary conditions on the inner boundary of \(\Lambda \), and minus boundary conditions on the outer boundary of \(\Lambda \). If \(b_1=0\), \((\mathrm{w},-)\) simply means minus boundary conditions.

This is equivalent to starting from \((+,-)\) or \((-,-)\) boundary conditions, and then replacing the inner boundary of \(\Lambda \) with a single Ising spin variable,

$$\begin{aligned} \mu ^{J,(\mathrm{w},-),0}_{\Lambda } (\{\sigma \}) =\frac{1}{\sum _{s=\pm 1}Z^{J,(s,-),0}_\Lambda } \cdot {\left\{ \begin{array}{ll} \exp \left( - \beta H_{\Lambda }^{J,(+,-),0}(\sigma )\right)\!,&\sigma \in \Sigma ^{(+,-)}_\Lambda ,\\ \exp \left( - \beta H_{\Lambda }^{J,(-,-),0}(\sigma )\right)\!,&\sigma \in \Sigma ^{(-,-)}_\Lambda .\\ \end{array}\right.} \end{aligned}$$

This is a measure on \(\Sigma ^{(+,-)}_\Lambda \cup \Sigma ^{(-,-)}_\Lambda \). Let \(\partial ^\mathrm{w}\Lambda \) denote the wired inner-boundary of \(\Lambda \). Let \(\sigma (\partial ^\mathrm{w}\Lambda )=\pm 1\) according to whether \(\sigma \in \Sigma ^{(+,-)}_\Lambda \) or \(\sigma \in \Sigma ^{(-,-)}_\Lambda \). Define \(\mathbb{M }_K^{(\mathrm{w},-)}\) to be either \(\mathbb{M }_K^{(+,-)}\) or \(\mathbb{M }_K^{(-,-)}\) according to \(\sigma (\partial ^\mathrm{w}\Lambda )\).

The \((\mathrm{w},-)\) measure with \(h=0\) has a natural random-cluster representation with wired-inner and wired-outer boundary conditions. The corresponding coupled measure is

$$\begin{aligned} \varphi ^{J,(\mathrm{w},-),0}_\Lambda (\{(\sigma ,\omega )\})=\frac{\sum _{s=\pm 1} Z^{J,(s,-),0}_\Lambda \varphi _\Lambda ^{J,(s,-),0}(\{\sigma ,\omega \})}{\sum _{s=\pm 1}Z^{J,(s,-),0}_\Lambda }. \end{aligned}$$
(4.9)

Note that only the \(s=\sigma (\partial ^\mathrm{w}\Lambda )\) term in the numerator of (4.9) is positive.

Let \(\mathrm{D}^{(\mathrm{w},-)}_\Lambda \) refer to the event that the inner boundary \(\partial ^\mathrm{w}\Lambda \) and the outer boundary \(\partial ^-\Lambda \) are not connected in the random-cluster representation. For \(\varepsilon >0\), let \(\mathrm{D}_{b,\varepsilon }\) denote the event that the inner-boundary takes the plus spin and that the phase profile is in the \(\mathcal{L }^d\)-ball of radius \(\varepsilon \) around \(\mathcal{W }_\theta (b)\),

$$\begin{aligned} \mathrm{D}_{b,\varepsilon }:=D^{(\mathrm{w},-)}_\Lambda \cap \{\sigma (\partial ^\mathrm{w}\Lambda )=1\}\cap \left\{ \mathbb{M }_K^{(\mathrm{w},-)} \in \mathcal{V }(\mathcal{W }_\theta (b), \varepsilon )\right\} . \end{aligned}$$

Parts (i) and (ii) below follow from the proofs in [25] of Proposition 3.10 and Theorem 1.11 respectively.

Proposition 4.5.1

Let \(b\in [b_1,b_2]\) and \(\varepsilon _\mathrm{wm}>0\). With high \(\mathbb{Q }_\theta \)-probability:

  1. (i)

    The probability of phase coexistence is bounded below,

    $$\begin{aligned} \varphi ^{J,(\mathrm{w},-),0}_\Lambda \left(\mathrm{D}_{b,\varepsilon _\mathrm{wm}} \right) \ge \exp \left(\frac{-\mathcal{F }(\mathcal{W }_\theta (b)) -\varepsilon _\mathrm{wm}}{h^{d-1}}\right). \end{aligned}$$
  2. (ii)

    The probability of phase coexistence is bounded above,

    $$\begin{aligned} \mu ^{J,(\mathrm{w},-),0}_\Lambda \left(\int \mathbb{M }_K^{(\mathrm{w},-)} \,\mathrm{d}\mathcal{L }^d \ge b^d \right) \le \exp \left(\frac{-\mathcal{F }(\mathcal{W }_\theta (b))+\varepsilon _\mathrm{wm}}{h^{d-1}}\right). \end{aligned}$$

4.6 Large deviations under \((+,-)\) boundary conditions

With \(b\) and \(\Lambda \) as in (4.8), we will give upper and lower bounds for the cost of phase coexistence under mixed boundary conditions in the absence of an external magnetic field.

Under \((+,-)\) boundary conditions, the Ising measure favors the minus phase because the minus boundary is bigger. Let \(\varepsilon _\mathrm{pm}>0\). Large deviations of the magnetization \(\mathbb{M }_K^{(+,-)}\) defined in (3.3) away from the minus phase are controlled as follows.

Proposition 4.6.1

(Upper bound) With high \(\mathbb{Q }_\theta \)-probability

$$\begin{aligned} \mu ^{J,(+,-),0}_\Lambda \left( \int \mathbb{M }_K^{(+,-)} \,\mathrm{d}\mathcal{L }^d \ge b^d \right) \le \exp \left(\frac{\mathcal{F }(\mathcal{W }_\theta (b_1))-\mathcal{F }(\mathcal{W }_\theta (b))+\varepsilon _\mathrm{pm}}{h^{d-1}}\right)\!. \end{aligned}$$

Proposition 4.6.2

(Lower bound) With high \(\mathbb{Q }_\theta \)-probability

$$\begin{aligned} \mu ^{J,(+,-),0}_\Lambda \left( \int \mathbb{M }_K^{(+,-)} \,\mathrm{d}\mathcal{L }^d \ge b^d \right) \ge \exp \left(\frac{\mathcal{F }(\mathcal{W }_\theta (b_1))-\mathcal{F }(\mathcal{W }_\theta (b))-\varepsilon _\mathrm{pm}}{h^{d-1}}\right)\!. \end{aligned}$$

Proof of Proposition 4.6.1

By the definition of the \((\mathrm{w},-)\) measure,

$$\begin{aligned}&\mu ^{J,(+,-),0}_\Lambda \left(\int \mathbb{M }_K^{(+,-)} \,\mathrm{d}\mathcal{L }^d \ge b^d\right)\nonumber \\&\quad \le \mu ^{J,(\mathrm{w},-),0}_\Lambda \left(\int \mathbb{M }_K^{(\mathrm{w},-)} \,\mathrm{d}\mathcal{L }^d \ge b^d \right)\cdot \left(\frac{Z^{J,(+,-),0}_\Lambda }{\sum _{s=\pm 1}Z^{J,(s,-),0}_\Lambda }\right)^{-1}. \end{aligned}$$
(4.10)

Proposition 4.5.1 part (ii) provides an upper bound on the first term on the right-hand side of (4.10),

$$\begin{aligned} \mu ^{J,(\mathrm{w},-),0}_\Lambda \left(\int \mathbb{M }_K^{(\mathrm{w},-)} \,\mathrm{d}\mathcal{L }^d \ge b^d \right)\le \exp \left(-\frac{\mathcal{F }(\mathcal{W }_\theta (b))-\varepsilon _\mathrm{wm}}{h^{d-1}}\right). \end{aligned}$$

For \(\omega \in \mathrm{D}^{(\mathrm{w},-)}_\Lambda \), the \(s=+1\) and \(s=-1\) terms in the numerator of the right-hand side of (4.9) are equal, so

$$\begin{aligned} \phi ^{J,(\mathrm{w},-),0}_\Lambda (\omega )=\frac{2Z^{J,(+,-),0}_\Lambda \phi ^{J,(+,-),0}_\Lambda (\omega )}{\sum _{s=\pm 1}Z^{J,(s,-),0}_\Lambda }. \end{aligned}$$

Summing over \(\omega \),

$$\begin{aligned} \phi ^{J,(\mathrm{w},-),0}_\Lambda (\mathrm{D}^{\mathrm{w},-}_\Lambda ) = \frac{2Z^{J,(+,-),0}_\Lambda }{\sum _{s=\pm 1}Z^{J,(s,-),0}_\Lambda }. \end{aligned}$$
(4.11)

By Proposition 4.5.1 part (i) with \(b=b_1\),

$$\begin{aligned} \phi ^{J,(\mathrm{w},-),0}_\Lambda (\mathrm{D}^{\mathrm{w},-}_\Lambda ) \ge \varphi ^{J,(\mathrm{w},-),0}_\Lambda (\mathrm{D}_{b_1,\varepsilon _\mathrm{wm}}) \ge \exp \left(-\frac{\mathcal{F }(\mathcal{W }_\theta (b_1))+\varepsilon _\mathrm{wm}}{h^{d-1}}\right)\!.\quad \end{aligned}$$
(4.12)

Combining inequalities (4.11) and (4.12) produces an upper bound for the second term on the right-hand side of (4.10). The proposition follows by taking \(\varepsilon _\mathrm{wm}=\varepsilon _\mathrm{pm}/3\). \(\square \)

Proof of Proposition 4.6.2

With \(\varepsilon _\mathrm{wm}>0\) let \(b^{\prime }=\root d \of {b^d+\varepsilon _\mathrm{wm}}\) so that \(\mathrm{D}_{b^{\prime },\varepsilon _\mathrm{wm}}\) implies \(\int \mathbb{M }_K^{(+,-)} \,\mathrm{d}\mathcal{L }^d \ge b^d\). By Eq. (4.11) and the definition of the \((\mathrm{w},-)\) measure,

$$\begin{aligned} \varphi ^{J,(+,-),0}_\Lambda \left(\mathrm{D}_{b^{\prime },\varepsilon _\mathrm{wm}} \right) = \frac{2 \varphi ^{J,(\mathrm{w},-),0}_\Lambda \left(\mathrm{D}_{b^{\prime },\varepsilon _\mathrm{wm}}\right)}{ \phi ^{J,(\mathrm{w},-),0}_\Lambda (\mathrm{D}^{(\mathrm{w},-)}_\Lambda )}. \end{aligned}$$

Proposition 4.5.1 part (i) gives a lower bound for the numerator on the right-hand side. Proposition 4.5.1 part (ii) with \(b=b_1\) gives an upper bound on the denominator on the right-hand side. Take \(\varepsilon _\mathrm{wm}\) sufficiently small. \(\square \)

Proposition 4.6.3

With reference to (4.8), for fixed \(\varepsilon _\mathrm{pm}\) both Proposition 4.6.1 and Proposition 4.6.2 hold with uniformly high \(\mathbb{Q }_\theta \)-probability.

Proof

Consider Proposition 4.6.1; the case of Proposition 4.6.2 follows similarly. Controlling uniformly the \(\mathbb{Q }_\theta \)-probability can be reduced to the control of a finite number of events. Let \(M(b_1,b_2,b)\) denote the increasing event \(\left\{ \sigma :\int \mathbb{M }_K^{(+,-)} \,\mathrm{d}\mathcal{L }^d \ge b^d\right\} \) with \(\mathbb{M }_K^{(+,-)}\) defined with respect to \(\mathbb{W }_\theta ^{\prime }(b_1,b_2)\).

The measure \(\mu ^{J,(+,-),h}_\Lambda \) is stochastically increasing with \(b_1\) and \(b_2\). Let \(C,\varepsilon >0\) and let

$$\begin{aligned} b_1^{\prime }=\varepsilon \lceil b_1/\varepsilon \rceil , \quad b_2^{\prime }=\varepsilon \lceil b_2/\varepsilon \rceil ,\quad b^{\prime }=\varepsilon \left\lfloor \varepsilon ^{-1}\root d \of {b^d-C\varepsilon (B_\mathrm{max}^\theta )^{d-1}} \ \right\rfloor . \end{aligned}$$

For triples \((b_1,b_2,b)\) satisfying (4.8), the triple \((b_1^{\prime },b_2^{\prime },b^{\prime })\) takes a finite number of values. With high \(\mathbb{Q }_\theta \)-probability,

$$\begin{aligned} \mu ^{J,(+,-),0}_{\mathbb{W }_\theta ^{\prime }(b_1^{\prime },b_2^{\prime })} \left( M(b_1^{\prime },b_2^{\prime },b^{\prime }) \right) \le \exp \left(\frac{\mathcal{F }(\mathcal{W }_\theta (b_1^{\prime }))-\mathcal{F }(\mathcal{W }_\theta (b^{\prime }))+\varepsilon _\mathrm{pm}/2}{h^{d-1}}\right). \end{aligned}$$

\(C\) can be chosen such that if \(\mathcal{F }(\mathcal{W }_\theta (b_1))-\mathcal{F }(\mathcal{W }_\theta (b))+\varepsilon _\mathrm{pm}<0\) and \(\varepsilon \) is sufficiently small then \(M(b_1,b_2,b) \implies M(b_1^{\prime },b_2^{\prime },b^{\prime })\) and

$$\begin{aligned} \mathcal{F }(\mathcal{W }_\theta (b_1^{\prime })) -\mathcal{F }(\mathcal{W }_\theta (b^{\prime })) +\varepsilon _\mathrm{pm}/2\le \mathcal{F }(\mathcal{W }_\theta (b_1))-\mathcal{F }(\mathcal{W }_\theta (b))+\varepsilon _\mathrm{pm}. \end{aligned}$$

\(\square \)

5 Spatial and Markov chain mixing

We will describe the dilute Ising model at equilibrium under mixed boundary conditions and a magnetic field. These results will be used in the next section to show that sufficiently large plus droplets spread in a predictable way.

5.1 Stability of minimum energy profiles

Consider the Ising measure with a magnetic field \(h\) and \((+,-)\) boundary conditions. Recall the definitions of the energy functions \(\mathcal{E }\) (3.7) and \(\mathsf{{E}}^\theta \) (4.2) and see Fig. 2.

Fig. 2
figure 2

The energy function \(\mathsf{{E}}^\theta (b)\) as used Sect. 5.1 to describe \((+,-)\) boundary conditions. If \(\mathsf{{E}}^\theta (b_1)>\mathsf{E}^\theta (b_2)\) the plus phase is favored (left). If \(\mathsf{E}^\theta (b_1)<\mathsf{E}^\theta (b_2)\) the minus phase is favored (middle) unless one conditions on the existence of a critical droplet of plus phase (right)

With reference to (4.8), consider the case \(\mathsf{{E}}^\theta (b_1)>\mathsf{E}^\theta (b_2)\). The minimum value of \(\mathsf{E}^\theta (b)\) is attained when \(b=b_2\). Geometrically, this means that the profile \(U\) with \(\mathcal{W }_\theta (b_1)\subset U\subset \mathcal{W }_\theta (b_2)\) that minimizes \(\mathcal{E }(U)\) is \(\mathcal{W }_\theta (b_2)\). The minimizer is unique and stable.

Proposition 5.1.1

Let \(\varepsilon _\mathrm{stb}\in (0,1)\) such that \(b_1< b_2(1-\varepsilon _\mathrm{stb})\) and \(\mathsf{{E}}^\theta (b_1)>\mathsf{{E}}^\theta (b_2(1-\varepsilon _\mathrm{stb}))\). There is a constant \(c_\mathrm{stb}=c_\mathrm{stb}(\varepsilon _\mathrm{stb})>0\), independent of \(B_\mathrm{max}^\theta \), such that for profiles \(U\in \mathrm{BV}\),

$$\begin{aligned} \mathcal{L }^d(U)\in [b_1^d,b_2^d (1-\varepsilon _\mathrm{stb})^d] \implies \mathcal{E }(U) \ge \mathcal{E }(\mathcal{W }_\theta (b_2))+2b_2 c_\mathrm{stb}. \end{aligned}$$
(5.1)

Given \(\varepsilon _\mathrm{stb}\), with uniformly high \(\mathbb{Q }_\theta \)-probability,

$$\begin{aligned} \mu ^{J,(+,-),h}_\Lambda \left(\int \mathbb{M }_K^{(+,-)} \,\mathrm{d}\mathcal{L }^d \le b_2^d (1-\varepsilon _\mathrm{stb})^d \right)\le \exp \left(-\frac{b_2c_\mathrm{stb}}{h^{d-1}}\right). \end{aligned}$$
(5.2)

Proof

We will first show inequality (5.1) with

$$\begin{aligned} c_\mathrm{stb}:=\frac{(d-1) \mathcal{F }(\mathcal{W }_\theta (1)) \left(B_\mathrm{c}^\theta \right)^{d-2} \varepsilon ^2 }{ 4d^2 \root d \of {1-\varepsilon }} \quad \text{ where}\quad \varepsilon :=1-(1-\varepsilon _\mathrm{stb})^d. \end{aligned}$$

By the optimality of the Wulff shape, Proposition , we can assume that \(U=\mathcal{W }_\theta (b)\) for some \(b\). The unique maximum of \(\mathsf{{E}}^\theta \) occurs at \(B_\mathrm{c}^\theta \). As \(\mathsf{E}^\theta (b_1)>\mathsf{{E}}^\theta (b_2(1-\varepsilon _\mathrm{stb}))\) the function \(\mathsf{E}^\theta \) must be decreasing in the region \([b_2(1-\varepsilon _\mathrm{stb}),b_2]\); it is optimal to consider

$$\begin{aligned} b=b_2(1-\varepsilon _\mathrm{stb})\ge B_\mathrm{c}^\theta . \end{aligned}$$
(5.3)

With \(b\) as above,

$$\begin{aligned} \mathsf{{E}}^\theta (b)-\mathsf{{E}}^\theta (b_2)&= \mathcal{F }(\mathcal{W }_\theta (1))(b^{d-1}-b_2^{d-1}) - \beta m^*(b^d-b_2^d) \nonumber \\&= \mathcal{F }(\mathcal{W }_\theta (1))b_2^{d-1}((1-\varepsilon )^{(d-1)/d}-1) + \beta m^* \varepsilon b_2^d. \end{aligned}$$
(5.4)

By (4.3) and (5.3),

$$\begin{aligned} \beta m^*\varepsilon b_2^d \ge \beta m^* \varepsilon \frac{b_2^{d-1} B_\mathrm{c}^\theta }{\root d \of {1-\varepsilon }} = \mathcal{F }(\mathcal{W }_\theta (1)) b_2^{d-1} \frac{\varepsilon }{\root d \of {1-\varepsilon }} \frac{d-1}{d}. \end{aligned}$$
(5.5)

Let \(f(\varepsilon )=(1-\varepsilon )-\root d \of {1-\varepsilon }\). For \(\varepsilon \in (0,1)\), \(f^{\prime \prime \prime }(\varepsilon )>0\) so

$$\begin{aligned} f(\varepsilon )-f(0)-\varepsilon f^{\prime }(0) \ge f^{\prime \prime }(0) \varepsilon ^2/2,\quad \varepsilon \in [0,1], \end{aligned}$$

or more explicitly

$$\begin{aligned} (1-\varepsilon )-\root d \of {1-\varepsilon }+\varepsilon \frac{d-1}{d} \ge \varepsilon ^2\cdot \frac{d-1}{2d^2}, \quad \varepsilon \in [0,1]. \end{aligned}$$
(5.6)

Substituting (5.5) into (5.4) and then using (5.6) gives

$$\begin{aligned} \mathsf{{E}}^\theta (b)-\mathsf{{E}}^\theta (b_2)\ge \mathcal{F }(\mathcal{W }_\theta (1)) b_2^{d-1} \cdot \frac{ \varepsilon ^2 }{\root d \of {1-\varepsilon }} \cdot \frac{d-1}{2d^2} \ge 2b_2c_\mathrm{stb}\end{aligned}$$

as required.

We will now show inequality (5.2). Let \(S\) count, up to an additive constant, the number of plus spins in \(\Lambda \),

$$\begin{aligned} S:=\sum _{x\in \Lambda } \frac{\sigma (x)+m^*}{2}. \end{aligned}$$

By the definition of \(\Lambda \), \(\mathbb{M }_K^{(+,-)}\) and \(S\),

$$\begin{aligned} \left|m^*b_1^d+h^d S - m^* \int \mathbb{M }_K^{(+,-)} \,\mathrm{d}\mathcal{L }^d\right|=\mathrm{{O}}(K/N). \end{aligned}$$

The magnetic field corresponds to a Radon-Nikodym derivative controlled by \(S\). With \(Z:=\mu ^{J,(+,-),0}_\Lambda \left[\exp \left(\beta h S \right)\right]\),

$$\begin{aligned} \frac{\mu ^{J,(+,-),h}_\Lambda ( \{\sigma \})}{\mu ^{J,(+,-),0}_\Lambda (\{\sigma \})} = \frac{1}{Z}\exp \left(\beta h S \right)\!. \end{aligned}$$

We will find a lower bound on \(Z\). Consider the \(\mathcal{L }^d\)-neighborhood of \(\mathcal{W }_\theta (b)\): when \(h\) is sufficiently small

$$\begin{aligned} \mathbb{M }_K^{(+,-)} \in \mathcal{V }(\mathcal{W }_\theta (b),\varepsilon _\mathrm{pm}) \implies |h^{d}\, S - m^*(b^d-b_1^d)| =\mathrm{{O}}(\varepsilon _\mathrm{pm}). \end{aligned}$$

Applying Proposition 4.6.2 with \(b=b_2\),

$$\begin{aligned} h^{d-1} \log Z&\ge h^{d-1} \log \mu ^{J,(+,-),0}_\Lambda (\mathbb{M }_K^{(+,-)} \in \mathcal{V }(\mathcal{W }_\theta (b_2),\varepsilon _\mathrm{pm}))\\&\quad +\beta m^* ( b_2^d-b_1^d) -\mathrm{{O}}(\varepsilon _\mathrm{pm})\\&\ge \mathcal{F }(\mathcal{W }_\theta (b_1))-\mathcal{F }(\mathcal{W }_\theta (b_2))+\beta m^* ( b_2^d-b_1^d) -\mathrm{{O}}(\varepsilon _\mathrm{pm})\\&\ge \mathcal{E }(\mathcal{W }_\theta (b_1))-\mathcal{E }(\mathcal{W }_\theta (b_2))-\mathrm{{O}}(\varepsilon _\mathrm{pm}). \end{aligned}$$

Let \(n=\lfloor (b_2(1-\varepsilon _\mathrm{stb})-b_1)/\varepsilon _\mathrm{pm}\rfloor \). We can write

$$\begin{aligned}&\left\{ \sigma :\int \mathbb{M }_K^{(+,-)} \,\mathrm{d}\mathcal{L }^d \le b_2^d (1-\varepsilon _\mathrm{stb})^d \right\} \\&\quad \subset \bigcup _{b\in \{b_1,b_1+\varepsilon _\mathrm{pm},\dots ,b_1+\varepsilon _\mathrm{pm}n\}} \left\{ \sigma :\int \mathbb{M }_K^{(+,-)} \,\mathrm{d}\mathcal{L }^d \in [b^d,(b+\varepsilon _\mathrm{pm})^d] \right\} . \end{aligned}$$

By Proposition 4.6.1, for \(b\in \{b_1,b_1+\varepsilon _\mathrm{pm},\dots ,b_1+\varepsilon _\mathrm{pm}n\}\),

$$\begin{aligned}&\mu ^{J,(+,-),0}_\Lambda \left[\exp \left( \beta h S \right);\int \mathbb{M }_K^{(+,-)} \,\mathrm{d}\mathcal{L }^d\in [b^d,(b+\varepsilon _\mathrm{pm})^d ]\right]\\&\quad \le \exp \left(h^{1-d} \left[\mathcal{F }(\mathcal{W }_\theta (b_1))-\mathcal{F }(\mathcal{W }_\theta (b))+ \beta m^* [(b+\varepsilon _\mathrm{pm})^d-b_1^d]+ \mathrm{{O}}(\varepsilon _\mathrm{pm}) \right] \right)\\&\quad \le \exp \left(h^{1-d} \left[\mathcal{E }(\mathcal{W }_\theta (b_1))-\mathcal{E }(\mathcal{W }_\theta (b))+ \mathrm{{O}}(\varepsilon _\mathrm{pm})\right]\right)\!. \end{aligned}$$

The left-hand side of (5.2) is therefore at most

$$\begin{aligned} \sum _{b\in \{b_1,b_1+\varepsilon _\mathrm{pm},\dots ,b_1+\varepsilon _\mathrm{pm}n\}} \exp \left(h^{1-d} [\mathcal{E }(\mathcal{W }_\theta (b_2))-\mathcal{E }(\mathcal{W }_\theta (b)) +\mathrm{{O}}(\varepsilon _\mathrm{pm}) ]\right)\!. \end{aligned}$$

Taking \(\varepsilon _\mathrm{pm}\) small with respect to \(\varepsilon _\mathrm{stb}\), inequality (5.2) follows by (5.1). \(\square \)

Consider now the case \(\mathsf{{E}}^\theta (b_1)<\mathsf{{E}}^\theta (b_2)\). The optimum profile matching \((+,-)\) boundary conditions is \(\mathcal{W }_\theta (b_1)\) so the minus phase is dominant.

Proposition 5.1.2

Let \(\varepsilon _\mathrm{stb}>0\). Suppose that \(\mathsf{{E}}^\theta \left(b_1+\varepsilon _\mathrm{stb}\right)< \mathsf{{E}}^\theta (b_2)\). There is a constant \(c_\mathrm{stb}^{\prime }=c_\mathrm{stb}^{\prime }(\varepsilon _\mathrm{stb})>0\) such that with uniformly high \(\mathbb{Q }_\theta \)-probability

$$\begin{aligned} \mu ^{J,(+,-),h}_\Lambda \left(\int \mathbb{M }_K^{(+,-)} \,\mathrm{d}\mathcal{L }^d \ge (b_1 +\varepsilon _\mathrm{stb})^d \right)\le \exp \left(- \frac{c_\mathrm{stb}^{\prime }}{h^{d-1}}\right)\!. \end{aligned}$$

We will omit the proof of Proposition 5.1.2 as it is similar to the proof of Proposition 5.1.1.

Now consider the case \(b_1=0\) and \(b_2>B_\mathrm{c}^\theta \) under minus boundary conditions. In order to get a stability property that does not depend on the sign of \(\mathsf{{E}}^\theta (b_2)\) we will condition on seeing a large region of the plus phase. With \(\varepsilon _\mathcal{C }>0\) let

$$\begin{aligned} \hat{\mu }^{J,-,h}_\Lambda&:= \mu ^{J,-,h}_\Lambda ( \,\cdot \,\mid \mathcal{C }) \quad \text{ where} \nonumber \\ \mathcal{C }=\mathcal{C }(\varepsilon _\mathcal{C })&:= \left\{ \sigma \in \Sigma _\Lambda ^-:\int _{\mathcal{W }_\theta (B_\mathrm{c}^\theta +\varepsilon _\mathcal{C })} \mathbb{M }_K^{-}\,\mathrm{d}\mathcal{L }^d \ge (B_\mathrm{c}^\theta )^d \right\} . \end{aligned}$$
(5.7)

Proposition 5.1.3

Let \(\varepsilon _\mathrm{stb},\varepsilon _\mathcal{C }>0\). Suppose \(b_1=0\) and \(b_2(1-\varepsilon _\mathrm{stb})> B_\mathrm{c}^\theta +\varepsilon _\mathcal{C }\). Recall \(c_\mathrm{stb}\) from Proposition 5.1.1. Given \(\varepsilon _\mathrm{stb}\) and \(\varepsilon _\mathcal{C }\), with uniformly high \(\mathbb{Q }_\theta \)-probability,

$$\begin{aligned} \hat{\mu }^{J,-,h}_\Lambda \left(\int \mathbb{M }_K^{-} \,\mathrm{d}\mathcal{L }^d \le b_2^d (1-\varepsilon _\mathrm{stb})^d \right)\le \exp \left(-\frac{b_2c_\mathrm{stb}}{h^{d-1}}\right). \end{aligned}$$

Proof

The proof of (5.1) (with \(B_\mathrm{c}^\theta \) playing the role of \(b_1\)) implies that for all \(U\in \mathrm{BV}\),

$$\begin{aligned} \mathcal{L }^d(U)\in [(B_\mathrm{c}^\theta )^d,b_2^d (1-\varepsilon _\mathrm{stb})^d] \implies \mathcal{E }(U) \ge \mathcal{E }(\mathcal{W }_\theta (b_2))+2b_2 c_\mathrm{stb}. \end{aligned}$$

Therefore \(\mathsf{{E}}^\theta (b_2 (1-\varepsilon _\mathrm{stb}))-\mathsf{{E}}^\theta (b_2)\ge 2b_2c_\mathrm{stb}\).

Returning to the context of Proposition 5.1.3, we have \(b_1=0\). Let \(a=\min \{0,\mathsf{{E}}^\theta (b_2)\}\) denote the minimum of \(\mathsf{{E}}^\theta (b)\) for \(b\in [b_1,b_2]\). Let \(b\) denote the minimum value in the range \([b_2(1-\varepsilon _\mathrm{stb}),b_2]\) such that

$$\begin{aligned} b_2^d-b^d \le (B_\mathrm{c}^\theta +\varepsilon _\mathcal{C })^d-(B_\mathrm{c}^\theta )^d. \end{aligned}$$

Treating the magnetic field as a Radon-Nikodym derivative as in the proof of Proposition 5.1.1,

$$\begin{aligned} \mu ^{J,-,h}_\Lambda \left(\int \mathbb{M }_K^{-} \,\mathrm{d}\mathcal{L }^d \in \left[(B_\mathrm{c}^\theta )^d,b_2^d(1-\varepsilon _\mathrm{stb})^d\right] \right)\le \exp \left(\frac{a-\mathsf{{E}}^\theta (b_2)-3b_2c_\mathrm{stb}/2}{h^{d-1}} \right) \end{aligned}$$

and

$$\begin{aligned} \mu ^{J,-,h}_\Lambda (\mathcal{C })\ge \mu ^{J,-,h}_\Lambda \left(\int \mathbb{M }_K^{-} \,\mathrm{d}\mathcal{L }^d \in \left[b^d,b_2^d\right] \right) \ge \exp \left(\frac{a-\mathsf{{ E}}^\theta (b_2)-b_2c_\mathrm{stb}/2}{h^{d-1}}\right). \end{aligned}$$

\(\square \)

We will now consider two different boundary conditions. By Proposition 4.6.1, plus/minus symmetry when \(h=0\), and monotonicity, the plus phase is dominant under \(\mu ^{J,(-,+),h}_\Lambda \).

Proposition 5.1.4

Let \(\varepsilon _\mathrm{stb}>0\). There is a constant \(c_\mathrm{stb}^{\prime \prime }=c_\mathrm{stb}^{\prime \prime }(\varepsilon _\mathrm{stb})>0\) such that with uniformly high \(\mathbb{Q }_\theta \)-probability,

$$\begin{aligned} \mu ^{J,(-,+),h}_\Lambda \left(\int _{\mathcal{W }_\theta (b_1,b_2)} \mathbb{M }_K^{(-,+)} \,\mathrm{d}\mathcal{L }^d \le b_2^d-(b_1+\varepsilon _\mathrm{stb})^d \right)\le \exp \left(- \frac{c_\mathrm{stb}^{\prime \prime }}{h^{d-1}}\right). \end{aligned}$$

Finally, consider boundary conditions of plus on the inner boundary and free on the outer boundary.

Proposition 5.1.5

Let \(\varepsilon _\mathrm{stb}>0\) and suppose \(b_2^d\ge 2\varepsilon _\mathrm{stb}\). With uniformly high \(\mathbb{Q }_\theta \)-probability,

$$\begin{aligned} \mu ^{J,(+,\mathrm{f}),h}_\Lambda \left(\int _{\mathcal{W }_\theta (b_1,b_2)} \mathbb{M }_K^+ \,\mathrm{d}\mathcal{L }^d \le b_2^d-\varepsilon _\mathrm{stb}\right)\le \exp \left(- \frac{\varepsilon _\mathrm{stb}}{h^{d-1}}\right)\!. \end{aligned}$$

We omit the proof as it is similar to the others in this section.

5.2 Phase labels in \(\mathbb{A }_\theta \)

Lemma 2.3.4 provides a simple measure of the cost of phase coexistence in a region conditional on the boundary. Using phase labels—defined in terms of the coarse graining—to describe the boundary conditions allows for a sharper bound.

Take \(\Lambda \) as in (4.8). Let \(\omega \in \Omega _\Lambda \), and let \(\sigma \in \Sigma _\Lambda ^\zeta \) denote an \(\omega \)-admissible spin configuration. Define phase labels in terms of the coarse graining with \(\varepsilon _\mathrm{cg}=1\):

$$\begin{aligned} \Psi (i)= {\left\{ \begin{array}{ll} +1,&\mathbb{B }_K(i) \text{ is} \text{1-good} \text{ and}\,\sigma (\mathbb{B }_K^\dagger (i))=+1,\\ -1,&\mathbb{B }_K(i) \text{ is} \text{1-good} \text{ and}\,\sigma (\mathbb{B }_K^\dagger (i))=-1,\\ 0,&\text{ otherwise}.\\ \end{array}\right.} \end{aligned}$$
(5.8)

Let \(\Gamma =\cup _{i\in I}\mathbb{B }_K(i)\) denote a subset of \(\Lambda \) composed of whole mesoscopic boxes. Let \(\partial I\) denote the set of points \(i\in \mathbb{Z }^d\) at \(L_\infty \)-distance \(2\) from \(I\) such that \(\mathbb{B }_K(i)\subset \mathbb{A }_\theta \). Let \(\psi :I\cup \partial I\rightarrow \{\pm 1,0\}\) denote a label configuration assigning labels to the boxes \(\mathbb{B }_K(i)\). Let \(\mathrm{Int}_\psi \) (respectively \(\mathrm{Ext}_\psi \)) denote the event that \(\Psi (i)=\psi (i)\) for \(i\in I\) (respectively \(\partial I\)). Given \(\psi \), let \(f_I^0\) count the number of \(0\) labels in \(I\). Let \(f_{\partial I}^+,f_{\partial I}^-,f_{\partial I}^0\) count the number of boxes in \(\partial I\) with phase labels \(+1, -1\) and \(0\). Let \(f_I^\leftrightarrow \) count the maximal number of disjoint paths (the maximal flow) inside \(I\) between the \(+1\) labels and \(-1\) labels.

We can find constants \(C_\mathrm{cut},c_\mathrm{cut}>0\) such that the following hold.

Lemma 5.2.1

With \(\mathbb{Q }_\theta \)-probability \(1-\exp (- c_\mathrm{cut}f_I^\leftrightarrow K^{d-1})\),

$$\begin{aligned} \varphi ^{J,\zeta ,h}_\Lambda (\mathrm{Int}_\psi \cap \mathrm{Ext}_\psi ) \le \exp \big ( C_\mathrm{cut}(f_{\partial I}^0+f_{\partial I}^-)K^{d-1}- c_\mathrm{cut}f_I^\leftrightarrow K^{d-1}\big ). \end{aligned}$$

Lemma 5.2.2

With \(\mathbb{Q }_\theta \)-probability \(1-\exp (- c_\mathrm{cut}[f_I^0 K+f_I^\leftrightarrow K^{d-1}])\),

$$\begin{aligned}&\varphi ^{J,\zeta ,h}_\Lambda (\mathrm{Int}_\psi \cap \mathrm{Ext}_\psi ) \le \exp \big ( C_\mathrm{cut}\min \{f_{\partial I}^0+f_{\partial I}^-,f_{\partial I}^0+f_{\partial I}^+\}K^{d-1}\\&\quad + \beta h |\Gamma |4^d- c_\mathrm{cut}[f_I^0 K+f_I^\leftrightarrow K^{d-1}]\big ). \end{aligned}$$

Proof of Lemma 5.2.2

Let \(\Gamma _k\) denote the union of the mesoscopic boxes at distance at most \(k\) from \(\Gamma \),

$$\begin{aligned} \Gamma _k=\bigcup \{\mathbb{B }_K(j):\exists \, i\in I,\ \Vert i-j\Vert _\infty \le k\}. \end{aligned}$$

Note that \(\Gamma \subset \Gamma _1\subset \Gamma _2\), and for \(i\in \partial I\), \(\mathbb{B }_K(i)\subset \Gamma _2\setminus \Gamma _1\). Write \(\omega =\omega _\mathfrak{g }\oplus \omega _\mathrm{int}\oplus \omega _\mathrm{ext}\) where \(\omega _\mathfrak{g }\) is the configuration of the ghost edges and \(\omega _\mathrm{int}\) is the configuration of the real edges \(E(\Gamma _1)\).

We would like to condition on the event \(\mathrm{Ext}_\psi \). However, the resulting measure is complicated because \(\mathrm{Ext}_\psi \) carries information about not just \(\omega _\mathrm{ext}\) and \((\sigma (\mathbb{B }_K^\dagger (i)):i\in \partial I)\), but also about \(\omega _\mathrm{int}\). Let \(\mathrm{Ext}^{\prime }_\psi \) denote the event that \(\omega _\mathrm{ext}\), and the states of the vertices in \(\Lambda \setminus {\Gamma _1}\), are compatible with \(\mathrm{Ext}_\psi \). Let \(\mathrm{Int}^{\prime }_\psi \) denote the event that \(\omega _\mathrm{int}\) is compatible with \(\mathrm{Int}_\psi \).

The coupled measure \(\varphi ^{J,\zeta ,h}_\Lambda \) has a property related to the finite energy property of the regular random-cluster model. The probability of edge \(e\) being closed, conditional on all the other edge states and all the spin states, is bounded away from \(0\):

$$\begin{aligned} \varphi ^{J,\zeta ,h}_\Lambda (\omega (e)=0 \mid \sigma \text{ and} \, (\omega (f))_{f\not =e}) \ge 1-p_e. \end{aligned}$$
(5.9)

This tells us the cost of conditioning on edges being closed. Surgically closing edges saves us from having to consider mixed boundary conditions.

Let \(T\) denote the event that all edges spanning between a box \(\mathbb{B }_K(i)\) with \(i\in \partial I\) and \(\psi (i)\le 0\) and a box \(\mathbb{B }_K(j)\subset {\Gamma _1}\) are closed. By (5.9), for some constant \(C_\mathrm{cut}>0\),

$$\begin{aligned} \varphi ^{J,\zeta ,h}_\Lambda (\mathrm{Int}^{\prime }_\psi \cap \mathrm{Ext}^{\prime }_\psi ) \le \varphi ^{J,\zeta ,h}_\Lambda (\mathrm{Int}^{\prime }_\psi \cap \mathrm{Ext}^{\prime }_\psi \cap T) \exp \left(C_\mathrm{cut}(f_{\partial I}^0 + f_{\partial I}^-) K^{d-1}\right).\nonumber \\ \end{aligned}$$
(5.10)

Suppose that \(\omega _\mathrm{ext}\) splits \(\Lambda \setminus {\Gamma _1}\) into \(\omega _\mathrm{ext}\)-clusters \(W_+,W_-,W_1,\ldots ,W_n\). We stress that \(\omega _\mathrm{ext}\) is a partial edge configuration: the \(\omega _\mathrm{ext}\)-clusters in \(\Lambda \setminus {\Gamma _1}\) may be connected by an \((\omega _\mathfrak{g }\oplus \omega _\mathrm{int})\)-path. Assume that the event \(\mathrm{Ext}^{\prime }_\psi \cap T\) holds. If \(\mathbb{B }_K^\ddagger (i)\), \(i\in \partial I\), intersects \({\Gamma _1}\) then \(\psi (i)=1\). Without loss of generality we can assume that

  1. (i)

    \(W_1,\dots ,W_l\) correspond to \(\mathbb{B }_K^\ddagger (i)\) with \(i\in \partial I\) and \(\psi (i)=+1\),

  2. (ii)

    \(W_{l+1},\dots ,W_m\) correspond to clusters with diameter less than \(K/2\),

  3. (iii)

    \(W_{m+1},\dots ,W_n\) (and \(W_-\)) are not connected to \({\Gamma _1}\).

Consider the conditional measure \(\varphi ^{J,\zeta ,h}_\Lambda ( \,\cdot \,\mid \mathrm{Ext}^{\prime }_\psi \cap T,\omega _\mathrm{ext})\). The clusters \(W_+\) and \(W_1,\dots ,W_l\) act as plus boundary conditions. The spins of the clusters \(W_{l+1},\dots ,W_m\) are unknown so the clusters act as wired boundary conditions. Let \(\hat{\phi }^{h}\) denote the marginal measure on \(\omega _\mathrm{int}\) corresponding to \(\varphi ^{J,\zeta ,h}_\Lambda ( \,\cdot \,\mid \mathrm{Ext}^{\prime }_\psi \cap T,\omega _\mathrm{ext})\),

$$\begin{aligned} \varphi ^{J,\zeta ,h}_\Lambda (\mathrm{Int}^{\prime }_\psi \cap \mathrm{Ext}^{\prime }_\psi \cap T)\le \sup _{\omega _\mathrm{ext}\in \mathrm{Ext}^{\prime }_\psi \cap T} \hat{\phi }^h(\mathrm{Int}^{\prime }_\psi ). \end{aligned}$$
(5.11)

Let \(A\) denote the decreasing event that there are no \(\omega _\mathrm{int}\)-open paths in \(\Gamma _1\) between \(\mathbb{B }_K\)-boxes with \(\psi (i)=+1\) and \(\psi (i)=-1\); note that \(\mathrm{Int}^{\prime }_\psi \subset A\). By monotonicity, as \(A\) is a decreasing event and \(\hat{\phi }^h\) is increasing with \(h\in [0,\infty )\),

$$\begin{aligned} \hat{\phi }^h(\mathrm{Int}^{\prime }_\psi )\le \hat{\phi }^h(A)\le \hat{\phi }^0(A). \end{aligned}$$
(5.12)

With reference to Corollary 2.3.3, we can find a collection of \(f_I^\leftrightarrow \) disjoint chains of boxes such that \(A\) implies that for each chain, the first box is not connected to the last box. With \(\mathbb{Q }_\theta \)-probability \(1-\exp (- c_\mathrm{cg}^{\prime }f_I^\leftrightarrow K^{d-1})\),

$$\begin{aligned} \hat{\phi }^0(A)\le \exp (-c_\mathrm{cg}^{\prime }f_I^\leftrightarrow K^{d-1}). \end{aligned}$$
(5.13)

Collecting (5.10)–(5.13) gives the lemma. \(\square \)

Proof of Lemma 5.2.3

We will first consider the case \(f_{\partial I}^+\ge f_{\partial I}^-\). Let \({\Gamma _1}, \mathrm{Int}^{\prime }_\psi ,\mathrm{Ext}^{\prime }_\psi ,\) \(T, W_1,\dots ,W_n\) and \(\hat{\phi }^h\) be defined as above.

Recall inequality (5.14). With reference to (2.3), the sizes of the \(W_{l+1},\dots ,W_m\) affect the interaction (under \(\hat{\phi }^h\)) of open clusters in \({\Gamma _1}\) with the external magnetic field. Recall that the clusters \(W_{l+1},\dots ,W_m\) have diameter at most \(K/2\):

$$\begin{aligned} \hat{\phi }^h(\mathrm{Int}^{\prime }_\psi ) / \hat{\phi }^0(\mathrm{Int}^{\prime }_\psi ) \le \exp (\beta h&|\Gamma _1\cup W_{l+1}\cup \dots \cup W_m|) \le \exp (\beta h |\Gamma | 4^d).\qquad \end{aligned}$$
(5.14)

By Proposition 2.3.2 and Corollary 2.3.3, there is a positive constant \(c\) such that with \(\mathbb{Q }_\theta \)-probability \(1-\exp (-c\, \max \{f_I^0 K,f_I^\leftrightarrow K^{d-1}\})\),

$$\begin{aligned} \hat{\phi }^0(\mathrm{Int}^{\prime }_\psi )\le \exp (-c\, \max \{f_I^0 K,f_I^\leftrightarrow K^{d-1}\}). \end{aligned}$$
(5.15)

Collecting (5.10)–(5.11) and (5.14)–(5.15) gives the lemma in the case \(f_{\partial I}^+\ge f_{\partial I}^-\). The proof in the case \(f_{\partial I}^->f_{\partial I}^+\) follows by swapping \(+\) and \(-\) in the definition of \(T\) and \(\hat{\phi }^h\). \(\square \)

5.3 Hausdorff stability of random-cluster boundaries

Consider the context of Proposition 5.1.1: \(\mathsf{{ E}}^\theta (b_1)>\mathsf{{E}}^\theta (b_2)\). Under \(\mu ^{J,(+,-),h}_\Lambda \) the plus phase is dominant so the minus boundary does not affect the bulk of the domain. In this section, we will show that, in a random-cluster sense, the \(\partial ^-\Lambda \) boundary-cluster is small.

Proposition 5.3.1

Let \(\varepsilon _\mathrm{hs}>0\). Suppose that \(\mathsf{{E}}^\theta (b_1)>\mathsf{{ E}}^\theta (b_2(1-\varepsilon _\mathrm{hs}))\). There is a constant \(c_\mathrm{hs}=c_\mathrm{hs}(\varepsilon _\mathrm{hs})>0\), independent of \(B_\mathrm{max}^\theta \), such that with uniformly high \(\mathbb{Q }_\theta \)-probability

$$\begin{aligned} \phi ^{J,(+,-),h}_\Lambda (\partial ^-\Lambda \leftrightarrow \mathbb{W }_\theta (b_1,b_2(1-\varepsilon _\mathrm{hs}))) \le \exp (-c_\mathrm{hs}b_2 /h). \end{aligned}$$

The proof develops the technique of truncation used in [7]. The differences in geometry, and the presence of a magnetic field, pose extra challenges.

Proof of Proposition 5.3.1

Let \(w\) denote the minimum \(L_\infty \)-distance between \(\partial \mathcal{W }_\theta (1)\) and the origin,

$$\begin{aligned} w=\inf _{{\mathbf{{x}}}\in \partial \mathcal{W }_\theta (1)} \Vert {\mathbf{{x}}}\Vert _\infty . \end{aligned}$$
(5.16)

Let \(R=\lfloor \varepsilon _\mathrm{hs}b_2 N w/(8 K) \rfloor \) and let \(S=\lfloor (b_2-b_1) N w / (2K)\rfloor \). Define mesoscopic layers

$$\begin{aligned} \mathbb{H }_l=\mathbb{W }_\theta \left(b_2-\frac{2l K}{wN},\ b_2-\frac{2(l-1) K}{wN}\right), \quad \quad l=1,\dots ,S. \end{aligned}$$

The layers divide the annulus \(\mathbb{W }_\theta (b_1,b_2)\) into \(S\) layers. Let \(\mathbb{H }_{j,k}=\cup _{l=j}^k \mathbb{H }_l\). The annulus \(\mathbb{W }_\theta (b_2(1-\varepsilon _\mathrm{hs}),b_2)\) corresponds to the first \(4R\) layers, \(\mathbb{H }_{1,4R}\), with \(\mathbb{H }_1\) the outermost layer. The layers are essentially \(d-1\) dimensional, resembling scalar multiples of \(\partial \mathcal{W }_\theta \); Fig. 3 shows the case \(d=2\). The layers have been constructed so that:

  1. (i)

    The mesoscopic boxes in \(\mathbb{H }_l\) form a surface separating \(\mathbb{H }_{l-1}\) from \(\mathbb{H }_{l+1}\).

  2. (ii)

    Each \(\mathbb{H }_l\) is between \(2\) and \(2d\) mesoscopic boxes thick.

Define mesoscopic phase labels \(\Psi =(\Psi (i):\mathbb{B }_K(i)\subset \mathbb{H }_{1,S})\) according to (5.8). We will show that with high probability, a surface of \(+1\) boxes separates \(\mathbb{H }_1\) from \(\mathbb{H }_{4R}\).

Given a phase label \(\psi :\{i:\mathbb{B }_K(i)\subset \mathbb{H }_{1,S}\}\rightarrow \{\pm 1,0\}\), define the profile \({\vec {f}}\) of \(\psi \) as follows. For \(s\in \{-1,0,1\}\), let \(f_l^s\) count the number of \(s\)-boxes in \(\mathbb{H }_l\),

$$\begin{aligned} f_l^s=\#\{i:\mathbb{B }_K(i)\subset \mathbb{H }_l \ \text{ and} \ \psi (i)=s\}. \end{aligned}$$

Let \({\vec {f}}=(f_l^s)_{l=1,\dots ,S}^{s=-1,0,1}\). We will write \(\mathcal{F }({\vec {f}})\) to denote the set of phase labels \(\psi \) compatible with \({\vec {f}}\). The number of configurations of the phase labels \(\psi \) compatible with \({\vec {f}}\) is limited by the definition of the coarse graining. Surfaces of \(0\)-boxes must separate the plus-boxes from the minus-boxes. The surfaces of \(0\)-boxes cannot separate \(\mathbb{H }_l\) into more than \(f_l^0+1\) connected components. Therefore the number of ways of assigning the labels in layer \(l\) is bounded by

$$\begin{aligned} \genfrac(){0.0pt}{}{|\mathbb{H }_l|K^{-d}}{f_l^0} \exp ([f_l^0+1] \log 2) =\exp (f_l^0 \mathrm{{O}}(\log N)). \end{aligned}$$
(5.17)

We will say that a profile \({\vec {f}}\) is spanning if \(f_l^{-1}+f_l^0>0\) for \(l=1,\dots ,4R\). Recall from (3.1) that \(K\approx N^{1/(2d)}\). The number of spanning profiles is less than

$$\begin{aligned} \prod _{l=1}^S \left(\frac{|\mathbb{H }_l|}{K^d}\right)^3 = \mathrm{{O}}\left(\left(N/K\right)^{d-1}\right)^{\mathrm{{O}}(N/K)} \end{aligned}$$

which grows more slowly than \(\exp (cN)\) for every positive constant \(c\). It is therefore sufficient to find a constant \(c_\mathrm{hs}=c_\mathrm{hs}(\varepsilon _\mathrm{hs})>0\) and a \(\mathbb{Q }_\theta \)-event \(\mathcal{J }\) such that

$$\begin{aligned} \mathcal{J }\subset \left\{ J:\forall {\vec {f}} \text{ spanning,} \, \varphi ^{J,(+,-),h}_\Lambda (\Psi \in \mathcal{F }({\vec {f}}))\le \exp (-2c_\mathrm{hs}b_2 N)\right\} \end{aligned}$$
(5.18)

and \(\mathcal{J }\) has uniformly high \(\mathbb{Q }_\theta \)-probability. The event \(\mathcal{J }\) is defined as follows. Let \(\mathcal{J }=\varnothing \) for \(h\ge h_0\) (for some \(h_0>0\)) so we can assume that \(h\) is arbitrarily small. We will appeal below to Proposition 5.1.1 and Lemmas , 5.2.1 and 5.2.2. For \(h<h_0\), let \(\mathcal{J }\) denote the intersection of the associated \(\mathbb{Q }_\theta \)-events.

Let \(\alpha ,\hat{\alpha }>0\). Consider three constraints on the label profile:

$$\begin{aligned} \sum _{l=1}^S f_l^0&\le N^{d-1}/\sqrt{K},\end{aligned}$$
(5.19)
$$\begin{aligned} \sum _{l=1}^S f_l^{-1}&\le \alpha \left(\frac{b_2N}{K}\right)^d,\end{aligned}$$
(5.20)
$$\begin{aligned} \max _{l=R,\dots ,3R} f_l^{-1}&\le \hat{\alpha }\left(\frac{b_2N}{K}\right)^{d-1}. \end{aligned}$$
(5.21)

For \(J\in \mathcal{J }\) we will check the inequality in (5.18) in four parts. We will show:

  1. (I)

    For any \(c_\mathrm{hs}>0\), if (5.19) fails then the inequality in (5.18) holds.

  2. (II)

    For any \(\alpha >0\), if (5.19) holds but (5.20) fails, then the inequality in (5.18) holds if \(c_\mathrm{hs}\) is sufficiently small.

  3. (III)

    For any \(\hat{\alpha }>0\), if (5.19) and (5.20) holds but (5.21) fails, then the inequality in (5.18) holds provided that \(\alpha \) and \(c_\mathrm{hs}\) are sufficiently small.

  4. (IV)

    If (5.19)–(5.20) hold, then the inequality in (5.18) holds if \(\alpha ,\hat{\alpha }\) and \(c_\mathrm{hs}\) are sufficiently small.

For part (I), choose \({\vec {f}}\) such that (5.19) fails. Inequality (5.19) is an upper bound on the volume of bad boxes. We will apply Lemma 2.3.4 for \(\psi \in \mathcal{F }({\vec {f}})\) with \(\varepsilon _\mathrm{cg}=1\) and \(n=\sum _{l=1}^S f_l^0>N^{d-1}/\sqrt{K}\). The positive terms in the exponential in Lemma 2.3.4 are

$$\begin{aligned} \beta |\partial ^\pm \Lambda | + \beta h |\Lambda | = \mathrm{{O}}(b_2^d N^{d-1}). \end{aligned}$$

The absolute value of the negative term is greater than \(c_\mathrm{cg}K (N^{d-1}/\sqrt{K})\) which is a higher order of \(N\) (3.1). For \(h\) sufficiently small,

$$\begin{aligned} \varphi ^{J,(+,-),h}_\Lambda (\Psi =\psi )\le \exp (-c_\mathrm{cg}n K/2), \end{aligned}$$

and so by (5.17),

$$\begin{aligned} \varphi ^{J,(+,-),h}_\Lambda (\Psi \in \mathcal{F }({\vec {f}}))\le \exp \left(n \mathrm{{O}}(\log N)-c_\mathrm{cg}nK/2\right)\!. \end{aligned}$$

Whatever the value of the constant \(c_\mathrm{hs}>0\), for \(h\) sufficiently small the right-hand side above is less than \(\exp (-2c_\mathrm{hs}b_2 N)\).

Now to part (II). Inequality (5.20) is an upper bound on the volume of minus phase. We can expect the majority of boxes to have label \(+1\) because of Proposition 5.1.1.

Recall the symbol \(\varepsilon _\mathrm{cg}\) used in the definition of the coarse graining. Let \(\psi \) denote a label configuration. Let \(n^+\) count the number of phase labels \(\psi (i)=+1\) such that \(\mathbb{B }_K(i)\) is also \(\varepsilon _\mathrm{cg}\)-good. Similarly for \(n^-\). Let \(n^0\) count the number of \(\varepsilon _\mathrm{cg}\)-bad boxes in \(\mathbb{H }_{1,S}\). Note that

$$\begin{aligned} \sum _{l=1}^S f_l^{-1} \le n^0+n^-. \end{aligned}$$

Let \(M(\varepsilon _\mathrm{stb})\) refer to the \(\mu ^{J,(+,-),h}_\Lambda \)-event in Proposition 5.1.1. Under \(M(\varepsilon _\mathrm{stb})\),

$$\begin{aligned} \frac{1}{N^d m^*} \sum _{x\in \Lambda }\sigma (x)\ge b_2^d-b_1^d-2\varepsilon _\mathrm{stb}b_2^d+\mathrm{{O}}(K^{-1}). \end{aligned}$$
(5.22)

For each box \(\mathbb{B }_K(i)\) counted by \(n^-\) there are at least \((1-\varepsilon _\mathrm{cg})K^dm^*\) vertices in \(\mathbb{B }_K^\ddagger (i)\cap \mathbb{B }_K(i)\). For each box \(\mathbb{B }_K(i)\) counted by \(n^+\) there are at most \((1+\varepsilon _\mathrm{cg})K^dm^*\) vertices in \(\mathbb{B }_K^\ddagger (i)\cap \mathbb{B }_K(i)\). There are at most \(n^0K^d\) vertices in \(\varepsilon _\mathrm{cg}\)-bad boxes. The remaining vertices lie in clusters with diameter less than \(K/2\). Small clusters only interact weakly with the magnetic field. If an open cluster has volume of less than \((K/2)^d\) then the odds of it taking plus spin are at most \(\exp (\beta h(K/2)^d)\) to \(1\) (2.3). Conditional on \(\Psi =\psi \), with probability \(1-\exp (-b_2 N)\),

$$\begin{aligned} \frac{1}{K^d m^*} \sum _{x\in \Lambda }\sigma (x) \le n^+ (1+\varepsilon _\mathrm{cg}) + \frac{n^0}{m^*} - n^- (1-\varepsilon _\mathrm{cg}) + \left(\frac{N}{K}\right)^d \mathrm{{O}}(K^{-1}).\qquad \end{aligned}$$
(5.23)

By the argument from part (I) we can assume that \(n^0\le N^{d-1}/\sqrt{K}\). The number of \(\varepsilon _\mathrm{cg}\)-good boxes in \(\Lambda \) is therefore

$$\begin{aligned} n^++n^- = \left(\frac{N}{K}\right)^d [b_2^d-b_1^d+\mathrm{{O}}(K^{-1})]. \end{aligned}$$
(5.24)

By (5.22)–(5.24),

$$\begin{aligned} n^-\le \left(\frac{N}{K}\right)^d [\varepsilon _\mathrm{stb}b_2^d + \varepsilon _\mathrm{cg}(b_2^d-b_1^d)/2 +\mathrm{{O}}(K^{-1}) ]. \end{aligned}$$

Taking \(\varepsilon _\mathrm{stb}=\alpha /2\) and \(\varepsilon _\mathrm{cg}\) sufficiently small, we see that (5.20) holds with high probability. Taking \(2c_\mathrm{hs}< \min \{1,c_\mathrm{stb}(\varepsilon _\mathrm{stb})\}\) we have completed part (II) of the proof of (5.18).

Now for part (III). Choose \({\vec {f}}\) such that (5.19)-(5.20) are satisfied but (5.21) is not. Let \(1\le k< l< m< n\le 4R\) and let \(\psi \in \mathcal{F }({\vec {f}})\). See Fig. 3.

Fig. 3
figure 3

Part (III): The shaded region \(U\) corresponds to the blocks with phase label \(-1\). The arrows indicate paths from \(-1\) phase labels to \(+1\) phase labels. The black lines marking the intersection of the shaded region with \(\mathbb{H }_m\) and \(\mathbb{H }_l\) correspond to \(A_1\) and \(A_2\) in Lemma 5.3.2, respectively. The intersection of \(\partial U\) with \(\mathbb{H }_{l,m}\) corresponds to the set \(S\)

We can apply Lemma 5.2.1 with \(\Gamma \) equal to the set of mesoscopic boxes in \(\mathbb{H }_{k+1,n-1}\) in the neighborhood of a \(0\) or a \(-1\) box,

$$\begin{aligned} \Gamma&= \bigcup _{i\in I}\mathbb{B }_K(i),\nonumber \\ I&= \{i:\exists j,\, \mathbb{B }_K(i),\mathbb{B }_K(j)\subset \mathbb{H }_{k+1,n-1}, \psi (j)<1 \text{ and} \, \Vert i-j\Vert _\infty \le 1\}.\qquad \end{aligned}$$
(5.25)

Note that \(\Gamma \) is not necessarily connected.

Let \(\psi \in \mathcal{F }({\vec {f}})\). Recall that the quantities \(f_{\partial I}^+,f_{\partial I}^0,f_{\partial I}^-,f_I^\leftrightarrow \) in the statement of Lemma 5.2.1 count the number of \(+1\), \(0\) and \(-1\) phase labels in \(\partial I\), and the maximum flow from plus to minus labels in \(\Gamma \). To estimate the quantity \(f_I^\leftrightarrow \) we need a geometric lemma. Recall the definition of \(\partial \) (3.5).

Lemma 5.3.2

Suppose that \(0<a_1<a_2\) and \(U\subset \mathcal{A }_\theta \). Let \(A_i=U\cap \partial \mathcal{W }_\theta (a_i), i=1,2\). Let \(S\) denote the portion of \(\partial U\) contained in the closure of \(\mathcal{W }_\theta (a_1,a_2)\). Then

$$\begin{aligned} \mathcal{H }^{d-1}(S) \ge d^{-1/2}[\mathcal{H }^{d-1}(A_1)-\mathcal{H }^{d-1}(A_2)]. \end{aligned}$$

Proof

Let \(P\) denote the linear projection \(x\rightarrow (a_1/a_2)x\). \(S\) must separate \(A_1\setminus P A_2\) from \((P^{-1}A_1) \setminus A_2\). The surface area of \(A_1\setminus P A_2\) is at least \(\mathcal{H }^{d-1}(A_1)-\mathcal{H }^{d-1}(A_2)\) as \(P\) is a contraction. Let \({\mathbf{{x}}}\in \partial \mathcal{W }_\theta \). Recall that \(\mathcal{W }_\theta =\mathcal{A }_\theta \cap \mathcal{W }_{2\pi }\); by the symmetry of \(\mathcal{W }_{2\pi }\), the angle between the vector \({\mathbf{{x}}}\) and the normal vector \({\mathbf{{n}}}^{\mathcal{W }_\theta }({\mathbf{{x}}})\) is at most \(\cos ^{-1}(d^{-1/2})\). \(\square \)

Lemma 5.3.2 implies that there is a constant \(c\in (0,1)\) such that

$$\begin{aligned} f_I^\leftrightarrow \ge c f_m^{-1} - c^{-1}(f_l^{-1}+f_l^0). \end{aligned}$$

Given \(\hat{\alpha }\), if \(\alpha \) is sufficiently small then we can choose \(k< l< m< n\) such that

$$\begin{aligned} f_k^0+f_k^{-1}&\le (b_2N/K)^{d-1}\times \hat{\alpha }c_\mathrm{cut}c /(6C_\mathrm{cut}),\nonumber \\ f_l^0+f_l^{-1}&\le (b_2N/K)^{d-1}\times \hat{\alpha } c^2/6,\nonumber \\ f_m^{-1}&\ge (b_2N/K)^{d-1}\times \hat{\alpha },\\ f_n^0+f_n^{-1}&\le (b_2N/K)^{d-1}\times \hat{\alpha }c_\mathrm{cut}c /(6C_\mathrm{cut}). \end{aligned}$$

Lemma 5.2.1 gives

$$\begin{aligned}&\varphi ^{J,(+,-),h}_\Lambda (\mathrm{Int}_\psi \cap \mathrm{Ext}_\psi )\nonumber \\&\quad \le \exp \left(C_\mathrm{cut}(f_k^0+f_k^{-1}+f_n^0+f_n^{-1})K^{d-1} -c_\mathrm{cut}\, f_I^\leftrightarrow K^{d-1}\right)\nonumber \\&\quad \le \exp (-\hat{\alpha }c_\mathrm{cut}c(b_2N)^{d-1}/2). \end{aligned}$$
(5.26)

We complete part (III) by checking that the right hand side of (5.26), when multiplied by the size of the set \(\mathcal{F }({\vec {f}})\) [cf. (5.17) and (5.19)] is less than \(\exp (-2c_\mathrm{hs}b_2 N)\).

Before we start part (IV) of the proof of (5.18), we will give an isoperimetric inequality for \(\partial \mathcal{W }_\theta \). The surface \(\partial \mathcal{W }_\theta \) is \(d-1\) dimensional, so subsets of \(\partial \mathcal{W }_\theta \) have \(d-2\) dimensional boundaries.

Lemma 5.3.3

There is a positive constant \(u=u(\theta )\) such that for any \(A\subset \partial \mathcal{W }_\theta \),

$$\begin{aligned} \frac{\mathcal{L }^{d-1}[A]}{\mathcal{L }^{d-1}[\partial \mathcal{W }_\theta ]}\le \frac{1}{2} \implies \mathcal{H }^{d-2}[\partial A] \ge u \left(\mathcal{L }^{d-1}[A]\right)^{(d-2)/(d-1)}. \end{aligned}$$

Proof

First consider the case \(\theta =2\pi \). Let \(w\) denote the minimum \(L_\infty \)-distance between \(\partial \mathcal{W }_\theta \) and the origin (5.16). By convexity, \(\partial \mathcal{W }_\theta \) lies inside the \(L_2\)-annulus with inner radius \(w\) and outer radius \(wd\).

Consider the projection \(P\) of \(\partial \mathcal{W }_\theta \) onto the unit sphere \(\mathcal{S }^{d-1}\). Associated with \(P\) are two Radon-Nikodym derivatives, one for the \(d-1\) dimensional Lebesgue measures on the domain and codomain, and one for the \(d-2\) dimensional Hausdorff measures on the domain and codomain. By symmetry and convexity (see [20, Theorem 2.2.4]) both Radon-Nikodym derivatives are bounded away from \(0\) and \(\infty \). The result follows from Lévy’s isoperimetric inequality for the unit sphere.

A similar argument works when \(0<\theta <\pi \). Consider a projection from \(\mathcal{W }_\theta \) to the \((d-1)\)-dimensional unit ball. \(\square \)

Now for part (IV) of the proof of (5.18). Let \({\vec {f}}\) denote a spanning profile and let \(\psi \in \mathcal{F }({\vec {f}})\). Choose \(k\) and \(n\) such that \(R\le k \le 2R \le n \le 3R\). We will apply Lemma 5.2.2 with \(\Gamma \) defined according to (5.25).

If (5.21) holds with \(\hat{\alpha }\) sufficiently small then Lemma 5.3.3 implies that for some constant \(c>0\),

$$\begin{aligned} f_I^\leftrightarrow \ge c \sum _{l=k+1}^{n-1} (f_l^{-1})^{(d-2)/(d-1)}. \end{aligned}$$

Let

$$\begin{aligned} g_l:= c_\mathrm{cut}f_l^0 K+ c_\mathrm{cut}c (f_l^{-1})^{(d-2)/(d-1)} K^{d-1}. \end{aligned}$$

Lemma 5.2.2 gives

$$\begin{aligned}&\varphi ^{J,(+,-),h}_\Lambda (\mathrm{Int}_\psi \cap \mathrm{Ext}_\psi )\nonumber \\&\le \exp \left(C_\mathrm{cut}(f_k^0+f_k^{-1}+f_n^0+f_n^{-1})K^{d-1} + \beta h | \Gamma | 4^d -\sum _{l=k+1}^{n-1}g_l\right).\quad \end{aligned}$$
(5.27)

The term corresponding to the magnetic field in (5.27) is bounded by (5.21). If \(\hat{\alpha }\) is sufficiently small,

$$\begin{aligned} \beta h |\Gamma | 4^d \le \beta h (12K)^d \sum _{l=k+1}^{n-1} f_l^0 + f_l^{-1} \le \frac{1}{4} \sum _{l=k+1}^{n-1} g_l. \end{aligned}$$

With reference to (5.17) and the assumption that \(h\) is small, the number of ways of choosing \((\psi (i):\mathbb{B }_K(i)\subset \mathbb{H }_{k,n})\) is at most

$$\begin{aligned} \sum _{l=k}^n \exp (f_l^0 \mathrm{{O}}(\log N))\le \exp \left((f_k^0+f_n^0)\mathrm{{O}}(\log N) + \frac{1}{4} \sum _{l=k+1}^{n-1} g_l\right). \end{aligned}$$

Thus if \(h\) is sufficiently small,

$$\begin{aligned} \varphi ^{J,(+,-),h}_\Lambda (\Psi \in \mathcal{F }({\vec {f}})) \!\le \! \exp \left(2C_\mathrm{cut}(f_k^0\!+\!f_k^{-1}+f_n^0+f_n^{-1})K^{d-1}-\frac{1}{2} \sum _{l=k+1}^{n-1}g_l\right).\nonumber \\ \end{aligned}$$
(5.28)

In our notation, [7], (4.40)] states that if \({\vec {f}}\) satisfies (5.19)–(5.20) with \(\alpha \) sufficiently small then there is a positive constant \(c\) such that

$$\begin{aligned}&\min _{R<k<2R} \left\{ 2C_\mathrm{cut}(f_k^0+f_k^{-1}) K^{d-1} - \frac{1}{2} \sum _{l=k+1}^{2R} g_l \right\} \le -c b_2 N, \text{ and}\nonumber \\&\min _{2R<n<3R} \left\{ 2C_\mathrm{cut}(f_n^0+f_n^{-1}) K^{d-1} - \frac{1}{2} \sum _{l=2R+1}^{n-1} g_l\right\} \le -c b_2 N. \end{aligned}$$
(5.29)

Part (IV) of the proof of inequality (5.18) follows from (5.28) and (5.29). This completes the proof of Proposition 5.3.1. \(\square \)

We will give three analogous results below. They follow, mutatis mutandis, from the proof of Proposition 5.3.1. Consider first the context of Proposition 5.1.3.

Proposition 5.3.4

Recall the event \(\mathcal{C }\) defined by (5.7). Suppose that \(b_1=0\) and \(b_2(1-\varepsilon _\mathrm{hs})> B_\mathrm{c}^\theta +\varepsilon _\mathcal{C }\). Given \(\varepsilon _\mathcal{C }\) and \(\varepsilon _\mathrm{hs}\), with uniformly high \(\mathbb{Q }_\theta \)-probability

$$\begin{aligned}&\varphi ^{J,-,h}_\Lambda (\partial ^-\Lambda \leftrightarrow \mathbb{W }_\theta (b_1,b_2(1-\varepsilon _\mathrm{hs}))\mid \mathcal{C })\le \exp (-c_\mathrm{hs}b_2/h). \end{aligned}$$

Proof

In part (II) of the proof of Proposition 5.3.1 replace Proposition 5.1.1 with Proposition 5.1.3. \(\square \)

In the context of Proposition 5.1.4, the plus phase is dominant and so the minus boundary does not affect the bulk of the domain.

Proposition 5.3.5

There is a constant \(c_\mathrm{hs}^{\prime }=c_\mathrm{hs}^{\prime }(\varepsilon _\mathrm{hs},B_\mathrm{max}^\theta )>0\) such that with uniformly high \(\mathbb{Q }_\theta \)-probability

$$\begin{aligned} \phi ^{J,(-,+),h}_\Lambda (\partial ^-\Lambda \leftrightarrow \mathbb{W }_\theta (b_1+\varepsilon _\mathrm{hs},b_2)) \le \exp (-c_\mathrm{hs}^{\prime }/h). \end{aligned}$$

Proof

Proposition 5.3.5 differs from Proposition 5.3.1 in that it shows that the inner (rather than the outer) boundary condition has limited influence. Let

$$\begin{aligned} \mathbb{H }_l&= \mathbb{W }_\theta \left(b_1+\frac{2(4R-l)K}{wN},\ b_1+\frac{2(4R+1-l)K}{wN}\right),\quad l=1,\ldots ,4R,\\ \mathbb{H }_l&= \mathbb{W }_\theta \left(b_1+\frac{2(l-1)K}{wN},\ b_1+\frac{2 l K}{wN}\right),\quad l=4R+1,\ldots ,S, \end{aligned}$$

with \(R=\lfloor \varepsilon _\mathrm{hs}N w/(8 K)\rfloor \) and \(S=\lfloor (b_2-b_1)Nw/(2K)\rfloor \). The proof of Proposition 5.3.1, mutandis mutandis, shows that a surface of \(+1\) boxes separates \(\mathbb{H }_1\) from \(\mathbb{H }_{4R}\). In (5.20)– (5.21), replace \(b_2N/K\) with \(N/K\). Proposition 5.1.4 replaces Proposition 5.1.1 in part (II) of the proof.

Notice that the proof of part (III) has become slightly more flexible; the additional flexibility will be important below in the proof of Proposition 5.3.6. The quantity \(c_\mathrm{hs}^{\prime }\) is allowed to depend on \(B_\mathrm{max}^\theta \). This means that Lemma 5.2.2 can be used in place of Lemma 5.2.1; the extra term due to the magnetic field can be controlled by taking \(\alpha \), and therefore \(|\Gamma |\), sufficiently small. \(\square \)

Consider the context of Proposition 5.1.2. The minus phase is dominant so the plus boundary does not affect the bulk of the domain.

Proposition 5.3.6

Suppose that \(\mathsf{{E}}^\theta (b_1+\varepsilon _\mathrm{hs})<\mathsf{{E}}^\theta (b_2)\). There is a constant \(c_\mathrm{hs}^{\prime \prime }=c_\mathrm{hs}^{\prime \prime }(\varepsilon _\mathrm{hs})>0\) such that with uniformly high \(\mathbb{Q }_\theta \)-probability

$$\begin{aligned} \phi ^{J,(+,-),h}_\Lambda (\partial ^+\Lambda \leftrightarrow \mathbb{W }_\theta (b_1+\varepsilon _\mathrm{hs},b_2)) \le \exp (-c_\mathrm{hs}^{\prime \prime }/h). \end{aligned}$$

Proof

The proof can be obtained from the proof of Proposition 5.3.5 by swapping the roles of plus and minus. The sign of the magnetic field has to stay the same, but, for example, in (5.20) replace \(f_l^{-1}\) with \(f_l^{+1}\), etc. Proposition 5.1.2 replaces Proposition 5.1.4 in part (II) of the proof. \(\square \)

5.4 Hausdorff stability implies spatial mixing

In Proposition 5.3.6 we showed that under \(\mathbb{Q }_\theta [\mu ^{J,(+,-),h}_\Lambda ]\), with high probability, a surface of \(-1\) mesoscopic blocks separates the region \(\mathbb{W }_\theta ^{\prime }(b_1+\varepsilon _\mathrm{hs},b_2)\) from the plus boundary \(\partial ^+\Lambda \). In two dimensions, by planar duality, this implies that there are no Ising spin-clusters connecting \(\partial ^+\Lambda \) to \(\mathbb{W }_\theta ^{\prime }(b_1+\varepsilon _\mathrm{hs},b_2)\). By the Ising model’s domain Markov property, and monotonicity, we can compare \(\mathbb{Q }_\theta [\mu ^{J,(+,-),h}_\Lambda ]\) to \(\mathbb{Q }_\theta [\mu ^{J,(-,-),h}_\Lambda ]\) in \(\mathbb{W }_\theta ^{\prime }(b_1+\varepsilon _\mathrm{hs},b_2)\).

In contrast in higher dimensions, especially close to the critical temperature, we cannot make such a comparison. By the definition of the coupling, adjacent sites can be in the same Ising spin-cluster even if they are not connected under the random-cluster model. Thus the Ising spin-cluster associated with \(\partial ^-\Lambda \) under \(\mu ^{J,(+,-),h}_\Lambda \) may be much larger than the cluster associated with \(\partial ^-\Lambda \) under the random-cluster representation \(\phi ^{J,(+,-),h}_\Lambda \). Instead we will appeal to a spatial mixing property, stated below as Proposition 5.4.1.

Although great progress has been made in the study of the Ising model’s spatial mixing properties, much it still not known. In the \(h=0\) case, for example, it has not been proven that exponential decay of correlation holds down to the critical inverse-temperature. The presence of a magnetic field poses additional difficulties—when the boundary conditions favors one phase, a magnetic field with the opposite polarity acts to weaken the dominant phase.

We conjecture that Proposition 5.4.1, and the coarse graining property, hold for all \(\beta >\beta _\mathrm{c}\). If that is the case then Theorem 1.2.1 holds up to the critical point. For simplicity we will reuse the constants \(c_\mathrm{hs},c_\mathrm{hs}^{\prime }\) and \(c_\mathrm{hs}^{\prime \prime }\), adjusting their values if necessary.

Proposition 5.4.1

There is a finite \(\beta _0\) such that if \(\beta >\beta _0\) then for \(\varepsilon _\mathrm{hs}>0\), with uniformly high \(\mathbb{Q }_\theta \)-probability:

  1. (i)

    For \(x\in \mathbb{W }_\theta (b_1,b_2(1-2\varepsilon _\mathrm{hs}))\),

    $$\begin{aligned}&\varphi ^{J,(+,-),h}_\Lambda \left(\sigma (x)=1 \,\big |\, \partial ^-\Lambda \, \leftrightarrow \!\!\!\!\!/\mathbb{W }_\theta (b_1,b_2(1-\varepsilon _\mathrm{hs}))\right) \\&\quad \ge \varphi ^{J,(+,+),h}_\Lambda (\sigma (x)=1) - \exp (-c_\mathrm{hs}b_2/h). \end{aligned}$$
  2. (ii)

    For \(x\in \mathbb{W }_\theta ^{\prime }(b_1+2\varepsilon _\mathrm{hs},b_2)\),

    $$\begin{aligned}&\varphi ^{J,(-,+),h}_\Lambda \left(\sigma (x)=1 \,\big |\, \partial ^-\Lambda \leftrightarrow \!\!\!\!\!/\mathbb{W }_\theta (b_1+\varepsilon _\mathrm{hs},b_2)\right) \\&\quad \ge \varphi ^{J,(+,+),h}_\Lambda (\sigma (x)=1) - \exp (-c_\mathrm{hs}^{\prime }/h). \end{aligned}$$
  3. (iii)

    If \(\mathsf{{E}}^\theta (b_1+2\varepsilon _\mathrm{hs})<\mathsf{{E}}^\theta (b_2)\), then for \(x\in \mathbb{W }_\theta ^{\prime }(b_1+2\varepsilon _\mathrm{hs},b_2)\),

    $$\begin{aligned}&\varphi ^{J,(+,-),h}_\Lambda \left(\sigma (x)=1 \,\big |\, \partial ^+\Lambda \leftrightarrow \!\!\!\!\!/\mathbb{W }_\theta (b_1+\varepsilon _\mathrm{hs},b_2)\right) \\&\quad \le \varphi ^{J,(-,-),h}_\Lambda (\sigma (x)=1) + \exp (-c_\mathrm{hs}^{\prime \prime }/h). \end{aligned}$$
  4. (iv)

    If \(b_1>B_\mathrm{c}^\theta \), for \(x\in \mathbb{W }_\theta ^{\prime }(b_1+\varepsilon _\mathrm{hs}b_2,b_2)\),

    $$\begin{aligned}&\varphi ^{J,(+,-),h}_\Lambda (\sigma (x)=1) \\&\quad \le \varphi ^{J,-,h}_{\mathbb{W }_\theta ^{\prime }(b_2)}\left(\sigma (x)=1 \,\Bigg |\, \int _{\mathbb{W }_\theta (b_1)}\mathbb{M }_K^- \,\mathrm{d}\mathcal{L }^d\ge (B_\mathrm{c}^\theta )^d\right) + \exp (-c_\mathrm{hs}b_2 /h). \end{aligned}$$

The statement of Proposition 5.4.1 is fine tuned to suit our needs—we have only considered spatial mixing in Wulff-shaped regions. Also, the restriction in part (iii) is stricter than necessary.

We will prove Proposition 5.4.1 after first showing how it can be used with the results of Section 5.3. By part (i), in the context of Proposition 5.3.1 with uniformly high \(\mathbb{Q }_\theta \)-probability, for \(x\in \mathbb{W }_\theta (b_1,b_2(1-2\varepsilon _\mathrm{hs}))\),

$$\begin{aligned} |\mu ^{J,(+,-),h}_\Lambda (\sigma (x)=1)-\mu ^{J,+,h}_\Lambda (\sigma (x)=1)| \le 2\exp (-c_\mathrm{hs}b_2 /h). \end{aligned}$$
(5.30)

By part (ii), in the context of Proposition 5.3.5 with uniformly high \(\mathbb{Q }_\theta \)-probability, for \(x\in \mathbb{W }_\theta ^{\prime }(b_1+2\varepsilon _\mathrm{hs},b_2)\),

$$\begin{aligned} |\mu ^{J,(-,+),h}_\Lambda (\sigma (x)=1)-\mu ^{J,+,h}_\Lambda (\sigma (x)=1)| \le 2\exp (-c_\mathrm{hs}^{\prime }/h). \end{aligned}$$
(5.31)

By part (iii), in the context of Proposition 5.3.6 with uniformly high \(\mathbb{Q }_\theta \)-probability, for \(x\in \mathbb{W }_\theta ^{\prime }(b_1+2\varepsilon _\mathrm{hs},b_2)\),

$$\begin{aligned} |\mu ^{J,(+,-),h}_\Lambda (\sigma (x)=1)-\mu ^{J,-,h}_\Lambda (\sigma (x)=1)| \le 2\exp (-c_\mathrm{hs}^{\prime \prime } /h). \end{aligned}$$
(5.32)

Suppose that \(b_2>b_1>B_\mathrm{root}^\theta \) and consider \(x\in \mathbb{W }_\theta ^{\prime }(b_1+\varepsilon _\mathrm{hs}b_2,b_2)\). By part (iv), and by Proposition 5.1.1 applied to \(\mathbb{W }_\theta (b_1)\) with \(\varepsilon _\mathrm{stb}=1-B_\mathrm{c}^\theta /b_1\), with uniformly high \(\mathbb{Q }_\theta \)-probability,

$$\begin{aligned}&|\mu ^{J,(+,-),h}_{\mathbb{W }_\theta ^{\prime }(b_1,b_2)} (\sigma (x)=1)-\mu ^{J,-,h}_{\mathbb{W }_\theta ^{\prime }(0,b_2)} (\sigma (x)=1)| \nonumber \\&\le \exp (-c_\mathrm{stb}b_1/h^{d-1})+\exp (-c_\mathrm{hs}b_2 /h). \end{aligned}$$
(5.33)

Proposition 5.4.1 is also relevant to the conditioned measure defined in (5.7). With \(b_1=0\), the total variation distance between \(\hat{\mu }^{J,-,h}_\Lambda \) and \(\mu ^{J,-,h}_\Lambda (\,\cdot \,\mid \partial ^-\Lambda \leftrightarrow \!\!\!\!\!/\mathbb{W }_\theta (b_2(1-\varepsilon _\mathrm{hs})))\) is bounded by Proposition 5.1.5, monotonicity, and Proposition 5.3.4. Thus by part (i) of Proposition 5.4.1, for some constant \(c>0\), with uniformly high \(\mathbb{Q }_\theta \)-probability for \(x\in \mathbb{W }_\theta ^{\prime }(b_2(1-2\varepsilon _\mathrm{hs}))\),

$$\begin{aligned} |\hat{\mu }^{J,-,h}_\Lambda (\sigma (x)=1)-\mu ^{J,+,h}_\Lambda (\sigma (x)=1)| \le \exp (-c/h). \end{aligned}$$
(5.34)

If \(b_2>b_1\ge B_\mathrm{c}^\theta +2\varepsilon _\mathcal{C }\) then part (iv) can be used to compare \(\mu ^{J,(+,-),h}_\Lambda \) and \(\hat{\mu }^{J,-,h}_{\mathbb{W }_\theta ^{\prime }(0,b_2)}\). For \(x\in \mathbb{W }_\theta ^{\prime }(b_1+\varepsilon _\mathrm{hs}b_2,b_2)\), with uniformly high \(\mathbb{Q }_\theta \)-probability for some positive constant \(c\),

$$\begin{aligned} |\mu ^{J,(+,-),h}_\Lambda (\sigma (x)=1)-\hat{\mu }^{J,-,h}_{\mathbb{W }_\theta ^{\prime }(0,b_2)} (\sigma (x)=1)| \le \exp (-c/h). \end{aligned}$$
(5.35)

Proof of Proposition 5.4.1

We will prove part (i); the other parts are similar. By monotonicity, it is sufficient to show that for \(x\in \mathbb{W }_\theta (b_1,b_2(1-2\varepsilon _\mathrm{hs}))\),

$$\begin{aligned}&\mu ^{J,(+,\mathrm{f}),h}_{\mathbb{W }_\theta (b_1,b_2(1-\varepsilon _\mathrm{hs}))}\left(\sigma (x)=1\right)\ge \mu ^{J,(+,+),h}_{\mathbb{W }_\theta (b_1,b_2(1-\varepsilon _\mathrm{hs}))}\left(\sigma (x)=1\right) -\exp (-c_\mathrm{hs}b_2 /h). \end{aligned}$$

To allow us to deal with \(p\) close to \(p_\mathrm{c}\), we will formally define spin clusters in terms of the graph of undiluted edges.

Definition 5.4.2

A \(\sigma \)-spin cluster is a maximal set of vertices of like spin that are connected by \(J=1\) edges.

Let \(\mathsf{{Disconnect}}\) denote the event that there are no minus \(\sigma \)-spin clusters stretching from the inner- to the outer-boundary of \(\mathbb{W }_\theta (b_2(1-2\varepsilon _\mathrm{hs}),b_2(1-\varepsilon _\mathrm{hs}))\). By monotonicity

$$\begin{aligned} \mu ^{J,(+,\mathrm{f}),h}_{\mathbb{W }_\theta (b_1,b_2(1-\varepsilon _\mathrm{hs}))}(\sigma (x)=1\mid \mathsf{{Disconnect}})\ge \mu ^{J,(+,+),h}_{\mathbb{W }_\theta (b_1,b_2(1-\varepsilon _\mathrm{hs}))}(\sigma (x)=1) \end{aligned}$$

so we need to show that

$$\begin{aligned} \mu ^{J,(+,\mathrm{f}),h}_{\mathbb{W }_\theta (b_1,b_2(1-\varepsilon _\mathrm{hs}))}(\mathsf{{Disconnect}})\ge 1-\exp (-c_\mathrm{hs}b_2/h). \end{aligned}$$

We will do this using a stronger coarse-graining property.

Definition 5.4.3

Consider a box \(\mathbb{B }_K(i)\subset \Lambda \). If

  1. (i)

    \(\mathbb{B }_K(i)\) is \(\varepsilon _\mathrm{cg}\)-good and

  2. (ii)

    the \(\sigma \)-spin clusters composed of vertices with spin \(-\sigma (\mathbb{B }^\dagger _K(i))\) intersecting \(\mathbb{B }_K(i)\) have diameter at most \(K/2\)

then say that \(\mathbb{B }_K(i)\) is \(\varepsilon _\mathrm{cg}\) Ising-good.

To see that boxes at the mesoscopic scale \(K\approx h^{-1/(2d)}\) are likely \(\varepsilon _\mathrm{cg}\)-Ising-good, we will consider a second, smaller mesoscopic scale that is independent of \(h\).

With \(p\in (p_\mathrm{c},1)\) fixed, as \(\beta \rightarrow \infty \) the averaged random-cluster measure \(\mathbb{Q }[\phi ^{J,h}]\) converges weakly to product measure with density \(p\); the density of edges with \(J(e)=1\) but \(\omega (e)=0\) goes to zero. Taking \(k\) large as a function of \(\varepsilon _\mathrm{cg}\), and then taking \(\beta \) large as a function of \(k\), we can make the \(\mathbb{Q }[\varphi ^{J,+,0}_\Lambda ]\)-probability that \(\mathbb{B }_{k}(i)\subset \Lambda \) is \(\varepsilon _\mathrm{cg}\)-Ising good arbitrarily close to \(1\). By a standard renormalization argument we can find \(\beta _0,K_1(\varepsilon _\mathrm{cg})\) and \(c>0\) such that if \(\beta >\beta _0\) and \(K>K_1(\varepsilon _\mathrm{cg})\) then \(\mathbb{B }_K(i)\subset \Lambda \) is \(\varepsilon _\mathrm{cg}\)-Ising-good with \(\mathbb{Q }[\varphi ^{J,\mathrm{f},0}_\Lambda ]\)-probability \(1-\exp (-c K)\).

The result now follows by adapting the proof of Proposition 5.3.1. Substitute ‘\(\varepsilon _\mathrm{cg}\)-Ising-good’ for ‘\(\varepsilon _\mathrm{cg}\)-good’ in the definition of the phase labels and, because of the free outer boundary conditions, use Proposition 5.1.5 in place of Proposition 5.1.1. If the profile \({\vec {f}}\) associated with the phase label configuration \(\Psi \) is not spanning then the event \(\mathsf{{Disconnect}}\) holds.

Parts (ii) and (iii) of Proposition 5.4.1 follow from the proofs of Propositions 5.3.5 and 5.3.6, respectively, by substituting ‘\(\varepsilon _\mathrm{cg}\)-Ising-good’ for ‘\(\varepsilon _\mathrm{cg}\)-good’. Part (iv) follows from the proof of Proposition 5.3.4; with high probability there is a surface of plus spins separating the inner- and outer-boundaries of \(\mathbb{W }_\theta (b_1,b_1+\varepsilon _\mathrm{hs}b_2)\) under \(\mu ^{J,-,h}_\Lambda [\,\cdot \,\mid \int _{\mathbb{W }_\theta (b_1)}\mathbb{M }_K^- \,\mathrm{d}\mathcal{L }^d\ge (B_\mathrm{c}^\theta )^d]\). \(\square \)

5.5 Spectral gap of the dynamics

The Glauber dynamics for \(\mu ^{J,\zeta ,h}_\Lambda \) can be studied by introducing a block dynamics [7, 16, 19, 22]. Inequalities (5.30)-(5.32) imply that [22, Propositions 3.5.1-3.5.3] can be extended from the Ising model on \(\mathbb{Z }^2\) to the dilute Ising model on \(\mathbb{Z }^d\) with \(d\ge 2\).

Proposition 5.5.1

Let \(b_1=0\) and \(\varepsilon >0\). With uniformly high \(\mathbb{Q }_\theta \)-probability,

$$\begin{aligned} \mathrm{gap}(\Lambda ,+,h)\ge \exp (-\varepsilon /h^{d-1}). \end{aligned}$$

Proposition 5.5.2

Consider the case \(b_1>B_\mathrm{c}^\theta \). Let \(\varepsilon >0\). With uniformly high \(\mathbb{Q }_\theta \)-probability,

$$\begin{aligned} \mathrm{gap} (\Lambda ,(+,-),h) \ge \exp ( - \varepsilon /h^{d-1}). \end{aligned}$$

Proposition 5.5.3

Let \(b_1=0\) and \(\varepsilon >0\). One can choose \(b_2\) slightly larger than \(B_\mathrm{c}^\theta \) such that with high \(\mathbb{Q }_\theta \)-probability,

$$\begin{aligned} \mathrm{gap}(\Lambda ,-,h)\ge \exp (-\varepsilon /h^{d-1}). \end{aligned}$$

Propositions 5.5.1–5.5.3 are similar enough to [22, Propositions 3.5.1–3.5.3] that we will just sketch the proof of Proposition 5.5.1 and omit the proofs of Propositions 5.5.2–5.5.3.

Proposition 5.5.1 tells us that if we start the Glauber dynamics for \(\mu ^{J,+,h}_\Lambda \) in the minus configuration, one quickly reaches the plus phase due to a wall of plus phase invading \(\Lambda \) from the boundary \(\partial ^+\Lambda \). In two dimensions, the growth of the plus phase can be understood in terms of advancing surfaces of plus spins. In higher dimensions, we cannot rely on such surfaces existing, but the argument can be carried through nonetheless.

Proof of Proposition 5.5.1

With \(\varepsilon _\mathrm{block}>0\), let \(n=\lfloor (b_2-b_1)/\varepsilon _\mathrm{block}\rfloor -1\). Consider a sequence of overlapping annuli that cover \(\Lambda \),

$$\begin{aligned}&\Delta _j=\mathbb{W }_\theta (b_1+(j-1)\varepsilon _\mathrm{block},b_1+(j+1)\varepsilon _\mathrm{block}),\quad \quad j=1,2,\dots ,n-1,\\&\Delta _n=\mathbb{W }_\theta ^{\prime }(b_1+(n-1)\varepsilon _\mathrm{block},b_2). \end{aligned}$$

Consider a block dynamics \((\sigma ^\mathrm{B}_t)_{t\ge 0}\) for \(\mu ^{J,\zeta ,h}_\Lambda \) with blocks \(\Delta _1,\dots ,\Delta _n\); update each block \(\Delta _j\) at rate 1, resampling the block conditional on the configuration restricted to \(\Lambda \setminus \Delta _j\). The equilibrium distribution of the block dynamics is the same as for the single spin dynamics, namely \(\mu ^{J,+,h}_\Lambda \).

Let \(\mathrm{gap}(\Lambda ,\zeta ,h)\) denote the spectral gap of the regular Glauber dynamics and let \(\mathrm{gap}(\Lambda ,\{\Delta _1,\dots ,\Delta _n\},\zeta ,h)\) denote the spectral gap of \((\sigma ^\mathrm{B}_t)\). For \(\varepsilon >0\), if \(\varepsilon _\mathrm{block}\) and \(h_0\) are sufficiently small and \(0<h<h_0\) then comparing the two dynamics we obtain [22] that

$$\begin{aligned} \mathrm{gap}(\Lambda ,\zeta ,h) \ge \exp (-\varepsilon /h^{d-1}) \mathrm{gap}(\Lambda ,\{\Delta _1,\dots ,\Delta _n\},\zeta ,h). \end{aligned}$$
(5.36)

The graphical construction can be extended to the block dynamics by coupling from the past: if block \(\Delta _j\) is to be updated at time \(t\), use a copy of the regular graphical construction in \(\Delta _j\) over the time interval \((-\infty ,0]\) with boundary conditions \(\sigma ^\mathrm{B}_{t-}\) to produce the configuration \(\sigma ^\mathrm{B}_t\). As \(t\rightarrow \infty \) the probability with respect to the graphical construction that \(\sigma ^\mathrm{B}_t\) depends on \(\sigma ^\mathrm{B}_0\) tends to zero.

We will show that with uniformly high \(\mathbb{Q }_\theta \)-probability, the probability that \(\sigma ^\mathrm{B}_1\) depends on \(\sigma ^\mathrm{B}_0\) is bounded away from one. This implies that the spectral gap of the block dynamics is bounded away from zero and so Proposition 5.5.1 follows by (5.36).

Suppose that in the time interval \([0,1]\) there are exactly \(n\) block updates in the order \(\Delta _n,\Delta _{n-1},\dots ,\Delta _1\). Suppose also that just before the update of block \(\Delta _j\), the spins in \(\Lambda \setminus \cup _{i=1}^j\Delta _i\) do not depend on \(\sigma ^\mathrm{B}_0\). The probability then that just after the update of block \(\Delta _j\) the spins in the larger set \(\Lambda \setminus \cup _{i=1}^{j-1}\Delta _i\) depend on \(\sigma ^\mathrm{B}_0\) is at most \(2|\Lambda |\exp (-c_\mathrm{hs}^{\prime }/h)\); apply inequality (5.31) in the region \(\cup _{i=j}^n\Delta _i\subset \Lambda \). Inductively, we see that such a sequence of updates leaves \(\sigma ^\mathrm{B}_1\) independent of \(\sigma ^\mathrm{B}_0\) with probability at least \(1-2n|\Lambda |\exp (-c_\mathrm{hs}^{\prime }/h)\). \(\square \)

6 Space-time cones and rescaling

In this section we will turn the heuristic description of plus-cluster nucleation from Sect. 1.3 into a proof of Theorem 1.2.1. We will focus on the case \(\theta \in (0,\pi )\); the case \(\theta =2\pi \) is simpler.

We will apply the results in Sect. 5 with two different values of \(\theta \). We will first take \(\theta \) from the statement of Theorem 1.2.1 to specify the types of catalyst cones we look to for the formation of droplets of plus phase. We also take \(\theta =2\pi \) to describe the spread of plus phase in regions of typical dilution. Figure 4 shows how the results of this section come together. A critical droplet forms in a subset of a catalyst cone, grows to fill the whole of the catalyst cone, escapes the catalyst cone, then spreads to regions of typical dilution.

In Sects. 6.16.3 and 6.5 we take \(\theta \in (0,\pi )\cup \{2\pi \}\). In Sects. 6.4 and 6.6 we take \(\theta \in (0,\pi )\), but also refer to full Wulff shapes \(\mathbb{W }_{2\pi }(\,\cdot \,)\).

Recall that \(B_\mathrm{max}^\theta \) determines the size of the catalyst cones (4.5). Take \(B_\mathrm{min}^\theta \in (B_\mathrm{c}^\theta ,B_\mathrm{max}^\theta )\) to be fixed. We will take \(B_\mathrm{min}^\theta \) from Proposition 5.5.3 to maximize the rate of nucleation of plus droplets. We will choose \(B_\mathrm{min}^{2\pi }\gg B_\mathrm{root}^{2\pi }\) to take advantage of Proposition 6.5.1.

6.1 The graphical construction in space-time regions

Before we give the proof of Theorem 1.2.1, we need to extend the Ising dynamics to allow the size of the graph to change with time. With \(\Gamma _0,\Gamma _1,\dots ,\Gamma _n \subset \mathbb{Z }^d\) and \(t_0<t_1<\dots <t_{n+1}\), consider the space-time region

$$\begin{aligned} \Gamma =\mathrm{ST}(\Gamma _0,\dots ,\Gamma _n;t_0<\dots <t_{n+1}):=\bigcup _{i=0}^n \Gamma _i \times [ t_i, t_{i+1} ]. \end{aligned}$$
(6.1)

The graphical construction for the Ising model \(\mu ^{J,\zeta ,h}_\Lambda \) described in Sect. 2.4 can be extended to \(\Gamma \).

  1. (i)

    Let \(s\) denote the start time.

  2. (ii)

    Let \(\xi \) denote an initial configuration compatible with boundary conditions \(\zeta \) at time \(s\), i.e. if \(s\in [t_i,t_{i+1})\) then \(\xi \in \Sigma _{\Gamma _i}^\zeta \).

  3. (iii)

    Let \(\sigma ^{s,\xi }_{\Gamma ,\zeta ,h;s}=\xi \).

  4. (iv)

    If a vertex \(x\) is added to the dynamics at time \(t_i\) (i.e. \(x\in \Gamma _i\setminus \Gamma _{i-1}\)) then the spin \(\sigma ^{s,\xi }_{\Gamma ,\zeta ,h;t_i}(x)\) is taken to be \(\zeta (x)\) to match the boundary conditions. The spin at \(x\) may then change with each arrival of the corresponding Poisson process.

  5. (v)

    If \(x\) is removed from the dynamics at time \(t_i\) (i.e. \(x\in \Gamma _{i-1}\setminus \Gamma _i\)) then the spin at \(x\) is immediately switched to \(\zeta (x)\) to conform to the boundary conditions.

The graphical construction of \(\mathbb{P }_J\) allows us to link together the Ising dynamics run in overlapping space-time regions. This can be used to chain together the different steps involved in the growth of a region of plus phase.

Remark 6.1.1

Consider two space-time regions such that the top layer of the first region covers the start of the second region:

$$\begin{aligned} \Gamma&= \mathrm{ST}(\Gamma _0,\Gamma _1,\dots ,\Gamma _m;t_0<t_1<\dots <t_{m+1}),\\ \Delta&= \mathrm{ST}(\Delta _0,\Delta _1,\dots ,\Delta _n;u_0<u_1<\dots <u_{n+1}),\\ \Gamma _m&= \Delta _0\,\text{ and}\,u_0= t_m<t_{m+1}\le u_1. \end{aligned}$$

If \(\sigma ^{t_0,\xi }_{\Gamma ,-,h;t_{m+1}}= \sigma ^{t_m,+}_{\Gamma ,-,h;t_{m+1}}\) and \(\sigma ^{u_0,+}_{\Delta ,-,h;u_{n+1}}=\sigma ^{u_n,+}_{\Delta ,-,h;u_{n+1}}\) then

$$\begin{aligned} \sigma ^{t_0,\xi }_{\Gamma \cup \Delta ,-,h;u_{n+1}}= \sigma ^{u_n,+}_{\Delta ,-,h;u_{n+1}}. \end{aligned}$$

6.2 Droplet creation in a Summertop cone

Let \(\theta \in (0,\pi )\cup \{2\pi \}\) and \(\delta >0\). With reference to Proposition 5.5.3 choose \(B_\mathrm{min}^\theta \in (B_\mathrm{c}^\theta ,B_\mathrm{root}^\theta )\) such that with high \(\mathbb{Q }_\theta \)-probability,

$$\begin{aligned} \mathrm{gap}(\mathbb{W }_\theta (B_\mathrm{min}^\theta ),-,h)\ge \exp (-\delta /(2h^{d-1})). \end{aligned}$$
(6.2)

Let \(\Lambda =\mathbb{W }_\theta (B_\mathrm{min}^\theta )\). Heuristically, we expect critical droplets to form in \(\Lambda \) at rate \(\exp (-\mathsf{{E}}^\theta _\mathrm{c}/h^{d-1})\). Let \(\hat{\mu }^{J,-,h}_\Lambda =\mu ^{J,-,h}_\Lambda ( \,\cdot \,\mid \mathcal{C })\) denote the conditional measure defined by (5.7) with \(\varepsilon _\mathcal{C }=(B_\mathrm{min}^\theta -B_\mathrm{c}^\theta )/3\).

Proposition 6.2.1

With high \(\mathbb{Q }_\theta \)-probability,

$$\begin{aligned} \mathbb{P }_J[\sigma ^{0,-}_{\Lambda ,-,h;\exp (\delta /h^{d-1})}\in \mathcal{C }]\ge \tfrac{1}{2} \mu ^{J,-,h}_\Lambda (\mathcal{C })\ge 2\exp (-\mathsf{{E}}^\theta _\mathrm{c}/h^{d-1}). \end{aligned}$$

Proof

Taking \(a=0\) in the last inequality in the proof of Proposition 5.1.3, we can assume that \(\mu ^{J,-,h}_\Lambda (\mathcal{C })\ge 4\exp (-\mathsf{{E}}^\theta _\mathrm{c}/h^{d-1})\). By (6.2) and a Markov chain mixing inequality (i.e. [21], (59)]) the total variation distance between \(\sigma ^{0,-}_{\Lambda ,-,h;\exp (\delta /h^{d-1})}\) and \(\mu ^{J,-,h}_\Lambda \) is less than \(2\exp (-\mathsf{{E}}^\theta _\mathrm{c}/h^{d-1})\). \(\square \)

6.3 Growing in a Summertop cone

In this section we will use the “inverted space-time pyramids” of [22] to show that under \(\mathbb{Q }_\theta \), droplets of plus phase tends to expand from \(\mathbb{W }_\theta (B_\mathrm{min}^\theta )\) to \(\mathbb{W }_\theta (B_\mathrm{max}^\theta )\) with high probability.

With \(\delta >0\), and with reference to (4.7) and (6.1), consider the space-time region

$$\begin{aligned} \triangledown&= \triangledown (B_\mathrm{min}^\theta ,B_\mathrm{max}^\theta ,\delta ,\theta ):=\mathrm{ST}(\Lambda _0,\dots ,\Lambda _n;t_0<\dots <t_{n+1}),\\ \Lambda _i&:= \Delta _\theta ^{i+|\mathbb{W }_\theta (B_\mathrm{min}^\theta )|},\quad t_i:=i\exp (\delta /h^{d-1}), \quad n:=|\mathbb{W }_\theta (B_\mathrm{max}^\theta )|-|\mathbb{W }_\theta (B_\mathrm{min}^\theta )|. \end{aligned}$$

For each \(i\) let \(\hat{\mu }^{J,-,h}_{\Lambda _i}=\mu ^{J,-,h}_{\Lambda _i} ( \,\cdot \,\mid \mathcal{C })\) with \(\varepsilon _\mathcal{C }=(B_\mathrm{min}^\theta -B_\mathrm{c}^\theta )/3\).

For \(\eta \in \Sigma _{\mathbb{W }_\theta (B_\mathrm{min}^\theta )}^-\) consider the event

$$\begin{aligned} G_\eta :=\left\{ \sigma ^{0,\eta }_{\triangledown ,-,h;t_{n+1}}= \sigma ^{t_n,+}_{\triangledown ,-,h;t_{n+1}}\right\} . \end{aligned}$$

For \(\eta \in \mathcal{C }, G_\eta \) describes the plus phase spreading from \(\mathbb{W }_\theta (B_\mathrm{min}^\theta )\) to \(\mathbb{W }_\theta (B_\mathrm{max}^\theta )\) in time \(t_n\). The event \(G_\eta \) depends only on the elements of the graphical construction contained in \(\triangledown \). Here is an extension of [22, Proposition 3.2.2] to the dilute Ising model.

Proposition 6.3.1

There are positive constants \(\delta _0,C,C_1\) such that if \(0<\delta \le \delta _0\) then with high \(\mathbb{Q }_\theta \)-probability

$$\begin{aligned} \int \mathbb{P }_J(G_\eta ) \,\mathrm{d}\hat{\mu }^{J,-,h}_{\Lambda _0}(\eta ) \ge 1-C \exp (-C_1/h). \end{aligned}$$

Proof of Proposition 6.3.1

We will assume that \(J\) belongs to a certain event with high \(\mathbb{Q }_\theta \)-probability; the set is defined implicitly by our use of results from Sect. 5.

For \(i=1,\dots ,n\) and \(\zeta \in \Sigma _{\Lambda _{i-1}}^-\) let \(G^i_\zeta =\left\{ \sigma ^{t_{i-1},\zeta }_ {\triangledown ,-,h;t_{i+1}}=\sigma ^{t_i,+}_{\triangledown ,-,h;t_{i+1}}\right\} \);

$$\begin{aligned} G^1_\eta \cap \left( \bigcap _{i=2}^n G^i_+\right) \implies G_\eta . \end{aligned}$$

By monotonicity \(G^i_\zeta \subset G^i_+\). It is sufficient to show that for each \(i\),

$$\begin{aligned} \int \mathbb{P }_J(G^i_\zeta ) \,\mathrm{d}\hat{\mu }^{J,-,h}_{\Lambda _{i-1}}(\zeta ) \ge 1-C \exp (-C_1/h). \end{aligned}$$
(6.3)

With reference to Proposition 5.1.3, if we start the dynamics with initial distribution \(\hat{\mu }^{J,-,h}_{\Lambda _{i-1}}\) we expect to stay inside \(\mathcal{C }\) for a long time. Let \((\hat{\sigma }^{s,\zeta }_{\triangledown ,-,h;t})_{t\ge s}\) denote the Markov chain obtained from the graphical construction by suppressing any jumps from \(\mathcal{C }\) to \(\mathcal{C }^\mathrm{c}\). By introducing a stopping time

$$\begin{aligned} \tau ^\zeta _i=\inf \{t\ge t_i : \sigma ^{t_i,\zeta }_{\triangledown ,-,h;t}\not = \hat{\sigma }^{t_i,\zeta }_{\triangledown ,-,h;t}\}, \end{aligned}$$

we will see that the modified dynamics are likely to agree with the regular dynamics over the interval \([t_i,t_{i+1}]\).

Let \(\sigma ^x\) denote the configuration obtained from \(\sigma \) by flipping the spin at \(x\), and let

$$\begin{aligned} \partial \mathcal{C }=\{\sigma \in \mathcal{C }:\exists x, \sigma ^x \in \mathcal{C }^\mathrm{c}\}. \end{aligned}$$

Proposition 5.1.3 gives an upper bound on \(\hat{\mu }^{J,-,h}_{\Lambda _i}(\partial \mathcal{C })\). Given \(B_\mathrm{min}^\theta \) we can find \(\delta _0>0\) such that

$$\begin{aligned} \hat{\mu }^{J,-,h}_{\Lambda _i}(\partial \mathcal{C }) \le \exp (-3\delta _0/h^{d-1}). \end{aligned}$$
(6.4)

If the starting state \(\zeta \) is sampled from \(\hat{\mu }^{J,-,h}_{\Lambda _i}\) then the process \((\hat{\sigma }^{t_i,\zeta }_{\triangledown ,-,h;t})_{t\in [t_i,t_{i+1}]}\) is stationary. By (6.4) (cf. [22], (2.12)]), if \(\delta \le \delta _0\) and \(h\) is sufficiently small,

$$\begin{aligned} \int \mathbb{P }_J(\tau ^\zeta _i\le t_{i+1}) \,\mathrm{d}\hat{\mu }^{J,-,h}_{\Lambda _i}(\zeta ) \le \exp (-\delta _0/h^{d-1}). \end{aligned}$$

For some \(x\), \(\Lambda _i=\Lambda _{i-1}\cup \{x\}\). We will need a bound on the effect of adding this extra vertex has on the conditional Ising measures \((\hat{\mu }^{J,-,h}_{\Lambda _i})_i\). By the Ising model’s finite-energy property, for any \(h_0>0\),

$$\begin{aligned} \alpha :=\inf _{0<h<h_0} \inf _{\zeta \in \Sigma } \inf _J \inf _{s=\pm 1} \mu ^{J,\zeta ,h}_{\{0\}} (\sigma (0)=s) >0. \end{aligned}$$
(6.5)

By the Ising model’s Markov property, for \(\zeta \in \mathcal{C }\cap \Sigma _{\Lambda _{i-1}}^-\),

$$\begin{aligned} \hat{\mu }^{J,-,h}_{\Lambda _i} (\zeta )/\hat{\mu }^{J,-,h}_{\Lambda _{i-1}} (\zeta ) = \mu ^{J,-,h}_{\Lambda _i}(\sigma (x)=-1\mid \mathcal{C })\ge \alpha . \end{aligned}$$

Therefore (cf. [22], (3.28)]),

$$\begin{aligned}&\int \mathbb{P }_J((G^i_\zeta )^\mathrm{c})\,\mathrm{d}\hat{\mu }^{J,-,h}_{\Lambda _{i-1}}(\zeta )\nonumber \\ \quad&=\int \mathbb{P }_J(\sigma ^{t_{i-1},\zeta }_{\triangledown ,-,h;t_{i+1}} \not = \sigma ^{t_i,+}_{\triangledown ,-,h;t_{i+1}})\,\mathrm{d}\hat{\mu }^{J,-,h}_{\Lambda _{i-1}}(\zeta )\nonumber \\ \quad&\le \int \mathbb{P }_J(\sigma ^{t_i,\zeta }_{\triangledown ,-,h;t_{i+1}} \not = \sigma ^{t_i,+}_{\triangledown ,-,h;t_{i+1}}) \,\mathrm{d}\hat{\mu }^{J,-,h}_{\Lambda _{i-1}}(\zeta ) + \int \mathbb{P }_J(\tau ^\zeta _{i-1}\le t_i) \,\mathrm{d}\hat{\mu }^{J,-,h}_{\Lambda _{i-1}}(\zeta ) \nonumber \\ \quad&\le \alpha ^{-1} \int \mathbb{P }_J(\sigma ^{t_i,\zeta }_{\triangledown ,-,h;t_{i+1}}\not = \sigma ^{t_i,+}_{\triangledown ,-,h;t_{i+1}}) \,\mathrm{d}\hat{\mu }^{J,-,h}_{\Lambda _i}(\zeta ) + \exp (-\delta _0/h^{d-1}). \end{aligned}$$
(6.6)

Set

$$\begin{aligned} b_\gamma =(1-\gamma )B_\mathrm{c}^\theta +\gamma B_\mathrm{min}^\theta , \quad \quad \gamma \in [0,1]. \end{aligned}$$
(6.7)

Thus \(B_\mathrm{c}^\theta =b_0<b_{1/3}<b_{2/3}<b_1=B_\mathrm{min}^\theta \). Choose \(b\) such that \(\mathbb{W }_\theta ^{\prime }(b)=\Lambda _i\). By monotonicity and the invariance of the modified dynamics with respect to \(\hat{\mu }^{J,-,h}_{\mathbb{W }_\theta ^{\prime }(b)}\),

$$\begin{aligned}&\int \mathbb{P }_J(\sigma ^{t_i,\zeta }_{\triangledown ,-,h;t_{i+1}}\not = \sigma ^{t_i,+}_{\triangledown ,-,h;t_{i+1}}) \,\mathrm{d}\hat{\mu }^{J,-,h}_{\mathbb{W }_\theta ^{\prime }(b)}(\zeta )\\&\quad \le \int \mathbb{P }_J(\sigma ^{t_i,+}_{\triangledown ,-,h;t_{i+1}}>\hat{\sigma }^ {t_i,\zeta }_{\triangledown ,-,h;t_{i+1}})+\hat{\mu }^{J,-,h}_{\mathbb{W }_\theta ^{\prime }(b)} (\tau _i^\zeta \le t_{i+1}) \,\mathrm{d}\hat{\mu }^{J,-,h}_{\mathbb{W }_\theta ^{\prime }(b)}(\zeta )\\&\quad \le \int \sum _{y\in \mathbb{W }_\theta ^{\prime }(b)} \Big \{ \mathbb{P }_J(\sigma ^{t_i,+}_{\triangledown ,-,h;t_{i+1}}(y)=1)- \mathbb{P }_J(\hat{\sigma }^{t_i,\zeta }_{\triangledown ,-,h;t_{i+1}}(y)=1)\Big \} \,\mathrm{d}\hat{\mu }^{J,-,h}_{\mathbb{W }_\theta ^{\prime }(b)}(\zeta )\\&\qquad +\exp (-\delta _0/h^{d-1}) \\&\quad \le \sum _{y\in \mathbb{W }_\theta (b_{2/3})} \Big \{\mathbb{P }_J(\sigma ^{0,+}_{\mathbb{W }_\theta ^{\prime }(b),+,h;\exp (\delta /h^{d-1})}(y)=1) -\hat{\mu }^{J,-,h}_{\mathbb{W }_\theta ^{\prime }(b)}(\sigma (y)=1)\Big \}\\&\qquad +\sum _{y\in \mathbb{W }_\theta ^{\prime }(b_{2/3},b)} \Big \{\mathbb{P }_J(\sigma ^{0,+}_{\mathbb{W }_\theta ^{\prime }(b_{1/3},b),(+,-),h; \exp (\delta /h^{d-1})}(y)=1)-\hat{\mu }^{J,-,h}_{\mathbb{W }_\theta ^{\prime }(b)}(\sigma (y)=1)\Big \}\\&\qquad +\exp (-\delta _0/h^{d-1}). \end{aligned}$$

For \(y\in \mathbb{W }_\theta (b_{2/3})\), by Proposition 5.5.1 and Markov chain mixing [21], (59)],

$$\begin{aligned}&|\mathbb{P }_J(\sigma ^{0,+}_{\mathbb{W }_\theta ^{\prime }(b),+,h;\exp (\delta /h^{d-1})}(y)=1)- \mu ^{J,+,h}_{\mathbb{W }_\theta ^{\prime }(b)}(\sigma (y)=1)|\\&\le \exp \left[-\exp (\delta /h^{d-1})\mathrm{gap}(\mathbb{W }_\theta ^{\prime }(b),+,h)\right] \Big /\mu ^{J,+,h}_{\mathbb{W }_\theta ^{\prime }(b)}(\sigma = +)\\&\le \exp \left[-\exp (\delta /(2h^{d-1}))\right]. \end{aligned}$$

Similarly for \(y\in \mathbb{W }_\theta ^{\prime }(b_{2/3},b)\), by Proposition 5.5.2,

$$\begin{aligned}&|\mathbb{P }_J(\sigma ^{0,+}_{\mathbb{W }_\theta ^{\prime }(b_{1/3},b),(+,-),h;\exp (\delta /h^{d-1})}(y)=1)- \mu ^{J,(+,-),h}_{\mathbb{W }_\theta ^{\prime }(b_{1/3},b)}(\sigma (y)=1)|\\&\quad \le \exp \left[-\exp (\delta /h^{d-1})\mathrm{gap}(\mathbb{W }_\theta ^{\prime }(b_{1/3},b),(+,-),h)\right] \Big / \mu ^{J,+,h}_{\mathbb{W }_\theta ^{\prime }(b_{1/3},b)}(\sigma =+)\\&\quad \le \exp \left[-\exp (\delta /(2h^{d-1}))\right]. \end{aligned}$$

By the above

$$\begin{aligned}&\int \mathbb{P }_J(\sigma ^{t_i,\zeta }_{\triangledown ,-,h;t_{i+1}}\not =\sigma ^{t_i,+}_ {\triangledown ,-,h;t_{i+1}}) \,\mathrm{d}\hat{\mu }^{J,-,h}_{\mathbb{W }_\theta ^{\prime }(b)}(\zeta )\nonumber \\&\le \sum _{y\in \mathbb{W }_\theta (b_{2/3})}\mu ^{J,+,h}_{\mathbb{W }_\theta ^{\prime }(b)} (\sigma (y)=+1)-\hat{\mu }^{J,-,h}_{\mathbb{W }_\theta ^{\prime }(b)}(\sigma (y)=+1)\nonumber \\&\quad +\sum _{y\in \mathbb{W }_\theta ^{\prime }(b_{2/3},b)}\mu ^{J,(+,-),h}_{\mathbb{W }_\theta ^{\prime }(b_{1/3},b)} (\sigma (y)=+1)-\hat{\mu }^{J,-,h}_{\mathbb{W }_\theta ^{\prime }(b)}(\sigma (y)=+1)\nonumber \\&\quad + \exp (-\delta _0/h^{d-1})+ |\mathbb{W }_\theta ^{\prime }(b)|\exp \left[-\exp (\delta /(2h^{d-1}))\right]\!. \end{aligned}$$
(6.8)

By (5.34) and (5.35),

$$\begin{aligned}&\sum _{y\in \mathbb{W }_\theta (b_{2/3})}\mu ^{J,+,h}_{\mathbb{W }_\theta ^{\prime }(b)} (\sigma (y)=+1)-\hat{\mu }^{J,-,h}_{\mathbb{W }_\theta ^{\prime }(b)}(\sigma (y)=+1)\nonumber \\&\qquad +\sum _{y\in \mathbb{W }_\theta ^{\prime }(b_{2/3},b)}\mu ^{J,(+,-),h}_{\mathbb{W }_\theta ^{\prime }(b_{1/3},b)}(\sigma (y)=+1)- \hat{\mu }^{J,-,h}_{\mathbb{W }_\theta ^{\prime }(b)}(\sigma (y)=+1)\nonumber \\&\quad \le |\mathbb{W }_\theta ^{\prime }(b)| \exp (-c/h). \end{aligned}$$
(6.9)

Inequality (6.3) now follows by (6.6), (6.8) and (6.9). \(\square \)

6.4 Escaping from Summertop-cones

In Proposition 6.3.1 we considered space-time pyramids. Consider now “space-time parallelepipeds”. From now on we will write \(2\pi \) in place of \(\theta \) to make it clear that \(\theta \) refers to the angle of the catalyst cone. Let \(a\) denote a positive constant and let \(b=1.01 B_\mathrm{c}^{2\pi }\). We can find a sequence of graphs \(\Lambda _0,\Lambda _1,\dots ,\Lambda _n\) such that

  1. (i)

    \(\Lambda _0=\mathbb{W }_{2\pi }(b)\),

  2. (ii)

    \(\Lambda _n=\mathbb{W }_{2\pi }(b)+aN\mathbf{{e}}_1\),

  3. (iii)

    \(\Lambda _{i+1}\) differs from \(\Lambda _i\) by adding a vertex or removing a vertex,

  4. (iv)

    for any \(i\), for some \(k_i\in (0,aN)\), \(\Lambda _i\) differs from \(\mathbb{W }_{2\pi }(b)+k_i\mathbf{{e}}_1\) by at most a mesoscopic layer of vertices around the boundary, and

  5. (v)

    \(n=\mathrm{{O}}(a b^{d-1}/h^d)\),

Let

$$\begin{aligned} \lozenge =\lozenge (a,b,\delta )=\mathrm{ST}(\Lambda _0,\dots ,\Lambda _n;t_0<\dots <t_{n+1}), \end{aligned}$$
(6.10)

with \(t_i:=i\exp (\delta /h^{d-1})\). In Fig. 4, the dotted lines indicate the area swept out by a space-time parallelepiped that starts inside the copy of \(\mathbb{W }_\theta (B_\mathrm{max}^\theta )\).

For \(\eta \in \Sigma _{\Lambda _0}^-\) consider the event

$$\begin{aligned} G_\eta :=\left\{ \sigma ^{0,\eta }_{\lozenge ,-,h;t_{n+1}}= \sigma ^{t_{n},+}_{\lozenge ,-,h;t_{n+1}}\right\} . \end{aligned}$$

Here is an extension of Proposition 6.3.1 to \(\lozenge \). Let \(\varepsilon _\mathcal{C }=(b- B_\mathrm{c}^{2\pi })/3\) and let \(\hat{\mu }^{J,-,h}_{\Lambda _i}:=\mu ^{J,-,h}_{\Lambda _i} ( \,\cdot \,\mid \mathcal{C }_i)\) with

$$\begin{aligned} \mathcal{C }_i=\left\{ \sigma :\int _{\mathcal{W }_{2\pi }(B_\mathrm{c}^{2\pi }+\varepsilon _\mathcal{C })+(k_i/N)\mathbf{{e}}_1} \mathbb{M }_K^{-}\,\mathrm{d}\mathcal{L }^d \ge (B_\mathrm{c}^{2\pi })^d\right\} . \end{aligned}$$
Fig. 4
figure 4

The \(\mathsf{{Nuc}}(x,i)\) nucleation event in a catalyst \(\mathsf{{Cat}}(x)\)

Proposition 6.4.1

Let \(\lozenge \) be defined according to (6.10) with \(\delta \le \delta _0\). There are positive constants \(C,C_1\) such that with high \(\mathbb{Q }\)-probability

$$\begin{aligned} \int \mathbb{P }_J(G_\eta ) \,\mathrm{d}\hat{\mu }^{J,-,h}_{\Lambda _0}(\eta ) \ge 1-C \exp (-C_1/h). \end{aligned}$$

Proof

We can adapt the proof of Proposition 6.3.1, showing that (6.3) holds when, for example, \(\Lambda _i=\Lambda _{i-1}\setminus \{x\}\) for some vertex \(x\) on the boundary of \(\Lambda _{i-1}\). Let \((\hat{\sigma }^{s,\zeta }_{\lozenge ,-,h;t})_{t\ge s}\) denote the Markov chain obtained from the graphical construction by suppressing any jumps from \(\mathcal{C }_{i-1}\) to \(\mathcal{C }_{i-1}^\mathrm{c}\). The only place where the change is important is in inequality (6.6). Recall that the spin of vertices leaving \(\lozenge \) are set to \(-1\). Let \(\mu \) denote the measure obtained by sampling from \(\hat{\mu }^{J,-,h}_{\Lambda _{i-1}}\) and then setting the spin at \(x\) equal to \(-1\). Let \(\mu ^{\prime }=\mu ^{J,-,h}_{\Lambda _i}(\,\cdot \,\mid \mathcal{C }_{i-1})\). By the definition of \(\alpha \) (6.5),

$$\begin{aligned} \mu (\zeta ) \le \alpha ^{-1}\mu ^{\prime }(\zeta ), \quad \quad \zeta \in \Sigma _i^-. \end{aligned}$$

In place of (6.6) we have that

$$\begin{aligned}&\int \mathbb{P }_J((G^i_\zeta )^\mathrm{c})\,\mathrm{d}\hat{\mu }^{J,-,h}_{\Lambda _{i-1}}(\zeta )\\&\quad =\int \mathbb{P }_J(\sigma ^{t_{i-1},\zeta }_{\lozenge ,-,h;t_{i+1}} \not = \sigma ^{t_i,+}_{\lozenge ,-,h;t_{i+1}})\,\mathrm{d}\hat{\mu }^{J,-,h}_{\Lambda _{i-1}}(\zeta )\\&\quad \le \int \mathbb{P }_J(\sigma ^{t_i,\zeta }_{\lozenge ,-,h;t_{i+1}} \not = \sigma ^{t_i,+}_{\lozenge ,-,h;t_{i+1}}) \,\mathrm{d}\mu (\zeta ) + \int \mathbb{P }_J(\tau ^\zeta _{i-1}\le t_i) \,\mathrm{d}\hat{\mu }^{J,-,h}_{\Lambda _{i-1}}(\zeta ) \\&\quad \le \alpha ^{-1} \int \mathbb{P }_J(\sigma ^{t_i,\zeta }_{\lozenge ,-,h;t_{i+1}}\not =\sigma ^{t_i,+}_{\lozenge ,-,h;t_{i+1}}) \,\mathrm{d}\mu ^{\prime }(\zeta ) + \exp (-\delta _0/h^{d-1}). \end{aligned}$$

The rest of the proof follows mutatis mutandis. \(\square \)

6.5 Stronger growth beyond \(B_\mathrm{root}^\theta \)

In Proposition 6.3.1 we require \(B_\mathrm{min}^\theta > B_\mathrm{c}^\theta \). If in addition \(B_\mathrm{min}^\theta >B_\mathrm{root}^\theta \) then we get the following stronger result corresponding to [22, Proposition 3.2.1].

Proposition 6.5.1

Let \(B_\mathrm{max}^{2\pi }>B_\mathrm{min}^{2\pi }>B_\mathrm{root}^{2\pi }\) and \(\delta >0\). Consider \(\triangledown =\triangledown (B_\mathrm{min}^{2\pi },B_\mathrm{max}^{2\pi },\delta ,2\pi )\). There are positive constants \(C,C_1\) such that with high \(\mathbb{Q }\)-probability

$$\begin{aligned} \int \mathbb{P }_J(G_\eta ) \,\mathrm{d}\mu ^{J,-,h}_{\mathbb{W }_{2\pi }(B_\mathrm{min}^{2\pi })}(\eta )\ge 1-C \exp (-C_1/h). \end{aligned}$$

Moreover, \(C_1\) is a function of \(B_\mathrm{min}^{2\pi }\) and \(C_1\rightarrow \infty \) as \(B_\mathrm{min}^{2\pi }\rightarrow \infty \).

Proof of Proposition 6.5.1

Taking \(\theta =2\pi \), the proof of this proposition is very similar to the proof of Proposition 6.3.1. Define \(b_\gamma =(1-\gamma )B_\mathrm{root}^{2\pi }+\gamma B_\mathrm{min}^{2\pi }\) in place of (6.7). We can then simply replace \(\hat{\mu }^{J,-,h}_{\Lambda _i}\) with \(\mu ^{J,-,h}_{\Lambda _i}\). The need for the modified dynamics and the stopping time has disappeared; the \(\exp (-\delta _0/h^{d-1})\) terms can be removed from the proof.

Let \(\varepsilon _\mathrm{hs}=(B_\mathrm{min}^{2\pi }-B_\mathrm{root}^{2\pi })/(8B_\mathrm{min}^{2\pi })\). In (6.9), (5.30) and (5.33) replace (5.34) and (5.35), respectively. We can therefore replace the term \(\exp (-c/h)\) with \(2\exp (-c_\mathrm{hs}B_\mathrm{min}^{2\pi }/h)\). This yields the claim that \(C_1\rightarrow \infty \) as \(B_\mathrm{min}^{2\pi }\rightarrow \infty \). \(\square \)

6.6 Proof of Theorem 1.2.1

The quantities \(\lambda , \lambda _2^\theta , C_0, f, \theta \) and \(h\,(\approx \!\!N^{-1})\) are from the statement of the theorem. Take \(\theta \in (0,\pi )\); the proof is easily modified to deal with the case \(\theta =2\pi \). There are three parts to the proof.

  1. (1)

    Define a rescaled space-time lattice.

  2. (2)

    Define a class of catalysts that create droplets of plus phase capable of spreading to nearby regions of space with typical dilution.

  3. (3)

    Show that in areas of typical dilution, large droplets of plus phase propagate with essentially linear speed.

  • Step (1) With reference to Proposition 6.5.1, choose \(B_\mathrm{min}^{2\pi }\) such that \(C_1\ge C_0+1\). Let \(R=\text{ diameter}\,(\mathcal{W }_{2\pi }(B_\mathrm{min}^{2\pi }+1))\). With reference to (5.30), take \(B_\mathrm{max}^{2\pi }\) to be greater than \(3 (B_\mathrm{min}^{2\pi }+1)\) and large enough that with high \(\mathbb{Q }\)-probability, for \(x\in [-1/h,1/h]^d\),

    $$\begin{aligned} |\mu ^{J,-,h}_{\mathbb{W }_{2\pi }(B_\mathrm{max}^{2\pi })}(\sigma (x)) -\mu ^{J,h}(\sigma (x))| \le \exp (-C_1/h). \end{aligned}$$
    (6.11)

    Consider a rescaled space-time lattice consisting of a collection of overlapping translations of \(\triangledown =\triangledown (B_\mathrm{min}^{2\pi },B_\mathrm{max}^{2\pi },\delta ,2\pi )\): let

    $$\begin{aligned} \triangledown _{(x,i)}=\triangledown + (R Nx,iT), \quad \quad (x,i)\in \mathbb{Z }^d\times \mathbb{N }, \end{aligned}$$

    with \(T\) denoting the time from the start of the first slice of \(\triangledown \) to the start of the final slice of \(\triangledown \),

    $$\begin{aligned} T:=|\mathbb{W }_{2\pi }(B_\mathrm{min}^{2\pi },B_\mathrm{max}^{2\pi })|\times T_\delta \quad \text{ with}\quad T_\delta :=\exp (\delta /h^{d-1}). \end{aligned}$$

    Time-wise, the top slice of \(\triangledown _{(x,i)}\) overlaps the bottom slice of \(\triangledown _{x,i+1}\). By the choice of \(B_\mathrm{min}^{2\pi },B_\mathrm{max}^{2\pi }\) and \(R\),

    $$\begin{aligned}&[\mathbb{W }_{2\pi }(B_\mathrm{min}^{2\pi }) + R N \mathbf{{e}}_1] \cap \mathbb{W }_{2\pi }(B_\mathrm{min}^{2\pi }) = \varnothing , \quad \text{ and}\\&[\mathbb{W }_{2\pi }(B_\mathrm{min}^{2\pi }) + R N \mathbf{{e}}_1] \subset \mathbb{W }_{2\pi }(B_\mathrm{max}^{2\pi }). \end{aligned}$$

    If \(\Vert x-y\Vert _1=1\), then \(\triangledown _{x,0}\) and \(\triangledown _{y,0}\) do not intersect at time 0, but they then ‘invade’ each other: at time \(T, \triangledown _{x,0}\) covers \(\triangledown _{y,1}\).

The notion of \(k\)-dependence [18] is useful on the rescaled lattice. We will say that a collection of \((E_x:x\in \mathbb{Z }^d)\) of \(\mathbb{Q }\)-measurable events are \(k\) -dependent if for all \(A,B\subset \mathbb{Z }^d\) such that the \(L_\infty \)-distance from \(A\) to \(B\) is greater than \(k\), \((E_x:x\in A)\) is independent of \((E_x:x\in B)\). Similarly, a collection \((E_z:z\in \mathbb{Z }^d\times \mathbb{N })\) of \(\mathbb{E }_J\)-measurable events are \(k\) -dependent if \((E_z:z\in A)\) and \((E_z:z\in B)\) are independent for \(A,B\subset \mathbb{Z }^d\times \mathbb{N }\) at \(L_\infty \)-distance greater than \(k\).

The number of occurrences of low density \(k\)-dependent processes is approximately Poisson [3]. High density \(k\)-dependent processes are bounded below by high density product measure [18].

  • Step (2) Recall that the Wulff shape \(\mathbb{W }_\theta (b)\) is defined to have a volume of approximately \((bN)^d\). It has size \(b\theta ^{-(d-1)/d}N\) in the \(\mathbf{{e}}_1\)-direction and size \(b\theta ^{1/d}N\) in the \(\mathbf{{e}}_2,\dots ,\mathbf{{e}}_d\)-directions. With reference to Proposition 4.2.1 we must take two things into account when choosing the size of the catalyst cone \(\mathbb{W }_\theta (B_\mathrm{max}^\theta )\). Firstly, having chosen \(B_\mathrm{min}^\theta \) to maximize the rate of nucleation of plus clusters (6.2) we must have \(B_\mathrm{max}^\theta \ge B_\mathrm{min}^\theta =\mathrm{{O}}(\theta ^{(d-1)/d})\). Secondly, \(B_\mathrm{max}^\theta \) must be large enough that a translation of \(\mathbb{W }_{2\pi }(1.01 B_\mathrm{c}^{2\pi })\) fits inside \(\mathbb{W }_\theta (b)\); \(\mathbb{W }_{2\pi }(1.01 B_\mathrm{c}^{2\pi })\) has diameter of order \(N\) as \(\beta \rightarrow \infty \) and \(\theta \rightarrow 0\) so \(B_\mathrm{max}^\theta \) must be of order \(\theta ^{-1/d}\). With reference to (4.5)-(4.6), this cone of dilution has probability

    $$\begin{aligned} \mathbb{Q }[\mathsf{{Catalyst}}(\theta )]=\exp (-C_\mathrm{dil}^\theta /h^{d-1}) \quad \text{ with}\quad C_\mathrm{dil}^\theta =\mathrm{{O}}\left(\frac{1}{\theta }\log \frac{1}{1-p}\right).\qquad \end{aligned}$$
    (6.12)

    For a catalyst to have the desired effect, the edges that do not need to be closed should have typical dilution. With reference to Fig. 4, we will identify catalysts as regions of typical dilution that are close to a region of high dilution.

We will write \(\mathsf{{Nuc}}{(x,i)}\) to denote the nucleation of a droplet of plus phase in the vicinity of the point \((x,i)\) of the rescaled lattice: choose \(r\in \mathbb{N }\) minimal such that

$$\begin{aligned} \left[\mathbb{W }_\theta (B_\mathrm{max}^\theta )-r{\mathbf{{e}}}_1\right] \cap \mathbb{W }_{2\pi }(B_\mathrm{max}^{2\pi }) = \varnothing ; \end{aligned}$$

then with

$$\begin{aligned} \square _{(x,i)}=[-rRN,rRN]^d\times [0,T+T_\delta ]+(RNx,iT) \end{aligned}$$

let

$$\begin{aligned} \mathsf{{Nuc}}(x,i):=\left\{ \sigma ^{iT,-}_{\square _{(x,i)},-,h;(i+1)T+T_\delta } \ge \sigma ^{(i+1)T,+}_{\triangledown _{(x,i)},-,h;(i+1)T+T_\delta } \right\} . \end{aligned}$$

Let \(\mathsf{{Cat}}(x)\) denote the event that for \(i\in \mathbb{N }\),

$$\begin{aligned} \mathbb{P }_J(\mathsf{{Nuc}}(x,i))\ge \exp \left(-\mathsf{{E}}^\theta _\mathrm{c}/h^{d-1}\right). \end{aligned}$$

If the event \(\mathsf{{Catalyst}}(\theta )\), translated by \(RNx-rRN{\mathbf{{e}}}_1\), occurs, then \(\mathsf{{Cat}}(x)\) occurs with high \(\mathbb{Q }\)-probability. Figure 4 shows how \(\mathsf{{Nuc}}{(x,i)}\) can be written as the concatenation, using Remark 6.1.1, of the events described in Propositions 6.2.1-6.4.1:

  1. (1)

    A droplet of plus phase with the shape \(\mathbb{W }_\theta (B_\mathrm{min}^\theta )\) forms in a region of high dilution that resembles \(\mathbb{W }_\theta (B_\mathrm{max}^\theta )\) [Proposition 6.2.1].

  2. (2)

    The droplet expands in the sheltered region to cover a copy of \(\mathbb{W }_{2\pi }(1.01 B_\mathrm{c}^{2\pi })\) [Proposition 6.3.1].

  3. (3)

    The droplet of plus phase spreads to the right [Proposition 6.4.1] and

  4. (4)

    expands to cover a copy of \(\mathbb{W }_{2\pi }(B_\mathrm{max}^{2\pi })\) [Proposition 6.3.1 with \(\theta \) taken to be \(2\pi \)].

In the applications of Propositions -6.4.1 take the value of \(\delta \) to be \(\min \{\delta _0, (\lambda -\lambda _2^\theta )/6\}\). By symmetry, the probability of \(\mathsf{{Cat}}(x)\) is greater than \(\mathbb{Q }[\mathsf{{Catalyst}}(\theta )]\).

The \(\mathsf{{Cat}}(\,\cdot \,)\) and \(\mathsf{{Nuc}}{(\,\cdot \,,\,\cdot \,)}\) events are \(2r\)-dependent processes with respect to \(\mathbb{Z }^d\) and \(\mathbb{Z }^d\times \mathbb{N }\), respectively.

  • Step (3) Once a \(\mathsf{{Nuc}}{(\,\cdot \,,\,\cdot \,)}\) event occurs, we should expect the droplet of plus phase to spread with speed one on the rescaled lattice. However, having taken advantage of an area of high dilution to form a critical droplet, areas of high dilution are now the enemy as they can prevent the droplet from spreading. First we show that any catalyst \(\mathsf{{Cat}}(\,\cdot \,)\) is likely to be connected to the origin by a path of typical points of the rescaled lattice. Second, we show that plus phase spreads along this path with speed close to one. For \(x\in \mathbb{Z }^d\), let \(\mathsf{{Conductive}}(x)\) denote the event that: translated by \(R Nx\) the \(\mathbb{Q }\)-measurable event from Proposition 6.5.1 holds. For each \(x\in \mathbb{Z }^d\), \(\mathsf{{ Conductive}}(x)\) holds with high \(\mathbb{Q }\)-probability—when \(h\) is small the \(\mathsf{{Conductive}}(\,\cdot \,)\) vertices form a supercritical \(k\)-dependent site percolation process on \(\mathbb{Z }^d\) \((k\le 2r)\). Let \(\mathsf{{Backbone}}\) denote the unique infinite cluster of \(\mathsf{{ Conductive}}(\,\cdot \,)\) vertices. Sites of the rescaled lattice that do not belong to \(\mathsf{{Backbone}}\) correspond to the rare regions of \(\mathbb{Z }^d\) where due to an unusual local pattern of dilution the dynamics relax over a time scale different from \(\exp (\lambda _2/h^{d-1})\). By a Peierls argument (or using [18, Theorem 0.0] and [12, Section 2.1]), with high \(\mathbb{Q }\)-probability the origin of the rescaled lattice does belong to \(\mathsf{{Backbone}}\). For \(x\in \mathsf{{Backbone}}\), let \(\mathsf{{Spread}}{(x,i)}\) denote the translation by \((R Nx,iT)\) of the \(\mathbb{P }_J\)-measurable event \(G_+\) from Proposition 6.5.1,

    $$\begin{aligned} \mathsf{{Spread}}{(x,i)}=\left\{ \sigma ^{iT,+}_{\triangledown _{(x,i)},-,h;(i+1)T+T_\delta }= \sigma ^{(i+1)T,+}_{\triangledown _{(x,i)},-,h;(i+1)T+T_\delta }\right\} . \end{aligned}$$

    The \(\mathsf{{Spread}}{(\,\cdot \,,\,\cdot \,)}\) events are \(k\)-dependent with \(k\le 2r\). Space-time paths of \(\mathsf{{Spread}}{(\,\cdot \,,\,\cdot \,)}\) events show how clusters of plus phase spread on \(\mathsf{{Backbone}}\) once they have formed.

Take \(M\) maximal such that \(\triangledown _{0,4M}\) finishes before time \(\exp (\lambda /h^{d-1})\). Let \(\mathsf{{Catalysts}}\) denote the set of \(\mathsf{{Cat}}(x)\) points such that \(x\in \mathsf{{Backbone}}\) and \(x\) is connected to the origin inside \(\mathsf{{Backbone}}\) by a simple path of length \(m\in \{M,M+1,\dots ,2M\}\) (i.e. a path composed of \(\mathsf{{ Conductive}}(\,\cdot \,)\) vertices).

Supercritical percolation clusters have positive density and surface-order large deviations [13] so for some constant \(c\), with high \(\mathbb{Q }\)-probability

$$\begin{aligned} |\mathsf{{Catalysts}}|\ge c M^d \mathbb{Q }[\mathsf{{Catalyst}}(\theta )]. \end{aligned}$$

Say that a \(\mathbb{P }_J\)-measurable event occurs with high probability if

$$\begin{aligned} \exists C>0 \text{ such} \text{ that} \text{ with} \text{ high} \, \mathbb{Q }\text{-probability} \mathbb{P }_J(\,\cdot \,) \ge 1-C\exp (-C_0/h). \end{aligned}$$

With \(\lambda >\lambda _2\), the expected number of \((x,i)\in \mathsf{{ Catalysts}}\times \{1,\dots ,M\}\) such that \(\mathsf{{Nuc}}{(x,i)}\) occurs is greater than

$$\begin{aligned} \exp \left(-\mathsf{{E}}^\theta _\mathrm{c}/h^{d-1}\right)\cdot \mathbb{Q }\left[|\mathsf{{ Catalysts}}|\right] \cdot M \gg 1. \end{aligned}$$

Thus a \(\mathsf{{Nuc}}{(x,i)}\) event does occur for some \((x,i)\in \mathsf{{ Catalysts}}\times \{1,\dots ,M\}\) with high probability. Let \(\mathsf{{ y}}=(y_0,y_1,\dots ,y_m)\in (\mathbb{Z }^d)^{m+1}\) denote the corresponding path of \(\mathsf{{Conductive}}(\,\cdot \,)\) vertices from \(y_0=x\) to \(y_m=0\).

The growth of the region of plus phase along the path \(\mathsf{{y}}\) corresponds to a supercritical oriented site-percolation cluster on the 1+1 dimensional space-time graph \(\{0,1,\dots ,m\}\times \{i,i+1,\dots ,4M\}\); see Fig. 5. The horizontal axis corresponds to the path of length \(m\in [M,2M]\) from the successful catalyst \(y_0=x\) to the origin \(y_m=0\). The vertical axis corresponds to time.

Fig. 5
figure 5

Growth of the plus phase on the rescaled lattice from \(\triangledown _{(x,i)}\) to \(\triangledown _{(0,4M)}\). In directions steeper that \(45^\circ \) there are infinite open paths with high probability

Let \(\mathsf{{Expand}}\) denote the event that there is an open space-time path in Fig. 5 from the bottom left hand corner to the top right hand corner, i.e. there is a sequence \((j_k : k=i,\dots ,4M)\) such that

$$\begin{aligned} j_i=0,\quad \quad j_{4M}=m,\quad \quad \forall k,\ |j_k-j_{k+1}|\le 1 \quad \quad \text{ and}\,\quad \quad \forall k,\ \mathsf{{Spread}}(y_{j_k},k). \end{aligned}$$

A Peierls argument in the \(1+1\) dimensional setting implies that \(\mathsf{{Expand}}\) occurs with high probability.

To finish the proof of Theorem 1.2.1 let \(t=\exp (\lambda /h^{d-1})\ge 4MT+T_\delta \). With reference to the graphical construction, (6.11) and the event \(\mathsf{{Expand}}\), with high probability the spins in the support of \(f\) do not depend on the starting configuration of the dynamics. The spins thus correspond to samples from the equilibrium distribution \(\mu ^{J,h}\).\(\square \)