Thermodynamics for spatially inhomogeneous magnetization and Young-Gibbs measures

We derive thermodynamic functionals for spatially inhomogeneous magnetization on a torus in the context of an Ising spin lattice model. We calculate the corresponding free energy and pressure (by applying an appropriate external field using a quadratic Kac potential) and show that they are related via a modified Legendre transform. The local properties of the infinite volume Gibbs measure, related to whether a macroscopic configuration is realized as a homogeneous state or as a mixture of pure states, are also studied by constructing the corresponding Young-Gibbs measures.


Introduction
In continuum mechanics, in order to describe the properties of a material, one studies a minimization problem of a given free energy functional with respect to an appropriate order parameter. The physical properties of the system are encoded in this functional which, in accordance with the second law of thermodynamics, is a convex function. Of particular interest is the case when we are in the regime of phase transition between pure states, which corresponds to a linear segment in the graph of the above functional with respect to the order parameter. In such a case, the solution of the minimization problem can be realized as a fine mixture of the two pure phases of the system. This is the case of occurrence of microstructures, a phenomenon observed in materials with significant technological implications. The percentage of each phase in this mixture has been successfully described by the use of Young measures. For an overview, one can look at [12] and the references therein. On the other hand, from an atomistic viewpoint and at finite temperature, there is a well-developed rigorous theory of phase transitions. For example, in the case of the Ising model, each pure phase is described via an extremal Gibbs measure and mixtures via convex combinations of the extremal ones. In this paper, we connect the two descriptions and derive a macroscopic continuum mechanics theory for scalar order parameter starting from statistical mechanics. In this context, we study the appearance of microstructures in our model by constructing Young-Gibbs measures, as they were introduced by Kotecký and Luckhaus in [10] for the case of elasticity. For more analogues of Young measures in the analysis of the collective behaviour in interacting particle systems, see [14].
To fix ideas, we consider the Ising model with nearest-neighbour ferromagnetic interaction as reference Hamiltonian. To allow spatially macroscopic inhomogeneous magnetization profiles, we have to patch together such Ising models for each given macroscopic magnetization. To obtain the desired profile, we can either do it by directly imposing a canonical constraint or by adding an external magnetic field to the Hamiltonian. We follow the second strategy and implement it by using a Kac potential acting at an intermediate scale and penalizing deviations out of an associated average magnetization (in other words, fixing the magnetization in a weaker sense than the canonical constraint). We study the Lebowitz-Penrose limit of the corresponding free energy and pressure and show equivalence of ensembles. As a result, for every macroscopic magnetization, there is a unique external field that can produce it. Note that this fact is not true for the nearestneighbour Ising model in the phase transition regime at zero external field. Indeed, thanks to the Kac term, we are able to fix a given value of the magnetization at large scales, but this is still not possible at smaller ones. In fact, what we observe in these smaller scales is the persistence of the two pure states of the Ising model with a percentage determined by the overall macroscopic magnetization.
It is worth mentioning that, for the case of the canonical ensemble with homogeneous magnetization, the actual geometry of the location of the pure states has been investigated in the celebrated result of the construction of the Wulff shape for the Ising model, [3]. In an inhomogeneous setup, the equivalent problem would be to further investigate how such shapes corresponding to two neighbouring macroscopic points are connected, but this is a challenging question beyond the scope of our paper.
To summarize, the presence of the Kac term in the Hamiltonian produces the phenomenon of microstructure as a competition between the Ising factor, which prefers the spins aligned, and the long range averages, which tend to keep the average fixed as induced by the Kac term. As a consequence, modulated patterns made out of the pure states are created and macroscopic values of the magnetization are realized in this manner. The percentage of each pure state in such a mixture is captured by the Young measure. However, it would be desirable to study more detailed properties such as the geometric shape of such structures. At zero temperature, there have been several studies at both the mesoscopic-macroscopic scale (without claim of being exhaustive, we refer to [11,2,1] for a rigorous analysis) and the microscopic scale for lattice models, as in a recent series of works by Giuliani, Lebowitz and Lieb, see [8] and the references therein. It would be of fundamental importance to develop such a theory in finite temperature as one would like to incorporate fluctuation-driven phenomena. However, this is still beyond the available techniques.
The paper is organized as follows: in Section 2, we present the model and the main theorems. The proof of the limiting free energy and pressure is given in Section 3. This is a standard result that essentially follows after putting together the results for the homogeneous case, which is also recalled in Appendix A. In Section 4, we prove equivalence of ensembles. In Section 5, as a corollary of the large deviations, we show that spin averages in domains larger than the Kac scale converge in probability to the fixed macroscopic configuration. The second part of the paper deals with investigating what happens when we take such averages in domains smaller than the Kac scale. We see that, in the phase transition regime, local averages converge in probability to averages with respect to a mixture of the pure states which, in accordance to the theory of the deterministic case, we call Young-Gibbs measure. The relevant proofs are given in Section 6 with some details left for Appendix B.

Notation and results
Let T := − 1 2 , 1 where x ∼ y means that x and y are nearest-neighbour sites, assuming periodic boundary conditions in the box Λ ε . The second part is K Λε,γ,α (σ) := x∈Λε (I γ x (σ) − α(εx)) 2 , (2 .3) where I γ x (σ) := y∈Λε J γ (x, y)σ(y) (2.4) is an average of the configuration σ around a vertex x ∈ Λ ε . We introduce another small parameter γ > 0 and the Kac interaction is an even function that vanishes outside the unit ball {r ∈ R d : |r| < 1} and integrates to 1. The difference x − y appearing in the right-hand side of (2.5) is a difference modulo Λ ε . Hence, the second term enforces the averages of spin configurations to follow α. Given (2.1), the associated finite volume Gibbs measure is defined by where Z Λε,γ,α is the normalizing constant. Note that throughout this paper, we neglect from the notation the dependence on β.
To study inhomogeneous magnetizations, we assume that locally in the macroscopic scale (i.e. the scale of the torus T) we have obtained a given value of the magnetization, which can however vary slowly as we move from one point to another. To describe what "locally" and "varying slowly" mean, we introduce an intermediate scale l of the form 2 −p , p ∈ N. Again, lim l→0 stands for lim p→∞ . Let {C l,1 , . . . , C l,N l } be the natural partition C l of T into N l = l −d cubes of side-length l, and let {∆ ε −1 l,1 , . . . , ∆ ε −1 l,N l } be its microscopic version, denoted by D ε −1 l . Its elements are given by ∆ ε −1 l,i := (ε −1 C l,i ) ∩ Z d for every i = 1, . . . , N l . Making an abuse of notation, for every i, we identify the set ∆ ε −1 l,i ⊂ Z d with the set ∪ x∈∆ ε −1 l,i ∆ 1 (x) in R d , where ∆ 1 (x) is the cube of size 1 centered in x. Note that |∆ ε −1 l | is the volume of the set ∆ ε −1 l , but also the cardinality of points in Z d within the set ∆ ε −1 l . Given u ∈ C(T, (−1, 1)), let u (l) : T → (−1, 1) be the piece-wise constant approximation of u at scale l: for all r ∈ C l,i , Here |C l | = l d denotes the volume of any of the cubes C l,1 , . . . , C l,N l . For A and B non-empty subsets of Z d such that A ⊂ B, and for σ ∈ Ω B , we define the average magnetization of σ in A by For n ∈ N, we define the set Observe that, under this definition, I |A| is the set of all possible (discrete) values that m A can assume. For t ∈ [−1, 1], let ⌈t⌉ n be the value in I n corresponding to the best approximation of t from above: ⌈t⌉ n := min{t ′ ∈ I n : t ′ ≥ t}.
(2.10) Furthermore, we consider the set of all configurations whose locally averaged magnetization m ∆ ε −1 l,i is close to the average of u in the corresponding macroscopic coarse grained box C l,i (see (2.7)), for every i = 1 . . . , N l . We have: Theorem 2.1 (Free energy and pressure). For u ∈ C(T, (−1, 1)) and α ∈ C(T, R), we have This limit gives the infinite volume free energy associated to the Hamiltonian (2.1). Here f β is the infinite volume free energy associated to the Hamiltonian (2.2) (see Theorem A.1). Similarly, we obtain the infinite volume pressure Moreover, given I α : C(T, (−1, 1)) → R defined by we obtain the following Large Deviations limit: where the set Ω Λε,l (u) is defined in (2.11).
The proof is given in Section 3.
Remark 2.2. The minimization problem in Theorem (2.1) can be easily solved; indeed, since f β is convex, symmetric with respect to the origin and lim t→±1 f ′ β (t) = ±∞, the associated Euler-Lagrange equation has a unique solution u :=ũ(α) for every number α ∈ R. On the other hand, for a given u ∈ (−1, 1), if we chooseα(u) := u + 1 2 f ′ β (u), then we can say that the Hamiltonian H Λε,γ,α with α =α(u) fixes the magnetization profile u in the sense of large deviations. The same is true point-wisely for functions, namely, x →ũ(α(x)) is the minimizer of F α in C(T, (−1, 1)).
In Remark 2.2, we have established a relation between a fixed macroscopic magnetization u and the way to obtain it by imposing an appropriate external fieldα(u) via a grand canonical ensemble with Hamiltonian H Λε,γ,α(u) . There is, however, an important difference with respect to the case of the Ising model with homogeneous external magnetic field: in the case of homogeneous magnetization, the correspondence between values of the external field and values of the magnetization is not one-to-one due to the fact that f β is constant on the interval [−m β , m β ] (to be specified later). On the contrary, in our model, we obtain such an one-to-one correspondence because of the presence of the Kac term which, acting at an intermediate scale, assigns a value to the magnetization according to the external field. This is manifested by a new quadratic term appearing in the free energy.
In the following theorem, we prove a duality relation between the free energy that corresponds to the Ising part of the Hamiltonian (2.2) and the pressure P (α), obtained through a modified Legendre transform with the external field action given by (2.3). Theorem 2.3 (Equivalence of ensembles). For α ∈ C(T, R), the following identity holds: Conversely, for u ∈ C(T, (−1, 1)), The proof is given in Section 4. As we mentioned before, the Kac potential acts at an intermediate scale γ −1 and tends to fix the average of the spin values in any box larger than γ −1 . To state this result properly, we recall the empirical magnetization defined in (2.8) and, with a slight abuse of notation, we extend it to a function from T to [−1, 1] given by r → m B R (ε −1 r) in such a way that it is constant in each small cube of side-length ε. Here B R (x) is the ball of radius R with center x, taking into consideration the periodicity in Λ ε . The first result asserts that, for α ∈ C(T, R) and R γ ≫ γ −1 , empirical averages converge in probability to the magnetization profile u =ũ(α). Formally, we define the test operator L ω,g : L 1 (T, [−1, 1]) → R, depending on a function ω ∈ C(T, R) and on a Lipschitz function g : [−1, 1] → R, by L ω,g (u) :=ˆT ω(r)g(u(r))dr. (

2.19)
Under this definition, the following theorem asserts that the operator applied to the empirical average converges to L ω,g (u) in µ Λε,γ,α -probability. Note that this convergence is a bit different than the usual convergence in probability, since the measure µ Λε,γ,α changes as ε → 0.
Theorem 2.4. Let u ∈ C(T, (−1, 1)) and choose α :=α(u) as in Remark 2.2. Then, for L ω,g given in (2.19), R γ ≫ γ −1 and δ > 0, we have The proof is given in Section 5. As it will be evident in the proof, in the above case R γ ≫ γ −1 , the test function g is not relevant.
A different scenario is observed when considering a smaller scale R: the value of the random sequence m B R (x) (σ) may oscillate and, as a consequence, its limiting value may not be just the average. In this case, we study more detailed properties of the underlying microscopic magnetizations. We refer to these as the "microscopic" spin statistics of the measure µ Λε,γ,α (as opposed to the "macroscopic" statistics given by large deviations). More precisely, we investigate how the limiting value u(r) in (2.21) is realized in intermediate scales: as a homogeneous state or as a mixture of the pure states, and how one can retain such an information in the limit. This is reminiscent of the theory of Young measures as applied to describe microstructure; see [12] for an overview. In fact, in order to describe it in our case, we will construct the appropriate Young measure.  It is known that the set G(β, h) of infinite volume Gibbs measures associated to (2.22) is a nonempty, weakly compact, convex set of probability measures on Ω Z d . More specifically, in d = 2 and for any pair (β, h), the set G(β, h) is the convex hull of two extremal elements µ nn h,± , the infinite volume limits of (2.23) with ± boundary conditions. Any non-extremal Gibbs measure can be uniquely expressed as a convex combination of these two elements: if G ∈ G(β, h), then there exists unique λ G ∈ [0, 1] such that We define the magnetization at the origin as the expectation The function ϕ : R → (−1, 1) is odd, strictly increasing, continuous in every point h = 0, and satisfies There exists a critical value β c > 0 such that the limit is positive if and only if β > β c ; it is the so-called spontaneous magnetization. Note that it also coincides with the magnetization associated to µ nn 0,+ : m β =´σ 0 µ nn 0,+ (dσ). For β ≤ β c , we have m β = 0. In this case, for every m ∈ (−1, 1), there exist a unique value h = h(m) ∈ R such that ϕ(h) = m. If m β > 0, the same is true for values of the magnetization such that |m| > m β . But, how about if |m| ≤ m β ? This has been investigated in [6], where the canonical infinite volume Gibbs measure has been constructed. As every magnetization u ∈ [−m β , m β ] can be uniquely written as a convex combination with λ u ∈ [0, 1], then u is the magnetization associated to the probability Hence, although "macroscopically" one observes the value u of the magnetization, in intermediate (still diverging) scales, one observes mixtures of the m β and −m β phases with a frequency given by λ u .
The purpose of the next theorem is to investigate the above fact for inhomogeneous magnetizations, namely by "imposing" a macroscopic profile u(r) in a grand canonical fashion, as it is described in Remark 2.2. For low enough temperature and for |u(r)| < m β , at large scales (beyond γ −1 ), the system with Hamiltonian (2.1) tends to fix u(r) while, at smaller ones, it allows (large) fluctuations once their average over areas of order γ −1 is compatible with u(r). Indeed, the result states that, at boxes of scale up to γ −1 , one of the two pure phases ±m β is observed while, at scales larger than γ −1 we see u(r). To capture this phenomenon, we use the observable L ω,g given by (2.19). With a slight abuse of notation, we can also view L ω,g as acting over Young measures ν(r) ∈ P([−1, 1]) as follows: L ω,g (ν) :=ˆT ω(r) ν(r), g dr.
(2.30) Theorem 2.6 (Parametrization by Young measures). Let u ∈ C(T, (−1, 1)) and α =α(u) ∈ C(T, R) be its associated external field (given by the solution of (2.16)). We have the following cases: where the functional L ω,g is defined in (2.19) and (2.30), and the Young measure is given by ν u (r) := δ u(r) for every r ∈ T. Here δ u(r) is the Dirac measure concentrated in u(r).
Here, for r ∈ T and E ⊂ [−1, 1] a Borel subset, the Young measure ν u,R is given by where λ u(r) and h(u(r)) are given in (2.28) and the discussion preceding it, respectively.
Under the same hypothesis of the previous item (d = 2 and β > log √ 5), for every δ > 0, The case R γ ≫ γ −1 is only a restatement of Theorem 2.4. The proof of the case R = O(1) is given in Section 6. The case 1 ≪ R ≪ γ −1 follows as a corollary of the previous case and it is briefly presented in Subsection 6.1.

Proof of Theorem 2.1
In this section we prove the limits (2.12) and (2.13). Then, the limit in (2.15) is a direct consequence.
3.1. Proof of (2.12). We first prove it for α and u constant and then for the general case.
3.1.1. Constant u and α. For u ∈ I |Λε| , we introduce the finite volume free energy associated to the Hamiltonian (2.1) by We proceed in three steps: we first show that the limit lim ε→0 F Λε,γ,α ⌈u⌉ Λε exists for every γ; we continue with a coarse-graining approximation and conclude establishing lower and upper bounds.
Step 1: existence of the limit lim ε→0 F Λε,γ,α ⌈u⌉ Λε for fixed γ > 0. Since ε = 2 −q , with a slight abuse of notation we denote the volume by Λ q in order to keep track of the dependence on q. We have |Λ q+1 | = 2 d |Λ q |. We also define the sequence of magnetizations u q := ⌈u⌉ Λq . It suffices to prove that the sequence (F Λq,γ,α (u q )) q is bounded below and that the inequality holds for every q, where (a q ) q is a sequence of non-negative numbers such that q a q < ∞.
The fact that the sequence (F Λq,γ,α (u q )) q is bounded from below follows from the inequalities and the fact that the right hand side of (3.4) converges to the pressure with zero external field; see Theorem A.1. To show (3.3), we write: To find an upper bound for we use the same sub-additive argument leading to (A.6) in the proof of Theorem A.1. Indeed, repeating the argument appearing there, it can be proved that where C is independent of q.
On the other hand, to estimate we use the following continuity lemma whose proof is also given in Subsection A.1.
Lemma 3.1. If t and t ′ are consecutive elements of I |Λq| , then where C is a constant that depends only on the dimension d.

The upper bound
follows after using this lemma repeatedly: indeed, u q+1 can be obtained from u q moving through consecutive elements of I |Λ q+1 | in at most 2 d steps. To conclude, define and observe that q a q < ∞.
Step 2: approximation by coarse-graining. We consider a microscopic parameter L γ of the form 2 m , m ∈ Z + depending on γ such that γL γ → 0 as γ → 0. In the sequel, in order to simplify notation we drop the dependence on γ from the scale L. Recall that by We define a new coarse-grained interaction J As before, |∆ L | denotes the cardinality of a generic box ∆ L,i . Since it assumes constant values for all x ∈ ∆ L,k and y ∈ ∆ L,k ′ we also introduce the notation Note that, for any k, we have Comparing to J γ , we have the error where the constant C depends on d and Dφ ∞ (the sup norm of the first derivative of φ). For a macroscopic parameter l (to be chosen εL in this case) and for r ∈ C l,k , we define the piece-wise constant approximation of α at scale l as in (2.7): With this definition, (3.15) implies that Note that using the notationJ For the nearest-neighbour part of the Hamiltonian, there are |∂C l |N εL = L d−1 (ε −1 /L) d = L −1 |Λ ε | nearest neighbours between the boxes ∆ L,1 , . . . , ∆ L,N εL ; hence we have where H nn ∆ k,i is considered with periodicity in the box ∆ L,i . Thus, to calculate (3.1), we sum over all possible values u 1 , . . . , u N εL of the magnetization in the boxes ∆ L,k , with k = 1, . . . , N εL , and obtain with a vanishing error of the order |Λ ε |(γL + εL + L −1 ), as follows from (3.17), (3.18) and (3.19).
Like before, we are using the simplified notation N = N εL . In the Appendix, Theorem A.1, we prove that the convergence to the free energy f β is uniform, hence the sum can be approximated by e −βL d f β (u k ) with an error bounded by e βL d s(L) , with s(L) → 0 as L → ∞. Then, the overall error is also vanishing: Finally, we are left with (3.23) Step 3: upper and lower bounds. To obtain a lower bound of (3.23), we bound the sum in (3.23) by the maximum contribution. Note that the cardinality of the sum vanishes in the limit ε → 0 after taking the logarithm and dividing by β|Λ ε |. Then the problem reduces to studying the minimum where G : [−1, 1] N → R is the function defined by (3.25) Moreover, using the convexity of f β , we have Furthermore, from the convexity of the function t → t −ᾱ (εL) i 2 , using (3.14), we obtain (3.27) Thus, using (3.26) and (3.27), expression (3.24) can be bounded from below by which converges to f β (u) + (u − α) 2 , and the lower bound follows.
For the upper bound of (3.23), we take one particular elementũ 1 , ...,ũ N that realizes the value of the lower bound. In this way, we obtain a lower bound for the sum over all possible values of u in (3.23) that leads to the desired upper bound. The idea is that these values should be as close as possible to ⌈u⌉ |Λε| and satisfy Let u − and u + be the best possible approximations of ⌈u⌉ |Λε| in I L d from below and from above, respectively. We have: Notice that identity (3.29) is satisfied by construction; moreover, it holds that for every i, with ε small enough and L large enough. As f ′ β is bounded in [a, b] (see Theorem A.1) and the function t → t 2 is Lipschitz over bounded subsets of R, it follows that (3.33) we conclude the proof of the upper bound and with that the proof of (3.2).
3.1.2. General u and α. For a macroscopic scale l of the form 2 −p , p ∈ N, recall the macroscopic partition C l of T and let D ε −1 l be its microscopic version, both with N = l −d elements. Given the function u ∈ C(T, (−1, 1)), we definē and C l,i is the macroscopic version of ∆ ε −1 l,i . Note that the averageū does not depend on ε, while the upper boundŪ (l) i it does due to the given discretization accuracy. Similarly, for α ∈ C(T, R), we consider its coarse-grained version α (l) as in (3.16). We next apply the previous result (for constant values of u and α) and pass to the limit l → 0. To implement this procedure, we approximate the Hamiltonian (2.1) by the sum of Hamiltonians over the boxes of the partitions with periodic boundary conditions. Neglecting the interactions between neighbouring boxes, for the Ising part of the Hamiltonian, we break N l · |∂∆ ε −1 l | ∼ |Λ ε |εl −1 many interactions. Hence Recalling the definition (2.11) of the set Ω Λε,l , we have After applying the previous result for u and α constant to each one of the Hamiltonians (recall the definition (2.1)), taking the log, dividing by −β|Λ ε | and passing to the limits lim γ→0 lim ε→0 , we obtain (3.39) Take finally lim l→0 to obtain´T f β (u(r)) + (u(r) − α(r)) 2 dr and complete the proof of (2.12).

3.2.
Proof of (2.13). This is similar to the previous proof. For the case of α constant, the existence of the limit ε → 0 for fixed γ can be proved by the same sub-additivity argument as before, without however the extra effort to keep the canonical constraint in the sequence of boxes of increasing size. Then, by the same coarse-graining argument, similarly to (3.23) we obtain For an upper bound, given (3.26) and (3.27), we take the maximum of all choices of u 1 , . . . , u N and bound (3.40) by The later quantity is further bounded by − min and the upper bound follows. For a lower bound we take one element (when all are equal) and obtain: For a general α ∈ C(T, R), we consider the partition C ε −1 l of N l many elements and, in each box, we apply the previous result. We have (3.44) Note that, for every α ∈ R, since f β is convex and u → (u − α) 2 is strictly convex, the function u → f β (u) + (u − α) 2 is strictly convex, so its derivative is strictly increasing. Hence equation has only one solutionũ(α). Since α →ũ(α) is a continuous function, taking the limit l → 0 in (3.44), by the Dominated Convergence Theorem we obtain T dr f β (ũ(α(r))) + (ũ(α(r)) − α(r)) 2 .

Proof of Theorem 2.4
We prove it first for α and u constant, by taking ω constant as well. The general case will follow by applying this case to piecewise constant approximations at an intermediate scale.
5.1. Constant u and α. We first prove the following exponential bound: for every δ > 0, there is a positive number I(δ) such that where we used the fact that g is Lipschitz with constant K. This implies that Now notice that, for every y ∈ Z d , we have where s(γ) → 0 when γ → 0, uniformly in y. As a consequence, we have where we recall the definition of I γ z in (2.4). It follows that 1 The correction term s(γ)O(1) + O(γ −1 /R γ ) vanishes when γ → 0. It follows that, for γ small enough and for every δ > 0, the following estimate holds: To estimate the latter expression, we observe that, ∀δ ′ > 0, Thus, we reduced the problem to the following lemma, whose proof is given in the Appendix A.2: Lemma 5.1. For every c, δ > 0 and γ small enough, we have that where I γ x is defined in (2.4).

5.2.
The inhomogeneous case. Like before, we consider a macroscopic scale characterized by the parameter l, which we take to be equal to 2 −p for p ∈ N. Recall that C l is the corresponding macroscopic partition of T into N l := l −d many sets denoted by C l,k , k = 1, . . . , N l . We denote their microscopic versions by ∆ ε −1 l,k . Let u (l) and ω (l) respectively be the piece-wise constant approximations of u and ω defined as (2.7). Since g is bounded and continuous and ω is uniformly continuous, we have be the magnetization considering periodic boundary conditions in ∆ ε −1 l,i . Note thatm B Rγ (x) coincides with m B Rγ (x) if the distance between x and Λ ε \ ∆ ε −1 l,i is larger than R γ . Then, since g is Lipschitz, we have Replacing in (5.11) and splitting over the boxes of the partition D ε −1 l , we obtain Then, defining for l and ε small enough, we have We notice that, for ζ > 0, Choosing ζ < δ/2 and setting δ ′ := 1 To proceed, we apply the result obtained in the first step for α and u constant. For this purpose, we define a new probability measureμ Λε,γ,α (l) defined on the union of the boxes ∆ ε −1 l,i with periodic boundary conditions in each of them and with external field α (l) as defined in (3.16). Then, by neglecting the interactions between the boxes ∆ ε −1 l,i , i = 1, . . . , N l , for any set B we obtain that Let us denote by ⌈δ ′ N l ⌉ the smallest integer not smaller than δ ′ N l . It follows from (5.1) that there exists I(ζ) > 0 such that (5.21) If we choose l small enough, the coefficient of ε −d inside the exponential is negative and thus we obtain concluding the proof of the inhomogeneous case as well as of Theorem 2.4.

Young-Gibbs measures, proof of Theorem 2.6
As mentioned before, the first case is just a restatement of Theorem 2.4. For the second case, it suffices to prove an exponential bound for the constant case and then the inhomogeneous case follows by the strategy in Subsection 5.2. The last case is a direct consequence and it will be given at the end of this section. Hence, for the rest of this section, we restrict ourselves to constant α, u and ω. We first prove the case |u| > m β and then the more difficult one: |u| ≤ m β . The hypotheses over the dimension and β are needed only in the second case.
Case |u| > m β . Let f be a local function and f x its translation by x ∈ Λ ε . For simplicity of notation, we use f instead of g(m B R ). Then, for ω constant and for fixed δ > 0, it suffices to prove an exponential bound for µ Λε,γ,α (E δ ), where with h such that E µ nn h (σ 0 ) = u, i.e., for h := f ′ β (u). We expand the Hamiltonian H Λε,γ,α as follows: When considering the corresponding measure, the constant terms cancel with the normalization. Note that To treat the second term, for some parameter ζ > 0, we consider the random variable D ζ (σ) := 1 |Λ ε | |{x ∈ Λ ε : |I γ x (σ) − u| > ζ}|, (6.5) which gives the density of bad Kac averages. Then, using the inequality and Lemma 5.1 (for appropriate choice of ζ and δ ′ ), the problem reduces to finding exponential bounds for the first term. Notice that, using (5.4) and (5.5), we have for some s(γ) → 0 as γ → 0. Then the first term on the right-hand side of (6.6) is bounded by where C 1 is a positive constant and whereẐ Λε,γ,α is the partition function associated toĤ Λε,γ,α . For σ such that D ζ (σ) ≤ δ ′ , we have From Lemma 5.1, we have that µ nn Λε,h (D ζ ≤ δ ′ ) > 1 2 , for ε small enough. Moreover, it is a standard result that there exists C 4 (δ) > 0 such that for ε small enough. For the exponential bound (6.11), we refer to [5], Theorem V.6.1. Actually, this theorem gives the result for f a local magnetization, that is, for f of the form f (σ) = 1 |∆| x∈∆ σ(x); in our case, this is enough as every local function can be written as a linear combination of local magnetizations. Under these considerations, by appropriately choosing ζ, δ ′ and for γ small, the right hand side of (6.10) is bounded by 3e −C 5 (δ)|Λε| for some constant C 5 (δ) > 0, and the result follows.
Case |u| ≤ m β . In this case, for f a local function, we seek an exponential bound for where G u := λ u µ nn + + (1 − λ u )µ nn − with λ u as in (2.29). Comparing to (6.1), we notice that, instead of the measure µ nn h with the external field corresponding to u, we have the canonical measure G u . Hence, in order to work with realizations of the measure G u , we need to introduce a scale K and prove that, for boxes in this scale, the relevant measures are µ nn + or µ nn − and that they appear with a percentage that agrees with the overall fixed magnetization u. There are two main obstacles: the first is that the Kac term in the original measure cannot directly fix the magnetization via large deviations as in Theorem 2.4, since we are looking at averages in a smaller scale than γ −1 ; in particular, (5.5) is not true. The second is to show that, in the smaller scale K, only the nearestneighbour part of the Hamiltonian is effective. Hence, we introduce another scale L ≫ K, in which the Kac term acts to all spins in the same way. Then inside the box only the nearest-neighbour interactions are relevant.
To proceed with this strategy, we fix a microscopic scale K of the form 2 m and call ∆ K,1 , . . . , ∆ K,N K the partition of Λ ε into N K := (εK) −d (6.13) boxes of side-length K. We call ∆ 0 K,i the boxes with the same center as ∆ K,i and distance √ K from their complement ∆ c K,i . We next introduce the notions of "circuit" and of "bad box". Definition 6.1 (circuit). It is easier to define the lack of circuit. For a sign τ = ±, we say that a configuration σ ∈ {−1, 1} Λε does not have a τ -circuit in ∆ K,i if there exists a path of vertices . . , k − 1, and σ x i = −τ for every i = 1, . . . , k. In other words, if the connected components of −τ that intersects the boundary of ∆ K,i do not intersect ∆ 0 K,i . If we are not interested in distinguishing the sign of the circuit, we just say that σ has a circuit.
Observe that the existence of a τ -circuit can be decided from the outside configuration.
On the other hand, we call a box ζ-good if it is not ζ-bad. We can further specify it saying it is Let N bad K,ζ and N good K,ζ,τ be the number of ζ-bad and (ζ, τ )-good boxes, respectively. To conclude the proof of the case |u| ≤ m β , it suffices to prove that the probability of having a large density of ζ-bad boxes is small and that the density of (ζ, +)-good boxes is λ u ; this is the content of the following lemma. Lemma 6.3. For every ζ, δ > 0, there exists C(ζ, δ) > 0 such that the following exponential bounds hold for every ε and γ small enough and K large enough: Before giving its proof, we see how the case |u| ≤ m β follows from it. For ζ, δ ′ > 0 (they will later depend on δ), (6.12) is bounded by The exponential bound for the second term is given by Lemma 6.3. To control the first one, we decompose the average as follows: and observe that, for σ such that Then the first term of (6.18) is bounded by Subtracting and adding E µ nn τ (f ), we have 1 Choosing ζ = δ 4 , the first term in the last expression is smaller than δ 4 , thus (6.21) is bounded by which, by Lemma 6.3, decays exponentially.
Proof of Lemma 6.3.
(i) We first notice that the criterion for a box to be "bad" is based only on the nearestneighbour interaction part of the measure. Therefore, instead of estimating (6.16) using µ Λε,γ,α , we reduce ourselves to an estimate using only the Ising part. To do that, we introduce another intermediate scale L of order γ −1+a , for a > 0, and we first condition over all possible values of the magnetization in this scale: we divide Λ ε into boxes ∆ L,1 , . . . , ∆ L,N L , N L = (εL) −d (recall (6.13)) and, in each box ∆ L,i , the new order parameter m ∆ L,i (σ) takes values in I |∆ L,i | . We denote this new configuration space by M L := N L i=1 I |∆ L,i | . Then, by conditioning on a set of configurations with a given average magnetization in M L , the Kac part of the Hamiltonian is essentially constant so we are only left with the nearest-neighbour interaction.
To proceed with this plan, we follow the coarse-graining procedure as in Section 2.1; recall the effective interactionJ (note that α is constant). Recalling the error (3.17), for L = γ −1+a , we obtain that Note that in the splitting in (6.25) we do not specify the boundary conditions, as with an extra lower order (surface) error we can choose them ad libitum. Hence, we have to estimate µ nn Λε,0 ({N bad K,ζ > δN K }). We split it into a product over the measures µ nn ∆ L,i ,0,+ assuming + boundary conditions and making an error of lower order. Then, we focus in a box ∆ L and denote by N K,L (respectively N bad K,L,ζ ) the number of boxes (respectively bad boxes) of size K in ∆ L . In order to conclude, it suffices to show that there is r(δ, ζ, K) > 0 such that The proof of (6.27) is lengthy and it is outlined below, after the end of the proof of Lemma 6.3. Furthermore, this decaying estimate should win against the accumulating errors of the order γ a |Λ ε | in (6.25) and (6.26). This is true since γ a K d ≪ r(δ, ζ, K), for γ small enough, after using the fact that |∆ L | = N K,L K d . We also need a lower bound of (6.26). For that, it suffices to show that for every i: The proof is given in Appendix B.3, concluding the proof of item (i) of Lemma 6.3.
(ii) To prove (6.17), for u constant, in a box Λ ε we have: From Definition 6.2, in the good boxes we have a circuit of ± spins. Then, using (B.1), for every for K large and a generic box ∆ K . We consider the measure µ Λε,γ,α and use the estimate (5.1). We split the measure over the boxes (∆ K,i ) i like previously and, using (6.29) as well as the estimate (6.16), we obtain (6.17). This concludes the proof of Lemma 6.3.
In the sequel, we first prove the remaining estimate (6.27). Here we present the strategy and state the main lemmas. For the proofs we refer to Appendix B.1 and B.2. The section will conclude with the proof of (2.34).
Proof of (6.27). Given a box ∆ L , let I ⊂ {1, . . . , N K,L } denote the indices of the boxes ∆ K within it.
Definition 6.4. Given I ⊂ {1, . . . , N K,L } and a ∈ {−, +} I , we define X ′ I,a to be the set of configurations where there is some circuit around ∆ 0 K,i for all i ∈ I and (6.14) is true. On the other hand, we define X ′′ I to be the set of configurations for which there is no circuit for any of the boxes in I.
Asking for more than δN K,L , 0 < δ < 1, many bad boxes is equivalent to the fact that at least one of the two cases described in Definition 6.4 has to occur more than δN K,L 2 , hence: To estimate the first contribution, we have the following lemma: There is a positive constant c so that, for any I ⊂ {1, . . . , N K,L } and i ∈ I, the following is true:

32)
where ζ is the precision parameter in the criterion (6.14) of bad boxes.
To obtain (6.27), we need to iterate the result of Lemma 6.5 and get µ nn ∆ L ,0,+ (∪ I: |I|≥δN K,L /2 X ′ I,a ) ≤ which agrees with the one in the right hand side of (6.27) since r(δ, ζ, K) := −δ log(ζ −2 K −d ) is sufficiently large by considering K large for ζ and δ fixed.
To find an estimate for the second contribution in (6.31), we use the random-cluster formulation. We give a complete description of the method in Appendix B.2, where we also provide the proof of the following lemma: Note that here, for simplicity of the proof, we can use empty boundary conditions by making an extra error of smaller order. From (6.31), (6.33) and (6.34), we conclude the proof of (6.27). 6.1. Proof of (2.34). When R → ∞, for any translation invariant measure µ (either µ nn 0,± or µ nn h(u) for some |u| > m β ) we have that Similarly, if R depends on γ and we pass simultaneously to the limit in such a way that 1 ≪ R γ ≪ γ −1 .

Appendix A. Homogeneous magnetization
For the nearest-neighbour interaction and for h ∈ R, we define the finite volume pressure by Moreover, for u ∈ I |Λε| , we define the finite volume free energy by and extend the domain of f Λε,β to [−1, 1] by assigning the values that correspond to linear interpolation between the values of f Λε,β at the neighbouring points in I |Λε| . We next prove the existence of the infinite volume free energy and pressure.
Similarly, the sequence of functions (p Λε,β ) ε converges point-wise to a function p β : R → R called pressure and it is given by Proof. This is a classical result (see e.g. [5]) with the exception of the uniform convergence of the free energy, which is given here. With a slight abuse of notation we use Λ q := Λ ε with ε = 2 −q .
Observe that Λ q+1 is the disjoint union of the sub-domains Λ q,1 , . . . , Λ q,2 d , each of which is a translation of Λ q . For a configuration σ ∈ Ω Λ q+1 , we call σ i , i = 1, . . . , 2 d its projections over these sub-domains, i.e., σ = σ 1 ∨ . . . ∨ σ 2 d where by ∨ we denote the concatenation on the sub-domains. Let u ∈ I |Λq| . Observe that if m Λ q,i (σ i ) = u for all i = 1, . . . , 2 d then m Λ q+1 (σ) = u. Note also that there are O 2 d |∂Λ q | many edges connecting vertices of different sub-domains, where by ∂Λ q we denote the boundary of the set. As a consequence, after defining we neglect the contributions between the sub-domains, so for some C > 0 we obtain: Taking logarithm and dividing by −β|Λ q+1 |, we get For a configuration σ ∈ Ω Λ q+1 , let N + (σ) := |{x ∈ Λ q+1 : σ(x) = 1}| be the associated number of pluses. There is a correspondence between I |Λ q+1 | and the set [0, |Λ q+1 |] ∩ Z containing all possible number of pluses. Let u and u ′ be consecutive elements of I |Λ q+1 | such that u < u ′ , and let n and n + 1 be respectively their associated number of pluses. Then where σ ′ ≥ σ means σ ′ (x) ≥ σ(x) for every x ∈ Λ q+1 . In the later sum, the configurations σ ′ differ from σ just in one site. As every site has 2 d neighbours, we have Replacing in (A.7), using the fact that |{σ ′ : σ ′ ≥ σ, N + (σ ′ ) = n + 1}| = |Λ q+1 |−n (i.e., the number of minuses in the σ configuration) and the bound Taking logarithm and dividing by −β|Λ q+1 |, we get For f Λ q+1 ,β (u ′ ) − f Λ q+1 ,β (u) the same bound can be obtained by replacing the number of pluses N + by the number of minuses N − . Then For u ∈ I |Λ q+1 | , let u − and u + be the best approximates in I |Λq| of u from below and from above: For u ∈ I |Λ q+1 | \ I |Λq| , using (A.11) repeatedly, we get after using (A.6) and the fact that 2 −q ≪ Let a q := O log|Λ q+1 | |Λ q+1 | and observe that a := q a q is finite. From the above estimates and the fact that f Λq,β is defined by linear interpolation, we obtain (A.14) for every u ∈ [−1, 1] and every q. Let g q := f Λq,β − q−1 i=0 a i . Inequality (A.14) implies that g q+1 (u) ≤ g q (u), (A.15) for every u ∈ [−1, 1]. The point-wise convergence of the free energy guarantees the point-wise convergence of (g q ) q to f β − a. Then (g q ) q is a sequence of continuous functions defined on a compact set that converges point-wise and in a monotonic way to f β − a. Under these hypotheses, Dini's theorem asserts that the convergence is uniform, hence concluding the uniform convergence of f Λq,β to f β .
(A. 16) We can now repeat the arguments of the proof of Theorem A.1: using (A.11) and (A.6) with error O(2 −q γ −d ), for t and t ′ consecutive elements of I |Λq| we have that where C is a constant that depends only on the dimension.
Appendix B. Estimates on "bad boxes".
Before proceeding with the estimates on "bad boxes", we state a theorem for the infinite volume Gibbs measures for the Ising model: is the critical value of the inverse temperature in dimension d), there are two different probability measures µ nn 0,± on {−1, 1} Z d such that, for any sequence of increasing volumes (Λ n ) n , the sequence µ nn Λε,0,± (with ± boundary conditions) converges weakly to µ nn 0,± . More precisely, for ∆ ⊂ Λ finite subsets of Z d and f : {−1, 1} Z d → R a function that depends only on spins inside ∆, there exists a positive constant C such that Furthermore, exponential decay of correlations holds: if the functions f and g depend on spins inside the finite regions ∆ f and ∆ g , respectively, then there exist positive constants C 1 and C 2 such that The proof is standard and can be found in [15], Theorem 2.5 and its proof in Section 2.6.2. See also Theorem 2.18. B.1. Proof of Lemma 6.5. In this section we give the following proof: Proof of Lemma 6.5: Let ∆ 00 K be the cube with the same center as ∆ 0 K and at distance K 1 2 from its complement. Given a set S ⊂ Λ ε , an accuracy parameter ζ and a radius R > 0 we define the following set of configurations: We have that for any τ ∈ {+, −} and for L large enough (B.4) Given i ∈ I and C ∈ K i the sets G C,i and X ′ I\i are C c measurable while the set C ∆ 0 K,i ,ζ is C measurable. Hence, using (B.4) we obtain where we have used the restricted measures µ nn C,0,+ and µ nn C c ,0,+ instead of µ nn ∆ L ,0,+ . From the exponential decay of correlations (B.2), we have that there are two positive constants C 1 and C 2 such that Then, using the Chebyshev inequality and the weak convergence to an infinite volume limit (B.1) we obtain: for some c > 0 and where R is such that R ≪ K 1/2 . Then from (B.5) we obtain: which gives the right hand side of (6.32) by using the fact that the events G C,i for C ∈ K i are disjoint.
B.2. Proof of Lemma 6.6. For completeness of the presentation, we first give a short description of the method and then proceed with the proof of the relevant Lemma 6.6. We restrict ourselves to dimension 2, but we expect that a similar result should be also true in higher dimensions. As before, we divide ∆ L into boxes ∆ K,i , and call N K,L = L 2 K 2 . Recall that ∆ 0 K,i stands for the box with the same center as ∆ K,i and distance √ K from its complement C c K . Let E(∆ L ) be the set of edges connecting vertices in ∆ L : E(∆ L ) := {{x, y} ⊂ Λ L : |x − y| = 1}. The random-cluster probability for ω ∈ {0, 1} E(∆ L ) is defined by where p := 1−e −2β , Cl(ω) is the number of connected components (or clusters) associated to ω, and Z ′ is the normalizing constant. The Edwards-Sokal probability Q, see [4], is defined on the product space {0, 1} E(∆ L ) × {−1, 1} ∆ L and has the random-cluster probability φ as the first marginal and the Ising probability µ ∆ L ,0, as the second marginal. The main property of Q is that the conditional probability Q(·|ω) is given by sampling a value of a spin independently in each cluster of ω with probability 1 2 . In this way, if x, y ∈ ∆ L and ω ∈ {0, 1} E(∆ L ) are such that x and y are connected by a path of edges e 1 , . . . , e k such that ω e i = 1 for every i, then Q(σ(x) = σ(y)|ω) = 1.
We can define a partial order on the probability space {0, 1} E(∆ L ) by ω ω ′ if and only if ω e ≤ ω ′ e for every e ∈ E(∆ L ). A function f : {0, 1} E(∆ L ) → R is increasing (resp. decreasing) if and only if f (ω) ≤ f (ω ′ ) (resp. f (ω) ≥ f (ω ′ )) for every ω, ω ′ such that ω ω ′ ; an event A ⊂ {0, 1} E(∆ L ) is increasing (resp. decreasing) if the indicator function 1 A is an increasing (resp. decreasing) function. For probabilities P and P ′ on {0, 1} E(∆ L ) , we say that P is stochastically dominated by P ′ , and write P ≤ st P ′ , if and only if´f dP ≤´f dP ′ for every increasing function f . The later property holds if and only if´f dP ≥´f dP ′ for every decreasing function f . Let B ρ be the Bernoulli probability on {0, 1} E(∆ L ) with parameter ρ := 1−e −2β 1+e −2β : (B.10) The random-cluster probability satisfies B ρ ≤ st φ; in particular, φ(A) ≤ B ρ (A) for every decreasing event A. We are ready now to give the proof of Lemma 6.6.
Proof of Lemma 6.6: We need to introduce some terminology. Let Z 2 * := Z 2 + 1 2 , 1 2 be the dual set of vertices of Z 2 . For an edge e = xy ∈ E(Z 2 ), where E(Z 2 ) is the set of edges of Z 2 , we define its dual edge e * as the one obtained after rotating it 90 degrees around its middle point; for an edge subset A ⊂ E(Z 2 ), we define A * := {e * : e ∈ A}. For any subset of edges E, let the support of E be the set of vertices that are extreme vertices of any of the edges in E. For a subset R ⊂ ∆ L , we define its dual set of vertices R * as the support of E(R) * . The inner boundary of R * is defined by ∂ • R * {x ∈ R * : |x − y| = 1 for some y ∈ Z 2 * \ R * }. For a configuration ω ∈ {0, 1} E(R) , we define its dual configuration ω * ∈ {0, 1} E(R) * by ω * e * = 1 − ω e . We say that an edge e * ∈ E(R) * is ω * -open if ω * e * = 1. Associated to ω * , and for a fixed box ∆ K,i , we call J ∆ K,i (ω * ) ⊂ E(∆ K,i ) * the set of edges "penetrating from the outside of ∆ K,i ", i.e., those containing the dual edges that are ω * -open and are connected to ∂ • (∆ * K,i ) by a path of ω * -open edges. We say that ω ∈ {0, 1} E(∆ L ) has a circuit of open edges in ∆ K,i if J ∆ K,i (ω * ) ∩ E(∆ 0 K,i ) * = ∅ (this is the formal way of saying that ω has a self-avoiding path of open edges living in E(∆ K,i ) that surrounds ∆ 0 K,i ). Consider the random-cluster probability φ associated to µ ∆ L ,0, (defined in (B.9)) and the corresponding Edwards-Sokal coupling Q between φ and µ ∆ L ,0, . The fundamental property of Q implies that, for every ∆ K,i , we consider its complement. Let R 1 , R 2 , R 3 and R 4 be the rectangles of dimension K × K−K × K that satisfy ∪ 4 i=1 R i = ∆ K \ ∆ 0 K (see Figure 1). Let R be one of these rectangles and, without loss of generality, suppose it to be horizontal, that is of dimension K × K−K  the bottom of the support of the (dual) set of edges R * , that is, the ones with highest and lowest second coordinate (see Figure 2).
We say that a configuration ω ∈ {0, 1} E(R) is good lengthwise if its dual configuration ω * ∈ {0, 1} E(R) * does not have any path of open edges connecting T (R * ) with B(R * ); in this case, we say ω * is bad transversally. Observe that a sufficient condition for a configuration ω ∈ {0, 1} E(∆ K ) to have a circuit is that, for every 1 ≤ i ≤ 4, the projection ω E(R i ) is good lengthwise. We have (B.15) in the last inequality, we used the fact that the event ω ∈ {0, 1} E(R) : ω is good lengthwise is increasing and that the Bernoulli probability satisfies the FKG property; see [9]. To estimate the probability of the last set, we consider its complement: Thus, to get a lower bound to the left hand side of (6.28) we restrict to G η . As a consequence, in each subdomain the corresponding probabilities can be bounded as in (B.24) and we are left with only the boundary terms, which are of the order L d−1 (for a box ∆ L = L d ). We have µ nn ∆ L ,0 ({m ∆ L = η}) ≥ µ nn ∆ L ,0 (G η ) If |η| > m β , then we choose h := h(η) as in discussion following (2.27) and obtain: for an appropriate choice of c. Following the steps above, (B.28) implies (6.28) for the case |η| > m β and concludes the proof.