Large Deviations for the Macroscopic Motion of an Interface

We study the most probable way an interface moves on a macroscopic scale from an initial to a final position within a fixed time in the context of large deviations for a stochastic microscopic lattice system of Ising spins with Kac interaction evolving in time according to Glauber (non-conservative) dynamics. Such interfaces separate two stable phases of a ferromagnetic system and in the macroscopic scale are represented by sharp transitions. We derive quantitative estimates for the upper and the lower bound of the cost functional that penalizes all possible deviations and obtain explicit error terms which are valid also in the macroscopic scale. Furthermore, using the result of a companion paper about the minimizers of this cost functional for the macroscopic motion of the interface in a fixed time, we prove that the probability of such events can concentrate on nucleations should the transition happen fast enough.

of Ising-spins with Glauber dynamics by a multi-scale procedure, see [11,18]. First, a spatial scaling of the order of the (diverging) interaction range of the Kac-potential is applied to obtain a deterministic limit on the so-called mesoscale, which follows a nonlocal evolution equation, see [8,11]. This equation is then rescaled diffusively to obtain the macroscopic evolution law, in this case motion by mean curvature. For an appropriate choice of the parameters both limits can be done simultaneously to obtain a macroscopic (and deterministic) evolution law for the phase boundary, in this case motion by mean curvature. It is natural to ask for the corresponding large deviations result, i.e., for the probability of macroscopic interfaces evolving differently from the deterministic limit law. This is particularly interesting when studying metastable phenomena of transitions from one local equilibrium to another as one needs to quantify such large deviations which cannot be captured by the deterministic evolution (for the present context of Glauber dynamics and Kac potential we also refer to [22]). For the first step, i.e., deviations from the limit equation on the mesoscale, this has been achieved by F. Comets, [7]. In the present and the companion [6] paper we extend this result and derive the probability of large deviations for the macroscopic limit evolution starting from the microscopic Ising-Kac model. The technical difficulties are related to the fact that almost all of the system will be in one of the two phases, i.e., contribute zero to the large deviations cost, while a deviation happens only at the interface. This means that the exponential decay rate of the probability of our events is smaller than the number of random variables involved. As a consequence of these difficulties, our final result holds in one dimension only (i.e. no curvature), while several partial results do not depend on the dimension. If we were to follow the technique used in [7] we would obtain errors which are either diverging in a further parabolic rescaling or they can not be explicitly quantified with respect to the small parameter. Therefore, in this paper we use a different technique by introducing coarse-grained time-space-magnetization boxes and explicitly quantifying all possible transitions in the coarse-grained state space.
Let us explain more precisely the setting of this paper. We fix a space-time (ξ, τ ) scale (macroscopic) and we consider the particular example of an interface which is forced to move from a starting position ξ = 0 (at τ = 0) to a final position ξ = R within a fixed time T . If such a motion occurs with constant velocity, being V = R/T , linear response theory and Onsager's principle suggest that the power (per unit area) needed is given by V 2 /μ, where μ is a mobility coefficient. Our goal is to verify the limits of validity of this law in a stochastic model of interacting spins which mesoscopically gives rise to a model of interfaces.
In [9] the same question has been studied starting with a model in the mesoscopic scale (x, t) and examining the motion of the interface in the macroscopic scale after a diffusive rescaling: x = −1 ξ and t = −2 τ , where is a small parameter eventually going to zero. The authors considered a non local evolution equation obtained as a gradient flow of a certain functional penalizing interfaces. An interface can be described as a non-homogeneous stationary solution of this equation, therefore in order to produce orbits where the interface is moving (i.e., non stationary) the authors included an additional external force. To select among all possible forces they considered as a cost functional an L 2 -norm of the external force whose minimizer provides the best mechanism for the motion of the interface. However, in our case of starting from a microscopic model of spins, instead of postulating an action functional we actually derive it as a large deviations functional. Then, in order to find the best mechanism for the macroscopic motion of the interface one has to study its minimizers. This is addressed in the companion paper [6] where we use a strategy closely related to the one in [9] but with the extra complication that the new functional turns out to give a softer penalization on deviating profiles than the L 2 norm considered in [9].
There is a significant number of works in the literature studying closely related problems, mostly in the context of the stochastic Allen-Cahn equation. In [20,21,23], the authors study a minimization problem over all possible "switching paths" related to the Allen-Cahn equation: The cost functional is the L 2 -norm of the forcing in the Allen-Cahn equation, which is what one would heuristically expect if one could define the large deviations rate functional for the Allen-Cahn equation with space-time white noise. Their results deal with the meso-to-macro limit of those rate functionals, but do not connect these rigorously to a stochastic process on the microscale. On the other hand, the large deviations have been studied in [14,16,17]. Furthermore, combining the above results, the large deviations asymptotics under diffusive rescaling of space and time are obtained in [5] [see also the companion paper [4]]: the authors consider coloured noise and take both the intensity and the spatial correlation length of the noise to zero while doing simultaneously the meso-to-macro limit. This double limit is similar in spirit with our work with the difference that our noise is microscopic and the "noise to zero" limit is replaced by a "micro-to-meso" limit. However, they state the large deviations principle directly in the -limit while we only obtain quantitative estimates for the upper and lower bound which are valid in this macroscopic scale; hence it would be interesting as a future work to consider this analysis also in our case, maybe in higher dimensions as well. ] the corresponding mesoscopic domains. Choosing another small parameter γ , we consider the microscopic lattice system S γ = ∩ γ Z, as viewed from the mesoscale. We consider

The Microscopic Model
where β is the inverse temperature and Z β, ,h the normalization (partition function). We introduce the Glauber dynamics, which satisfies the detailed balance condition with respect to the Gibbs measure defined above, in terms of a continuous time Markov chain. Let λ : {−1, +1} S γ → R + be a bounded function and p(·, ·) a transition probability on {−1, +1} S γ that vanishes on the diagonal: p(σ, σ ) = 0 for every σ ∈ {−1, +1} S γ . Consider the space endowed with the Borel σ -algebra that makes the variables σ n ∈ {−1, +1} S γ and τ n ∈ R + measurable. For each σ ∈ {−1, +1} S γ , let P σ be the probability measure under which (i) {σ n } n∈N , is a Markov chain with transition probability p starting from σ and (ii) given {σ n } n∈N , the random variables τ n are independent and distributed according to an exponential law of parameter λ(σ n ). Any realization of the process can be described in terms of the infinite sequence of pairs (σ n , t n ) where t 0 = 0 and t n+1 = t n + τ n determining the state into which the process jumps and the time at which the jump occurs: The space of realizations of the Glauber dynamics is also equivalent to D(R + , {−1, +1} S γ ), namely the Skorohod space of cadlag trajectories (continuous from the right and with limits from the left). From [19] we have that for every P σ the sequence (σ n , t n ) is an inhomogeneous Markov chain with infinitesimal transition probability given by (2.5) The flip rate λ is given by and the transition probability by where σ x is the configuration obtained from σ flipping the spin located at x. The flip rates c(x, σ ) for single spin at x in the configuration σ are defined by For later use we also express the rates as: (2.7) Note that the flip rate is bounded both from above and below:

The Mesoscopic Model
For x ∈ S γ , we divide into intervals I i , of equal length for some b > 0 to be determined in Sect. 7. Denoting also by I (x) the interval that contains the microscopic point x ∈ S γ , we consider the block spin transformation given by (2.9) In the sequel we will also need to specify it by the index i ∈ I of the coarse cell, i.e., denote it by m γ (σ ; i, t) or use a time independent version m γ (σ ; i) as well.
In [11] it has been proved that as γ → 0 the function m γ (σ ; x, t) converges in a suitable topology to m(x, t) which is the solution of the following nonlocal evolution equation Furthermore, this equation is related to the gradient flow of the free energy functional where φ β (m) is the "mean field excess free energy" and S(m) the entropy: We also define f (m) := δF δm Thus F is a Lyapunov functional for the equation (2.10): since the two factors inside the integral have the same sign. This structure will be essential in the sequel.
Concerning the stationary solutions of the Eq. (2.10) in R, it has been proved that the two constant functions m (±) (x) := ±m β , with m β > 0 solving the mean field equation m β = tanh{βm β } are stable stationary solutions of (2.10) and are interpreted as the two pure phases of the system with positive and negative magnetization.
Interfaces, which are the objects of this paper, are made up from particular stationary solutions of (2.10). Such solutions, called instantons, exist for any β > 1 and we denote them bym ξ (x), where ξ is a parameter called the center of the instanton. Denotingm :=m 0 , we have thatm ξ (x) =m(x − ξ). (2.13) The instantonm satisfiesm It is an increasing, antisymmetric function which converges exponentially fast to ±m β as x → ±∞, see e.g. [12], and there are α and a positive so that lim x→∞ e αxm (x) = a, (2.15) see [10], Theorem 3.1. Moreover, any other solution of (2.14) which is strictly positive [respectively negative] as x → ∞ [respectively x → −∞], is a translate ofm(x), see [13]. Note also that in the case of finite volume the solutionm ( ) with Neumann boundary conditions is close tom as → 0, see [3], Section 3.

The Macroscopic Scale
This consists of the rescaled space-time domain × T . The corresponding profiles are rescaled versions of the functions in the mesoscopic domain. In particular, the mesoscopically diffuse instanton is now a sharp interface between the two phases.

Large Deviations at the Macroscopic Scale
We consider an instanton initially at a macroscopic position 0 and move it to a final position R within a fixed time T = R/V , where V is a given value of the average velocity. At the mesoscopic scale functions that satisfy the above requirement are profiles in the set Due to the stationarity ofm, no element in U[ −1 R, −2 T ] is a solution of the equation (2.10). In order to produce such a motion, in [9] the authors considered an external force to the equation (2.10). Then, the optimal motion of the interface can be found by minimizing an appropriately chosen cost functional. Following their reasoning, given a profile φ(x, t) in (2.16) with time derivativeφ(x, t), we define the following quantity: and we suppose that the profiles under investigation are solutions of equation (2.10) with additional external force b:ṁ In [9] the cost functional has been chosen to be In the present paper we derive such an action functional by considering the underlying microscopic process and studying the probability of observing such a deviating event. Note that this is a large deviations away from a typical profile that satisfies the mesoscopic equation (2.10). The problem is formulated as follows: show that the probability of the event under investigation is logarithmically equivalent to the minimal cost computed over the class U[ −1 R, −2 T ] as γ → 0. Here we are using the symbol ∼ to denote a suitable notion of distance that will be formally given below in Definition 3.1. In [7] the probability for the transition from the neighborhood of a stable equilibrium to another has been studied by establishing the equivalent to the Freidlin-Wentzell estimates, see [15]. The corresponding cost functional for T × [0, T ] is given by where (2.21) However, in our case, we have to perform the same task but for the rescaled time and space domain × T in order to obtain a result which is valid also at the macroscopic scale. This is technically challenging as, in the case the time horizon as well as the volume scale with (γ ), the error estimates providing (2.20) are not bounded when γ → 0. To overcome it, we follow a different approach by coarse-graining the space of realizations of the process in all time, space and magnetization coordinates. Then, in order to calculate the probability of an event we intersect it with all possible coarse-grained "tubelets". The final result comes from an explicit calculation of the probability of such a tubelet and agrees with (2.20).

Properties of the Cost Functional
Then after a simple manipulation we can write H in the following form (committing a small abuse of notation): It is a straightforward calculation to see that uniformly on u ∈ [−1, 1] and w ∈ (−1, 1) we have: Note that the cost assumed in [9] is approximating the case that b is small, hence it gives a stronger penalization of the deviating profiles than the one derived from the microscopic system.
For further properties we refer the reader to [7]. In particular, in the sequel we will use the fact that (2.22) The minimizers of I ×T (φ) over the class U[ −1 R, −2 T ] is addressed to the companion paper [6]. To get a rough idea, the cost of a moving instanton with -small velocity, i.e is given by wherem is the derivative ofm and . Following [9] it can be shown that other ways to move continuously the instanton are more expensive. However, in such systems one can also observe the phenomenon of nucleations, namely the appearance of droplets of a phase inside another. In [1] and [2] it has been proved that for such a profile the cost is bounded by twice the free energy computed at the instanton so it can be comparable to the cost (2.23) of the translating instanton. This will be properly stated in the main results in the next section.

Main Results
We divide We choose the length to be for some number λ 0 > 0 to be determined later in (5.33). Note that each I i contains γ −1 |I i | many lattice sites of S γ . Given such a coarse cell, we define the set of all discretized paths Similarly, for a time-independent function m ∈ L ∞ ( ) we denote by σ t ∈ {m} δ (or {σ t ∼ m} if we do not want to specify the parameter δ) the relation Given a set A ⊂ D(R + , {−1, +1} S γ ), to each σ ∈ A we can associate an a ∈¯ γ and a φ ∈ C 1 ( × T ) such that σ ∈ {a} δ and σ ∈ {φ} δ , respectively.

Definition 3.2 For
and The main result of this paper are the following quantitative estimates: where again lim γ →0 c γ = lim γ →0 δ γ = 0.
The above theorem is a quantitative version (for finite γ ) of a Large Deviation Principle (LDP) for γ −1 −1 many random variables with a rate of only γ −1 . Note that if we wanted to write a statement directly in the limit γ → 0 one should study the -limit of the functional I (γ ) ×T (γ ) , which might be a delicate issue since we need to express the limiting functional over singular functions and with the appropriate topology for the LDP to hold. However, we can find both the minimal value and the profiles to which it corresponds in the limit γ → 0. This is the context of a companion paper [6] where we obtain a lower bound for the cost functional (2.16). We start with a definition.

Definition 3.4
Given R, T > 0 and the mobility coefficient μ =: 4 m L 2 (dν) > 0, we define the cost corresponding to n nucleations and the related translations by where V = R/T , F is the free energy (2.11) andm the instanton, given in (2.14).
Note that the first term in (3.8) corresponds to the cost of n nucleations while the second to the cost of displacement of 2n + 1 fronts (with the smaller velocity V /(2n + 1)).
we have: The proof of this theorem is given in the companion paper [6]. Combining the results in Theorem 3.3 and 3.5 we obtain a corollary about the optimal macroscopic motion of the interface. We start with some definitions: from the cost (3.8) we consider the set which contains at most two elements. One can check that for certain values of R and T , n and n + 1 nucleations have the same cost for some n, since we can get the same minimum value by one nucleation less, but higher velocity of the newly created fronts. Hence, the number of nucleations quantizes the cost. Now we define the set of profiles that have for some time t ∈ T at least the optimal number of nucleations. Given δ > 0 we define the following set of mesoscopic paths M δ, Note also that here we assume that the nucleations occur simultaneously as this is the most efficient way to do it, see [6]. The fact that the instanton has travelled at least −1 R is represented by the set where P σ 0 denotes the law of the magnetization process starting at σ 0 , with σ 0 ∈ {m} γ as in (2.13).
The proof follows from the previous results. The key point is that if we consider the cost corresponding to the sets (A δ γ ) c ∩ C δ γ and C δ γ , by using the corresponding estimates from Theorem 3.3 for the closed and the open sets, we have that since in the first set we do not include the optimal number of nucleations, hence the cost is higher than in the second. Then, the proof follows by applying the estimates of Theorem 3.3 to the conditional probability.

Strategy of the Proof of Theorem 3.3
Given a closed set C ⊂ D(R + , {−1, +1} S γ ) for as in (3.1), consider the set¯ γ . Now choose δ := /2 and partition the sample space to get an upper bound by restricting tō γ,δ (C), given in (3.4). Since we would like to work with smooth functions, we also define the following intermediate space: Definition 3. 7 We define by PC |I | Aff t ( × T ) the space of piecewise constant in space (in intervals of length |I |) and linear in time (in intervals of length t) functions. Given a ∈¯ γ , φ a is the linear interpolation between the values a(x, ( j − 1) t) and a(x, j t)): With the above choices we have: if we are able to find for a given tubelet {a} δ an estimate of the form Here,f i, j (a) will be a discrete version of the density of the cost functional we are after.
In the second inequality we bounded the sum by the maximum value times its cardinality. Denoting by N s , N t and N m the number of space, time and magnetization coarse cells, we have the following bound for the cardinality: This gives for all c < 1, as γ → 0.
In order to prove (3.17), in Section 4 we divide T into time intervals with less (respectively more) spin flips than a fixed number. We call these time intervals good (respectively bad). We first show that the probability of having more than a given number (still diverging) of bad time intervals is negligible. In this way we partition the space of realizations by considering good and bad time intervals which we study separately. In each case we obtain a different form off . In Section 5 we study the probability of the tubelet in a good time interval and by appropriately approximating it by a Poisson process for the number of positive and negative spin flips we obtain a formula for the density of the cost functional under the assumption that the fixed magnetization profiles a are far enough from their boundary values ±1. This assumption will be removed later in Appendix 1 by showing that the probability of the process being close to any profile a can be approximated within some allowed error by the probability of the process being close to another profileã as above. Another key step of the derivation of the cost in the good time intervals is to replace the random by deterministic rates and this is given in Section 5.3. Then, in Section 6 we treat the case of bad time intervals. More specifically we first show a rough upper bound for the probability in a given time interval which together with the estimated number of bad time intervals shows that the bad time intervals have vanishing contribution to the cost. We conclude with Section 7 where we prove that the discretized sum is a convergent Riemann sum yielding the cost functional we are after. To do that, we replace the discrete values a by the corresponding profile φ a and subsequently obtain the cost functional over such functions given by I (γ ) ×T (γ ) (φ a ) as in (2.20). Finally, in Lemma 7.2 we argue that it is enough to minimize over smoother versions of such functions, i.e., we will restrict our attention on the set given in (3.5). Once we have the upper bound we can look where the infimum occurs. Then for the lower bound we pick a collection {a * i, j } i, j which corresponds to the infimum and we bound the probability of an open set O by the probability of this particular profile, i.e., (3.20) We skip the explicit proof of the lower bound as it is a straightforward repetition of the steps for obtaining the upper bound, with small alterations which will be discussed throughout the proof.

Too Many Jumps are Negligible
We distinguish two types of time intervals, namely those with less (we call them good) or more (we call them bad) spin flips than a fixed number N to be a slightly larger number than the expected number of jumps within time t, i.e., we choose where for some λ 1 > 0 to be determined in (7.20). For the time interval [ j t, ( j + 1) t) we denote the number of jumps within this interval by: We decompose the path space X in (2.4) as follows: ..., j k = {N (σ t , j) > N , j ∈ { j 1 , . . . , j k } and N (σ t , j) ≤ N , otherwise} is the set of realizations with k bad time intervals, indexed by j 1 , . . . , j k . Then for the probability in the left hand side of (3.17) we have: We selectk such that P({a} δ ∩Dk) is negligible. Note that where λ := max σ λ(σ ). Therefore, given a configuration σ 0 , we have which is negligible if we choosek for some η 2 ≡ η 2 (γ ) = | ln γ | −λ 2 , with λ 1 > λ 2 > 0, (4.6) so that η 1 << η 2 , as required in Sect. 7, formula (7.4). Notice thatk → ∞ as γ → 0 since t = γ c while all other parameters grow logarithmically in γ . Thus, overall we show that the probability of having too many bad time strips is negligible so for the upper bound we estimate it by the probability of the set {a} δ ∩D c k . We have: which can be further bounded by taking the cardinalityk −2 T t k of the sum over k and j 1 < . . . < j k and then the max over (k, { j 1 , . . . , j k }). We call k * , { j * 1 , . . . , j * k * } the choice where the maximum is attained. On the good time strips ( j / ∈ { j * 1 , . . . , j * k }) we derive a discrete version of the density of the cost functional. On the other hand, on the bad time strips ( j ∈ { j * 1 , . . . , j * k }) we obtain upper and lower bounds and show that since these are few the corresponding cost is negligible. Note also that for the lower bound (3.20) we can simply restrict our attention on the good part D c 0 .

Good Time Intervals
In this section we compute the probability in a good time interval [( j − 1) t, j t).

Coarse-Grained Spin Flip Markov Process {σ t } t≥0
We establish a new spin flip markov process {σ t } t≥0 which is defined on the same space and in a similar fashion as {σ t } t≥0 , but does not distinguish among the spins of the same coarse cell I i , i ∈ I. The new transition probability is given bȳ 1) wherep(·, ·) andλ are given below. Recalling the coarse-graining over space intervals I i , i ∈ I, we first define the coarse-grained interaction potential x,y∈I i ,x =y J γ (x, y). Note also that for all x ∈ I i and y ∈ I i we have the bound: The coarse-grained rates for x ∈ I i are given bȳ Then, the flip rateλ and the transition probability are respectively given bȳ In the next lemma we compare the processes σ andσ : For any a ∈¯ γ there exists c > 0 such that for γ > 0 small enough 6) where η 1 is given in (4.2) and C * (γ ) = |I | J ∞ + γ J ∞ . Remark 5.2 Note that after taking γ ln() and considering all time intervals, the error in (5.6) is negligible as Proof We compare the rates of the processes σ t andσ t : for any x ∈ I i from (5.3) and the properties of F in (2.7), starting from the same configuration σ we have that there exists c > 0 such that (5.8)

Using (5.3) we obtain the error
which further gives that Replacing it by the Radon-Nikodym derivative between the laws of the processes σ t andσ t (see e.g. [19], Appendix 1, Proposition 2.6) we obtain the upper bound γ −1 −1 C * (γ ) t for the integral in (5.11) and NC * (γ ), with N as in (4.1) for the sum, which further yield the bounds of (5.6).
LetL be the generator of the new process {σ t } t≥0 . We consider the magnetization density at each coarse cell I i of the new process {σ t } t≥0 as in (2.9) and (with slight abuse of notation) define where f (σ ) = g(m γ (σ )). This is easy to show: we first denote the new coarse-grained process by m(t) ≡ {m i (t)} i∈I whose generator L is given by (5.14) with rates:c h(i; m)), (5.15) where, by a slight abuse of notation compared to (5.5), and F ∓ (h) = e ∓βh e −βh + e βh . (5.17) When f (σ ) = g(m γ (σ )) thenL f (σ ) = Lg(m γ (σ )). By induction on n, we have that L n f (σ ) = L n g(m γ (σ )) and expanding eL t f in a power series, we obtain (5.13).

Poisson Process for the Jumps
To compute the probability for the coarse-grained process we realize the coarse-grained Glauber dynamics by constructing for each m i two independent Poisson processes, . .} called "random times" and then taking the product over all m i ∈ M and all i ∈ I. Hence, we can construct the process m(t) := {m i (t)} i∈I , t ≥ 0, as follows: if at time s ≥ 0 the process is in m then it remains in m until the minimum between the times t i ± := min n∈N {t i ±,n (m i )} and over all i occurs. Then, for that i, the magnetization m i increases (respectively decreases) by 2 γ −1 |I | . The case min i t i − = min i t i + has probability 0.

From Random to Deterministic Rates
The complication in the construction of m(t) resides on the fact that we need to know how the random times are interrelated. Furthermore, the values of m i and m j (at the two coarsegrained boxes I i and I j , respectively) are correlated via the interaction potentialJ γ . Hence, for both of the above reasons, the analysis would become much simpler if we made the intensities of the random times independent of the current value m i . To this end, we make them depend on some deterministic value of the profile which remains close to m i during the whole time interval of length t. As a result, there will be only two rates for each i ∈ I: one for the plus jumps and the other for the minus jumps. Let N ± i, j−1 be the number of plus/minus random times during the time interval [( j − 1) t, j t] that occur in the i-th space interval. Note that for simplicity in the notation, in N ± i, j−1 we do not carry the dependence on t. Then, the change of the magnetization in any time interval To formulate this idea we introduce new deterministic rates depending on the fixed configuration a ≡ {a i, j } i, j : where F ∓ is given in (5.17), Our goal is to use the distribution of the random variable 2(N − i, j−1 − N + i, j−1 ). More precisely, in Lemma 5.3 below, we show that the law of two independent Poisson processes with deterministic intensities γ −1 |I |c ± (i, a) is close to the law of two independent Poisson processes with intensities γ −1 |I |c ± (i, m(( j − 1) t)).
By approximating the mean field process considering constant intensities γ −1 |I |c ± (i, a) (one for the plus and one for the minus species), the resulting process is independent in each space box indexed by i ∈ I. The Poisson probability of the occurrence of n random times at a given space box within a time interval of length t is given by ∈ R we consider the following event where the random variable N i, j stands for the number of jumps within the time interval

Lemma 5.3
Let ν i = P γ −1 |I |c + (i,a) × P γ −1 |I |c + (i,a) be the law of the product of two independent Poisson processes with intensities γ −1 |I |c + (i, a) and γ −1 |I |c − (i, a), respectively. Then, for any configuration a ∈¯ γ and δ > 0, we have that Remark 5.4 Finally, note that the error is negligible for the choice δ ≡ δ γ = 2 with as in (3.1), since, after considering all time intervals,

Moreover, we denote by ν i m i (( j−1) t) (·) the conditional probability of an event which starts
under the requirement (5.7) and the fact that t (in δ γ ) is a power of γ .
Proof We consider a process {m(t)} t≥0 whose rates are constant and equal to γ −1 |I |c ± (i, a) as in (5.18). By comparing the ratesc ± (i, m) andc ± (i, a) given in (5.15) and (5.18), respectively, we have: where we have used (5.3) for the slightly different case, namely when r, r ∈ R rather than just on S γ . Recalling C * (γ ) from (5.9), we obtain: By using (5.11) we get Furthermore, since the processesm i are independent with respect to i ∈ I, we can write (5.26) in the following form: and similarly for the lower bound. Last, it is easy to see that given an initial condition m i (( j − 1) t) ∈ {a i, j−1 } δ , for every element of the set {m i ( j t) ∈ {a i, j } δ , N i, j ≤ N } corresponds only one element of B δ i, j−1 (a), hence the right hand side of (5.27) equals that of (5.22), which concludes the proof of the lemma.
Remark 5.5 Note that if, instead of the definition (5.2) for the coarse potential, we used a different one which is also more common in the literature, e.g. see [24] formula (4.2.5.2), namelyJ then the estimate (5.24) would be simpler and equal to cδ.
The next task is the asymptotic analysis of (5.20). In the lemma below we compute the cost functional for the Poisson process. Lemma 5.6 Given a profile a ≡ {a i, j } i, j ∈¯ γ , let ν i = P γ −1 |I |c + (i,a) ×P γ −1 |I |c − (i,a) be the law of two independent Poisson processes with intensities γ −1 |I |c + (i, a) and γ −1 |I |c − (i, a), respectively. Then, for d i, j−1 = a i, j −a i, j−1 t and B δ i, j−1 (a) as in (5.21), with some δ > 0 small, e.g. δ = t η 0 , with η 0 as in (3.2), we have: for α > 0 small and where a) .
and the optimal valuesx ± i, j−1 satisfŷ Remark 5.7 The error in (5.29) is negligible if we choose η 0 such that Moreover, for later use, we also consider a t-dependent version of f in (5.30), namely: Note that for the valuesx ± i, j−1 given in (5.32), the following is true: The proof of the lemma will be given in Appendix 1. The next step is to show that the stochastic dynamics prefer to drive the system towards profiles a ∈¯ γ which are away from the boundary values ±1. We introduce the threshold δ := t · η 3 , with η 3 ≡ η 3 (γ ) := | ln γ | −λ 3 , λ 3 > 0, (5.35) where λ 3 will be determined in (7.20) and consider the class: In the following lemma we prove that given a profile a ∈¯ γ , we can construct a new profilẽ a ∈¯ δ γ that the process m prefers to follow with higher or comparable probability.

Remark 5.9
Note that the error is negligible if we take γ ln() and consider all space-time coarse-grained boxes, i.e., The proof is given in Appendix 1. We summarize what we have done so far: by putting together the results of Lemmas 5.1, 5.3, 5.6 and 5.8 and considering the number of all timespace coarse cells, we have the following lower and upper bounds, for γ > 0 small enough and for some c > 0: Note that the error is negligible under the requirements in the corresponding lemmas.

Bad Time Intervals
Going back to (4.7) and the discussion below, for the terms in {a} δ ∩ D c k with j / ∈ { j * 1 , . . . , j * k } we use the formula derived in the previous section. On the other hand, for the terms with j ∈ { j * 1 , . . . , j * k } we consider upper and lower bounds by replacing the rates by the corresponding constant ones c m and c M as in (2.8). Hence, for the case of the upper bound (and similarly for the lower bound), we construct a new processσ which is a Markov Process with infinitesimal transition probabilityP given by: In the new process we have replaced the rates by constant ones in such a way to get an upper bound. It is easy to check that 3) whereP is the probability of the new process {σ t } t≥0 . To compute the upper and lower bounds for the new process we proceed as before and consider the corresponding mean field process {m i (t)} i∈I, t≥0 with rates given by By defining the Poisson representation of the process in a similar fashion as in subsection 5.2 we obtain similar upper (g 1 ) and lower (g 2 ) bounds as in (5.38) and (5.39), respectively, where instead of f we have Hereẑ ± are computed following the Appendix 1. Note that we also have a rough lower bound: g 1,2 (ẑ ± ; a) ≥ −c b where c b is a positive constant number since h ≥ 0. Now we have all the ingredients to derive the discrete version of the cost functional in the space × T .

Derivation of the Cost Functional
We recall from Definition 3.7 the space PC |I | Aff t ( × T ) of all functions 1) which are linear interpolation between the values a(x, ( j − 1) t) and a(x, j t)) and piecewise constant in space. We also consider another function which agrees with its derivative in each open interval: We also recall that {k * , j * 1 , . . . , j * k * } = argmax {k, j 1 ,..., j k } P({a} δ ∩ D (k) j 1 ,..., j k ) and for simplicity we call J * := { j * 1 , . . . , j * k * }. Then, for a ∈¯ γ , from (4.7), (5.39), (6.2) and (6.5) we get for someã ∈¯ δ γ andk as in (4.5). A similar lower bound is obtained following the same reasoning. To have a negligible error in (7.3) we need the constraints (5.7), (5.33), (5.37) and which is true from the choice made in (4.6). The next step is to replace f by the density H of the cost functional: For every a ∈¯ δ γ , with δ as in (5.35), φ a and ψ a as in (7.1) and (7.2), there is a constant C γ → 0 as γ → 0 such that Both F(x ± ; a) and H (φ a , ψ a ) are functions in × T given by: ; a) as given in (5.30) and where the x, t dependence is hidden in φ a and ψ a .
Proof We first estimate the difference betweenx + i, j as in (8.9) and y ≡ y(φ a , ψ a ) with The rates c ± (φ a ) are defined analogously toc ± (i, a) in (5.18) where instead of a j−1 (x) we have φ a , that is, By comparing to the ratesc ± (i, a) we obtain that for x ∈ I i and t ∈ [( j − 1) t, j t): for some c > 0. Moreover the following identities are satisfied by the above rates, and From these and after some straightforward cancellations, we rewrite the function H (φ a , ψ a ) in (7.7) as follows: where h is defined in (5.31). Notice the similarity with f (x ± i, j ; a), where φ a and c ± (φ a ) have replaced a andc ± (i, a), respectively.
Then, for the difference | f (x ± i, j ; a) − H (φ a , ψ a )|, it suffices to estimate the following as the other terms can be treated in a similar fashion: The first term is given in (7.9) so we require that −3 |φ a t| (1−α)/2 → 0, as γ → 0. (7.10) Note that if all allowed spin-flips occur on the same space coarse-grained box we have the bound 11) where N were chosen in (4.1). Thus, requirement (7.10) is easily satisfied since t = γ c . The main difficulty is in the second term since, in some regimes, |x + i, j | may be large and at the same time 1 + φ a small. This occurs when the given profile a (and subsequently also φ a ) is very close to the boundary value −1 (recall the lower bound 1 + φ a ≥ t · η 3 from (5.36)) with a negative derivative which can also be large in absolute value, given by (7.11). Due to the symmetry of the problem the same holds for the case of a profile going up and being close to the upper boundary +1 in which case the "bad" term is |x − i, j | · | ln 1−a 1−φ a |. More precisely, in (8.7), if We fix a threshold η 4 ≡ η 4 (γ ) := | ln γ | −λ 4 , λ 4 > 0, (7.12) such that η 4 >> t and we split the integral |x + i, j | · | ln 1+a 1+φ a | dx dt into the set {1 + φ a > t η 4 } and its complement. For the first we have that and we choose t << η 4 << · η 1 · |I |. (7.13) Under this condition we obtain that (7.14) This is vanishing provided that η 4 << η 2 1 · |I | 2 · 4 , i.e., λ 4 > 2λ 1 + 2b + 4a, (7.15) which also covers the previous requirement (7.13).
In the complement, recalling the properties (2.22) of the functional, we exploit the fact that ψ a ln(1 + φ a ) ∈ L 1 ( × T ) for ψ a =φ a . Indeed, we have that: On the other hand, we also have that 1+a 1+φ a > 1 which implies that from (7.11) and the fact that |1 + φ a | ≥ t η 3 . From (7.16) and (7.17) we obtain: Then the other parameters can be chosen as follows: η 0 from requirement (5.33), η 2 from (7.4) and η 4 from (7.15). The parameters a and b, for and |I | respectively, have more freedom, but within the limits of the above constraints. Finally, the error C γ in (7.5) is given by the right hand sides of (7.14) and (7.18) which are vanishing as γ → 0.
Putting together good and bad time intervals from (5.29) and (6.4)-(6.5), we obtain the following bound for the last two factors of (7.3): Since both f and g 1,2 are integrable functions in × T and |J * |/( −2 / t) is negligible. Using again Lemma 7.1 we have that (7.21) equals ×T H (φ a , ψ a )dx dt plus vanishing error as γ → 0. We conclude the last step of the strategy (3.16) by restricting to the class of smoother functions: Lemma 7.2 Given a closed set C ⊂ D(R + , {−1, +1} S γ ), for some δ, δ > 0 we denote bȳ δ γ,δ (C) the set of profiles in¯ γ,δ (C) defined in (3.4), with the extra property that |a±1| > δ . Then, for such a profile a ∈¯ δ γ,δ (C) and δ, δ chosen as before, we have that I ×T (φ) + C γ , (7.22) with the same C γ as in (7.5).
Proof Mollified versions of (φ a , ψ a ) are elements in U δ (C) to which we can restrict ourselves by obtaining a lower bound. Furthermore, mollified functions are close in L 1 to the original ones. The same is true for their images under integrable functions such as the ones in H (φ a ,φ a ). Hence, we can approximate H (φ a ,φ a ) by H evaluated at mollified versions of φ a with a negligible error which is similar to the one in Lemma 7.1. This is a standard calculation and details are omitted.
Given a ∈¯ γ , we denote by R δ i, j (a) the range of the values that the pair of random variables (X i, j−1 , K i, j ) can take. This is determined by the set {|K i, j − d i, j−1 t| < δ} for K i, j and by the set [m δ i, j (K i, j ), M δ i, j (a)] for N + i, j−1 . In the latter, we have defined (a) in (5.19). Note that in M δ i, j (a) the minimum is over the number of pluses at time ( j − 1) t and the number of minuses at the next time j t, as the number of pluses that become minuses cannot exceed neither of them. By (5.20) we have: ) P γ −1 |I |c+(i,a) (N + i, j−1 = n + i, j−1 ).

(8.5)
For n + i, j−1 and n + i, j−1 + γ −1 |I |k i, j large enough, we apply Stirling's formula to (8.5) and using (5.20) we obtain the following expression: where f t (x ± i, j−1 ; a) is given in (5.34) and x ± i, j−1 represents the number of occurrence of the random times N ± i, j−1 divided by γ −1 |I |. Recall also that x − i, j−1 = x + i, j−1 + γ −1 |I |k i, j . Moreover, note that in the latter sum, k i, j denotes a rescaled number by γ −1 |I | while in the sum in (8.5) it is not rescaled.

Asymptotics of the Poisson Process, Proof of Lemma 5.6
We give the asymptotic analysis of the Poisson Process.
Proof of Lemma 5.6. We optimize the exponent of (8.6) with respect to x + i, j−1 ∈ γ −1 |I |R δ i, j (a) (viewing k i, j as a parameter) and using the fact that x − i, j−1 = x + i, j−1 + γ −1 |I |k i, j . The optimal value is given by which further implies that |R δ i, j (a)| ≤ |R δ i, j (ã|) and |c ± (i,ã) − c ± (i, a)| ≤ βδ |I |. So the only terms in (8.12), (8.13) and (8.14) that contribute in the estimate are the terms which include the ratio and the difference of the rates. Thus, in this case, we obtain that: ≤ e γ −1 |I | 2 ln 1+ βδ |I | cm +2β|I |δ t .

Remark 7.3
In some realizations and some boxes, it may also happen that the number of plus or minus jumps is finite. We show that in such a case we can still work with profiles away from ±1. Consider Case 1 with finite plus jumps when a is close to +1. The other cases can be done similarly. Then, in (8.5) for the probability of plus jumps P γ −1 |I |c + (i,a) (N − i, j−1 = n i, j−1 − ), as given in (5.20), we use the injective map ι as in Case 1 and obtain