Journal of Statistical Physics

, Volume 147, Issue 6, pp 1094–1112

Gibbs-Non-Gibbs Transitions via Large Deviations: Computable Examples

Authors

  • Frank Redig
    • Delft Institute of Applied MathematicsTechnische Universiteit Delft
    • Mathematisch InstituutUniversiteit Leiden
Open AccessArticle

DOI: 10.1007/s10955-012-0523-9

Cite this article as:
Redig, F. & Wang, F. J Stat Phys (2012) 147: 1094. doi:10.1007/s10955-012-0523-9

Abstract

We give new and explicitly computable examples of Gibbs-non-Gibbs transitions of mean-field type, using the large deviation approach introduced in (van Enter et al. in Mosc. Math. J. 10:687–711, 2010). These examples include Brownian motion with small variance and related diffusion processes, such as the Ornstein-Uhlenbeck process, as well as birth and death processes. We show for a large class of initial measures and diffusive dynamics both short-time conservation of Gibbsianness and dynamical Gibbs-non-Gibbs transitions.

Keywords

Dynamical Gibbs-non-Gibbs transition Feng-Kurtz formalism Bad configurations Unique and non-unique histories

1 Introduction

Starting from [12] dynamical Gibbs-non-Gibbs transitions have been considered by several authors, see e.g. [2, 9, 11]. In these studies, one considers lattice spin systems started from a Gibbs measure μ at time zero and evolves it according to a Markovian dynamics (e.g. Glauber dynamics) with stationary Gibbs measure νμ. The question is then whether μ t , the time-evolved measure at time t>0 is a Gibbs measure. Typically this is the case for short times, whereas for longer times, there can be transitions from Gibbs to non-Gibbs (loss) and back from non-Gibbs to Gibbs (recovery). The notion of a “bad configuration”, i.e., a point of essential discontinuity of the conditional probabilities of the measure μ t is crucial here. Such a configuration η spec is typically identified by looking at the joint distribution of the system at time 0 and at time t. If conditioned on η spec the system at time zero has a phase transition, then typically η spec is a bad configuration.

In the context of mean-field models, the authors in [6] started with an analysis of the most probable trajectories (in the sense of large deviations) of a system conditioned to arrive at time T at a given configuration. The setting of [6] is the Curie-Weiss model subjected to a spin-flip dynamics. A Gibbs-non-Gibbs transition is in this context rephrased as a phenomenon of “competing histories”, i.e., for special terminal conditions x spec and times T not too small, multiple trajectories can minimize the rate function, and these trajectories can be selected by suitably approximating x spec . Multiple histories were then shown to lead to jumps in conditional probabilities indicating non-Gibbsian behavior in the mean-field setting, see e.g. [5, 7, 14]. These special conditionings leading to multiple histories are the analogue of “bad configurations” (essential points of discontinuity of conditional probabilities of the measure at time t) in the (lattice) Gibbs-non-Gibbs transition scenario. This “trajectory-large-deviation approach” has then been studied in more generality, including the lattice case, in [13].

In this paper, we apply the trajectory-large-deviation approach in several examples, both for diffusion processes and for birth and death processes. This leads to new and explicitly computable Gibbs-non-Gibbs transitions of mean-field type. For processes of diffusion type, we first treat an explicit example for the rate function of the initial measure, and as dynamics Brownian motion with small variance or the Ornstein-Uhlenbeck process. In all cases, we obtain the explicit form of the conditioned trajectories, and explicit formulas for the bad configuration and the time at which it becomes bad. In the case of general Markovian diffusion processes in a symmetric potential landscape, we show under reasonable conditions short-time Gibbsianness as well as appearance of bad configurations at large times. Next, we treat the case of continuous-time random walk with small increments, as arises e.g. naturally in the context of (properly rescaled) population dynamics. In that case, the Euler-Lagrange trajectories can be explicitly computed for some particular choices of the “birth and death” rates. Constant birth and death rates are the analogue of the Brownian motion case, whereas linear birth and death rates are the analogue of the Ornstein-Uhlenbeck process, but in that case the cost of optimal trajectories becomes a much more complicated expression.

Our paper is organized as follows. In Sect. 2 we introduce some elements or the Feng-Kurtz formalism, and define the notion of bad configurations in the present setting. In Sect. 3 we treat diffusion processes with small variance, with an explicit form for the initial rate function. In Sect. 3.3 we treat the case of Brownian motion dynamics with different cases for the rate function of the initial measure. Finally, in Sect. 5, we treat one-dimensional random walks with small increments, such as rescaled birth and death processes.

2 The Feng-Kurtz Scheme, Euler-Lagrange Trajectories, Bad Configurations

We study Markov processes \(\{X^{n}_{t}: 0\leq t\leq T\}\) taking values in ℝ d , parametrized by a natural number n. This parameter tunes the “amount of noise” in the process, i.e., as n→∞, the process becomes deterministic, and the measure on trajectories satisfies the large deviation principle with rate n and with a rate function of the form
https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Equ1_HTML.gif
(1)
This means more precisely that
https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Equ2_HTML.gif
(2)
to be interpreted in the usual sense of the large deviation principle with a suitable topology on the set of trajectories. The form (1) naturally follows from the Markov property.

Notice that the form of the rate function does not depend on the choice of this topology. So one usually starts with the weakest topology, i.e., the product topology, and then, if possible, strengthens the topology by showing exponential tightness. See [1] for an illustration of this strategy in the context of theorems like Mogulskii’s theorem.

Since in this paper we are only interested in finding out optimal trajectories, i.e., minimizers of the rate function over a set of trajectories with prescribed terminal condition and open-start condition, we will not have to worry about the strongest topology in which the large deviation principle (2) holds, but we are rather after (as explicit as possible) solutions of Euler-Lagrange problems associated to the rate function.

In [4] a scheme is given to compute the “Lagrangian” L, see also [13] for an illustration of this scheme in the large-deviation view on Gibbs-non-Gibbs transitions. First one computes the “Hamiltonian”
https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Equ3_HTML.gif
(3)
where https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_IEq2_HTML.gif is the generator of the process \(\{X^{n}_{t}: 0\leq t\leq T\}\) (working on the x-variable), where p∈ℝ d is the “momentum” and where 〈.,.〉 denotes inner product. Under regularity conditions on https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_IEq4_HTML.gif (e.g. strict convexity), the associated Lagrangian is then given by the Legendre transform
https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Equ4_HTML.gif
(4)
As an example, consider
$$X^n_t= n^{-1/2}B_t $$
with generator
https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Equb_HTML.gif
then we have
https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Equc_HTML.gif
and associated Lagrangian
$$L(x,v)=\frac{v^2}{2} $$
which produces the rate function of the well-known Schilder’s theorem
$$\mathbb{P} \bigl(\bigl\{X^n_t:0\leq t\leq T\bigr\} \approx\gamma \bigr) \approx\exp \biggl(-\frac{n}{2}\int _0^T \dot{\gamma}_s^2 ds \biggr) $$
To proceed, we also want the initial point of our process to have some fluctuations. More precisely, we need for the starting point of our process an initial measure μ n (depending on n) on ℝ d , satisfying the large deviation principle with rate n and rate function i(x), i.e., in the sense of large deviations, we assume
$$ \mathbb{P} \bigl(X^n_0\in A \bigr)= \mu_n(A)\approx\exp\Bigl(-n\inf_{x\in A} i(x)\Bigr) $$
(5)
We call the triple \((\{X^{n}_{t}:0\leq t\leq T\}, L,i)\) a stochastic system with small noise.

We continue now with the definition of a bad configuration in this framework. This is motivated by the definition of a bad configuration in the context of mean-field models [6], and can be viewed as the large-deviation rephrasing of “a phase transition at time zero conditioned on a special configuration at time T”.

Definition 2.1

Let \((\{X^{n}_{t}:0\leq t\leq T\},L,i)\) be a stochastic system with small noise. We say that a point b∈ℝ d is bad at time T if the following two conditions hold.
  1. 1.

    Conditional on \(X^{n}_{T}=b\), \(X^{n}_{0}\) does not converge (as n→∞) to a point-mass in distribution.

     
  2. 2.

    There exist two sequences \(b^{+}_{k}\to b\), \(b^{-}_{k}\to b\) and δ>0 such that the variational distance between the distribution \(\mu(0,T;b^{+}_{k})\) of \(X^{n}_{0}|X^{n}_{T}=b^{+}_{k}\) and the distribution \(\mu(0,T;b^{-}_{k})\) of \(X^{n}_{0}|X^{n}_{T}=b^{-}_{k}\) is at least δ for k large enough.

     

The simplest example which follows also the most common scenario is where the distribution of \(X^{n}_{0}|X^{n}_{T}=b\) converges to \(\frac{1}{2}(\delta_{-a}+\delta_{a})\) and for c>b \(X^{n}_{0}|X^{n}_{T}=c\) converges to δ α(c) where α(c)→a as cb, whereas for c<b \(X^{n}_{0}|X^{n}_{T}=c\) converges to δ α′(c) where α′(c)→−a as cb. This means that conditioned to be at time T at location b, the process has two “favorite” initial spots, which can be “selected” by approaching b from the right or from the left.

This is the analogue of a phase transition, where the phases can be selected by appropriately approximating the bad configuration, see [12].

3 Diffusion Processes with Small Variance Conditioned on the Future

In this section we present examples where \(X^{n}_{t}\) is a diffusion process. We show also how from the large deviation approach we gain a new understanding of “short-time Gibbsianness” for a general class of drifts of the diffusion, or initial rate functions.

3.1 Brownian Motion

To start with, we consider Brownian motion with small variance \(\frac{1}{n}\) starting from an initial distribution satisfying the large deviation principle (with rate n) with a non-convex rate function having two minima at locations −a,a, with a>0. More precisely, we consider the process
$$ X^n_t =\frac{1}{\sqrt{n}} B_t $$
(6)
starting from an initial distribution μ n such that, informally written,
$$ \mathbb{P} \bigl(X^n_0\in dx \bigr)= \mu_n (dx) \approx e^{-n i(x)} dx $$
(7)
For i we make the explicit choice:
$$ i(x)= \bigl(x^2-a^2\bigr)^2 $$
(8)
i.e., a non-convex function, non-negative, with zeros at −a,a and maximum at x=0 (i(x) with a=2 is plotted in Fig. 1).
https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Fig1_HTML.gif
Fig. 1

i(x)=(x 2a 2)2 with a=2

This specific choice is for the sake of explicit analytic computability but many results are true for a general class of rate functions that have a similar graph with two zeros located at −a,a and a maximum at zero.

More formally, we require that the sequence of initial probability measures {μ n ,n∈ℕ} satisfies the large deviation principle with rate function i given by (8). Such rate functions arise naturally in the context of mean-field models with continuous spins and spin-Hamiltonian depending on the magnetization.

We are then interested in the most probable trajectory γ with initial point distributed according to μ n and final point γ T =0. More precisely, by application of Schilder’s theorem, the trajectory \(\{X^{n}_{t}: 0\leq t\leq T\}\) satisfies the LDP with rate function
https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Equ9_HTML.gif
(9)
The optimal trajectory we are looking for is hence
https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Equf_HTML.gif
The Euler-Lagrange trajectories (extrema of the cost \(\frac{1}{2}\int_{0}^{T}\dot{\gamma}_{s}^{2} ds\) corresponding to https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_IEq23_HTML.gif ) are linear in t:
$$\gamma_t = A+ Bt $$
By the terminal condition γ T =0, we have B=−A/T.
The cost https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_IEq24_HTML.gif of this trajectory can then be rewritten as a function of the starting point γ 0=A:
https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Equ10_HTML.gif
(10)
with
$$ \alpha(a,T)= \biggl(\frac{1}{2T}-2a^2 \biggr) $$
(11)
The behavior of this cost depends on the sign of α. If α≥0, then there is a unique minimum at A=0, this case corresponds to
$$T \leq\frac{1}{4a^2}:= T_{\mathit{crit}} $$
If α<0 then there are two minima A=A ± given by
$$ A_{\pm} = \pm\sqrt{-\alpha(a,T)/2}=\pm\sqrt{a^2-(4T)^{-1}} $$
(12)

We thus conclude that, as n→∞, the starting point is most probably 0 for small T and most (and equally) probably A ± for large T, which converges to ±a when T→∞. Hence we have non-uniqueness of histories.

Let us denote μ(n,T,0) the distribution of \(X^{n}_{0}\) conditioned on \(X^{n}_{T}=0\). Then we have
  1. 1.
    Small times, unique history. If TT crit then
    $$\lim_{n\to\infty}\mu(n,T,0)=\delta_0 $$
     
  2. 2.
    Large times, non-unique history. If T>T crit then
    $$\lim_{n\to\infty}\mu(n,T,0)= \frac{1}{2}(\delta_{A^+}+ \delta_{A^-}) $$
     
  3. 3.
    Limit of large times
    $$\lim_{T\to\infty}\lim_{n\to\infty}\mu(n,T,0)\to\frac{1}{2}(\delta_{a}+ \delta_{-a}) $$
     
Let us now condition on \(X^{n}_{T}=b\neq 0\). Then the most probable trajectory is still a straight line \(\gamma^{b}_{t}=A+Bt\) but now with terminal condition A+BT=b, i.e., B=(bA)/T. It has cost expressed in terms of the starting point γ 0=A
https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Equ13_HTML.gif
(13)
This is the cost function https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_IEq29_HTML.gif of (10) plus a linear term \(-\frac{b}{T}A +\frac{b^{2}}{2T} \). Minimization of https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_IEq31_HTML.gif leads to the equation
$$ 4A^3 +2\alpha A = \frac{b}{T} $$
(14)
We then have two cases:
  1. 1.

    α≥0, i.e., TT crit . Equation (14) has a unique real solution, corresponding to a unique minimum A b of E b (A). This minimum converges to zero as b→0. Hence, 0 is good for TT crit .

     
  2. 2.

    α<0. Equation (14) has three real solutions. For b>0 we have one positive and two negative solutions. The positive solution denoted \(A(+,b,T)>\sqrt{-\alpha/2}\) gives the minimum. The negative solutions correspond to a maximum and a local minimum. For b<0 the situation is exactly the opposite: the unique negative solution \(A(-,b,T)<-\sqrt {-\alpha/2}\) correspond to the global minimum whereas the two positive solutions give a maximum and a local minimum. Hence 0 is bad for all T>T crit .

     
In particular, for the T→∞ the positive, resp. negative minimum of the rate function of the distribution at time zero is selected by taking the right or left limit of the conditioning.
$$\lim_{c\to0 , b>0}\lim_{T\to\infty}\mathbb{P}\bigl(X^n_0 = \cdot| X^n_T = c\bigr)=\delta_a $$
and, similarly
$$\lim_{c\to0 , b<0}\lim_{T\to\infty}\mathbb{P}\bigl(X^n_0 = \cdot| X^n_T = c\bigr)=\delta_{-a} $$

Summarizing our findings, let us denote https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_IEq34_HTML.gif the set of bad configurations then we have

Theorem 3.1

  1. 1.

    Short times: no bad configurations. For \(T \leq\frac{1}{4a^{2}}\), https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_IEq36_HTML.gif .

     
  2. 2.

    Large times: unique bad configuration. For \(T > \frac{1}{4a^{2}}\), https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_IEq38_HTML.gif .

     

3.2 Brownian Motion with Constant Drift

The case of Brownian motion with constant drift V>0 is treated similarly. The Euler-Lagrange trajectories are once more linear in t, but the cost is now
$$i(\gamma_0) + \frac{1}{2}\int_0^T( \dot{\gamma}_s- V)^2 ds $$
which for γ t =A+Bt ending in γ T =b can be computed explicitly and gives
$$E_{b,V} (A)= \frac{1}{2} \biggl(\frac{b-A}{T}- V \biggr)^2 + i(A) $$
of which a similar analysis can be given. In particular, choosing b=VT we see that the cost is identical to the zero drift case conditioning to be at zero at time T, and hence this is a bad point for T>T crit , where T crit is the same critical time as for the zero drift case. The analysis around this bad point is identical. Notice that the “limiting deterministic dynamics” is \(\dot{x}= V\) and the bad point x spec =VT is precisely where this dynamics ends up at time T when started from zero.

3.3 Other Rate Functions for the Initial Measure and Corresponding Behavior of Brownian Motion

We now consider other possible scenarios for different rate functions associated to the initial measure, and for the Brownian motion with small variance as dynamics. The starting measure μ n (dx) satisfies the large deviation principle with rate function i(x). As a consequence, the minimizing trajectory to arrive at position b at time T is γ t =Bt+A with B=(bA)/T and has cost
https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Equ15_HTML.gif
(15)
The following scenarios can then occur
  1. 1.

    i(A) is strictly convex: no bad configurations. Indeed, in that case https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_IEq40_HTML.gif is also strictly convex (as a sum of two strict convex function) and hence has a unique minimum. In this scenario, there are no bad configurations, and the optimal conditioned trajectory is always unique. This corresponds to “high-temperature initial measure” and “infinite-temperature dynamics”, which always conserves Gibbsianness.

     
  2. 2.
    Initial field: loss without recovery, with acompensatingbad configuration. As an example we can take i(A)=(A 2a 2)2+A+r. For a>1, this rate function has one local minimum in the vicinity of x=a, a maximum in the vicinity of x=0 and its (absolute) minimum in the vicinity of x=−a. This corresponds to an initial field (favorizing the minimizer x=−a). i(A) with a=2 with r=2.01539 is plotted in Fig. 2. The minimization of https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_IEq41_HTML.gif leads to the equation
    $$ 4A^3+2\alpha(a,T) A = \frac{b}{T} - 1 $$
    (16)
    By an analysis of (16) similar for (14), we obtain that there is no bad point when \(T\leq T_{\mathit{crit}}=\frac{1}{4a^{2}}\), but b=T is bad for all T>T crit . The bad point “compensates” the initial field, and therefore has to become larger (and positive) when time T increases.
    https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Fig2_HTML.gif
    Fig. 2

    i(A)=(A 2a 2)2+A+r with a=2 and r=2.01539

     
  3. 3.
    Non-symmetric rate function. To see that the symmetry of the initial rate function is not a necessary requirement to produce bad configurations, we have the following example. Let i(A)=7A 6−24A 5+9A 4+38A 3−42A 2+40 (see Fig. 3). This rate function has two global minima at A=−1 and A=2 and one maximum at A=0. The cost function corresponding to trajectories arriving at b at time T is
    https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Equp_HTML.gif
    For fixed b, and T large enough, this function has two local minima, located at A 1(b,T)<A 2(b,T). Let us denote, for fixed T,
    https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Equq_HTML.gif
    If as a function of b, D T changes sign, by continuity, there must be a value of b where D T (b )=0, i.e., where the minima of https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_IEq43_HTML.gif are at equal height. This b is then a bad point at time T. For T=1 we have D T (0.499)≈−0.00182497<0 and D T (0.4999)≈0.000868034>0, so at T=1, there is a bad point at b ∈(0.499,0.4999). We observe that b is T dependent and tends to 0.5 as T increases. From numerical computation, we have b ∈(0.4999,0.49999) for T=4, b ∈(0.49999,0.499999) for T=39 and b ∈(0.499999,0.4999999) for T=1000.
    https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Fig3_HTML.gif
    Fig. 3

    i(A)=7A 6−24A 5+9A 4+38A 3−42A 2+40

     
  4. 4.

    General symmetric rate function. For any rate function i(A) which is symmetric with respect to x=0 and which has minima for A≠0, b=0 is bad when T is large enough. Indeed, the cost to arrive at 0 is from (15): \(i(A)+\frac{A^{2}}{2T}\) which has a non zero minimum as soon as T is large enough.

     
  5. 5.

    General short-time Gibbsianness. For every rate function i which is twice differentiable and its second derivative is continuous and bounded from below, we show that for T small enough there is a unique minimum A b of https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_IEq45_HTML.gif . This is the analogue of “short-time” Gibbsianness obtained in the lattice case via cluster expansions [10] or conditional Dobrushin uniqueness [8] and can be proved as follows.

    We look at the equation (see (15))
    $$ i'(A)=-\frac{A}{T}+\frac{b}{T}=:f(A) $$
    (17)
    Put d=inf A i″(A). Then we conclude, for
    $$ T<-\frac{1}{d} $$
    (18)
    that (17) has only one real solution A b . Indeed, look at any two adjacent intersection points A 1 and A 2 of i′(A) and f(A) if there were more than one real solution for (17). By the intermediate value theorem, we get
    $$ \min\bigl(i''(A_1),i''(A_2) \bigr)<-\frac{1}{T}<d=\inf_{A}i''(A) $$
    (19)
    This is a contradiction. And further because \(i''(A_{b})>-\frac{1}{T}\), we have
    https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Equ20_HTML.gif
    (20)
    Therefore A b is a minimum.
     
  6. 6.

    Non-Gibbsianness for all times. An example where b=0 is bad for all T>0 is i(0)=0, \(i(A)=\int_{0}^{|A|}|x|\cos^{2}\frac {1}{x} dx\) for A≠0. This follows from the facts that i″(A) is not bounded from below when A→0 and i(A) is symmetric about A=0. To see that indeed for all T>0, the value 0 is a bad point, we see that the line f(A)=−A/T always intersects the graph of the derivative of the rate function.

     

Remark 3.1

In order to understand better the connection between our large deviation based notion of badness, and badness in the sense of conditional probabilities in the mean-field setting, we first remark that the initial measure \(\mu_{n}(x)\approx e^{-ni_{0}(x)} dx\) can be produced as follows.

Start from an independent standard normal a priori measure on ℝ n
$$\alpha_n (dx_1,\ldots, dx_n)= \prod _{i=1}^n \frac{e^{-\frac{1}{2} x_i^2}}{\sqrt{2\pi}} dx_i $$
Under this measure α n , the “average magnetization” \(\overline{x}_{n}=(1/n)\sum_{i=1}^{n} x_{i}\) satisfies the large deviation principle with rate function \(\tilde{\i}(x)= x^{2}/2\). If we tilt the a priori measure α n with the function \(F= F(\overline{x}_{n})\), i.e., if we consider the measure
$$\mu_n^F= e^{nF(\overline{x}_n)} \alpha_n (dx_1,\ldots,dx_n) $$
then under \(\mu_{n}^{F}\), \(\overline{x}_{n}\) satisfies the large deviation principle with rate function \(i_{F}(x)=(\tilde{\i}(x)- F(x))- C\) where \(C=\inf_{x} (\tilde{\i} (x)-F(x))\). Making then the choice
$$F(x)=\frac{x^2}{2} - \bigl(x^2-a^2 \bigr)^2 $$
leads to a measure \(\mu_{n}^{F}\) such that \(\overline{x}_{n}\) satisfies the large deviation principle with rate function (8).

Next, if we start (x ,…,x n ) from this measure \(\mu^{F}_{n}\) and apply independent Brownian motions \((W^{1}_{t},\ldots,W^{n}_{t})\), then the “magnetization” at time t>0 exactly evolves as the process \(X^{n}_{t}\) in (6).

Therefore, if we have at least two optimal trajectories conditioned to arrive at a certain magnetization m at time T>0, and these trajectories can be selected by approximating the magnetization appropriately, then we have an essential discontinuity at m=m of the conditional distribution \(m\mapsto\mu^{F}_{n}(t) (dx_{1}|m)\) as a function of the magnetization \(m= (1/n)\sum_{i=2}^{n} x_{i}\). Such a discontinuity is referred to as non-Gibbsianness in the mean-field context, see [5, 7] for more details.

4 The Ornstein-Uhlenbeck Process

As a second example, we consider the process \(X^{n}_{t}\) to be the solution of
$$ dX_t= -\kappa X_t dt + \frac{1}{\sqrt{n}} dB_t $$
and the initial point distributed as in the previous section, in (7), (8).
The cost function for the large deviation principle of the trajectories now becomes
https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Equ21_HTML.gif
(21)
The Euler-Lagrange trajectories extremizing \(\frac{1}{2} \int_{0}^{T} (\dot{\gamma}_{s} +\kappa\gamma_{s})^{2} ds\) are given by
$$\gamma_t= Ae^{\kappa t} + Be^{-\kappa t} $$
by the terminal condition γ T =0 we have
$$\gamma_t = -Be^{-2\kappa T}e^{\kappa t} + B e^{-\kappa t} $$
the cost function for such a trajectory can then explicitly be evaluated and gives
https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Equ22_HTML.gif
(22)
where
https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Equ23_HTML.gif
(23)
A similar analysis as in the previous section can now be started. We have a unique minimum at B=0 of the cost function https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_IEq65_HTML.gif for
$$ T\leq T_{\mathit{crit}}:=-\frac{1}{2\kappa} \log \biggl( \frac{2a^2}{2a^2+\kappa} \biggr) $$
(24)
and for T>T crit , 0 becomes the unique bad point for this process.
The cost of an optimal trajectory ending up at b at time T can also be expressed as a function of the starting point γ 0, which gives the explicit expression
https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Equ25_HTML.gif
(25)

4.1 The Ornstein-Uhlenbeck Process with Constant External Field

The equation for the process \(X^{n}_{T}\) then reads
$$ dX^n_t= (-\kappa X_t + E)dt + \frac{1}{\sqrt{n}} dB_t $$
(26)
where E>0 is a constant representing a (constant) external field. As rate function of the initial measure we choose as before (8). The cost of the trajectory is now given by \(\int_{0}^{T} L (\gamma_{s},\dot{\gamma}_{s}) ds\) with \(L (\gamma_{s},\dot {\gamma}_{s}) = (\dot{\gamma_{s}} +\kappa\gamma_{s}-E)^{2}\). The Euler-Lagrange trajectories are of the form
$$\gamma_t = Ae^{\kappa t} + Be^{-\kappa t} + \frac{E}{\kappa} $$
The trajectory cost of an Euler-Lagrange trajectory is given by 2A 2(e 2κT −1). From this, we derive that the total cost of a trajectory to end up at time T in γ T =b is given, as a function of γ 0, by
https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Equy_HTML.gif
The same analysis can then be performed. The “critical” time at which a unique bad point starts to appear is the same as in the zero-field case, i.e., given by (24). This bad point is given by
$$ b= \frac{E}{\kappa}\bigl(1-e^{-\kappa T}\bigr) $$
(27)
which corresponds to the point at which the deterministic evolution \(\dot{x}_{t}= -\kappa x_{t} +E\) arrives when starting from x 0=0. Notice that total cost to arrive at this bad point b is given by
$$i(\gamma_0)+\frac{\kappa\gamma_0^2}{e^{2\kappa T}-1} $$
which is symmetric around γ 0=0. Moreover, for T large the path cost contribution which is equal to \(\frac{\kappa\gamma_{0}^{2}}{e^{2\kappa T}-1}\) vanishes exponentially fast, and hence for large T two minima exist.
The corresponding optimal trajectories to arrive at the bad point b are starting from
$$\gamma_0^{\pm} =\pm\sqrt{ a^2 - \frac{\kappa}{2(e^{2\kappa T} -1)}} $$
and explicitly given by
https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Equab_HTML.gif
The trajectory with plus resp. minus sign can be selected by conditioning to arrive at b +>b, resp. b <b, and letting b +b, resp. b b. Here we plot a limiting process with a=2, κ=0.7, T=30, hence \(\gamma_{0}^{+}\approx2.0\), and a corresponding conditioned process with E=0.1, hence b≈0.142857, see Fig. 4.
https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Fig4_HTML.gif
Fig. 4

A limiting process (the purple line) with a=2,κ=0.7,T=30, hence \(\gamma_{0}^{+}\approx2.0\), and a corresponding conditioned process (the blue line) with E=0.1, b≈0.142857 (Color figure online)

4.2 General Drift

Let us now consider the process \(X^{n}_{t}\) with a general drift f(x) and variance \(\frac{1}{n}\), i.e., the solution of
$$dX_t =-f(X_t) dt +\frac{1}{\sqrt{n}} dB_t $$
We assume f:ℝ→ℝ to be Lipschitz, and odd: f(−x)=−f(x). For the rate function of the initial point \(X^{n}_{0}\) we choose as before (7), (8). The rate function of the trajectory is now given by
https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Equ28_HTML.gif
(28)
and the minimization problem for the optimal trajectory ending at zero γ T =0 becomes now to find
https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Equ29_HTML.gif
(29)
The Euler-Lagrange equations for minimal cost trajectories are given by
$$\frac{d^2\gamma_s}{ds^2}= f(\gamma_s)f' ( \gamma_s) $$
These equations correspond to classical motion in a potential U satisfying U′=−ff′, which gives as a possible choice \(U= -\frac{1}{2} f^{2}\). Notice that this formal potential U has no physical meaning, but we need it if we want to translate the framework of the Euler-Lagrange equations to Hamilton equations. Indeed, the corresponding Hamiltonian is
$$ H(p,q)= \frac{p^2}{2} - U(q) $$
(30)
In particular, under the Euler-Lagrange equations,
$$ \frac{\dot{\gamma}^2_t}{2}- \frac{1}{2} \bigl(f(\gamma_t) \bigr)^2= E $$
(31)
is a constant of motion. Further, we have the open-start and terminal condition
$$ \everymath{\displaystyle} \begin{array}{rcl} i'(\gamma_0)&=& \dot{\gamma_0} + f( \gamma_0) \\[9pt] \gamma_T&=&0 \end{array} $$
(32)
We can think of these equations as having γ 0 and E as parameters. The terminal condition gives then a relation between E and γ 0. Notice that the trajectory of zero-energy, E=0, γ≡0 is always a solution since f(0)=0. We want to show that under some reasonable assumptions, for T small, it is the only solution. For this we make the following assumptions. Call https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_IEq77_HTML.gif the collection of all trajectories γ:[0,T]→ℝ ending at 0, i.e., with γ T =0 and with “energy” E, i.e., such that
$$\frac{\dot{\gamma}^2_t}{2}- \frac{1}{2} \bigl(f(\gamma_t)\bigr)^2= E $$
for all 0≤tT. We impose now the following conditions.
  1. 1.
    There exist a function φ:ℝ→[0,∞) and T 0>0 and a constant C>0 such that φ(0)=0, φ(E)>0 for \(E\not=0\) such that for all TT 0 and for all https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_IEq79_HTML.gif , \(\gamma_{0}\dot{\gamma}_{0}<0\),
    $$ |\dot{\gamma}_0|\geq\varphi(E) $$
    (33)
    and
    $$ |\gamma_0| \leq C\varphi(E) T $$
    (34)
     
  2. 2.

    The drift function f is locally monotone around 0, i.e., there exist x 0 such that f restricted to [0,x 0], [−x 0,0] is monotone.

     
The first condition states that if T is small, and one wants to end at γ T =0 from γ 0>0, then the derivative at zero should be negative, or vice versa. The second part of the condition states that there exist lower bounds for the derivative and upper bounds for γ 0.

Coming back to the previous examples: for the Brownian motion case, for all https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_IEq81_HTML.gif we have \(\gamma_{t}= \pm\sqrt{2E} (t-T)\) hence and for https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_IEq83_HTML.gif we have \(\gamma_{0}= \mp\sqrt{2E}T\), \(\dot{\gamma}_{0}=\pm\sqrt{2E}\), and we can choose \(\varphi(E)= \sqrt{2E}\). For the Ornstein-Uhlenbeck case we have γ t =B(e κt e κ(2Tt)), E=−2ABα and if γ T =0 we find \(\gamma_{0}= \sqrt{2E/\kappa} \sinh(\kappa T)\), \(\dot{\gamma}_{0}= -\sqrt{2E/\kappa} \cosh(\kappa T)\) which clearly satisfies the conditions, with the \(\varphi=\sqrt {2E/\kappa}\).

The open-start condition requires
$$\dot{\gamma}_0 + f(\gamma_0)= 4\gamma_0 \bigl(\gamma_0^2-a^2\bigr) $$
Hence, for https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_IEq90_HTML.gif such that γ 0>0:
https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Equ35_HTML.gif
(35)
which is clearly a contradiction for T sufficiently small. Hence for T sufficiently small, there do not exist E≠0 with γS T (E). As a consequence, under these assumptions, for small T the zero trajectory is the only solution of the minimization problem (29).
For large times, if we assume that the drift is such that from any starting point one can travel to the origin at arbitrary small cost if one has sufficient time, i.e., for all x 0>0,
$$\lim_{T\to\infty} \inf \biggl\{ \int_0^T \bigl(\dot{ \gamma}_s + f(\gamma)_s\bigr)^2 ds: \gamma_0= x_0, \gamma_T=0 \biggr\}=0 $$
then this implies that for T large enough that there exists \(x_{0}\not = 0\) and a trajectory γ starting from x 0 such that i(x 0)<i(0)/2 and
$$\biggl\{ \int_0^T\bigl(\dot{ \gamma}_s + f(\gamma)_s\bigr)^2 ds: \gamma_0= x_0, \gamma_T=0 \biggr\} < i(0)/2 $$
this trajectory γ clearly has lower cost than the zero trajectory, and by symmetry, −γ is a trajectory with identical cost. Therefore, 0 becomes a bad point.

5 Approximately Deterministic Walks in d=1

An “approximately deterministic random walk” is a continuous-time random walk with small increments performed at high rate, i.e., a random walk \(X^{N}_{t}\) on ℝ that, starting at X 0=x, makes increments of size ±1/N with rates Nb(x), resp. Nd(x). In other words, \(X^{N}_{t}\) is a Markov process on ℝ with generator
https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Equ36_HTML.gif
(36)

Such walks arise naturally in the context of population dynamics, see e.g. [3]. The notation b(x) and d(x) is also reminiscent of this interpretation and we will call these quantities birth resp. death rates.

We ask then the same large deviation question, i.e., we start the process \(X^{N}_{t}\) from an initial distribution μ N satisfying the large deviation principle with rate function (8)—or some natural modification of it if we have to restrict the state space—and look for the minimizing trajectory(ies) that end at time T at the origin (or at a more general bad point if the dynamics has a drift, see later).

The large deviation function for the trajectories can be computed using the Feng-Kurtz scheme, i.e., denoting \(f^{N}_{p}(x)= e^{Npx}\) we compute the Hamiltonian
https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Equ37_HTML.gif
(37)
and the corresponding Lagrangian
https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Equ38_HTML.gif
(38)
For the trajectories of \(\{X^{N}_{t}:0\leq t\leq T\}\), we have
$$ \mathbb{P}\bigl(X^N_.\approx\gamma\bigr) \approx e^{-N\int_0^T L (\gamma_s, \dot {\gamma}_s)}ds $$
(39)
where the informal notation has to be interpreted as usual in the sense of the large deviation principle.
The equations for the optimal trajectories, i.e. for the minimizers of the “action”
https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Equ40_HTML.gif
(40)
can now more conveniently be written in terms of the Hamiltonian (the Lagrangian is a more complicated expression to deal with).
Introducing the canonical coordinates (x,p) we have the Hamilton equations, together with the terminal condition and the open-start condition corresponding to the choice of the distribution of \(X^{N}_{0}\).
$$ \everymath{\displaystyle} \begin{array}{rcl} \dot{x}_t&=&\frac{\partial H}{\partial p} (x_t,p_t)= b(x)e^p-d(x)e^{-p} \\[9pt] \dot{p}_t &=&-\frac{\partial H}{\partial x} (x_t,p_t)=-b'(x) \bigl(e^p-1\bigr)-d'(x) \bigl(e^{-p}-1\bigr) \end{array} $$
(41)
with conditions
$$ \everymath{\displaystyle} \begin{array}{rcl} x_T&=&0 \\[9pt] p_0 &=& i'(x_0)= 4x_0 \bigl(x_0^2-a^2\bigr) \end{array} $$
(42)
Where i 0 is the quartic rate function from (8). The total “energy” is a constant of motion along minimizing trajectories, so we put H(x,p)=E and we can rewrite the Hamilton equations (41)
$$ \everymath{\displaystyle} \begin{array}{rcl} E+b(x)+d(x)+\dot{x}&=& 2b(x) u \\[9pt] E+b(x)+d(x)-\dot{x}&=& 2d(x) u^{-1} \end{array} $$
(43)
where u=e p . This leads to
$$ \dot{x}^2 = E^2 + 2E \bigl(b(x)+d(x)\bigr) + \bigl(b(x)-d(x)\bigr)^2 $$
(44)
So we can think now of the cost of a trajectory as a function of two parameters: the starting point and the energy (x 0,E). Zero-energy correspond to the “typical trajectory” following the limiting differential equation \(\dot{x}= b(x)-d(x)\), which means that the cost of the Lagrangian part of the rate function is zero, and only the cost due to the starting point x 0 has to be paid. Non-zero energy trajectories have a strictly positive cost of the Lagrangian part of the rate function. The additional terminal condition X T =b will eliminate one of these variables (e.g. E), so that we can think of the cost of the trajectory as a function of a single variable (e.g. x 0).

We now concentrate on three important particular cases.

5.1 Constant Birth and Death Rates

If b and d do not depend on x, then the equation for the momentum shows that p t =C, hence we have linear Euler-Lagrange trajectories, and correspondingly the same analysis and phenomena as in the Brownian motion case of the previous section.

5.2 Mean-Field Independent Spin Flips

A special case, corresponding to independent spin-flip dynamics is b(x)=(1−x), d(x)=(1+x). Moreover, the x-variable is now restricted to [−1,1]. As in the case x∈ℝ we assume that initially, x 0 is distributed according to a measure μ n (dx) on [−1,1] satisfying the large deviation principle with the non-convex rate function (8) for x∈[−1,1] and +∞ otherwise. In particular, a∈(0,1).

The Hamilton equations then read
$$ \everymath{\displaystyle} \begin{array}{rcl} \dot{x}&=& -x\bigl(e^p+e^{-p}\bigr) + e^p-e^{-p} \\[9pt] \dot{p}&=&e^p-e^{-p} \end{array} $$
(45)
Taking the derivative w.r.t. time of the first equation and using the second equation leads to elimination of p, and the simple second order equation for x: to
$$ \frac{d^2 x}{dt^2}= 4x $$
(46)
with solutions
$$x(t)= C_1e^{2t} + C_2 e^{-2t} $$
where C 1,C 2 are determined by the open-start condition and the terminal condition. This case was treated before in the context of the Curie-Weiss model subjected to independent spin flips in [6, 10].
The equation for the momentum can be integrated and gives
$$\tanh(p_t/2)= \pm C e^{2t} $$
Furthermore, since
$$E= (1-x) \bigl(e^p-1\bigr) + (1+x) \bigl( e^{-p}-1\bigr) $$
is a constant of motion, we find as possible solutions for x, using that x T =0:
$$x_t =\pm\sqrt{E/4(1+E/4)} \bigl(e^{2(t-T)}-e^{2(T-t)} \bigr) $$
In particular, as in the Brownian motion case, the zero-energy trajectory (E=0) yields x t =0. The relation between the energy, initial position and initial momentum is
$$p_0= \log \biggl(\frac{2+E +\sqrt{(2+E)^2 - 4(1-x_0^2)}}{2(1-x_0)} \biggr) $$
Zero-energy thus corresponds to zero initial momentum and zero initial position.
In general, the initial points are symmetrically distributed around the origin and related to the energy via
$$x_0= \pm\sqrt{E/4(1+E/4)} \bigl(e^{-2T}-e^{2T} \bigr) $$
Whether or not a non-zero energy solution can be the minimizer is determined by the open-start condition:
$$ p_0= i'(x_0)= 4x_0\bigl(x_0^2-a^2\bigr) $$
(47)
This can be viewed now as an equation for E. For small T>0,
$$x_0= x_0(E,T)\approx C(E) T,\qquad p_0= p_0(E,T)\approx cE $$
which implies that a non-zero energy solution of (47) can not exist for small T. For large T, a non-zero energy solution exists, yielding two symmetrically solutions for x 0.
Alternatively, the trajectory cost C T (γ 0) of a trajectory starting at γ 0 ending up at time T at b=0 has the following important properties
  1. 1.

    Symmetry: C T (−γ 0)=C T (γ 0)

     
  2. 2.

    Small time behavior: lim T→0 C T (γ 0)=∞ for all γ 0≠0

     
  3. 3.

    Large time behavior: lim T→∞ C T (γ 0)=0 for all γ 0

     
From these properties it follows that for small T there are no bad points, and for large T zero is the unique bad point. Notice that contrary to the Curie-Weiss model situation analyzed in [6] there are no non-neutral (non-zero) bad configurations due to the fact that the rate function of the initial measure is here simply a fourth-order polynomial.

5.3 Independent Spin-Flips in a Field

This corresponds to the choice b(x)=γ(1−x), d(x)=(1+x), x∈[−1,1]. Here γ>1 corresponds to a bias in the plus direction (positive magnetic field). The limiting deterministic trajectory is given by
$$ \everymath{\displaystyle} \begin{array}{rcl} \frac{dx_t}{dt}&=& -(1+\gamma) x_t + (\gamma-1) \\[6pt] x_t& =& x_0 e^{-(1+\gamma)t} + \frac{\gamma-1}{1+\gamma} \bigl(1-e^{-(1+\gamma)t} \bigr) \end{array} $$
(48)
This is the zero-energy trajectory starting from x 0.
Using (44) we find that for a given energy E, the solution for x is of the form
$$ x_t= x(E,C,t)= C_1 e^{ t(1+\gamma)} + C_2 e^{- t(1+\gamma)} + C_3 $$
(49)
with
https://static-content.springer.com/image/art%3A10.1007%2Fs10955-012-0523-9/MediaObjects/10955_2012_523_Equ50_HTML.gif
(50)
where C is an integration constant.

Remark 5.1

  1. 1.

    Remark that for E=0 C 3=(γ−1)(1+γ)−1 which corresponds to the limiting value of the zero-energy trajectory.

     
  2. 2.

    If γ=1, and E≠0 we find C 3=0 and recover the solution of the form C 1 e 2t +C 2 e −2t corresponding to the optimal trajectories of the independent spin-flip dynamics.

     
The general form of an optimal trajectory arriving at time T at x T =b and starting from x 0=γ 0 is
$$x(t)= (b-C_3) \frac{\sinh(\delta t)}{\sinh(\delta T)} + (\gamma_0-C_3) \frac{\sinh(\delta(T-t))}{\sinh(\delta T)} +C_3 $$
with δ=(1+γ) and where C 3 is given in (50). Notice the analogy with the case of the Ornstein-Uhlenbeck process in a constant field (27). As in that case, the bad point is time-dependent and given by
$$b= \frac{\gamma-1}{\gamma+1} \bigl(1-e^{-\delta T} \bigr) $$
which is the point at which the limiting deterministic dynamics arrives at time T when started from x 0=0. The trajectory cost C T (γ 0) to arrive at this bad point satisfies the same properties as the trajectory cost C T (γ 0) of the previous subsection (zero-field case). Hence, for T large two minimizing γ 0 of the total cost function appear which correspond to two optimal trajectories.

Acknowledgement

We thank Aernout van Enter and Olaf de Leeuw for useful discussions and suggestions.

Open Access

This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.

Copyright information

© The Author(s) 2012