Skip to main content

Advertisement

Log in

A Thermodynamic Formalism for Continuous Time Markov Chains with Values on the Bernoulli Space: Entropy, Pressure and Large Deviations

  • Published:
Journal of Statistical Physics Aims and scope Submit manuscript

Abstract

Through this paper we analyze the ergodic properties of continuous time Markov chains with values on the one-dimensional spin lattice \(\{1,\dots,d\}^{{\mathbb{N}}}\) (also known as the Bernoulli space). Initially, we consider as the infinitesimal generator the operator , where is a discrete time Ruelle operator (transfer operator), and \(A:\{1,\dots,d\}^{{\mathbb{N}}}\to\mathbb{R}\) is a given fixed Lipschitz function. The associated continuous time stationary Markov chain will define the a priori probability.

Given a Lipschitz interaction \(V:\{1,\dots,d\}^{{\mathbb{N}}}\to\mathbb{R}\), we are interested in Gibbs (equilibrium) state for such V. This will be another continuous time stationary Markov chain. In order to analyze this problem we will use a continuous time Ruelle operator (transfer operator) naturally associated to V. Among other things we will show that a continuous time Perron-Frobenius Theorem is true in the case V is a Lipschitz function.

We also introduce an entropy, which is negative (see also Lopes et al. in Entropy and Variational Principle for one-dimensional Lattice Systems with a general a-priori probability: positive and zero temperature. Arxiv, 2012), and we consider a variational principle of pressure. Finally, we analyze large deviations properties for the empirical measure in the continuous time setting using results by Y. Kifer (Tamsui Oxf. J. Manag. Sci. 321(2):505–524, 1990). In the last appendix of the paper we explain why the techniques we develop here have the capability to be applied to the analysis of convergence of a certain version of the Metropolis algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Baraviera, A., Leplaideur, R., Lopes, A.O.: Selection of ground states in the zero temperature limit for a one-parameter family of potentials. SIAM J. Appl. Dyn. Syst. 11(1), 243–260 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  2. Baraviera, A., Exel, R., Lopes, A.: A Ruelle Operator for continuous time Markov chains. São Paulo J. Math. Sci. 4(1), 1–16 (2010)

    MathSciNet  MATH  Google Scholar 

  3. Baraviera, A.T., Cioletti, L.M., Lopes, A., Mohr, J., Souza, R.R.: On the general one dimensional XY model: positive and zero temperature, selection and non-selection. Rev. Math. Phys. 23(10), 1063–1113 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  4. Baladi, V.: Positive Transfer Operators and Decay of Correlations. World Scientific, Singapore (2000)

    MATH  Google Scholar 

  5. Baladi, V., Smania, D.: Linear response formula for piecewise expanding unimodal maps. Nonlinearity 21(4), 677–711 (2008)

    Article  MathSciNet  ADS  MATH  Google Scholar 

  6. Berger, N., Kenyon, C., Mossel, E., Peres, Y.: Glauber dynamics on trees and hyperbolic graphs. Probab. Theory Relat. Fields 131(3), 311–340 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  7. Contreras, G., Lopes e, A.O., Thieullen, Ph.: Lyapunov minimizing measures for expanding maps of the circle. Ergod. Theory Dyn. Syst. 21, 1379–1409 (2001)

    Article  MATH  Google Scholar 

  8. Craizer, M.: Teoria Ergódica das Transformações expansoras. Master dissertation, IMPA, Rio de Janeiro (1985)

  9. Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications. Springer, Berlin (1998)

    Book  MATH  Google Scholar 

  10. Donsker, M., Varadhan, S.: Asymptotic evaluation of certain Markov process expectations for large time I. Commun. Pure Appl. Math. 28, 1–47 (1975)

    Article  MathSciNet  MATH  Google Scholar 

  11. Deutschel, J.-D., Stroock, D.: Large Deviations. AMS, Berlin (1989)

    Google Scholar 

  12. den Hollander, F.: Large Deviations. AMS, Providence (2000)

    MATH  Google Scholar 

  13. Diaconis I, P., Saloff-Coste, L.: Nash inequalities for finite Markov chains. J. Theor. Probab. 9(2) (1996)

  14. Diaconis, P., Saloff-Coste, L.: What do we know about the Metropolis algorithm? J. Comput. Syst. Sci. 57(1), 2–36 (1998). 27th Annual ACM Symposium on the Theory of Computing (STOC 95) (Las Vegas, NV)

    Article  MathSciNet  Google Scholar 

  15. Dupuis, P., Liu, Y.: On the large deviation rate function for the empirical measures of reversible jump Markov processes. Arxiv (2013)

  16. Ellis, R.: Entropy, Large Deviations, and Statistical Mechanics Springer, Berlin

  17. Ethier, S.N., Kurtz, T.G.: Markov Processes—Characterization and Convergence. Wiley, New York (2005)

    MATH  Google Scholar 

  18. Feng, J., Kurtz, T.: Large Deviations for Stochastic Processes. AMS, Providence (2006)

    MATH  Google Scholar 

  19. Kifer, Y.: Large deviations in dynamical systems and stochastic processes. Tamsui Oxf. J. Manag. Sci. 321(2), 505–524 (1990)

    MathSciNet  MATH  Google Scholar 

  20. Kifer, Y.: Principal eigenvalues, topological pressure, and stochastic stability of equilibrium states. Isr. J. Math. 70(I), 1–47 (1990)

    Article  MathSciNet  ADS  MATH  Google Scholar 

  21. Landim, C., Kipnis, C.: Scaling Limits of Interacting Particle Systems. Grundlehren der Mathematischen Wissenschaften, vol. 320. Springer, Berlin (1999)

    MATH  Google Scholar 

  22. Lebeau, G.: Introduction a l’analyse de l’algorithme de Metropolis. Preprint. http://math.unice.fr/~sdescomb/MOAD/CoursLebeau.pdf

  23. Leonard, C.: Large deviations for Poisson random measures and processes with independent increments. Stoch. Process. Appl. 85, 93–121 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  24. Liggett, T.: Continuous Time Markov Processes. AMS, Providence (2010)

    MATH  Google Scholar 

  25. Lopes, A., Mengue, J., Mohr, J., Souza, R.R.: Entropy and Variational Principle for one-dimensional Lattice Systems with a general a-priori probability: positive and zero temperature. Arxiv (2012)

  26. Lopes, A.O.: An analogy of the charge on distribution on Julia sets with the Brownian motion. J. Math. Phys. 30(9), 2120–2124 (1989)

    Article  MathSciNet  ADS  MATH  Google Scholar 

  27. Lopes, A.O.: Entropy and large deviations. Nonlinearity 3(2), 527–546 (1990)

    Article  MathSciNet  ADS  MATH  Google Scholar 

  28. Lopes, A.O., Mohr, J., Souza, R., Thieullen, Ph.: Negative entropy, zero temperature and stationary Markov chains on the interval. Bull. Braz. Math. Soc. 40, 1–52 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  29. Lopes, A.O.: Thermodynamic formalism, maximizing probabilities and large deviations. Lecture Notes—Dynamique en Cornouaille (2012, in press)

  30. Parry, W., Pollicott, M.: Zeta functions and the periodic orbit structure of hyperbolic dynamics. Astérisque 187–188 (1990)

  31. Protter, P.: Stochastic Integration and Differential Equations. Springer, Berlin (1990)

    Book  MATH  Google Scholar 

  32. Randall, D., Tetali, P.: Analyzing Glauber dynamics by comparison of Markov chains. J. Math. Phys. 41(3), 1598–1615 (2000)

    Article  MathSciNet  ADS  MATH  Google Scholar 

  33. Revuz, D., Yor, M.: Continuous Martingales and Brownian Motion, 3rd edn. Springer, Berlin (1999)

    Book  MATH  Google Scholar 

  34. Ruelle, D.: Thermodynamic Formalism. Addison-Wesley, Reading (1978)

    MATH  Google Scholar 

  35. Stroock, D.: An Introduction to Markov Processes. Springer, Berlin (2000)

    Google Scholar 

  36. Stroock, D.: An Introduction to the Theory of Large Deviations Springer, Berlin

  37. Stroock, D., Zegarlinski, B.: On the ergodic properties of Glauber dynamics. J. Stat. Phys. 81(5–6), 1007–1019 (1995)

    Article  MathSciNet  ADS  MATH  Google Scholar 

  38. Yoshida, K.: Functional Analysis. Springer, Berlin (1978)

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Artur Lopes.

Appendices

Appendix A: The Spectrum of on \({{\mathbb{L}}}^{2} (\mu_{A})\) and Dirichlet Form

For any \(f\in {\mathbb{L}}^{2} (\mu_{A})\) the Dirichlet form of f is

Notice that

(26)

Indeed,

By the other hand,

These two equalities imply that

From expression (26) we have that implies f=0.

We point out that we will consider bellow eigenvalues in \({{\mathbb{L}}}^{2} (\mu_{A})\) which are not necessarily Lipschitz.

Dirichlet forms are quite important (see [21]), among other reasons, because they are particularly useful when there is an spectral gap. However, this will not be the case here.

Proposition 26

Let a Lipschitz function \(V: \{1,\dots,d\}\to {\mathbb{R}}\) such that supV−infV<2. There are eigenvalues c for in \({{\mathbb{L}}}^{2} (\mu_{A})\) such that [(supV−2)∨0]<c<infV. Each eigenvalue has infinite multiplicity. Therefore, in this case, there is no spectral gap.

Proof

The existence of positive eigenvalues c for the operator satisfying [(supV−2)∨0]<c<infV will obtained from solving the twisted cohomological equation. In order to simplify the reasoning we will present the proof for the case \(E=\{0,1\}^{\mathbb{N}}\). From Sect. 2.2 in [5], we know that given functions \(z:E\to\mathbb {R}\) and \(C:E\to \mathbb{R}\) one can solve in α the twisted cohomological equation

$$\begin{aligned} \frac{z(y)}{C(y)}= \frac{1}{C(y)}\alpha(y) - \alpha \bigl( \sigma(y) \bigr), \end{aligned}$$
(27)

in the case that |C|<1. Indeed, just take

$$\begin{aligned} \alpha(y)= \sum_{j=0}^\infty \frac{\frac{z(\sigma^j(y))}{C(\sigma^j(y) )}}{(C(y) C(\sigma(y))\dots C(\sigma^j (y)))^{-1}}. \end{aligned}$$

Note that this function α is measurable and bounded but not Lipschitz.

Take \(z(y)=(-1)^{y_{0}} e^{-A(y)}\), when y=(y 0,y 1,y 2,…). Now, for c∈([(supV−2)∨0],infV) fixed, consider C(y)=1−V(σ(y))+c. Notice that |C|<1. Then, (27) becomes

$$\begin{aligned} (-1)^{y_0}= e^{A(y)} \bigl\{\alpha(y) - \alpha \bigl(\sigma(y) \bigr) \bigl(1-V \bigl(\sigma(y) \bigr)+c \bigr) \bigr\}. \end{aligned}$$

Let \(x\in\{1,\dots,d\}^{{\mathbb{N}}}\). Adding the equations above when y=0x and when y=1x, we get

because σ(0x)=x=σ(1x), and the potential A is normalized.

It is also easy to show that changing a little bit the argument one can get an infinite dimensional set of possible α associated to the same eigenvalue. □

Appendix B: Basic Tools for Continuous Time Markov Chains

In this section we present the proofs of Lemma 3 and Lemma 5. In order to do that, we will present another way to analyze the properties of a continuous time Markov chain.

Suppose the process {X t ,t≥0} is a continuous time Markov chain. In an alternative way we can described it by considering its skeleton chain (see [24, 31]). Let \(\{\xi_{n}\}_{n\in {\mathbb{N}}}\) be a discrete time Markov chain with transition probability given by p(x,y)=1 [σ(y)=x] e A(y). Consider a sequence of random variables \(\{\tau_{n}\}_{n\in {\mathbb{N}}}\), which are independent and identically distributed according to an exponential law of parameter 1. For n≥0, define

$$\begin{aligned} T_0=0,\qquad T_{n+1}=T_n+ \tau_n=\tau_0+\tau_1+\cdots+\tau _n. \end{aligned}$$

Thus, X t can be rewritten as \(\sum_{n=0}^{+\infty} \xi_{n} \mathbf{1}_{[T_{n}\leq t<T_{n+1}]}\), for all t≥0.

Proof of Lemma 3

Using the above, we are able to describe expression (1) in a different way:

$$\begin{aligned} P_{T}^V (f) (x) =& {\mathbb{E}}_{x} \bigl[e^{\int_0^{T} V(X_r) dr} f(X_{T}) \bigr] =\sum _{n=0}^{+\infty} {\mathbb{E}}_{x} \bigl[e^{\int_0^{T} V(X_r) dr} f(X_{T})\textbf {1}_{[T_n\leq T< T_{n+1}]} \bigr] \\ =& \sum_{n=0}^{+\infty} {\mathbb{E}}_{x} \bigl[e^{T_1 V(\xi_0)+ (T_2-T_1)V(\xi_1)+\cdots +(T_n-T_{n-1}) V(\xi_{n-1})+(T-T_n) V(\xi_{n})} f(\xi_n)\mathbf{1}_{[T_n\leq T< T_{n+1}]} \bigr] \\ =& \sum_{n=0}^{+\infty} {\mathbb{E}}_{x} \bigl[e^{\tau_0 V(\xi_0)+\tau_1 V(\xi_1)+\cdots+\tau _{n-1} V(\xi_{n-1})+(T-\sum_{i=0}^{n-1}\tau_i) V(\xi_{n})} f(\xi_n)\mathbf{1}_{[\sum_{i=0}^{n-1}\tau_i\leq T< \sum _{i=0}^{n}\tau_i]} \bigr] \\ =& {\mathbb{E}}_{x} \bigl[e^{T V(\xi_0)} f(\xi_0) \mathbf{1}_{[ T< \tau_0]} \bigr]+ \sum_{n=1}^{+\infty} \sum _{a_1=1}^{d}\dots\sum_{a_n=1}^{d} {\mathbb{E}}_{x} \bigl[e^{\tau_0 V(\xi_0)+\cdots+(T-\sum_{i=0}^{n-1}\tau _i) V(\xi_{n})} \\ &{} \times f(\xi_n) \mathbf{1}_{[\sum_{i=0}^{n-1}\tau_i\leq T< \sum _{i=0}^{n}\tau_i]}\mathbf{1}_{[\xi_{1}=a_1x,\dots,\xi_{n}=a_n\dots a_1x]} \bigr], \end{aligned}$$

where σ n(a n a 1 x)=x. The first term above is equal to e TV(x) f(x)e T. The summand in the second one is equal to

$$\begin{aligned} &{\mathbb{E}}_{x} \bigl[e^{\tau_0 V(\xi_0)+\cdots+(T-\sum_{i=0}^{n-1}\tau _i) V(\xi_{n})} f( \xi_n)\mathbf{1}_{[\sum_{i=0}^{n-1}\tau_i\leq T< \sum _{i=0}^{n}\tau_i]} |\xi_{1}=a_1x, \dots,\xi_{n}=a_n\dots a_1x \bigr] \\ &\quad\times {\mathbb{P}}_{x} [\xi_{1}=a_1x,\dots, \xi_{n}=a_n\dots a_1x ]. \end{aligned}$$

Using the transition probability of the Markov chain {ξ n } n , we get

$$\begin{aligned} {\mathbb{P}}_{x} [\xi_{1}=a_1x,\dots, \xi_{n}=a_n\dots a_1x ]= e^{A(a_1x)}\dots e^{A(a_n\dots a_1x)}. \end{aligned}$$

Recalling that the random variables {τ i } are independent and identically distributed according to an exponential law of parameter 1, we have

$$\begin{aligned} & {\mathbb{E}}_{x} \bigl[e^{\tau_0 V(\xi_0)+\cdots+(T-\sum_{i=0}^{n-1}\tau _i) V(\xi_{n})} f( \xi_n)\mathbf{1}_{[\sum_{i=0}^{n-1}\tau_i\leq T< \sum _{i=0}^{n}\tau_i]} |\xi_{1}=a_1x, \dots,\xi_{n}=a_n\dots a_1x \bigr] \\ &\quad{}= {\mathbb{E}}_{x} \bigl[e^{\tau_0 V(x)+\cdots+(T-\sum_{i=0}^{n-1}\tau_i) V(a_n\dots a_1x)} f(a_n\dots a_1x)\mathbf{1}_{[\sum_{i=0}^{n-1}\tau_i\leq T< \sum _{i=0}^{n}\tau_i]} \bigr] \\ &\quad= f(a_n\dots a_1x) \int_0^\infty dt_{n} \dots \int_0^\infty dt_{0} e^{t_0 V(x)+\cdots+(T-\sum_{i=0}^{n-1}t_i) V(a_n\dots a_1x)} \\ &\qquad{}\times\mathbf{1}_{[\sum_{i=0}^{n-1}t_i\leq T< \sum_{i=0}^{n}t_i]}e^{-t_0} \! \dots e^{-t_n}. \end{aligned}$$

Therefore,

$$\begin{aligned} &P_{T}^V (f) (x)= {\mathbb{E}}_{x} \bigl[e^{\int_0^{T} V(X_r) dr} f(X_{T}) \bigr]=e^{T V(x)} f(x)e^{-T} \\ &\quad{}+ \sum_{n=1}^{+\infty} \sum _{a_1=1}^{d}\dots\sum_{a_n=1}^{d} e^{A(a_1x)}\dots e^{A(a_n\dots a_1x)} f(a_n\dots a_1x) \\ &\quad{}\times\int_0^\infty dt_{n} \dots\int _0^\infty dt_{0} e^{t_0 V(x)+\cdots+(T-\sum_{i=0}^{n-1}t_i) V(a_n\dots a_1x)} \mathbf{1}_{[\sum_{i=0}^{n-1}t_i\leq T< \sum _{i=0}^{n}t_i]}e^{-t_0}\dots e^{-t_n}. \end{aligned}$$

 □

Proof of Lemma 5

We begin analyzing

(28)

and \(e^{T V(x)} e^{-T}\leq e^{TC_{V}d(x,y)} e^{T V(y)} e^{-T}\). Since the potential A is also Lipschitz, we get

$$\begin{aligned} e^{A(a_1x)}\dots e^{A(a_n\dots a_1x)} \leq& e^{C_A(\theta+\cdots+\theta^n)d(x,y)} e^{A(a_1y)}\dots e^{A(a_n\dots a_1y)} \\ \leq& e^{C_A\theta(1-\theta)^{-1}d(x,y)} e^{A(a_1y)}\dots e^{A(a_n\dots a_1y)}. \end{aligned}$$
(29)

By the hypothesis we assume for f, we get

$$\begin{aligned} f(a_n\dots a_1x)\leq e^{C_f\theta^n d(x,y)} f(a_n \dots a_1y)\leq e^{C_f\theta d(x,y)} f(a_n\dots a_1y). \end{aligned}$$

Thus,

 □

Appendix C: Radon-Nikodym Derivative

Let be the natural filtration.

Proposition 27

The Radon-Nikodym derivative of the measure \({\mathbb{P}}_{\mu}\) (associated to the a priori process) concerning the admissible measure \(\tilde{{\mathbb{P}}}_{\mu}\) (see Definition 10) restricted to is

Proof

The probabilities \(\tilde{{\mathbb{P}}}_{\mu}\) and \({\mathbb{P}}_{\mu}\) on are equivalent, because the initial measure and the allowed jumps are the same. Thus, the expectation under \({\mathbb{E}}_{\mu}\) of all bounded function , -measurable, is

The goal here is to obtain a formula for the Radon-Nikodym derivative \(\frac{\, \mathrm{d}{\mathbb{P}}_{\mu}}{\, \mathrm{d}\tilde{{\mathbb{P}}}_{\mu}}\). Since every bounded -measurable function can be approximated by functions depending only on a finite number of coordinates, then, it is enough to work with these functions. For k≥1, consider a sequence of times 0≤t 1<⋯<t k T and a bounded function \(F: (\{1,\dots,d\}^{{\mathbb{N}}} )^{k}\to {\mathbb{R}}\). Using the skeleton chain, presented in the proof of Lemma 3, we get

$$\begin{aligned} {\mathbb{E}}_{\mu} \bigl[F(X_{t_1}, \dots,X_{t_k}) \bigr]=\sum_{n\geq0} {\mathbb{E}}_{\mu} \bigl[F(X_{t_1},\dots,X_{t_k}) \mathbf{1}_{[T_n\leq T< T_{n+1}]} \bigr] . \end{aligned}$$

Since \(F(X_{t_{1}},\dots,X_{t_{k}})\) restricted to the set [T n T<T n+1] depends only on ξ 1,T 1,…,ξ n ,T n , there exist functions \(\bar{F}_{n}\) such that

$$\begin{aligned} {\mathbb{E}}_{\mu} \bigl[F(X_{t_1}, \dots,X_{t_k}) \bigr]=\sum_{n\geq0} {\mathbb{E}}_{\mu} \bigl[\bar{F}_n(\xi_1, T_1, \dots, \xi_n,T_n)\textbf {1}_{[T_n\leq T< T_{n+1}]} \bigr] . \end{aligned}$$

Through some calculations that are similar to the one used on the Corollary 2.2 in Appendix 1 of the [21], the last probability is equal to

$$\begin{aligned} \sum_{n\geq0} {\mathbb{E}}_{\mu} \bigl[\bar{F}_n(\xi_1, T_1, \dots, \xi_n,T_n) \textbf {1}_{[T_n\leq T]} e^{-\lambda(\xi_n)(T-T_n)} \bigr]. \end{aligned}$$
(30)

Then, we need to estimate for each \(n\in {\mathbb{N}}\) and, moreover, for all bounded measurable function \(G: (\{1,\dots,d\}^{{\mathbb{N}}}\times(0,\infty) )^{n}\to {\mathbb{R}}\) the expectation

$$\begin{aligned} {\mathbb{E}}_{\mu} \bigl[G(\xi_1, T_1,\dots, \xi_n,T_n) \bigr]=\int _{\{ 1,\dots,d\}^{{\mathbb{N}}}}{\mathbb{E}}_{x} \bigl[G(\xi_1, T_1,\dots, \xi _n,T_n) \bigr] \, \mathrm{d} \mu(x). \end{aligned}$$

Notice that, for all \(x\in\{1,\dots,d\}^{{\mathbb{N}}}\),

$$\begin{aligned} &{\mathbb{E}}_{x} \bigl[G(\xi_1, T_1,\dots, \xi_n,T_n) \bigr]= \sum _{a_1=1}^{d}\dots\sum_{a_n=1}^{d} e^{A(a_1x)}\dots e^{A(a_n\dots a_1x)} \\ &\qquad{}\times\biggl\{\int_0^\infty dt_{n-1} \dots\int_0^\infty dt_{0} e^{-t_0}\dots e^{-t_{n-1}} G(a_1x,t_0, \ldots,a_n\dots a_1 x,t_{n-1}+ \cdots+t_0) \biggr\} \\ &\quad{}= \sum_{a_1=1}^{d}\dots\sum _{a_n=1}^{d} e^{\tilde{A}(a_1x)}\dots e^{\tilde{A}(a_n\dots a_1x)} \biggl\{ \int_0^\infty dt_{n-1} \dots \int_0^\infty dt_{0} \tilde{\gamma}(x)e^{-\tilde{\gamma}(x)t_0}\dots \\ &\qquad{}\times\tilde{\gamma}(a_{n-1}\dots a_1x)e^{-\tilde{\gamma}(a_{n-1}\dots a_1x)t_{n-1}} e^{A(a_1x)-\tilde{A}(a_1x)}\dots e^{A(a_n\dots a_1x)-\tilde{A}(a_n\dots a_1x)} \frac{e^{(\tilde{\gamma}(x)-1)t_0}}{\tilde{\gamma}(x)}\dots \\ &\qquad{}\times \frac {e^{(\tilde{\gamma}(a_{n-1}\dots a_1x)-1)t_{n-1}}}{\tilde{\gamma}(a_{n-1}\dots a_1x)}G(a_1x,t_0, \ldots,a_n\dots a_1 x,t_{n-1}+ \cdots+t_0) \biggr\} \\ &\quad{}=\tilde{{\mathbb{E}}}_{x} \Biggl[G(\xi_1, T_1, \dots, \xi_n,T_n) \exp \Biggl\{\sum _{i=0}^{n-1} \bigl(\tilde{\gamma}(\xi_i)-1 \bigr)\tau_i \Biggr\}\prod_{i=0}^{n-1} e^{A(\xi_{i+1})-\tilde{A}(\xi_{i+1})} \genfrac {}{}{}{0}{1}{\tilde{\gamma}(\xi _i)} \Biggr]. \end{aligned}$$

We can write \(\sum_{i=0}^{n-1}(\tilde{\gamma}(\xi_{i})-1)\tau_{i}\) as

$$\begin{aligned} \sum_{i=0}^{n-1} \bigl(\tilde{\gamma}( \xi_i)-1 \bigr)\int_{0}^{T_n} \textbf {1}_{[T_i\leq s< T_{i+1}]} \, \mathrm{d}s =& \int_{0}^{T_n} \sum_{i=0}^{\infty} \bigl(\tilde{\gamma}( \xi_i)-1 \bigr) \mathbf{1}_{[T_i\leq s< T_{i+1}]} \, \mathrm{d}s\\ =& \int _{0}^{T_n} \bigl(\tilde{\gamma}(X_s)-1 \bigr) \, \mathrm{d}s, \end{aligned}$$

and, we can write \(e^{A(\xi_{i+1})-\tilde{A}(\xi_{i+1})} \frac {1}{\tilde{\gamma}(\xi_{i})}\) as

$$\begin{aligned} & \exp \Biggl\{\sum_{i=0}^{n-1} \bigl(A(\xi_{i+1})-\tilde{A}(\xi_{i+1})-\log \tilde{\gamma}( \xi_i) \bigr) \Biggr\} \\ &\quad{}= \exp \Biggl\{\sum_{i=0}^{n-1} \mathbf{1}_{[\sigma(\xi_{i+1})=\xi _{i}]} \bigl(A(\xi_{i+1})-\tilde{A}( \xi_{i+1})-\log\tilde{\gamma}\bigl(\sigma(\xi_{i+1}) \bigr) \bigr) \Biggr\} \\ &\quad{}=\exp \biggl\{\sum_{s\leq T_n}\mathbf{1}_{[\sigma(X_s)=X_{s^-}]} \bigl(A(X_s)-\tilde{A}(X_s)-\log \bigl(\tilde{\gamma} \bigl(\sigma (X_s) \bigr) \bigr) \bigr) \biggr\}. \end{aligned}$$

The expectation under \({\mathbb{P}}_{x}\) of G(ξ 1,T 1,…,ξ n ,T n ) becomes

$$\begin{aligned} &\tilde{{\mathbb{E}}}_{x} \biggl[G( \xi_1, T_1,\dots, \xi_n,T_n) \exp \biggl\{\int_{0}^{T_n} \bigl(\tilde{\gamma}(X_s)-1 \bigr) \, \mathrm{d}s\\ &\quad{}+\sum_{s\leq T_n} \mathbf{1}_{[\sigma(X_s)=X_{s^-}]} \bigl(A(X_s)-\tilde{A}(X_s)- \log \bigl(\tilde{\gamma} \bigl(\sigma (X_s) \bigr) \bigr) \bigr) \biggr\} \biggr]. \end{aligned}$$

Using the formula above in (30), the expectation under \({\mathbb{E}}_{\mu}\) of \(F(X_{t_{1}},\dots,X_{t_{k}})\) is equal to

$$\begin{aligned} &\sum_{n\geq0} \tilde{{\mathbb{E}}}_{\mu} \biggl[\bar{F}_n(\xi_1, T_1,\dots, \xi_n,T_n) \mathbf{1}_{[T_n\leq T]} e^{-\lambda(\xi_n)(T-T_n)} \\ &\quad{}\times \exp \biggl\{\int_{0}^{T_n} \bigl( \tilde{\gamma}(X_s)-1 \bigr) \, \mathrm{d}s+\sum _{s\leq T_n}\mathbf{1}_{[\sigma(X_s)=X_{s^-}]} \bigl(A(X_s)- \tilde{A}(X_s)-\log \bigl(\tilde{\gamma} \bigl(\sigma (X_s) \bigr) \bigr) \bigr) \biggr\} \biggr]. \end{aligned}$$

Once again, we use some calculations similarly to the Corollary 2.2 in Appendix 1 of the [21] and we rewrite the expression above as

$$\begin{aligned} &\sum_{n\geq0} \tilde{{\mathbb{E}}}_{\mu} \biggl[\bar{F}_n(\xi_1, T_1,\dots, \xi_n,T_n) \mathbf{1}_{[T_n\leq T< T_{n+1}]} \\ &\quad\times \exp \biggl\{\int_{0}^{T} \bigl( \tilde{\gamma}(X_s)-1 \bigr) \, \mathrm{d}s+\sum _{s\leq T}\mathbf{1}_{[\sigma(X_s)=X_{s^-}]} \bigl(A(X_s)- \tilde{A}(X_s)-\log \bigl(\tilde{\gamma} \bigl(\sigma (X_s) \bigr) \bigr) \bigr) \biggr\} \biggr], \end{aligned}$$

and, this sum is equal to

$$\begin{aligned} & \tilde{{\mathbb{E}}}_{\mu} \biggl[F(X_{t_1}, \dots,X_{t_k}) \exp \biggl\{\int_{0}^{T} \bigl(\tilde{\gamma}(X_s)-1 \bigr) \, \mathrm{d}s\\ &\quad{}+\sum _{s\leq T}\mathbf{1}_{[\sigma(X_s)=X_{s^-}]} \bigl(A(X_s)- \tilde{A}(X_s)-\log \bigl(\tilde{\gamma} \bigl(\sigma (X_s) \bigr) \bigr) \bigr) \biggr\} \biggr]. \end{aligned}$$

This finish the proof. □

Appendix D: Proof of Lemma 11

Proof of Lemma 11

We claim that

$$\begin{aligned} M^G_T(\omega)=\sum _{s\leq T}\mathbf{1}_{\{\sigma(\omega_s)= \omega_{s^-}\}} G(\omega_s) - \int_0^T\tilde{\gamma}( \omega_s)G( \omega_s) \, \mathrm{d}s \end{aligned}$$

is a \(\tilde{{\mathbb{P}}}_{\mu}\)—martingale. Then, this lemma will follow from \(\tilde{{\mathbb{E}}}_{\mu}[M^{G}_{T} ]=\tilde{{\mathbb{E}}}_{\mu}[M^{G}_{0} ]=0\). In order to prove this claim it is enough to prove that

$$\begin{aligned} M_T(\omega)=\sum _{s\leq T}\mathbf{1}_{\{\sigma(\omega_s)= \omega_{s^-}\}} - \int_0^T \tilde{\gamma}(\omega_s) \, \mathrm{d}s \end{aligned}$$
(31)

is a \(\tilde{{\mathbb{P}}}_{\mu}\)—martingale, because \(M^{G}_{T}=\int G \, \mathrm{d}M_{T}\) will be a \(\tilde{{\mathbb{P}}}_{\mu}\)—martingale (see [33]).

Now, we prove (31). Let be the natural filtration. For all S<T, we prove that . By Markov property, we only need to show that \(\tilde{{\mathbb{E}}}_{x} [M_{t} ]=0 \).

Denote by the space of all trajectories ω in such that ω 0=x. Observe that, for all ω in ,

$$\begin{aligned} \int_0^t \tilde{\gamma}(\omega_s) \, \mathrm{d}s= \sum _{k\geq 1}\sum_{i_1=1}^{d} \cdots\sum_{i_k=1}^{d}\tilde{ \gamma}(i_k\dots i_1x) \int_0^t \mathbf{1}_{[\omega_s=i_k\dots i_1x]} \, \mathrm{d}s. \end{aligned}$$
(32)

For all s≥0 and \(y \in\{1,\dots,d\}^{{\mathbb{N}}}\), N s (y) denotes the number of times that the exponential clock rang at site y. Thus, the first term on the right side of (31) can be rewritten as

$$\begin{aligned} \sum_{s\leq t} \mathbf{1}_{\{\sigma(\omega_s)=\omega_{s^-}\}} = \sum_{k\geq1}\sum _{i_1=1}^{d}\cdots\sum_{i_k=1}^{d} N_t(i_k\dots i_1x), \end{aligned}$$
(33)

for all ω in .

Since (32) and (33) are true, in order to conclude this prove, it is sufficient to show that

$$\begin{aligned} \tilde{{\mathbb{E}}}_x \biggl[N_t(y)-\tilde{\gamma}(y) \int_0^t \mathbf{1}_{[X_s=y]} \, \mathrm{d}s \biggr]=0, \end{aligned}$$
(34)

for all \(y \in\{1,\dots,d\}^{{\mathbb{N}}}\).

Let 0=t 0<t 1<⋯<t n =t be a partition of the interval [0,t]. The expression (34) can be rewritten as

$$\begin{aligned} \sum_{i=0}^{n-1}\tilde{{\mathbb{E}}}_x \biggl[ N_{t_{i+1}}(y)-N_{t_{i}}(y)+\tilde{ \gamma}(y)\int_{t_i}^{t_{i+1}} \mathbf{1}_{[X_s=y]} \, \mathrm{d}s \biggr]. \end{aligned}$$

Observe that

$$\begin{aligned} \tilde{{\mathbb{E}}}_x \biggl[ \int _{t_i}^{t_{i+1}} \mathbf{1}_{[X_s=y]} \, \mathrm{d}s \biggr] =&\tilde{{\mathbb{E}}}_y \biggl[ \int_{0}^{t_{i+1}-t_i} \mathbf{1}_{[X_s=y]} \, \mathrm{d}s \biggr] \\ =&\tilde{{\mathbb{E}}}_y \biggl[ \int_{0}^{t_{i+1}-t_i} \mathbf{1}_{[X_s=y]} \, \mathrm{d}s \mathbf{1}_{[N_{t_{i+1}-t_i}(y)=0]} \biggr]\\ &{}+ \tilde{{\mathbb{E}}}_y \biggl[ \int_{0}^{t_{i+1}-t_i} \mathbf{1}_{[X_s=y]} \, \mathrm{d}s \mathbf{1}_{[N_{t_{i+1}-t_i}(y)>0]} \biggr] \\ =&(t_{i+1}-t_i)+O_{\tilde{\gamma}} \bigl((t_{i+1}-t_i)^2 \bigr), \end{aligned}$$

where the function \(O_{\tilde{\gamma}}\) satisfies \(O_{\tilde{\gamma }}(h)\leq C_{\tilde{\gamma}} h\). Then, we only need to prove that

$$\begin{aligned} \tilde{{\mathbb{E}}}_x \bigl[ N_{t_{i+1}}(y)-N_{t_{i}}(y) \bigr]=\tilde{\gamma }(y) (t_{i+1}-t_i). \end{aligned}$$

By the Markov Property, it is enough to see that \(\tilde{{\mathbb{E}}}_{x}[ N_{h}(y)]=\tilde{\gamma}(y)h\). This is a consequence of the \(\tilde{\gamma}(y)\) being the parameter of the exponential clock at the site y. □

Appendix E: Basic Properties of Q(V)

Lemma 28

|Q(V)−Q(U)|≤∥VU.

Proof

Since

$$\begin{aligned} P_{T}^{V}(1) (x)={\mathbb{E}}_x \bigl[e^{\int_0^T V(X_r) dr} \bigr] \leq {\mathbb{E}}_x \bigl[ e^{T \Vert V-U\Vert_{\infty}}e^{\int_0^T U(X_r) dr} \bigr]= e^{T \Vert V-U\Vert_{\infty}} P_{T}^{U}(1) (x), \end{aligned}$$

then,

$$\begin{aligned} \bigl\vert Q(V)-Q(U)\bigr\vert =&\lim_{T \to\infty} \frac{1}{T} \log \frac{\sup_{x\in\{1,\dots,d\}^{{\mathbb{N}}}} P_{T}^V(1)(x)}{\sup_{x\in\{ 1,\dots,d\}^{{\mathbb{N}}}} P_{T}^U(1)(x)} \\ \leq& \lim_{T \to\infty} \frac{1}{T} \log \frac{\sup_{x\in\{ 1,\dots,d\}^{{\mathbb{N}}}} e^{T \Vert V-U\Vert_{\infty}} (P_{T}^U 1)(x)}{\sup_{x\in\{1,\dots,d\}^{{\mathbb{N}}}} P_{T}^U(1)(x)} \\ = &\Vert V-U\Vert_{\infty}. \end{aligned}$$

 □

Lemma 29

The functional VQ(V) is convex, i.e., for all α∈(0,1), we have

$$\begin{aligned} Q \bigl(\alpha V+(1-\alpha)U \bigr)\leq\alpha Q(V)+(1-\alpha)Q(U). \end{aligned}$$

Proof

Using the Holder’s inequality, we have

$$\begin{aligned} P_{T}^{\alpha V+(1-\alpha)U}(1) (x) =&{\mathbb{E}}_x \bigl[e^{\int_0^T \alpha V(X_r) dr}e^{\int_0^T (1-\alpha) U(X_r) dr} \bigr] \\ \leq& \bigl({\mathbb{E}}_x \bigl[e^{\int_0^T V(X_r) dr} \bigr] \bigr)^{\alpha} \bigl({\mathbb{E}}_x \bigl[e^{\int_0^T U(X_r) dr} \bigr] \bigr)^{(1-\alpha)}. \end{aligned}$$

Thus,

$$\begin{aligned} Q \bigl(\alpha V+(1-\alpha)U \bigr) =&\lim _{T \to\infty} \frac{1}{T} \log \sup_{x\in\{1,\dots,d\}^{{\mathbb{N}}}} P_{T}^{\alpha V+(1-\alpha)U}(1) (x) \\ \leq& \lim_{T \to\infty} \frac{1}{T} \log \Bigl(\sup _{x\in\{1,\dots,d\}^{{\mathbb{N}}}}{\mathbb{E}}_x \bigl[e^{\int_0^T V(X_r) dr} \bigr] \Bigr)^{\alpha}\\ &{}\times \Bigl(\sup_{x\in\{1,\dots,d\}^{{\mathbb{N}}}}{\mathbb{E}}_x \bigl[e^{\int_0^T U(X_r) dr} \bigr] \Bigr)^{(1-\alpha)} \\ =&\alpha\lim_{T \to\infty} \frac{1}{T} \log \sup _{x\in\{1,\dots,d\}^{{\mathbb{N}}}}{\mathbb{E}}_x \bigl[e^{\int_0^T V(X_r) dr} \bigr]\\ &{}+(1- \alpha) \lim_{T \to\infty} \frac{1}{T} \log\sup _{x\in\{1,\dots,d\} ^{{\mathbb{N}}}}{\mathbb{E}}_x \bigl[e^{\int_0^T U(X_r) dr} \bigr]. \end{aligned}$$

 □

Appendix F: The Associated Symmetric Process and the Metropolis Algorithm

We can consider in our setting an extra parameter \(\beta\in\mathbb {R}\) which plays the role of the inverse of temperature. For a given fixed potential V we can consider the new potential βV, \(\beta\in\mathbb{R}\), and applying what we did before, we get continuous time equilibrium states described by γ β :=γ βV and B β :=B βV , in the previous notation. In other words, we consider the infinitesimal generator , β>0, and the associated main eigenvalue λ β :=λ βV . We denote by L V,β the infinitesimal generator of the process that is the continuous time Gibbs state for the potential βV, then L V,β acts on functions f as \(L^{V,\beta}(f)(x)= \gamma_{\beta}(x) \sum_{\sigma (y)=x}e^{B_{\beta}(y)} [f(y)-f(x) ]\). We are interested in the stationary probability \(\mu_{\beta}:=\mu_{B_{\beta V}, \gamma_{\beta V}}\) for the semigroup \(\{e^{ t L^{V,\beta} }, t\geq0\}\), and its weak limit as β→∞. This limit would correspond to the continuous time Gibbs state for temperature zero (see [7, 25, 28] for related results).

The dual of L V,β on the Hilbert space \({\mathbb{L}}^{2}( \mu_{\beta })\) is , where is the Koopman operator. Notice that the probability μ β is also stationary for the continuous time process with symmetric infinitesimal generator \(L_{sym}^{V,\beta}:=\frac{1}{2} (L^{V,\beta} + {L^{V,\beta}}^{*})\). In this new process the particle at x can jump to a σ−preimage y with probability \(\frac{1}{2}e^{B_{\beta}(y)}\), or with probability \(\frac{1}{2}\), to the forward image σ(x), but, in both ways, according to a exponential time of parameter γ β (x).

The eigenfunction of the continuous time Markov chain with infinitesimal generator \(L_{sym}^{V,\beta}\) can be different from the one with generator L V,β. Given V and β, we denote λ(β) sym the main eigenvalue that we obtained from βV and the generator \(L_{sym}^{V,\beta}\). The eigenvalues of L V,β and L V,β are the same as before. Now, we will look briefly at how to obtain λ(β) sym . From the symmetric assumption, [11], we get, for a fixed β,

The second equality is due to the Definition (12), and the last one is by the dual, , on \({\mathbb{L}}^{2}(\mu _{\beta})\) is .

Suppose one changes β in such way that β increases converging to ∞, then one can ask about the asymptotic behavior of the stationary Gibbs probability μ β . One should analyze first what that happens with the optimal ϕ (or almost optimal) in the maximization problem above. In order to answer this last question, we use, in \({\mathbb{L}}^{2}(\mu_{\beta})\), the Schwartz inequality, and we obtain

Note that, for a fixed large β, the positive value γ β (x)=1−βV(x)+λ βV became smaller close by the supremum of V. Which means that \(\frac{1}{\gamma_{\beta}(x)}\) became large close by the supremum of V. Moreover, for fixed β, the part \(\int \beta V |\phi| \frac{1}{ \gamma_{\beta}} \frac{\, \mathrm{d}\mu_{B_{\beta}}}{\int\frac{1}{\gamma_{\beta}} \, \mathrm{d} \mu_{B_{\beta}}} \) of the above expression increase if we consider |ϕ| such that the big part of its mass is more and more close by to the supremum of βV. Note that, for fixed β, the part of the above expression is bounded and just depends on ϕ. The supremum of \(\int \beta V |\phi| \frac{1}{ \gamma_{\beta}} \frac{\, \mathrm{d}\mu_{B_{\beta}}}{\int\frac{1}{\gamma_{\beta}} \, \mathrm{d} \mu_{B_{\beta}}} \) grows with β at least of order β.

Therefore, for large β, the maximization above should be obtained by taking ϕ=ϕ β in \({\mathbb{L}}^{2}(\mu_{\beta})\) such that is more and more concentrated close by the supremum of βV. In this way, when β→∞ the “almost” optimal ϕ has a tendency to localize the points where the supremum of V is attained. If there is a unique point z 0 where V is optimal, then λ β βV(z 0). The probability μ β will converge to the delta Dirac on the point z 0. This procedure is quite similar with the process of determining ground states for a given potential via an approximation by Gibbs states which have a very small value of temperature (see for instance [1]).

The Metropolis algorithm has several distinct applications. In one of them, it can be used to maximize a function on a quite large space (see [14, 22]). Suppose V has a unique point of maximal value. The basic idea is to produce a random algorithm that can explore the state space and localize the point of maximum, this problem may happen with a deterministic algorithm. The use of continuous time paths resulted in some advantages in the method. The randomness assures that the algorithm does note stuck on a point of local maximum of some function V. The setting we consider here has several similarities with the usual procedure. When we take β large, then the probability μ β will be very close to the delta Dirac on the point of maximum for V as we just saw. This is so because the parameter \(\frac{1}{\gamma_{\beta}(x)}\) of the exponential distribution became large close by the supremum of V. In the classical Metropolis algorithm there is link on β and t which is necessary for the convergence (cooling schedule in [35]). In a forthcoming paper, using our large deviation results, we will investigate the question: given small ϵ and δ, with probability bigger than 1−δ, the empirical path on the one-dimensional spin lattice will stay, up to a distance smaller the ϵ of the maximal value, a proportion 1−δ of the time t, if t and β are chosen in a certain way (to be understood). In order to do that we have to use the large deviation results we get before.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lopes, A., Neumann, A. & Thieullen, P. A Thermodynamic Formalism for Continuous Time Markov Chains with Values on the Bernoulli Space: Entropy, Pressure and Large Deviations. J Stat Phys 152, 894–933 (2013). https://doi.org/10.1007/s10955-013-0796-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10955-013-0796-7

Keywords

Navigation