Large Deviations for Finite State Markov Jump Processes with Mean-Field Interaction Via the Comparison Principle for an Associated Hamilton–Jacobi Equation

We prove the large deviation principle (LDP) for the trajectory of a broad class of finite state mean-field interacting Markov jump processes via a general analytic approach based on viscosity solutions. Examples include generalized Ehrenfest models as well as Curie–Weiss spin flip dynamics with singular jump rates. The main step in the proof of the LDP, which is of independent interest, is the proof of the comparison principle for an associated collection of Hamilton–Jacobi equations. Additionally, we show that the LDP provides a general method to identify a Lyapunov function for the associated McKean–Vlasov equation.


Introduction
We consider the large deviation behaviour of a sequence of non-linear Markov jump processes [20] on a finite state space.The non-linearity is, for example, motivated from mean-field dynamics, where the rates depend on the empirical distribution.The limiting behaviour of the processes under consideration is gouverned by a first order differential equation and we will study the fluctuations around this limiting behaviour.Feng and Kurtz [16] propose a general strategy for proving large deviation principles for sequences of processes.Their approach is based on the observation that large deviations have many similarities with weak convergence, e.g. both principles can be defined in terms of an upper bound for closed sets and a lower bound for open sets.The weak limiting behaviour of a sequence of Markov processes can be studied using linear semigroups and their generators.Appropriate convergence of the generators implies weak convergence of the Markov processes on the Skorokhod space.Correspondingly, they show that the large deviation behaviour of a sequence of Markov processes can be studied via their associated non-linear generators.A major step towards proving the large deviation principle consists of showing that the limiting non-linear generator H generates a semigroup.The Crandall-Liggett generation theorem [6] gives necessary and sufficient conditions, one of which is checking whether the range of (½ − λH), λ > 0 is dense in the Banach space under consideration.In this case, we work with (C(E), ||•||), where E is a compact metric space.Then, an alternative to the range condition is solving the equation f − λHf − h = 0, λ > 0, h ∈ C(E), in the viscosity sense.
Viscosity solutions of certain types of differential equations were introduced by Crandall and Lions [8], also see [7], and are a powerful method to obtain uniqueness and existence results for these differential equations.For the processes under consideration, we obtain a non-linear operator H and show that the comparison principle holds for the equation f − λHf − h = 0, for any λ > 0 and h ∈ C(E).For this we use methods described in [7] using appropriate distance functions.The comparison principle implies that there is at most one viscosity solution to the equation, which can be given explicitly by a variational formula.These solutions are then used to define an extension Ĥ of H that satisfies the conditions of the Crandall-Liggett theorem.This in turn, is sufficient to prove the large deviation principle and to show that the large deviation rate function can be decomposed into a rate for the large deviations at time zero and an integral over a 'Lagrangian'.Similar problems have been considered before.Comets [5] studied the large deviations for Glauber dynamics for a long range spin model and gives a decomposition of the rate function as a static and dynamical part, where the dynamic part is given by the integral over a Lagrangian.Léonard [22] extends the large deviation principle to a setting that also covers the non-linear jump process case.S. Feng [17] proves the large deviation principle for an empirical jump process with mean-field interaction term that involves jumps with unbounded rate.The functional analytic approach to the large deviation problem in this paper differs from the approach in any of the papers above and gives the Lagrangian form of the rate function in a straightforward way.Compared to the jump-process results in [22], we cover more general sequences of processes as the non-linear jump kernel can be different for various n.The literature on large deviations of diffusion processes is more extensive.The limiting behaviour for these models is described by a McKean-Vlasov equation and are well studied.A basic reference is [10].Feng and Kurtz [16,Section 13.3] study this problem using methods on which our approach is based, and more recently, large deviations have been proven using weak convergence and control theory methods in [3].The program to study Gibbs-non-Gibbs transitions via large deviations proposed by van Enter, den Hollander, Fernández, and Redig [1,26] motivates this work.They conjecture an equivalence between a bifurcation in the large deviation rate function and the transition from a Gibbs measure to a non-Gibbs measure.Steps to verify the conjecture of the program have been made in the jump process case in [14,18,19,21,24] and in the diffusion process case in [12].A basic ingredient of this approach is the path-space large deviation principle.Our general results cover the jump process models considered in these papers and, moreover, hold also for the mean-field Potts model, for any potential with Lipschitz derivative.
The paper is organised as follows.In Section 2, we start with our basic set-up and give some known weak convergence results.In Section 3, we proceed with the large deviation results by Feng and Kurtz [16] and give the main conditions under which the large deviation principle holds in our set-up.We give some examples of situations for which these conditions are satisfied in Section 4. Additionally, we show how these results can be used to obtain H-theorems for certain systems.Proofs of the comparison principle for the examples mentioned in Section 4 are given in Section 5.

Preliminaries and weak convergence results
We consider a compact metric space E ⊆ R d and for n ≥ 1 compact metric spaces E n ⊆ R d .For every n, we have a Borel measurable map η n : E n → E and we assume that E = lim n η n (E n ) in the sense that for x ∈ E there are As usual we write C(E), C(E n ) for the continuous functions on E and E n write C 1 (E) for functions f that are defined and once continuously differentiable on a neighbourhood of E in R d .Similarly, we denote with C 1,l (E) functions on E that can be extended to differentiable functions with Lipschitz derivative on a neighbourhood of E.
In all of our examples, E n will be some discrete subset of R d , and E will be some limiting closed set.In these cases, η n will be the identity map.For example, for n even: Assume that for each n ∈ N, we have a jump process Y n on E n , generated by a bounded infinitesimal generator A n .We denote the associated strongly continuous contraction semigroup by {S(t)} t≥0 .To study the limiting behaviour of the sequence of processes Y n , we map all of them to E. We denote X n := η n (Y n ).The existence of a strongly continuous limiting semigroup {S(t)} t≥0 on C(E) in the sense that for all f ∈ C(E), all T ≥ 0 and all allows us to study to study the limiting behaviour of the process X n .We will consider this question from the point of view of the generators A n .First, we need some definitions, which we will also use for non-linear operators.Definition 2.1.Suppose that for each n we have an operator (B n , D(B n )), For an operator (B, D(B)), we write We say that an operator (B, D(B)) is dissipative: if for all λ > 0 and f, g The dissipativity of B, will imply the injectivity of the inverse operators R(λ, B) := (½ − λB) −1 .An easy way to check dissipativity of an operator is through the positive maximum principle.We say that an operator (B, D(B)), B : D(B) ⊆ C(E) → C(E), satisfies the positive maximum principle if for any two functions f, g ∈ D(B), we have the following: The proof of the following result is straightforward.be an open ball centered at y with radius ε > 0, which has x on its boundary, but S ∩ E = ∅.Then, we say that ν(x) = y − x is normal to E at x.For later reference, we repeat our basic set-up.
Assume that for each n ∈ N, we have a bounded generator Finally, suppose that there exists an operator (A, C 1,l (E)) such that A ⊆ ex − lim n A n and which satisfies Af (x) = A(x, ∇f (x)) = ∇f (x), F(x) for some continuous vector field F : E → R d that satisfies (a) F can be extended to a continuous vector field F : R d → R d , (b) there exists some M ≥ 0 such that x − y, F(x) − F(y) ≤ M |x − y| 2 for all x, y in R d , for example this is satisfied if F is globally Lipschitz, (c) ν(x), F(x) ≤ 0, whenever ν(y) is normal to E at y. Intuitively, this means that the vector field at the boundary of E points into E.
Let Y n be a Feller process associated to {S n (t)} t≥0 and denote Remark 2.4.The domain of A can be extended to C 1 (E) without any problems.For the study of the non-linear generator H, it will turn out that the domain C 1,l (E) is easier to work with.We keep the domains of both operators the same so that we can work with a single definition of viscosity solutions in Definition 2.7.The restriction to this smaller domain has no effect as C 1 (E) is contained in the graph norm closure of C 1,l (E).
Given an operator A: Af (x) = ∇f (x), F(x) in the assumption above with a vector field F : E → R d , we now construct the semigroup that has an extension of A as its generator.The approach follows Section II.3.28 in [13].
First, we extend F to a globally Lipschitz vector field F : R d → R d .From standard results, it follows that there exists a continuous flow Φ : R × R d → R d , i.e.Φ is continuous and for all s, t ∈ R, x ∈ R d we have Φ(t+s, x) = Φ(t, Φ(s, x)), Φ(0, x) = x, that satisfies As a consequence, we obtain the following result Theorem 2.6 (Trotter-Kurtz).Let Assumption 2.3 be satisfied and let F be continuously differentiable.Then, we have for every T ≥ 0 and Furthermore, suppose that X n (0) → x 0 in distribution.Then, we have that Proof.The first statement follows by the Trotter-Kurtz approximation theorem [15, Theorem 1.6.1].The final result follows by Corollary 4.8.7 in [15].
To use the Trotter-Kurtz approximation theorem, it is essential that C 1,l (E) is a core for the generator B = A, which follows from the fact that F is continuously differentiable.To obtain (2.2) under the relaxed condition that F is merely Lipschitz, we introduce a new solution concept for solving the equation f − λAf = h for given h ∈ C(E) and λ > 0. Consider the following differential equation where E is a compact subset of R d and F is a continuous function Definition 2.7.We say that u is a (viscosity) subsolution of Equation (2.3) if u is bounded, upper semi-continuous and if for every f ∈ C 1,l (E) and x ∈ E such that u − f has a local maximum at x, we have We say that u is a (viscosity) supersolution of Equation (2.3) if u is bounded, lower semi-continuous and if for every f ∈ C 1,l (E) and x ∈ E such that u − f has a local minimum at x, we have We say that u is a (viscosity) solution of Equation (2.3) if it is both a sub and a super solution.
Definition 2.8.We say that Equation (2.3) satisfies the comparison principle if for a subsolution u and supersolution v we have u ≤ v.
Note that if the comparison principle is satisfied, then a viscosity solution is unique.
Denote with R(λ, B) the resolvent corresponding to the semigroup {S(t)} t≥0 : In the case that F is continuously differentiable, we have B = A by Lemma 2.5 and, hence, that Ran R(λ, B) = D(B) = D(A).This in turn implies that (½ − λA)R(λ, B)h = h for all h ∈ C(E) and λ > 0. In the case that F is merely Lipschitz, we will show that f = R(λ, B)h is a viscosity solution to the equation We start with some preparations.
(a) Suppose there is a unique Suppose that there exists ε 0 > 0 such that for 0 < ε < ε 0 , we have (b) Suppose there is a unique x 0 ∈ E such that f (x 0 ) = inf x f (x).Suppose that there exists ε 0 > 0 such that for 0 < ε < ε 0 , we have The proof of this proposition in inspired on the proof of Theorem 8.27 (a) in [16].
and first suppose that there is a unique Recall that the resolvent equation holds: for all µ > 0, we have where we use in line three that R(µ, B) is a positive contractive operator.As this holds for all µ > 0, we obtain by Lemma 2.9 (a) that Theorem 2.11.Let Assumption 2.3 be satisfied.Let (B, D(B)) be the generator of {S(t)} t≥0 introduced in Lemma 2.5.Then, (a) For each λ > 0 and h ∈ C(E), R(λ, B)h is the unique viscosity solution to (b) We have for every T ≥ 0 and Furthermore, suppose that X n (0) → x 0 in distribution.Then, we have that Proof.First of all, the graph of the operator (B, D(B)) because R(λ, B) = (½ − λB) −1 and the range of R(λ, B) equals D(B).By Proposition 2.10, R(λ, B)h is a viscosity solution to f (x)−λA(x, ∇f (x))−h(x) = 0, which is also the unique solution by the comparison principle proven in Lemma 5.6.Therefore, we obtain by Theorem 6.13 in [16] and (2.5) that for every T ≥ 0 and The final result follows by Corollary 4.8.7 in [15].

Large deviation results
We now turn to the large deviation question that is natural in view of the result of Theorem 2.11.We follow the approach of Feng and Kurtz [16] and extend the assumptions made in the Section 2 Assumption 3.1.Let Assumption 2.3 be satisfied.Define for each n the exponential semigroup The generator of {V n (t)} t≥0 is given by Additionally, we assume that for every x ∈ E, the map p → H(x, p) is convex and continuous.

Remark 3.2.
In the examples that we will consider, the domain of H can easily be extended to C 1 (E).However, for functions f ∈ C 1 (E) \ C 1,l (E) it is not immediately clear how to construct the trajectory necessary for Condition 3.6 (b).

Non-linear semigroup theory
As in the linear case, we can proceed our study if we can prove that the semigroups {V n (t)} t≥0 have a limiting semigroup {V (t)} t≥0 generated by (H, C 1,l (E)).This can be made precise by the Crandall-Liggett theorem, extended with Proposition 5.5 in [16].[6]).Suppose that (H, D(H)) is dissipative and that the range condition is satisfied: for all λ > 0 we have exists and defines a strongly continuous contraction semigroup on C(E).Under Assumption 3.1, we have for every T ≥ 0 and The range condition is usually hard to verify, as classical solutions to f (x) − λH(x, ∇f (x)) = h(x) are hard to find.An alternative is solving this equation in the viscosity sense.This way, we obtain an extension Ĥ of H that satisfies the range condition.That this extension is also dissipative is indicated by the fact R(λ, H)h being a subsolution exactly means that for f ∈ D(H) and x such that R(λ, H)h(x) − f (x) is maximal, we have Ĥ(R(λ, H)h)(x) − Hf (x) ≤ 0, which corresponds to one of the bounds of the positive maximum principle.Intuitively, if the range condition is satisfied, there is at most one dissipative maximal extension of the operator H.
Theorem 3.4 (Feng and Kurtz [16], Theorem 6.13).Let Assumption 3.1 be satisfied.Furthermore, suppose that for all λ > 0 and h ∈ C(E) the comparison principle holds for Then, (a) For each λ > 0 and h ∈ C(E), there exists a unique viscosity solution to Equation (3.1), which we denote by R(λ, H)h.Also, the operator R(λ, H) : is dissipative and satisfies the range condition.
(c) Ĥ generates a semigroup a strongly continuous contraction semigroup {V (t)} t≥0 on C(E) given by (d) We have for every T ≥ 0 and Proof.To apply Theorem 6.13 in [16], we need to check that Assumption 3.1 implies that the (H n , C(E n )) are dissipative and satisfy the range condition.The first statement follows by Lemmas 2.2 and 3.5, the latter we will prove below.The range condition for (H n , C(E n )) follows from [16,Lemma 5.7] as A n e nf , for a bounded linear operator A n .
Lemma 3.5.Suppose the operator (A, C(E)) is given by where for every x, r(x, dy) is a non-negative measure of bounded variation.
Then the operator (B n , C(E)) defined by B n f = 1 n e −nf Ae nf satisfies the positive maximum principle.
Proof.The operator B n is given by which directly shows that B n satisfies the positive maximum principle.
The semigroup {V (t)} t≥0 is of great importance for the large deviation problem as shown in [16,Theorem 5.15].However, the characterisation of the semigroup as a limit does not tell us very much.In the next section, we will find a alternative and informative expression for V (t).

The Nisio semigroup
We introduce a new semigroup, the Nisio semigroup V(t), for which we will prove that The new semigroup is given as a variational problem where one optimises a payoff f (γ(t)) that depends on the state γ(t) ∈ E, but where a cost is paid that depends on the whole trajectory {γ(s)} 0≤s≤t .This cost is accumulated over time and is given by the Lagrangian.Given a continuous and convex operator Hf (x) = H(x, ∇f (x)), we define this Lagrangian by taking the Legendre-Fenchel transform: As p → H(x, p) is convex and continuous, it follows by the FenchelMoreau theorem that also Let AC be the space of absolutely continuous trajectories in E ⊆ R d , and set AC x be the trajectories in AC that start in x.Using L, we define the Nisio semigroup for measurable functions f on E: For the semigroup to be well behaved, we need the following two conditions.(b) For all g ∈ C 1,l (E) and x ∈ E, there exists a trajectory γ ∈ AC x such that for all t ≥ 0: Lemma 3.7.Suppose Condition 3.6 is satisfied.For all c ≥ 0, the set Proof.By lower semi-continuity, {L ≤ c} is closed, so we need to prove that the set is contained in some compact set.Without loss of generality, we can assume that Condition 3.6 (a), is satisfied with As a consequence, we have by Young's inequality for (x, u) Lemma 3.8.Let Condition 3.6 be satisfied.For every f ∈ C 1,l (E), there exists a right continuous and non-decreasing function The proof uses the approach given in Lemma 10.21 in [16].
Proof.Denote and for all c > 0 Note that Γ(c) < ∞ as H is continuous by Assumption 3.1 and the compactness of the closure of cN .Assume without loss of generality that Γ(1) = 1.
For the neighbourhood N , we define the Minkowski norm for every c > 0. As H(c) < ∞ by Condition 3.6, this directly yields for every c > 0 The function is an strictly increasing and satisfies r −1 φ(r) → ∞.Now pick f ∈ C 1,l (E), and set Condition 3.6, together with Lemmas 3.7 and 3.8 imply Conditions 8.9, 8.10 and 8.11 in Feng and Kurtz [16].Note that we do not need to work with relaxed controls as in [16], as our control space equals the linear space R d and because L is convex in the second coordinate.As a consequence, we obtain the result stated in Proposition 8.13 in [16].Proposition 3.9.Under Condition 3.6, we have (a) For every M ≥ 0 and T ≥ 0, the set (b) {V(t)} t≥0 is a strongly continuous contraction semigroup on the space of upper semi-continuous functions on E that are bounded from above.
To connect to the resolvents introduced in Theorem 3.4, we define the following variational resolvent: Following the first part of the proof of Theorem 8.27 in [16], we obtain the following important result.
Lemma 3.10.Let Condition 3.6 be satisfied.For As a consequence of the last lemma, we see that if f (x)−λH(x, ∇f (s))−h(x) = 0 satisfies the comparison principle for all λ > 0 and h ∈ C(E), then R(λ)h = R(λ, H)h as given in Theorem 3.4.Additionally, in Lemma 8.18 in [16] it is proven for f ∈ C(E) that which yields that V (t)f = V(t)f .Using this identification, Feng and Kurtz obtained the following result.
In the next section, we introduce the models for which we will prove the large deviation result using Theorem 3.11 and show that Assumption 3.1 and Condition 3.6 are satisfied.Afterwards, we will prove the comparison principle for these specific models.

The Ehrenfest model in one dimension
We consider a model with n spins, modelled as σ = (σ(1), . . ., σ(n → R be a differentiable function with Lipschitz derivative, to be interpreted as a potential.The spins evolve according to mean-field Markovian dynamics under influence of the potential V .We let the generator be where σ i denotes the configuration Let x := 1 n j≤n σ(j) be the magnetisation of the spin configuration.As the potential is evaluated in this magnetisation, this leads to an effective Markov process {Y n (t)} t≥0 on the set .
Let E = [−1, 1] and let η n : E n → E be the embedding.The limiting behaviour of X n = η n (Y n ) is determined by the limiting generator (A, C 1 (E)) ⊆ ex − lim A n given by Af (x) = ∇f (x), F(x) , where the vector field F is given by Clearly, F is extendable to a vector field on a neighbourhood of [−1, 1], and as F(−1) > 0 and F(1) < 0, Assumption 2.3 is satisfied.The limiting Hamiltonian (Hf (x), C 1,l (E)) ⊆ ex − lim H n is given by and, hence, Assumption 3.1 is satisfied.Writing X n (t) = η n (Y n (t)), we aim to prove the large deviation principle for {X ( t)} t≥0 on D E (R + ), and to obtain the Lagrangian form of the rate function.For this we turn to Theorem 3.11.However, the methods that we use are easily extended to more general limiting schemes.
Instead of the special case of the Ehrenfest model with a potential V , we look at generators A n of the form where r n,+ , r n,− ≥ 0 and the vector field defining the limiting generator Af (x) = ∇f (x), F(x) is given by r − , r + ≥ 0. The corresponding non-linear generator (H, C 1,l (E)) is given by For approximating Markov processes of the form above, we obtain the following theorem.
We are left to prove Condition 3.6 (b).Fix g ∈ C 1,l (E) and x ∈ E. We introduce the vector field and consider the operator A g defined by Note that F g is Lipschitz as g ∈ C 1,l (E) and E is compact.As we did for A, we find a flow Φ g satisfying As before, we have a semigroup S g (t)f (x) := f (Φ g (t, x)) which is generated by an extension of (A g , C 1 (E)).We denote γ g (t) = Φ g (t, x) and obtain that γ g is absolutely continuous and γg (t) = F g (γ(t)).Now consider For any x ∈ E, the difference A g f (x) − Hf (x) is given by and consists of two terms of the form c e a b + e b − 1 , for some constant c > 0.
The supremum over b ∈ R of this term is given for a = b.This implies that sup for all x ∈ E, so in particular for x = γ g (t).

The Ehrenfest model with d dimensional spins
We now consider the evolution of d dimensional spins: σ = (σ(1), . . ., σ(n)) ∈ ({−1, 1} d ) n .For k ≤ n, we denote the i-th coordinate of σ(k) by σ i (k).For each coordinate, we have non-negative functions r i n,+ , r i n,− : E n → R + .The evolution on the level of the magnetisations is given by , where e i the vector consisting of 0's, and a 1 in the i-th component.
The limiting generators are given by where r i − , r i + : E → R + and As in the one-dimensional case, it is clear that Assumption 3.1 is satisfied.
Theorem 4.2.Suppose that where r i + , r i − are Lipschitz continuous and bounded away from 0. Furthermore, suppose that {X n (0)} n≥1 satisfies the large deviation principle on E with good rate function I 0 .Then, {X n } n≥1 satisfies the large deviation principle on D E (R + ) with good rate function I given by Proof.We check the conditions for Theorem 3. .
The proof of Condition 3.6 (b) is as in the proof of Theorem 4.1

Mean-field Markov jump processes
Our third example is on the large deviation behaviour of copies of a Markov process on {1, . . ., d} that evolve under the influence of some mean-field interaction.We have n processes modelled as σ = (σ(1), . . ., σ(n)) ∈ {1, . . ., d} n and we have a representation in δ xi , for some x i ∈ {1, . . ., d} .
Of course, E n can also be seen as a discrete subset of E := P({1, . . ., d}) = {µ ∈ R d | µ i ≥ 0, i µ i = 1} and this directly gives us the embedding η n : E n → E.
We take some jump kernels r n : {1, . . ., d}×{1, . . ., d}×E n → R + , r{1, . . ., d}× {1, . . ., d} × E → R + .The evolution of σ is given by the infinitesimal generator where σ i,b is the configuration obtained from σ by changing the i-th coordinate to b.As an example, we can take which, up to transformation of the spaces E n and E, includes the one-dimensional Ehrenfest model as a special case.On E n , we obtain an effective dynamics Y n with generator If (4.4) is satisfied, we obtain as limiting generators (A, C 1 (E)) of which we easily see that Assumption 2.3 is satisfied, and (H, C 1,l (E)) given by We have the following result regarding the large deviations of the processes Additionally, suppose that {X n (0)} n≥1 satisfies the large deviation principle on E with good rate function I 0 .Then, {X n } n≥1 satisfies the large deviation principle on D E (R + ) with good rate function I given by Proof.Again, we check the conditions for Theorem 3.11.First of all, the comparison principle follows from Proposition 5.9.Condition 3.6 (a) is satisfied with where R = sup a,b,µ∈E r(a, b, µ).Condition 3.6 (b) follows as in Theorem 4.1, by taking the flow corresponding to the semigroup generated by As an example, we will consider Glauber dynamics for the Potts model, see also [9,18].
Example 4.4.We consider Glauber dynamics for the Potts model, see also [9,18].For inverse temperature β ≥ 0 and n ≥ 1, we consider the measure We set e 1 , . . ., e d to be the d unit vectors in R d , then it follows that we can rewrite the above expression as where V : R d → R is given by V (x) = − x, x .Correspondingly, we can define Glauber dynamics for the Potts model by A straightforward calculation gives the limiting Hamiltonian: Clearly, if Y n are the processes generated by A n , and if {X n (0)} n≥1 satisfies the large deviation principle, then Theorem 4.3 yields the large deviation principle for {X n } n≥0 on D E (R + ).

H-theorems
As a small corollary to the large deviation results, we show that we can obtain H-theorems for certain types of dynamics, see e.g.[4, Section III.9] or [25] and references therein.One tries to obtain a Lyapunov function for the solutions to the differential equation Often, an entropy turns out to be a Lyapunov function.The large deviation principle explains this fact and gives a method to obtain a suitable Lyapunov function.Suppose that we have Markov jump processes on sets E n generated by A n of any of the three types given in Sections 4.1, 4.2, or 4.3.Furthermore, suppose that Assumption 3.1 is satisfied with limiting operators (A, C 1 (E)) given by Af (x) = F(x), ∇f (x) for some Lipschitz vector field F, and nonlinear generator (H, C 1,l (E)) for which the comparison principle holds for f (x)− λHf (x) − h(x) = 0 for all λ > 0 and h ∈ C(E).
Proposition 4.5.Suppose there exists measures µ n ∈ P(E n ) that are invariant for the dynamics generated by A n .Additionally, suppose that the measures μn defined by μn (B) = µ n (η −1 n (B)) satisfy the large deviation principle on E with good rate function I 0 .Then I 0 is increasing along the flow generated by F.
Note that the conditions for the corollary are satisfied in Example 4.4 with µ n being the pushforward of ν n to E n .
Proof.Fix t ≥ 0 and some some starting point x 0 .Set x(t) = Φ(t, x 0 ), where Φ(t, x) is the flow generated by F. We show that I 0 (x(t)) ≤ I 0 (x 0 ).Let Y n (0) be distributed as µ n .Then it follows by one of the Theorems 4.1, 4.2, 4.3, that the large deviation principle holds for As µ n is invariant for the Markov process generated by A n , also the sequence {η n (Y n )(t)} n≥0 satisfies the large deviation principle on E with good rate function I 0 .Combining these two facts, the Contraction principle [11, Theorem 4.2.1]yields Note that L(x(s), ẋ(s)) = 0 for all s by Equations (4.1) and (4.2), as F = F 0 in terms of the proof of Theorem 4.1.

Viscosity solutions
We start by an introduction of viscosity solutions without explicit reference to the actual problem under consideration.Again consider the first order differential equation where E is a compact subset of R d and F is a continuous function The notions of viscosity solutions and the comparison principle for (5.1) were given in Definition 2.7.We do not expect the comparison principle to hold for arbitrary differential equations.However, we will assume for all x ∈ E, s, r ∈ R and p ∈ R d .Furthermore, we assume for the moment that there exists a modulus of continuity ω, i.e. ω : R + → R + continuous and ω(0) = 0, such that for x, y ∈ E, s ∈ R and α > 0: These two condition will give the following preliminary proposition.Before proving the proposition, we give an example of how these conditions can be applied.Later on, we will vastly generalise the conditions for examples of this type.
for f ∈ C 1,l (E).Our goal is to see whether (H, C 1,l [−1, 1]) generates a semigroup, for which we need to check the range condition.In other words, for every λ > 0 and h ∈ C(E), we want to find a solution for Now clearly the difference |h(y , where ω h is the modulus of continuity for h.For the second term note that where the inequality follows because the terms in both products have opposite sign.Clearly, this approach generalises to Hamiltonians of the form where r − , r + are continuous, non-negative, r − is increasing and r − (−1) = 0 and r + is increasing and r + (1) = 0.
The proof of Proposition 5.1 relies on the following well known result applied to Ψ(x, y) = 2 −1 |x − y| 2 , see Lemma 3.1 and Proposition 3.7 in [7].
(ii) All limit points of (x α , y α ) are of the form (z, z) and for these limit points we have u(z Proof of Proposition 5.1.Let u be a subsolution and v a supersolution to (5.1).We argue by contradiction, and assume that δ Thus Lemma 5.3 yields α|x α −y α | 2 → 0 and for any limit point z of the sequence x α , we have u(z) − v(z) = sup x∈E u(x) − v(x) > 0.
The proof gives us the following generalisation in the case that we are interested in.We say that Ψ : E 2 → R + is a good distance function if Ψ(x, y) = 0 implies x = y, it is lower semi-continuous, differentiable with Lipschitz derivative in both components, and (∇Ψ(•, y))(x) = −(∇Ψ(x, •))(y) for all x, y ∈ E. Proof.We directly see that Condition (5.2) is satisfied.In the proof of Proposition 5.1 above, Condition 5.3 was used to bound (5.7) from above.We replace this bound by noting that The first term converges to 0 as α → ∞ and by the assumptions in the corollary also the second term can be made arbitrarily small.
The next lemma gives additional control on the sequences x α , y α .
Then we have that Proof.As v is a supersolution, we see

The comparison principle for linear operators
We start with checking the comparison principle for linear first order differential operators.f (x) − λA(x, ∇f (x)) − h(x) = 0, where (A, C 1,l (E)) is given by Af (x) = ∇f (x), F(x) .
for some vector field F.
As 1 − y α is bounded away from 0 for α large enough, we obtain from the first term that v α = e 2α(xα−yα) −1 is bounded and contains a converging subsequence v α ′ .We obtain as in the proof where z ∈ (−1, 1) that Suppose we can find a further subsequence α ) − 1 is bounded and contains a converging subsequence.This would conclude the proof as in the argument above.
In the case that sup Therefore, we assume x α(k) > 0 for all k.In this case, we use y α(k) > x α(k) > 0 and that r − is bounded away from 0 to write By the bound in (5.10), and the obvious lower bound, we see that the nonnegative sequence For a sequence of real numbers {b(k)} and a sequence of non-negative real numbers {c(k)}, we have the basic inequality lim inf As a consequence, we obtain lim inf This concludes the proof of (5.9) for the case that z = −1.

Multi-dimensional Ehrenfest model
We generalise the Ehrenfest model to d dimensions.
Proof.Let u be a subsolution and v a supersolution to f (x) − λH(x, ∇f (x)) − h(x) = 0.As in the proof of Proposition 5.7, we check the condition for Corollary 5.4.Again, for α and without loss of generality let z be such that x α , y α → z.Denote with x α,i and y α,i the i-th coordinate of x α respectively y α .We prove r i − (y α ) e α(yα,i−xα,i) − 1 ≤ 0, by constructing a subsequence α(n) → ∞ such that the first term in the sum converges to 0. From this sequence, we find a subsequence such that the second term converges to zero, and so on.Therefore, we will assume that we have a sequence α(n) → ∞ for which the first i − 1 terms of the difference of the two Hamiltonians vanishes and prove that we can find a subsequence for which the i-th term 1 − x α,i 2 r i + (x α ) − 1 − y α,i 2 r i + (y α ) e α(xα,i−yα,i) − 1 + 1 + x α,i 2 r i − (x α ) − 1 + y α,i 2 r i − (y α ) e α(yα,i−xα,i) − 1 (5.11) vanishes.This follows directly as in the one-dimensional case, arguing depending on the situation z i ∈ (−1, 1), z i = −1 or z i = −1. .Clearly, Ψ is differentiable in both components and satisfies (∇Ψ(•, ν))(µ) = −(∇Ψ(µ, •))(ν).Finally, using the fact that i µ(i) = i ν(i) = 1, we find that Ψ(µ, ν) = 0 implies that µ = ν.We conclude that Ψ is a good distance function.
The right hand side is bounded above by our observation above and bounded below by −1, so we take a subsequence of α, also denoted by α, such that the right hand side converges.Also note that for α large enough the right hand side is non-negative.Therefore, it suffices to show that lim inf α→∞ µ α (i)r(i, j, µ α ) ν α (i)r(i, j, ν α ) ≤ 1, which follows as in the proof of Proposition 5.7.

Lemma 2 . 2 .
If an operator (B, D(B)) satisfies the positive maximum principle, then it is dissipative.We say that the range condition for (B, D(B)) is satisfied if for all λ > 0 we have Ran(½ − λB) = C(E).In terms of the inverse operators, this means that the closure B, defined by taking the graph topology closure of the graph of B, of the operator B has surjective inverses R(λ, B) = (½ − λB) −1 .Finally, we introduce normal vectors to E in R d .Let x ∈ E and let S = B ε (y)

Condition 3 . 6 .
The operator (H, C 1,l (E)) satisfies (a) There exists a bounded symmetric neighbourhood N in R d of 0 such that sup x∈E,p∈N H(x, p) < ∞.

Theorem 3 . 11 (
Feng and Kurtz [16], Corollary 8.29).Let Assumption 3.1 be satisfied, let f (x) − λH(x, ∇f (x)) − h(x) = 0 satisfy the comparison principle for every λ > 0 and h ∈ C(E) and additionally assume Condition 3.6.Suppose that the sequence X n (0) satisfies the large deviation principle with good rate function I 0 .Then, {X n } n≥1 is exponentially tight in D E (R + ) and satisfies the large deviation principle with rate function I given by

Theorem 4 . 3 .
Suppose that for all a, b ∈ {1, . . ., d}, we have lim n→∞ sup µ∈En |r n (x, y, µ) − r(x, y, η n (µ))| = 0 (4.4) and that we have (a) The function µ → r(a, b, µ) is either equal to 0 or is bounded away from 0. (b) The function µ → r(a, b, µ) can be extended to a neighbourhood U of E in R d and is Lipschitz continuous on U .

Example 5 . 2 (
The Hamiltonian of the infinite temperature Ehrenfest model).The Hamiltonian of the infinite temperature Ehrenfest model is given by