Couplings via comparison principle and exponential ergodicity of SPDEs in the hypoelliptic setting

We develop a general framework for studying ergodicity of order-preserving Markov semigroups. We establish natural and in a certain sense optimal conditions for existence and uniqueness of the invariant measure and exponential convergence of transition probabilities of an order-preserving Markov process. As an application, we show exponential ergodicity and exponentially fast synchronization-by-noise of the stochastic reaction-diffusion equation in the hypoelliptic setting. This refines and complements corresponding results of Hairer, Mattingly (2011).


Introduction
The goal of this article is to build a framework for analyzing ergodic properties of orderpreserving Markov processes and to provide simple, verifiable, yet general enough sufficient conditions for exponential ergodicity. This framework turns out to be especially powerful for investigating ergodicity of order-preserving stochastic PDEs with highly degenerate additive forcing. Our main example is the stochastic reaction-diffusion equation in the hypoelliptic setting. We show that even if noise comes to the system only from one Brownian motion, then (under certain conditions) this SPDE has a unique invariant measure and its transition probabilities converges to it exponentially in the Wasserstein metric. We also establish exponentially fast synchronization-by-noise of the solutions to this equation. This refines [14,Remark 8.22] and complements [14,Theorem 8.21].
In the mathematical physics literature there is a growing interest in ergodic behavior of nonlinear PDEs forced by smooth in space noise acting only on a few Fourier modes, see, e.g., [24,25,12,3,19]. Since the noise is smooth in space, it is usually relatively easy to show that these SPDEs have a unique solution and that this solution is a Markov process. On the other hand, since the solution at any fixed time is an infinite-dimensional random variable and the noise acts only onto finitely many degrees of freedom, these processes do not get enough noise and hence they are typically only Feller but not strong Feller. This makes analyzing their ergodic behavior much more challenging.
Indeed, recall that ergodicity of strong Feller processes can be established using the standard classical approach, which combines a local mixing condition on a certain set (the small set condition) and a recurrence condition, see, e.g., [26]. Unfortunately, this method is usually not applicable for Markov processes which are only Feller and not strong Feller because they do not have good mixing properties, see also a detailed discussion in [15,Section 1]). To study ergodicity of these processes three alternative strategies have been suggested recently.
The first approach was developed in [12,13,14]. It introduces the asymptotic strong Feller (ASF) property, which serves as a replacement for the strong Feller property. It is shown there that if a Markov process satisfies ASF as well as certain recurrence and topological irreducibility conditions then the process is exponentially ergodic. Note that for many Markov processes verifying ASF might be rather challenging. In particular, while this method works quite well for stochastic Navier-Stokes (SNS) equations on a torus in the hypoelliptic setting, it is not so clear how to check ASF for the SNS equation on a bounded domain, see the discussion in [8,Section 1].
Another approach establishes exponential ergodicity using generalized couplings [25,9,15,8,20,1]. Recall that a coupling is a pair of stochastic processes with given marginal distributions. By contrast, a generalized coupling is a pair of stochastic processes, whose marginals are not necessarily equal to a prescribed pair of probability distributions, but are in a sense close to this pair. Clearly, constructing a generalized coupling is much easier than constructing a coupling. Furthermore, it is shown in the papers mentioned above that existence of a generalized coupling with certain nice properties yields exponential ergodicity. This approach works quite well for a large class of SPDEs in the effectively elliptic setting (that is, when noise acts in a finite but large enough number of directions), but is less useful for studying SPDEs in the hypoelliptic setting.
Finally, the third main approach was introduced in [15]. It utilizes the notion of a dsmall set (a generalization of a small set), which is particularly well adapted to the study of Markov processes with bad mixing properties. This approach provides another set of sufficient conditions for exponential ergodicity, and it works quite well with stochastic delay equations and nonlinear autoregressions. Unfortunately, verifying this set of conditions for SPDEs is rather difficult.
Our new approach developed in this paper is somehow orthogonal to all of the strategies mentioned above. It is specifically targeted at order preserving Markov semigroups, that is, the semigroups which map increasing bounded functions to increasing bounded functions. On the one hand, this significantly reduces the applicability of this approach; for example, it cannot be used to study stochastic Navier-Stokes equations. On the other hand, for order-preserving Markov processes (e.g., stochastic reaction-diffusion equations) it allows to obtain exponential ergodicity under very weak assumptions; this is rather difficult (or maybe impossible) to achieve with other methods.
The main result (Theorems 2.3 and 2.4) is quite general. It shows that if an orderpreserving Markov semigroup satisfies additionally a swap condition (i.e., two Markov processes started with initial conditions x, y with x y can change their order by time 1 with a small but positive probability), then under a standard Lyapunov-type condition as well as a certain technical assumption the process is exponentially ergodic. We also show that this swap condition cannot be omitted.
We apply the obtained theorems to establish exponential ergodicity of stochastic reaction-diffusion equations on a d-dimensional torus T d , d ∈ N du(t, ξ) = [∆u(t, ξ) + f (u(t, ξ), ξ)]dt + m k=1 σ k (ξ)dW k (t), ξ ∈ T d , t 0, (1.1) where (W 1 , W 2 , . . . , W m ) are independent standard Brownian motions; f , σ k are continuous functions acting from R × T d → R and T d → R, respectively and satisfying certain conditions. It is clear that if m = 0 (no noise), then this equation might have multiple invariant measures. On the other hand, if m = ∞ (noise acts in every direction), then the process u is strong Feller and it can be shown by the classical methods that it has a unique invariant measure [4,Sections 7 and 11]. Thus, it is natural to ask what the smallest number of directions m that have to be perturbed by noise is, so that equation (1.1) still has a unique invariant measure. Using the ASF method described above, it was shown in [14,Remark 8.22] that if f is a polynomial and m = 3, then equation (1.1) has a unique invariant measure and is exponentially ergodic. We extend this result and show exponential ergodicity of u already if m = 1 (that is, when noise acts only in one direction), see Theorem 4.5 and Remark 4.6; we also do not rely on a specific form of f . Furthermore, in Theorem 4.8, we show that any two solutions to (1.1) launched with the same noise from different initial conditions converge to each other exponentially fast (synchronization by noise).
The idea that order-preservation helps to obtain better convergence rates of a Markov process is not new; it can be probably traced back to works of Tweedie, Roberts and Lund from late 1990s [23,27]. Note however that the methods developed in those papers rely on the small set condition. Since in the current paper we study processes with bad mixing properties, where this condition might not hold, unfortunately the ideas of [23,27] cannot be applied in our case.
It is interesting to compare our results with [6]. In that paper the authors consider an order-preserving random dynamical system (RDS) with two additional properties: it has a unique invariant measure and it weakly converges to this measure. It is shown there that this implies that any two trajectories of the RDS converge to each other in probability. By contrast, in the current paper we start with an order-preserving Markov process and prove uniqueness of an invariant measure and convergence of transition probabilities.
Our main tool is a new version of the coupling method specifically tailored for orderpreserving Markov processes, see the proof of Theorem 2.3. This is combined with an analysis of the relations between stochastic domination and expected distance of random variables, see Section 3. There we continue the study initiated in [6,Proposition 2.4]. Note however, that the methods introduced in [6] cannot be used to get a quantitative bound even in the case where the Markov process has state space R, see Example 3.2. Therefore we apply a new technique.
While we study in detail only the stochastic reaction-diffusion equations on the torus, the strategy developed in this paper should also work in a very similar way for other order-preserving SPDEs including stochastic reaction-diffusion equation on a bounded domain and stochastic porous medium equations. It might be also possible to apply this method to obtain exponential ergodicity of certain singular SPDEs (for example Φ 4 2 and Φ 4 3 models) but this will be the subject of further research. The rest of the paper is organized as follows. We present our main results in Section 2. In Section 3 we investigate the relations between stochastic domination and average distance of random variables. Section 4 is devoted to a detailed study of ergodicity of stochastic reaction-diffusion equations. The proofs of the main results are placed in Section 5.
Convention on constants. Throughout the paper C denotes a positive constant whose value may change from line to line. nos Dareiotis for their help, patience and detailed explanations of some parts of the theory of parabolic PDEs. We also would like to thank Máté Gerencsér, Lenya Ryzhik and Alessandra Lunardi for useful comments. Part of the work on the project has been done during the visit of OB to the Institute of Science and Technology -Austria (IST). OB is very grateful to IST Austria for their support and hospitality.

A general framework for ergodicity for order-preserving Markov processes
We begin by introducing the basic notation. Let (E, ρ) be a Polish space with partial order such that the set x ∈ E, A ∈ E} t 0 be a Markov transition function over E and denote by {P x , x ∈ E} the corresponding Markov family; that is P x is the law of the Markov process {X t , t 0} with the given transition function and initial condition X 0 = x. The law of X will be understood in the sense of finite-dimensional distributions; that is, we will not rely on the trajectory-wise properties of X.
For a measurable function r : E × E → [0, 1], we consider the corresponding coupling distance W r : P(E) × P(E) → R + given by where C(µ, ν) is the set of all couplings between µ and ν, i.e., probability measures on (E × E, E ⊗ E) with marginals µ and ν. If r is a lower semicontinuous metric on E, then W r is the usual Kantorovich-Wasserstein distance. If r is the discrete metric, i.e., r(x, y) = ½(x = y), then W r is the total variation distance, which will be denoted further Let us now recall the standard definitions related to the partial order ; we refer to, e.g., [21, Chapter IV] for a detailed discussion.
(ii) Let µ, ν ∈ P(E) be two probability measures. We say that ν stochastically dominates µ and denote it by µ st ν if for any bounded measurable increasing function (iii) We say that a Markov transition function {P t } t 0 is order preserving if for any t > 0 and x, y ∈ E such that x y we have P t (x, ·) st P t (y, ·).
In other words, a Markov transition function is order preserving if it maps bounded increasing functions to bounded increasing functions. Examples of Markov processes with an order preserving transition function include stochastic-reaction diffusion equations, stochastic porous media equations and others, see, e.g., [6].
Remark 2.2. Strassen's theorem (see, e.g., [21,Theorem IV.2.4]) provides the following coupling definition of stochastic domination, which is equivalent to the one stated above. We have µ st ν if and only if there exist random elements X, Y : Ω → E such that Law(X) = µ, Law(Y ) = ν and X Y . Now we are ready to present our main results.
, such that the following conditions hold: (1) the Markov transition function (P t ) t 0 is order-preserving; (2) the function V is a Lyapunov function, that is, there exist constants γ, K > 0 such that (3) if x, y ∈ E and x y, then 0 d(x, y) ϕ(y) − ϕ(x); (4) for any x ∈ E we have M(x) := sup t 0 P t ϕ 2 (x) < ∞; (5) there exist sets A, B ∈ E and ε > 0 such that A B and for any x ∈ {V 4K/γ} we have P 1 (x, A) > ε and P 1 (x, B) > ε.

(2.3)
Then for any θ > 0 there exist constants C, λ > 0 such that for any x, y ∈ E W d∧1 (P t (x, ·), P t (y, ·)) C(1 Then the Markov semigroup has a unique invariant measure π. Further, for any θ > 0 there exist constants C, λ > 0 such that for any x ∈ E W ρ∧1 (P t (x, ·), π) C(1 + V (x)) θ exp(−λt), t 0. (2.5) Sketch of the proof of Theorems 2.3 and 2.4. Here, for the convenience of the reader, we provide just a very brief roadmap of the proof; a complete proof is given in Section 5. Fix x, y ∈ E, t > 0. The proof splits into two independent parts. First, we use a new version of the coupling method and utilize conditions (1), (2), (5) to construct random elements Z x , Z y , Z x taking values in E with the following properties: for some universal constants C 1 , C 2 > 0. Second, using the ideas developed in Section 3, we transform the bound (2.6) into the following inequality: where C 3 , C 4 > 0 are again some universal constants. It is at this step, where we are using conditions (3) and (4). Since Law(Z x ) = P t (x, ·) and Law(Z y ) = P t (y, ·), inequality (2.7) yields (2.4) and (2.5).
Theorems 2.3 and 2.4 provide general sufficient conditions for an order-preserving Markov process to be ergodic. Condition (3) of Theorem 2.3 is actually a condition on the space (E, ) and d rather than on the Markov semigroup. As shown below it is satisfied in many natural situations, including E = R and E = L p , p 1. Thus, the only additional assumption for exponential ergodicity (apart from the standard Lyapunov and moment-type conditions) is the swap condition (5). It tells that the state space E contains two sets, one preceding the other, and locally uniformly in the initial condition the Markov process has a small chance to be in either of these sets. The following simple example explains why this condition cannot be dropped.
Example 2.5. Introduce the following trivial order on E: for x, y ∈ E we have x y if and only if x = y. It is clear, that for this order any Markov semigroup is order-preserving. We also see that the set Γ defined in (2.1) is closed; furthermore, conditions (3) and (4) of Theorem 2.3 trivially hold with ϕ ≡ 0. Thus, any Markov process on E, that has a Lyapunov function satisfies conditions (1)-(4) of Theorem 2.3. It is well-known that this is not enough for uniqueness of the invariant measure. Thus, the swap condition (5) cannot be omitted.
We also would like to emphasize that the swap condition (5) is very different in nature from the small set condition or other minorization-type conditions, which are imposed within the classical framework, see e.g., [28]. Indeed, a minorization condition guarantees good mixing properties of transition kernels and, in particular, bounds on the total variation distance between the kernels. On the other hand, the swap condition does not yield such bounds since nothing was assumed about mixing on the sets A and B. Lemma 5.9 shows how the swap condition can be verified for the stochastic-reaction diffusion equation. Now let us provide natural examples of spaces E for which condition (3) of Theorem 2.3 holds.
Example 2.6. Put E = R equipped with the standard distance, d(x, y) := |x − y|, x, y ∈ R, and consider the standard order . Then condition (3) holds for the function ϕ(x) := x, x ∈ R. Example 2.7. Put E := L p (D, R), where p 1 and an arbitrary domain D ⊂ R n , n ∈ N. Let · Lp be the standard L p norm in this space. Consider the following partial order x pos y if and only if x(ξ) y(ξ) for almost all ξ ∈ D (2.8) and let d(x, y) := x − y p Lp , x, y ∈ L p (D, R). Then there exists a function ϕ such that the partial order pos and function d introduced above satisfy condition (3) of Theorem 2.3. Furthermore, the set Γ defined in (2.1) is closed. We postpone the proof of this statement to Section 5.

Stochastic domination and distance between random variables
In this section we explore the connections between stochastic domination and expected distance of random variables, thus continuing the analysis initiated in [6]. Recall that we are given a Polish space E with metric ρ. The following statement played a key role in establishing synchronization-by-noise results in [6].
Suppose that X t and Y t converge weakly as t → ∞ to a random element with the law µ . Then, It is natural to ask whether the above statement can be quantified. More precisely, assume additionally that for some rate function r going to 0 as t → ∞. One can ask whether these bounds guarantee a quantitative estimate on E[ρ(X t , Y t ) ∧ 1]. Quite surprisingly, the answer to this question is negative: without any additional assumptions E[ρ(X t , Y t ) ∧ 1] might tend to 0 very slowly even when r(t) converges exponentially fast. This is illustrated by the following example.
Example 3.2. Let E = R be equipped with the standard order . Let (p i ) i∈Z + be a sequence of positive numbers summing up to 1. Let X be a random variable which with Then for each n we clearly have X n Y n and it is immediate to see that for µ := Law(X) one has On the other hand, which is much slower than p n 2 −n .
Thus, to quantify bounds in (3.1) we need to impose extra assumptions. This is where condition (3) of Theorem 2.3 pops up.
Proof. We begin by observing that since X Y a.s., condition (3) of Theorem 2.3 yields . Then we continue (3.4) as follows, using the fact that d is bounded by 1: Since K > 1 was arbitrary, this yields the statement of the lemma.
The lemma above will be very useful for obtaining exponential bounds on the synchronization -by-noise. On the other hand, to complete the second part of the proof of Theorem 2.3 (see its sketch above in Section 2), we need to solve the opposite problem. More precisely, Lemma 3.3 considers the case when one random element is less than the other everywhere but their laws are different. The following lemma studies the situation where the laws of the random elements coincide, but one is less than the other with probability smaller than 1.
Then for any p, q > 1 such that 1/p + 1/q = 1 we have Again, we see that the statement of the lemma is satisfied only under some extra conditions (one needs condition (3) of Theorem 2.3 and a reasonable bound on E|ϕ(X)| q ). The following two examples show that these extra conditions cannot be dropped.
Then it is easy to see that f k pos f l whenever k l n. Let ξ be uniformly distributed on the set {1, 2, . . . , n}. Put X := f ξ and Y = X := f ξ+1 , where the summation is taken mod n. Then X and X are bounded, It is easy to see that in this example condition (3) of Theorem 2.3 does not hold. Indeed, if a function ϕ satisfies this condition, then for any n ∈ N, k ∈ {1, 2, . . . , n} we where1 and0 denote elements of E identically equal to 1 and 0, respectively. Since n was arbitrary, we see that (3.5) implies that ϕ(1) cannot be finite, which is impossible.
Let n ∈ N and let X be uniformly distributed on the set {1, 2, . . . , n}.
Thus, we see that in this example condition (3) of Theorem 2.3 holds, but E|ϕ(X)| = EX = n/2 + 1/2 can be arbitrarily large for large n.
Proof of Lemma 3.4. We begin by observing that the function ϕ is increasing, that is x y implies ϕ(x) ϕ(y) for any x, y ∈ E. Fix p, q > 1 such that 1/p+1/q = 1. Without loss of generality, we can also assume that E|ϕ(X)| q < ∞ (otherwise the statement of the lemma is trivial).

Exponential ergodicity of the stochastic reactiondiffusion equation in the hypoelliptic setting
(4.1)  In case of ambiguity, a solution to (1.1) with initial condition x ∈ L 2 (T d ) will be denoted by u x . We refer to [7,5,11] for a detailed discussion of relations between different notions of a solution to SPDEs.
We will make the following assumption on f and σ i .
Assumption A. The function f is jointly continuous and locally Lipschitz in the first variable. Moreover, the following condition holds (dissipativity outside of a compact set): Suppose that σ k ∈ C 2 (T d ), for each k = 1, . . . , m.
Recall the definition of the order pos in (2.8).

Theorem 4.4. Suppose that Assumption A holds. Then
(i) for any u 0 ∈ C 2 (T d ) equation (1.1) has a unique analytically and probabilistically strong solution u with initial condition u 0 ; furthermore u has a version which is continuous on [0, ∞) × T d ; (ii) for any u 0 ∈ L 2 (T d ) equation (1.1) has a unique analytically generalized strong and probabilistically strong solution u with initial condition u 0 ; furthermore u has a version which is continuous on (0, ∞) × T d ; (iii) this solution of equation (1.1) is a homogeneous Markov process with state space L 2 (T d ); (iv) there exists a set Ω ′ ⊂ Ω of full measure such that if x, y ∈ L 2 (T d ) and x pos y, then u x (t, ω) pos u y (t, ω), t 0, ω ∈ Ω ′ ; (v) the corresponding Markov transition function is order preserving with respect to the order pos ; (vi) there exists a constant C > 0 such that the following energy estimates hold for any where the constants K 1 and K 2 were defined in (4.3).
The proof of the theorem is given in Section 5.2. Let us make here a couple of remarks about the theorem. Usually in the literature it is assumed additionally that f is a polynomial or f is globally Lipschitz or that at least some growth bounds on |f (x, ξ)| holds, see, e.g., [5,Section 7], [7,Section 4.2], [14,Section 8.3]. Here we established existence and uniqueness of the solutions to (1.1) without these additional restrictions. The main challenge is that the corresponding Nemytskii operator . Therefore it is difficult to apply here any fixed point principle. Furthermore if u n is a sequence of solutions to (1.1) with smooth initial conditions u n 0 , and u n converges to some u in L 2 , then it is not clear at all that this u is a solution to (1.1). To overcome these obstacles, inspired by some ideas from [7], we have extended the corresponding PDE result [22,Propositions 7.3.1] to L 2 initial conditions. The fact that for irregular initial data (u 0 ∈ L 2 (T d )) equation (1.1) might not have an analytically strong solution is not surprising. Indeed, even for the standard heat equation ∂ t u = ∆u on T 1 we have ∆u(t) L 2 Ct −1 u(0) L 2 and thus t 0 ∆u(s) L 2 ds might be infinite. Now let us present the main result of this section and establish ergodicity of the stochastic reaction-diffusion equation in the hypoelliptic setting. We introduce the following condition.
Assumption B. There exist ε > 0 and λ k ∈ R, k = 1, . . . , m such that Theorem 4.5. Suppose that Assumptions A and B hold. Then the stochastic reactiondiffusion equation (1.1) has a unique invariant measure π. Furthermore, there exist C > 0, λ > 0 such that for all x ∈ L 2 (T d ) we have (4.9) The Markov semigroup corresponding to (1.1) is order-preserving with respect to pos by Theorem 4.4(iv) and thus condition (1) of Theorem 2.3 holds. Condition (2) of Theorem 2.3 is satisfied thanks to energy bound (4.6). It follows from Example 2.7 and the definition of ϕ that conditions (3) and (4) of Theorem 2.3 are met. By Lemma 5.9 condition (5) holds. Conditions (6) and (7) hold trivially. Thus all conditions of Theorems 2.3 and 2.4 are satisfied. Therefore the process u has a unique invariant measure and (4.8) holds.
Finally, we establish exponentially fast synchronization by noise for solutions to (1.1): that is, we will prove that any two solutions with initial conditions x, y ∈ L 2 launched with the same noise will converge to each other in probability exponentially fast.
Theorem 4.8. Suppose that Assumptions A and B hold. Then there exist C > 0, λ > 0 such that for any x, y ∈ L 2 (T d ) we have Note that the fact that E[ u x (t) − u y (t) L 2 ∧ 1] → 0 as t → ∞ follows immediately from Theorem 4.5 and [6, Proposition 2.4]. Unfortunately, as discussed in Section 3, the techniques developed in [6] do not provide a quantitative bound on convergence rate. Therefore to prove this theorem we use the toolkit developed in Section 3.
Now let us check that all the conditions of Lemma 3.3 hold. Let E = L 2 (T d ) with partial order pos . Set d(x, y) := x − y 2 L 2 ∧ 1, x, y ∈ L 2 (T d ), and define ϕ as in (4.9). Then it follows from Example 2.7 that condition (3) of Theorem 2.3 is satisfied. Furthermore, if d(x, y) = 1, then we have If d(x, y) < 1, then one has where in the second inequality we used the following bound Combining (4.12) and (4.13), we get where we put ψ(x) := 4 √ 2(1 + x 2 L 2 ), x ∈ L 2 (T d ). Thus, bound (3.2) holds. It is also clear that by definition we have X pos Y . Further we have Finally, by Theorem 4.5 there exist C, λ > 0 such that Thus, all the conditions of Lemma 3.3 are met. Therefore inequality (3.3) and energy bound (4.7) yield Since Ed(X, Y ) 1, the above bound implies Taking into account (4.11), we obtain (4.10).

Proofs of the results of Section 2
To prove Theorem 2.3 we need a couple of lemmas. The first lemma is quite standard and is in the spirit of [26,Section 15.2]. The lemma deals with the connection between the finiteness of the exponential moment of the first return time of a Markov process to a set and the existence of a Lyapunov function. However we were not able to find in the literature precisely this statement (we found only a number of related ones). Thus, for the convenience of the reader and for the sake of completeness we provide the full proof of the lemma.
Let us consider a measurable space ( E, E) and a Markov kernel Q on it. Let Z = (Z n ) n∈Z + be a Markov process with transition kernel Q. For a set A ∈ E introduce the first return time to A τ A := inf{n 1 : Z n ∈ A}. (i) We have Proof. (i). The proof uses the standard argument. First note, that it follows from the Lyapunov condition (5.1) and the definition of r that Then by Dynkin's formula for discrete time Markov chains (see, e.g., [26, Theorem 11.3.1]) we have for any If x ∈ E \ {V M}, then by definition for any i ∈ [1, τ (n) ] we have Z i−1 ∈ E \ {V M}. Therefore, (5.6) implies for any i ∈ [1, n] Combining this with (5.7), we get which by Fatou's lemma yields (5.2). If x ∈ {V M}, then by the Markov property and bound (5.2), which implies (5.3).
(ii). First, let us note that we can assume that Q(x, A) equals exactly ε for any x ∈ {V ∈ M}. Indeed, this can only increase the random variable τ A . Introduce random variables We see that the event {I n = 1} corresponds to the reaching of the desired set A after nth visit to the set {V M}. We have for any Note that by the Markov property and the definition above, for any where in the last inequality we used (5.2) and the fact that l < r. Applying the Cauchy-Schwarz inequality and (5.3), we deduce where the last inequality follows from the definitions of l and r. Thus, (5.10) together with (5.8) and (5.9) yields which implies (5.5).
Step 1. In this step we fix x, y ∈ E and t > 0. Let {X x (s), s 0} and {X y (s), s 0} be independent Markov processes with the laws P x and P y , correspondingly. Let F s := σ(X x (r), X y (r), r s), s t, be the natural filtration. Introduce stopping times τ x y := inf{n ∈ Z + : X x (n) X y (n)}, τ y x := inf{n ∈ Z + : X y (n) X x (n)}.
Note that by definition these stopping times τ x y and τ y x take only countably many values.
Let us now extend the state space E and add an additional element denoted by ♦. We assume that ♦ ♦ and do not impose any further partial order relations between ♦ and other elements of E.
We claim now that Law(η x ) st Law(η y ). Indeed, let f : E ∪ {♦} → R be an arbitrary bounded measurable increasing function. Then Note that for any i = 0, .., ⌊t⌋ we have Recall that the kernel P t−i is order preserving and thus for any z 1 , z 2 ∈ E such that Since on the set {τ x y = i} we have X x (i) X y (i), we can continue (5.12) in the following way: Combining this with (5.11), we finally deduce Since f was an arbitrary bounded measurable increasing function, we see that indeed Law(η x ) st Law(η y ).
It follows from the construction, that P(η x = X x (t)) 1 − P(τ x y > t) and P(η y = X y (t)) 1 − P(τ x y > t). Thus, applying twice the gluing lemma (see, e.g., [30, p. 23]), we deduce that there exist random variables Z x and Z y such that Law(Z x ) = Law(X x (t)), Law(Z y ) = Law(X y (t)); In a similar way, we construct another pair of random variables Z x and Z y with the following properties Law( Z x ) = Law(X x (t)), Law( Z y ) = Law(X y (t)); Step 3. Now it follows from Step 2 and Lemma 5.2 that Thus everything boils down to bounding the probabilities P(t τ y x ) and P t τ x y ). We will do it using Lemma 5.1.
Step 4. First we note that, clearly, {(x 1 , x 2 ) ∈ E × E : x 1 x 2 } ⊃ A × B (recall the definitions of the sets A and B in condition (5) of the theorem). Therefore, τ x y inf{n ∈ N : (X x (n), X y (n)) ∈ A × B}. (5.14) Now we apply Lemma 5.1(ii) to the state space E := E × E, the kernel Q on it defined by Q ((x 1 , x 2 ), the set A × B, and the Lyapunov function V(x 1 , It follows from (2.2) and the Gronwall inequality that This, combined with (5.14) and the Chebyshev inequality implies Using exactly the same argument, we also get that Substituting these bounds into (5.13), we finally deduce W d∧1 (P t (x, ·), P t (y, ·)) C(1 + V (x) + V (y))(1 + M(x))e −λt/2 .
Taking the limit in the right-hand side of the above inequality we get π 1 = π 2 .
Proof of Example 2.7. First let us show that the set Γ defined in (2.1) is closed. Let (x n ) n∈Z + , (y n ) n∈Z + be two sequences of elements of L p (D, R) such that x n pos y n and x n − x Lp → 0, y n − y Lp as n → ∞. We claim now that x pos y. Indeed, by passing to an appropriate subsequence (n k ) we see that x n k → x almost everywhere (a.e.) and y n k → y a.e. as n → ∞. Since for each k we have x n k y n k a.e., it is now easily seen that x y a.e. and thus x pos y. Therefore, the set Γ is closed. Now let us introduce the function ϕ. Put Let us verify that the partial order pos and functions ϕ and d satisfy condition (3) of Theorem 2.3. Fix any x, y ∈ L p (D, R) such that x pos y. Then using the inequality (a + b) p 2 p−1 (a p + b p ) valid for all nonnegative real a, b, we have and thus condition (3) of Theorem 2.3 holds.

Proofs of the results of Section 4
We begin with the following auxiliary statement which provides Gronwall-type bounds that will be very useful in the sequel. Lemma 5.3. Let C 1 , C 2 , C 3 , C 4 be positive constants. Let X be a continuous F t -adapted process taking values in [0, ∞) such that X(0) = x and where M is a continuous local F t -martingale with M(0) = 0 and Then the following bounds hold: where C = C(C 1 , C 2 , C 3 , C 4 ) > 0 is some constant independent of t and x.
Proof. We begin with the proof of (5.20). For n ∈ N introduce the stopping time τ n := inf{t 0 : |M(t)| n}. Then it follows from (5.18) that for any n ∈ N, t 0 Using Fatou's lemma, we deduce that for any t 0 we have EX(t) x + C 2 t. This and (5.19) imply that for any t 0 we have E M t C 4 t + C 3 xt + C 2 C 3 t 2 /2 < ∞. Therefore M is a martingale and (5.20) follows immediately from (5.18).
To establish (5.21), we introduce a process Y , which is a solution to the following equation We claim that for any t 0 we have X(t) Y (t). Indeed, assume the contrary and suppose that for some T 0 > 0 we have X(T 0 ) > Y (T 0 ). Then arguing as in [16, Proof of Proposition 9.2], we introduce Since the processes X and Y are continuous we have S 0 < T 0 and X(t) > Y (t) for t ∈ (S 0 , T 0 ). Then (5.18) and (5.22) imply which contradicts the fact that X(T 0 ) > Y (T 0 ). Therefore such T 0 does not exist and X(t) Y (t) for any t 0.
Note now that by Ito's formula and (5.19) where C 5 := (2C 2 + C 3 ) 2 /(4C 1 ) + C 4 and N is a continuous local martingale. Consider a stopping time σ n := inf{t 0 : |N(t)| n}. Then, using (5.23), we get for any 0 s t Therefore, the Gronwall inequality yields The proof of (5.21) is completed by passing to the limit in the above inequality using Fatou's lemma.
In all the remaining theorems and lemmas of this section we will assume that Assumption A is satisfied. To establish Theorem 4.4 we split the process u into the following two parts. The first part, denoted by w, is the stochastic convolution, that is a unique analytically and probabilistically strong solution of the following equation with the initial condition w(0) = 0.
Lemma 5.5. For any v 0 ∈ C 2 (T d ) equation (5.27) has a unique analytically strong solution v with the initial condition v 0 . Furthermore v ∈ C([0, T ], C 2 (T d )) and there exists C > 0 such that for any t Proof. We begin with the uniqueness part. Suppose that v andv are two strong solutions to equation (5.27) with the same initial condition v 0 ∈ C 2 (T d ). Then, by the chain rule where the last inequality follows from assumption (4.4). Since v(0) =v(0), an application of the Gronwall lemma immediately implies that v(t) =v(t) for any t ∈ [0, T ].
Finally, to obtain the desired bound on v(t) L∞ in terms of v 0 L 2 rather than v 0 L∞ we fix t ∈ [0, T ] and consider again PDE (5.31) with the initial condition ψ(0, ξ) := (v 0 (ξ) ∨ 0) + M t . By the comparison principle, we have v(t, ξ) ψ(t, ξ), ξ ∈ T d . (5.32) Since ψ(0, ξ) M t for all ξ ∈ T d and the drift is nonnegative, it is immediate to see that ψ(s, ξ) M t for all s ∈ [0, t], ξ ∈ T d . Therefore, taking into account (5.26), we get that Introduce now ψ + , which is the strong solution of the following PDE with the initial condition ψ + (0) := ψ(0). Using (5.33), we see that the comparison principle implies that Let p s be the heat kernel on the torus T d . Then it is straightforward to see that for any Combining this with (5.32) and (5.34), we deduce v(t, ξ) By a similar argument, we get This yields (5.28). Now let us move on to less regular initial data.
Proof. We begin with the uniqueness part. Suppose that v andv are two analytically generalized strong solutions to equation (1.1) with the same initial condition v 0 ∈ L ∞ (T d ).
Fix arbitrary t > 0. Arguing as in the proof of Lemma 5.5, we see that by the Gronwall lemma there exists C = C(t) > 0 such that for any δ ∈ (0, t) we have Since v(0) =v(0), by passing to the limit as δ → 0 in (5.35), we deduce that v(t) =v(t). Note again that the function (5.30) is locally Lipschitz in x, Hölder in t and continuous in ξ. Therefore, by [22,Proposition 7.3.1.i] there exists δ > 0 such that on the interval [0, δ] equation (5.27) has an analytically generalized strong solution v with initial condition v 0 and v ∈ C((0, δ], C 2 (T d )). Since v(δ) ∈ C 2 (T d ), by Lemma 5.5 we can construct an analytically strong solution v on [δ, T ] with the initial condition v(δ) and v ∈ C([δ, T ], C 2 (T d )). By gluing these two solutions together we get an analytically generalized strong solution v ∈ C((0, T ], C 2 (T d )).
To consider even less regular initial data (recall that we are interested in the initial conditions from L 2 (T d )) we need the following lemma about approximations of solutions to (5.27).
Let v n be the analytically strong solution of (5.27) with the initial condition v n 0 . Then (iii) ifv is an analytically generalized strong solution of (5.27) with the initial condition v 0 , then for each t ∈ [0, T ] we havev(t, ξ) = v(t, ξ) for almost all ξ ∈ T d . .
Proof. (i). Let n, m ∈ Z + . Then arguing as in the derivation of (5.29) we get by the chain rule and assumption (4.4) By the Gronwall lemma, this implies that Now the statement of the lemma follows immediately from the completeness of the space C([0, T ], L 2 (T d )) ( [17,Theorem I.4.19]).
(ii). Fix t ∈ (0, T ]. By Lemma 5.5, we have the following uniform over n bound Since v n (t) converges to v(t) in L 2 , bound (5.36) implies v(t) ∈ L ∞ (T d ).
(iii). Letv be an analytically generalized strong solution of (5.27) with the initial condition v 0 . Then arguing as in part (i) of the lemma, we get for any δ > 0 By passing to the limit as δ → 0 and using the fact that by definitionv(δ) → v 0 in L 2 , we get sup However, by part (i) of the lemma This implies thatv(t) = v(t) as elements of L 2 (T d ) for any t ∈ [0, T ].
The next lemma establishes existence of solutions to equation (5.27) with L 2 initial data. It relies on Lemma 5.7 and extends [22,Propositions 7.3.1].
Lemma 5.8. For any v 0 ∈ L 2 (T d ) equation (5.27) has a unique analytically generalized strong solution v with the initial condition v 0 . Furthermore, we have v ∈ C((0, T ], C 2 (T d )).
Proof. The proof of the uniqueness part is the same as in Lemma 5.6.
Let us show existence of a solution to (5.27). Let (v n 0 ) n∈Z + be a sequence of C 2 (T d ) functions converging in L 2 to v 0 ∈ L 2 (T d ). Let v n be the analytically strong solution of (5.27) with the initial condition v n 0 (it exists thanks to Lemma 5.5). By Lemma 5.7(i,ii) there exists v ∈ C([0, T ], L 2 (T d )) such that for any t ∈ (0, T ] we have v(t) ∈ L ∞ (T d ) and v n (t) − v(t) L 2 → 0 as n → ∞. (5.37) We claim now that v is an analytically generalized strong solution to (5.27) with the initial condition v 0 . Indeed, we have by construction v(0) = v 0 and v ∈ C([0, T ], L 2 (T d )). Fix any ε ∈ (0, T ] and let us verify that identity (4.2) holds.
By (5.37), we have v(ε/2) ∈ L ∞ . Therefore, by Lemma 5.8 there exists a process v ∈ C((ε/2, T ], C 2 (T d )), which is an analytically generalized strong solution of (5.27) on interval [ε/2, T ] with the initial condition v(ε/2). Therefore, by Lemma 5.7(iii) we havē v(t) = v(t) for any t > ε/2. Thus, identity (4.2) holds and for any t ε the function v(t) has a version which is in C 2 (T d ). Since ε was arbitrary, this implies the statement of the lemma. u(t, ξ, ω) := v(t, ξ, ω) + w(t, ξ, ω), t ∈ [0, T ], ξ ∈ T d , ω ∈ Ω ′ and u(t, ξ, ω) = 0 for ω ∈ Ω \ Ω ′ . It follows from Lemmas 5.4 and 5.5 that the function u is an analytically strong solution to (1.1) with the initial condition u 0 . To show the adaptiveness of u, we introduce a function u n , n ∈ Z + , which is an analytically and probabilistically strong solution on [0, T ] to the following equation where for x ∈ R, ξ ∈ T d we put f n (x, ξ) := f ((x ∧ n) ∨ (−n), ξ). Since f n is uniformly bounded, it follows from [18, Chapter II, Theorem 2.1] that u n is well-defined; thus, identity (5.38) holds on some set Ω n of full measure. On the other hand, by uniqueness, we have |u n (t, ξ)| n} and u n = u on A n . Since for each ω ∈ Ω ′ the function u is bounded, we see that P(A n ) → 1 as n → ∞ and A n ⊂ A n+1 . This implies that u n converges to u a.s. as n → ∞. Since each u n is (F t )-adapted, their limit u is also (F t )-adapted. Therefore u is a probabilistically strong solution of (1.1) on [0, T ]. Since T was arbitrary, this and uniqueness imply that u is an analytically and probabilistically strong solution of (1.1) on [0, ∞). The continuity of u follows from continuity of v and w (Lemmas 5.4 and 5.5).
(ii). The proof of the uniqueness part is the same as in Lemma 5.6. To show existence put again now u(t, ξ, ω) := v(t, ξ, ω) + w(t, ξ, ω), t ∈ [0, T ], ξ ∈ T d , ω ∈ Ω ′ and u(t, ξ, ω) = 0 for ω ∈ Ω \ Ω ′ . Here v is an analytically generalized strong solution to (5.27) with the initial condition v(0) = u 0 . It follows from Lemmas 5.4 and 5.6 that the function u is an analytically generalized strong solution to (1.1) with the initial condition u 0 . To show the adaptiveness we consider (u n 0 ) n∈Z + , a sequence of C 2 (T d ) functions converging in L 2 to u 0 ∈ L 2 (T d ). Let u n be the analytically and probabilistically strong solution of (1.1) with the initial condition u n 0 (it exists thanks to part (i) of the theorem). By Lemma 5.7(iii), we have that for each t > 0 the function u n (t) converges to u(t) in L 2 P-a.s. Since u n (t) is F t -adapted, we see that u(t) is also (F t )-adapted. Therefore u is a probabilistically strong solution of (1.1) on [0, T ]. The proof ends in the same way as in the part (i) of the theorem.
(iii). We begin with the following two observations. First, we note that part (ii) of the theorem implies that the generalized strong solution of equation (1.1) is unique.
Second, for x ∈ L 2 (T d ) let us denote by u x the generalized strong solution of (1.1) with the initial condition x. Then, by the chain rule and Gronwall's inequality we get that for each t > 0 there exists C(t) > 0 such that for any x, y ∈ L 2 (T d ) we have where we also took into account assumption (4.4). Now the Markov property of u follows from these two properties by repeating literally the argument from [2, Proposition 4.1].
(iv). Take the set Ω ′ defined in Lemma 5.4. Fix t > 0, ω ∈ Ω ′ , x, y ∈ L 2 (T d ) such that x pos y. Let u x , u y be the generalized strong solutions of (1.1) with the initial conditions x and y respectively. Let us show now that u x (t, ω) pos u y (t, ω) a.s.
We begin with the case when x, y ∈ C 2 (T d ). Let w be the stochastic convolution (recall its definition in Lemma 5.4). Let v x , v y be analytically strong solutions of (5.27) with the initial conditions x and y respectively. Then by Lemma 5.5 v x and v y are continuous on [0, t] × T d . Therefore, by comparison principle [29,Theorem 10.1] we have v x (t, ω) pos v y (t, ω) and hence u x (t, ω) = v x (t, ω) + w(t, ω) pos v y (t, ω) + w(t, ω) = u y (t, ω).
In the general case, we consider (x n ) n∈Z + and (y n ) n∈Z + , two sequences of C 2 (T d ) functions converging in L 2 to x and y, respectively. By the above, for each n ∈ Z + we have u x n (t, ω) pos u y n (t, ω). (5.39) On the other hand, by part (ii) of the theorem and Lemma 5.7 we have u x n (t, ω) → u x (t, ω), u y n (t, ω) → u y (t, ω) in L 2 (T d ) as n → ∞. This together with (5.39) yields u x (t) pos u y (t).
(v). Follows immediately from part (iv) of the theorem and the Strassen's theorem.
(vi) Fix x ∈ L 2 (T d ). Then u x is an analytically generalized strong solution of (1.1). Therefore by Ito's lemma, taking into account (4.3), we deduce for any 0 < s t u x (t) 2 Using the fact that u x is continuous in L 2 (T d ), we can pass to the limit in (5.40) as s → 0 to deduce that (5.40) is also valid for s = 0. Therefore all the conditions of Lemma 5.3 are satisfied for the process X(t) := u x (t) 2 L 2 . Now inequality (4.6) follows from (5.20) and inequality (4.7) follows from (5.21).
Before we formulate and prove the final lemma we recall that under Assumption A the solution u with initial condition u 0 ∈ L 2 (T d ) satisfies the following mild form of the SPDE (1.1) [5,Theorem 5.4] u(t, ξ) = where we used the Cauchy-Schwarz inequality to obtain convergence of the first term.
For an arbitrary element x ∈ L 2 (T d ) let S x and L x be the sets of elements from L 2 (T d ) that are smaller (respectively larger) than x, that is It is easy to see that the sets S x and L x are closed. For a ∈ R puť a(ξ) := a, ξ ∈ T d . Lemma 5.9. Suppose that Assumptions A and B hold. Let (P t ) be the semigroup associated with equation (1.1). Then for any M > 0 there exists ε > 0 such that Proof. By symmetry it suffices to prove the first claim. We show it in two steps.
Step 1. At this step we will use only Assumption A. We claim that for every M > 0 and T ∈ (0, 1] there exist Γ ∈ R and t 0 ∈ (0, T ] such that inf x∈L 2 (T d ) x L 2 M P t 0 (x, SΓ) > 0.
First we note, that it is sufficient to prove the claim only for large enough M. Observe now that by (4.3) there exists γ ∈ R such that f (z, ξ) 0 for all z γ and all ξ ∈ T d . Fix arbitrary β > γ, and assume that M is large enough so that β L 2 M. By order preservation (Theorem 4.4 (iv)) it suffices to show that for every M > 0, T ∈ (0, 1] there exist Γ ∈ R and t 0 ∈ (0, T ] such that Fix now arbitrary x ∈ L 2 (T d ) satisfyingβ pos x and x L 2 M. Then, using order preservation again, we see that P Ω(x) κ.
Using (5.41), we see that on the setΩ∩Ω(x) (which has measure at least κ/2) the solution u x satisfies for all ξ ∈ T d u x (t 0 , ξ) where we used the Cauchy-Schwarz inequality in the first inequality and the fact that f (z, ξ) 0 for z γ and ξ ∈ T d . The proof of the claim of Step 1 is complete.
Step 2. This is the only part of the proof of Theorem 4.5, where Assumption B is used. We claim that for every Γ ∈ R and τ > 0 we have P τ (Γ, S0) > 0.