1 Introduction

The goal of this article is to build a framework for analyzing ergodic properties of order-preserving Markov processes and to provide simple, verifiable, yet general enough sufficient conditions for exponential ergodicity. This framework turns out to be especially powerful for investigating ergodicity of order-preserving stochastic PDEs with highly degenerate additive forcing. Our main example is the stochastic reaction–diffusion equation in the hypoelliptic setting. We show that even if noise comes to the system only from one Brownian motion, then (under certain conditions) this SPDE has a unique invariant measure and its transition probabilities converges to it exponentially in the Wasserstein metric. We also establish exponentially fast synchronization-by-noise of the solutions to this equation. This refines [18, Remark 8.22] and complements [18, Theorem 8.21].

In the mathematical physics literature there is a growing interest in ergodic behavior of nonlinear PDEs forced by smooth in space noise acting only on a few Fourier modes, see, e.g., [5, 16, 23, 29, 30]. Since the noise is smooth in space, it is usually relatively easy to show that these SPDEs have a unique solution and that this solution is a Markov process. On the other hand, since the solution at any fixed time is an infinite-dimensional random variable and the noise acts only onto finitely many degrees of freedom, these processes do not get “enough” noise and hence they are typically only Feller but not strong Feller, see also the discussion in [15, Section 9]. This makes analyzing their ergodic behavior much more challenging.

Indeed, recall that ergodicity of strong Feller processes can be established using the standard classical approach, which combines a local mixing condition on a certain set (the small set condition) and a recurrence condition, see, e.g., [31]. Unfortunately, this method is usually not applicable for Markov processes which are only Feller and not strong Feller because they do not have good mixing properties, see also a detailed discussion in [19, Section 1]). To study ergodicity of these processes three alternative strategies have been suggested recently.

The first approach was developed in [16,17,18], see also [15, Section 11]. It introduces the asymptotic strong Feller (ASF) property, which serves as a replacement for the strong Feller property. It is shown there that if a Markov process satisfies ASF as well as certain recurrence and topological irreducibility conditions then the process is exponentially ergodic. Note that for many Markov processes verifying ASF might be rather challenging. In particular, while this method works quite well for stochastic Navier–Stokes (SNS) equations on a torus in the hypoelliptic setting, it is not so clear how to check ASF for the SNS equation on a bounded domain, see the discussion in [11, Section 1].

Another approach establishes exponential ergodicity using generalized couplings [2, 11, 12, 19, 24, 30]. Recall that a coupling is a pair of stochastic processes with given marginal distributions. By contrast, a generalized coupling is a pair of stochastic processes, whose marginals are not necessarily equal to a prescribed pair of probability distributions, but are in a sense close to this pair. Clearly, constructing a generalized coupling is much easier than constructing a coupling. Furthermore, it is shown in the papers mentioned above that existence of a generalized coupling with certain nice properties yields exponential ergodicity. This approach works quite well for a large class of SPDEs in the effectively elliptic setting (that is, when noise acts in a finite but large enough number of directions), but is less useful for studying SPDEs in the hypoelliptic setting.

Finally, the third main approach was introduced in [19]. It utilizes the notion of a d-small set (a generalization of a small set), which is particularly well adapted to the study of Markov processes with bad mixing properties. This approach provides another set of sufficient conditions for exponential ergodicity, and it works quite well with stochastic delay equations and nonlinear autoregressions. Unfortunately, verifying this set of conditions for SPDEs is rather difficult.

Our new approach developed in this paper is somehow orthogonal to all of the strategies mentioned above. It is specifically targeted at order preserving Markov semigroups, that is, the semigroups which map increasing bounded functions to increasing bounded functions. On the one hand, this significantly reduces the applicability of this approach; for example, it cannot be used to study stochastic Navier–Stokes equations. On the other hand, for order-preserving Markov processes (e.g., stochastic reaction–diffusion equations) it allows to obtain exponential ergodicity under very weak assumptions; this is rather difficult (or maybe impossible) to achieve with other methods.

The main result (Theorems 2.3 and 2.4) is quite general. It shows that if an order-preserving Markov semigroup satisfies additionally a swap condition (i.e., two Markov processes started with initial conditions x, y with \(x\preceq y\) can change their order by time 1 with a small but positive probability), then under a standard Lyapunov-type condition as well as a certain technical assumption the process is exponentially ergodic. We also show that this swap condition cannot be omitted.

We apply the obtained theorems to establish exponential ergodicity of stochastic reaction–diffusion equations on a d-dimensional torus \(\mathbb {T}^d\), \(d\in \mathbb {N}\)

$$\begin{aligned} d u(t,\xi )=[\Delta u(t,\xi )+f(u(t,\xi ),\xi )]dt+\sum _{k=1}^m \sigma _k(\xi )dW^k(t), \quad \xi \in \mathbb {T}^d,\, t\geqslant 0, \end{aligned}$$
(1.1)

where \((W^1, W^2, \ldots , W^m)\) are independent standard Brownian motions; f, \(\sigma _k\) are continuous functions acting from \(\mathbb {R}\times \mathbb {T}^d\rightarrow \mathbb {R}\) and \(\mathbb {T}^d\rightarrow \mathbb {R}\), respectively and satisfying certain conditions. It is clear that if \(m=0\) (no noise), then this equation might have multiple invariant measures. On the other hand, if \(m=\infty \) (noise acts in every direction), then, under certain additional assumptions on \(\sigma \), the process u is strong Feller and it can be shown by the classical methods that it has a unique invariant measure [6, Sections 7 and 11]. Thus, it is natural to ask what the smallest number of directions m that have to be perturbed by noise is, so that Eq. (1.1) still has a unique invariant measure.

Using the ASF method described above, it was shown in [18, Remark 8.22] that if f is a polynomial and \(m=3\), then Eq. (1.1) has a unique invariant measure and is exponentially ergodic. We extend this result and show exponential ergodicity of u already if \(m=1\) (that is, when noise acts only in one direction), see Theorem 4.5 and Remark 4.6; we also do not rely on a specific form of f. Note that the convergence to the invariant measure is established in the Wasserstein metric; Theorem 4.8 shows that in the case \(m<\infty \) this is optimal. Namely, if no additional assumptions are imposed, then, contrary to the case \(m=\infty \), the transition probabilities of Markov process (1.1) might not converge to the invariant measure in the total variation metric. Finally, in Theorem 4.9, we show that any two solutions to (1.1) launched with the same noise from different initial conditions converge to each other exponentially fast (synchronization by noise).

The idea that order-preservation helps to obtain better convergence rates of a Markov process is not new; it was used to study interacting particle systems since 1970s, see [25]. However, it is not clear at all how to extend the techniques used in this book beyond the framework of interacting particle models. Order-preserving Markov processes on a general state spaces were considered in [28, 32]. However, the methods developed there rely on the small set condition. Since in the current paper we study processes on a general state space with bad mixing properties, where this condition might not hold, unfortunately the ideas of [25, 28, 32] cannot be applied in our case.

It is interesting to compare our results with [8]. In that paper the authors consider an order-preserving random dynamical system (RDS) with two additional properties: it has a unique invariant measure and it weakly converges to this measure. It is shown there that this implies that any two trajectories of the RDS converge to each other in probability. By contrast, in the current paper we start with an order-preserving Markov process and prove uniqueness of an invariant measure and convergence of transition probabilities.

Our main tool is a new version of the coupling method specifically tailored for order-preserving Markov processes, see the proof of Theorem 2.3. This is combined with an analysis of the relations between stochastic domination and expected distance of random variables, see Sect. 3. There we continue the study initiated in [4, Proposition 1] and [8, Proposition 2.4]. Note however, that the methods introduced in [4] and [8] cannot be used to get a quantitative bound even in the case where the Markov process has state space \(\mathbb {R}\), see Example 3.2. Therefore we apply a new technique.

While we study in detail only the stochastic reaction–diffusion equations on the torus, the strategy developed in this paper should also work in a very similar way for other order-preserving SPDEs including stochastic reaction–diffusion equation on a bounded domain and stochastic porous medium equations. We would like to mention also that after the preprint of our paper became available, our technique was extended and adapted to study regularization by noise for singular SPDEs [10].

The rest of the paper is organized as follows. We present our main results in Sect. 2. In Sect. 3 we investigate the relations between stochastic domination and average distance of random variables. Section 4 is devoted to a detailed study of ergodicity of stochastic reaction–diffusion equations. The proofs of the main results are placed in Sect. 5.

Convention on constants. Throughout the paper C denotes a positive constant whose value may change from line to line.

2 A General Framework for Ergodicity for Order-Preserving Markov Processes

We begin by introducing the basic notation. Let \((E,\rho )\) be a Polish space with partial order \(\preceq \) such that the set

$$\begin{aligned} \Gamma := \{(x,y)\in E\times E:x\preceq y\} \end{aligned}$$
(2.1)

is closed. Let \(\mathcal {E}=\mathcal {B}(E)\) be the Borel \(\sigma \)-field. For sets \(A,B\in \mathcal {E}\) we will write \(A\preceq B\) if for any \(a\in A\), \(b\in B\) we have \(a\preceq b\). Denote by \(\mathcal {P}(E)\) the set of all probability measures on \((E,\mathcal {E})\). Let \(\{P_t\}_{t\geqslant 0}=\{P_t(x,A),x\in E, A\in {\mathcal {E}}\}_{t\geqslant 0}\) be a Markov transition function over E and denote by \(\{\mathsf {P}_x, x\in E\}\) the corresponding Markov family; that is \(\mathsf {P}_x\) is the law of the Markov process \(\{X_t, t\geqslant 0\}\) with the given transition function and initial condition \(X_0=x\). The law of X will be understood in the sense of finite-dimensional distributions; that is, we will not rely on the trajectory-wise properties of X.

For a measurable function \(r:E\times E\rightarrow [0,1]\), we consider the corresponding coupling distance \(W_r:{\mathcal {P}}(E)\times {\mathcal {P}}(E)\rightarrow \mathbb {R}_+\) given by

$$\begin{aligned} W_r(\mu ,\nu ) := \inf _{\lambda \in \mathcal {C}(\mu ,\nu )}\int _{E\times E} r(x,y) \lambda (dx,dy),\quad \mu ,\nu \in {\mathcal {P}}(E), \end{aligned}$$

where \(\mathcal {C}(\mu ,\nu )\) is the set of all couplings between \(\mu \) and \(\nu \), i.e., probability measures on \((E\times E,{\mathcal {E}}\otimes {\mathcal {E}})\) with marginals \(\mu \) and \(\nu \). If r is a lower semicontinuous metric on E, then \(W_r\) is the usual Kantorovich–Wasserstein distance. If r is the discrete metric, i.e., \(r(x, y) ={{\,\mathrm{\mathbb {1}}\,}}(x\ne y)\), then \(W_r\) is the total variation distance, which will be denoted further by \(d_{TV}\).

Let us now recall the standard definitions related to the partial order \(\preceq \); we refer to, e.g., [26, Chapter IV] for a detailed discussion.

Definition 2.1

  1. (i)

    A function \(f:E\rightarrow \mathbb {R}\) is called increasing if for any \(x,y\in E\) with \(x\preceq y\) we have \(f(x)\leqslant f(y)\).

  2. (ii)

    Let \(\mu ,\nu \in {\mathcal {P}}(E)\) be two probability measures. We say that \(\nu \) stochastically dominates \(\mu \) and denote it by \(\mu \preceq _{st}\nu \) if for any bounded measurable increasing function \(f:E\rightarrow \mathbb {R}\) we have

    $$\begin{aligned} \int _E fd\mu \leqslant \int _E fd\nu . \end{aligned}$$
  3. (iii)

    We say that a Markov transition function \(\{P_t\}_{t\geqslant 0}\) is order preserving if for any \(t>0\) and \(x,y\in E\) such that \(x\preceq y\) we have

    $$\begin{aligned} P_t(x,\cdot )\preceq _{st}P_t(y,\cdot ). \end{aligned}$$

In other words, a Markov transition function is order preserving if it maps bounded increasing functions to bounded increasing functions. Examples of Markov processes with an order preserving transition function include stochastic reaction-diffusion equations, stochastic porous media equations and others, see, e.g., [8].

Remark 2.2

Strassen’s theorem (see, e.g., [26, Theorem IV.2.4]) provides the following coupling definition of stochastic domination, which is equivalent to the one stated above. We have \(\mu \preceq _{st}\nu \) if and only if there exist random elements \(X,Y:\Omega \rightarrow E\) such that \({{\,\mathrm{Law}\,}}(X)=\mu \), \({{\,\mathrm{Law}\,}}(Y)=\nu \) and \(X\preceq Y\).

Now we are ready to present our main results.

Theorem 2.3

Suppose that there exist measurable functions \(V:E\rightarrow [0,\infty )\), \(\varphi :E\rightarrow \mathbb {R}\), \(d:E\times E\rightarrow [0,\infty )\), such that the following conditions hold:

  1. (1)

    the Markov transition function \((P_t)_{t\geqslant 0}\) is order-preserving;

  2. (2)

    the function V is a Lyapunov function, that is, there exist constants \(\gamma , K>0\) such that

    $$\begin{aligned} P_t V(x)\leqslant V(x)-\gamma \int _0^t P_s V(x)\,ds+Kt,\quad t\geqslant 0,\, x\in E; \end{aligned}$$
    (2.2)
  3. (3)

    if \(x,y\in E\) and \(x\preceq y\), then \(0\leqslant d(x,y)\leqslant \varphi (y)-\varphi (x)\);

  4. (4)

    for any \(x\in E\) we have \(M(x) := \sup _{t\geqslant 0} P_t \varphi ^2(x)<\infty \);

  5. (5)

    there exist sets \(A, B\in \mathcal {E}\) and \(\varepsilon >0\) such that \(A\preceq B\) and for any \(x\in \{V\leqslant 4K/\gamma \}\) we have

    $$\begin{aligned} P_1(x,A)>\varepsilon \text { and } P_1(x,B)>\varepsilon . \end{aligned}$$
    (2.3)

Then for any \(\theta >0\) there exist constants \(C,\lambda >0\) such that for any \(x,y\in E\)

$$\begin{aligned} W_{d\wedge 1}(P_t(x,\cdot ), P_t(y,\cdot ))\leqslant C (1+V(x)+V(y))^{\theta }(1+M(x))^{\theta } \exp (-\lambda t),\quad t\geqslant 0. \end{aligned}$$
(2.4)

Theorem 2.4

Suppose that all conditions of Theorem 2.3 are satisfied. Assume further that

  1. (6)

    there exists \(\delta \in (0,1]\) such that \(\rho \leqslant d^{\delta }\), where \(\rho \) is the metric on the Polish space E.

  2. (7)

    there exists \(K>0\), \(\kappa >0\) such that \(M(x)^\kappa \leqslant K+K V(x)\) for all \(x\in E\).

Then the Markov semigroup has a unique invariant measure \(\pi \). Further, for any \(\theta >0\) there exist constants \(C,\lambda >0\) such that for any \(x\in E\)

$$\begin{aligned} W_{\rho \wedge 1}(P_t(x,\cdot ), \pi )\leqslant C (1+V(x))^{\theta } \exp (-\lambda t),\quad t\geqslant 0. \end{aligned}$$
(2.5)

Proof

(Sketch of the proof of Theorems 2.3 and 2.4) Here, for the convenience of the reader, we provide just a very brief roadmap of the proof; a complete proof is given in Sect. 5.

Fix \(x,y\in E\), \(t>0\). The proof splits into two independent parts. First, we use a new version of the coupling method and utilize conditions (1), (2), (5) to construct random elements \(Z^x, Z^y, \widetilde{Z}^x\) taking values in E with the following properties:

$$\begin{aligned}&{{\,\mathrm{Law}\,}}(Z^x)={{\,\mathrm{Law}\,}}(\widetilde{Z}^x)=P_t(x,\cdot ),\quad {{\,\mathrm{Law}\,}}(Z^y)=P_t(y,\cdot );\nonumber \\&\mathsf {P}(Z^x\preceq Z^y \preceq \widetilde{Z}^x)\geqslant 1-C_1(1+V(x)+V(y))e^{-C_2 t}, \end{aligned}$$
(2.6)

for some universal constants \(C_1, C_2>0\). Second, using the ideas developed in Sect. 3, we transform the bound (2.6) into the following inequality:

$$\begin{aligned} \mathsf {E}[d(Z^x,Z^y)\wedge 1]\leqslant 1-C_3(1+V(x)+V(y))(1+M(x))e^{-C_4 t}, \end{aligned}$$
(2.7)

where \(C_3, C_4>0\) are again some universal constants. It is at this step, where we are using conditions (3) and (4). Since \({{\,\mathrm{Law}\,}}(Z^x)=P_t(x,\cdot )\) and \({{\,\mathrm{Law}\,}}(Z^y)=P_t(y,\cdot )\), inequality (2.7) yields (2.4) and (2.5). \(\square \)

Theorems 2.3 and 2.4 provide general sufficient conditions for an order-preserving Markov process to be ergodic. Condition (3) of Theorem 2.3 is actually a condition on the space \((E,\preceq )\) and d rather than on the Markov semigroup. As shown below it is satisfied in many natural situations, including \(E=\mathbb {R}\), \(E=L_p\), \(p\in [1,\infty )\), \(E={\mathscr {B}}^{\alpha }_{p,\infty }\), \(\alpha <0\), \(p\in [1,\infty )\); the latter stands for the Besov space of regularity \(\alpha \) and integrability p, see, e.g., [1, Section 2.7 and Proposition 2.93]. Thus, the only additional assumption for exponential ergodicity (apart from the standard Lyapunov and moment-type conditions) is the swap condition (5). It tells that the state space E contains two sets, one preceding the other, and locally uniformly in the initial condition the Markov process has a small chance to be in either of these sets. The following simple example explains why this condition cannot be dropped.

Example 2.5

Introduce the following trivial order on E: for \(x,y\in E\) we have \(x\preceq y\) if and only if \(x=y\). It is clear, that for this order any Markov semigroup is order-preserving. We also see that the set \(\Gamma \) defined in (2.1) is closed; furthermore, conditions (3) and (4) of Theorem 2.3 trivially hold with \(\varphi \equiv 0\). Thus, any Markov process on E, that has a Lyapunov function satisfies conditions (5)–(4) of Theorem 2.3. It is well-known that this is not enough for uniqueness of the invariant measure. Thus, the swap condition (5) cannot be omitted.

We also would like to emphasize that the swap condition (5) is very different in nature from the small set condition or other minorization-type conditions, which are imposed within the classical framework, see e.g., [33]. Indeed, a minorization condition guarantees good mixing properties of transition kernels and, in particular, bounds on the total variation distance between the kernels. On the other hand, the swap condition does not yield such bounds since nothing was assumed about mixing on the sets A and B. Lemma 5.9 shows how the swap condition can be verified for the stochastic reaction-diffusion equation.

Now let us provide natural examples of spaces E for which condition (3) of Theorem 2.3 holds.

Example 2.6

Let S be a countable set. Put \(E=\{0,1\}^S\) equipped with the distance \(d(x,y) := \sum _{i=1}^\infty 2^{-i}|x_i-y_i|\), where \(x,y\in \{0,1\}^S\) and \(x=(x_1,x_2,\dots )\), \(y=(y_1,y_2,\dots )\). This space often appears within the context of interacting particle systems. Consider the following partial order: if \(x,y\in \{0,1\}^S\), then

$$\begin{aligned} x\preceq _{inc} y \text { if and only if } x_i\leqslant y_i \text { for all } i\in \mathbb {N}. \end{aligned}$$

Then condition (3) holds for the function \(\varphi (x) := \sum _{i=1}^\infty 2^{-i}x_i\), \(x\in \{0,1\}^S\).

Example 2.7

Put \(E=\mathbb {R}\) equipped with the standard distance, \(d(x,y) := |x-y|\), \(x,y\in \mathbb {R}\), and consider the standard order \(\leqslant \). Then condition (3) holds for the function \(\varphi (x) := x\), \(x\in \mathbb {R}\).

Example 2.8

Put \(E := L_p(D,\mathbb {R})\), where \(p\geqslant 1\) and an arbitrary domain \(D\subset \mathbb {R}^n\), \(n\in \mathbb {N}\). Let \(\Vert \cdot \Vert _{L_p}\) be the standard \(L_p\) norm in this space. Consider the following partial order

$$\begin{aligned} x\preceq _{pos}y \text { if and only if } x(\xi )\leqslant y(\xi ) \text { for almost all }\xi \in D \end{aligned}$$
(2.8)

and let \(d(x,y) := \Vert x-y\Vert _{L_p(D)}^p\), \(x,y\in L_p(D,\mathbb {R})\). Then there exists a function \(\varphi \) such that the partial order \(\preceq _{pos}\) and function d introduced above satisfy condition (3) of Theorem 2.3. Furthermore, the set \(\Gamma \) defined in (2.1) is closed. We postpone the proof of this statement to Sect. 5.

Example 2.9

This example is due to [10]. Put \(E := {\mathscr {B}}^{\alpha }_{p,\infty }(D,\mathbb {R})\), where \(p\geqslant 1\), \(\alpha <0\) and an arbitrary domain \(D\subset \mathbb {R}^n\), \(n\in \mathbb {N}\). Let \(\Vert \cdot \Vert _{{\mathscr {B}}^{\alpha }_{p,\infty }}\) be the standard Besov norm in this space, see, e.g., [1, Definition 2.68]. Consider the following partial order

$$\begin{aligned} x\preceq _{Besov} y \text { if and only if } \langle x,\varphi \rangle \preceq _{pos}\langle y,\varphi \rangle \text { for any non-negative } \varphi \in \mathcal {C}^\infty \end{aligned}$$

and let \(d(x,y) := \Vert x-y\Vert _{{\mathscr {B}}^{\alpha }_{p,\infty }}^p\), \(x,y\in {\mathscr {B}}^{\alpha }_{p,\infty }(D,\mathbb {R})\). Then, as shown in [10, Lemma A.1], there exists a function \(\varphi \) such that the partial order \(\preceq _{Besov}\) and function d satisfy condition (3) of Theorem 2.3.

3 Stochastic Domination and Distance Between Random Variables

In this section we explore the connections between stochastic domination and expected distance of random variables, thus continuing the analysis initiated in [8]. Recall that we are given a Polish space E with metric \(\rho \). The following statement played a key role in establishing synchronization-by-noise results in [8].

Proposition 3.1

([8, Proposition 2.4]). Let \(X_t\), \(Y_t\) be two stochastic processes taking values in E such that \(X_t(\omega )\preceq Y_t(\omega )\) for all \(t\geqslant 0\), \(\omega \in \Omega \). Suppose that \(X_t\) and \(Y_t\) converge weakly as \(t\rightarrow \infty \) to a random element with the law \(\mu \) . Then,

$$\begin{aligned} \mathsf {E}[\rho (X_t, Y_t)\wedge 1]\rightarrow 0\quad \text {as } t\rightarrow \infty . \end{aligned}$$
(3.1)

It is natural to ask whether the above statement can be quantified. More precisely, assume additionally that

$$\begin{aligned} d_{TV}({{\,\mathrm{Law}\,}}(X_t),\mu )\leqslant r(t),\quad d_{TV}({{\,\mathrm{Law}\,}}(Y_t),\mu )\leqslant r(t), \end{aligned}$$

for some rate function r going to 0 as \(t\rightarrow \infty \). One can ask whether these bounds guarantee a quantitative estimate on \(\mathsf {E}[\rho (X_t, Y_t)\wedge 1]\). Quite surprisingly, the answer to this question is negative: without any additional assumptions \(\mathsf {E}[\rho (X_t, Y_t)\wedge 1]\) might tend to 0 very slowly even when r(t) converges exponentially fast. This is illustrated by the following example.

Example 3.2

Let \(E=\mathbb {R}\) be equipped with the standard order \(\leqslant \). Let \((p_i)_{i\in \mathbb {Z}_+}\) be a sequence of positive numbers summing up to 1. Let X be a random variable which with probability \(p_i\) is uniformly distributed on \([2^i,2^{i+1}]\), \(i\in \mathbb {Z}_+\). Define for \(n\in \mathbb {Z}_+\)

$$\begin{aligned} Y_n := X{{\,\mathrm{\mathbb {1}}\,}}(X\notin [2^n,2^{n+1}])+(X+1){{\,\mathrm{\mathbb {1}}\,}}(X\in [2^n,2^{n+1}]);\quad X_n=X. \end{aligned}$$

Then for each n we clearly have \(X_n\leqslant Y_n\) and it is immediate to see that for \(\mu := {{\,\mathrm{Law}\,}}(X)\) one has

$$\begin{aligned} d_{TV}({{\,\mathrm{Law}\,}}(X_n),\mu )=0;\quad d_{TV}({{\,\mathrm{Law}\,}}(Y_n),\mu )\leqslant p_n 2^{-n}. \end{aligned}$$

On the other hand,

$$\begin{aligned} \mathsf {E}[|X_n-Y_n|\wedge 1]=p_n, \end{aligned}$$

which is much slower than \(p_n 2^{-n}\).

Thus, to quantify bounds in (3.1) we need to impose extra assumptions. This is where condition (3) of Theorem 2.3 pops up.

Lemma 3.3

Assume that condition (3) of Theorem 2.3 is satisfied for measurable functions \(\varphi :E\rightarrow \mathbb {R}\), \(d:E\times E\rightarrow [0,1]\). Suppose further that there exist a function \(\psi :E\rightarrow \mathbb {R}_+\) and \(k\in (0,1)\) such that

$$\begin{aligned} |\varphi (x)-\varphi (y)|\leqslant d(x,y)^k(\psi (x)+\psi (y)),\quad x,y\in E. \end{aligned}$$
(3.2)

Let XY be random elements \(\Omega \rightarrow E\) such that \(X\preceq Y\) a.s., \(\mathsf {E}|\varphi (X)|<\infty \), \(\mathsf {E}|\varphi (Y)|<\infty \) and \(W_d({{\,\mathrm{Law}\,}}(X),{{\,\mathrm{Law}\,}}(Y))\leqslant \varepsilon \) for some \(\varepsilon >0\). Then we have

$$\begin{aligned} \mathsf {E}d(X,Y)\leqslant \varepsilon ^{k}\bigl ((\mathsf {E}[\psi (X)^{1/(1-k)}])^{1-k}+(\mathsf {E}[\psi (Y)^{1/(1-k)}])^{1-k}\bigr ). \end{aligned}$$
(3.3)

Proof

We begin by observing that since \(X\preceq Y\) a.s., condition (3) of Theorem 2.3 yields

$$\begin{aligned} \mathsf {E}d(X,Y)\leqslant \mathsf {E}\varphi (Y)-\mathsf {E}\varphi (X). \end{aligned}$$
(3.4)

Fix \(K>1\) and let \(\widetilde{X}\), \(\widetilde{Y}\) be random elements such that \({{\,\mathrm{Law}\,}}(\widetilde{X})={{\,\mathrm{Law}\,}}(X)\), \({{\,\mathrm{Law}\,}}(\widetilde{Y})={{\,\mathrm{Law}\,}}(Y)\) and \(\mathsf {E}d(\widetilde{X},\widetilde{Y})\leqslant K\varepsilon \) (they exist since \(W_d({{\,\mathrm{Law}\,}}(X),{{\,\mathrm{Law}\,}}(Y))\leqslant \varepsilon \)). Then we continue (3.4) as follows, using the fact that d is bounded by 1:

$$\begin{aligned} \mathsf {E}d(X,Y)&\leqslant \mathsf {E}\varphi (\widetilde{Y})-\mathsf {E}\varphi (\widetilde{X})\\&\leqslant \mathsf {E}[d(\widetilde{X},\widetilde{Y})^k(\psi (\widetilde{X})+\psi (\widetilde{Y}))]\\&\leqslant K^{k}\varepsilon ^{k}\bigl ((\mathsf {E}[\psi (\widetilde{X})^{1/(1-k)}])^{1-k}+(\mathsf {E}[\psi (\widetilde{Y})^{1/(1-k)}])^{1-k}\bigr )\\&=K^{k}\varepsilon ^{k}\bigl ((\mathsf {E}[\psi (X)^{1/(1-k)}])^{1-k}+(\mathsf {E}[\psi (Y)^{1/(1-k)}])^{1-k}\bigr ). \end{aligned}$$

Since \(K>1\) was arbitrary, this yields the statement of the lemma. \(\square \)

The lemma above will be very useful for obtaining exponential bounds on the synchronization-by-noise. On the other hand, to complete the second part of the proof of Theorem 2.3 (see its sketch above in Sect. 2), we need to solve the opposite problem. More precisely, Lemma 3.3 considers the case when one random element is less than the other everywhere but their laws are different. The following lemma studies the situation where the laws of the random elements coincide, but one is less than the other with probability smaller than 1.

Lemma 3.4

Assume that condition (3) of Theorem 2.3 is satisfied for measurable functions \(\varphi :E\rightarrow \mathbb {R}\), \(d:E\times E\rightarrow [0,1]\). Let \(X, Y, \widetilde{X}\) be random elements \(\Omega \rightarrow E\) such that \({{\,\mathrm{Law}\,}}(X)={{\,\mathrm{Law}\,}}(\widetilde{X})\) and

$$\begin{aligned} \mathsf {P}(X\preceq Y\preceq \widetilde{X})\geqslant 1-\varepsilon \end{aligned}$$

for some \(\varepsilon >0\). Then for any \(p,q>1\) such that \(1/p+1/q=1\) we have

$$\begin{aligned} \mathsf {E}d(X,Y)\leqslant 2 \varepsilon ^{1/p}(\mathsf {E}|\varphi (X)|^q)^{1/q}+\varepsilon . \end{aligned}$$

Again, we see that the statement of the lemma is satisfied only under some extra conditions (one needs condition (3) of Theorem 2.3 and a reasonable bound on \(\mathsf {E}|\varphi (X)|^q\)). The following two examples show that these extra conditions cannot be dropped.

Example 3.5

Let \(E=\mathcal {C}([0,1],\mathbb {R})\) be equipped with the sup norm \(\Vert \cdot \Vert \) and order \(\preceq _{pos}\) (defined as in (2.8)). For \(x,y\in E\) put \(d(x,y) := \Vert x-y\Vert \wedge 1\). Let \(n\in \mathbb {N}\). For \(k\in \{1,2,\ldots ,n\}\) let \(f_k\) be an element of \(\mathcal {C}([0,1],\mathbb {R})\) taking values in [0, 1] such that

$$\begin{aligned} f_k(\xi )=1\,\,\,\text {for }\xi \in [0,k/n];\quad f_k(\xi )=0\,\,\,\text {for }\xi \in [(k+1)/n,1]. \end{aligned}$$

Then it is easy to see that \(f_k\preceq _{pos}f_l\) whenever \(k\leqslant l\leqslant n\). Let \(\eta \) be uniformly distributed on the set \(\{1,2,\ldots ,n\}\). Put \(X := f_\eta \) and \(Y=\widetilde{X} := f_{\eta +1}\), where the summation is taken \(\mod n\). Then X and \(\widetilde{X}\) are bounded, \({{\,\mathrm{Law}\,}}(X)={{\,\mathrm{Law}\,}}(\widetilde{X})\), \(\mathsf {P}(X\preceq Y\preceq \widetilde{X})=1-1/n\), but \(\mathsf {E}[\Vert X-Y\Vert \wedge 1]=1\).

It is easy to see that in this example condition (3) of Theorem 2.3 does not hold. Indeed, if a function \(\varphi \) satisfies this condition, then for any \(n\in \mathbb {N}\), \(k\in \{1,2,\ldots ,n-1\}\) we have \(\varphi (f_{k+1})\geqslant \varphi (f_{k})+d(f_{k},f_{k+1})=\varphi (f_{k})+1\). Therefore

$$\begin{aligned} \varphi (\check{1})=\varphi (f_n)\geqslant \varphi (f_1)+n-1\geqslant \varphi (\check{0})+n, \end{aligned}$$
(3.5)

where \(\check{1}\) and \(\check{0}\) denote elements of E identically equal to 1 and 0, respectively. Since n was arbitrary, we see that (3.5) implies that \(\varphi (\check{1})\) cannot be finite, which is impossible.

Example 3.6

Let \(E=\mathbb {R}\) be equipped with the standard order \(\leqslant \). For \(x,y\in E\) put \(d(x,y) := |x-y|\wedge 1\). Let \(n\in \mathbb {N}\) and let X be uniformly distributed on the set \(\{1,2,\ldots ,n\}\). Let \(Y=\widetilde{X}=(X+1){{\,\mathrm{\mathbb {1}}\,}}(X<n)+{{\,\mathrm{\mathbb {1}}\,}}(X=n)\). Then condition (3) of Theorem 2.3 holds with \(\varphi (x)=x\), \({{\,\mathrm{Law}\,}}(X)={{\,\mathrm{Law}\,}}(\widetilde{X})\), \(\mathsf {P}(X\preceq Y\preceq \widetilde{X})=1-1/n\), but \(\mathsf {E}[|X-Y|\wedge 1]=1\).

Thus, we see that in this example condition (3) of Theorem 2.3 holds, but \(\mathsf {E}|\varphi (X)|=\mathsf {E}X=n/2+1/2\) can be arbitrarily large for large n.

Proof of Lemma 3.4

We begin by observing that the function \(\varphi \) is increasing, that is \(x\preceq y\) implies \(\varphi (x)\leqslant \varphi (y)\) for any \(x,y\in E\). Fix \(p,q>1\) such that \(1/p+1/q=1\). Without loss of generality, we can also assume that \(\mathsf {E}|\varphi ( X)|^q<\infty \) (otherwise the statement of the lemma is trivial).

Let \(\Lambda := \{\omega : X(\omega )\preceq Y(\omega )\preceq \widetilde{X}(\omega )\}\). Then, by assumption, \(\mathsf {P}(\Lambda )\geqslant 1-\varepsilon \) and we get (recall that \(d\leqslant 1\))

$$\begin{aligned} \mathsf {E}d(X,Y)&=\mathsf {E}[d(X,Y){{\,\mathrm{\mathbb {1}}\,}}(\Lambda )]+\mathsf {E}[d(X,Y){{\,\mathrm{\mathbb {1}}\,}}(\Omega {\setminus }\Lambda )]\nonumber \\&\leqslant \mathsf {E}[(\varphi (Y)-\varphi (X)){{\,\mathrm{\mathbb {1}}\,}}(\Lambda )]+\mathsf {P}(\Omega {\setminus }\Lambda )\nonumber \\&\leqslant \mathsf {E}[(\varphi (\widetilde{X})-\varphi (X)){{\,\mathrm{\mathbb {1}}\,}}(\Lambda )]+\varepsilon \nonumber \\&=\mathsf {E}(\varphi (\widetilde{X})-\varphi (X))-\mathsf {E}[(\varphi (\widetilde{X})-\varphi (X)){{\,\mathrm{\mathbb {1}}\,}}(\Omega {\setminus }\Lambda )]+\varepsilon . \end{aligned}$$
(3.6)

Since \({{\,\mathrm{Law}\,}}(X)={{\,\mathrm{Law}\,}}(\widetilde{X})\), we see that \(\mathsf {E}(\varphi (\widetilde{X})-\varphi (X))=0\). Applying the Cauchy–Schwarz inequality, we finally obtain from (3.6)

$$\begin{aligned} \mathsf {E}d(X,Y)&\leqslant 2 \varepsilon ^{1/p}(\mathsf {E}|\varphi (X)|^q)^{1/q}+\varepsilon . \end{aligned}$$

\(\square \)

4 Exponential Ergodicity of the Stochastic Reaction–Diffusion Equation in the Hypoelliptic Setting

Fix a filtered probability space \((\Omega , \mathcal {F},(\mathcal {F}_t)_{t\geqslant 0},\mathsf {P})\). We consider the stochastic reaction–diffusion equation (1.1) evolving on a d-dimensional torus \(\mathbb {T}^d\), \(d\in \mathbb {N}\). Note that we have equipped this equation with periodic boundary conditions just to simplify the exposition and to emphasize the main ideas. Everything should work in a similar way for Eq. (1.1) on a bounded domain equipped with Dirichlet boundary conditions.

We are interested in analytically and probabilistically strong solutions to this equation. Recall the following notions.

Definition 4.1

(see, e.g., [9, Definition 1.7]). Let \(u_0\in L_2(\mathbb {T}^d)\). An \(L_2(\mathbb {T}^d)\)-valued, continuous in time process \(u:\Omega \times [0,\infty )\rightarrow L_2(\mathbb {T}^d)\) is called an analytically generalized strong solution to (1.1) with initial condition \(u_0\) if \(u(0)=u_0\) and for any \(\varepsilon >0\), \(t>\varepsilon \) the following holds:

$$\begin{aligned}&\int _\varepsilon ^t \Vert \Delta u(s)\Vert _{L_2(\mathbb {T}^d)}+\Vert f(u(s,\cdot ),\cdot )\Vert _{L_2(\mathbb {T}^d)}ds<\infty ,\quad \mathsf {P}\text {-a.s.}; \end{aligned}$$
(4.1)
$$\begin{aligned}&u(t)=u(\varepsilon )+\int _\varepsilon ^t (\Delta u(s)+f(u(s,\cdot ),\cdot ))ds+\sum _{k=1}^m \sigma _k (W^k(t)-W^k(\varepsilon )),\quad \mathsf {P}\text {-a.s.}.\nonumber \\ \end{aligned}$$
(4.2)

Definition 4.2

(see, e.g., [7, Section 6.1], [9, Definition 1.3]). Let \(u_0\in L_2(\mathbb {T}^d)\). An \(L_2(\mathbb {T}^d)\)-valued, continuous in time process \(u:\Omega \times [0,\infty )\rightarrow L_2(\mathbb {T}^d)\) is called an analytically strong solution to (1.1) with initial condition \(u_0\) if conditions (4.1) and (4.2) hold for \(\varepsilon =0\) and any \(t>0\).

Definition 4.3

(see, e.g., [14, Remark 5.2]). A solution to (1.1) (strong or generalized strong) is called a probabilistically strong solution if it is adapted to the filtration \((\mathcal {F}_t)_{t\geqslant 0}\).

In case of ambiguity, a solution to (1.1) with initial condition \(x\in L_2(\mathbb {T}^d)\) will be denoted by \(u^x\). We refer to [7, 9, 14] for a detailed discussion of relations between different notions of a solution to SPDEs.

We will make the following assumption on f and \(\sigma _i\).

Assumption A

The function f is jointly continuous and locally Lipschitz in the first variable. Moreover, the following condition holds (dissipativity outside of a compact set): there exist \(K_1, K_2, K_3>0\) such that for any \(x,y\in \mathbb {R}\), \(\xi \in \mathbb {T}^d\)

$$\begin{aligned}&x f(x,\xi )\leqslant K_1-K_2 x^2; \end{aligned}$$
(4.3)
$$\begin{aligned}&(x-y) (f(x,\xi )-f(y,\xi ))\leqslant K_3(x-y)^2. \end{aligned}$$
(4.4)

Suppose that

$$\begin{aligned} \sigma _k\in \mathcal {C}^2(\mathbb {T}^d),\quad \text {for each }k=1,\ldots ,m. \end{aligned}$$
(4.5)

As shown in Theorem 4.4 below, the condition on \(\sigma \) guarantees local existence of strong solution and condition (4.3) prevents blow-up. Conditions (4.3) and (4.4) are satisfied, for example, for \(f(x,\xi ) := K x-x^3\), \(K\in \mathbb {R}\).

Recall the definition of the order \(\preceq _{pos}\) in (2.8).

Theorem 4.4

Suppose that Assumption A holds. Then

  1. (i)

    for any \(u_0\in \mathcal {C}^2(\mathbb {T}^d)\) Eq. (1.1) has a unique analytically and probabilistically strong solution u with initial condition \(u_0\); furthermore u has a version which is continuous on \([0,\infty )\times \mathbb {T}^d\);

  2. (ii)

    for any \(u_0\in L_2(\mathbb {T}^d)\) Eq. (1.1) has a unique analytically generalized strong and probabilistically strong solution u with initial condition \(u_0\); furthermore u has a version which is continuous on \((0,\infty )\times \mathbb {T}^d\);

  3. (iii)

    this solution of Eq. (1.1) is a homogeneous Markov process with state space \(L_2(\mathbb {T}^d)\);

  4. (iv)

    there exists a set \(\Omega '\subset \Omega \) of full measure such that if \(x,y\in L_2(\mathbb {T}^d)\) and \(x\preceq _{pos}y\), then

    $$\begin{aligned} u^x(t,\omega )\preceq _{pos}u^y(t,\omega ),\quad t\geqslant 0,\,\omega \in \Omega '; \end{aligned}$$
  5. (v)

    the corresponding Markov transition function is order preserving with respect to the order \(\preceq _{pos}\);

  6. (vi)

    for any \(x\in L_2(\mathbb {T}^d)\), \(t\geqslant 0\) we have \(\mathsf {E}\Vert u^x(t)\Vert _{L_2(\mathbb {T}^d)}^2<\infty \); moreover there exists a constant \(C>0\) such that the following energy estimates hold for any \(x\in L_2(\mathbb {T}^d)\), \(0\leqslant s\leqslant t\)

    $$\begin{aligned} \mathsf {E}\Vert u^x(t)\Vert _{L_2(\mathbb {T}^d)}^2&\leqslant \mathsf {E}\Vert u^x(s)\Vert _{L_2(\mathbb {T}^d)}^2-K_2\int _s^t\mathsf {E}\Vert u^x(r)\Vert _{L_2(\mathbb {T}^d)}^2\, dr\nonumber \\&\,\quad +(K_1+\Vert \sigma \Vert _{L_2(\mathbb {T}^d)}^2)(t-s), \end{aligned}$$
    (4.6)
    $$\begin{aligned} \mathsf {E}\Vert u^x(t)\Vert _{L_2(\mathbb {T}^d)}^4&\leqslant \Vert x\Vert _{L_2}^4\exp (-K_2t) +C, \end{aligned}$$
    (4.7)

    where the constants \(K_1\) and \(K_2\) were defined in (4.3).

The proof of the theorem is given in Sect. 5.2. Let us make here a couple of remarks about the theorem. Usually in the literature it is assumed additionally that f is a polynomial or f is globally Lipschitz or that at least some growth bounds on \(|f(x,\xi )|\) hold, see, e.g., [7, Section 7], [9, Section 4.2], [18, Section 8.3]. Here we established existence and uniqueness of the solutions to (1.1) without these additional restrictions. The main challenge is that the corresponding Nemytskii operator

$$\begin{aligned} F(u)(\xi ) := f(u(\xi ),\xi ) \end{aligned}$$

is not \(L_2(\mathbb {T}^d)\rightarrow L_2(\mathbb {T}^d)\) (or at least \(L_p(\mathbb {T}^d)\rightarrow L_2(\mathbb {T}^d)\) for some large p). Therefore it is difficult to apply here any fixed point principle. Furthermore if \(u^n\) is a sequence of solutions to (1.1) with smooth initial conditions \(u_0^n\), and \(u^n\) converges to some u in \(L_2\), then it is not clear at all that this u is a solution to (1.1). To overcome these obstacles, inspired by some ideas from [9], we have extended the corresponding PDE result [27, Propositions 7.3.1] to \(L_2\) initial conditions.

The fact that for irregular initial data (\(u_0\in L_2(\mathbb {T}^d)\)) Eq. (1.1) might not have an analytically strong solution is not surprising. Indeed, even for the standard heat equation \(\partial _t u=\Delta u\) on \(\mathbb {T}^1\) we have \(\Vert \Delta u(t)\Vert _{L_2(\mathbb {T}^d)}\leqslant Ct^{-1}\Vert u(0)\Vert _{L_2(\mathbb {T}^d)}\) and thus \(\int _0^t \Vert \Delta u(s)\Vert _{L_2(\mathbb {T}^d)}\,ds\) might be infinite.

Now let us present the main result of this section and establish ergodicity of the stochastic reaction–diffusion equation in the hypoelliptic setting. By Theorem 4.4(iii) the solution to (1.1) is a Markov process with the state space \(L_2(\mathbb {T}^d)\). Let \((P_t)_{t\geqslant 0}\) be the family of Markov transition probabilities associated with this process. We introduce the following condition.

Assumption B

There exist \(\varepsilon >0\) and \(\lambda _k\in \mathbb {R}\), \(k=1,\ldots ,m\) such that

$$\begin{aligned} \sum _{i=1}^m\lambda _k\sigma _k(\xi )>\varepsilon \quad \text {for all } \xi \in \mathbb {T}^d. \end{aligned}$$

Theorem 4.5

Suppose that Assumptions A and B hold. Then the stochastic reaction–diffusion equation (1.1) has a unique invariant measure \(\pi \). Furthermore, there exist \(C>0\), \(\lambda > 0\) such that for all \(x\in L_2(\mathbb {T}^d)\) we have

$$\begin{aligned} W_{\Vert \cdot \Vert _{L_2}\wedge 1}(P_t(x,\cdot ),\pi )\leqslant C(1+\Vert x\Vert _{L_2}^2)\exp (-\lambda t),\quad t\geqslant 0. \end{aligned}$$
(4.8)

Remark 4.6

Assumption B is satisfied for example if \(m\geqslant 1\) and \(\sigma _1(\xi )>\varepsilon \) for all \(\xi \in \mathbb {T}^d\).

Remark 4.7

Assumption B is imposed only to guarantee the swap condition, see Lemma 5.9. If the solution to (1.1) satisfies (5.43), then only Assumptions A is needed in Theorem 4.5.

As mentioned in the introduction, it is clear that if \(m=0\), then Eq. (1.1) might have multiple invariant measures. On the other hand, if \(m=\infty \), Eq. (1.1) has a unique invariant measure (of course, under certain assumption on the rate of decay of \(\Vert \sigma _k\Vert _{L_2}\) as \(k\rightarrow \infty \)) [6, Sections 7 and 11]. The first result showing uniqueness of the invariant measure for finite m (that is when noise acts not in all directions) is due to Hairer [12], who showed it for m large enough. This was later improved by Hairer and Mattingly [18, Remark 8.22], where uniqueness of the invariant measure and exponential ergodicity is proven if \(d=1\) and \(m=3\). Remark 4.6 refines this result and shows that if the noise acts only in one direction and the corresponding \(\sigma _1\) is positive, then the solution to Eq. (1.1) is exponentially ergodic. Note that contrary to [18, Remark 8.22] we also do not rely here on the specific polynomial form of the drift f and do not assume that the space dimension \(d\leqslant 3\) as in [18, Assumption RD.2].

The next result shows that, in general, the convergence of transition probabilities to the invariant measure in (4.8) might not take place in the total variation metric, and thus the \(W_{\Vert \cdot \Vert _{L_2}\wedge 1}\) metric can not be replaced in (4.8) by the \(d_{TV}\) metric. As discussed in the introduction, this happens due to the fact that the stochastic system do not get “enough” noise. A related result for the case when the drift f is linear can be found in [15, Proposition 9.7].

Theorem 4.8

Suppose that Assumption A holds for a function f which does not depend on space (so \(f=f(x)\), \(x\in \mathbb {R}\)). Then there exists a function \(\sigma \) satisfying additionally Assumption B and an initial condition \(x\in L_2(\mathbb {T}^d)\), such that

$$\begin{aligned} d_{TV}(P_t(x,\cdot ),\pi )=1, \end{aligned}$$

for any \(t\geqslant 0\).

The proof of Theorem 4.8 is given in Sect. 5.2.

Proof of Theorem 4.5

Let us verify that all conditions of Theorems 2.3 and 2.4 are satisfied. Let \(E=L_2(\mathbb {T}^d)\) with partial order \(\preceq _{pos}\) (recall its definition in (2.8)). Set

$$\begin{aligned}&V(x)=\Vert x\Vert _{L_2}^2,\,\,d(x,y) := \Vert x-y\Vert _{L_2}^2\quad x,y\in L_2(\mathbb {T}^d),\nonumber \\&\varphi (x) := 2\int _{\mathbb {T}^d} x(z)^2{{\,\mathrm{sign}\,}}(x(z))\,dz,\quad x\in L_2(\mathbb {T}^d). \end{aligned}$$
(4.9)

The Markov semigroup corresponding to (1.1) is order-preserving with respect to \(\preceq _{pos}\) by Theorem 4.4(v) and thus condition (1) of Theorem 2.3 holds. Condition (2) of Theorem 2.3 is satisfied thanks to energy bound (4.6). It follows from Example 2.8 and the definition of \(\varphi \) that condition (3) of Theorem 2.3 is met. Condition (4) follows from (4.6) with \(M(x) := 2\Vert x\Vert _{L_2(\mathbb {T}^d)}^4+C\), \(x\in L_2(\mathbb {T}^d)\). By Lemma 5.9 condition (5) holds. Conditions (6) and (7) hold trivially with \(\delta =1/2\) and \(\kappa =1/2\), respectively. Thus, all conditions of Theorems 2.3 and 2.4 are satisfied. Therefore the process u has a unique invariant measure and (4.8) holds. \(\square \)

Finally, we establish exponentially fast synchronization by noise for solutions to (1.1): that is, we will prove that any two solutions with initial conditions \(x,y\in L_2(\mathbb {T}^d)\) launched with the same noise will converge to each other in probability exponentially fast.

Theorem 4.9

Suppose that Assumptions A and B hold. Then there exist \(C>0\), \(\lambda > 0\) such that for any \(x,y\in L_2(\mathbb {T}^d)\) we have

$$\begin{aligned} \mathsf {E}[\Vert u^x(t)-u^y(t)\Vert _{L_2(\mathbb {T}^d)}\wedge 1]\leqslant C(1+\Vert x\Vert _{L_2(\mathbb {T}^d)}^2+\Vert y\Vert _{L_2(\mathbb {T}^d)}^2)\exp (-\lambda t),\quad t\geqslant 0. \end{aligned}$$
(4.10)

Note that the fact that \(\mathsf {E}[\Vert u^x(t)-u^y(t)\Vert _{L_2(\mathbb {T}^d)}\wedge 1]\rightarrow 0\) as \(t\rightarrow \infty \) follows immediately from Theorem 4.5 and [8, Proposition 2.4]. Unfortunately, as discussed in Sect. 3, the techniques developed in [8] do not provide a quantitative bound on convergence rate. Therefore to prove this theorem we use the toolkit developed in Sect. 3.

Proof of Theorem 4.9

Fix arbitrary \(x_0,y_0\in L_2(\mathbb {T}^d)\), \(t\geqslant 0\). Define

$$\begin{aligned} z_{-}(\xi )\, := \, x_0(\xi )\wedge y_0(\xi ),\,\,z_{+}(\xi ) := x_0(\xi )\vee y_0(\xi ),\quad \xi \in \mathbb {T}^d. \end{aligned}$$

Put \(X := u^{z_{-}}(t)\), \(Y := u^{z_{+}}(t)\). Then by the order-preserving property (Theorem 4.4(iv)) we have

$$\begin{aligned} X\preceq _{pos}u^{x_0}(t)\preceq _{pos}Y,\,\,\,\,X\preceq _{pos}u^{y_0}(t)\preceq _{pos}Y,\quad {\text {a.s.}} \end{aligned}$$

Therefore

$$\begin{aligned}{}[\Vert u^{x_0}(t)-u^{y_0}(t)\Vert _{L_2(\mathbb {T}^d)}\wedge 1]\leqslant [\Vert X-Y\Vert _{L_2(\mathbb {T}^d)}\wedge 1]. \end{aligned}$$
(4.11)

Now let us check that all the conditions of Lemma 3.3 hold. Let \(E=L_2(\mathbb {T}^d)\) with partial order \(\preceq _{pos}\). Set \(d(x,y) := \Vert x-y\Vert _{L_2}^2\wedge 1\), \(x,y\in L_2(\mathbb {T}^d)\), and define \(\varphi \) as in (4.9). Then it follows from Example 2.8 that condition (3) of Theorem 2.3 is satisfied. Furthermore, if \(d(x,y)=1\), then we have

$$\begin{aligned} |\varphi (x)-\varphi (y)|\leqslant (|\varphi (x)|+|\varphi (y)|)\leqslant 2d(x,y)^{1/2}(\Vert x\Vert _{L_2}^2+\Vert y\Vert _{L_2}^2). \end{aligned}$$
(4.12)

If \(d(x,y)<1\), then one has

$$\begin{aligned} |\varphi (x)-\varphi (y)|&\leqslant 2\int _{\mathbb {T}^d} \bigl |x(z)^2{{\,\mathrm{\mathbb {1}}\,}}(x(z)>0)-y(z)^2{{\,\mathrm{\mathbb {1}}\,}}(y(z)>0)\bigr |\,dz\nonumber \\&\quad +\, 2\int _{\mathbb {T}^d} \bigl |x(z)^2{{\,\mathrm{\mathbb {1}}\,}}(x(z)<0)-y(z)^2{{\,\mathrm{\mathbb {1}}\,}}(y(z)<0)\bigr |\,dz\nonumber \\&\leqslant 4\int _{\mathbb {T}^d} |x(z)-y(z)|(|x(z)|+|y(z)|)\,dz\nonumber \\&\leqslant 4\sqrt{2} \Vert x-y\Vert _{L_2}(\Vert x\Vert _{L_2}+\Vert y\Vert _{L_2})\nonumber \\&\leqslant 4\sqrt{2} d(x,y)^{1/2}(2+\Vert x\Vert ^2_{L_2}+\Vert y\Vert ^2_{L_2}), \end{aligned}$$
(4.13)

where in the second inequality we used the following bound

$$\begin{aligned} |a^2{{\,\mathrm{\mathbb {1}}\,}}(a>0)-b^2{{\,\mathrm{\mathbb {1}}\,}}(b>0)\bigr |\leqslant |a-b|(|a|+|b|),\quad a,b\in \mathbb {R}. \end{aligned}$$

Combining (4.12) and (4.13), we get

$$\begin{aligned} |\varphi (x)-\varphi (y)|\leqslant d(x,y)^{1/2}(\psi (x)+\psi (y)), \end{aligned}$$

where we put \(\psi (x) := 4\sqrt{2}(1+\Vert x\Vert ^2_{L_2})\), \(x\in L_2(\mathbb {T}^d)\). Thus, bound (3.2) holds.

It is also clear that by definition we have \(X\preceq _{pos}Y\). Further, by Theorem 4.4(vi) we have

$$\begin{aligned} \mathsf {E}|\varphi (X)|\leqslant 2\mathsf {E}\Vert X\Vert _{L_2}^2=2\mathsf {E}\Vert u^{z_{-}}(t)\Vert _{L_2}^2<\infty \end{aligned}$$

and, similarly, \(\mathsf {E}|\varphi (Y)|<\infty \). Finally, by Theorem 4.5 there exist \(C,\lambda >0\) such that

$$\begin{aligned} W_d({{\,\mathrm{Law}\,}}(X),{{\,\mathrm{Law}\,}}(Y))&\leqslant W_{\Vert \cdot \Vert _{L_2}\wedge 1}({{\,\mathrm{Law}\,}}(X),{{\,\mathrm{Law}\,}}(Y))\\&\leqslant C(1+\Vert z_{-}\Vert _{L_2}^2+\Vert z_{+}\Vert _{L_2}^2)\exp (-\lambda t)\\&= C(1+\Vert x_0\Vert _{L_2}^2+\Vert y_0\Vert _{L_2}^2)\exp (-\lambda t). \end{aligned}$$

Thus, all the conditions of Lemma 3.3 are met. Therefore inequality (3.3) and energy bound (4.7) yield

$$\begin{aligned} \mathsf {E}d(X,Y)&\leqslant C(1+\Vert x_0\Vert _{L_2}^2+\Vert y_0\Vert _{L_2}^2)\exp (-\lambda t/2)(1+\sqrt{\mathsf {E}\Vert X\Vert _{L_2}^4}+\sqrt{\mathsf {E}\Vert Y\Vert _{L_2}^4})\\&\leqslant C(1+\Vert x_0\Vert _{L_2}^2+\Vert y_0\Vert _{L_2}^2)\exp (-\lambda t/2)(1+\Vert z_-\Vert _{L_2}^2+\Vert z_+\Vert _{L_2}^2)\\&= C(1+\Vert x_0\Vert _{L_2}^2+\Vert y_0\Vert _{L_2}^2)^2\exp (-\lambda t/2). \end{aligned}$$

This bound allows us to conclude.

$$\begin{aligned} \mathsf {E}\sqrt{d(X,Y)}\leqslant \sqrt{\mathsf {E}d(X,Y)}\leqslant C(1+\Vert x_0\Vert _{L_2}^2+\Vert y_0\Vert _{L_2}^2)\exp (-\lambda t/4). \end{aligned}$$

Taking into account (4.11), we obtain (4.10). \(\square \)

5 Proofs

5.1 Proofs of the results of Sect. 2

To prove Theorem 2.3 we need a couple of lemmas. The first lemma is quite standard and is in the spirit of [31, Section 15.2]. The lemma deals with the connection between the finiteness of the exponential moment of the first return time of a Markov process to a set and the existence of a Lyapunov function. However we were not able to find in the literature precisely this statement (we found only a number of related ones). Thus, for the convenience of the reader and for the sake of completeness we provide the full proof of the lemma.

Let us consider a measurable space \((\widetilde{E}, \widetilde{\mathcal {E}})\) and a Markov kernel Q on it. Let \(Z=(Z_n)_{n\in \mathbb {Z}_+}\) be a Markov process with transition kernel Q. For a set \(A\in \widetilde{\mathcal {E}}\) introduce the first return time to A

$$\begin{aligned} \tau _A := \inf \{n\geqslant 1:Z_n\in A\}. \end{aligned}$$

Lemma 5.1

Suppose there exists a measurable function \(\mathcal {V}:\widetilde{E}\rightarrow [1,\infty )\) such that for some \(\lambda \in (0,1)\), \(K>0\)

$$\begin{aligned} Q\mathcal {V}\leqslant \lambda \mathcal {V}+ K. \end{aligned}$$
(5.1)

Fix

$$\begin{aligned} M>(K/(1-\lambda ))\vee 1 \end{aligned}$$
(5.2)

and put \(r := (\lambda +K/M)^{-1}\).

  1. (i)

    We have

    $$\begin{aligned}&\mathsf {E}_x r^{\tau _{\{\mathcal {V}\leqslant M\}}}\leqslant \mathcal {V}(x),\quad x\in \widetilde{E}{\setminus }\{\mathcal {V}\leqslant M\}; \end{aligned}$$
    (5.3)
    $$\begin{aligned}&\mathsf {E}_x r^{\tau _{\{\mathcal {V}\leqslant M\}}}\leqslant rQ\mathcal {V}(x),\quad x\in \{\mathcal {V}\leqslant M\}. \end{aligned}$$
    (5.4)
  2. (ii)

    Further, suppose that for a set \(A\in \widetilde{\mathcal {E}}\) we have for some \(\varepsilon >0\)

    $$\begin{aligned} \inf _{x\in \{\mathcal {V}\leqslant M\}}Q(x,A)\geqslant \varepsilon . \end{aligned}$$
    (5.5)

    Then, for any \(1<l<r^{\frac{|\log (1-\varepsilon )|\wedge \log M}{2\log M}}\) there exists a constant \(C=C(\lambda ,l,K,M,\varepsilon )>0\) such that for any \(x\in \widetilde{E}\)

    $$\begin{aligned} \mathsf {E}_x l^{\tau _A}\leqslant C \mathcal {V}(x). \end{aligned}$$
    (5.6)

Proof

(i) The proof uses the standard argument. First note, that it follows from the Lyapunov condition (5.1) and the definition of r that

$$\begin{aligned} Q\mathcal {V}(x)\leqslant r^{-1}\mathcal {V}(x),\quad x\in \widetilde{E}{\setminus }\{\mathcal {V}\leqslant M\}. \end{aligned}$$
(5.7)

Put now for \(n\in \mathbb {Z}_+\)

$$\begin{aligned} \tau ^{(n)} := \tau _{\{\mathcal {V}\leqslant M\}}\wedge n \wedge \inf \{k\in \mathbb {Z}_+:r^k \mathcal {V}(Z_k)\geqslant n\}. \end{aligned}$$

Then by Dynkin’s formula for discrete time Markov chains (see, e.g., [31, Theorem 11.3.1]) we have for any \(x\in \widetilde{E}\)

$$\begin{aligned} \mathsf {E}_x r^{\tau ^{(n)}}\mathcal {V}(Z_{\tau ^{(n)}})=\mathcal {V}(x)+\mathsf {E}_x\Bigl [\sum _{i=1}^{n}r^{i-1}{{\,\mathrm{\mathbb {1}}\,}}(i-1<\tau ^{(n)})(\mathsf {E}[r\mathcal {V}(Z_i)|Z_{i-1}]-\mathcal {V}(Z_{i-1}))\Bigr ]. \end{aligned}$$
(5.8)

If \(x\in \widetilde{E}{\setminus }\{\mathcal {V}\leqslant M\}\), then by definition for any \(i\in [1,\tau ^{(n)}]\) we have \(Z_{i-1}\in \widetilde{E}{\setminus }\{\mathcal {V}\leqslant M\}\). Therefore, (5.7) implies for any \(i\in [1,n]\)

$$\begin{aligned} r^{i-1}{{\,\mathrm{\mathbb {1}}\,}}(i-1<\tau ^{(n)})(\mathsf {E}[r\mathcal {V}(Z_i)|Z_{i-1}]-\mathcal {V}(Z_{i-1}))\leqslant 0. \end{aligned}$$

Combining this with (5.8), we get

$$\begin{aligned} \mathsf {E}_x r^{\tau ^{(n)}}\leqslant \mathsf {E}_x r^{\tau ^{(n)}}\mathcal {V}(Z_{\tau ^{(n)}})\leqslant \mathcal {V}(x),\quad x\in \widetilde{E}{\setminus }\{\mathcal {V}\leqslant M\}, \end{aligned}$$

which by Fatou’s lemma yields (5.3).

If \(x\in \{\mathcal {V}\leqslant M\}\), then by the Markov property and bound (5.3),

$$\begin{aligned} \mathsf {E}_x r^{\tau _{\{\mathcal {V}\leqslant M\}}}= & {} r \mathsf {P}_x (Z_1\in \{\mathcal {V}\leqslant M\}) + r\mathsf {E}_x [\mathcal {V}(Z_1){{\,\mathrm{\mathbb {1}}\,}}(Z_1\notin \{\mathcal {V}\leqslant M\})]\leqslant r \mathsf {E}_x \mathcal {V}(Z_1)\\= & {} r Q\mathcal {V}(x), \end{aligned}$$

which is (5.4).

(ii) Fix \(l\in (1,r^{\frac{|\log (1-\varepsilon )|\wedge \log M}{2\log M}})\). For \(n\in \mathbb {Z}_+\) denote by \(\tau ^M(n)\) the nth return time to the set \(\{\mathcal {V}\leqslant M\}\):

$$\begin{aligned}&\tau ^M(1) := \inf \{n\geqslant 0:Z_n\in \{\mathcal {V}\leqslant M\}\};\\&\tau ^M(n) := \inf \{n\geqslant \tau ^M(n-1)+1:Z_n\in \{\mathcal {V}\leqslant M\}\},\quad n\geqslant 2. \end{aligned}$$

Introduce random variables

$$\begin{aligned} I_n := {{\,\mathrm{\mathbb {1}}\,}}(Z_{\tau ^M(n)+1}\in A),\quad n\in \mathbb {Z}_+. \end{aligned}$$

We see that the event \(\{I_n=1\}\) corresponds to the reaching of the desired set A after nth visit to the set \(\{\mathcal {V}\leqslant M\}\). We have for any \(x\in \widetilde{E}\)

$$\begin{aligned} \mathsf {E}_x l^{\tau _A}&\leqslant \sum _{k=1}^\infty \mathsf {E}_x [l^{\tau ^M(k)+1}{{\,\mathrm{\mathbb {1}}\,}}(I_1=\cdots =I_{k-1}=0, I_k=1)]\nonumber \\&\leqslant \sum _{k=1}^\infty \mathsf {E}_x [l^{\tau ^M(k)+1}{{\,\mathrm{\mathbb {1}}\,}}(I_1=\cdots =I_{k-1}=0)]. \end{aligned}$$
(5.9)

Define

$$\begin{aligned} D(M,l) := \sup _{x\in \{\mathcal {V}\leqslant M\}}\mathsf {E}_x [l^{\tau _{\{\mathcal {V}\leqslant M\}}}{{\,\mathrm{\mathbb {1}}\,}}(Z_1\notin A)]. \end{aligned}$$

Note that by the strong Markov property and the definition above, for any \(x\in \widetilde{E}\)

$$\begin{aligned}&\mathsf {E}_x [l^{\tau ^M(k)+1}{{\,\mathrm{\mathbb {1}}\,}}(I_1=\cdots =I_{k-1}=0)]\nonumber \\&\qquad =\mathsf {E}_x \bigl [l^{\tau ^M(k-1)+1}{{\,\mathrm{\mathbb {1}}\,}}(I_1=\cdots =I_{k-2}=0)\mathsf {E}_{Z_{\tau ^M(k-1)}} (l^{\tau _{\{\mathcal {V}\leqslant M\}}}{{\,\mathrm{\mathbb {1}}\,}}(Z_1\notin A))\bigr ]\nonumber \\&\qquad \leqslant D(M,l)\mathsf {E}_x [l^{\tau ^M(k-1)+1}{{\,\mathrm{\mathbb {1}}\,}}(I_1=\cdots =I_{k-2}=0)]\nonumber \\&\qquad \leqslant D(M,l)^{k-1}\mathsf {E}_x [l^{\tau ^M(1)+1}]\nonumber \\&\qquad \leqslant l D(M,l)^{k-1} (\mathcal {V}(x)\vee M), \end{aligned}$$
(5.10)

where in the last inequality we used (5.3) and the fact that \(l<r\). Applying the Cauchy–Schwarz inequality and (5.4), we deduce

$$\begin{aligned} D(M,l)\leqslant \sup _{x\in \{\mathcal {V}\leqslant M\}}(\mathsf {E}_x [l^{2\tau _{\{\mathcal {V}\leqslant M\}}}])^{1/2}(1-\varepsilon )^{1/2}\leqslant M^{\log l/\log r}(1-\varepsilon )^{1/2}<1, \end{aligned}$$
(5.11)

where the last inequality follows from the definitions of l and r. Thus, (5.11) together with (5.9), (5.10), and the obvious bound \((\mathcal {V}(x)\vee M)\leqslant M\mathcal {V}(x)\) yields

$$\begin{aligned} \mathsf {E}_x l^{\tau _A}&\leqslant l (\mathcal {V}(x)\vee M) \sum _{k=1}^\infty D(M,l)^{k-1}\leqslant C(\lambda ,l,K,M,\varepsilon ) \mathcal {V}(x), \end{aligned}$$

which implies (5.6). \(\square \)

The next lemma explains why conditions (3) and (4) are imposed in Theorem 2.3.

Lemma 5.2

Assume that condition (3) of Theorem 2.3 is satisfied for measurable functions \(\varphi :E\rightarrow \mathbb {R}\), \(d:E\times E\rightarrow [0,1]\). Let \(X,\, Y,\, \widetilde{X},\, \widetilde{Y}\) be random elements \(\Omega \rightarrow E\) such that \({{\,\mathrm{Law}\,}}(X)={{\,\mathrm{Law}\,}}(\widetilde{X})\), \({{\,\mathrm{Law}\,}}(Y)={{\,\mathrm{Law}\,}}(\widetilde{Y})\) and

$$\begin{aligned} \mathsf {P}(X\preceq Y)\geqslant 1-\varepsilon _1,\,\,\mathsf {P}(\widetilde{Y}\preceq \widetilde{X})\geqslant 1-\varepsilon _2 \end{aligned}$$

for some \(\varepsilon _1,\varepsilon _2>0\). Then

$$\begin{aligned} \mathsf {E}d(X,Y)\leqslant 2\sqrt{\varepsilon _1+\varepsilon _2}(\mathsf {E}\varphi ( X)^2)^{1/2}+\varepsilon _1+\varepsilon _2. \end{aligned}$$

Proof

It follows from the gluing lemma (see, e.g., [38, p. 23]) that there exist random elements \(Z_1, Z_2, Z_3\) such that \({{\,\mathrm{Law}\,}}(Z_1,Z_2)={{\,\mathrm{Law}\,}}(X,Y)\) and \({{\,\mathrm{Law}\,}}(Z_2,Z_3)={{\,\mathrm{Law}\,}}(\widetilde{Y},\widetilde{X})\). Clearly, we also have \({{\,\mathrm{Law}\,}}(Z_1)={{\,\mathrm{Law}\,}}(Z_3)\).

Let \(\Lambda \) be the set \(\{\omega : Z_1(\omega )\preceq Z_2(\omega )\preceq Z_3(\omega )\}\). Then by above

$$\begin{aligned} \mathsf {P}(\Lambda )\geqslant \mathsf {P}(Z_1\preceq Z_2)+\mathsf {P}(Z_2\preceq Z_3)-1= \mathsf {P}(X\preceq Y)+\mathsf {P}(\widetilde{Y}\preceq \widetilde{X})-1\geqslant 1-\varepsilon _1-\varepsilon _2. \end{aligned}$$

Thus, we immediately get from Lemma 3.4 that

$$\begin{aligned} \mathsf {E}d(X,Y)=\mathsf {E}d(Z_1,Z_2)\leqslant 2\sqrt{\varepsilon _1+\varepsilon _2}(\mathsf {E}\varphi ( X)^2)^{1/2}+\varepsilon _1+\varepsilon _2. \end{aligned}$$

\(\square \)

Proof of Theorem 2.3

The proof of the theorem consists of several steps.

Step 1. In this step we fix \(x,y\in E\) and \(t>0\). Let \(\{X^x(s),s\geqslant 0\}\) and \(\{X^y(s), s\geqslant 0\}\) be independent Markov processes with the laws \(\mathsf {P}_x\) and \(\mathsf {P}_y\), correspondingly. Let \(\mathcal {F}_s := \sigma (X^x(r),X^y(r),r\leqslant s)\), \(s\leqslant t\), be the natural filtration. Introduce stopping times

$$\begin{aligned}&\tau _{x\preceq y} := \inf \{n\in \mathbb {Z}_+: X^x(n)\preceq X^y(n)\},\\&\tau _{y\preceq x} := \inf \{n\in \mathbb {Z}_+: X^y(n)\preceq X^x(n)\}. \end{aligned}$$

Note that by definition these stopping times \(\tau _{x\preceq y}\) and \(\tau _{y\preceq x}\) take only countably many values.

Let us now extend the state space E and add an additional element denoted by \(\diamondsuit \). We assume that \(\diamondsuit \preceq \diamondsuit \) and do not impose any further partial order relations between \(\diamondsuit \) and other elements of E.

Introduce the following random elements:

$$\begin{aligned}&\eta ^x := X^x(t){{\,\mathrm{\mathbb {1}}\,}}(\tau _{x\preceq y}\leqslant t)+\diamondsuit {{\,\mathrm{\mathbb {1}}\,}}(\tau _{x\preceq y}>t),\\&\eta ^y := X^y(t){{\,\mathrm{\mathbb {1}}\,}}(\tau _{x\preceq y}\leqslant t)+\diamondsuit {{\,\mathrm{\mathbb {1}}\,}}(\tau _{x\preceq y}>t). \end{aligned}$$

We claim now that \({{\,\mathrm{Law}\,}}(\eta ^x)\preceq _{st}{{\,\mathrm{Law}\,}}(\eta ^y)\). Indeed, let \(f:E\cup \{\diamondsuit \}\rightarrow \mathbb {R}\) be an arbitrary bounded measurable increasing function. Then

$$\begin{aligned} \mathsf {E}f(\eta ^x)=f(\diamondsuit )\mathsf {P}(\tau _{x\preceq y}>t)+\sum _{i=0}^{\lfloor t\rfloor }\mathsf {E}[f(X^x(t)) {{\,\mathrm{\mathbb {1}}\,}}(\tau _{x\preceq y}=i)]. \end{aligned}$$
(5.12)

Note that for any \(i=0,\ldots ,\lfloor t\rfloor \) we have

$$\begin{aligned} \mathsf {E}[f(X^x(t)){{\,\mathrm{\mathbb {1}}\,}}(\tau _{x\preceq y}=i)]&=\mathsf {E}\bigl [{{\,\mathrm{\mathbb {1}}\,}}(\tau _{x\preceq y}=i)\mathsf {E}\bigl (f(X^x(t))|\mathcal {F}_i\bigr )\bigr ]\nonumber \\&=\mathsf {E}\Bigl [{{\,\mathrm{\mathbb {1}}\,}}(\tau _{x\preceq y}=i)\int _E f(z)P_{t-i}(X^x(i),dz)\Bigr ]. \end{aligned}$$
(5.13)

Recall that the kernel \(P_{t-i}\) is order preserving and thus for any \(z_1,z_2\in E\) such that \(z_1\preceq z_2\) we have

$$\begin{aligned} \int _E f(z)P_{t-i}(z_1,dz)\leqslant \int _E f(z)P_{t-i}(z_2,dz). \end{aligned}$$

Since on the set \(\{\tau _{x\preceq y}=i\}\) we have \(X^x(i)\preceq X^y(i)\), we can continue (5.13) in the following way:

$$\begin{aligned} \mathsf {E}[f(X^x(t)){{\,\mathrm{\mathbb {1}}\,}}(\tau _{x\preceq y}=i)]&\leqslant \mathsf {E}\Bigl [{{\,\mathrm{\mathbb {1}}\,}}(\tau _{x\preceq y}=i)\int _E f(z)P_{t-i}(X^y(i),dz)\Bigr ]\\&=\mathsf {E}[f(X^y(t)){{\,\mathrm{\mathbb {1}}\,}}(\tau _{x\preceq y}=i)]. \end{aligned}$$

Combining this with (5.12), we finally deduce

$$\begin{aligned} \mathsf {E}f(\eta ^x)\leqslant f(\diamondsuit )\mathsf {P}(\tau _{x\preceq y}>t)+\sum _{i=0}^{\lfloor t\rfloor }\mathsf {E}[f(X^y(t)) {{\,\mathrm{\mathbb {1}}\,}}(\tau _{x\preceq y}=i)]=\mathsf {E}f(\eta ^y). \end{aligned}$$

Since f was an arbitrary bounded measurable increasing function, we see that indeed \({{\,\mathrm{Law}\,}}(\eta ^x)\preceq _{st}{{\,\mathrm{Law}\,}}(\eta ^y)\).

Step 2. We have shown that \({{\,\mathrm{Law}\,}}(\eta ^x)\preceq _{st}{{\,\mathrm{Law}\,}}(\eta ^y)\). Therefore, by the Strassen theorem (see, e.g., [26, Theorem IV.2.4]) there exist random variables \(\xi ^x\) and \(\xi ^y\) such that \({{\,\mathrm{Law}\,}}(\xi ^x)={{\,\mathrm{Law}\,}}(\eta ^x)\), \({{\,\mathrm{Law}\,}}(\xi ^y)={{\,\mathrm{Law}\,}}(\eta ^y)\) and \(\xi ^x\preceq \xi ^y\) a.s.

It follows from the construction, that \(\mathsf {P}(\eta ^x\ne X^x(t))\leqslant \mathsf {P}(\tau _{x\preceq y}>t)\) and \(\mathsf {P}(\eta ^y\ne X^y(t))\leqslant \mathsf {P}(\tau _{x\preceq y}>t)\). Now we will apply twice the gluing lemma (see, e.g., [38, p. 23]). First, we apply it to the pairs \((X^x(t),\eta ^x)\) and \((\xi ^x,\xi ^y)\) and construct two random variables \(Y^x\) and \(Y^y\) such that \({{\,\mathrm{Law}\,}}(Y^x)={{\,\mathrm{Law}\,}}(X^x(t))\), \({{\,\mathrm{Law}\,}}(Y^y)={{\,\mathrm{Law}\,}}(\xi ^y)={{\,\mathrm{Law}\,}}(\eta ^y)\) and \(\mathsf {P}(Y^x\preceq Y^y)\geqslant 1-\mathsf {P}(\tau _{x\preceq y}>t)\). Then, we apply the gluing lemma to the pairs \((Y^x,Y^y)\) and \((\eta ^y,X^y)\). We deduce that there exist random variables \(Z^x\) and \(Z^y\) such that

$$\begin{aligned}&{{\,\mathrm{Law}\,}}(Z^x)={{\,\mathrm{Law}\,}}(X^x(t)),\quad {{\,\mathrm{Law}\,}}(Z^y)={{\,\mathrm{Law}\,}}(X^y(t));\\&\mathsf {P}(Z^x\preceq Z^y)\geqslant 1-2\mathsf {P}(\tau _{x\preceq y}>t). \end{aligned}$$

In a similar way, we construct another pair of random variables \(\widetilde{Z}^x\) and \(\widetilde{Z}^y\) with the following properties

$$\begin{aligned}&{{\,\mathrm{Law}\,}}(\widetilde{Z}^x)={{\,\mathrm{Law}\,}}(X^x(t)),\quad {{\,\mathrm{Law}\,}}(\widetilde{Z}^y)={{\,\mathrm{Law}\,}}(X^y(t));\\&\mathsf {P}(\widetilde{Z}^y\preceq \widetilde{Z}^x)\geqslant 1-2\mathsf {P}(\tau _{y\preceq x}>t). \end{aligned}$$

Step 3. Now it follows from Step 2 and Lemma 5.2 that

$$\begin{aligned} W_{d\wedge 1}(P_t(x,\cdot ),P_t(y,\cdot ))&\leqslant \mathsf {E}(d(Z^x, Z^y)\wedge 1)\nonumber \\&\leqslant 2\sqrt{2} (\mathsf {P}(t\leqslant \tau _{y\preceq x})+\mathsf {P}\bigl (t\leqslant \tau _{x\preceq y})\bigr )^{1/2}(1+M(x)^{1/2}). \end{aligned}$$
(5.14)

Thus everything boils down to bounding the probabilities \(\mathsf {P}(t\leqslant \tau _{y\preceq x})\) and \(\mathsf {P}\bigl (t\leqslant \tau _{x\preceq y})\). We will do it using Lemma 5.1.

Step 4. First we note that, clearly, \(\{(x_1,x_2)\in E\times E:x_1\preceq x_2\}\supset A\times B\) (recall the definitions of the sets A and B in condition (5) of the theorem). Therefore,

$$\begin{aligned} \tau _{x\preceq y}\leqslant \inf \{n\in \mathbb {N}: (X^x(n),X^y(n))\in A\times B\}. \end{aligned}$$
(5.15)

Now we apply Lemma 5.1(ii) to the state space \(\widetilde{E} := E\times E\), the kernel Q on it defined by

$$\begin{aligned} Q((x_1,x_2),A_1\times A_2) := P_1(x_1,A_1)P_1(x_2,A_2),\quad x_1,x_2\in E,\,A_1,A_2\in {\mathcal {E}}, \end{aligned}$$

the set \(A\times B\), and the Lyapunov function \(\mathcal {V}(x_1,x_2) := 1+V(x_1)+V(x_2)\), \(x_1,x_2\in E\). It follows from (2.2) and the Gronwall inequality that

$$\begin{aligned} Q\mathcal {V}(x_1,x_2)\leqslant e^{-\gamma }\mathcal {V}(x_1,x_2)+(2K/\gamma +1)(1-e^{-\gamma }). \end{aligned}$$

Therefore, condition (5.1) is met. Take \(M=4K/\gamma +1\). It is clear that this choice of M satisfies (5.2). Thanks to assumption (2.3) and the definitions of \(\mathcal {V}\) and the sets A, B, we see that condition (5.5) holds for the set \(A\times B\) in place of A and the positive constant M chosen above. Therefore all conditions of Lemma 5.1(ii) are satisfied and, thus there exist constants \(C>0\), \(\lambda >0\) that do not depend on xy such that

$$\begin{aligned} \mathsf {E}_{x,y} e^{\lambda \tau _{A\times B}}\leqslant C(1+V(x)+V(y)). \end{aligned}$$

This, combined with (5.15) and the Chebyshev inequality implies

$$\begin{aligned} \mathsf {P}(t\leqslant \tau _{y\preceq x})\leqslant C(1+V(x)+V(y)) e^{-\lambda t}. \end{aligned}$$

Using exactly the same argument, we also get that

$$\begin{aligned} \mathsf {P}(t\leqslant \tau _{x\preceq y})\leqslant C(1+V(x)+V(y)) e^{-\lambda t}. \end{aligned}$$

Substituting these bounds into (5.14), we finally deduce

$$\begin{aligned} W_{d\wedge 1}(P_t(x,\cdot ),P_t(y,\cdot ))\leqslant C (1+V(x)+V(y))(1+M(x))e^{-\lambda t/2}. \end{aligned}$$

Taking into account that obviously \(W_{d\wedge 1}(P_t(x,\cdot ),P_t(y,\cdot ))\leqslant 1\), we obtain (2.4). \(\square \)

Proof of Theorem 2.4

We begin by observing that by Jensen’s inequality and condition (6) of the theorem we have for any measures \(\mu ,\nu \in {\mathcal {P}}(E)\)

$$\begin{aligned} W_{\rho \wedge 1}(\mu ,\nu )\leqslant W_{d^\delta \wedge 1}(\mu ,\nu )\leqslant (W_{d\wedge 1}(\mu ,\nu ))^\delta . \end{aligned}$$
(5.16)

We apply now Theorem 2.3 with \(\theta := \frac{\kappa }{\delta (1+\kappa )}\). Taking into account (5.16), we obtain that there exist \(C>0\), \(\lambda >0\) such that for any \(x,y\in E\) we have

$$\begin{aligned} W_{\rho \wedge 1}(P_t(x,\cdot ),P_t(y,\cdot ))\leqslant C(1+V(x)+V(y))e^{-\lambda t},\quad t\geqslant 0 \end{aligned}$$
(5.17)

where we have also used condition (7) of the theorem.

From here the proof follows the standard line of argument. Note that by [21, Theorem 17.24] for any fixed \(t\geqslant 0\), the mapping \(x\mapsto P_t(x,\cdot )\) is measurable. Therefore by [38, Theorem 4.8 and Corollary 5.22] the mapping \((x,y)\mapsto W_{\rho \wedge 1}(P_t(x,\cdot ),P_t(y,\cdot ))\) is measurable and the function \(W_{\rho \wedge 1}\) is convex in both arguments. Therefore we immediately get from (5.17) that for any \(\mu ,\nu \in {\mathcal {P}}(E)\)

$$\begin{aligned} W_{\rho \wedge 1}(P_t \mu ,P_t \nu )\leqslant C(1+\mu (V)+\nu (V))e^{-\lambda t},\quad t\geqslant 0. \end{aligned}$$
(5.18)

Now we are ready to prove the existence of an invariant measure for \(\{P_t,t\geqslant 0\}\). Fix any \(x\in E\). We make use of (5.18) to obtain that for any \(t,s>0\)

$$\begin{aligned} W_{\rho \wedge 1}(P_t(x,\cdot ),P_{t+s}(x,\cdot ))&=W_{\rho \wedge 1}(P_t \delta _x,P_{t}(P_s\delta _x))\\&\leqslant C(1+V(x)+P_sV(x))e^{-\lambda t}\\&\leqslant C(1+2V(x)+K/\gamma )e^{-\lambda t}. \end{aligned}$$

Thus the sequence of measures \(\{P_t(x,\cdot )\}\) is Cauchy in the space \(({\mathcal {P}}(E),W_{\rho \wedge 1})\). Since this space is complete (see, e.g., [38, Theorem 6.18]), there exists a measure \(\pi \in {\mathcal {P}}(E)\) such that \(W_{\rho \wedge 1}(P_t(x,\cdot ),\pi )\rightarrow 0\) as \(t\rightarrow \infty \). Hence for any \(y\in E\) we have

$$\begin{aligned} W_{\rho \wedge 1}(P_t(y,\cdot ),\pi )\leqslant W_{\rho \wedge 1}(P_t(x,\cdot ),\pi )+W_{\rho \wedge 1}(P_t(x,\cdot ),P_t(y,\cdot ))\rightarrow 0\,\,\text {as } t\rightarrow \infty . \end{aligned}$$

Therefore for any \(t\geqslant 0\) by the monotone convergence theorem we deduce

$$\begin{aligned} W_{\rho \wedge 1}(P_t \pi ,\pi )\leqslant \int _E W_{\rho \wedge 1}(P_t(y,\cdot ),\pi )\,\pi (dy)\rightarrow 0\,\,\text {as } t\rightarrow \infty , \end{aligned}$$

where we have used again that the function \(W_{\rho \wedge 1}\) is convex. Therefore the measure \(\pi \) is invariant for \(\{P_t,t\geqslant 0\}\).

Now let us show the uniqueness of the invariant measure. First, note that if \(\pi \) is an arbitrary invariant measure for \(\{P_t,t\geqslant 0\}\), then \(\pi (V)<\infty \), see [13, Proposition 4.24]. Thus if \(\pi _1\) and \(\pi _2\) are two invariant measures, then by (5.18) we have for any \(t\geqslant 0\)

$$\begin{aligned} W_{\rho \wedge 1}(\pi _1, \pi _2)=W_{\rho \wedge 1}(P_t \pi _1,P_t \pi _2)\leqslant C(1+\pi _1(V)+\pi _2(V))e^{-\lambda t}. \end{aligned}$$

Taking the limit in the right-hand side of the above inequality we get \(\pi _1=\pi _2\).

Finally, bound (2.5) follows directly from (5.18) and the fact that \(\pi (V)<\infty \). \(\square \)

Proof of Example 2.8

First let us show that the set \(\Gamma \) defined in (2.1) is closed. Let \((x_n)_{n\in \mathbb {Z}_+}\), \((y_n)_{n\in \mathbb {Z}_+}\) be two sequences of elements of \(L_p(D,\mathbb {R})\) such that \(x_n\preceq _{pos}y_n\) and \(\Vert x_n-x\Vert _{L_p(D)}\rightarrow 0\), \(\Vert y_n-y\Vert _{L_p(D)}\) as \(n\rightarrow \infty \). We claim now that \(x\preceq _{pos}y\).

Indeed, by passing to an appropriate subsequence \((n_k)\) we see that \(x_{n_k}\rightarrow x\) almost everywhere (a.e.) and \(y_{n_k}\rightarrow y\) a.e. as \(n\rightarrow \infty \). Since for each k we have \(x_{n_k}\leqslant y_{n_k}\) a.e., it is now easily seen that \(x\leqslant y\) a.e. and thus \(x\preceq _{pos}y\). Therefore, the set \(\Gamma \) is closed.

Now let us introduce the function \(\varphi \). Put

$$\begin{aligned} \varphi (x) := 2^{p-1}\int _D \bigl |x(z)|^p{{\,\mathrm{sign}\,}}(x(z))\,dz,\quad x\in L_p(D,\mathbb {R}). \end{aligned}$$

Let us verify that the partial order \(\preceq _{pos}\) and functions \(\varphi \) and d satisfy condition (3) of Theorem 2.3. Our original proof of this was a bit long; the following simple proof was suggested to us by the referee, to whom we are very grateful for this.

Fix any \(x,y\in L_p(D,\mathbb {R})\) such that \(x\preceq _{pos}y\). Note that for any real numbers \(a<b\), \(p\geqslant 1\) one has

$$\begin{aligned} |a-b|^p\leqslant 2^{p-1}(|b|^p{{\,\mathrm{sign}\,}}(b) - |a|^p{{\,\mathrm{sign}\,}}(a)). \end{aligned}$$

Using this inequality, we get

$$\begin{aligned} 0\leqslant d(x,y)&=\Vert x-y\Vert _{L_p(D)}^p\\&\leqslant 2^{p-1} \int _D \bigl (|y(z)|^p {{\,\mathrm{sign}\,}}(y(z))-|x(z)|^p{{\,\mathrm{sign}\,}}(x(z))\,dz\\&=\varphi (y)-\varphi (x), \end{aligned}$$

and thus condition (3) of Theorem 2.3 holds. \(\square \)

5.2 Proofs of the results of Sect. 4

We begin with the following auxiliary statement which provides Gronwall-type bounds that will be very useful in the sequel.

Lemma 5.3

Let \(C_1,C_2,C_3,C_4\) be positive constants. Let X be a continuous \(\mathcal {F}_t\)-adapted process taking values in \([0,\infty )\) such that \(X(0)=x\) and

$$\begin{aligned} X(t)\leqslant X(s)-C_1\int _s^tX(r)dr+M(t)-M(s)+C_2(t-s),\quad 0\leqslant s\leqslant t, \end{aligned}$$
(5.19)

where M is a continuous local \(\mathcal {F}_t\)-martingale with \(M(0)=0\) and

$$\begin{aligned} d\langle M\rangle _t\leqslant (C_3 X(t)+C_4)dt. \end{aligned}$$
(5.20)

Then for any \(t\geqslant 0\) we have \(\mathsf {E}X(t)<+\infty \) and the following bounds hold:

$$\begin{aligned}&\mathsf {E}X(t)\leqslant \mathsf {E}X(s)-C_1\int _s^t\mathsf {E}X(r)dr+C_2(t-s),\quad 0\leqslant s\leqslant t, \end{aligned}$$
(5.21)
$$\begin{aligned}&\mathsf {E}X(t)^2\leqslant x^2 \exp (-C_1t)+C,\quad t\geqslant 0, \end{aligned}$$
(5.22)

where \(C=C(C_1,C_2,C_3,C_4)>0\) is some constant independent of t and x.

Proof

We begin with the proof of (5.21). For \(n\in \mathbb {N}\) introduce the stopping time \(\tau _n := \inf \{t\geqslant 0: |M(t)|\geqslant n\}\). Then it follows from (5.19) that for any \(n\in \mathbb {N}\), \(t\geqslant 0\)

$$\begin{aligned} \mathsf {E}X(t\wedge \tau _n )\leqslant x+C_2 (t\wedge \tau _n). \end{aligned}$$

Using Fatou’s lemma, we deduce that for any \(t\geqslant 0\) we have \(\mathsf {E}X(t)\leqslant x+C_2 t\) and thus \(\mathsf {E}X(t)<\infty \). This and (5.20) imply that for any \(t\geqslant 0\) we have \(\mathsf {E}\langle M\rangle _t\leqslant C_4t+C_3xt+C_2C_3t^2/2<\infty \). Therefore M is a martingale and (5.21) follows immediately from (5.19).

To establish (5.22), we introduce a process Y, which is a solution to the following equation

$$\begin{aligned} Y(t)= x-C_1\int _0^tY(r)dr+M(t)+C_2t,\quad t\geqslant 0. \end{aligned}$$
(5.23)

We claim that for any \(t\geqslant 0\) we have \(X(t)\leqslant Y(t)\). Indeed, assume the contrary and suppose that for some \(T_0>0\) we have \(X(T_0)> Y(T_0)\). Then, arguing as in [20, Proof of Proposition 9.2], we introduce

$$\begin{aligned} S_0 := \sup \{t\in [0,T_0]:X(t)\leqslant Y(t)\}. \end{aligned}$$

Since the processes X and Y are continuous we have \(S_0<T_0\) and \(X(t)> Y(t)\) for \(t\in (S_0,T_0)\). Then (5.19) and (5.23) imply

$$\begin{aligned} X(T_0)-Y(T_0)\leqslant X(S_0)-Y(S_0)-C_1\int _{S_0}^{T_0}(X(t)-Y(t))\,dt<0 \end{aligned}$$

which contradicts the fact that \(X(T_0)> Y(T_0)\). Therefore such \(T_0\) does not exist and \(X(t)\leqslant Y(t)\) for any \(t\geqslant 0\).

Note now that by Ito’s formula and (5.20)

$$\begin{aligned} d Y(t)^2&=(-2C_1Y(t)^2+2C_2Y(t))dt+d\langle M\rangle _t+dN(t)\nonumber \\&\leqslant (-2C_1Y(t)^2+(2C_2+C_3)Y(t)+C_4)dt+dN(t)\nonumber \\&\leqslant (-C_1Y(t)^2+C_5)dt+dN(t), \end{aligned}$$
(5.24)

where \(C_5 := (2C_2+C_3)^2/(4C_1)+C_4\) and N is a continuous local martingale.

Consider a stopping time \(\sigma _n := \inf \{t\geqslant 0: |N(t)|\geqslant n\}\). Then, using (5.24), we get for any \(0\leqslant s\leqslant t\)

$$\begin{aligned} \mathsf {E}Y(t\wedge \sigma _n)^2\leqslant \mathsf {E}Y(s\wedge \sigma _n)^2-C_1\int _{s}^t \mathsf {E}Y(r\wedge \sigma _n)^2\,dr+C_5(t-s). \end{aligned}$$

Therefore, the Gronwall inequality yields

$$\begin{aligned} \mathsf {E}X(t\wedge \sigma _n)^2\leqslant \mathsf {E}Y(t\wedge \sigma _n)^2\leqslant x^2 \exp (-C_1t)+C_5/C_1,\quad t\geqslant 0. \end{aligned}$$

The proof of (5.22) is completed by passing to the limit in the above inequality using Fatou’s lemma. \(\square \)

In all the remaining theorems and lemmas of this section we will assume that Assumption A is satisfied. To establish Theorem 4.4 we split the process u into the following two parts. The first part, denoted by w, is the stochastic convolution, that is a unique analytically and probabilistically strong solution of the following equation

$$\begin{aligned} d w(t,\xi )=\Delta w(t,\xi )dt+\sum _{k=1}^m \sigma _k(\xi )dW^k(t), \quad \xi \in \mathbb {T}^d,\, t\geqslant 0, \end{aligned}$$
(5.25)

with the initial condition \(w(0)=0\).

Lemma 5.4

The function w is well-defined. Furthermore, there exists a set \(\Omega '\subset \Omega \) such that \(\mathsf {P}(\Omega ')=1\) and for any \(\varepsilon >0\), \(T>0\), \(\omega \in \Omega '\) there exists \(C=C(\varepsilon ,T,\omega )\) such that for any \(\xi _1,\xi _2\in \mathbb {T}^d\), \(t_1,t_2\in [0,T]\) one has

$$\begin{aligned}&|w(t_1,\xi _1,\omega )-w(t_2,\xi _2,\omega )|\leqslant C(|t_1-t_2|^{1/2-\varepsilon }+|\xi _1-\xi _2|^{1-\varepsilon }). \end{aligned}$$
(5.26)

Proof

Existence and uniqueness of a strong solution of (5.25) follows from [7, Example 5.39] and assumption (4.5). Bound (5.26) follows from [7, Theorem 5.15(ii)]. \(\square \)

For \(T>0\) denote now

$$\begin{aligned} M_T=M_T(\omega )\, := \, \sup _{\begin{array}{c} t\in [0,T]\\ \xi \in \mathbb {T}^d \end{array}}|w(t,\xi ,\omega )|. \end{aligned}$$
(5.27)

By Lemma 5.4, for any \(\omega \in \Omega '\), \(T>0\) we have \(M_T(\omega )<\infty \).

The second ingredient of decomposition of u is much more tricky. We fix now arbitrary \(\omega \in \Omega '\), \(T>0\) and consider the (deterministic) PDE

$$\begin{aligned} d v(t,\xi )=\Delta v(t,\xi )dt+f(v(t,\xi )+w(t,\xi ,\omega ),\xi )dt, \quad \xi \in \mathbb {T}^d,\, t\in [0,T] \end{aligned}$$
(5.28)

with the initial condition \(v(0)=v_0\in \mathcal {C}^2(\mathbb {T}^d)\). We begin with the case when the initial condition is smooth enough.

Lemma 5.5

For any \(v_0\in \mathcal {C}^2(\mathbb {T}^d)\) Eq. (5.28) has a unique analytically strong solution v with the initial condition \(v_0\). Furthermore \(v\in \mathcal {C}([0,T],\mathcal {C}^2(\mathbb {T}^d))\) and there exists \(C>0\) such that for any \(t\in [0,T]\)

$$\begin{aligned} \Vert v(t)\Vert _{L_\infty (\mathbb {T}^d)}\leqslant C(\Vert v_0\Vert _{L_2(\mathbb {T}^d)}+M_t)(1+t^{-d/4})+C_ft, \end{aligned}$$
(5.29)

where \(C_f := \sup _{\begin{array}{c} x\geqslant 0\\ \eta \in \mathbb {T}^d \end{array}}({{\,\mathrm{sign}\,}}(x)f(x,\eta ))\vee 0\).

Proof

We begin with the uniqueness part. Suppose that v and \(\bar{v}\) are two strong solutions to Eq. (5.28) with the same initial condition \(v_0\in \mathcal {C}^2(\mathbb {T}^d)\). Then, by the chain rule

$$\begin{aligned}&\frac{d}{dt}\Vert v(t)-\bar{v}(t)\Vert _{L_2(\mathbb {T}^d)}^2\nonumber \\&\quad =2\int _{\mathbb {T}^d}(v(t,\xi )-\bar{v}(t,\xi ))\bigl (f(v(t,\xi )+w(t,\xi ,\omega ),\xi )-f(\bar{v}(t,\xi )+w(t,\xi ,\omega ),\xi )\bigr )\,d\xi \nonumber \\&\quad -2\Vert \nabla (v(t)-\bar{v}(t))\Vert _{L_2(\mathbb {T}^d)}^2\nonumber \\&\quad \leqslant C \Vert v(t)-\bar{v}(t)\Vert _{L_2(\mathbb {T}^d)}^2, \end{aligned}$$
(5.30)

where the last inequality follows from assumption (4.4). Since \(v(0)=\bar{v}(0)\), an application of the Gronwall lemma immediately implies that \(v(t)=\bar{v}(t)\) for any \(t\in [0,T]\).

To show existence of a strong solution to (5.28) we fix the initial condition \(v_0\in \mathcal {C}^2(\mathbb {T}^d)\) and \(T>0\). Note that the function

$$\begin{aligned} \mathbb {R}\times [0,T]\times \mathbb {T}^d\ni (x,t,\xi )\mapsto f(x+w(t,\xi ,\omega ),\xi ) \end{aligned}$$
(5.31)

is locally Lipschitz in x, Hölder in t and continuous in \(\xi \) thanks to (5.26) and Assumption A. Therefore it satisfies [27, Inequality (7.3.2)]. Thus, by [27, Propositions 7.3.1.i, 7.3.1.ii, 7.1.10.iii] equation (5.28) has a local solution on some interval \([0,\delta ]\), \(\delta >0\) and this solution is in \(\mathcal {C}([0,\delta ],\mathcal {C}^2(\mathbb {T}^d))\). Let us now show that Eq. (5.28) has a global solution.

It follows from (4.3) that there exist \(x_-\leqslant x_+\in \mathbb {R}\) such that \(f(x,\xi )\geqslant 0\) for any \(x\leqslant x_-\), \(\xi \in \mathbb {T}^d\) and \(f(x,\xi )\leqslant 0\) for any \(x\geqslant x_+\), \(\xi \in \mathbb {T}^d\). Consider the following PDE

$$\begin{aligned} d \psi (t,\xi )=\Delta \psi (t,\xi )dt+(f(\psi (t,\xi )+w(t,\xi ,\omega ),\xi )\vee 0)dt, \quad \xi \in \mathbb {T}^d,\, t\geqslant 0, \end{aligned}$$
(5.32)

with the initial condition \(\psi (0)=(x_++M_T)\vee \sup _\eta v_0(\eta )\) (recall the definition of \(M_T\) in (5.27)). By above, the constant function

$$\begin{aligned} \psi (t,\xi )\, := \, (x_++M_T)\vee \sup _\eta v_0(\eta ) \end{aligned}$$

solves this equation on [0, T]. On the other hand, since \(v(0,\xi )\leqslant \psi (0,\xi )\) and \(f(x+w(t,\xi ,\omega ),\xi )\leqslant f(x+w(t,\xi ,\omega ),\xi )\vee 0\), the comparison theorem for reaction–diffusion PDEs [36, Theorem 10.1] implies that for every interval \([0,\delta ]\) where Eq. (5.28) has a local solution we have

$$\begin{aligned} v(t,\xi )\leqslant (x_++M_T)\vee \sup _\eta v_0(\eta ),\quad t\in [0,\delta ],\, \xi \in \mathbb {T}^d. \end{aligned}$$

By a similar argument,

$$\begin{aligned} v(t,\xi )\geqslant (x_--M_T)\wedge \inf _\eta v_0(\eta ),\quad t\in [0,\delta ],\, \xi \in \mathbb {T}^d. \end{aligned}$$

Thus, for any interval \([0,\delta ]\) where Eq. (5.28) has a local solution, this solution is uniformly bounded by a constant which does not depend on \(\delta \). Therefore, by [27, Proposition 7.3.1.v] Eq. (5.28) has a global solution on [0, T]. By [27, Proposition 7.1.10.iii] this solution is in \(\mathcal {C}([0,T],\mathcal {C}^2(\mathbb {T}^d))\).

Finally, to obtain the desired bound on \(\Vert v(t)\Vert _{L_\infty (\mathbb {T}^d)}\) in terms of \(\Vert v_0\Vert _{L_2(\mathbb {T}^d)}\) rather than \(\Vert v_0\Vert _{L_\infty (\mathbb {T}^d)}\) we fix \(t\in [0,T]\) and consider again PDE (5.32) with the initial condition \(\psi (0,\xi ) := (v_0(\xi )\vee 0)+M_t\). By the comparison principle, we have

$$\begin{aligned} v(t,\xi )\leqslant \psi (t,\xi ), \quad \xi \in \mathbb {T}^d. \end{aligned}$$
(5.33)

Since \(\psi (0,\xi )\geqslant M_t\) for all \(\xi \in \mathbb {T}^d\) and the drift is nonnegative, it is immediate to see that \(\psi (s,\xi )\geqslant M_t\) for all \(s\in [0,t]\), \(\xi \in \mathbb {T}^d\). Therefore, taking into account (5.27), we get that for all \(s\in [0,t]\), \(\xi \in \mathbb {T}^d\)

$$\begin{aligned} f(\psi (s,\xi )+w(s,\xi ,\omega ),\xi )\vee 0\leqslant 0\vee \sup _{\begin{array}{c} x\geqslant 0\\ \eta \in \mathbb {T}^d \end{array}}f(x,\eta )=:C_+<\infty . \end{aligned}$$
(5.34)

Introduce now \(\psi _+\), which is the strong solution of the following PDE

$$\begin{aligned} d \psi _+(s,\xi )=\Delta \psi _+(s,\xi )ds+C_+ds, \quad \xi \in \mathbb {T}^d,\, s\in [0,t], \end{aligned}$$

with the initial condition \(\psi _+(0) := \psi (0)\). Using (5.34), we see that the comparison principle implies that

$$\begin{aligned} \psi (t,\xi )\leqslant \psi _+(t,\xi ), \quad \xi \in \mathbb {T}^d. \end{aligned}$$
(5.35)

Let \(p_s\) be the heat kernel on the torus \(\mathbb {T}^d\). Then it is straightforward to see that for any \(\xi \in \mathbb {T}^d\)

$$\begin{aligned} |\psi _+(t,\xi )|= & {} |p_t*\psi _0(\xi )+C_+t|\leqslant C_+t+\Vert \psi _0\Vert _{L_2(\mathbb {T}^d)}\Bigl (\int _{\mathbb {T}^d} p_t(\xi )^2\,d\xi \Bigr )^{1/2}\\\leqslant & {} C_+t+C\Vert \psi _0\Vert _{L_2(\mathbb {T}^d)}(1+t^{-d/4}). \end{aligned}$$

Combining this with (5.33) and (5.35), we deduce

$$\begin{aligned} v(t,\xi )\leqslant & {} C_+t+C\Vert \psi _0\Vert _{L_2(\mathbb {T}^d)}(1+t^{-d/4})\\\leqslant & {} t(\sup _{\begin{array}{c} x\geqslant 0\\ \eta \in \mathbb {T}^d \end{array}}f(x,\eta )\vee 0)+C(\Vert v_0\Vert _{L_2(\mathbb {T}^d)}+M_t)(1+t^{-d/4}). \end{aligned}$$

By a similar argument, we get

$$\begin{aligned} v(t,\xi )\geqslant t(\inf _{\begin{array}{c} x\leqslant 0\\ \eta \in \mathbb {T}^d \end{array}}f(x,\eta )\wedge 0)-C(\Vert v_0\Vert _{L_2(\mathbb {T}^d)}+M_t)(1+t^{-d/4}). \end{aligned}$$

This yields (5.29). \(\square \)

Now let us move on to less regular initial data.

Lemma 5.6

For any \(v_0\in L_\infty (\mathbb {T}^d)\) Eq. (5.28) has a unique analytically generalized strong solution v with initial condition \(v_0\). Furthermore, we have \(v\in \mathcal {C}((0,T],\mathcal {C}^2(\mathbb {T}^d))\).

Proof

We begin with the uniqueness part. Suppose that v and \(\bar{v}\) are two analytically generalized strong solutions to Eq. (1.1) with the same initial condition \(v_0\in L_\infty (\mathbb {T}^d)\). Fix arbitrary \(t>0\). Arguing as in the proof of Lemma 5.5, we see that by the Gronwall lemma there exists \(C=C(t)>0\) such that for any \(\delta \in (0,t)\) we have

$$\begin{aligned} \Vert v(t)-\bar{v}(t)\Vert _{L_2(\mathbb {T}^d)}^2\leqslant C \Vert v(\delta )-\bar{v}(\delta )\Vert _{L_2(\mathbb {T}^d)}^2. \end{aligned}$$
(5.36)

By definition, \(v(\delta )\rightarrow v(0)\) and \(\bar{v}(\delta )\rightarrow \bar{v}(0)\) in \(L_2(\mathbb {T}^d)\) as \(\delta \rightarrow 0\). Since \(v(0)=\bar{v}(0)\), by passing to the limit as \(\delta \rightarrow 0\) in (5.36), we deduce that \(v(t)=\bar{v}(t)\).

Note again that the function defined in (5.31) is locally Lipschitz in x, Hölder in t and continuous in \(\xi \). Therefore, by [27, Proposition 7.3.1.i] there exists \(\delta >0\) such that on the interval \([0,\delta ]\) Eq. (5.28) has an analytically generalized strong solution v with initial condition \(v_0\) and \(v\in \mathcal {C}((0,\delta ],\mathcal {C}^2(\mathbb {T}^d))\). Since \(v(\delta )\in \mathcal {C}^2(\mathbb {T}^d)\), by Lemma 5.5 we can construct an analytically strong solution v on \([\delta ,T]\) with the initial condition \(v(\delta )\) and \(v\in \mathcal {C}([\delta ,T],\mathcal {C}^2(\mathbb {T}^d))\). By gluing these two solutions together we get an analytically generalized strong solution \(v\in \mathcal {C}((0,T],\mathcal {C}^2(\mathbb {T}^d))\). \(\square \)

To consider even less regular initial data (recall that we are interested in the initial conditions from \(L_2(\mathbb {T}^d)\)) we need the following lemma about approximations of solutions to (5.28).

Lemma 5.7

Let \((v^n_0)_{n\in \mathbb {Z}_+}\) be a sequence of \(\mathcal {C}^2(\mathbb {T}^d)\) functions converging in \(L_2\) to \(v_0\in L_2(\mathbb {T}^d)\). Let \(v^n\) be the analytically strong solution of (5.28) with the initial condition \(v^n_0\). Then

  1. (i)

    \(v^n\) converges in \(\mathcal {C}([0,T],L_2(\mathbb {T}^d))\) to some \(v\in \mathcal {C}([0,T],L_2(\mathbb {T}^d))\);

  2. (ii)

    for each \(t\in (0,T]\) we have \(v(t)\in L_\infty (\mathbb {T}^d)\);

  3. (iii)

    if \(\bar{v}\) is an analytically generalized strong solution of (5.28) with the initial condition \(v_0\), then for each \(t\in [0,T]\) we have \(\bar{v}(t,\xi )=v(t,\xi )\) for almost all \(\xi \in \mathbb {T}^d\).

Proof

(i) Let \(n,m\in \mathbb {Z}_+\). Then arguing as in the derivation of (5.30) we get by the chain rule and assumption (4.4)

$$\begin{aligned} \frac{d}{dt}\Vert v^n(t)-v^m(t)\Vert ^2_{L_2(\mathbb {T}^d)}\leqslant C\Vert v^n(t)-v^m(t)\Vert ^2_{L_2(\mathbb {T}^d)}. \end{aligned}$$

By the Gronwall lemma, this implies that

$$\begin{aligned} \sup _{t\in [0,T]}\Vert v^n(t)-v^m(t)\Vert ^2_{L_2(\mathbb {T}^d)}\leqslant C(T)\Vert v^n_0-v^m_0\Vert ^2_{L_2(\mathbb {T}^d)}. \end{aligned}$$

Now the statement of the lemma follows immediately from the completeness of the space \(\mathcal {C}([0,T],L_2(\mathbb {T}^d))\) ([21, Theorem I.4.19]).

(ii) Fix \(t\in (0,T]\). By Lemma 5.5, we have the following uniform over \(n\geqslant N\) bound, where N is large enough:

$$\begin{aligned} \Vert v^n(t)\Vert _{L_\infty (\mathbb {T}^d)}\leqslant C(t)(\Vert v^n_0\Vert _{L_2(\mathbb {T}^d)}+M_t)+C_ft\leqslant 2C(t)(\Vert v_0\Vert _{L_2(\mathbb {T}^d)}+M_t)+C_ft. \end{aligned}$$
(5.37)

Since \(v^n(t)\) converges to v(t) in \(L_2\), bound (5.37) implies \(v(t)\in L_\infty (\mathbb {T}^d)\).

(iii) Let \(\bar{v}\) be an analytically generalized strong solution of (5.28) with the initial condition \(v_0\). Then arguing as in part (i) of the lemma, we get for any \(\delta >0\)

$$\begin{aligned} \sup _{t\in [\delta ,T]}\Vert \bar{v}(t)-v^n(t)\Vert ^2_{L_2(\mathbb {T}^d)}\leqslant C(T)\Vert \bar{v}(\delta )-v^n(\delta )\Vert ^2_{L_2(\mathbb {T}^d)}. \end{aligned}$$

By passing to the limit as \(\delta \rightarrow 0\) and using the fact that by definition \(\bar{v}(\delta )\rightarrow v_0\) in \(L_2\), we get

$$\begin{aligned} \sup _{t\in [0,T]}\Vert \bar{v}(t)-v^n(t)\Vert ^2_{L_2(\mathbb {T}^d)}\leqslant C(T)\Vert v_0-v^n_0\Vert ^2_{L_2(\mathbb {T}^d)}. \end{aligned}$$

However, by part (i) of the lemma

$$\begin{aligned} \sup _{t\in [0,T]}\Vert v(t)-v^n(t)\Vert ^2_{L_2(\mathbb {T}^d)}\leqslant C(T)\Vert v_0-v^n_0\Vert ^2_{L_2(\mathbb {T}^d)}. \end{aligned}$$

This implies that \(\bar{v}(t)=v(t)\) as elements of \(L_2(\mathbb {T}^d)\) for any \(t\in [0,T]\). \(\square \)

The next lemma establishes existence of solutions to Eq. (5.28) with \(L_2(\mathbb {T}^d)\) initial data. It relies on Lemma 5.7 and extends [27, Propositions 7.3.1].

Lemma 5.8

For any \(v_0\in L_2(\mathbb {T}^d)\) Eq. (5.28) has a unique analytically generalized strong solution v with the initial condition \(v_0\). Furthermore, we have \(v\in \mathcal {C}((0,T],\mathcal {C}^2(\mathbb {T}^d))\).

Proof

The proof of the uniqueness part is the same as in Lemma 5.6.

Let us show existence of a solution to (5.28). Let \((v^n_0)_{n\in \mathbb {Z}_+}\) be a sequence of \(\mathcal {C}^2(\mathbb {T}^d)\) functions converging in \(L_2\) to \(v_0\in L_2(\mathbb {T}^d)\). Let \(v^n\) be the analytically strong solution of (5.28) with the initial condition \(v^n_0\) (it exists thanks to Lemma 5.5). By Lemma 5.7(i,ii) there exists \(v\in \mathcal {C}([0,T],L_2(\mathbb {T}^d))\) such that

$$\begin{aligned} v(t)\in L_{\infty }(\mathbb {T}^d),\,\,t\in (0,T] \quad \text {and }\sup _{t\in [0,T]}\Vert v^n(t)-v(t)\Vert _{L_2}\rightarrow 0\,\,\text {as } n\rightarrow \infty . \end{aligned}$$
(5.38)

We claim now that v is an analytically generalized strong solution to (5.28) with the initial condition \(v_0\).

Indeed, we have by construction \(v(0)=v_0\) and \(v\in \mathcal {C}([0,T],L_2(\mathbb {T}^d))\) as a limit of the functions from the space \(\mathcal {C}([0,T],L_2(\mathbb {T}^d))\). Fix any \(\varepsilon \in (0,T]\) and let us verify that identity (4.2) holds.

By (5.38), we have \(v(\varepsilon /2)\in L_\infty \). Therefore, by Lemma 5.8 there exists a process \(\bar{v}\in \mathcal {C}((\varepsilon /2,T],\mathcal {C}^2(\mathbb {T}^d))\), which is an analytically generalized strong solution of (5.28) on interval \([\varepsilon /2,T]\) with the initial condition \(v(\varepsilon /2)\). Therefore, by Lemma 5.7(iii) we have \(\bar{v}(t)=v(t)\) for any \(t>\varepsilon /2\). Thus, identity (4.2) holds and for any \(t\geqslant \varepsilon \) the function v(t) has a version which is in \(\mathcal {C}^2(\mathbb {T}^d)\). Since \(\varepsilon \) was arbitrary, this implies the statement of the lemma. \(\square \)

Now we are ready to present the proof of Theorem 4.4.

Proof of Theorem 4.4

(i) The proof of the uniqueness part is the same as in Lemma 5.5. To show existence of a strong solution to (1.1) we fix an initial condition \(u_0\in \mathcal {C}^2(\mathbb {T}^d)\), \(T>0\) and first show existence on the time interval [0, T]. Let v be an analytically strong solution to (5.28) with the initial condition \(v(0)=u_0\). Recall the definition of \(\Omega '\) from Lemma 5.4 and put now

$$\begin{aligned} u(t,\xi ,\omega ) := v(t,\xi ,\omega )+w(t,\xi ,\omega ),\quad t\in [0,T], \xi \in \mathbb {T}^d, \omega \in \Omega ' \end{aligned}$$

and \(u(t,\xi ,\omega )=0\) for \(\omega \in \Omega {\setminus }\Omega '\). It follows from Lemmas 5.4 and 5.5 that the function u is an analytically strong solution to (1.1) with the initial condition \(u_0\). To show the adaptiveness of u, we introduce a function \(u^n\), \(n\in \mathbb {Z}_+\), which is an analytically and probabilistically strong solution on [0, T] to the following equation

$$\begin{aligned} u^n(t,\xi )=u_0(\xi )+\int _0^t [\Delta u^n(s,\xi )+f^n(u^n(s,\xi ),\xi )]dt+\sum _{k=1}^m \sigma _k(\xi )W^k(t), \quad \xi \in \mathbb {T}^d,\, t\geqslant 0, \end{aligned}$$
(5.39)

where for \(x\in \mathbb {R}\), \(\xi \in \mathbb {T}^d\) we put \(f^n(x,\xi ) := f((x\wedge n)\vee (-n),\xi )\). Since \(f^n\) is uniformly bounded, it follows from [22, Chapter II, Theorem 2.1] that \(u^n\) is well-defined; thus, identity (5.39) holds on some set \(\Omega _n\) of full measure. On the other hand, by uniqueness, we have

$$\begin{aligned} A_n:= & {} \cap _{k=1}^\infty \Omega _k\cap \Omega '\cap \{\sup _{t\in [0,T],\xi \in \mathbb {T}^d}|u(t,\xi )|\leqslant n\}\\= & {} \cap _{k=1}^\infty \Omega _k\cap \Omega '\cap \{\sup _{t\in [0,T],\xi \in \mathbb {T}^d}|u^n(t,\xi )|\leqslant n\} \end{aligned}$$

and \(u^n=u\) on \(A_n\). Since for each \(\omega \in \Omega '\) the function u is bounded, we see that \(\mathsf {P}(A_n)\rightarrow 1\) as \(n\rightarrow \infty \) and \(A_n\subset A_{n+1}\). This implies that \(u^n\) converges to u a.s. as \(n\rightarrow \infty \). Since each \(u^n\) is \((\mathcal {F}_t)\)-adapted, their limit u is also \((\mathcal {F}_t)\)-adapted. Therefore u is a probabilistically strong solution of (1.1) on [0, T]. Since T was arbitrary, this and uniqueness imply that u is an analytically and probabilistically strong solution of (1.1) on \([0,\infty )\). The continuity of u follows from continuity of v and w (Lemmas 5.4 and 5.5).

(ii) The proof of the uniqueness part is the same as in Lemma 5.6. To show existence fix \(T>0\) and put again now

$$\begin{aligned} u(t,\xi ,\omega )\, := \, v(t,\xi ,\omega )+w(t,\xi ,\omega ),\quad t\in [0,T], \xi \in \mathbb {T}^d, \omega \in \Omega ' \end{aligned}$$

and \(u(t,\xi ,\omega )=0\) for \(\omega \in \Omega {\setminus }\Omega '\). Here v is an analytically generalized strong solution to (5.28) with the initial condition \(v(0)=u_0\). It follows from Lemmas 5.4 and 5.6 that the function u is an analytically generalized strong solution to (1.1) with the initial condition \(u_0\). To show the adaptiveness we consider \((u^n_0)_{n\in \mathbb {Z}_+}\), a sequence of \(\mathcal {C}^2(\mathbb {T}^d)\) functions converging in \(L_2\) to \(u_0\in L_2(\mathbb {T}^d)\). Let \(u^n\) be the analytically and probabilistically strong solution of (1.1) with the initial condition \(u^n_0\) (it exists thanks to part (i) of the theorem). By Lemma 5.7(iii), we have that for each \(t>0\) the function \(u^n(t)\) converges to u(t) in \(L_2(\mathbb {T}^d)\) \(\mathsf {P}\)-a.s. Since \(u^n(t)\) is \(\mathcal {F}_t\)-adapted, we see that u(t) is also \((\mathcal {F}_t)\)-adapted. Therefore u is a probabilistically strong solution of (1.1) on [0, T]. The proof ends in the same way as in the part (i) of the theorem.

(iii) We begin with the following two observations. First, we note that part (ii) of the theorem implies that the generalized strong solution of equation (1.1) is unique.

Second, for \(x\in L_2(\mathbb {T}^d)\) let us denote by \(u^x\) the generalized strong solution of (1.1) with the initial condition x. Then, by the chain rule and Gronwall’s inequality we get that for each \(t>0\) there exists \(C(t)>0\) such that for any \(x,y \in L_2(\mathbb {T}^d)\) we have

$$\begin{aligned} \Vert u^x(t)-u^y(t)\Vert ^2_{L_2(\mathbb {T}^d)}\leqslant C(t)\Vert x-y\Vert ^2_{L_2(\mathbb {T}^d)}, \end{aligned}$$

where we also took into account assumption (4.4).

Now the Markov property of u follows from these two properties by repeating literally the argument from [3, Proposition 4.1].

(iv) Take the set \(\Omega '\) defined in Lemma 5.4. Fix \(t>0\), \(\omega \in \Omega '\), \(x,y\in L_2(\mathbb {T}^d)\) such that \(x\preceq _{pos}y\). Let \(u^x\), \(u^y\) be the generalized strong solutions of (1.1) with the initial conditions x and y respectively. Let us show now that \(u^x(t,\omega )\preceq _{pos}u^y(t,\omega )\) a.s.

We begin with the case when \(x,y\in \mathcal {C}^2(\mathbb {T}^d)\). Let w be the stochastic convolution (recall its definition in Lemma 5.4). Let \(v^x\), \(v^y\) be analytically strong solutions of (5.28) with the initial conditions x and y respectively. Then by Lemma 5.5\(v^x\) and \(v^y\) are continuous on \([0,t]\times \mathbb {T}^d\). Therefore, by comparison principle [36, Theorem 10.1] we have \(v^x(t,\omega )\preceq _{pos}v^y(t,\omega )\) and hence \(u^x(t,\omega )=v^x(t,\omega )+w(t,\omega )\preceq _{pos}v^y(t,\omega )+w(t,\omega )=u^y(t,\omega )\).

In the general case, we consider \((x^n)_{n\in \mathbb {Z}_+}\) and \((y^n)_{n\in \mathbb {Z}_+}\), two sequences of \(\mathcal {C}^2(\mathbb {T}^d)\) functions converging in \(L_2\) to x and y, respectively. By the above, for each \(n\in \mathbb {Z}_+\) we have

$$\begin{aligned} u^{x^n}(t,\omega )\preceq _{pos}u^{y^n}(t,\omega ). \end{aligned}$$
(5.40)

On the other hand, by part (ii) of the theorem and Lemma 5.7 we have \(u^{x^n}(t,\omega )\rightarrow u^{x}(t,\omega )\), \(u^{y^n}(t,\omega )\rightarrow u^{y}(t,\omega )\) in \(L_2(\mathbb {T}^d)\) as \(n\rightarrow \infty \). This together with (5.40) yields \(u^{x}(t)\preceq _{pos}u^{y}(t)\).

(v) Follows immediately from part (iv) of the theorem and the Strassen’s theorem.

(vi) Fix \(x\in L_2(\mathbb {T}^d)\). Then \(u^x\) is an analytically generalized strong solution of (1.1). Therefore by Ito’s lemma, taking into account (4.3), we deduce for any \(0<s\leqslant t\)

$$\begin{aligned} \Vert u^x(t)\Vert _{L_2(\mathbb {T}^d)}^2\leqslant&\Vert u^x(s)\Vert _{L_2(\mathbb {T}^d)}^2-2K_2\int _s^t \Vert u^x(r)\Vert _{L_2(\mathbb {T}^d)}^2\, dr+(2K_1+\Vert \sigma \Vert _{L_2(\mathbb {T}^d)}^2)(t-s)\nonumber \\&+M(t)-M(s), \end{aligned}$$
(5.41)

where the constants \(K_1\), \(K_2\) are defined in (4.3) and M is a continuous local martingale with \(M(0)=0\) and

$$\begin{aligned} d\langle M\rangle _t=4\sum _{k=1}^m\int _{\mathbb {T}^d}(u^x(t,\xi ))^2\sigma _k(\xi )^2\,d\xi dt\leqslant 4\Vert u^x(t)\Vert _{L_2(\mathbb {T}^d)}^2 \sum _{k=1}^m\Vert \sigma _k\Vert ^2_{L_\infty (\mathbb {T}^d)}dt. \end{aligned}$$

Using the fact that \(u^x\) is continuous in \(L_2(\mathbb {T}^d)\), we can pass to the limit in (5.41) as \(s\rightarrow 0\) to deduce that (5.41) is also valid for \(s=0\).

Therefore all the conditions of Lemma 5.3 are satisfied for the process \(X(t):=\Vert u^x(t)\Vert _{L_2(\mathbb {T}^d)}^2\). Thus, \(\mathsf {E}\Vert u^x(t)\Vert _{L_2(\mathbb {T}^d)}^2<\infty \) for any \(t>0\), inequality (4.6) follows from (5.21) and inequality (4.7) follows from (5.22). \(\square \)

Before we formulate and prove the final lemma we recall that under Assumption A the solution u with initial condition \(u_0 \in L_2(\mathbb {T}^d)\) satisfies the following mild form of the SPDE (1.1) [7, Theorem 5.4]

$$\begin{aligned} u(t,\xi )&=\int _{\mathbb {T}^d} p_t(\xi -\zeta )u(\varepsilon ,\zeta )d\zeta + \sum _{k=1}^m\int _\varepsilon ^t \int _{\mathbb {T}^d} p_{t-s}(\xi -\zeta )\sigma _k(\zeta )\,d\zeta \,dW^k(s)\\&\quad +\, \int _\varepsilon ^t \int _{\mathbb {T}^d} p_{t-s}(\xi -\zeta ) f(u(s,\zeta ),\zeta )\,d\zeta \,ds, \end{aligned}$$

for each \(t >0\), \(\varepsilon \in (0,t]\) and \(\xi \in \mathbb {T}^d\), where p denotes the fundamental solution of the heat equation on \(\mathbb {T}^d\). Letting \(\varepsilon \downarrow 0\), we see that almost surely

$$\begin{aligned} u(t,\xi )&=\int _{\mathbb {T}^d} p_t(\xi -\zeta )u_0(\zeta )d\zeta + \sum _{k=1}^m\int _0^t \int _{\mathbb {T}^d} p_{t-s}(\xi -\zeta )\sigma _k(\zeta )\,d\zeta \,dW^k(s)\nonumber \\&\quad +\, \lim _{\varepsilon \downarrow 0}\int _\varepsilon ^t \int _{\mathbb {T}^d} p_{t-s}(\xi -\zeta ) f(u(s,\zeta ),\zeta )\,d\zeta \,ds, \end{aligned}$$
(5.42)

where we used the Cauchy-Schwarz inequality to obtain convergence of the first term.

For an arbitrary element \(x\in L_2(\mathbb {T}^d)\) let \(S_x\) and \(L_x\) be the sets of elements from \(L_2(\mathbb {T}^d)\) that are smaller (respectively larger) than x, that is

$$\begin{aligned} S_{x} := \{f\in L_2(\mathbb {T}^d): f\preceq x\},\,\, L_x := \{f\in L_2(\mathbb {T}^d): x \preceq f\}. \end{aligned}$$

It is easy to see that the sets \(S_{x}\) and \(L_x\) are closed. For \(a \in \mathbb {R}\) put

$$\begin{aligned} \check{a}(\xi ) := a,\quad \xi \in \mathbb {T}^d. \end{aligned}$$

Lemma 5.9

Suppose that Assumptions A and B hold. Let \((P_t)\) be the semigroup associated with Eq. (1.1). Then for any \(M>0\) there exists \(\varepsilon >0\) such that

$$\begin{aligned} \inf _{\begin{array}{c} x\in L_2(\mathbb {T}^d)\\ \Vert x\Vert _{L_2}\leqslant M \end{array}} P_1(x,S_{\check{0}})>\varepsilon ,\quad \quad \inf _{\begin{array}{c} x\in L_2(\mathbb {T}^d)\\ \Vert x\Vert _{L_2}\leqslant M \end{array}} P_1(x,L_{\check{0}})>\varepsilon . \end{aligned}$$
(5.43)

Proof

By symmetry it suffices to prove the first claim. We show it in two steps.

Step 1. At this step we will use only Assumption A. We claim that for every \(M>0\) and \(T\in (0,1]\) there exist \(\Gamma \in \mathbb {R}\) and \(t_0\in (0,T]\) such that

$$\begin{aligned} \inf _{\begin{array}{c} x\in L_2(\mathbb {T}^d)\\ \Vert x\Vert _{L_2}\leqslant M \end{array}} P_{t_0}(x,S_{\check{\Gamma }})>0. \end{aligned}$$

First we note, that it is sufficient to prove the claim only for large enough M. Observe now that by (4.3) there exists \(\gamma \in \mathbb {R}\) such that \(f(z,\xi )\leqslant 0\) for all \(z \geqslant \gamma \) and all \(\xi \in \mathbb {T}^d\). Fix arbitrary \(\beta >\gamma \), and assume that M is large enough so that \(\Vert \check{\beta }\Vert _{L_2}\leqslant M\). By order preservation (Theorem 4.4 (iv)) it suffices to show that for every \(M>0\), \(T \in (0,1]\) there exist \(\Gamma \in \mathbb {R}\) and \(t_0\in (0,T]\) such that

$$\begin{aligned} \inf _{\begin{array}{c} x\in L_2(\mathbb {T}^d), \check{\beta }\preceq _{pos}x\\ \Vert x\Vert _{L_2}\leqslant M \end{array}} P_{t_0}(x,S_{\check{\Gamma }})>0. \end{aligned}$$

By continuity of solutions (Theorem 4.4) there exists \(t_0 \in (0,T]\) such that

$$\begin{aligned} \kappa := \mathsf {P}\big (\check{\gamma } \preceq _{pos}u^{\check{\beta }}(s) \text{ for } \text{ all } s \in [0,t_0]\big )>0. \end{aligned}$$

For \(x \in L_2(\mathbb {T}^d)\) we define

$$\begin{aligned} \Omega (x)\, := \, \big \{\omega : \, \check{\gamma }\preceq _{pos}u^x(s) \text{ for } \text{ all } s\in [0,t_0]\big \}. \end{aligned}$$

Fix now arbitrary \(x\in L_2(\mathbb {T}^d)\) satisfying \(\check{\beta }\preceq _{pos}x\) and \(\Vert x\Vert _{L_2}\leqslant M\). Then, using order preservation again, we see that

$$\begin{aligned} \mathsf {P}\big ( \Omega (x)\big ) \geqslant \kappa . \end{aligned}$$

Choose \(\tilde{\Gamma }\) so large that

$$\begin{aligned} \mathsf {P}\big (\tilde{\Omega }\big ):= & {} \mathsf {P}\Big (\sum _{k=1}^m \int _0^r\!\int _{\mathbb {T}^d}p_{r-s}(\xi -\zeta )\sigma _k(\zeta )\,d\zeta \,dW^k(s) \leqslant \tilde{\Gamma } \text{ for } \text{ all } 0 \leqslant r \leqslant t_0,\, \xi \in \mathbb {T}^d\Big )\\\geqslant & {} 1 - \frac{\kappa }{2}. \end{aligned}$$

Using (5.42), we see that on the set \(\tilde{\Omega }\cap \Omega (x)\) (which has measure at least \(\kappa /2\)) the solution \(u^x\) satisfies for all \(\xi \in \mathbb {T}^d\)

$$\begin{aligned} u^x(t_0,\xi )\leqslant \Big (\int _{\mathbb {T}^d}p_{t_0}^2(\xi -\zeta )\,d\zeta \Big )^{1/2}\Vert x\Vert _{L_2}+ \tilde{\Gamma }\leqslant \Big (\int _{\mathbb {T}^d}p_{t_0}^2(\xi -\zeta )\,d\zeta \Big )^{1/2}M+\tilde{\Gamma }=:\Gamma , \end{aligned}$$

where we used the Cauchy–Schwarz inequality in the first inequality and the fact that \(f(z,\xi )\leqslant 0\) for \(z \geqslant \gamma \) and \(\xi \in \mathbb {T}^d\). The proof of the claim of Step 1 is complete.

Step 2. This is the only part of the proof of Theorem 4.5, where Assumption B is used. We claim that for every \(\Gamma \in \mathbb {R}\) and \(\tau >0\) we have

$$\begin{aligned} P_\tau (\check{\Gamma },S_{\check{0}})>0. \end{aligned}$$

To show this claim, first of all, we assume, without loss of generality, that \(\Gamma >0\). It suffices to show that for some \(\tau _0=\tau _0(\Gamma )>0\) the statement holds for every \(\tau \in (0,\tau _0]\). Fix numbers \(\lambda _k\), \(k=1,\ldots ,m\) as in Assumption B with \(\varepsilon =1\) and define \(\Lambda (\xi ) := \sum _{k=1}^m\lambda _k\sigma _k(\xi )\geqslant 1\) and \({\hat{\Lambda }} := \max _{\xi \in \mathbb {T}^d}\Lambda (\xi )\). Choose \(Q > {\hat{\Lambda }} (\Gamma +2)+2\) and fix \(\tau _0\in (0,1]\) such that \(\tau _0|f(z,\xi )|\leqslant 1\) for all \(\xi \in \mathbb {T}^d\) and \(z \in [\Gamma -Q,\Gamma +2]\) (such \(\tau _0\) exists since f is continuous). Fix \(\tau \in (0,\tau _0]\) and \(\Theta := \frac{\Gamma +2}{\tau }\). Then, for

$$\begin{aligned} \widetilde{W}^k(s)\, := \,\Theta \lambda _k s +W^k(s),\; s \geqslant 0, \,k\in \{1,\ldots ,m\} \end{aligned}$$

and \(t>0\), the solution \(u^{\check{\Gamma }}\) satisfies

$$\begin{aligned} u^{\check{\Gamma }}(t,\xi )&=\Gamma - \int _0^t\int _{\mathbb {T}^d}p_{t-s}(\xi -\zeta )\Lambda (\zeta )\,\Theta \,d\zeta \,ds \\&\quad + \sum _{k=1}^m \int _0^t\int _{\mathbb {T}^d}p_{t-s}(\xi -\zeta )\sigma _k(\zeta )\,d\zeta \,d\widetilde{W}^k(s)\\&\quad +\, \int _0^t\int _{\mathbb {T}^d}p_{t-s}(\xi -\zeta )f(u^{\check{\Gamma }}(s,\zeta ),\zeta )\,d\zeta \,ds\\&=: \Gamma -I_1(t,\xi )+I_2(t,\xi ,\omega )+I_3(t,\xi ,\omega ). \end{aligned}$$

Note that for each \(t \geqslant 0\),

$$\begin{aligned} I_1(t,\xi )\in [\Theta t,\Theta {\hat{\Lambda }} t]. \end{aligned}$$

Let

$$\begin{aligned} {\hat{\Omega }} := \{\omega :\,|I_2(s,\xi ,\omega )|\leqslant 1 \text{ for } \text{ all } s \in [0,1],\xi \in \mathbb {T}^d\}. \end{aligned}$$

Then \(\mathsf {P}({\hat{\Omega }})>0\). Let . If \(\psi <\tau \) and \(u(\psi ,\xi )=\Gamma +2\) for some \(\xi \in \mathbb {T}^d\), then, on the set \({\hat{\Omega }} \cap \{\psi <\tau \}\) we have

$$\begin{aligned} \Gamma +2\leqslant \Gamma - \Theta \psi +1+1, \end{aligned}$$

so \(\psi =0\) which is impossible (since u is continuous). If, on the other hand, \(\psi <\tau \) and \(u(\psi ,\xi )=\Gamma -Q\) for some \(\xi \in \mathbb {T}^d\), then, on the set \(\widetilde{\Omega }\cap \{\psi <\tau \}\), we have

$$\begin{aligned} \Gamma -Q \geqslant \Gamma -\Theta {\hat{\Lambda }} \psi -2, \end{aligned}$$

which is also impossible by the definition of Q and \(\Theta \).

Thus, \(\psi \geqslant \tau \) almost surely on \({\hat{\Omega }}\). Therefore, on \({\hat{\Omega }}\), we have

$$\begin{aligned} u^{\check{\Gamma }}(\tau ,\xi )\leqslant \Gamma - \Theta \tau +2\leqslant 0 \end{aligned}$$

for every \(\xi \in \mathbb {T}^d\), so the proof of Step 2 is complete.

Using order preservation once more, we see that Step 1 and Step 2 imply the assertion in the lemma. \(\square \)

Finally, before we present the proof of Theorem 4.8, we need to introduce some additional notation. If for some \(a\in \mathbb {R}\), \(x\in L_2(\mathbb {T}^d)\) we have \(x(\xi )=a\) for all \(\xi \in \mathbb {T}^d\), then we will write

$$\begin{aligned} \Pi x = a;\,\,\, \Pi ^{-1}a =x. \end{aligned}$$

Introduce also the set of all constants in the space \(L_2(\mathbb {T}^d)\), that is,

$$\begin{aligned} A_c := \{\Pi ^{-1}a\mid a\in \mathbb {R}\}\subset L_2(\mathbb {T}^d). \end{aligned}$$

Let \((P_t)_{t\geqslant 0}\) be the transition function associated with the solution to (1.1).

Lemma 5.10

Consider Eq. (1.1) with \(m=1\) and \(\sigma _1\equiv 1\). Suppose that the drift f does not depend on space. This equation has the following properties:

  1. (i)

    if \(u(0)\in A_c\), then \(u(t)\in A_c\) a.s. for any \(t\geqslant 0\);

  2. (ii)

    if \(u(0)\notin A_c\), then \(u(t)\notin A_c\) a.s. for any \(t\geqslant 0\);

  3. (iii)

    this equation has a unique invariant measure \(\pi \) and \(\pi (A_c)=1\).

Proof

(i) Suppose that \(u(0)\in A_c\). Let \(a=\Pi [u(0)]\). Consider a stochastic differential equation

$$\begin{aligned} X_t^a=a+\int _0^t f(X^a_s)\,ds+W_t. \end{aligned}$$
(5.44)

Since the function f is locally Lipschitz and satisfies the one-sided growth condition, Eq. (5.44) has a unique strong solution [34, Proposition 2.1(b)]. It is immediate to see that the function \(u(t,\xi ) := X^a(t)\) is an analytically and probabilistically strong solution of (1.1) with the initial condition u(0). By Theorem 4.4 this is the unique solution to (1.1). Thus \(u(t)\in A_c\) for any \(t\geqslant 0\).

(ii) Fix the initial condition \(x\notin A_c\). First, consider the case \(x\in \mathcal {C}^2(\mathbb {T}^d)\). Let \(a_1,a_2\) be real numbers such that \(a_1\leqslant x(\xi )\leqslant a_2\) for all \(\xi \in \mathbb {T}^d\). Let \(\Omega ''\subset \Omega \) be the set of full measure where the trajectories of the Brownian motion W are continuous and identity (4.2) holds for \(\varepsilon =0\) for the initial conditions \(\Pi ^{-1}a_1\), \(\Pi ^{-1}a_2\), and x. Set \(\Omega _0 := \Omega ''\cap \Omega '\), where \(\Omega '\) is defined in Theorem 4.4(iv). It follows that \(\mathsf {P}(\Omega _0)=1\).

Assume now the contrary. That is suppose that for some \(T>0\), \(\omega \in \Omega _0\) we have \(u^x(0)=x\) and \(u^x(T,\omega )\in A_c\). Denote \(b := \Pi [u^x(T,\omega )]\). By Theorem 4.4(iv) and part (i) of the theorem, we have for all \(\xi \in \mathbb {T}^d\)

$$\begin{aligned} X_T^{a_1}(\omega )=u^{\Pi ^{-1}a_1}(T,\xi ,\omega )\leqslant u^{x}(T,\xi ,\omega ) \leqslant u^{\Pi ^{-1}a_2}(T,\xi ,\omega )=X_T^{a_2}(\omega ). \end{aligned}$$

It follows immediately from the Gronwall lemma, the fact that f is locally Lipschitz, and the comparison principle that solutions to (5.44) are continuous with respect to the initial condition. Therefore there exists some initial condition \(a\in (a_1,a_2)\) such that \(X^a(0,\omega )=a\) and \(X^a(T,\omega )=b\). Denote now \(v(t,\xi ) := u(T-t,\xi ,\omega )-X^a(T-t,\omega )\), \(t\in [0,T]\), \(\xi \in \mathbb {T}^d\). Then we have \(v(0,\xi )=0\) for any \(\xi \in \mathbb {T}^d\). On the other hand, for any \(t\in [0,T]\), \(\xi \in \mathbb {T}^d\) we have

$$\begin{aligned}&|\Delta v(t,\xi )+\partial _t v(t,\xi )|\nonumber \\&\qquad =|\Delta u(T-t,\xi ,\omega )-f(X^a(T-t,\omega ))-\Delta u(T-t,\xi ,\omega )-f(u(T-t,\xi ,\omega ))|\nonumber \\&\qquad \leqslant C|X^a(T-t,\omega )-(u(T-t,\xi ,\omega ))|\nonumber \\&\qquad =C|v(t,\xi )|, \end{aligned}$$
(5.45)

where we also used that the function f is locally Lipschitz and the processes \(u(\omega )\) and \(X^a(\omega )\) are bounded. However, inequality (5.45) together with the fact that \(v(0)=0\) implies that \(v(t)=0\) for all \(t\in [0,T]\) [37, Theorem 3.0.4]. This yields that

$$\begin{aligned} 0=v(T,\xi )=u(0,\xi )-a,\quad \xi \in \mathbb {T}^d. \end{aligned}$$

However it was assumed that \(u(0)\notin A_c\). This contradiction finishes the proof for the case \(x\in \mathcal {C}^2(\mathbb {T}^d)\).

In the general case, \(x\in L_2(\mathbb {T}^d)\), we will use the Markov property of the solutions to (1.1), that is, Theorem 4.4(iii). We have for any \(0<s<t\)

$$\begin{aligned} P_t(x,A_c)= & {} \int _{L_2(\mathbb {T}^d)}P_s(x,dy)P_{t-s}(y,A_c)= \int _{\mathcal {C}^2(\mathbb {T}^d)}P_s(x,dy)P_{t-s}(y,A_c)\\= & {} \mathsf {P}(u^x(s)\in A_c), \end{aligned}$$

where the first identity is the Markov property, the second identity follows from the fact that \(u^x(s)\in \mathcal {C}^2(\mathbb {T}^d)\) a.s. by Lemma 5.8, and the third identity follows from the fact that \(P_{t-s}(y,A_c)=0\) if \(y\in \mathcal {C}^2(\mathbb {T}^d){\setminus } A_c\) established above. Since s was arbitrary, we can pass to the limit as \(s\rightarrow 0\). Using the fact that the process \(u^x(\omega )\) is by definition continuous in \(L_2(\mathbb {T}^d)\) and \(x\notin A_c\), we get

$$\begin{aligned} \mathsf {P}(u^x(s)\in A_c)\rightarrow 0,\quad \text {as }s\rightarrow 0, \end{aligned}$$

which completes the proof.

(iii) It follows from Theorem 4.5 that SPDE (1.1) has a unique invariant measure \(\pi \) and that for any \(x\in L_2(\mathbb {T}^d)\) one has \(P_t(x,\cdot )\rightarrow \pi \) weakly as \(t\rightarrow \infty \). Fix \(x_0\in A_c\). Then, by part (i) of the proof and the Portmanteau theorem (see, e.g., [35, Theorem III.1.1(II)]), one has

$$\begin{aligned} 1=\limsup _{t\rightarrow \infty }P_t(x_0,A_c)\leqslant \pi (A_c), \end{aligned}$$

where we also used the fact that the set \(A_c\) is closed in \(L_2(\mathbb {T}^d)\). This proves the statement of the lemma. \(\square \)

Proof of Theorem 4.8

Take \(m=1\) and \(\sigma _1\equiv 1\). Fix any \(x\in L_2(\mathbb {T}^d){\setminus } A_c\). Then for any \(t>0\) by Lemma 5.10(ii) one has \(P_t(x,A_c)=0\). On the other hand, by Lemma 5.10(iii), \(\pi (A_c)=1\). This implies the statement of the theorem. \(\square \)