1 Introduction

Shigesada, Kawasaki, and Teramoto (SKT) suggested in their seminal paper [37] a deterministic cross-diffusion system for two competing species, which is able to describe the segregation of the populations. A random influence of the environment or the lack of knowledge of certain biological parameters motivate the introduction of noise terms, leading to the stochastic system for n species with the population density \(u_i\) of the ith species:

$$\begin{aligned} \textrm{d}u_i - {\text {div}}\bigg (\sum _{j=1}^n A_{ij}(u)\nabla u_j\bigg ) \textrm{d}t = \sum _{j=1}^n \sigma _{ij}(u)\textrm{d}W_j(t)\quad \text{ in } {\mathcal {O}},\ t>0,\ i=1,\ldots ,n, \end{aligned}$$
(1)

with initial and no-flux boundary conditions

$$\begin{aligned} u_i(0)=u_i^0\quad \text{ in } {\mathcal {O}}, \quad \sum _{j=1}^n A_{ij}(u)\nabla u_j\cdot \nu = 0\quad \text{ on } \partial {\mathcal {O}},\ t>0,\ i=1,\ldots ,n, \end{aligned}$$
(2)

and diffusion coefficients

$$\begin{aligned} A_{ij}(u) = \delta _{ij}\bigg (a_{i0} + \sum _{k=1}^n a_{ik}u_k\bigg ) + a_{ij}u_i, \quad i,j=1,\ldots ,n, \end{aligned}$$
(3)

where \({\mathcal {O}}\subset {{\mathbb {R}}}^d\) (\(d\ge 1\)) is a bounded domain, \(\nu \) is the exterior unit normal vector to \(\partial {\mathcal {O}}\), \((W_1,\ldots ,W_n)\) is an n-dimensional cylindrical Wiener process, and \(a_{ij}\ge 0\) for \(i=1,\ldots ,n\), \(j=0,\ldots ,n\) are parameters. The stochastic framework is detailed in Sect. 2.

The deterministic analog of (1)–(3) generalizes the two-species model of [37] to an arbitrary number of species. The deterministic model can be derived rigorously from nonlocal population systems [19, 35], stochastic interacting particle systems [8], and finite-state jump Markov models [2, 13]. The original system in [37] also contains a deterministic environmental potential and Lotka–Volterra terms, which are neglected here for simplicity.

We call \(a_{i0}\) the diffusion coefficients, \(a_{ii}\) the self-diffusion coefficients, and \(a_{ij}\) for \(i\ne j\) the cross-diffusion coefficients. We say that system (1)-(3) is with self-diffusion if \(a_{i0}\ge 0\), \(a_{ii} > 0\) for all \(i=1,\ldots ,n\), and without self-diffusion if \(a_{i0}>0\), \(a_{ii}=0\) for all \(i=1,\ldots ,n\).

The aim of this work is to prove the existence of global nonnegative martingale solutions to system (1)–(3) allowing for large cross-diffusion coefficients. The existence of a local pathwise mild solution to (1)–(3) with \(n=2\) was shown in [30, Theorem 4.3] under the assumption that the diffusion matrix is positive definite. Global martingale solutions to a SKT model with quadratic instead of linear coefficients \(A_{ij}(u)\) were found in [18]. Besides detailed balance, this result needs a moderate smallness condition on the cross-diffusion coefficients. We prove the existence of global martingale solutions to the SKT model for general coefficients satisfying detailed balance. This result seems to be new.

There are two major difficulties in the analysis of system (1). The first difficulty is the fact that the diffusion matrix associated to (1) is generally neither symmetric nor positive semidefinite. In particular, standard semigroup theory is not applicable. These issues have been overcome in [9, 10] in the deterministic case by revealing a formal gradient-flow or entropy structure. The task is to extend this idea to the stochastic setting.

In the deterministic case, usually an implicit Euler time discretization is used [24]. In the stochastic case, we need an explicit Euler scheme because of the stochastic Itô integral, but this excludes entropy estimates. An alternative is the Galerkin scheme, which reduces the infinite-dimensional stochastic system to a finite-dimensional one; see, e.g., the proof of [32, Theorem 4.2.4]. This is possible only if energy-type (\(L^2\)) estimates are available, i.e. if \(u_i\) can be used as a test function. In the present case, however, only entropy estimates are available with the test function \(\log u_i\), which is not an element of the Galerkin space.

In the following, we describe our strategy to overcome these difficulties. We say that system (1) has an entropy structure if there exists a function \(h:[0,\infty )^n\rightarrow [0,\infty )\), called an entropy density, such that the deterministic analog of (1) can be written in terms of the entropy variables (or chemical potentials) \(w_i=\partial h/\partial u_i\) as

$$\begin{aligned} \partial _t u_i(w) - {\text {div}}\bigg (\sum _{j=1}^n B_{ij}(w)\nabla w_j\bigg ) = 0, \quad i=1\ldots ,n, \end{aligned}$$
(4)

where \(w=(w_1,\ldots ,w_n)\), \(u_i\) is interpreted as a function of w, and \(B(w)=A(u(w))h''(u(w))^{-1}\) with \(B=(B_{ij})\) is positive semidefinite. For the deterministic analog of (1), it was shown in [11] that the entropy density is given by

$$\begin{aligned} h(u) = \sum _{i=1}^n\pi _i \big (u_i(\log u_i-1)+1\big ), \quad u\in [0,\infty )^n, \end{aligned}$$
(5)

where the numbers \(\pi _i>0\) are assumed to satisfy \(\pi _ia_{ij}=\pi _ja_{ji}\) for all \(i,j=1,\ldots ,n\). This condition is the detailed-balance condition for the Markov chain associated to \((a_{ij})\), and \((\pi _1,\ldots ,\pi _n)\) is the corresponding reversible stationary measure [11]. Using \(w_i=\pi _i\log u_i\) in (4) as a test function and summing over \(i=1,\ldots ,n\), a formal computation shows that

$$\begin{aligned} \frac{\textrm{d}}{\textrm{d}t}\int _{\mathcal {O}}h(u)\textrm{d}x + 2\int _{\mathcal {O}}\sum _{i=1}^n\pi _i\bigg (2a_{i0}|\nabla \sqrt{u_i}|^2 + 2a_{ii}|\nabla u_i|^2 + \sum _{j\ne i}a_{ij}|\nabla \sqrt{u_iu_j}|^2\bigg )\textrm{d}x = 0.\nonumber \\ \end{aligned}$$
(6)

A similar expression holds in the stochastic setting; see (29). It provides \(L^2\) estimates for \(\nabla \sqrt{u_i}\) if \(a_{i0}>0\) and for \(\nabla u_i\) if \(a_{ii}>0\). Moreover, having proved the existence of a solution w to an approximate version of (1) leads to the positivity of \(u_i(w)=\exp (w_i/\pi _i)\) (and nonnegativity after passing to the de-regularization limit).

To define the approximate scheme, our idea is to “regularize” the entropy variable w. Indeed, instead of the algebraic mapping \(w\mapsto u(w)\), we introduce the mapping \(Q_\varepsilon (w)=u(w) + \varepsilon L^*Lw\), where \(L:D(L)\rightarrow H\) with domain \(D(L)\subset H\) is a suitable operator and \(L^*\) its dual; see Sect. 3 for details. The operator L is chosen in such a way that all elements of D(L) are bounded functions, implying that u(w) is well defined. Introducing the regularization operator \(R_\varepsilon :D(L)'\rightarrow D(L)\) as the inverse of \(Q_\varepsilon :D(L)\rightarrow D(L)'\), the approximate scheme to (1) is defined, written in compact form, as

$$\begin{aligned} \textrm{d}v(t) = {\text {div}}\big (B(R_\varepsilon (v))\nabla R_\varepsilon (v)\big )\textrm{d}t + \sigma \big (u(R_\varepsilon (v))\big )\textrm{d}W(t), \quad t>0. \end{aligned}$$
(7)

The existence of a local weak solution \(v^\varepsilon \) to (7) with suitable initial and boundary conditions is proved by applying the abstract result of [32, Theorem 4.2.4]; see Theorem 13. The entropy inequality for \(w^\varepsilon :=R_\varepsilon (v^\varepsilon )\) and \(u^\varepsilon :=u(w^\varepsilon )\),

$$\begin{aligned} {{\mathbb {E}}}&\sup _{0<t<T\wedge \tau _R}\int _{\mathcal {O}}h(u^\varepsilon (t))\textrm{d}x + \frac{\varepsilon }{2}{{\mathbb {E}}}\sup _{0<t<T\wedge \tau _R}\Vert Lw^\varepsilon (t))\Vert _{L^2({\mathcal {O}})}^2 \\&{}+ {{\mathbb {E}}}\sup _{0<t<T\wedge \tau _R}\int _0^t\int _{\mathcal {O}}\nabla w^\varepsilon (s):B(w^\varepsilon (s))\nabla w^\varepsilon (s)\textrm{d}x\textrm{d}s \le C(u^0,T), \end{aligned}$$

up to some stopping time \(\tau _R>0\) allows us to extend the local solution to a global one (Proposition 16).

For the de-regularization limit \(\varepsilon \rightarrow 0\), we need suitable uniform bounds. The entropy inequality provides gradient bounds for \(u_i^\varepsilon \) in the case with self-diffusion and for \((u_i^\varepsilon )^{1/2}\) in the case without self-diffusion. Based on these estimates, we use the Gagliardo–Nirenberg inequality to prove uniform bounds for \(u_i^\varepsilon \) in \(L^q(0,T;L^q({\mathcal {O}}))\) with \(q\ge 2\). Such an estimate is crucial to define, for instance, the product \(u_i^\varepsilon u_j^\varepsilon \). Furthermore, we show a uniform estimate for \(u_i^\varepsilon \) in the Sobolev–Slobodeckij space \(W^{\alpha ,p}(0,T;D(L)')\) for some \(\alpha <1/2\) and \(p>2\) such that \(\alpha p>1\). These estimates are needed to prove the tightness of the laws of \((u^\varepsilon )\) in some sub-Polish space and to conclude strong convergence in \(L^2\) thanks to the Skorokhod–Jakubowski theorem.

For the uniform estimates, we need to distinguish the cases with and without self-diffusion. In the former case, we obtain an \(L^2(0,T;H^1({\mathcal {O}}))\) estimate for \(u_i^\varepsilon \), such that the product \(u_i^\varepsilon \nabla u_j^\varepsilon \) is integrable, and we can pass to the limit in the coefficients \(A_{ij}(u_i^\varepsilon )\). Without self-diffusion, we can only conclude that \((u_i^\varepsilon )\) is bounded in \(L^2(0,T;W^{1,1}({\mathcal {O}}))\), and products like \(u_i^\varepsilon \nabla u_j^\varepsilon \) may be not integrable. To overcome this issue, we use the fact that

$$\begin{aligned} {\text {div}}\bigg (\sum _{j=1}^n A_{ij}(u^\varepsilon )\nabla u_j^\varepsilon \bigg ) = \Delta \bigg (u_i^\varepsilon \bigg (a_{i0} + \sum _{j=1}^n a_{ij} u_j^\varepsilon \bigg )\bigg ) \end{aligned}$$
(8)

and write (1) in a “very weak” formulation by applying the Laplace operator to the test function. Since the bound in \(L^2(0,T;W^{1,1}({\mathcal {O}}))\) implies a bound in \(L^2(0,T;L^2({\mathcal {O}}))\) bound in two space dimensions, products like \(u_i^\varepsilon u_j^\varepsilon \) are integrable. In the deterministic case, we can exploit the \(L^2\) bound for \(\nabla (u_i^\varepsilon u_j^\varepsilon )^{1/2}\) to find a bound for \(u_i^\varepsilon u_j^\varepsilon \) in \(L^1(0,T;L^1({\mathcal {O}}))\) in any space dimension, but the limit involves an identification that we could not extend to the martingale solution concept.

On an informal level, we may state our main result as follows. We refer to Sect. 2 for the precise formulation.

Theorem 1

(Informal statement) Let \(a_{ij}\ge 0\) satisfy the detailed-balance condition, let the stochastic diffusion \(\sigma _{ij}\) be Lipschitz continuous on the space of Hilbert–Schmidt operators, and let a certain interaction condition between the entropy and stochastic diffusion hold (see Assumption (A5) below). Then there exists a global nonnegative martingale solution to (1)–(3) in the case with self-diffusion in any space dimension and in the case without self-diffusion in at most two space dimensions.

We discuss examples for \(\sigma _{ij}(u)\) in Sect. 7. Here, we only remark that an admissible diffusion term is

$$\begin{aligned} \sigma _{ij}(u)=\delta _{ij}u_i^\alpha \sum _{k=1}^\infty a_k(e_k,\cdot )_{U}, \quad i.j=1,\ldots ,n, \end{aligned}$$
(9)

where \(1/2\le \alpha \le 1\), \(\delta _{ij}\) is the Kronecker symbol, \(a_k\ge 0\) decays sufficiently fast, \((e_k)\) is a basis of the Hilbert space U with inner product \((\cdot ,\cdot )_{U}\).

We end this section by giving a brief overview of the state of the art for the deterministic SKT model. First existence results for the two-species model were proven under restrictive conditions on the parameters, for instance in one space dimension [26], for the triangular system with \(a_{21}=0\) [33], or for small cross-diffusion parameters, since in the latter situation the diffusion matrix becomes positive definite [17]. Amann [1] proved that a priori estimates in the \(W^{1,p}({\mathcal {O}})\) norm with \(p>d\) are sufficient to conclude the global existence of solutions to quasilinear parabolic systems, and he applied this result to the triangular SKT system. The first global existence proof without any restriction on the parameters \(a_{ij}\) (except nonnegativity) was achieved in [22] in one space dimension. This result was generalized to several space dimensions in [9, 10] and to the whole space problem in [21]. SKT-type systems with nonlinear coefficients \(A_{ij}(u)\), but still for two species, were analyzed in [15, 16]. Global existence results for SKT-type models with an arbitrary number of species and under a detailed-balance condition were first proved in [11] and later generalized in [31].

This paper is organized as follows. We present our notation and the main results in Sect. 2. The operators needed to define the approximative scheme are introduced in Sect. 3. In Sect. 4, the existence of solutions to a general approximative scheme is proved and the corresponding entropy inequality is derived. Theorems 4 and 5 are shown in Sects. 5 and 6, respectively. Section 7 is concerned with examples for \(\sigma _{ij}(u)\) satisfying our assumptions. Finally, the proofs of some auxiliary lemmas are presented in Appendix A, and Appendix B states a tightness criterion that (slightly) extends [5, Corollary 2.6] to the Banach space setting.

2 Notation and main result

2.1 Notation and stochastic framework

Let \({\mathcal {O}}\subset {{\mathbb {R}}}^d\) (\(d\ge 1\)) be a bounded domain. The Lebesgue and Sobolev spaces are denoted by \(L^p({\mathcal {O}})\) and \(W^{k,p}({\mathcal {O}})\), respectively, where \(p\in [1,\infty ]\), \(k\in {{\mathbb {N}}}\), and \(H^k({\mathcal {O}})=W^{k,2}({\mathcal {O}})\). For notational simplicity, we generally do not distinguish between \(W^{k,p}({\mathcal {O}})\) and \(W^{k,p}({\mathcal {O}};{{\mathbb {R}}}^n)\). We set \(H_N^m({\mathcal {O}}) = \{v\in H^m({\mathcal {O}}):\nabla v\cdot \nu =0\) on \(\partial {\mathcal {O}}\}\) for \(m\ge 2\). If \(u=(u_1,\ldots ,u_n)\in X\) is some vector-valued function in the normed space X, we write \(\Vert u\Vert _X^2=\sum _{i=1}^n\Vert u_i\Vert _X^2\). The inner product of a Hilbert space H is denoted by \((\cdot ,\cdot )_H\), and \(\langle \cdot ,\cdot \rangle _{V',V}\) is the dual product between the Banach space V and its dual \(V'\). If \(F:U\rightarrow V\) is a Fréchet differentiable function between Banach spaces U and V, we write \(\textrm{D}F[v]:U\rightarrow V\) for its Fréchet derivative, for any \(v\in U\).

Given two quadratic matrices \(A=(A_{ij})\), \(B=(B_{ij})\in {{\mathbb {R}}}^{n\times n}\), \(A:B=\sum _{i,j=1}^n A_{ij}B_{ij}\) is the Frobenius matrix product, \(\Vert A\Vert _F=(A:A)^{1/2}\) the Frobenius norm of A, and \({\text {tr}}A =\sum _{i=1}^n A_{ii}\) the trace of A. The constants \(C>0\) in this paper are generic and their values change from line to line.

Let \((\Omega ,{\mathcal {F}},{{\mathbb {P}}})\) be a probability space endowed with a complete right-continuous filtration \({{\mathbb {F}}}=({\mathcal {F}}_t)_{t\ge 0}\) and let H be a Hilbert space. Then \(L^0(\Omega ;H)\) consists of all measurable functions from \(\Omega \) to H, and \(L^2(\Omega ;H)\) consists of all H-valued random variables v such that \({{\mathbb {E}}}\Vert v\Vert _H^2=\int _\Omega \Vert v(\omega )\Vert _H^2{{\mathbb {P}}}(\textrm{d}\omega )<\infty \). Let U be a separable Hilbert space and \((e_k)_{k\in {{\mathbb {N}}}}\) be an orthonormal basis of U. The space of Hilbert–Schmidt operators from U to \(L^2({\mathcal {O}})\) is defined by

$$\begin{aligned} {{\mathcal {L}}}_2(U;L^2({\mathcal {O}})) = \bigg \{F:U\rightarrow L^2({\mathcal {O}}) \text{ linear, } \text{ continuous }: \sum _{k=1}^\infty \Vert Fe_k\Vert _{L^2({\mathcal {O}})}^2 < \infty \bigg \}, \end{aligned}$$

and it is endowed with the norm \(\Vert F\Vert _{{{\mathcal {L}}}_2(U;L^2({\mathcal {O}}))} = (\sum _{k=1}^\infty \Vert Fe_k\Vert _{L^2({\mathcal {O}})}^2)^{1/2}\).

Let \(W=(W_1,\ldots ,W_n)\) be an n-dimensional U-cylindrical Wiener process, taking values in the separable Hilbert space \(U_0\supset U\) and adapted to the filtration \({{\mathbb {F}}}\). We can write \(W_j=\sum _{k=1}^\infty e_k W_j^k\), where \((W_j^k)\) is a sequence of independent standard one-dimensional Brownian motions [12, Section 4.1.2]. Then \(W_j(\omega )\in C^0([0,\infty );U_0)\) for a.e. \(\omega \) [32, Section 2.5.1].

2.2 Assumptions

We impose the following assumptions:

  1. (A1)

    Domain: \({\mathcal {O}}\subset {{\mathbb {R}}}^d\) (\(d\ge 1\)) is a bounded domain with Lipschitz boundary. Let \(T>0\) and set \(Q_T={\mathcal {O}}\times (0,T)\).

  2. (A2)

    Initial datum: \(u^0=(u_1^0,\ldots ,u_n^0)\in L^\infty (\Omega ;L^2({\mathcal {O}};{{\mathbb {R}}}^n))\) is a \({\mathcal {F}}_0\)-measurable random variable satisfying \(u^0(x)\ge 0\) for a.e. \(x\in {\mathcal {O}}\) \({{\mathbb {P}}}\)-a.s.

  3. (A3)

    Diffusion matrix: \(a_{ij}\ge 0\) for \(i=1,\ldots ,n\), \(j=0,\ldots ,n\) and there exist \(\pi _1,\ldots ,\pi _n>0\) such that \(\pi _i a_{ij}=\pi _j a_{ji}\) for all \(i,j=1,\ldots ,n\) (detailed-balance condition).

  4. (A4)

    Multiplicative noise: \(\sigma =(\sigma _{ij})\) is an \(n\times n\) matrix, where \(\sigma _{ij}:L^2({\mathcal {O}};{{\mathbb {R}}}^n)\rightarrow {{\mathcal {L}}}_2(U;L^2({\mathcal {O}}))\) is \({\mathcal {B}}(L^2({\mathcal {O}};{{\mathbb {R}}}^n))/\) \({\mathcal {B}} ({{\mathcal {L}}}_2(U;L^2({\mathcal {O}})))\)-measurable and \({{\mathbb {F}}}\)-adapted. Furthermore, there exists \(C_\sigma >0\) such that for all u, \(v\in L^2({\mathcal {O}};{{\mathbb {R}}}^n)\),

    $$\begin{aligned} \Vert \sigma (u)-\sigma (v)\Vert _{{{\mathcal {L}}}_2(U;L^2({\mathcal {O}}))}&\le C_\sigma \Vert u-v\Vert _{L^2({\mathcal {O}})}, \\ \Vert \sigma (v)\Vert _{{{\mathcal {L}}}_2(U;L^2({\mathcal {O}}))}&\le C_\sigma (1+\Vert v\Vert _{L^2({\mathcal {O}})}). \end{aligned}$$
  5. (A5)

    Interaction between entropy and noise: There exists \(C_h>0\) such that for all \(u\in L^\infty ({\mathcal {O}}\times (0,T))\),

    $$\begin{aligned}&\bigg \{\int _0^t\sum _{k=1}^\infty \sum _{i,j=1}^n \bigg (\int _{\mathcal {O}}\frac{\partial h}{\partial u_i}(u(s)) \sigma _{ij}(u(s))e_k\textrm{d}x\bigg )^2\textrm{d}s\bigg \}^{1/2}\\&\quad \le C_h\bigg (1 + \int _0^t\int _{\mathcal {O}}h(u(s))\textrm{d}x\textrm{d}s\bigg ), \\&\int _0^t\sum _{k=1}^\infty \int _{\mathcal {O}}{\text {tr}}\big [ (\sigma (u)e_k)^T h''(u)\sigma (u)e_k\big ](s)\textrm{d}x\textrm{d}s \\&\quad \le C_h\bigg (1 + \int _0^t\int _{\mathcal {O}}h(u(s))\textrm{d}x\textrm{d}s\bigg ), \end{aligned}$$

    where h is the entropy density defined in (5).

Remark 2

(Discussion of the assumptions)

  1. (A1)

    The Lipschitz regularity of the boundary \(\partial {\mathcal {O}}\) is needed to apply the Sobolev and Gagliardo–Nirenberg inequalities.

  2. (A2)

    The regularity condition on \(u^0\) can be weakened to \(u^0\in L^p(\Omega ;L^2({\mathcal {O}};{{\mathbb {R}}}^n))\) for sufficiently large \(p\ge 2\) (only depending on the space dimension); it is used to derive the higher-order moment estimates.

  3. (A3)

    The detailed-balance condition is also needed in the deterministic case to reveal the entropy structure of the system; see [11].

  4. (A4)

    The Lipschitz continuity of the stochastic diffusion \(\sigma (u)\) is a standard condition for stochastic PDEs; see, e.g., [36].

  5. (A5)

    This is the most restrictive assumption. It compensates for the singularity of \((\partial h/\partial u_i)(u)=\pi _i\log u_i\) at \(u_i=0\). We show in Lemma 34 that

    $$\begin{aligned} \sigma _{ij}(u)(\cdot ) = \frac{u_i\delta _{ij}}{1+u_i^{1/2+\eta }} \sum _{k=1}^\infty a_k(e_k,\cdot )_U \end{aligned}$$

    satisfies Assumption (A5), where \(\eta >0\) and \((a_k)\in \ell ^2({{\mathbb {R}}})\). Taking into account the gradient estimate from the entropy inequality (see 6), we can allow for more general stochastic diffusion terms like (9); see Lemma 35.

Remark 3

(Reaction terms) It is possible to include additional nonlinear, continuous reaction terms \(f:{{\mathbb {R}}}^{n}\rightarrow {{\mathbb {R}}}^{n}\) satisfying

$$\begin{aligned} \int _0^t \sum _{i=1}^{n} \int _{\mathcal {O}}f_{i}(u)\frac{\partial h}{\partial u_{i}} \textrm{d}x\textrm{d}s \le C_f\bigg (1 + \int _0^t\int _{\mathcal {O}}h(u(s))\textrm{d}x\textrm{d}s\bigg ). \end{aligned}$$

A prominent choice are the so-called Lotka-Volterra source terms

$$\begin{aligned} f_{i}(u)=\bigg (b_{i0}-\sum _{j=1}^{n}b_{ij}u_{j}\bigg )u_{i},\quad i=1,2, \end{aligned}$$

where \(b_{ij}\ge 0\) for \(i=1,\dots ,n\), \(j=0,1,\dots ,n\). Considering the entropy density h given by (5), it is easy to see that this reaction term even improves the integrability of the solution, due to bounds for terms of the form \(u_{i}^{2}\log (u_{i})\), \(i=1,\dots ,n\).

2.3 Main results

Let \(T>0\), \(m\in {{\mathbb {N}}}\) with \(m>d/2+1\), and \(D(L)=H_N^m({\mathcal {O}})\).

Definition 1

(Martingale solution) A martingale solution to (1)–(3) is the triple \(({\widetilde{U}},{\widetilde{W}},{\widetilde{u}})\) such that \({\widetilde{U}} =({{\widetilde{\Omega }}},\widetilde{{\mathcal {F}}},{{\widetilde{{{\mathbb {P}}}}}},{{\widetilde{{{\mathbb {F}}}}}})\) is a stochastic basis with filtration \({{\widetilde{{{\mathbb {F}}}}}}=(\widetilde{{\mathcal {F}}}_t)_{t\ge 0}\), \({\widetilde{W}}\) is an n-dimensional cylindrical Wiener process, and \({\widetilde{u}}=({\widetilde{u}}_1,\ldots ,{\widetilde{u}}_n)\) is a continuous \(D(L)'\)-valued \({\widetilde{{{\mathbb {F}}}}}\)-adapted process such that \({\widetilde{u}}_i\ge 0\) a.e. in \({\mathcal {O}}\times (0,T)\) \({{\widetilde{{{\mathbb {P}}}}}}\)-a.s.,

$$\begin{aligned} {\widetilde{u}}_i\in L^0({{\widetilde{\Omega }}};C^0([0,T];D(L)'))\cap L^0({{\widetilde{\Omega }}};L^2(0,T;H^1({\mathcal {O}}))), \end{aligned}$$
(10)

the law of \({\widetilde{u}}_i(0)\) is the same as for \(u_i^0\), and for all \(\phi \in D(L)\), \(t\in (0,T)\), \(i=1,\ldots ,n\), \({{\widetilde{{{\mathbb {P}}}}}}\)-a.s.,

$$\begin{aligned} \langle {\widetilde{u}}_i(t),\phi \rangle _{D(L)',D(L)}&= \langle {\widetilde{u}}_i(0),\phi \rangle _{D(L)',D(L)} - \sum _{j=1}^n\int _0^t\int _{\mathcal {O}}A_{ij}({\widetilde{u}}(s)) \nabla {\widetilde{u}}_j(s)\cdot \nabla \phi \textrm{d}x\textrm{d}s \nonumber \\&\quad \;\,+ \sum _{j=1}^n\int _{\mathcal {O}}\bigg (\int _0^t\sigma _{ij}({\widetilde{u}}(s)) \textrm{d}{\widetilde{W}}_j(s)\bigg )\phi \textrm{d}x. \end{aligned}$$
(11)

Our main results read as follows.

Theorem 4

(Existence for the SKT model with self-diffusion) Let Assumptions (A1)– (A5) be satisfied and let \(a_{ii}>0\) for \(i=1,\ldots ,n\). Then (1)–(3) has a global nonnegative martingale solution in the sense of Definition 1.

Theorem 5

(Existence for the SKT model without self-diffusion) Let Assumptions (A1)–(A5) be satisfied, let \(d\le 2\), and let \(a_{0i}>0\) for \(i=1,\ldots ,n\). We strengthen Assumption (A4) slightly by assuming that for all \(v\in L^2({\mathcal {O}};{{\mathbb {R}}}^n)\),

$$\begin{aligned} \Vert \sigma (v)\Vert _{{\mathcal {L}}_2(U;L^2({\mathcal {O}}))}\le C_\sigma (1+\Vert v\Vert _{L^2({\mathcal {O}})}^\gamma ), \end{aligned}$$

where \(\gamma <1\) if \(d=2\) and \(\gamma =1\) if \(d=1\). Then (1)–(3) has a global nonnegative martingale solution in the sense of Definition 1 with the exception that (10) and (11) are replaced by

$$\begin{aligned} {\widetilde{u}}_i\in L^0({{\widetilde{\Omega }}};C^0([0,T];D(L)'))\cap L^0({{\widetilde{\Omega }}};L^2(0,T;W^{1,1}({\mathcal {O}}))) \end{aligned}$$

and, for all \(\phi \in D(L)\cap W^{2,\infty }({\mathcal {O}})\),

$$\begin{aligned} \langle {\widetilde{u}}_i(t),\phi \rangle _{D(L)',D(L)}&= \langle {\widetilde{u}}_i(0),\phi \rangle _{D(L)',D(L)} \\&\quad - \int _0^t\int _{\mathcal {O}}{\widetilde{u}}_i(s) \bigg (a_{i0}+\sum _{j=1}^n a_{ij}{\widetilde{u}}_j(s)\bigg )\Delta \phi \textrm{d}x\textrm{d}s \\&\quad + \sum _{j=1}^n\int _{\mathcal {O}}\bigg (\int _0^t\sigma _{ij}({\widetilde{u}}(s)) \textrm{d}{\widetilde{W}}_j(s)\bigg )\phi \textrm{d}x. \end{aligned}$$

The weak formulation for the SKT system without self-diffusion is weaker than that one with self-diffusion, since we have only the gradient regularity \(\nabla \widetilde{u}_i\in L^1({\mathcal {O}})\), and \(A_{ij}(\widetilde{u})\) may be nonintegrable. However, system (1) can be written in Laplacian form according to (8), which allows for the “very weak” formulation stated in Theorem 5. The condition on \(\gamma \) if \(d=2\) is needed to prove the fractional time regularity for the approximative solutions.

Remark 6

(Nonnegativity of the solution) The a.s. nonnegativity of the population densities is a consequence of the entropy structure, since the approximate densities \(u_i^\varepsilon \) satisfy \(u_i^\varepsilon = u_i(R_\varepsilon (v^\varepsilon )) = \exp (R_\varepsilon (v^\varepsilon )/\pi _i)>0\) a.e. in \(Q_T\). This may be surprising since we do not assume that the noise vanishes at zero, i.e. \(\sigma _{ij}(u)=0\) if \(u_i=0\). This condition is replaced by the weaker integrability condition for \(\sigma _{ij}(u)\log u_i\) in Assumption (A5). A similar, but pointwise condition was imposed in the deterministic case; see Hypothesis (H3) in [25, Section 4.4]. The examples in Sect. 7 satisfy \(\sigma _{ij}(u)=0\) if \(u_i=0\). \(\square \)

3 Operator setup

In this section, we introduce the operators needed to define the approximate scheme.

3.1 Definition of the connection operator L

We define an operator L that “connects” two Hilbert spaces V and H satisfying \(V\subset H\). This abstract operator allows us to define a regularization operator that “lifts” the dual space \(V'\) to V.

Proposition 7

(Operator L) Let V and H be separable Hilbert spaces such that the embedding \(V\hookrightarrow H\) is continuous and dense. Then there exists a bounded, self-adjoint, positive operator \(L:D(L)\rightarrow H\) with domain \(D(L)=V\). Moreover, it holds for L and its dual operator \(L^*:H\rightarrow V'\) (we identify H and its dual \(H'\)) that, for some \(0<c<1\),

$$\begin{aligned} c\Vert v\Vert _V \le \Vert L(v)\Vert _H = \Vert v\Vert _V, \quad \Vert L^*(w)\Vert _{V'}\le \Vert w\Vert _H, \quad v\in V,\ w\in H. \end{aligned}$$
(12)

We abuse slightly the notation by denoting both dual and adjoint operators by \(A^*\). The proof is similar to [27, Theorem 1.12]. For the convenience of the reader, we present the full proof.

Proof

We first construct some auxiliary operator by means of the Riesz representation theorem. Let \(w\in H\). The mapping \(V\rightarrow {{\mathbb {R}}}\), \(v\mapsto (v,w)_H\), is linear and bounded. Hence, there exists a unique element \(\widetilde{w}\in V\) such that \((v,\widetilde{w})_{V}=(v,w)_H\) for all \(v\in V\). This defines the linear operator \(G:H\rightarrow V\), \(G(w):=\widetilde{w}\), such that

$$\begin{aligned} (v,w)_H = (v,G(w))_{V}\quad \text{ for } \text{ all } v\in V,\ w\in H. \end{aligned}$$

The operator G is bounded and symmetric, since \(\Vert G(w)\Vert _V=\Vert \widetilde{w}\Vert _V = \Vert w\Vert _H\) and

$$\begin{aligned} (G(w),v)_H = (G(w),G(v))_V = (w,G(v))_H\quad \text{ for } \text{ all } v,w\in H. \end{aligned}$$
(13)

This means that G is self-adjoint as an operator on H. Choosing \(v=w\in H\) in (13) gives \((G(v),v)_H=\Vert G(v)\Vert _V^2\ge 0\), i.e., G is positive. We claim that G is also one-to-one. Indeed, let \(G(w)=0\) for some \(w\in H\). Then \(0=(v,G(w))_V=(v,w)_H\) for all \(v\in V\) and, by the density of the embedding \(V\hookrightarrow H\), for all \(v\in H\). This implies that \(w=0\) and shows the claim.

The properties on G allow us to define \(\Lambda :=G^{-1}:D(\Lambda )\rightarrow H\), where \(D(\Lambda )={\text {ran}}(G)\subset V\) and \(D(\Lambda )\) denotes the domain of \(\Lambda \). By definition, this operator satisfies

$$\begin{aligned} (v,\Lambda (w))_H = (v,w)_V\quad \text{ for } \text{ all } v\in V,\ w\in D(\Lambda ). \end{aligned}$$

Hence, for all \(v,w\in D(\Lambda )\), we have \((v,\Lambda (w))_H=(v,w)_V=(\Lambda (v),w)_H\), i.e., \(\Lambda \) is symmetric. Since \(G=G^*\), we have \(D(\Lambda ^*)={\text {ran}}(G^*) ={\text {ran}}(G)=D(\Lambda )\) and consequently, \(\Lambda \) is self-adjoint. Moreover, \(\Lambda \) is densely defined (since \(V\hookrightarrow H\) is dense). As a densely defined, self-adjoint operator, it is also closed. Finally, \(\Lambda \) is one-to-one and positive:

$$\begin{aligned} C\Vert \Lambda (v)\Vert _H\Vert v\Vert _V \ge \Vert \Lambda (v)\Vert _H\Vert v\Vert _H \ge (\Lambda (v),v)_H = (v,v)_V = \Vert v\Vert _V^2\ge 0 \end{aligned}$$

for all \(v\in D(\Lambda )\) and some \(C>0\) and consequently, \(\Vert \Lambda (v)\Vert _H\ge C^{-1}\Vert v\Vert _V\).

Therefore, we can define the square root of \(\Lambda \), \(\Lambda ^{1/2}:D(\Lambda ^{1/2})\rightarrow H\), which is densely defined and closed. Its domain can be obtained by closing \(D(\Lambda )\) with respect to

$$\begin{aligned} \Vert \Lambda ^{1/2}(v)\Vert _H = (\Lambda ^{1/2}(v),\Lambda ^{1/2}(v))_H^{1/2} = (\Lambda (v),v)_H^{1/2} = (v,v)_V^{1/2} = \Vert v\Vert _V \end{aligned}$$
(14)

for \(v\in D(\Lambda ^{1/2})\). In particular, the graph norm \(\Vert \cdot \Vert _H + \Vert \Lambda ^{1/2}(\cdot )\Vert _H\) is equivalent to the norm in V. We claim that \(D(\Lambda ^{1/2})=V\). To prove this, let \(w\in V\) be orthogonal to \(D(\Lambda ^{1/2})\). Then \((w,v)_V=0\) for all \(v\in D(\Lambda ^{1/2})\) and, since \(D(\Lambda )\subset D(\Lambda ^{1/2})\), in particular for all \(v\in D(\Lambda )\). It follows that \(0=(w,v)_V=(w,\Lambda (v))_H\) for \(v\in D(\Lambda )\). Since \(\Lambda \) is the inverse of \(G:H\rightarrow V\), we have \({\text {ran}}(\Lambda )=H\), and it holds that \((w,\xi )_H=0\) for all \(\xi \in H\), implying that \(w=0\). This shows the claim.

Finally, we define \(L:=\Lambda ^{1/2}:D(L)=V\rightarrow H\), which is a positive and self-adjoint operator. Estimate (14) shows that \(\Vert L(v)\Vert _H = \Vert v\Vert _V\) for \(v\in V\). We deduce from the equivalence between the norm in V and the graph norm of L that, for some \(C>0\) and all \(v\in V\),

$$\begin{aligned} \Vert v\Vert _V \le C(\Vert L(v)\Vert _H+\Vert v\Vert _H)= & {} C(\Vert L(v)\Vert _V + \Vert L^{-1}L(v)\Vert _H)\\\le & {} C(1+\Vert L^{-1}\Vert )\Vert L(v)\Vert _H, \end{aligned}$$

which proves the lower bound in (12). The dual operator \(L^*:H\rightarrow V'\) is bounded too, since it holds for all \(w\in H\) that

$$\begin{aligned} \Vert L^*(w)\Vert _{V'} = \sup _{\Vert v\Vert _V=1}|(w,L(v))_H| \le \sup _{\Vert v\Vert _V=1}\Vert w\Vert _H\Vert v\Vert _V = \Vert w\Vert _H. \end{aligned}$$

This ends the proof. \(\square \)

We apply Proposition 7 to \(V=H^m_N({\mathcal {O}})\) and \(H=L^2({\mathcal {O}})\), recalling that \(H^m_N({\mathcal {O}})=\{v\in H^m({\mathcal {O}}):\nabla v\cdot \nu =0\) on \(\partial {\mathcal {O}}\}\) and \(m>d/2+1\). Then, by Sobolev’s embedding, \(D(L)\hookrightarrow W^{1,\infty }({\mathcal {O}})\). Observe the following two properties that are used later:

$$\begin{aligned}&\Vert L^*L(v)\Vert _{V'}\le \Vert v\Vert _V, \quad \Vert L^*(w)\Vert _{V'}\le \Vert w\Vert _H \quad \text{ for } \text{ all } v\in V,\ w\in H. \end{aligned}$$
(15)

The following lemma is used in the proof of Proposition 16 to apply Itô’s lemma.

Lemma 8

(Operator \(L^{-1}\)) Let \(L^{-1}:{\text {ran}}(L)\rightarrow D(L)\) be the inverse of L and let \(D(L^{-1}):=\overline{D(\Lambda )}\) be the closure of \(D(\Lambda )\) with respect to \(\Vert L^{-1}(\cdot )\Vert _H\). Then \(D(L)'\) is isometric to \(D(L^{-1})\). In particular, it holds that \((L^{-1}(v),L^{-1}(w))_{H} = (v,w)_{D(L)'}\) for all v, \(w\in D(L)'\).

Proof

The proof is essentially contained in [27, p. 136ff] and we only sketch it. Let \(F\in D(L^{-1})'\). Then \(|F(v)|\le C\Vert L^{-1}(v)\Vert _H\) for all \(v\in D(\Lambda )\) and, as a consequence, \(|F(Lu)|\le C\Vert u\Vert _H\) for \(u=L^{-1}(v)\in D(L)\). The density of \(L^{-1}(D(\Lambda ))\) in H guarantees the unique representation \(F(Lu)=(u,w)_H\) for some \(w\in H\), and we can represent F in the form \(F(v)=(L^{-1}v,w)_H=(v,L^{-1}w)_H\), where \(L^{-1}w\in D(L)\). This shows that every element of \(D(L^{-1})'\) can be identified with an element of D(L).

Conversely, if \(w\in D(L)\), we consider functionals of the type \(v\mapsto (v,w)_H\) for \(v\in D(\Lambda )\), which are bounded in \(\Vert L^{-1}(\cdot )\Vert _H\). These functionals can be extended by continuity to functionals F belonging to \(D(L^{-1})'\). The proof in [27, p. 137] shows that \(\Vert F\Vert _{D(L^{-1})'}=\Vert w\Vert _{D(L)}\). We conclude that \(D(L^{-1})'\) is isometric to D(L). Since Hilbert spaces are reflexive, \(D(L^{-1})\) is isometric to \(D(L)'\). \(\square \)

Lemma 9

(Operator u) The mapping \(u:=(h')^{-1}\) from D(L) to \(L^\infty ({\mathcal {O}})\) is Fréchet differentiable and, as a mapping from D(L) to \(D(L)'\), monotone.

Proof

Let \(w\in D(L)\hookrightarrow L^\infty ({\mathcal {O}})\) (here we use \(m>d/2\)). Then \(u(w)=(x\mapsto u(w(x))) \in L^\infty ({\mathcal {O}})\), showing that \(u:D(L)\rightarrow L^\infty ({\mathcal {O}})=(L^1({\mathcal {O}}))'\hookrightarrow D(L)'\) is well defined. It follows from the mean-value theorem that for all w, \(\xi \in D(L)\),

$$\begin{aligned} \Vert u(w+\xi )-u(w)-u'(w)\xi \Vert _{L^\infty ({\mathcal {O}})} \le C\Vert \xi \Vert _{D(L)}^2\bigg \Vert \int _0^1 (1-s)u''(w+s\xi )\textrm{d}s\bigg \Vert _{L^\infty ({\mathcal {O}})}. \end{aligned}$$

Since \(u''\) maps bounded sets to bounded sets, the integral is bounded. Thus, \(u:D(L)\rightarrow L^\infty ({\mathcal {O}})\) is Fréchet differentiable. For the monotonicity, we use the convexity of h and hence the monotonicity of \(h'\):

$$\begin{aligned} \langle u(v)-u(w),v-w\rangle _{D(L)',D(L)}&= (u(v)-u(w),v-w)_{L^2({\mathcal {O}})} \\&= (u(v)-u(w),h'(u(v))-h'(u(w)))_{L^2({\mathcal {O}})} \ge 0 \end{aligned}$$

for all v, \(w\in D(L)\). This proves the lemma. \(\square \)

3.2 Definition of the regularization operator \(R_\varepsilon \)

First, we define another operator, denoted by \(Q_{\varepsilon }\), that maps D(L) to \(D(L)'\). Its inverse is the desired regularization operator.

Lemma 10

(Operator \(Q_\varepsilon \)) Let \(\varepsilon >0\) and define \(Q_\varepsilon :D(L)\rightarrow D(L)'\) by \(Q_\varepsilon (w)=u(w)+\varepsilon L^*Lw\), where \(w\in D(L)\). Then \(Q_\varepsilon \) is Fréchet differentiable, strongly monotone, coercive, and invertible. Its Fréchet derivative \(\textrm{D}Q_\varepsilon [w](\xi )=u'(w)\xi + \varepsilon L^*L\xi \) for w, \(\xi \in D(L)\) is continuous, strongly monotone, coercive, and invertible.

Proof

The mapping \(Q_\varepsilon \) is well defined since \(w\in D(L)\hookrightarrow L^\infty ({\mathcal {O}})\) implies that \(u(w)\in L^\infty ({\mathcal {O}})\) and hence, \(\Vert u(w)\Vert _{D(L)'}\le C\Vert u(w)\Vert _{L^1({\mathcal {O}})}\) is finite. We show that \(Q_\varepsilon \) is strongly monotone. For this, let v, \(w\in D(L)\) and compute

$$\begin{aligned} \langle Q_\varepsilon&(v)-Q_\varepsilon (w),v-w\rangle _{D(L)',D(L)} \nonumber \\&= (u(v)-u(w),v-w)_H + \varepsilon \langle L^*L(v-w),v-w\rangle _{D(L)',D(L)} \nonumber \\&\ge \varepsilon \langle L^*L(v-w),v-w\rangle _{D(L)',D(L)} = \varepsilon \Vert L(v-w)\Vert _H^2 \ge \varepsilon c\Vert v-w\Vert _{D(L)}^2 \end{aligned}$$
(16)

where we used the monotonicity of \(w\mapsto u(w)\) and the lower bound in (12). The coercivity of \(Q_\varepsilon \) is a consequence of the strong monotonicity:

$$\begin{aligned} \langle Q_\varepsilon (v),v\rangle _{D(L)',D(L)}&= \langle Q_\varepsilon (v)-Q_\varepsilon (0),v-0\rangle _{D(L)',D(L)} + \langle Q_\varepsilon (0),v\rangle _{D(L)',D(L)} \\&\ge \varepsilon c\Vert v\Vert _{D(L)}^2 + (u(0),v)_H \ge \varepsilon c\Vert v\Vert _{D(L)}^2 - C|u(0)|\,\Vert v\Vert _{D(L)} \end{aligned}$$

for \(v\in D(L)\). Based on these properties, the invertibility of \(Q_\varepsilon \) now follows from Browder’s theorem [20, Theorem 6.1.21].

Next, we show the properties for \(\textrm{D}Q_\varepsilon \). The operator \(\textrm{D}Q_\varepsilon [w]:D(L)\rightarrow D(L)'\) is well defined for all \(w\in D(L)\), since

$$\begin{aligned} \Vert u'(w)\xi \Vert _{D(L)'}\le & {} C\Vert u'(w)\xi \Vert _{L^2({\mathcal {O}})}\le C\Vert u'(w)\Vert _{L^2({\mathcal {O}})} \Vert \xi \Vert _{L^\infty ({\mathcal {O}})}\\\le & {} C\Vert u'(w)\Vert _{L^2({\mathcal {O}})}\Vert \xi \Vert _{D(L)} \end{aligned}$$

for all \(\xi \in D(L)\hookrightarrow L^\infty ({\mathcal {O}})\). The strong monotonicity of \(\textrm{D}Q_\varepsilon [w]\) for \(w\in D(L)\) follows from the positive semidefiniteness of \(u'(w)=(h'')^{-1}(u(w))\) and the lower bound in (12):

$$\begin{aligned} \langle \textrm{D}&Q_\varepsilon [w](\xi )-\textrm{D}Q_\varepsilon [w](\eta ),\xi -\eta \rangle _{D(L)',D(L)} \\&= (u'(w)(\xi -\eta ),\xi -\eta )_H + \varepsilon \langle L^*L(\xi -\eta ),\xi -\eta \rangle _{D(L)',D(L)} \\&\ge \varepsilon \Vert L(\xi -\eta )\Vert _H^2 \ge \varepsilon c\Vert \xi -\eta \Vert _{D(L)}^2 \end{aligned}$$

for \(\xi \), \(\eta \in D(L)\). The choice \(\eta =0\) yields immediately the coercivity of \(\textrm{D}Q_\varepsilon [w]\). The invertibility of \(\textrm{D}Q_\varepsilon [w]\) follows again from Browder’s theorem. \(\square \)

Lemma 10 shows that the inverse of \(Q_\varepsilon \) exists. We set \(R_\varepsilon :=Q_\varepsilon ^{-1}:D(L)'\rightarrow D(L)\), which is the desired regularization operator. It has the following properties.

Lemma 11

(Operator \(R_\varepsilon \)) The operator \(R_\varepsilon :D(L)'\rightarrow D(L)\) is Fréchet differentiable and strictly monotone. In particular, it is Lipschitz continuous with Lipschitz constant \(C/\varepsilon \), where \(C>0\) does not depend on \(\varepsilon \). The Fréchet derivative is also Lipschitz continuous with the same constant and satisfies

$$\begin{aligned} \textrm{D}R_\varepsilon [v] = (\textrm{D}Q_\varepsilon [R_\varepsilon (v)])^{-1} = (u'(R_\varepsilon (v))+\varepsilon L^*L)^{-1} \quad \text{ for } v\in D(L)', \end{aligned}$$

and it is Lipschitz continuous with constant \(C/\varepsilon \), satisfying \(\Vert \textrm{D}R_\varepsilon [v](\xi )\Vert _{D(L)}\le \varepsilon ^{-1}C\Vert \xi \Vert _{D(L)'}\) for v, \(\xi \in D(L)'\).

Proof

We show first the Lipschitz continuity of \(R_\varepsilon \). Let \(v_1\), \(v_2\in D(L)'\). Then there exist \(w_1\), \(w_2\in D(L)\) such that \(v_1=Q_\varepsilon (w_1)\), \(v_2=Q_\varepsilon (w_2)\). Hence, using (12) and (16),

$$\begin{aligned} \Vert R_\varepsilon (v_1)-R_\varepsilon (v_2)\Vert _{D(L)}^2&= \Vert w_1-w_2\Vert _{D(L)}^2 \le C\Vert L(w_1-w_2)\Vert _H^2 \\&\le \varepsilon ^{-1}C\langle Q_\varepsilon (w_1)-Q_\varepsilon (w_2),w_1-w_2\rangle _{D(L)',D(L)} \\&\le \varepsilon ^{-1}C\Vert Q_\varepsilon (w_1)-Q_\varepsilon (w_2)\Vert _{D(L)'}\Vert w_1-w_2\Vert _{D(L)} \\&= \varepsilon ^{-1}C\Vert v_1-v_2\Vert _{D(L)'}\Vert R_\varepsilon (v_1)-R_\varepsilon (v_2)\Vert _{D(L)}, \end{aligned}$$

proving that \(R_\varepsilon \) is Lipschitz continuous with Lipschitz constant \(C/\varepsilon \). The Fréchet differentiability is a consequence of the inverse function theorem and \(\textrm{D}R_\varepsilon [v] = (\textrm{D}Q_\varepsilon [R_\varepsilon (v)])^{-1}\) for \(v\in D(L)'\).

We verify the strict monotonicity of \(R_\varepsilon \). Let v, \(w\in D(L)'\) with \(v\ne w\). Because of the strong monotonicity of \(Q_\varepsilon \), we have

$$\begin{aligned} \langle v-w,R_\varepsilon (v)-R_\varepsilon (w)\rangle _{D(L)',D(L)}&= \langle Q_\varepsilon (R_\varepsilon (v))-Q_\varepsilon (R_\varepsilon (w)),R_\varepsilon (v)\\&\quad -R_\varepsilon (w) \rangle _{D(L)',D(L)} \\&\ge \varepsilon ^{-1} c\Vert R_\varepsilon (v)-R_\varepsilon (w)\Vert _{D(L)}^2 > 0, \end{aligned}$$

and the right-hand side vanishes only if \(v=w\), since \(R_\varepsilon \) is one-to-one.

Next, we show that \(\textrm{D}R_\varepsilon [v]\) is Lipschitz continuous. Let \(w_1\), \(w_2\in D(L)\). By Lemma 10, \(\textrm{D}Q_\varepsilon [w]\) is strongly monotone. Thus, for any \(w\in D(L)\),

$$\begin{aligned} \varepsilon c\Vert w_1-w_2\Vert _{D(L)}^2&\le \langle \textrm{D}Q_\varepsilon [w](w_1)-\textrm{D}Q_\varepsilon [w](w_2), w_1-w_2\rangle _{D(L)',D(L)} \\&\le \Vert \textrm{D}Q_\varepsilon [w](w_1)-\textrm{D}Q_\varepsilon [w](w_2)\Vert _{D(L)'}\Vert w_1-w_2\Vert _{D(L)}. \end{aligned}$$

Let \(v_1=\textrm{D}Q_\varepsilon [w](w_1)\) and \(v_2=\textrm{D}Q_\varepsilon [w](w_2)\). We infer that

$$\begin{aligned} \Vert (\textrm{D}&Q_\varepsilon [w])^{-1}(v_1)-(\textrm{D}Q_\varepsilon [w])^{-1}(v_2)\Vert _{D(L)} = \Vert w_1-w_2\Vert _{D(L)} \\&\le \varepsilon ^{-1}C\Vert \textrm{D}Q_\varepsilon [w](w_1)-\textrm{D}Q_\varepsilon [w](w_2)\Vert _{D(L)'} = \varepsilon ^{-1}C\Vert v_1-v_2\Vert _{D(L)'}, \end{aligned}$$

showing the Lipschitz continuity of \((\textrm{D}Q_\varepsilon [w])^{-1}\) and \(\textrm{D}R_\varepsilon [v] = (\textrm{D}Q_\varepsilon [R_\varepsilon (v)])^{-1}\). Finally, choosing \(w=R_\varepsilon [v]\) and \(v_2=0\), \(\Vert \textrm{D}R_\varepsilon [v](v_1)\Vert _{D(L)}\le \varepsilon ^{-1}C\Vert v_1\Vert _{D(L)'}\). \(\square \)

4 Existence of approximate solutions

In the previous section, we have introduced the regularization operator \(R_\varepsilon :D(L)'\rightarrow D(L)\). The entropy variable w is replaced by the regularized variable \(R_\varepsilon (v)\) for \(v\in D(L)'\). Setting \(v = u(R_\varepsilon (v)) + \varepsilon L^*LR_\varepsilon (v)\), we consider the regularized problem

$$\begin{aligned}&\textrm{d}v = {\text {div}}\big (B(R_\varepsilon (v))\nabla R_\varepsilon (v)\big )\textrm{d}t + \sigma \big (u(R_\varepsilon (v))\big )\textrm{d}W(t) \quad \text{ in } {\mathcal {O}},\ t\in [0,T\wedge \tau ), \end{aligned}$$
(17)
$$\begin{aligned}&v(0) = u^0\quad \text{ in } {\mathcal {O}}, \quad \nabla R_\varepsilon (v)\cdot \nu = 0 \quad \text{ on } \partial {\mathcal {O}},\ t>0, \end{aligned}$$
(18)

recalling that \(B(w)=A(u(w))h''(u(w))^{-1}\) for \(w\in {{\mathbb {R}}}^n\).

We clarify the notion of solution to problem (17)–(18). Let \(T>0\), let \(\tau \) be an \({{\mathbb {F}}}\)-adapted stopping time, and let v be a continuous, \(D(L)'\)-valued, \({{\mathbb {F}}}\)-adapted process. We call \({(v,\tau )}\) a local strong solution to (17) if

$$\begin{aligned} v(\omega ,\cdot ,\cdot )\in L^2([0,T\wedge \tau (\omega ));D(L)')\cap C^0([0,T\wedge \tau (\omega ));D(L)') \end{aligned}$$

for a.e. \(\omega \in \Omega \) and for all \(t\in [0,T\wedge \tau )\),

$$\begin{aligned}&v(t) = v(0) + \int _0^t{\text {div}}\big (B(R_\varepsilon (v(s)))\nabla R_\varepsilon (v(s))\big )\textrm{d}s + \int _0^t\sigma \big (u(R_\varepsilon (v(s))\big )\textrm{d}W(s), \end{aligned}$$
(19)
$$\begin{aligned}&\nabla R_\varepsilon (v)\cdot \nu = 0\quad \text{ on } \partial {\mathcal {O}}\quad {{\mathbb {P}}}\text{-a.s. } \end{aligned}$$
(20)

It can be verified that \(R_\varepsilon \) is strongly measurable and, if v is progressively measurable, also progressively measurable. Furthermore, if w is progressively measurable then so does u(w), and if \(v\in C^0([0,T];D(L)')\), we have \(R_\varepsilon (v)\in C^0([0,T];D(L))\) and \(u(R_\varepsilon (v))\in L^\infty (Q_T)\). Finally, if \(v\in L^0(\Omega ;L^p(0,T;D(L)'))\) for \(1\le p\le \infty \), then \({\text {div}}(B(u(R_\varepsilon (v)))\nabla R_\varepsilon (v))\in L^0(\Omega ;L^p(0,T;D(L)')))\). Therefore, the integrals in (19) are well defined. The local strong solution is called a global strong solution if \({{\mathbb {P}}}(\tau =\infty )=1\). Given \(t>0\) and a process \(v\in L^2(\Omega ;C^0([0,t];D(L)'))\), we introduce the stopping time

$$\begin{aligned} \tau _R:= \inf \{s\in [0,t]:\Vert v(s)\Vert _{D(L)'}>R\}\quad \text{ for } R>0. \end{aligned}$$

The stopping time \(\tau _{R}\) is \({{\mathbb {P}}}\)-a.s. positive. Indeed, by Chebychev’s inequality, it holds for \(\delta >0\) that

$$\begin{aligned} {{\mathbb {P}}}(\tau _R>\delta ) \ge {{\mathbb {P}}}\Big (\sup _{0<t<\delta }\Vert v(t\wedge \tau _R)\Vert _{D(L)'} \le R\Big ) \ge 1 - \frac{1}{R^2}{{\mathbb {E}}}\sup _{0<t<\delta } \Vert v(t\wedge \tau _R)\Vert _{D(L)'}^2. \end{aligned}$$

Then, inserting (19) and using the properties of the operators introduced in Sect. 3, we can show that \({{\mathbb {P}}}(\tau _R>\delta )\ge 1-C(\delta )\), where \(C(\delta )\rightarrow 0\) as \(\delta \rightarrow 0\), which proves the claim.

We impose the following general assumptions.

  1. (H1)

    Entropy density: Let \({\mathcal {D}}\subset {{\mathbb {R}}}^n\) be a domain and let \(h\in C^2({\mathcal {D}};[0,\infty ))\) be such that \(h':{\mathcal {D}}\rightarrow {{\mathbb {R}}}^n\) and \(h''(u)\in {{\mathbb {R}}}^{n\times n}\) for \(u\in {\mathcal {D}}\) are invertible and there exists \(C>0\) such that \(|u|\le C(1+h(u))\) for all \(u\in {\mathcal {D}}\).

  2. (H2)

    Initial datum: \(u^0=(u_1^0,\ldots ,u_n^0)\in L^\infty (\Omega ;L^2({\mathcal {O}};{{\mathbb {R}}}^n))\) is \({\mathcal {F}}_0\)-measurable satisfying \(u^0(x)\in {\mathcal {D}}\) for a.e. \(x\in {\mathcal {O}}\) \({{\mathbb {P}}}\)-a.s.

  3. (H3)

    Diffusion matrix: \(A=(A_{ij})\in C^1({\overline{{\mathcal {O}}}};{{\mathbb {R}}}^{n\times n})\) grows at most linearly and the matrix \(h''(u)A(u)\) is positive semidefinite for all \(u\in {\mathcal {D}}\).

Remark 12

(Discussion of the assumptions) Hypothesis (H1) and the positive semidefiniteness condition of \(h''(u)A(u)\) in (H3) are necessary for the entropy structure of the general cross-diffusion system. The entropy density (5) with \({\mathcal {D}}=(0,\infty )^n\) satisfies Hypothesis (H1), and the diffusion matrix (3) fulfills (H3). The differentiability of A is needed to apply [32, Prop. 4.1.4] (stating that the assumptions of the abstract existence Theorem 4.2.2 are satisfied) and can be weakened to continuity, weak monotonicity, and coercivity conditions. The growth condition for A is technical; it guarantees that the integral formulation associated to (1) is well defined. Hypothesis (H2) guarantees that \(h(u^0)\) is well defined. \(\square \)

We consider general approximate stochastic cross-diffusion systems, since the existence result for (17) may be useful also for other stochastic cross-diffusion systems.

Theorem 13

(Existence of approximate solutions) Let Assumptions (A1)–(A2), (A4)–(A5), (H1)–(H3) be satisfied and let \(\varepsilon >0\), \(R>0\). Then problem (17)–(18) has a unique local solution \({(v^\varepsilon ,\tau _R)}\).

Proof

We want to apply Theorem 4.2.4 and Proposition 4.1.4 of [32]. To this end, we need to verify that the operator \(M:D(L)'\rightarrow D(L)'\), \(M(v):= {\text {div}}(B(R_\varepsilon (v))\nabla R_\varepsilon (v))\), is Fréchet differentiable and has at most linear growth, \(\textrm{D}M[v]-cI\) is negative semidefinite for all \(v\in D(L)'\) and some \(c>0\), and \(\sigma \) is Lipschitz continuous.

By the regularity of the matrix A and the entropy density h, the operator \(D(L)\rightarrow D(L)'\), \(w\mapsto {\text {div}}(B(w)\nabla w)\), is Fréchet differentiable. Then the Fréchet differentiability of \(R_\varepsilon \) (see Lemma 11) and the chain rule imply that the operator M is also Fréchet differentiable with derivative

$$\begin{aligned} \textrm{D}M[v](\xi ) = {\text {div}}\big (\textrm{D}B[R_\varepsilon (v)](\textrm{D}R_\varepsilon [v](\xi )) \nabla R_\varepsilon (v)\big ) + {\text {div}}\big (B(R_\varepsilon (v))\nabla \textrm{D}R_\varepsilon [v](\xi )\big ), \end{aligned}$$

where v, \(\xi \in D(L)'\). We claim that this derivative is locally bounded, i.e. if \(\Vert v\Vert _{D(L)'}\le K\) then \(\Vert \textrm{D}M[v](\xi )\Vert _{D(L)'}\le C(K)\Vert \xi \Vert _{D(L)'}\). For this, we deduce from the Lipschitz continuity of \(R_\varepsilon \) (Lemma 11) and the property \(u(R_\varepsilon (v))\in L^\infty ({\mathcal {O}})\) for \(v\in D(L)'\) that

$$\begin{aligned} \Vert B(R_\varepsilon (v))\Vert _{L^\infty ({\mathcal {O}})} + \Vert \textrm{D}B[R_\varepsilon (v)]\Vert _{L^\infty ({\mathcal {O}})}\le & {} C(1+\Vert R_\varepsilon (v)\Vert _{D(L)})\\\le & {} C(\varepsilon )(1+\Vert v\Vert _{D(L)'}), \end{aligned}$$

where \(\textrm{D}B[R_\varepsilon (v)]\) is interpreted as a matrix. Recalling from Lemma 11 that

$$\begin{aligned} \Vert \textrm{D}R_\varepsilon [v](\xi )\Vert _{D(L)}\le C(\varepsilon )\Vert \xi \Vert _{D(L)'} \quad \text{ for } \text{ all } \xi \in D(L)', \end{aligned}$$

we obtain for \(\Vert v\Vert _{D(L)'}\le K\) and \(\xi \in D(L)'\):

$$\begin{aligned} \Vert \textrm{D}M[v](\xi )\Vert _{D(L)'}&\le C\big \Vert \textrm{D}B[R_\varepsilon (v)](\textrm{D}R_\varepsilon [v](\xi ))\nabla R_\varepsilon (v)\\&\quad + B(R_\varepsilon (v))\nabla \textrm{D}R_\varepsilon [v](\xi )\big \Vert _{L^1({\mathcal {O}})} \\&\le C\Vert \textrm{D}B[R_\varepsilon (v)](\textrm{D}R_\varepsilon [v](\xi ))\Vert _{L^\infty ({\mathcal {O}})} \Vert \nabla R_\varepsilon (v)\Vert _{L^1({\mathcal {O}})} \\&\quad \, + C\Vert B(R_\varepsilon (v))\Vert _{L^\infty ({\mathcal {O}})}\Vert \nabla \textrm{D}R_\varepsilon [v](\xi )\Vert _{L^1({\mathcal {O}})} \\&\le C\Vert \textrm{D}B[R_\varepsilon (v)]\Vert _{L^\infty ({\mathcal {O}})}\Vert \textrm{D}R_\varepsilon [v](\xi )\Vert _{D(L)} \Vert R_\varepsilon (v)\Vert _{D(L)} \\&\quad \,+ C\Vert B(R_\varepsilon (v))\Vert _{L^\infty ({\mathcal {O}})} \Vert \textrm{D}R_\varepsilon [v](\xi )\Vert _{D(L)} \\&\le C(\varepsilon )(1+\Vert v\Vert _{D(L)'})\Vert \xi \Vert _{D(L)'}\le C(\varepsilon ,K)\Vert \xi \Vert _{D(L)'}. \end{aligned}$$

This proves the claim. Thus, if \(\Vert v\Vert _{D(L)'}\le K\), there exists \(c>0\) such that

$$\begin{aligned} (\xi ,\textrm{D}M[v](\xi )- c\xi )_{D(L)'}\le 0\quad \text{ for } \xi \in D(L)'. \end{aligned}$$

Moreover, by Lemma 11 again,

$$\begin{aligned} \Vert M(v)\Vert _{D(L)'}&\le C\Vert B(R_\varepsilon (v))\nabla R_\varepsilon (v)\Vert _{L^1({\mathcal {O}})} \le C\Vert \nabla R_\varepsilon (v)\Vert _{L^1({\mathcal {O}})} \\&\le C\Vert R_\varepsilon (v)\Vert _{D(L)} \le \varepsilon ^{-1}C(1+\Vert v\Vert _{D(L)'}). \end{aligned}$$

It follows from Assumption (A4) and Lemma 9 that for v, \({\bar{v}}\in D(L)'\) with \(\Vert v\Vert _{D(L)'}\le K\) and \(\Vert {\bar{v}}\Vert _{D(L)'}\le K\),

$$\begin{aligned} \Vert \sigma&(u(R_\varepsilon (v)))-\sigma (u(R_\varepsilon ({\bar{v}})))\Vert _{{\mathcal {L}}_2(U;D(L)')} \le C\Vert \sigma (u(R_\varepsilon (v)))\\&\quad -\sigma (u(R_\varepsilon ({\bar{v}}))) \Vert _{{\mathcal {L}}_2(U;L^2({\mathcal {O}}))} \\&\le C(K)\Vert u(R_\varepsilon (v)))-u(R_\varepsilon ({\bar{v}}))\Vert _{L^2({\mathcal {O}})} \\&\le C(K)\Vert R_\varepsilon (v)-R_\varepsilon ({\bar{v}})\Vert _{D(L)} \le C(\varepsilon ,K)\Vert v-{\bar{v}}\Vert _{D(L)'}, \end{aligned}$$

where C(K) also depends on the \(L^\infty ({\mathcal {O}})\) norms of \(u'(R_\varepsilon (v))\) and \(u'(R_\varepsilon ({\bar{v}}))\).

These estimates show that the assumptions of [32, Theorem 4.2.4] are satisfied in the ball \(\{v\in D(L)':\Vert v\Vert _{D(L)'}\le K\}\). An inspection of the proof of that theorem, which is based on the Galerkin method and Itô’s lemma, shows that local bounds are sufficient to conclude the existence of a local solution v up to the stopping time \(\tau _R\). The boundary conditions follow from \(R_\varepsilon (v)\in D(L)=H^m_N({\mathcal {O}})\) and the definition of the space \(H^m_N({\mathcal {O}})\). \(\square \)

For the entropy estimate we need two technical lemmas whose proofs are deferred to Appendix A.

Lemma 14

Let \(w\in D(L)\), \(a=(a_{ij})\in L^1({\mathcal {O}};{{\mathbb {R}}}^{n\times n})\), and \(b=(b_{ij})\in D(L)^{n\times n}\) satisfying \(\textrm{D}R_\varepsilon [w](a) = b\). Then

$$\begin{aligned} \int _{\mathcal {O}}a:b\textrm{d}x \le \int _{\mathcal {O}}{\text {tr}}[a^T u'(w)^{-1}a]\textrm{d}x. \end{aligned}$$

Lemma 15

Let \(v^0\in L^p(\Omega ;L^1({\mathcal {O}}))\) for some \(p\ge 1\) satisfies \({{\mathbb {E}}}\int _{\mathcal {O}}h(v^0)\textrm{d}x\le C\). Then

$$\begin{aligned} \int _{\mathcal {O}}h(u(R_\varepsilon (v^0)))\textrm{d}x + \frac{\varepsilon }{2}\Vert LR_\varepsilon (v^0)\Vert _{L^2({\mathcal {O}})}^2 \le \int _{\mathcal {O}}h(v^0)\textrm{d}x. \end{aligned}$$

We turn to the entropy estimate.

Proposition 16

(Entropy inequality) Let \({(v^\varepsilon ,\tau _R)}\) be a local solution to (17)–(18) and set \(v^R(t)=v^\varepsilon (\omega ,t\wedge \tau _R(\omega ))\) for \(\omega \in \Omega \), \(t\in (0,\tau _R(\omega ))\). Then there exists a constant \(C(u^0,T)>0\), depending on \(u^0\) and T but not on \(\varepsilon \) and R, such that

$$\begin{aligned} {{\mathbb {E}}}&\sup _{0<t<T\wedge \tau _R}\int _{\mathcal {O}}h(u^\varepsilon (t))\textrm{d}x + \frac{\varepsilon }{2}{{\mathbb {E}}}\sup _{0<t<T\wedge \tau _R}\Vert Lw^\varepsilon (t)\Vert _{L^2({\mathcal {O}})}^2 \\&{}+ {{\mathbb {E}}}\sup _{0<t<T\wedge \tau _R}\int _0^t\int _{\mathcal {O}}\nabla w^\varepsilon (s):B(w^\varepsilon (s))\nabla w^\varepsilon (s)\textrm{d}x\textrm{d}s \le C(u^0,T), \end{aligned}$$

where \(u^\varepsilon :=u(R_\varepsilon (v^R))\) and \(w^\varepsilon :=R_\varepsilon (v^R)\).

Proof

The result follows from Itô’s lemma using a regularized entropy. More precisely, we want to apply the Itô lemma in the version of [29, Theorem 3.1]. To this end, we verify the assumptions of that theorem. Basically, we need a twice differentiable function \({{\mathcal {H}}}\) on a Hilbert space H, whose derivatives satisfy some local growth conditions on H and V, where V is another Hilbert space such that the embedding \(V\hookrightarrow H\) is dense and continuous. We choose \(V=H=D(L)'\) and the regularized entropy

$$\begin{aligned} {{\mathcal {H}}}(v):= \int _{\mathcal {O}}h(u(R_\varepsilon (v)))\textrm{d}x + \frac{\varepsilon }{2}\Vert LR_\varepsilon (v)\Vert _{L^2({\mathcal {O}})}^2, \quad v\in D(L)'. \end{aligned}$$
(21)

Recall that \(R_\varepsilon (v) = h'(u(R_\varepsilon (v)))\) for \(v\in D(L)'\), since \(u=u(w)\) is the inverse of \(h'\). Then, in view of the regularity assumptions for h and Lemma 11, \({{\mathcal {H}}}\) is Fréchet differentiable with derivative

$$\begin{aligned} \textrm{D}{{\mathcal {H}}}[v](\xi )&= \int _{\mathcal {O}}\big (h'(u(R_\varepsilon (v)))u'(R_\varepsilon (v)) \textrm{D}R_\varepsilon [v](\xi ) + \varepsilon L\textrm{D}R_\varepsilon [v](\xi )\cdot LR_\varepsilon (v)\big )\textrm{d}x \\&= \big \langle (u'(R_\varepsilon (v))+\varepsilon L^*L)\textrm{D}R_\varepsilon [v](\xi ),R_\varepsilon (v) \big \rangle _{D(L)',D(L)} \\&= \big \langle \textrm{D}Q_\varepsilon [R_\varepsilon (v)]\textrm{D}R_\varepsilon [v](\xi ),R_\varepsilon (v) \big \rangle _{D(L)',D(L)} = \langle \xi ,R_\varepsilon (v)\rangle _{D(L)',D(L)}, \end{aligned}$$

where v, \(\xi \in D(L)'\). In other words, \(\textrm{D}{{\mathcal {H}}}[v]\) can be identified with \(R_\varepsilon (v)\in D(L)\). In a similar way, we can prove that \(\textrm{D}{{\mathcal {H}}}[v]\) is Fréchet differentiable with

$$\begin{aligned} \textrm{D}^2 {{\mathcal {H}}}[v](\xi ,\eta ) = \langle \xi ,\textrm{D}R_\varepsilon [v](\eta )\rangle _{D(L)',D(L)} \quad \text{ for } v,\,\xi ,\,\eta \in D(L)'. \end{aligned}$$

We have, thanks to the Lipschitz continuity of \(R_\varepsilon \) and \(\textrm{D}R_\varepsilon [v]\) (see Lemma 11) for all v, \(\xi \in D(L)'\) with \(\Vert v\Vert _{D(L)'}\le K\) for some \(K>0\),

$$\begin{aligned} |\textrm{D}{{\mathcal {H}}}[v](\xi )|&\le \Vert R_\varepsilon (v)\Vert _{D(L)}\Vert \xi \Vert _{D(L)'} \le C(\varepsilon )(1+\Vert v\Vert _{D(L)'})\Vert \xi \Vert _{D(L)'} \\&\le C(\varepsilon ,K)\Vert \xi \Vert _{D(L)'}, \\ |\textrm{D}^2 {{\mathcal {H}}}[v](\xi ,\xi )|&\le \Vert \textrm{D}R_\varepsilon [v](\xi )\Vert _{D(L)}\Vert \xi \Vert _{D(L)'} \le C(\varepsilon )\Vert \xi \Vert _{D(L)'}^2. \end{aligned}$$

Finally, for any \(\eta \in D(L)'\), we need an estimate for the mapping \(D(L)'\rightarrow {{\mathbb {R}}}\), \(v\mapsto \textrm{D}{{\mathcal {H}}}[v](\eta )\). We have identified \(\textrm{D}{{\mathcal {H}}}[v]\) with \(R_\varepsilon (v)\in D(L)\), but we need an identification in \(D(L)'\). As in Lemma 8, the operator L can be constructed in such a way that the Riesz representative in \(D(L)'\) of a functional acting on \(D(L)'\) can be expressed via the application of \(L^*L\) to an element of D(L). Indeed, for \(F\in D(L)\) and \(\xi \in D(L)'\), we infer from Lemma 8 that

$$\begin{aligned} \langle \xi ,F\rangle _{D(L)',D(L)}&= (L^{-1}\xi ,LF\rangle _{D(L)',D(L)} = ((LL^{-1})L^{-1}\xi ,LF)_{L^2({\mathcal {O}})} \\&= (L^{-1}\xi ,L^{-1}L^*LF)_{L^2({\mathcal {O}})} = (L^*LF,\xi )_{D(L)'}. \end{aligned}$$

Hence, we can associate \(\textrm{D}{{\mathcal {H}}}[v]\) with \(L^*LR_\varepsilon (v) \in D(L)'\). Then, by the first estimate in (15) and the Lipschitz continuity of \(R_\varepsilon \),

$$\begin{aligned} \Vert L^*LR_\varepsilon (v)\Vert _{D(L)'}&\le C\Vert R_\varepsilon (v)\Vert _{D(L)} \le C\Vert R_\varepsilon (v)-R_\varepsilon (0)\Vert _{D(L)} + C\Vert R_\varepsilon (0)\Vert _{D(L)} \\&\le C(\varepsilon )(1+\Vert v\Vert _{D(L)'})\quad \text{ for } \text{ all } v\in D(L)', \end{aligned}$$

giving the desired estimate for \(\textrm{D}{{\mathcal {H}}}[v]\) in \(D(L)'\). Thus, the assumptions of the Itô lemma, as stated in [29], are satisfied.

To simplify the notation, we set \(u^\varepsilon := u(R_\varepsilon (v^R))\) and \(w^\varepsilon := R_\varepsilon (v^R)\) in the following. By Itô’s lemma, using \(\textrm{D}{{\mathcal {H}}}[v^R]=h'(u^\varepsilon )\), \(\textrm{D}^2 {{\mathcal {H}}}[v^R]=\textrm{D}R_\varepsilon (v^R)\), we have

$$\begin{aligned} {{\mathcal {H}}}(v^R(t))&= {{\mathcal {H}}}(v(0)) + \int _0^t\big \langle {\text {div}}\big (B(w^\varepsilon )\nabla h'(u^\varepsilon (s))\big ),w^\varepsilon (s) \big \rangle _{D(L)',D(L)}\textrm{d}s \nonumber \\&\quad + \sum _{k=1}^\infty \sum _{i,j=1}^n\int _0^t\int _{\mathcal {O}}\frac{\partial h}{\partial u_i}(u^\varepsilon (s))\sigma _{ij}(u^\varepsilon (s))e_k\textrm{d}x\textrm{d}W_j^k(s) \nonumber \\&\quad + \frac{1}{2}\sum _{k=1}^\infty \int _0^t\int _{\mathcal {O}}\textrm{D}R_\varepsilon [v^R(s)] \big (\sigma (u^\varepsilon (s))e_k\big ):\big (\sigma (u^\varepsilon (s))e_k\big ) \textrm{d}x\textrm{d}s. \end{aligned}$$
(22)

Lemma 15 shows that the first term on the right-hand side can be estimated from above by \(\int _{\mathcal {O}}h(u^0)\textrm{d}x\). Using \(w^\varepsilon = R_\varepsilon (v^R)=h'(u^\varepsilon )\) and integrating by parts, the second term on the right-hand side can be written as

$$\begin{aligned} \int _0^t\big \langle&{\text {div}}\big (B(w^\varepsilon )\nabla h'(u^\varepsilon (s))\big ),w^\varepsilon (s) \big \rangle _{D(L)',D(L)}\textrm{d}s \\&= -\int _0^t\int _{\mathcal {O}}\nabla w^\varepsilon (s):B(w^\varepsilon )\nabla w^\varepsilon (s)\textrm{d}x\textrm{d}s \le 0. \end{aligned}$$

The boundary integral vanishes because of the choice of the space \(D(L)=H^m_N({\mathcal {O}})\). The last inequality follows from Assumption (A3), which implies that \(B(w^\varepsilon ) = A(u(w^\varepsilon ))h''(u(w^\varepsilon ))^{-1}\) is positive semidefinite.. We reformulate the last term in (22) by applying Lemma 14 with \(a=\sigma (u^\varepsilon )e_k\) and \(b=\textrm{D}R_\varepsilon [v](\sigma (u^\varepsilon )e_k)\):

$$\begin{aligned} \int _{\mathcal {O}}&\textrm{D}R_\varepsilon [v^R] \big (\sigma (u^\varepsilon )e_k\big ):\big (\sigma (u^\varepsilon )e_k\big )\textrm{d}x \\&\le \int _{\mathcal {O}}{\text {tr}}\big [(\sigma (u^\varepsilon )e_k)^T u'(w^\varepsilon )^{-1} \sigma (u^\varepsilon )e_k\big ]\textrm{d}x. \end{aligned}$$

Taking the supremum in (22) over \((0,T_R)\), where \(T_R\le T\wedge \tau _R\), and the expectation yields

$$\begin{aligned} {{\mathbb {E}}}&\sup _{0<t<T_R}\int _{\mathcal {O}}h(u^\varepsilon (t))\textrm{d}x + \frac{\varepsilon }{2}{{\mathbb {E}}}\sup _{0<t<T_R}\Vert Lw^\varepsilon \Vert _{L^2({\mathcal {O}})}^2 \nonumber \\&\quad \; + {{\mathbb {E}}}\sup _{0<t<T_R}\int _0^t\int _{\mathcal {O}}\nabla w^\varepsilon (s):B(w^\varepsilon )\nabla w^\varepsilon (s) \textrm{d}x\textrm{d}s - {{\mathbb {E}}}\int _{\mathcal {O}}h(u^0)\textrm{d}x\nonumber \\&\le {{\mathbb {E}}}\sup _{0<t<T_R}\sum _{k=1}^\infty \sum _{i,j=1}^n\int _0^t\int _{\mathcal {O}}\frac{\partial h}{\partial u_i}(u^\varepsilon (s))\sigma _{ij}(u^\varepsilon (s))e_k\textrm{d}x\textrm{d}W_j^k(s) \nonumber \\&\quad \;+ \frac{1}{2}{{\mathbb {E}}}\sup _{0<t<T_R}\sum _{k=1}^\infty \int _0^t\int _{\mathcal {O}}{\text {tr}}\big [(\sigma (u^\varepsilon (s))e_k)^T u'(w^\varepsilon (s))^{-1} \sigma (u^\varepsilon (s))e_k\big ]\textrm{d}x\textrm{d}s \nonumber \\&=: I_1 + I_2. \end{aligned}$$
(23)

We apply the Burkholder–Davis–Gundy inequality [32, Theorem 6.1.2] to \(I_1\) and use Assumption (A5):

$$\begin{aligned} I_1&\le C{{\mathbb {E}}}\sup _{0<t<T_R}\bigg \{\int _0^t\sum _{k=1}^\infty \sum _{i,j=1}^n \bigg (\int _{\mathcal {O}}\frac{\partial h}{\partial u_i}(u^\varepsilon (s))\sigma _{ij}(u^\varepsilon (s))e_k \textrm{d}x\bigg )^2\textrm{d}s\bigg \}^{1/2} \\&\le C{{\mathbb {E}}}\sup _{0<t<T_R}\bigg (1+\int _0^t\int _{\mathcal {O}}h(u^\varepsilon (s))\textrm{d}x\textrm{d}s\bigg ). \end{aligned}$$

Also the remaining integral \(I_2\) can be bounded from above by Assumption (A5):

$$\begin{aligned} I_2 \le C{{\mathbb {E}}}\sup _{0<t<T_R}\bigg (1+\int _0^t\int _{\mathcal {O}}h(u^\varepsilon (s))\textrm{d}x\textrm{d}s\bigg ). \end{aligned}$$

Therefore, (23) becomes

$$\begin{aligned} {{\mathbb {E}}}&\sup _{0<t<T_R}\int _{\mathcal {O}}h(u^\varepsilon (t))\textrm{d}x + \frac{\varepsilon }{2}{{\mathbb {E}}}\sup _{0<t<T_R}\Vert Lw^\varepsilon \Vert _{L^2({\mathcal {O}})}^2 \nonumber \\&\quad \; + {{\mathbb {E}}}\sup _{0<t<T_R}\int _0^t\int _{\mathcal {O}}\nabla w^\varepsilon (s):B(w^\varepsilon )\nabla w^\varepsilon (s) \textrm{d}x\textrm{d}s - {{\mathbb {E}}}\int _{\mathcal {O}}h(u^0)\textrm{d}x\nonumber \\&\le C{{\mathbb {E}}}\sup _{0<t<T_R}\bigg (1+\int _0^t\int _{\mathcal {O}}h(u^\varepsilon (s))\textrm{d}x\textrm{d}s\bigg ) \nonumber \\&\le C + C{{\mathbb {E}}}\int _0^{T_R} \int _{\mathcal {O}}\sup _{0<s<t}h(u^\varepsilon (s))\textrm{d}x\textrm{d}t. \end{aligned}$$
(24)

We apply Gronwall’s lemma to the function \(F(t)=\sup _{0<s<t}\int _{\mathcal {O}}h(u^\varepsilon (s))\textrm{d}x\) to find that

$$\begin{aligned} {{\mathbb {E}}}\sup _{0<t<T_R}\int _{\mathcal {O}}h(u^\varepsilon (t))\textrm{d}x \le C(u^0,T). \end{aligned}$$

Using this bound in (24) then finishes the proof. \(\square \)

The entropy inequality allows us to extend the local solution to a global one.

Proposition 17

Let \({(v^\varepsilon ,\tau _R)}\) be a local solution to (19)–(20), constructed in Theorem 13. Then \(v^\varepsilon \) can be extended to a global solution to (19)–(20).

Proof

With the notation \(u^\varepsilon = u(R_\varepsilon (v^\varepsilon ))\) and \(w^\varepsilon =R_\varepsilon (v^\varepsilon )\), we observe that \(v^\varepsilon =Q_\varepsilon (R_\varepsilon (v^\varepsilon ))\) \(= u(R_\varepsilon (v^\varepsilon ))+\varepsilon L^*LR_\varepsilon (v^\varepsilon ) = u^\varepsilon + \varepsilon L^*Lw^\varepsilon \). Thus, we have for \(T_R\le T\wedge \tau _R\),

$$\begin{aligned} {{\mathbb {E}}}&\sup _{0<t<T_R}\Vert v^\varepsilon (t)\Vert _{D(L)'} \le {{\mathbb {E}}}\sup _{0<t<T_R}\Vert u^\varepsilon \Vert _{D(L)'} + \varepsilon {{\mathbb {E}}}\sup _{0<t<T_R}\Vert L^*Lw^\varepsilon (t)\Vert _{D(L)'} \\&\le C{{\mathbb {E}}}\sup _{0<t<T_R}\Vert u^\varepsilon \Vert _{L^1({\mathcal {O}})} + \varepsilon {{\mathbb {E}}}\sup _{0<t<T_R}\Vert L^*Lw^\varepsilon (t)\Vert _{D(L)'}. \end{aligned}$$

We know from Hypothesis (H1) that \(|u^\varepsilon |\le C(1+h(u^\varepsilon ))\). Therefore, taking into account the entropy inequality and the second inequality in (15),

$$\begin{aligned} {{\mathbb {E}}}\sup _{0<t<T_R}\Vert v(t)\Vert _{D(L)'}\le & {} C{{\mathbb {E}}}\sup _{0<t<T_R}\Vert h(u^\varepsilon (t))\Vert _{L^1({\mathcal {O}})} \\{} & {} + \varepsilon C\sup _{0<t<T_R} \Vert Lw^\varepsilon (t)\Vert _{L^2({\mathcal {O}})} \le C(u^0,T). \end{aligned}$$

This allows us to perform the limit \(R\rightarrow \infty \) and to conclude that we have indeed a solution \(v^\varepsilon \) in (0, T) for any \(T>0\). \(\square \)

5 Proof of Theorem 4

We prove the global existence of martingale solutions to the SKT model with self-diffusion.

5.1 Uniform estimates

Let \(v^\varepsilon \) be a global solution to (19)–(20) and set \(u^\varepsilon =u(R_\varepsilon (v^\varepsilon ))\). We assume that A(u) is given by (3) and that \(a_{ii}>0\) for \(i=1,\ldots ,n\). We start with some uniform estimates, which are a consequence of the entropy inequality in Proposition 16.

Lemma 18

(Uniform estimates) There exists a constant \(C(u^0,T)>0\) such that for all \(\varepsilon >0\) and \(i,j=1,\ldots ,n\) with \(i\ne j\),

$$\begin{aligned} {{\mathbb {E}}}\Vert u_i^\varepsilon \Vert _{L^\infty (0,T;L^1({\mathcal {O}}))}&\le C(u^0,T), \end{aligned}$$
(25)
$$\begin{aligned} a_{i0}^{1/2}{{\mathbb {E}}}\Vert (u_i^\varepsilon )^{1/2}\Vert _{L^2(0,T;H^1({\mathcal {O}}))} + a_{ii}^{1/2}{{\mathbb {E}}}\Vert u_i^\varepsilon \Vert _{L^2(0,T;H^1({\mathcal {O}}))}&\le C(u^0,T), \nonumber \\ a_{ij}^{1/2}{{\mathbb {E}}}\Vert \nabla (u_i^\varepsilon u_j^\varepsilon )^{1/2}\Vert _{L^2(0,T;L^2({\mathcal {O}}))}&\le C(u^0,T). \end{aligned}$$
(26)

Moreover, we have the estimate

$$\begin{aligned} \varepsilon {{\mathbb {E}}}\Vert LR_\varepsilon (v^\varepsilon )\Vert _{L^\infty (0,T;L^2({\mathcal {O}}))}^2 + {{\mathbb {E}}}\Vert v^\varepsilon \Vert _{L^\infty (0,T;D(L)')}^2 \le C(u^0,T). \end{aligned}$$
(27)

Proof

Let \(v^\varepsilon \) be a global solution to (19)–(20). We observe that \(R_\varepsilon (v^\varepsilon ) = h'(u(R_\varepsilon (v^\varepsilon ))) = h'(u^\varepsilon )\) implies that \(\nabla R_\varepsilon (v^\varepsilon ) = h''(u^\varepsilon )\nabla u^\varepsilon \). It is shown in [11, Lemma 4] that for all \(z\in {{\mathbb {R}}}^n\) and \(u\in (0,\infty )^n\),

$$\begin{aligned} z^T h''(u)A(u)z \ge \sum _{i=1}^n\pi _i\bigg (a_{0i}\frac{z_i^2}{u_i} + 2a_{ii}z_i^2\bigg ) + \frac{1}{2}\sum _{i,j=1,\,i\ne j}^n \pi _i a_{ij}\bigg (\sqrt{\frac{u_j}{u_i}}z_i + \sqrt{\frac{u_i}{u_j}}z_j\bigg )^2. \end{aligned}$$

Using \(B(R_\varepsilon (v^\varepsilon ))=A(u^\varepsilon )h''(u^\varepsilon )^{-1}\) and the previous inequality with \(z=\nabla u^\varepsilon \), we find that

$$\begin{aligned} \nabla R_\varepsilon (v^\varepsilon )&:B(R_\varepsilon (v^\varepsilon ))\nabla R_\varepsilon (v) = \nabla u^\varepsilon :h''(u^\varepsilon )\big (A(u^\varepsilon )h''(u^\varepsilon )^{-1}\big )h''(u^\varepsilon )\nabla u^\varepsilon \nonumber \\&= \nabla u^\varepsilon :h''(u^\varepsilon )A(u^\varepsilon )\nabla u^\varepsilon \nonumber \\&\ge \sum _{i=1}^n\pi _i\big (4a_{0i}|\nabla (u^\varepsilon )^{1/2}|^2 + 2a_{ii}|\nabla u^\varepsilon |^2\big ) + 2\sum _{i\ne j}\pi _i a_{ij}|\nabla (u^\varepsilon _i u^\varepsilon _j)^{1/2}|^2. \end{aligned}$$
(28)

Therefore, the entropy inequality in Proposition 16 becomes

$$\begin{aligned} {{\mathbb {E}}}\sup _{0<t<T}\int _{\mathcal {O}}h(u^\varepsilon (t))\textrm{d}x&+ {{\mathbb {E}}}\sup _{0<t<T}\frac{\varepsilon }{2}\Vert LR(v^\varepsilon (t))\Vert _{L^2({\mathcal {O}})}^2 \nonumber \\&+ {{\mathbb {E}}}\int _0^T\int _{\mathcal {O}}\sum _{i=1}^n\pi _i\big (4a_{0i}|\nabla (u^\varepsilon )^{1/2}|^2 + 2a_{ii}|\nabla u^\varepsilon |^2\big )\textrm{d}x\textrm{d}s \nonumber \\&+ 2{{\mathbb {E}}}\int _0^T\int _{\mathcal {O}}\sum _{i\ne j}\pi _i a_{ij}|\nabla (u^\varepsilon _i u^\varepsilon _j)^{1/2}|^2\textrm{d}x\textrm{d}s \le C(u^0,T). \end{aligned}$$
(29)

This is the stochastic analog of the entropy inequality (6). By Hypothesis (H1), we have \(|u|\le C(1+h(u))\) and consequently,

$$\begin{aligned} {{\mathbb {E}}}\sup _{0<t<T}\Vert u^\varepsilon (t)\Vert _{L^1({\mathcal {O}})} \le C{{\mathbb {E}}}\sup _{0<t<T}\int _{\mathcal {O}}h(u^\varepsilon (t))\textrm{d}x + C \le C(u^0,T), \end{aligned}$$

which proves (25). Estimate (26) then follows from the Poincaré–Wirtinger inequality.

It remains to show estimate (27). We deduce from the second inequality in (15) that

$$\begin{aligned} \Vert v^\varepsilon (t)\Vert _{D(L)'}&= \Vert Q_\varepsilon (R_\varepsilon (v^\varepsilon (t)))\Vert _{D(L)'} = \Vert u(R_\varepsilon (v^\varepsilon (t))) + \varepsilon L^*LR_\varepsilon (v^\varepsilon (t))\Vert _{D(L)'} \\&\le C\Vert u(R_\varepsilon (v^\varepsilon (t)))\Vert _{L^1({\mathcal {O}})} + \varepsilon \Vert L^*LR_\varepsilon (v^\varepsilon (t))\Vert _{D(L)'} \\&\le C\Vert u^\varepsilon (t)\Vert _{L^1({\mathcal {O}})} + \varepsilon C\Vert LR_\varepsilon (v^\varepsilon (t))\Vert _{L^2({\mathcal {O}})}. \end{aligned}$$

This shows that

$$\begin{aligned}{} & {} {{\mathbb {E}}}\sup _{0<t<T}\Vert v^\varepsilon (t)\Vert _{D(L)'} \le C{{\mathbb {E}}}\sup _{0<t<T}\Vert u^\varepsilon \Vert _{L^1({\mathcal {O}})} + \varepsilon C{{\mathbb {E}}}\sup _{0<t<T}\Vert LR_\varepsilon (v^\varepsilon (t))\Vert _{L^2({\mathcal {O}})}\\{} & {} \quad \le C(u^0,T), \end{aligned}$$

ending the proof. \(\square \)

We also need higher-order moment estimates.

Lemma 19

(Higher-order moments I) Let \(p\ge 2\). There exists a constant \(C(p,u^0,T)\), which is independent of \(\varepsilon \), such that

$$\begin{aligned} {{\mathbb {E}}}\Vert u^\varepsilon \Vert _{L^\infty (0,T;L^1({\mathcal {O}}))}^p&\le C(p,u^0,T), \end{aligned}$$
(30)
$$\begin{aligned} a_{i0}^{p/2}{{\mathbb {E}}}\Vert (u_i^\varepsilon )^{1/2}\Vert _{L^2(0,T;H^1({\mathcal {O}}))}^p + a_{ii}^{p/2}{{\mathbb {E}}}\Vert u_i^\varepsilon \Vert _{L^2(0,T;H^1({\mathcal {O}}))}^p&\le C(p,u^0,T), \end{aligned}$$
(31)
$$\begin{aligned} a_{ij}^{p/2}{{\mathbb {E}}}\Vert \nabla (u_i^\varepsilon u_j^\varepsilon )^{1/2}\Vert _{L^2(0,T;L^2({\mathcal {O}}))}^p&\le C(p,u^0,T). \end{aligned}$$
(32)

Moreover, we have

$$\begin{aligned} {{\mathbb {E}}}\bigg (\varepsilon \sup _{0<t<T}\Vert LR_\varepsilon (v^\varepsilon (t))\Vert _{L^2({\mathcal {O}})}^2\bigg )^p + {{\mathbb {E}}}\bigg (\sup _{0<t<T}\Vert v^\varepsilon (t)\Vert _{D(L)'}\bigg )^p \le C(p,u^0,T). \end{aligned}$$
(33)

Proof

Proceeding as in the proof of Proposition 16 and taking into account identity (22) and inequality (28), we obtain

$$\begin{aligned} {{\mathcal {H}}}(v^\varepsilon (&t)) + \int _0^T\int _{\mathcal {O}}\sum _{i=1}^n\pi _i\big (4a_{i0}|\nabla (u^\varepsilon )^{1/2}|^2 + 2a_{ii}|\nabla u^\varepsilon |^2\big )\textrm{d}x\textrm{d}s \\&\quad \;+ 2{{\mathbb {E}}}\int _0^T\int _{\mathcal {O}}\sum _{i\ne j}\pi _i a_{ij} |\nabla (u^\varepsilon _i u^\varepsilon _j)^{1/2}|^2\textrm{d}x\textrm{d}s \\&\le {{\mathcal {H}}}(v^\varepsilon (0)) + \sum _{k=1}^\infty \sum _{i,j=1}^n\int _0^t\int _{\mathcal {O}}\pi _i\log u^\varepsilon _i(s)\sigma _{ij}(u^\varepsilon (s))e_k\textrm{d}x\textrm{d}W_j^k(s) \\&\quad \;+ \frac{1}{2}\sum _{k=1}^\infty \int _0^t\int _{\mathcal {O}}{\text {tr}} \big [(\sigma (u^\varepsilon (s))e_k)^T h''(u^\varepsilon (s))\sigma (u^\varepsilon (s))e_k\big ]\textrm{d}x\textrm{d}s, \end{aligned}$$

recalling Definition 21 of \({{\mathcal {H}}}(v^\varepsilon )\). We raise this inequality to the pth power, take the expectation, apply the Burkholder–Davis–Gundy inequality (for the second term on the right-hand side), and use Assumption (A5) to find that

$$\begin{aligned} {{\mathbb {E}}}&\bigg (\sup _{0<t<T}\int _{\mathcal {O}}h(u^\varepsilon (t))\textrm{d}x + \varepsilon \sup _{0<t<T}\Vert LR_\varepsilon (v^\varepsilon (t))\Vert _{L^2({\mathcal {O}})}^2\bigg )^p \nonumber \\&\quad \;+ C{{\mathbb {E}}}\bigg (\int _0^T\int _{\mathcal {O}}\sum _{i=1}^n\pi _i a_{i0} |\nabla (u_i^\varepsilon (s))^{1/2}|^2\textrm{d}x\textrm{d}s\bigg )^p \nonumber \\&\quad \;+ C{{\mathbb {E}}}\bigg (\int _0^T\int _{\mathcal {O}}\sum _{i=1}^n\pi _i a_{ii} |\nabla u_i^\varepsilon (s)|^2\textrm{d}x\textrm{d}s\bigg )^p \nonumber \\&\quad \;+ C{{\mathbb {E}}}\bigg (\int _0^T\int _{\mathcal {O}}\sum _{i\ne j}\pi _i a_{ij} |\nabla (u^\varepsilon _i u^\varepsilon _j)^{1/2}|^2\textrm{d}x\textrm{d}s\bigg )^p \nonumber \\&\le C(p,u^0) + C{{\mathbb {E}}}\bigg (\int _0^T\sum _{k=1}^\infty \sum _{i,j=1}^n \bigg (\int _{\mathcal {O}}\log u^\varepsilon _i(s)\sigma _{ij}(u^\varepsilon (s))e_k\textrm{d}x\bigg )^2 \textrm{d}s\bigg )^{p/2} \nonumber \\&\quad \;+ C{{\mathbb {E}}}\bigg (\int _0^T\sum _{k=1}^\infty \int _{\mathcal {O}}{\text {tr}} \big [(\sigma (u^\varepsilon (s))e_k)^T h''(u^\varepsilon (s))(\sigma (u^\varepsilon (s))e_k)\big ]\textrm{d}x\textrm{d}s \bigg )^p \nonumber \\&\le C(p,u^0) + C{{\mathbb {E}}}\bigg (\int _0^T\int _{\mathcal {O}}h(u^\varepsilon (s))\textrm{d}x\textrm{d}s\bigg )^p. \end{aligned}$$
(34)

We neglect the expression \(\varepsilon \Vert LR_\varepsilon (v^\varepsilon (t))\Vert _{L^2({\mathcal {O}})}^2\) and apply Gronwall’s lemma. Then, taking into account the fact that the entropy dominates the \(L^1({\mathcal {O}})\) norm, thanks to Hypothesis (H1), and applying the Poincaré–Wirtinger inequality, we obtain estimates (30)–(32). Going back to (34), we infer that

$$\begin{aligned} {{\mathbb {E}}}\bigg (\varepsilon \sup _{0<t<T}\Vert LR_\varepsilon (v^\varepsilon (t))\Vert _{L^2({\mathcal {O}})}^2\bigg )^p&\le C(p,u^0) + C(p,T){{\mathbb {E}}}\int _0^T\bigg (\int _{\mathcal {O}}h(u^\varepsilon (s))\textrm{d}x\bigg )^p \textrm{d}s \\&\le C(p,u^0,T). \end{aligned}$$

Combining the previous estimates and arguing as in the proof of Lemma 18, we have

$$\begin{aligned} {{\mathbb {E}}}&\bigg (\sup _{0<t<T}\Vert v^\varepsilon (t)\Vert _{D(L)'}\bigg )^p = {{\mathbb {E}}}\bigg (\sup _{0<t<T}\Vert u^\varepsilon (t)+\varepsilon L^*LR_\varepsilon (v^\varepsilon (t))\Vert _{D(L)'}\bigg )^p \\&\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \, \le C{{\mathbb {E}}}\bigg (\sup _{0<t<T}\Vert u^\varepsilon (t)\Vert _{L^1({\mathcal {O}})}\bigg )^p\\&\quad + C{{\mathbb {E}}}\bigg (\varepsilon ^2\sup _{0<t<T}\Vert LR_\varepsilon (v^\varepsilon (t))\Vert _{L^2({\mathcal {O}})}^2\bigg )^{p/2}\\&\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \, \le C(p,u^0,T). \end{aligned}$$

This ends the proof. \(\square \)

Using the Gagliardo–Nirenberg inequality, we can derive further estimates. We recall that \(Q_T={\mathcal {O}}\times (0,T)\).

Lemma 20

(Higher-order moments II) Let \(p\ge 2\). There exists a constant \(C(p,u^0,T)\) \(>0\), which is independent of \(\varepsilon \), such that

$$\begin{aligned} {{\mathbb {E}}}\Vert u_i^\varepsilon \Vert _{L^{2+2/d}(Q_T)}^p&\le C(p,u^0,T), \end{aligned}$$
(35)
$$\begin{aligned} {{\mathbb {E}}}\Vert u_i^\varepsilon \Vert _{L^{2+4/d}(0,T;L^2({\mathcal {O}}))}^p&\le C(p,u^0,T). \end{aligned}$$
(36)

Proof

We apply the Gagliardo–Nirenberg inequality:

$$\begin{aligned} {{\mathbb {E}}}&\bigg (\int _0^T\Vert u_i^\varepsilon \Vert _{L^r({\mathcal {O}})}^s\textrm{d}t\bigg )^{p/s} \le C{{\mathbb {E}}}\bigg (\int _0^T\Vert u_i^\varepsilon \Vert _{H^1({\mathcal {O}})}^{\theta s} \Vert u_i^\varepsilon \Vert _{L^1({\mathcal {O}})}^{(1-\theta )s}\textrm{d}t\bigg )^{p/s} \\&\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \le C{{\mathbb {E}}}\bigg (\Vert u_i^\varepsilon \Vert _{L^\infty (0,T;L^1({\mathcal {O}}))}^{(1-\theta )s}\int _0^T \Vert u_i^\varepsilon \Vert _{H^1({\mathcal {O}})}^{2}\textrm{d}t\bigg )^{p/s} \\&\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \le C\big ({{\mathbb {E}}}\Vert u_i^\varepsilon \Vert _{L^\infty (0,T;L^1({\mathcal {O}}))}^{2(1-\theta )p}\big )^{1/2} \big ({{\mathbb {E}}}\Vert u_i^\varepsilon \Vert _{L^2(0,T;H^1({\mathcal {O}}))}^{4p/s}\big )^{1/2} \le C, \end{aligned}$$

where \(r>1\) and \(\theta \in (0,1]\) are related by \(1/r=1-\theta (d+2)/(2d)\) and \(s=2/\theta \ge 2\). The right-hand side is bounded in view of estimates (30) and (31). Estimate (35) follows after choosing \(r=s\), implying that \(r=2+2/d\), and (36) follows from the choice \(s=2+4/d\), implying that \(r=2\). \(\square \)

Next, we show some bounds for the fractional time derivative of \(u^\varepsilon \). This result is used to establish the tightness of the laws of \((u^\varepsilon )\) in a sub-Polish space. Alternatively, the tightness property can be proved by verifying the Aldous condition; see, e.g., [18]. We recall the definition of the Sobolev–Slobodeckij spaces. Let X be a vector space and let \(p\ge 1\), \(\alpha \in (0,1)\). Then \(W^{\alpha ,p}(0,T;X)\) is the set of all functions \(v\in L^p(0,T;X)\) for which

$$\begin{aligned} \Vert v\Vert _{W^{\alpha ,p}(0,T;X)}^p&= \Vert v\Vert _{L^p(0,T;X)}^p + |v|_{W^{\alpha ,p}(0,T;X)}^p \\&= \int _0^T\Vert v\Vert _X^p \textrm{d}t + \int _0^T\int _0^T\frac{\Vert v(t)-v(s)\Vert _X^p}{|t-s|^{1+\alpha p}}\textrm{d}t\textrm{d}s < \infty . \end{aligned}$$

With this norm, \(W^{\alpha ,p}(0,T;X)\) becomes a Banach space. We need the following technical lemma, which is proved in Appendix A.

Lemma 21

Let \(g\in L^1(0,T)\) and \(\delta <2\), \(\delta \ne 1\). Then

$$\begin{aligned} \int _0^T\int _0^T|t-s|^{-\delta }\int _{s\wedge t}^{t\vee s}g(r)\textrm{d}r\textrm{d}t\textrm{d}s < \infty . \end{aligned}$$
(37)

We obtain the following uniform bounds for \(u^\varepsilon \) and \(v^\varepsilon \) in Sobolev–Slobodeckij spaces.

Lemma 22

(Fractional time regularity) Let \(\alpha <1/2\). There exists a constant \(C(u^0,T)>0\) such that, for \(p:=(2d+4)/d >2\),

$$\begin{aligned} {{\mathbb {E}}}\Vert u^\varepsilon \Vert _{W^{\alpha ,p}(0,T;D(L)')}^p&\le C(u^0,T), \nonumber \\ \varepsilon ^p{{\mathbb {E}}}\Vert L^*LR_\varepsilon (v^\varepsilon )\Vert _{W^{\alpha ,p}(0,T;D(L)')}^p + {{\mathbb {E}}}\Vert v^\varepsilon \Vert _{W^{\alpha ,p}(0,T;D(L)')}^p&\le C(u^0,T). \end{aligned}$$
(38)

Since \(p>2\), we can choose \(\alpha <1/2\) such that \(\alpha p>1\). Then the continuous embedding \(W^{\alpha ,p}(0,T)\hookrightarrow C^{0,\beta }([0,T])\) for \(\beta =\alpha -1/p>0\) implies that

$$\begin{aligned} {{\mathbb {E}}}\Vert u^\varepsilon \Vert _{C^{0,\beta }([0,T];D(L)')}^p\le C(u^0,T). \end{aligned}$$
(39)

Proof

First, we derive the \(W^{\alpha ,p}\) estimate for \(v^\varepsilon \) and then we conclude the estimate for \(u^\varepsilon \) from the definition \(v^\varepsilon = u^\varepsilon + \varepsilon L^*LR_\varepsilon (v^\varepsilon )\) and Lemma 20. Equation (17) reads in terms of \(u^\varepsilon \) as

$$\begin{aligned} \textrm{d}v^\varepsilon _i = {\text {div}}\bigg (\sum _{j=1}^n A_{ij}(u^\varepsilon )\nabla u^\varepsilon _j\bigg )\textrm{d}t + \sum _{j=1}^n\sigma _{ij}(u^\varepsilon )\textrm{d}W_j, \quad i=1,\ldots ,n. \end{aligned}$$

We know from (33) that \({{\mathbb {E}}}\Vert v^\varepsilon \Vert _{L^\infty (0,T;D(L)')}^p\) is bounded. Thus, to prove the bound for the second term in (38), it remains to estimate the following seminorm:

$$\begin{aligned} {{\mathbb {E}}}|v_i^\varepsilon&|_{W^{\alpha ,p}(0,T;D(L)'}^p = {{\mathbb {E}}}\int _0^T\int _0^T\frac{\Vert v_i^\varepsilon (t)-v_i^\varepsilon (s)\Vert _{D(L)'}^p}{ |t-s|^{1+\alpha p}}\textrm{d}t\textrm{d}s \\&\quad \quad \quad \quad \quad \quad \quad \le {{\mathbb {E}}}\int _0^T\int _0^T|t-s|^{-1-\alpha p}\bigg \Vert \int _{s\wedge t}^{t\vee s} {\text {div}}\sum _{j=1}^n A_{ij}(u^\varepsilon (r))\nabla u_j^\varepsilon (r)\textrm{d}r\bigg \Vert _{D(L)'}^p \textrm{d}t\textrm{d}s \\&\quad \quad \quad \quad \quad \quad \quad \quad \;+ {{\mathbb {E}}}\int _0^T\int _0^T|t-s|^{-1-\alpha p} \bigg \Vert \int _{s\wedge t}^{t\vee s} \sum _{j=1}^n\sigma _{ij}(u^\varepsilon (r))\textrm{d}W_j(r)\bigg \Vert _{D(L)'}^p\textrm{d}t\textrm{d}s \\&\quad \quad \quad \quad \quad \quad \quad =: J_1 + J_2. \end{aligned}$$

We need some preparations before we can estimate \(J_1\). We observe that

$$\begin{aligned} \bigg \Vert \sum _{j=1}^n A_{ij}(u^\varepsilon )\nabla u_j^\varepsilon \bigg \Vert _{L^1({\mathcal {O}})}&= \bigg \Vert \bigg (a_{i0}+2\sum _{j=1}^n a_{ij}u_j^\varepsilon \bigg )\nabla u_i^\varepsilon + \sum _{j\ne i}a_{ij}u_i^\varepsilon \nabla u_j^\varepsilon \bigg \Vert _{L^1({\mathcal {O}})} \\&\le C\Vert \nabla u_i^\varepsilon \Vert _{L^1({\mathcal {O}})} + C\Vert u^\varepsilon \Vert _{L^2({\mathcal {O}})} \Vert \nabla u^\varepsilon \Vert _{L^2({\mathcal {O}})}. \end{aligned}$$

It follows from the embedding \(L^1({\mathcal {O}})\hookrightarrow D(L)'\) that

$$\begin{aligned} J_1&\le {{\mathbb {E}}}\int _0^T\int _0^T|t-s|^{-1-\alpha p}\bigg (\int _{s\wedge t}^{t\vee s} \bigg \Vert {\text {div}}\sum _{j=1}^n A_{ij}(u^\varepsilon (r))\nabla u_j^\varepsilon (r)\bigg \Vert _{D(L)'}\textrm{d}r \bigg )^p\textrm{d}t\textrm{d}s \\&\le C{{\mathbb {E}}}\int _0^T\int _0^T|t-s|^{-1-\alpha p}\bigg (\int _{s\wedge t}^{t\vee s} \bigg \Vert \sum _{j=1}^nA_{ij}(u^\varepsilon (r))\nabla u_j^\varepsilon (r)\bigg \Vert _{L^1({\mathcal {O}})}\textrm{d}r \bigg )^p\textrm{d}t\textrm{d}s \\&\le C{{\mathbb {E}}}\int _0^T\int _0^T|t-s|^{-1-\alpha p}\bigg (\int _{s\wedge t}^{t\vee s} \Vert \nabla u^\varepsilon (r)\Vert _{L^2({\mathcal {O}})}\textrm{d}r\bigg )^p \textrm{d}t\textrm{d}s \\&\quad \; + C{{\mathbb {E}}}\int _0^T\int _0^T|t-s|^{-1-\alpha p} \bigg (\int _{s\wedge t}^{t\vee s} \Vert u^\varepsilon (r)\Vert _{L^2({\mathcal {O}})}\Vert \nabla u^\varepsilon (r)\Vert _{L^2({\mathcal {O}})}\textrm{d}r\bigg )^p \textrm{d}t\textrm{d}s \\&=: J_{11} + J_{12}. \end{aligned}$$

We use Hölder’s inequality and fix \(p=(2d+4)/d\) to obtain

$$\begin{aligned} J_{11}\le C{{\mathbb {E}}}\int _0^T\int _0^T|t-s|^{-1-\alpha p}|t-s|^{p/2} \bigg (\int _{s\wedge t}^{t\vee s}\Vert \nabla u^\varepsilon (r)\Vert _{L^2({\mathcal {O}})}^2\textrm{d}r\bigg )^{p/2} \textrm{d}t\textrm{d}s. \end{aligned}$$

In view of estimate (31) and (37), the right-hand side is finite if \(1+\alpha p-p/2<2\) or, equivalently, \(\alpha <(d+1)/(d+2)\), and this holds true since \(\alpha <1/2\). Applying Hölder’s inequality again, we have

$$\begin{aligned} J_{12}&\le C{{\mathbb {E}}}\int _0^T\int _0^T|t-s|^{-1-\alpha p}\bigg ( \int _{s\wedge t}^{t\vee s}\Vert u^\varepsilon (r)\Vert _{L^2({\mathcal {O}})}^2\textrm{d}r\bigg )^{p/2} \bigg (\int _{s\wedge t}^{t\vee s}\Vert \nabla u^\varepsilon (r)\Vert _{L^2({\mathcal {O}})}^2\textrm{d}r\bigg )^{p/2} \textrm{d}t \textrm{d}s \\&\le C{{\mathbb {E}}}\int _0^T\int _0^T|t-s|^{-1-\alpha p}|t-s|^{p/(d+2)}\bigg ( \int _{s\wedge t}^{t\vee s}\Vert u^\varepsilon (r)\Vert _{L^2({\mathcal {O}})}^{(2d+4)/d}\textrm{d}r \bigg )^{pd/(2d+4)} \\&\quad \; \times \bigg (\int _{s\wedge t}^{t\vee s}\Vert \nabla u^\varepsilon (r)\Vert _{L^2({\mathcal {O}})}^2 \textrm{d}r\bigg )^{p/2}\textrm{d}t\textrm{d}s \\&\le C\bigg \{{{\mathbb {E}}}\bigg (\int _0^T\int _0^T|t-s|^{-1-\alpha p+p/(d+2)} \bigg (\int _{s\wedge t}^{t\vee s}\Vert u^\varepsilon (r)\Vert _{L^2({\mathcal {O}})}^{(2d+4)/d}\textrm{d}r\bigg ) \textrm{d}t\textrm{d}s\bigg )^2\bigg \}^{1/2}\\&\quad \; \times \bigg \{{{\mathbb {E}}}\bigg (\int _0^T\Vert \nabla u^\varepsilon (r) \Vert _{L^2({\mathcal {O}})}^2\textrm{d}r\bigg )^{p}\bigg \}^{1/2}. \end{aligned}$$

Because of estimates (31), (36), and (37), the right-hand side of is finite if \(1+\alpha p-p/(d+2)<2\), which is equivalent to \(\alpha <1/2\).

To estimate \(J_2\), we use the embedding \(L^2({\mathcal {O}})\hookrightarrow D(L)'\), the Burkholder–Davis–Gundy inequality, the linear growth of \(\sigma \) from Assumption (A4), and the Hölder inequality:

$$\begin{aligned} J_2&\le C\int _0^T\int _0^T|t-s|^{-1-\alpha p} {{\mathbb {E}}}\bigg \Vert \int _{s\wedge t}^{t\vee s} \sum _{j=1}^n\sigma _{ij}(u^\varepsilon (r))\textrm{d}W_j(r)\bigg \Vert _{L^2({\mathcal {O}})}^p\textrm{d}t\textrm{d}s \\&\le C\int _0^T\int _0^T|t-s|^{-1-\alpha p} {{\mathbb {E}}}\bigg (\int _{s\wedge t}^{t\vee s}\sum _{k=1}^\infty \sum _{j=1}^n \Vert \sigma _{ij}(u^\varepsilon (r))e_k\Vert _{L^2({\mathcal {O}})}^2\textrm{d}r \bigg )^{p/2}\textrm{d}t \textrm{d}s \\&\le C\int _0^T\int _0^T|t-s|^{-1-\alpha p+(p-2)/2}\int _{s\wedge t}^{t\vee s}{{\mathbb {E}}}\sum _{j=1}^n\big (1+\Vert u_j^\varepsilon (r)\Vert _{L^2({\mathcal {O}})}^p\big )\textrm{d}r\textrm{d}t\textrm{d}s. \end{aligned}$$

By (36) and (37), the right-hand side is finite if \(1+\alpha p-(p-2)/2<2\), which is equivalent to \(\alpha <(3d+2)/(2d+4)\), and this is valid due to the condition \(\alpha <1/2\). We conclude that \((v^\varepsilon )\) is bounded in \(L^p(\Omega ;W^{\alpha ,p}(0,T;D(L)'))\) with \(p=(2d+4)/d\).

Next, we derive the uniform bounds for \(u^\varepsilon \). By definition of \(v^\varepsilon \) and the \(W^{\alpha ,p}\) seminorm,

$$\begin{aligned} {{\mathbb {E}}}|u^\varepsilon |_{W^{\alpha ,p}(0,T;D(L)')}^p&= {{\mathbb {E}}}|v^\varepsilon - \varepsilon L^*LR_\varepsilon (v^\varepsilon )|_{W^{\alpha ,p}(0,T;D(L)')}^p \\&\le C{{\mathbb {E}}}\int _0^T\int _0^T\frac{\Vert v^\varepsilon (t)-v^\varepsilon (s)\Vert _{D(L)'}^p}{|t-s|^{1+\alpha p}} \textrm{d}t\textrm{d}s \\&\quad \; + C{{\mathbb {E}}}\int _0^T\int _0^T \frac{\varepsilon ^p\Vert L^*LR_\varepsilon (v^\varepsilon (t)) -L^*LR_\varepsilon (v^\varepsilon (s))\Vert _{D(L)'}^p}{|t-s|^{1+\alpha p}}\textrm{d}t\textrm{d}s. \end{aligned}$$

It follows from (15) and the Lipschitz continuity of \(R_\varepsilon \) (Lemma 11) that

$$\begin{aligned} \Vert L^*LR_\varepsilon (v^\varepsilon (t))-L^*LR_\varepsilon (v^\varepsilon (s))\Vert _{D(L)'}&\le \Vert R_\varepsilon (v^\varepsilon (t))-R_\varepsilon (v^\varepsilon (s))\Vert _{L^2({\mathcal {O}})} \\&\le \varepsilon ^{-1}C\Vert v^\varepsilon (t)-v^\varepsilon (s)\Vert _{D(L)'}. \end{aligned}$$

Then we find that

$$\begin{aligned} {{\mathbb {E}}}|u^\varepsilon |_{W^{\alpha ,p}(0,T;D(L)')}^p\le & {} C{{\mathbb {E}}}\int _0^T\int _0^T\frac{\Vert v^\varepsilon (t)-v^\varepsilon (s)\Vert _{D(L)'}^p}{|t-s|^{1+\alpha p}} \textrm{d}t\textrm{d}s\\= & {} C{{\mathbb {E}}}|v^\varepsilon |_{W^{\alpha ,p}(0,T;D(L)')}, \end{aligned}$$

which finishes the proof. \(\square \)

5.2 Tightness of the laws of \((u^\varepsilon )\)

We show that the laws of \((u^\varepsilon )\) are tight in a certain sub-Polish space. For this, we introduce the following spaces:

  • \(C^0([0,T];D(L)')\) is the space of continuous functions \(u:[0,T]\rightarrow D(L)'\) with the topology \({\mathbb {T}}_1\) induced by the norm \(\Vert u\Vert _{C^0([0,T];D(L)')} =\sup _{0<t<T}\Vert u(t)\Vert _{D(L)'}\);

  • \(L_w^2(0,T;H^1({\mathcal {O}}))\) is the space \(L^2(0,T;H^1({\mathcal {O}}))\) with the weak topology \({\mathbb {T}}_2\).

We define the space

$$\begin{aligned} \widetilde{Z}_T:= C^0([0,T];D(L)')\cap L_w^2(0,T;H^1({\mathcal {O}})), \end{aligned}$$

endowed with the topology \(\widetilde{{\mathbb {T}}}\) that is the maximum of the topologies \({\mathbb {T}}_1\) and \({\mathbb {T}}_2\). The space \(\widetilde{Z}_T\) is a sub-Polish space, since \(C^0([0,T];D(L)')\) is separable and metrizable and

$$\begin{aligned} f_m(u) = \int _0^T(u(t),v_m(t))_{H^1({\mathcal {O}})}\textrm{d}t, \quad u\in L_w^2(0,T;H^1({\mathcal {O}})),\ m\in {{\mathbb {N}}}, \end{aligned}$$

where \((v_m)_m\) is a dense subset of \(L^2(0,T;H^1({\mathcal {O}}))\), is a countable family \((f_m)\) of point-separating functionals acting on \(L^2(0,T;H^1({\mathcal {O}}))\). In the following, we choose a number \(s^*\ge 1\) such that

$$\begin{aligned} s^*< \frac{2d}{d-2}\quad \text{ if } d\ge 3, \quad s^*<\infty \quad \text{ if } d=2, \quad s^* \le \infty \quad \text{ if } d=1. \end{aligned}$$
(40)

Then the embedding \(H^1({\mathcal {O}})\hookrightarrow L^{s^*}({\mathcal {O}})\) is compact.

Lemma 23

The set of laws of \((u^\varepsilon )\) is tight in

$$\begin{aligned} Z_T = \widetilde{Z}_T\cap L^2(0,T;L^{s^*}({\mathcal {O}})) \end{aligned}$$

with the topology \({\mathbb {T}}\) that is the maximum of \(\widetilde{{\mathbb {T}}}\) and the topology induced by the \(L^2(0,T;\) \(L^{s^*}({\mathcal {O}}))\) norm, where \(s^*\) is given by (40).

Proof

We apply Chebyshev’s inequality for the first moment and use estimate (39) with \(\beta =\alpha -1/p>0\), for any \(\eta >0\) and \(\delta >0\),

$$\begin{aligned} \sup _{\varepsilon>0}\,&{{\mathbb {P}}}\bigg (\sup _{\begin{array}{c} s,t\in [0,T], \\ |t-s|\le \delta \end{array}} \Vert u^\varepsilon (t)-u^\varepsilon (s)\Vert _{D(L)'}>\eta \bigg )\\&\le \sup _{\varepsilon>0}\frac{1}{\eta }{{\mathbb {E}}}\bigg ( \sup _{\begin{array}{c} s,t\in [0,T], \\ |t-s|\le \delta \end{array}} \Vert u^\varepsilon (t)-u^\varepsilon (s)\Vert _{D(L)'}\bigg ) \\&\le \frac{\delta ^\beta }{\eta }\sup _{\varepsilon>0}{{\mathbb {E}}}\bigg (\sup _{\begin{array}{c} s,t\in [0,T], \\ |t-s|\le \delta \end{array}}\frac{\Vert u^\varepsilon (t)-u^\varepsilon (s)\Vert _{D(L)'}}{|t-s|^\beta }\bigg ) \le \frac{\delta ^\beta }{\eta }\sup _{\varepsilon >0}{{\mathbb {E}}}\Vert u^\varepsilon \Vert _{C^{0,\beta }([0,T];D(L)'))}\\&\le C\frac{\delta ^\beta }{\eta }. \end{aligned}$$

This means that for all \(\theta >0\) and all \(\eta >0\), there exists \(\delta >0\) such that

$$\begin{aligned} \sup _{\varepsilon>0}\,{{\mathbb {P}}}\bigg (\sup _{s,t\in [0,T], \, |t-s|\le \delta } \Vert u^\varepsilon (t)-u^\varepsilon (s)\Vert _{D(L)'}>\eta \bigg ) \le \theta , \end{aligned}$$

which is equivalent to the Aldous condition [5, Section 2.2]. Applying [38, Lemma 5, Theorem 3] with the spaces \(X=H^1({\mathcal {O}})\) and \(B=D(L)'\), we conclude that \((u^\varepsilon )\) is precompact in \(C^0([0,T];D(L)')\). Then, proceeding as in the proof of the basic criterion for tightness [34, Chapter II, Section 2.1], we see that the set of laws of \((u^\varepsilon )\) is tight in \(C^0([0,T];D(L)')\).

Next, by Chebyshev’s inequality again and estimate (26), for all \(K>0\),

$$\begin{aligned} {{\mathbb {P}}}\big (\Vert u^\varepsilon \Vert _{L^2(0,T;H^1({\mathcal {O}}))} > K\big ) \le \frac{1}{K^2}{{\mathbb {E}}}\Vert u^\varepsilon \Vert _{L^2(0,T;H^1({\mathcal {O}}))}^2 \le \frac{C}{K^2}. \end{aligned}$$

This implies that for any \(\delta >0\), there exists \(K>0\) such that \({{\mathbb {P}}}(\Vert u^\varepsilon \Vert _{L^2(0,T;H^1({\mathcal {O}}))}\le K)\le 1-\delta \). Since closed balls with respect to the norm of \(L^2(0,T;H^1({\mathcal {O}}))\) are weakly compact, we infer that the set of laws of \((u^\varepsilon )\) is tight in \(L^2_w(0,T;H^1({\mathcal {O}}))\).

The tightness in \(L^2(0,T;L^{s^*}({\mathcal {O}}))\) follows from Lemma 37 in Appendix B with \(p=q=2\) and \(r=2+4/d\). \(\square \)

Lemma 24

The set of laws of \((\sqrt{\varepsilon }L^*LR_\varepsilon (v^\varepsilon ))\) is tight in

$$\begin{aligned} Y_T:= L_w^2(0,T;D(L)')\cap L^\infty _{w*}(0,T;D(L)') \end{aligned}$$

with the associated topology \({\mathbb {T}}_Y\).

Proof

We apply the Chebyshev inequality and use the inequality \(\Vert L^*LR_\varepsilon (v^\varepsilon )\Vert _{D(L)'}\le C\Vert LR_\varepsilon (v^\varepsilon )\Vert _{L^2({\mathcal {O}})}\) and estimate (27):

$$\begin{aligned} {{\mathbb {P}}}\big (\sqrt{\varepsilon }\Vert L^*LR_\varepsilon (v^\varepsilon )\Vert _{L^2(0,T;D(L)')}>K\big ) \le \frac{\varepsilon }{K^2}{{\mathbb {E}}}\Vert L^*LR_\varepsilon (v^\varepsilon )\Vert _{L^2(0,T;D(L)')}^2 \le \frac{C}{K^2} \end{aligned}$$

for any \(K>0\). Since closed balls in \(L^2(0,T;D(L)')\) are weakly compact, the set of laws of \((\sqrt{\varepsilon }L^*LR_\varepsilon (v^\varepsilon ))\) is tight in \(L_w^2(0,T;D(L)')\). The second claim follows from an analogous argument. \(\square \)

5.3 Convergence of \((u^\varepsilon )\)

Let \(\text{ P }(X)\) be the space of probability measures on X. We consider the space \(Z_T\times Y_T\times C^0([0,T];U_0)\), equipped with the probability measure \(\mu ^\varepsilon :=\mu _u^\varepsilon \times \mu _w^\varepsilon \times \mu _W^\varepsilon \), where

$$\begin{aligned} \mu _u^\varepsilon (\cdot )&= {{\mathbb {P}}}(u^\varepsilon \in \cdot )\in \text{ P }(Z_T), \\ \mu _w^\varepsilon&= {{\mathbb {P}}}(\sqrt{\varepsilon } L^*LR_\varepsilon (v^\varepsilon )\in \cdot )\in \text{ P }(Y_T), \\ \mu _W^\varepsilon (\cdot )&= {{\mathbb {P}}}(W\in \cdot ) \in \text{ P }(C^0([0,T];U_0)), \end{aligned}$$

recalling the choice (40) of \(s^*\). The set of measures \((\mu ^\varepsilon )\) is tight, since the set of laws of \((u^\varepsilon )\) and \((\sqrt{\varepsilon }L^*LR_\varepsilon (v^\varepsilon ))\) are tight in \((Z_T,{\mathbb {T}})\) and \((Y_T,{\mathbb {T}}_Y)\), respectively. Moreover, \((\mu _W^\varepsilon )\) consists of one element only and is consequently weakly compact in \(C^0([0,T];U_0)\). By Prokhorov’s theorem, \((\mu _W^\varepsilon )\) is tight. Hence, \(Z_T\times Y_T\times C^0([0,T];U_0)\) satisfies the assumptions of the Skorokhod–Jakubowski theorem [6, Theorem C.1]. We infer that there exists a subsequence of \((u^\varepsilon ,\sqrt{\varepsilon }L^*LR_\varepsilon (v^\varepsilon ))\), which is not relabeled, a probability space \(({\widetilde{\Omega }},\widetilde{{\mathcal {F}}},{\widetilde{{{\mathbb {P}}}}})\) and, on this space, \((Z_T\times Y_T\times C^0([0,T];U_0))\)-valued random variables \((\widetilde{u},\widetilde{w},\widetilde{W})\) and \((\widetilde{u}^\varepsilon , \widetilde{w}^\varepsilon ,\widetilde{W}^\varepsilon )\) such that \((\widetilde{u}^\varepsilon , \widetilde{w}^\varepsilon ,\widetilde{W}^\varepsilon )\) has the same law as \((u^\varepsilon ,\sqrt{\varepsilon }L^*LR_\varepsilon (v^\varepsilon ),W)\) on \({{\mathcal {B}}}(Z_T\times Y_T\times C^0([0,T];U_0))\) and, as \(\varepsilon \rightarrow 0\),

$$\begin{aligned} (\widetilde{u}^\varepsilon ,\widetilde{w}^\varepsilon ,\widetilde{W}^\varepsilon )\rightarrow (\widetilde{u},\widetilde{w},\widetilde{W})\quad \text{ in } Z_T\times Y_T\times C^0([0,T];U_0)\quad {\widetilde{{{\mathbb {P}}}}}\text{-a.s. } \end{aligned}$$

By the definition of \(Z_T\) and \(Y_T\), this convergence means \({\widetilde{{{\mathbb {P}}}}}\)-a.s.,

$$\begin{aligned} \widetilde{u}^\varepsilon \rightarrow \widetilde{u}&\quad \text{ strongly } \text{ in } C^0([0,T];D(L)'), \\ \widetilde{u}^\varepsilon \rightharpoonup \widetilde{u}&\quad \text{ weakly } \text{ in } L^2(0,T;H^1({\mathcal {O}})), \\ \widetilde{u}^\varepsilon \rightarrow \widetilde{u}&\quad \text{ strongly } \text{ in } L^2(0,T;L^{s^*}({\mathcal {O}})), \\ \widetilde{w}^\varepsilon \rightharpoonup \widetilde{w}&\quad \text{ weakly } \text{ in } L^2(0,T;D(L)'), \\ \widetilde{w}^\varepsilon \rightharpoonup \widetilde{w}&\quad \text{ weakly* } \text{ in } L^\infty (0,T;D(L)'), \\ \widetilde{W}^\varepsilon \rightarrow \widetilde{W}&\quad \text{ strongly } \text{ in } C^0([0,T];U_0). \end{aligned}$$

We derive some regularity properties for the limit \(\widetilde{u}\). We note that \(\widetilde{u}\) is a \(Z_T\)-Borel random variable, since \({{\mathcal {B}}}(Z_T\times Y_T\times C^0([0,T];U_0))\) is a subset of \({{\mathcal {B}}}(Z_T)\times {{\mathcal {B}}}(Y_T)\times {{\mathcal {B}}}(C^0([0,T];U_0))\). We deduce from estimates (25) and (26) and the fact that \(u^\varepsilon \) and \(\widetilde{u}^\varepsilon \) have the same law that

$$\begin{aligned} \sup _{\varepsilon>0}{\widetilde{{{\mathbb {E}}}}}\Vert \widetilde{u}^\varepsilon \Vert _{L^2(0,T;H^1({\mathcal {O}}))}^p + \sup _{\varepsilon >0}{\widetilde{{{\mathbb {E}}}}}\Vert \widetilde{u}^\varepsilon \Vert _{L^\infty (0,T;D(L)')}^p < \infty . \end{aligned}$$

We infer the existence of a further subsequence of \((\widetilde{u}^\varepsilon )\) (not relabeled) that is weakly converging in \(L^p({\widetilde{\Omega }};L^2(0,T;H^1({\mathcal {O}})))\) and weakly* converging in \(L^p({\widetilde{\Omega }};C^0([0,T];D(L)'))\) as \(\varepsilon \rightarrow 0\). Because \(\widetilde{u}^\varepsilon \rightarrow \widetilde{u}\) in \(Z_T\) \({\widetilde{{{\mathbb {P}}}}}\)-a.s., we conclude that the limit function satisfies

$$\begin{aligned} {\widetilde{{{\mathbb {E}}}}}\Vert \widetilde{u}\Vert _{L^2(0,T;H^1({\mathcal {O}}))}^p + {\widetilde{{{\mathbb {E}}}}}\Vert \widetilde{u}\Vert _{L^\infty (0,T;D(L)')}^p < \infty . \end{aligned}$$

Let \({\widetilde{{{\mathbb {F}}}}}\) and \({\widetilde{{{\mathbb {F}}}}}^\varepsilon \) be the filtrations generated by \((\widetilde{u},\widetilde{w},\widetilde{W})\) and \((\widetilde{u}^\varepsilon ,\widetilde{w}^\varepsilon ,\widetilde{W})\), respectively. By following the arguments of the proof of [7, Proposition B4], we can verify that these new random variables induce actually stochastic processes. The progressive measurability of \(\widetilde{u}^\varepsilon \) is a consequence of [4, Appendix B]. Set \(\widetilde{W}_j^{\varepsilon ,k}(t):=(\widetilde{W}^\varepsilon (t),e_k)_U\). We claim that \(\widetilde{W}_j^{\varepsilon ,k}(t)\) for \(k\in {{\mathbb {N}}}\) are independent, standard \(\widetilde{{\mathcal {F}}}_t\)-Wiener processes. The adaptedness is a direct consequence of the definition; the independence of \(\widetilde{W}_j^{\varepsilon ,k}(t)\) and the independence of the increments \(\widetilde{W}^{\varepsilon ,k}(t)-\widetilde{W}^{\varepsilon ,k}(s)\) with respect to \(\widetilde{{\mathcal {F}}}_s\) are inherited from \((W(t),e_k)_U\). Passing to the limit \(\varepsilon \rightarrow 0\) in the characteristic function, by using dominated convergence, we find that \(\widetilde{W}(t)\) are \(\widetilde{{\mathcal {F}}}_t\)-martingales with the correct marginal distributions. We deduce from Lévy’s characterization theorem that \(\widetilde{W}(t)\) is indeed a cylindrical Wiener process.

By definition, \(u_i^\varepsilon =u_i(R_\varepsilon (v^\varepsilon ))=\exp (R_\varepsilon (v^\varepsilon ))\) is positive in \(Q_T\) a.s. We claim that also \(\widetilde{u}_i\) is nonnegative in \({\mathcal {O}}\) a.s.

Lemma 25

(Nonnegativity) It holds that \(\widetilde{u}_i\ge 0\) a.e. in \(Q_T\) \({\widetilde{{{\mathbb {P}}}}}\)-a.s. for all \(i=1,\ldots ,n\).

Proof

Let \(i\in \{1,\ldots ,n\}\). Since \(u_i^\varepsilon > 0\) in \(Q_T\) a.s., we have \({{\mathbb {E}}}\Vert (u_i^\varepsilon )^-\Vert _{L^2(0,T;L^2({\mathcal {O}}))}=0\), where \(z^-=\min \{0,z\}\). The function \(u_i^\varepsilon \) is \(Z_T\)-Borel measurable and so does its negative part. Therefore, using the equivalence of the laws of \(u_i^\varepsilon \) and \(\widetilde{u}_i^\varepsilon \) in \(Z_T\) and writing \(\mu _i^\varepsilon \) and \({\widetilde{\mu }}_i^\varepsilon \) for the laws of \(u_i^\varepsilon \) and \(\widetilde{u}_i^\varepsilon \), respectively, we obtain

$$\begin{aligned} {\widetilde{{{\mathbb {E}}}}}\Vert (\widetilde{u}_i^\varepsilon )^-\Vert _{L^2(Q_T)}&= \int _{L^2(Q_T)}\Vert y^-\Vert _{L^2(Q_T)}\textrm{d}{\widetilde{\mu }}_i^\varepsilon (y)\\&= \int _{L^2(Q_T)}\Vert y^-\Vert _{L^2Q_T)}\textrm{d}\mu _i^\varepsilon (y) = {{\mathbb {E}}}\Vert u_i^\varepsilon \Vert _{L^2(Q_T)} = 0. \end{aligned}$$

This shows that \(\widetilde{u}_i^\varepsilon \ge 0\) a.e. in \(Q_T\) \({\widetilde{{{\mathbb {P}}}}}\)-a.s. The convergence (up to a subsequence) \(\widetilde{u}^\varepsilon \rightarrow \widetilde{u}\) a.e. in \(Q_T\) \({\widetilde{{{\mathbb {P}}}}}\)-a.s. then implies that \(\widetilde{u}_i\ge 0\) in \(Q_T\) \({\widetilde{{{\mathbb {P}}}}}\)-a.s. \(\square \)

The following lemma is needed to verify that \((\widetilde{u},\widetilde{W})\) is a martingale solution to (1)–(2).

Lemma 26

It holds for all \(t\in [0,T]\), \(i=1,\ldots ,n\), and all \(\phi _1\in L^2({\mathcal {O}})\) and all \(\phi _2\in D(L)\) that

$$\begin{aligned}&\lim _{\varepsilon \rightarrow 0}{{\widetilde{{{\mathbb {E}}}}}}\int _0^T\big ({\widetilde{u}}_{i}^\varepsilon (t) -{\widetilde{u}}_{i}(t), \phi _1\big )_{L^2({\mathcal {O}})}\textrm{d}t = 0, \end{aligned}$$
(41)
$$\begin{aligned}&\lim _{\varepsilon \rightarrow 0}{{\widetilde{{{\mathbb {E}}}}}}\big \langle {\widetilde{u}}_{i}^\varepsilon (0) -{\widetilde{u}}_{i}(0),\phi _2\big \rangle _{D(L)',D(L)} = 0, \end{aligned}$$
(42)
$$\begin{aligned}&\lim _{\varepsilon \rightarrow 0}{{\widetilde{{{\mathbb {E}}}}}}\int _0^T\big \langle \sqrt{\varepsilon }\widetilde{w}_i^\varepsilon (t), \phi _2\big \rangle _{D(L)',D(L)}\textrm{d}t = 0, \end{aligned}$$
(43)
$$\begin{aligned}&\lim _{\varepsilon \rightarrow 0}{{\widetilde{{{\mathbb {E}}}}}}\langle \sqrt{\varepsilon }\widetilde{w}_i^\varepsilon (0), \phi _2\rangle _{D(L)',D(L)} = 0, \end{aligned}$$
(44)
$$\begin{aligned}&\lim _{\varepsilon \rightarrow 0}{{\widetilde{{{\mathbb {E}}}}}}\int _0^T\bigg |\sum _{j=1}^n\int _0^t\int _{\mathcal {O}}\big ( A_{ij}({\widetilde{u}}^\varepsilon (s))\nabla {\widetilde{u}}_j^\varepsilon (s) - A_{ij}({\widetilde{u}}(s))\nabla {\widetilde{u}}_j(s)\big )\cdot \nabla \phi _2 \textrm{d}x\textrm{d}s\bigg |\textrm{d}t = 0, \end{aligned}$$
(45)
$$\begin{aligned}&\lim _{\varepsilon \rightarrow 0}{{\widetilde{{{\mathbb {E}}}}}}\int _0^T\bigg |\sum _{j=1}^n\int _0^t \Big (\sigma _{ij}({\widetilde{u}}^\varepsilon (s))\textrm{d}{\widetilde{W}}_j^\varepsilon (s) -\sigma _{ij}({\widetilde{u}}(s))\textrm{d}{\widetilde{W}}_j(s),\phi _1\Big )_{L^2({\mathcal {O}})}\bigg |^2 \textrm{d}t = 0. \end{aligned}$$
(46)

Proof

The proof is a combination of the uniform bounds and Vitali’s convergence theorem. Convergences (41) and (42) have been shown in the proof of [18, Lemma 16], and (43) is a direct consequence of (38) and

$$\begin{aligned}{} & {} {\widetilde{{{\mathbb {E}}}}}\bigg (\int _0^T\langle \sqrt{\varepsilon }\widetilde{w}_i^\varepsilon (t),\phi _2 \rangle _{D(L)',D(L)}\textrm{d}t\bigg )^p \\{} & {} \quad \le \varepsilon ^{p/2}{\widetilde{{{\mathbb {E}}}}}\bigg (\int _0^T\Vert \widetilde{w}_i^\varepsilon (t)\Vert _{D(L)'} \Vert \phi _2\Vert _{D(L)}\textrm{d}t\bigg )^p \le \varepsilon ^{p/2}C. \end{aligned}$$

Convergence (44) follows from \(\widetilde{w}_i^\varepsilon \rightharpoonup \widetilde{w}_i\) weakly* in \(L^\infty (0,T;D(L)')\). We establish (45):

$$\begin{aligned} \bigg |\int _0^T&\bigg |\sum _{j=1}^n\int _0^t\int _{\mathcal {O}}\big ( A_{ij}({\widetilde{u}}^\varepsilon (s))\nabla {\widetilde{u}}_j^\varepsilon (s) - A_{ij}({\widetilde{u}}(s))\nabla {\widetilde{u}}_j(s)\big )\cdot \nabla \phi _2\textrm{d}x\textrm{d}s\bigg | \\&\le \int _0^T\Vert A_{ij}({\widetilde{u}}^\varepsilon (s))-A_{ij}({\widetilde{u}}(s))\Vert _{L^2({\mathcal {O}})} \Vert \nabla {\widetilde{u}}_j^\varepsilon (s)\Vert _{L^2({\mathcal {O}})}\Vert \nabla \phi _2\Vert _{L^\infty ({\mathcal {O}})}\textrm{d}s \\&\quad \;+ \bigg |\int _0^T\int _{\mathcal {O}}A_{ij}({\widetilde{u}}(s)) \nabla ({\widetilde{u}}^\varepsilon (s) - {\widetilde{u}}(s))\cdot \nabla \phi _2 \textrm{d}x\textrm{d}s\bigg | =: I_1^\varepsilon +I_2^\varepsilon . \end{aligned}$$

By the Lipschitz continuity of A and the uniform bound for \(\nabla {\widetilde{u}}^\varepsilon \), we have \(I_1^\varepsilon \rightarrow 0\) as \(\varepsilon \rightarrow 0\) \({\widetilde{{{\mathbb {P}}}}}\)-a.s. At this point, we use the embedding \(D(L)\hookrightarrow W^{1,\infty }({\mathcal {O}})\). Also the second integral \(I_2^\varepsilon \) converges to zero, since \(A_{ij}({\widetilde{u}})\nabla \phi _2 \in L^2(0,T;L^2({\mathcal {O}}))\) and \(\nabla {\widetilde{u}}_j^\varepsilon \rightharpoonup \nabla {\widetilde{u}}_j\) weakly in \(L^2(0,T;L^2({\mathcal {O}}))\). This shows that \({{\widetilde{{{\mathbb {P}}}}}}\)-a.s.,

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0}\int _0^T\int _{\mathcal {O}}A_{ij}({\widetilde{u}}^\varepsilon (s)) \nabla {\widetilde{u}}_j^\varepsilon (s)\cdot \nabla \phi _2\textrm{d}x\textrm{d}s = \int _0^T\int _{\mathcal {O}}A_{ij}({\widetilde{u}}(s))\nabla {\widetilde{u}}_j(s) \cdot \nabla \phi _2\textrm{d}x\textrm{d}s. \end{aligned}$$

A straightforward estimation and bound (31) lead to

$$\begin{aligned} {{\widetilde{{{\mathbb {E}}}}}}&\bigg |\int _0^T\int _{\mathcal {O}}A_{ij}({\widetilde{u}}^\varepsilon (s)) \nabla {\widetilde{u}}_j^\varepsilon (s)\cdot \nabla \phi _2\textrm{d}x\textrm{d}s\bigg |^p \\&\le \Vert \nabla \phi _2\Vert _{L^\infty ({\mathcal {O}})}^p{{\widetilde{{{\mathbb {E}}}}}}\bigg (\int _0^T \bigg \Vert \sum _{j=1}^n A_{ij}({\widetilde{u}}^\varepsilon (s))\nabla {\widetilde{u}}_j^\varepsilon (s) \bigg \Vert _{L^1({\mathcal {O}})}\textrm{d}s\bigg )^p \le C, \end{aligned}$$

Hence, Vitali’s convergence theorem gives (45).

It remains to prove (46). By Assumption (A4), \({{\widetilde{{{\mathbb {P}}}}}}\)-a.s.,

$$\begin{aligned} \int _0^T\Vert \sigma _{ij}({\widetilde{u}}^\varepsilon (s))-\sigma _{ij}({\widetilde{u}}(s)) \Vert _{{{\mathcal {L}}}_2(U;L^2({\mathcal {O}}))}^2\textrm{d}s \le C_\sigma \Vert {\widetilde{u}}^\varepsilon -{\widetilde{u}}\Vert _{L^2(0,T;L^2({\mathcal {O}}))}\rightarrow 0. \end{aligned}$$

This convergence and \({\widetilde{W}}^\varepsilon \rightarrow {\widetilde{W}}\) in \(C^0([0,T];U_0)\) imply that [14, Lemma 2.1]

$$\begin{aligned} \int _0^T\sigma _{ij}({\widetilde{u}}^\varepsilon )\textrm{d}{\widetilde{W}}^\varepsilon \rightarrow \int _0^T\sigma _{ij}({\widetilde{u}})\textrm{d}{\widetilde{W}}\quad \text{ in } L^2(0,T;L^2({\mathcal {O}}))\ {{\widetilde{{{\mathbb {P}}}}}}\text{-a.s. } \end{aligned}$$

By Assumption (A4) again,

$$\begin{aligned} {{\widetilde{{{\mathbb {E}}}}}}&\bigg (\int _0^T\Vert \sigma _{ij}({\widetilde{u}}^\varepsilon (s))-\sigma _{ij} ({\widetilde{u}}(s))\Vert ^2_{{{\mathcal {L}}}_2(U;L^2({\mathcal {O}}))}\textrm{d}s\bigg )^p \\&\le C + C{{\widetilde{{{\mathbb {E}}}}}}\bigg (\int _0^T\big (\Vert {\widetilde{u}}^\varepsilon (s)\Vert _{L^2({\mathcal {O}})}^2 + \Vert {\widetilde{u}}(s)\Vert _{L^2({\mathcal {O}})}^2\big )\textrm{d}s\bigg )^p \le C. \end{aligned}$$

We infer from Vitali’s convergence theorem that

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0}{{\widetilde{{{\mathbb {E}}}}}}\int _0^T\Vert \sigma _{ij}({\widetilde{u}}^\varepsilon (s)) -\sigma _{ij}({\widetilde{u}}(s))\Vert _{{{\mathcal {L}}}_2(U;L^2({\mathcal {O}}))}^2\textrm{d}s = 0. \end{aligned}$$

The estimate

$$\begin{aligned} {{\widetilde{{{\mathbb {E}}}}}}&\bigg |\bigg (\int _0^T\sigma _{ij}({\widetilde{u}}^\varepsilon (s)) \textrm{d}{\widetilde{W}}_j^\varepsilon (s) - \int _0^T\sigma _{ij}({\widetilde{u}}(s)) \textrm{d}{\widetilde{W}}_j(s),\phi _1\bigg )_{L^2({\mathcal {O}})}\bigg |^2 \\&\le C\Vert \phi _1\Vert _{L^2({\mathcal {O}})}^2{{\widetilde{{{\mathbb {E}}}}}}\int _0^T \big (\Vert \sigma _{ij}({\widetilde{u}}^\varepsilon (s))\Vert _{{{\mathcal {L}}}_2(U;L^2({\mathcal {O}}))}^2 + \Vert \sigma _{ij}({\widetilde{u}}(s))\Vert _{{{\mathcal {L}}}_2(U;L^2({\mathcal {O}}))}^2\big )\textrm{d}s \\&\le C\Vert \phi _1\Vert _{L^2({\mathcal {O}})}^2\bigg \{1 + {{\widetilde{{{\mathbb {E}}}}}}\bigg (\int _0^T\big ( \Vert {\widetilde{u}}^\varepsilon (s)\Vert _{L^2({\mathcal {O}})}^2 + \Vert {\widetilde{u}}(s)\Vert _{L^2({\mathcal {O}})}^2\big ) \textrm{d}s\bigg )\bigg \} \le C \end{aligned}$$

for all \(\phi _1\in L^2({\mathcal {O}})\) and the dominated convergence theorem yield (46). \(\square \)

To show that the limit is indeed a solution, we define, for \(t\in [0,T]\), \(i=1,\ldots ,n\), and \(\phi \in D(L)\),

$$\begin{aligned} \Lambda _i^\varepsilon ({\widetilde{u}}^\varepsilon ,{\widetilde{w}}^\varepsilon ,{\widetilde{W}}^\varepsilon ,\phi )(t)&:= \langle {\widetilde{u}}_i(0),\phi \rangle + \sqrt{\varepsilon }\langle {\widetilde{w}}^\varepsilon (0),\phi \rangle \\&\quad \;\; - \sum _{j=1}^n\int _0^t\int _{\mathcal {O}}A_{ij}({\widetilde{u}}^\varepsilon (s)) \nabla {\widetilde{u}}_j^\varepsilon (s)\cdot \nabla \phi \textrm{d}x\textrm{d}s \\&\quad \;\; + \sum _{j=1}^n\bigg (\int _0^t\sigma _{ij}({\widetilde{u}}^\varepsilon (s))\textrm{d}{\widetilde{W}}^\varepsilon _j(s),\phi \bigg )_{L^2({\mathcal {O}})}, \\ \Lambda _i({\widetilde{u}},{\widetilde{w}},{\widetilde{W}},\phi )(t)&:= \langle {\widetilde{u}}_i(0),\phi \rangle - \sum _{j=1}^n\int _0^t\int _{\mathcal {O}}\langle A_{ij}({\widetilde{u}}(s))\nabla {\widetilde{u}}_j(s) \cdot \nabla \phi \textrm{d}x\textrm{d}s \\&\quad \;\; + \sum _{j=1}^n\bigg (\int _0^t\sigma _{ij}({\widetilde{u}}(s))\textrm{d}{\widetilde{W}}_j(s),\phi \bigg )_{L^2({\mathcal {O}})}. \end{aligned}$$

The following corollary is a consequence of the previous lemma.

Corollary 27

It holds for any \(\phi _1\in L^2({\mathcal {O}})\) and \(\phi _2\in D(L)\) that

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0}\big \Vert ({\widetilde{u}}_i^\varepsilon ,\phi _1)_{L^2({\mathcal {O}})} - ({\widetilde{u}}_i,\phi _1)_{L^2({\mathcal {O}})}\big \Vert _{L^1({{\widetilde{\Omega }}}\times (0,T))}&= 0, \\ \lim _{\varepsilon \rightarrow 0}\Vert \Lambda _i^\varepsilon ({\widetilde{u}}^\varepsilon ,\sqrt{\varepsilon }{\widetilde{w}}^\varepsilon , {\widetilde{W}}^\varepsilon ,\phi _2) - \Lambda _i({\widetilde{u}},0,{\widetilde{W}},\phi _2) \Vert _{L^1({{\widetilde{\Omega }}}\times (0,T))}&= 0. \end{aligned}$$

Since \(v^\varepsilon \) is a strong solution to (17), it satisfies for a.e. \(t\in [0,T]\) \({{\mathbb {P}}}\)-a.s., \(i=1,\ldots ,n\), and \(\phi \in D(L)\),

$$\begin{aligned} (v_i^\varepsilon (t),\phi )_{L^2({\mathcal {O}})} = \Lambda _i^\varepsilon (u^\varepsilon ,\varepsilon L^*LR_\varepsilon (v^\varepsilon ),W,\phi )(t) \end{aligned}$$

and in particular,

$$\begin{aligned} \int _0^T{{\mathbb {E}}}\big |(v_i^\varepsilon (t),\phi )_{L^2({\mathcal {O}})} - \Lambda _i^\varepsilon (u^\varepsilon ,\varepsilon L^*LR_\varepsilon (v^\varepsilon ),W,\phi )(t)\big |\textrm{d}t = 0. \end{aligned}$$

We deduce from the equivalence of the laws of \((u^\varepsilon ,\varepsilon L^*LR_\varepsilon (v^\varepsilon ),W)\) and \(({\widetilde{u}}^\varepsilon ,\sqrt{\varepsilon }{\widetilde{w}}^\varepsilon ,{\widetilde{W}})\) that

$$\begin{aligned} \int _0^T{{\widetilde{{{\mathbb {E}}}}}}\big |\big ({\widetilde{u}}^\varepsilon _i(t) +\sqrt{\varepsilon }{\widetilde{w}}_i^\varepsilon ,\phi \big )_{L^2({\mathcal {O}})} - \Lambda _i^\varepsilon \big ({\widetilde{u}}^\varepsilon ,\sqrt{\varepsilon }{\widetilde{w}}^\varepsilon , {\widetilde{W}}^\varepsilon ,\phi \big )(t)\big |\textrm{d}t = 0. \end{aligned}$$

By Corollary 27, we can pass to the limit \(\varepsilon \rightarrow 0\) to obtain

$$\begin{aligned} \int _0^T{{\widetilde{{{\mathbb {E}}}}}}\big |({\widetilde{u}}_i(t),\phi )_{L^2({\mathcal {O}})} - \Lambda _i({\widetilde{u}},0,{\widetilde{W}},\phi )(t)\big |\textrm{d}t = 0. \end{aligned}$$

This identity holds for all \(i=1,\ldots ,n\) and all \(\phi \in D(L)\). This shows that

$$\begin{aligned} \big |({\widetilde{u}}_i(t),\phi )_{L^2({\mathcal {O}})} - \Lambda _i({\widetilde{u}},0,{\widetilde{W}},\phi )(t)\big | = 0 \quad \text{ for } \text{ a.e. } t\in [0,T]\ {{\widetilde{{{\mathbb {P}}}}}}\text{-a.s. },\ i=1,\ldots ,n. \end{aligned}$$

We infer from the definition of \(\Lambda _i\) that

$$\begin{aligned} ({\widetilde{u}}_i(t),\phi )_{L^2({\mathcal {O}})}&= ({\widetilde{u}}_i(0),\phi )_{L^2({\mathcal {O}})} - \sum _{j=1}^n\int _0^t \int _{\mathcal {O}}A_{ij}({\widetilde{u}}(s))\nabla {\widetilde{u}}_j(s) \cdot \nabla \phi \textrm{d}x\textrm{d}s \\&\quad \;\;+ \sum _{j=1}^n\bigg (\int _0^t\sigma _{ij} ({\widetilde{u}}(s))\textrm{d}{\widetilde{W}}_j(s),\phi \bigg )_{L^2({\mathcal {O}})} \end{aligned}$$

for a.e. \(t\in [0,T]\) and all \(\phi \in D(L)\). Set \({\widetilde{U}}=({{\widetilde{\Omega }}},\widetilde{{\mathcal {F}}},{{\widetilde{{{\mathbb {P}}}}}}, {{\widetilde{{{\mathbb {F}}}}}})\). Then \(({\widetilde{U}},{\widetilde{W}},{\widetilde{u}})\) is a martingale solution to (1)–(3).

6 Proof of Theorem 5

We turn to the existence proof of the SKT model without self-diffusion.

6.1 Uniform estimates

Let \(v^\varepsilon \) be a global solution to (19)–(20) and set \(u^\varepsilon =u(R_\varepsilon (v^\varepsilon ))\). We assume that A(u) is given by (3) and that \(a_{i0}>0\), \(a_{ii}=0\) for \(i=1,\ldots ,n\). The uniform estimates of Lemmas 18 and 19 are still valid. Since \(a_{ii}=0\), we obtain an \(H^1({\mathcal {O}})\) bound for \((u_i^\varepsilon )^{1/2}\) instead of \(u_i^\varepsilon \), which yields weaker bounds than those in Lemma 20.

Lemma 28

Let \(p\ge 2\) and set \(\rho _1:=(d+2)/(d+1)\). Then there exists a constant \(C(p,u^0,T)>0\), which is independent of \(\varepsilon \), such that

$$\begin{aligned} {{\mathbb {E}}}\Vert u_i^\varepsilon \Vert _{L^2(0,T;W^{1,1}({\mathcal {O}}))}^p&\le C(p,u^0,T), \end{aligned}$$
(47)
$$\begin{aligned} {{\mathbb {E}}}\Vert u_i^\varepsilon \Vert _{L^{1+2/d}(Q_T)}^p&\le C(p,u^0,T), \end{aligned}$$
(48)
$$\begin{aligned} {{\mathbb {E}}}\Vert u_i^\varepsilon \Vert _{L^{4/d}(0,T;L^2({\mathcal {O}}))}^p&\le C(p,u^0,T), \end{aligned}$$
(49)
$$\begin{aligned} {{\mathbb {E}}}\Vert u_i^\varepsilon \Vert _{L^{\rho _1}(0,T;W^{1,\rho _1}({\mathcal {O}}))}^p&\le C(p,u^0,T). \end{aligned}$$
(50)

Proof

The identity \(\nabla u_i^{\varepsilon }=2(u_i^{\varepsilon })^{1/2}\nabla (u_i^{\varepsilon })^{1/2}\) and the Hölder inequality show that

$$\begin{aligned} {{\mathbb {E}}}\Vert \nabla u_i^{\varepsilon }\Vert _{L^2(0,T;L^1({\mathcal {O}}))}^{p}&\le C{{\mathbb {E}}}\bigg (\int _0^T\Vert (u_i^{\varepsilon })^{1/2}\Vert _{L^2({\mathcal {O}})}^2 \Vert \nabla (u_i^{\varepsilon })^{1/2}\Vert _{L^2({\mathcal {O}})}^2\textrm{d}t\bigg )^{p/2} \\&\le C{{\mathbb {E}}}\bigg (\Vert u_i^{\varepsilon }\Vert _{L^\infty (0,T;L^1({\mathcal {O}}))}\int _0^T \Vert \nabla (u_i^{\varepsilon })^{1/2}\Vert _{L^2({\mathcal {O}})}^2\textrm{d}t\bigg )^{p/2} \\&\le C\big ({{\mathbb {E}}}\Vert u_i^{\varepsilon }\Vert _{L^\infty (0,T;L^1({\mathcal {O}}))}^{p}\big )^{1/2} \big ({{\mathbb {E}}}\Vert \nabla (u_i^{\varepsilon })^{1/2}\Vert _{L^2(0,T;L^2({\mathcal {O}}))}^{2p}\big )^{1/2}. \end{aligned}$$

Because of (30) and (31), the right-hand side is bounded. Using (30) again, we infer that (47) holds. Estimate (48) is obtained from the Gagliardo–Nirenberg inequality similarly as in the proof of Lemma 20:

$$\begin{aligned} {{\mathbb {E}}}\bigg (\int _0^T\Vert (u_i^{\varepsilon })^{1/2}\Vert _{L^{r}({\mathcal {O}})}^{s}\bigg )^{p/s}&\le C\big ({{\mathbb {E}}}\Vert (u_i^{\varepsilon })^{1/2}\Vert _{L^\infty (0,T;L^2({\mathcal {O}}))}^{2(1-\theta )p} \big )^{1/2} \\&\quad \;\times \big ({{\mathbb {E}}}\Vert (u_i^{\varepsilon })^{1/2}\Vert _{L^2(0,T;H^1({\mathcal {O}}))}^{4p/s}\big )^{1/2} \le C, \end{aligned}$$

where \(s=2/\theta \ge 2\) and \(1/r=1/2-\theta /d=1/2-2/(ds)\). Choosing \(r=(2d+4)/d\) gives \(s=r\), and \(r=4\) leads to \(s=8/d\); this proves estimates (48) and (49). Finally, (50) follows from Hölder’s inequality:

$$\begin{aligned} \Vert u_i^\varepsilon \Vert _{L^{\rho _1}(Q_T)}&= 2\Vert (u_i^\varepsilon )^{1/2}\nabla (u_i^\varepsilon )^{1/2}\Vert _{L^{\rho _1}(Q_T)}\\&\le 2\Vert (u_i^\varepsilon )^{1/2}\Vert _{L^{(2d+4)/d}(Q_T)}\Vert \nabla (u_i^\varepsilon )^{1/2}\Vert _{L^2(Q_T)} \\&\le 2\Vert u_i^\varepsilon \Vert _{L^{1+2/d}(Q_T)}^{1/2}\Vert (u_i^\varepsilon )^{1/2}\Vert _{L^2(0,T;H^1({\mathcal {O}}))} \end{aligned}$$

and taking the expectation and using (48) and (31) ends the proof. \(\square \)

The following lemma is needed to derive the fractional time estimate.

Lemma 29

Let \(p\ge 2\) and set \(\rho _2:=(2d+2)/(2d+1)\). Then it holds for any \(i,j=1,\ldots ,n\) with \(i\ne j\):

$$\begin{aligned} {{\mathbb {E}}}\Vert u_i^\varepsilon u_j^\varepsilon \Vert _{L^{\rho _2}(0,T;W^{1,\rho _2}({\mathcal {O}}))}^p \le C(p,u^0,T). \end{aligned}$$
(51)

Proof

The Hölder inequality and (30) immediately yield

$$\begin{aligned} {{\mathbb {E}}}\Vert (u_i^\varepsilon u_j^\varepsilon )^{1/2}\Vert _{L^\infty (0,T;L^1({\mathcal {O}}))}^p \le C, \end{aligned}$$

and we conclude from the Poincaré–Wirtinger inequality, estimate (32), and the previous estimate that

$$\begin{aligned} {{\mathbb {E}}}\Vert (u_i^\varepsilon u_j^\varepsilon )^{1/2}\Vert _{L^2(0,T;H^1({\mathcal {O}}))}^p \le C. \end{aligned}$$
(52)

By the Gagliardo–Nirenberg inequality, with \(\theta =d/(d+1)\),

$$\begin{aligned}&\int _0^T\Vert (u_i^\varepsilon u_j^\varepsilon )^{1/2}\Vert _{L^{2(d+1)/d}({\mathcal {O}})}^{2(d+1)/d}\textrm{d}t \\&\quad \le C\int _0^T\Vert (u_i^\varepsilon u_j^\varepsilon )^{1/2}\Vert _{H^1({\mathcal {O}})}^{2\theta (d+1)/d} \Vert (u_i^\varepsilon u_j^\varepsilon )^{1/2}\Vert _{L^1({\mathcal {O}})}^{2(1-\theta )(d+1)/d}\textrm{d}t \\&\quad \le C\Vert (u_i^\varepsilon u_j^\varepsilon )^{1/2}\Vert _{L^\infty (0,T;L^1({\mathcal {O}}))}^{2(1-\theta )(d+1)/d} \int _0^T\Vert (u_i^\varepsilon u_j^\varepsilon )^{1/2}\Vert _{H^1({\mathcal {O}})}^2 \textrm{d}t \\&\quad = C\Vert (u_i^\varepsilon u_j^\varepsilon )^{1/2}\Vert _{L^\infty (0,T;L^1({\mathcal {O}}))}^{2/d} \Vert (u_i^\varepsilon u_j^\varepsilon )^{1/2}\Vert _{L^2(0,T;H^1({\mathcal {O}}))}^2. \end{aligned}$$

Taking the expectation and applying the Hölder inequality, we infer that

$$\begin{aligned} {{\mathbb {E}}}\Vert (u_i^\varepsilon u_j^\varepsilon )^{1/2}\Vert _{L^{2(d+1)/d}(Q_T)}^p \le C. \end{aligned}$$
(53)

Finally, the identity \(\nabla (u_i^\varepsilon u_j^\varepsilon ) = 2(u_i^\varepsilon u_j^\varepsilon )^{1/2} \nabla (u_i^\varepsilon u_j^\varepsilon )^{1/2}\) and Hölder’s inequality lead to

$$\begin{aligned} \int _0^T&\Vert \nabla (u_i^\varepsilon u_j^\varepsilon )\Vert _{L^{\rho _2}({\mathcal {O}})}^{\rho _2}\textrm{d}t \le C\int _0^T\Vert (u_i^\varepsilon u_j^\varepsilon )^{1/2}\Vert _{L^{2(d+1)/d}({\mathcal {O}})}^2 \Vert \nabla (u_i^\varepsilon u_j^\varepsilon )^{1/2}\Vert _{L^2({\mathcal {O}})}^2\textrm{d}t \\&\le C\bigg (\int _0^T\Vert (u_i^\varepsilon u_j^\varepsilon )^{1/2}\Vert _{L^{2(d+1)/d}({\mathcal {O}})}^{2(d+1)/d} \textrm{d}t\bigg )^{1-\rho _2/2}\bigg (\int _0^T \Vert \nabla (u_i^\varepsilon u_j^\varepsilon )^{1/2}\Vert _{L^2({\mathcal {O}})}^2\textrm{d}t\bigg )^{\rho _2/2}. \end{aligned}$$

The bounds (52)–(53) yield, after taking the expectation and applying Hölder’s inequality again, the conclusion (51). \(\square \)

We show now that the fractional time derivative of \(u^\varepsilon \) is uniformly bounded.

Lemma 30

(Fractional time regularity) Let \(d\le 2\). Then there exist \(0<\alpha <1\), \(p>1\), and \(\beta >0\) such that \(\alpha p>1\) and

$$\begin{aligned} {{\mathbb {E}}}\Vert u^\varepsilon \Vert _{W^{\alpha ,p}(0,T;D(L)')}^p + {{\mathbb {E}}}\Vert u^\varepsilon \Vert _{C^{0,\beta }([0,T];D(L)')}^p \le C. \end{aligned}$$
(54)

Proof

We proceed similarly as in the proof of Lemma 22. First, we estimate the diffusion part, setting

$$\begin{aligned} g(t) = \int _0^t\bigg \Vert a_{i0}\nabla u_i^\varepsilon + \sum _{j\ne i}a_{ij} \nabla (u_i^\varepsilon u_j^\varepsilon )\bigg \Vert _{L^1({\mathcal {O}})}\textrm{d}r. \end{aligned}$$

Then, using \(D(L)\subset W^{1,\infty }({\mathcal {O}})\) (which holds due to the assumption \(m>d/2+1\)),

$$\begin{aligned} {{\mathbb {E}}}&\int _0^T\int _0^T|t-s|^{-1-\alpha p}\bigg \Vert \int _{s\wedge t}^{t\vee s} {\text {div}}\sum _{j=1}^n A_{ij}(u^\varepsilon (r))\nabla u_j^\varepsilon (r)\textrm{d}r\bigg \Vert _{D(L)'}^p\textrm{d}t\textrm{d}s \\&\le C\int _0^T\int _0^T|t-s|^{-1-\alpha p}\bigg (\int _{s\wedge t}^{t\vee s} \bigg \Vert a_{i0}\nabla u_i^\varepsilon + \sum _{j\ne i}a_{ij} \nabla (u_i^\varepsilon u_j^\varepsilon )\bigg \Vert _{L^1({\mathcal {O}})}\textrm{d}r\bigg )^p\textrm{d}t\textrm{d}s \\&\le C{{\mathbb {E}}}\int _0^T\int _0^T\frac{|g(t)-g(s)|^p}{|t-s|^{1+\alpha p}}\textrm{d}t\textrm{d}s \le C{{\mathbb {E}}}\Vert g\Vert _{W^{\alpha ,p}(0,T;{{\mathbb {R}}})}^p. \end{aligned}$$

The embedding \(W^{1,p}(0,T;{{\mathbb {R}}})\hookrightarrow W^{\alpha ,p}(0,T;{{\mathbb {R}}})\) and estimates (51) and (50) show that for \(1\le p\le \rho _1 = (d+2)/(d+1)\),

$$\begin{aligned} {{\mathbb {E}}}&\Vert g\Vert _{W^{\alpha ,p}(0,T;{{\mathbb {R}}})}^p \le C{{\mathbb {E}}}\Vert g\Vert _{W^{1,p}(0,T;{{\mathbb {R}}})}^p = C{{\mathbb {E}}}\Vert \partial _t g\Vert _{L^p(0,T;{{\mathbb {R}}})}^p + C{{\mathbb {E}}}\Vert g\Vert _{L^p(0,T;{{\mathbb {R}}})}^p \\&\le C{{\mathbb {E}}}\int _0^T\bigg \Vert a_{i0}\nabla u_i^\varepsilon (t) + \sum _{j\ne i}a_{ij} \nabla (u_i^\varepsilon u_j^\varepsilon )(t)\bigg \Vert _{L^1({\mathcal {O}})}^p \textrm{d}t \\&\quad \;\;+ C{{\mathbb {E}}}\int _0^T\int _0^t\bigg \Vert a_{i0}\nabla u_i^\varepsilon (r) + \sum _{j\ne i}a_{ij}\nabla (u_i^\varepsilon u_j^\varepsilon )(r)\bigg \Vert _{L^1({\mathcal {O}})}^p \textrm{d}r\textrm{d}t \le C. \end{aligned}$$

Next, we consider the stochastic part, using the Burkholder–Davis–Gundy inequality, Hölder’s inequality, and the sublinear growth condition in the statement of the theorem:

$$\begin{aligned} {{\mathbb {E}}}&\int _0^T\int _0^T|t-s|^{-1-\alpha p}\bigg \Vert \int _{s\wedge t}^{t\vee s} \sum _{j=1}^n\sigma _{ij}(u^\varepsilon (r))\textrm{d}W_j(r)\bigg \Vert _{L^2({\mathcal {O}})}^p\textrm{d}t\textrm{d}s \\&\le C{{\mathbb {E}}}\int _0^T\int _0^T|t-s|^{-1-\alpha p}\bigg (\int _{s\wedge t}^{t\vee s} \sum _{j=1}^n\Vert \sigma _{ij}(u^\varepsilon (r))\Vert _{{{\mathcal {L}}}_2(U;L^2({\mathcal {O}}))}^2\textrm{d}r \bigg )^{p/2}\textrm{d}t\textrm{d}s \\&\le C\int _0^T\int _0^T|t-s|^{-1-\alpha p+p/2-1}{{\mathbb {E}}}\int _{s\wedge t}^{t\vee s} \sum _{j=1}^n\Vert \sigma _{ij}(u^\varepsilon (r))\Vert _{{{\mathcal {L}}}_2(U;L^2({\mathcal {O}}))}^p\textrm{d}r \textrm{d}t\textrm{d}s \\&\le C\int _0^T\int _0^T|t-s|^{-1-\alpha p+p/2-1}{{\mathbb {E}}}\int _{s\wedge t}^{t\vee s} \sum _{j=1}^n(1+\Vert u^\varepsilon (r)\Vert _{L^2({\mathcal {O}})}^{\gamma p})\textrm{d}r\textrm{d}t\textrm{d}s \le C. \end{aligned}$$

The last step follows from estimate (49) (assuming that \(1\le \gamma p\le 4/d\)) and Lemma 21, since \(1+\alpha p-p/2+1<2\) if and only if \(\alpha <1/2\). We conclude that the second term of the right-hand side of

$$\begin{aligned} v^\varepsilon (t) = v^\varepsilon (0) + \int _0^t{\text {div}}(A(u^\varepsilon (s))\nabla u^\varepsilon (s))\textrm{d}s + \int _0^t\sigma (u^\varepsilon (s))\textrm{d}W(s) \end{aligned}$$

is uniformly bounded in \({{\mathbb {E}}}|\cdot |_{W^{\alpha ,p}(0,T;D(L)')}\) for \(\alpha <1\) and \(p\le (d+2)/(d+1)\), while the third term is uniformly bounded in that norm for \(\alpha <1/2\) and \(p\le 4/(\gamma d)\). In both cases, we can choose \(\alpha \) such that \(\alpha p>1\). At this point, we need the condition \(\gamma <1\) if \(d=2\). (The result holds for any space dimension if \(\gamma <2/d\).) Taking into account (33), \((v^\varepsilon )\) is bounded in \(W^{\alpha ,p}(0,T;D(L)')\). The embedding \(W^{\alpha ,p}(0,T;D(L)')\hookrightarrow C^{0,\beta }([0,T];D(L)')\) for \(\beta =\alpha -1/p>0\) implies that \((v^\varepsilon )\) is bounded in the latter space.

We turn to the estimate of \(u^\varepsilon \) in the \(W^{\alpha ,p}(0,T;D(L)')\) norm:

$$\begin{aligned} {{\mathbb {E}}}\Vert u^\varepsilon \Vert _{W^{\alpha ,p}(0,T;D(L)')}^p \le C\big ({{\mathbb {E}}}\Vert v^\varepsilon \Vert _{W^{\alpha ,p}(0,T;D(L)')}^p + \varepsilon {{\mathbb {E}}}\Vert L^*LR_\varepsilon (v^\varepsilon )\Vert _{W^{\alpha ,p}(0,T;D(L)')}^p\big ). \end{aligned}$$

It remains to consider the last term. In view of estimate (15) and the Lipschitz continuity of \(R_\varepsilon \) with Lipschitz constant \(C/\varepsilon \), we obtain

$$\begin{aligned} {{\mathbb {E}}}&|\varepsilon L^*LR_\varepsilon (v^\varepsilon )|_{W^{\alpha ,p}(0,T;D(L)')}^p \\&= \varepsilon ^p{{\mathbb {E}}}\int _0^T\int _0^T|t-s|^{-1-\alpha p}\Vert L^*LR_\varepsilon (v^\varepsilon (t)) - L^*LR_\varepsilon (v^\varepsilon (s))\Vert _{D(L)'}^p\textrm{d}t\textrm{d}s \\&\le \varepsilon ^p C{{\mathbb {E}}}\int _0^T\int _0^T|t-s|^{-1-\alpha p}\Vert R_\varepsilon (v^\varepsilon (t)) - R_\varepsilon (v^\varepsilon (s))\Vert _{D(L)}^p \textrm{d}t\textrm{d}s \\&\le \varepsilon ^p C{{\mathbb {E}}}\int _0^T\int _0^T|t-s|^{-1-\alpha p}\frac{C}{\varepsilon ^p} \Vert v^\varepsilon (t)-v^\varepsilon (s)\Vert _{D(L)'}^p\textrm{d}t\textrm{d}s \\&= C{{\mathbb {E}}}\Vert v^\varepsilon \Vert _{W^{\alpha ,p}(0,T;D(L)')}^p \le C. \end{aligned}$$

Moreover, by (15) and the Lipschitz continuity of \(R_\varepsilon \) again,

$$\begin{aligned} \Vert \varepsilon L^*LR_\varepsilon (v^\varepsilon )\Vert _{L^p(0,T;D(L)')}^p\le & {} \varepsilon ^p C\Vert R_\varepsilon (v^\varepsilon )\Vert _{L^p(0,T;D(L))}^p\\\le & {} \varepsilon ^p C\Vert v^\varepsilon \Vert _{L^p(0,T;D(L)')}^p \le C, \end{aligned}$$

where we used estimate (33). This finishes the proof. \(\square \)

6.2 Tightness of the laws of \((u^\varepsilon )\)

The tightness is shown in a different sub-Polish space than in Sect. 5.2:

$$\begin{aligned} \widetilde{Z}_T:= C^0([0,T];D(L)')\cap L_w^{\rho _1}(0,T;W^{1,\rho _1}({\mathcal {O}})), \end{aligned}$$

endowed with the topology \(\widetilde{{\mathbb {T}}}\) that is the maximum of the topology of \(C^0([0,T];D(L)')\) and the weak topology of \(L_w^{\rho _1}(0,T;W^{1,\rho _1}({\mathcal {O}}))\), recalling that \(\rho _1{=}(d+2)/(d+1){>}1\).

Lemma 31

The family of laws of \((u^\varepsilon )\) is tight in

$$\begin{aligned} Z_T:= \widetilde{Z}_T \cap L^2(0,T;L^{2}({\mathcal {O}})) \end{aligned}$$

with the topology that is the maximum of \(\widetilde{{\mathbb {T}}}\) and the topology induced by the \(L^2(0,T;L^{2}({\mathcal {O}}))\) norm.

Proof

The tightness in \(L^2(0,T;L^q({\mathcal {O}}))\) for \(q<d/(d-1)=2\) is a consequence of the compact embedding \(W^{1,1}({\mathcal {O}})\hookrightarrow L^q({\mathcal {O}})\) as well as estimates (47) and (54). In fact, we can extend this result up to \(q=2\) because of the uniform bound of \(u_i^\varepsilon \log u_i^\varepsilon \) in \(L^\infty (0,T;L^1({\mathcal {O}}))\), which originates from the entropy estimate. Indeed, we just apply [3, Prop. 1], using additionally (26) with \(a_{i0}>0\). Then the tightness in \(L^2(0,T;L^2({\mathcal {O}}))\) follows from Lemma 37. Finally, the tightness in \(\widetilde{Z}_T\) is shown as in the proof of Lemma 23 in Appendix B. \(\square \)

In three space dimensions, we do not obtain tightness in \(L^2(0,T;L^2({\mathcal {O}}))\) but in the larger space \(L^{4/3}(0,T;L^2({\mathcal {O}}))\). This follows similarly as in the proof of Lemma 23 taking into account the compact embedding \(W^{1,\rho _1}({\mathcal {O}})\hookrightarrow L^2({\mathcal {O}})\), which holds as long as \(d\le 3\), as well as estimates (50) and (54). Unfortunately, this result seems to be not sufficient to identify the limit of the product \(\widetilde{u}_i^\varepsilon \widetilde{u}_j^\varepsilon \). Therefore, we restrict ourselves to the two-dimensional case.

The following result is shown exactly as in Lemma 24.

Lemma 32

The family of laws of \((\sqrt{\varepsilon }L^*LR_\varepsilon (v^\varepsilon ))\) is tight in \(Y_T=L_w^2(0,T;D(L)')\cap L^\infty _{w*}(0,T;D(L)')\).

Arguing as in Sect. 5.3, the Skorokhod–Jakubowski theorem implies the existence of a subsequence, a probability space \(({{\widetilde{\Omega }}},\widetilde{{\mathcal {F}}},{\widetilde{{{\mathbb {P}}}}})\), and, on this space, \((Z_T\times Y_T\times C^0([0,T];U_0))\)-valued random variables \((\widetilde{u}^\varepsilon ,\widetilde{w}^\varepsilon ,\widetilde{W}^\varepsilon )\) and \((\widetilde{u},\widetilde{w},\widetilde{W})\) such that \((\widetilde{u}^\varepsilon ,\widetilde{w}^\varepsilon ,\widetilde{W}^\varepsilon )\) has the same law as \((u^\varepsilon ,\sqrt{\varepsilon }L^*LR_\varepsilon (v^\varepsilon ),W)\) on \({\mathcal {B}}(Z_T\times Y_T\times C^0([0,T];U_0))\) and, as \(\varepsilon \rightarrow 0\) and \({\widetilde{{{\mathbb {P}}}}}\)-a.s.,

$$\begin{aligned} (\widetilde{u}^\varepsilon ,\widetilde{w}^\varepsilon ,\widetilde{W}^\varepsilon ) \rightarrow (\widetilde{u},\widetilde{w},\widetilde{W}) \quad \text{ in } Z_T\times Y_T\times C^0([0,T];U_0). \end{aligned}$$

This convergence means that \({\widetilde{{{\mathbb {P}}}}}\)-a.s.,

$$\begin{aligned} \widetilde{u}^\varepsilon \rightarrow \widetilde{u}&\quad \text{ strongly } \text{ in } C^0([0,T];D(L)'), \\ \nabla \widetilde{u}^\varepsilon \rightharpoonup \nabla \widetilde{u}&\quad \text{ weakly } \text{ in } L^{\rho _1}(Q_T), \\ \widetilde{u}^\varepsilon \rightarrow \widetilde{u}&\quad \text{ strongly } \text{ in } L^{2}(Q_T), \\ \widetilde{w}^\varepsilon \rightharpoonup \widetilde{w}&\quad \text{ weakly } \text{ in } L^2(0,T;D(L)'), \\ \widetilde{w}^\varepsilon \rightharpoonup \widetilde{w}&\quad \text{ weakly* } \text{ in } L^\infty (0,T;D(L)'), \\ \widetilde{W}^\varepsilon \rightarrow \widetilde{W}&\quad \text{ strongly } \text{ in } C^0([0,T];U_0). \end{aligned}$$

The remainder of the proof is very similar to that one of Sect. 5.3, using slightly weaker convergence results. The most difficult part is the convergence of the nonlinear term \(\nabla (\widetilde{u}_i^\varepsilon \widetilde{u}_j^\varepsilon )\), since the previous convergences do not allow us to perform the limit \(\widetilde{u}_i^\varepsilon \nabla \widetilde{u}_j^\varepsilon \) because of \(\rho _1<2\). The idea is to consider the “very weak” formulation by performing the limit in \(\widetilde{u}_i^\varepsilon \widetilde{u}_j^\varepsilon \Delta \phi \) instead of \(\nabla (\widetilde{u}_i^\varepsilon \widetilde{u}_j^\varepsilon )\cdot \nabla \phi \) for suitable test functions \(\phi \). Indeed, let \(\phi \in L^\infty (0,T;C_0^\infty ({\mathcal {O}}))\). Since \(\widetilde{u}_i^\varepsilon \rightarrow \widetilde{u}\) strongly in \(L^2(0,T;L^2({\mathcal {O}}))\) \({\widetilde{{{\mathbb {P}}}}}\)-a.s., we have

$$\begin{aligned} \int _0^T\int _{\mathcal {O}}\nabla (\widetilde{u}_i^j\widetilde{u}_j^\varepsilon )\cdot \nabla \phi \textrm{d}x\textrm{d}t = -\int _0^T\int _{\mathcal {O}}\widetilde{u}_i^\varepsilon \widetilde{u}_j^\varepsilon \Delta \phi \textrm{d}x\textrm{d}t \rightarrow -\int _0^T\int _{\mathcal {O}}\widetilde{u}_i\widetilde{u}_j\Delta \phi \textrm{d}x\textrm{d}t. \end{aligned}$$

It follows from the equivalence of the laws that

$$\begin{aligned} {\widetilde{{{\mathbb {E}}}}}\bigg (\int _0^T\int _{\mathcal {O}}\widetilde{u}_i^\varepsilon \widetilde{u}_j^\varepsilon \Delta \phi \textrm{d}x\textrm{d}t\bigg )^2 \le C, \end{aligned}$$

and we conclude from Vitali’s theorem that

$$\begin{aligned} \widetilde{E}\bigg |\int _0^T\int _{\mathcal {O}}\big (\widetilde{u}_i^\varepsilon \widetilde{u}_j^\varepsilon - \widetilde{u}_i\widetilde{u}_j\big )(t)\Delta \phi \textrm{d}x\textrm{d}t\bigg |\rightarrow 0\quad \text{ as } \varepsilon \rightarrow 0. \end{aligned}$$

By density, this convergence holds for all test functions \(\phi \in L^\infty (0,T;W^{2,\infty }({\mathcal {O}}))\) such that \(\nabla \phi \cdot \nu =0\) on \(\partial {\mathcal {O}}\). This ends the proof of Theorem 5.

Remark 33

(Three space dimensions) The three-dimensional case is delicate since \(u_i^\varepsilon \) lies in a space larger than \(L^2(Q_T)\). We may exploit the regularity (51) for \(\nabla (u_i^\varepsilon u_j^\varepsilon )\), but this leads only to the existence of random variables \({\widetilde{\eta }}_{ij}^\varepsilon \) and \({\widetilde{\eta }}_{ij}\) with \(i,j=1,\ldots ,n\) and \(i\ne j\) on the space \(X_T=L_w^{\rho _2}(0,T;L^{\rho _2}({\mathcal {O}}))\) such that \({\widetilde{\eta }}_{ij}^\varepsilon \) and \(u_i^\varepsilon u_j^\varepsilon \) have the same law on \({\mathcal {B}}(X_T)\) and, as \(\varepsilon \rightarrow 0\),

$$\begin{aligned} {\widetilde{\eta }}_{ij}^\varepsilon \rightharpoonup {\widetilde{\eta }}_{ij} \quad \text{ weakly } \text{ in } X_T. \end{aligned}$$

Similar arguments as before lead to the limit

$$\begin{aligned} {\widetilde{{{\mathbb {E}}}}}\bigg |\int _0^T\int _{\mathcal {O}}\nabla ({\widetilde{\eta }}_{ij}^\varepsilon - {\widetilde{\eta }}_{ij})(t)\cdot \nabla \phi (t)\textrm{d}x\textrm{d}t\bigg |\rightarrow 0, \end{aligned}$$

but we cannot easily identify \({\widetilde{\eta }}_{ij}\) with \(\widetilde{u}_i\widetilde{u}_j\). \(\square \)

7 Discussion of the noise terms

We present some examples of admissible terms \(\sigma (u)\). Recall that \((e_k)_{k\in {{\mathbb {N}}}}\) is an orthonormal basis of U.

Lemma 34

The stochastic diffusion

$$\begin{aligned} \sigma _{ij}(u) = \delta _{ij}s(u_i)\sum _{\ell =1}^\infty a_\ell (e_\ell ,\cdot )_{U}, \quad s(u_i) = \frac{u_i}{1+u_i^{1/2+\eta }} \end{aligned}$$

satisfies Assumption (A5) for \(\eta >0\) and \((a_\ell )\in \ell ^2({{\mathbb {R}}})\).

Proof

With the entropy density h given by (5), we compute \((\partial h/\partial u_i)(u)=\pi _i\log u_i\) and \((\partial ^2\,h/\partial u_i\partial u_j)(u)=(\pi _i/u_i)\delta _{ij}\). Therefore, by Jensen’s inequality and the elementary inequalities \(|u_i\log u_i|\le C(1+u_i^{1+\eta })\) for any \(\eta >0\) and \(|u|\le C(1+h(u))\),

$$\begin{aligned} J_1&:= \bigg \{\int _0^T\sum _{k=1}^\infty \sum _{i,j=1}^n \bigg (\int _{\mathcal {O}}\frac{\partial h}{\partial u_i}(u)\sigma _{ij}(u)e_k\textrm{d}x\bigg )^2\textrm{d}s \bigg \}^{1/2} \\&= \bigg \{\sum _{k=1}^\infty a_k^2\int _0^T\sum _{i=1}^n\bigg (\int _{\mathcal {O}}\pi _i\frac{u_i\log u_i}{1+u_i^{1/2+\eta }}\textrm{d}x\bigg )^2\textrm{d}s \bigg \}^{1/2} \\&\le C\bigg \{\sum _{i=1}^n\int _0^T\bigg (\int _{\mathcal {O}}\frac{1+u_i^{1+\eta }}{1+u_i^{1/2+\eta }}\textrm{d}x\bigg )^2\textrm{d}s\bigg \}^{1/2} \\&\le C\bigg \{\sum _{i=1}^n\int _0^T\int _{\mathcal {O}}\bigg (\frac{1+u_i^{1+\eta }}{1+u_i^{1/2+\eta }}\bigg )^2\textrm{d}x\textrm{d}s\bigg \}^{1/2} \\&\le C\bigg \{\sum _{i=1}^n\int _0^T\int _{\mathcal {O}}(1+u_i)\textrm{d}x\textrm{d}s\bigg \}^{1/2} \le C\bigg (1+\int _0^T\int _{\mathcal {O}}h(u)\textrm{d}x\textrm{d}s\bigg ). \end{aligned}$$

The second condition in Assumption (A5) becomes

$$\begin{aligned} J_2&:= \int _0^T\sum _{k=1}^\infty \int _{\mathcal {O}}{\text {tr}}\big [(\sigma (u)e_k)^T h''(u)\sigma (u)e_k\big ]\textrm{d}x\textrm{d}s \\&\;= \sum _{k=1}^\infty a_k^2\sum _{i=1}^n\int _0^T\int _{\mathcal {O}}\frac{\pi _iu_i}{(1+u_i^{1/2+\eta })^2}\textrm{d}x\textrm{d}s \le C({\mathcal {O}},T). \end{aligned}$$

Thus, Assumption (A5) is satisfied. \(\square \)

The proof shows that \(J_1\) can be estimated if \(s(u_i)^2\log (u_i)^2\) is bounded from above by \(C(1+h(u))\). This is the case if \(s(u_i)\) behaves like \(u_i^\alpha \) with \(\alpha <1/2\). Furthermore, \(J_2\) can be estimated if \(s(u_i)^2/u_i\) is bounded, which is possible if \(s(u_i)=u_i^\alpha \) with \(\alpha \ge 1/2\). Thus, to both satisfy the growth restriction and avoid the singularity at \(u_i=0\), we have chosen \(\sigma _{ij}\) as in Lemma 34. This example is rather artificial. To include more general choices, we generalize our approach. In fact, it is sufficient to estimate the integrals in inequality (23) in such a way that the entropy inequality of Proposition 16 holds. The idea is to exploit the gradient bound for \(u_i\) for the estimatation of \(J_1\) and \(J_2\).

Consider a trace-class, positive, and symmetric operator Q on \(L^2({\mathcal {O}})\) and the space \(U=Q^{1/2}(L^2({\mathcal {O}}))\), equipped with the norm \(\Vert Q^{1/2}(\cdot )\Vert _{L^2({\mathcal {O}})}\). We will work in the following with an U-cylindrical Wiener process \(W^Q\). This setting is equivalent to a spatially colored noise on \(L^2({\mathcal {O}})\) in the form of a Q-Wiener process (with \(Q\ne \textrm{Id}\)). The latter viewpoint provides, in our opinion, a more intuitive insight. In particular, the operator Q is constructed from the eigenfunctions and eigenvalues described below.

Let \((\eta _k)_{k\in {{\mathbb {N}}}}\) be a basis of \(L^2({\mathcal {O}})\), consisting of the normalized eigenfunctions of the Laplacian subject to Neumann boundary conditions with eigenvalues \(\lambda _k\ge 0\), and set \(a_k=(1+\lambda _k)^{-\rho }\) for some \(\rho >0\) such that \(\sum _{k=1}^\infty a_k^2\Vert \eta _k\Vert _{L^\infty ({\mathcal {O}})}^2<\infty \). Since \(\lambda _k\le Ck^{2/d}\) [28, Corollary 2] and \(\Vert \eta _k\Vert _{L^\infty ({\mathcal {O}})}\le Ck^{(d-1)/2}\) [23, Theorem 1], we may choose \(\rho >(d/2)^2\). Considering a sequence of independent Brownian motions \((W_1^k,\ldots ,W_n^k)_{k\in {{\mathbb {N}}}}\), we assume the noise to be of the form \(W^Q=(W_1^Q,\ldots ,W_n^Q)\), where

$$\begin{aligned} W_j^Q(t) = \sum _{k=1}^\infty a_k e_k W_j^k(t), \quad j=1,\ldots ,n,\ t>0, \end{aligned}$$

and \((e_k)_{k\in {{\mathbb {N}}}}=(a_k\eta _k)_{k\in {{\mathbb {N}}}}\) is a basis of \(U=Q^{1/2}(L^2({\mathcal {O}}))\).

Lemma 35

For the SKT model with self-diffusion, let \(\sigma _{ij}(u)=\delta _{ij}u_i^\alpha \) for \(1/2\le \alpha \le 1\), \(i,j=1,\ldots ,n\), interpreted as a map from \(L^2({\mathcal {O}})\) to \({{\mathcal {L}}}_2(H^\beta ({\mathcal {O}});L^2({\mathcal {O}}))\), where \(\beta >\rho \). Then the entropy inequality (29) holds, i.e., \(\sigma _{ij}\) is admissible for Theorem 4.

Proof

We can write inequality (23) for \(0<T<T_R\) as

$$\begin{aligned} {{\mathbb {E}}}&\sup _{0<t<T}\int _{\mathcal {O}}h(u^\varepsilon (t))\textrm{d}x + \frac{\varepsilon }{2}{{\mathbb {E}}}\sup _{0<t<T}\Vert Lw^\varepsilon (t)\Vert _{L^2({\mathcal {O}})}^2 \nonumber \\&\quad \;+ {{\mathbb {E}}}\sup _{0<t<T}\int _0^t\int _{\mathcal {O}}\nabla w^\varepsilon (s):B(w^\varepsilon )\nabla w^\varepsilon (s) \textrm{d}x\textrm{d}s - {{\mathbb {E}}}\int _{\mathcal {O}}h(u^0)\textrm{d}x \nonumber \\&\le {{\mathbb {E}}}\sup _{0<t<T}\bigg \{\int _0^t\sum _{k=1}^\infty \sum _{i,j=1}^n\bigg ( \int _{\mathcal {O}}\pi _i\log u_i^\varepsilon (s)\sigma _{ij}(u^\varepsilon (s))e_k\textrm{d}x\bigg )^2 \textrm{d}s\bigg \}^{1/2} \nonumber \\&\quad \;+ \frac{1}{2}{{\mathbb {E}}}\sup _{0<t<T}\sum _{k=1}^\infty \sum _{i=1}^n \int _0^t\int _{\mathcal {O}}(\sigma _{ii}(u^\varepsilon )e_k\frac{\pi _i}{u_i^\varepsilon } \sigma _{ii}(u^\varepsilon )e_k\textrm{d}x\textrm{d}s \nonumber \\&=: J_3 + J_4, \end{aligned}$$
(55)

recalling that \(w^\varepsilon =R_\varepsilon (v^\varepsilon )\) and \(u^\varepsilon =u(w^\varepsilon )\). We simplify \(J_3\) and \(J_4\), using the definition \(e_k=a_k\eta _k\):

$$\begin{aligned} J_3&= {{\mathbb {E}}}\sup _{0<t<T}\bigg \{\sum _{k=1}^\infty a_k^2\int _0^t\sum _{i=1}^n\pi _i^2\bigg ( \int _{\mathcal {O}}u_i^\varepsilon (s)^\alpha \log u_i^\varepsilon (s)\eta _k\textrm{d}x\bigg )^2\textrm{d}s\bigg \}^{1/2} \\&\le C{{\mathbb {E}}}\sup _{0<t<T}\bigg \{\sum _{k=1}^\infty a_k^2\int _{\mathcal {O}}\eta _k^2\textrm{d}x \int _0^t\sum _{i=1}^n\int _{\mathcal {O}}(u_i^\varepsilon (s)^\alpha \log u_i^\varepsilon (s))^2 \textrm{d}x\textrm{d}s\bigg \}^{1/2} \\&\le C\sum _{i=1}^n{{\mathbb {E}}}\Vert (u_i^\varepsilon )^\alpha \log u_i^\varepsilon \Vert _{L^2(0,T;L^2({\mathcal {O}}))}, \\ J_4&= \sum _{k=1}^\infty a_k^2{{\mathbb {E}}}\sup _{0<t<T}\sum _{i=1}^n\pi _i \int _0^{t}\int _{\mathcal {O}}(u_i^\varepsilon )^{2\alpha }(u_i^\varepsilon )^{-1}\eta _k^2\textrm{d}x\textrm{d}s \\&\le C\sum _{k=1}^\infty a_k^2\Vert \eta _k\Vert _{L^\infty ({\mathcal {O}})}^2\sum _{i=1}^n {{\mathbb {E}}}\Vert (u_i^\varepsilon )^{2\alpha -1}\Vert _{L^1(0,T;L^1({\mathcal {O}}))} \\&\le C\sum _{i=1}^n{{\mathbb {E}}}\Vert (u_i^\varepsilon )^{2\alpha -1}\Vert _{L^1(0,T;L^1({\mathcal {O}}))}. \end{aligned}$$

The last inequality follows from our assumption on \((a_k)\). By (28), we can estimate the integrand of the third integral on the left-hand side of (55) according to

$$\begin{aligned} \nabla w^\varepsilon :B(w^\varepsilon )\nabla w^\varepsilon \ge 2\sum _{i=1}^n\pi _ia_{ii}|\nabla u^\varepsilon |^2. \end{aligned}$$

Hence, because of \(|u|\le C(1+h(u))\), we can formulate (55) as

$$\begin{aligned} {{\mathbb {E}}}&\sup _{0<t<T}\Vert h(u^\varepsilon (t))\Vert _{L^1({\mathcal {O}})} + \frac{\varepsilon }{2}{{\mathbb {E}}}\sup _{0<t<T}\Vert Lw^\varepsilon (t)\Vert _{L^2({\mathcal {O}})}^2 + C{{\mathbb {E}}}\Vert \nabla u^\varepsilon (s)\Vert _{L^2(0,T;L^2({\mathcal {O}}))}^2 \\&\le C + C\sum _{i=1}^n{{\mathbb {E}}}\Vert (u_i^\varepsilon )^{\alpha }\log u_i^\varepsilon \Vert _{L^2(0,T;L^2({\mathcal {O}}))} + C\sum _{i=1}^n{{\mathbb {E}}}\Vert (u_i^\varepsilon )^{2\alpha -1}\Vert _{L^1(0,T;L^1({\mathcal {O}}))}. \end{aligned}$$

It is sufficient to continue with the case \(\alpha =1\), since the proof for \(\alpha <1\) follows from the case \(\alpha =1\). Then, using \(|u^\varepsilon |\le C(1+h(u^\varepsilon ))\),

$$\begin{aligned} {{\mathbb {E}}}&\Vert h(u^\varepsilon )\Vert _{L^\infty (0,T;L^1({\mathcal {O}}))} + {{\mathbb {E}}}\Vert u^\varepsilon \Vert _{L^\infty (0,T;L^1({\mathcal {O}}))}\nonumber \\&\quad \;+ \varepsilon {{\mathbb {E}}}\Vert Lw^\varepsilon \Vert _{L^\infty (0,T;L^2({\mathcal {O}}))}^2 + {{\mathbb {E}}}\Vert \nabla u^\varepsilon \Vert _{L^2(0,T;L^2({\mathcal {O}}))}^2 \nonumber \\&\le C + C\sum _{i=1}^n{{\mathbb {E}}}\Vert u_i^\varepsilon \log u_i^\varepsilon \Vert _{L^2(0,T;L^2({\mathcal {O}}))} + C{{\mathbb {E}}}\Vert u^\varepsilon \Vert _{L^1(0,T;L^1({\mathcal {O}}))}. x \end{aligned}$$
(56)

Now, we use the following lemma which is proved in Appendix A.

Lemma 36

Let \(d\ge 2\) and let \(v\in L^2(0,T;H^1({\mathcal {O}}))\) satisfy \(v\log v\in L^\infty (0,T;L^1({\mathcal {O}}))\). Then for any \(\delta >0\), there exists \(C(\delta )>0\) such that

$$\begin{aligned}&\Vert v\log v\Vert _{L^2(0,T;L^2({\mathcal {O}}))} \\&\quad \le \delta \big (\Vert v\log v\Vert _{L^1(0,T;L^1({\mathcal {O}}))} + \Vert v\Vert _{L^\infty (0,T;L^1({\mathcal {O}}))} + \Vert \nabla v\Vert _{L^2(0,T;L^2({\mathcal {O}}))}^2\big ) \\&\qquad + C(\delta )\Vert v\Vert _{L^1(0,T;L^1({\mathcal {O}}))}. \end{aligned}$$

It follows from (56) that, for any \(\delta >0\),

$$\begin{aligned} {{\mathbb {E}}}&\Vert h(u^\varepsilon )\Vert _{L^\infty (0,T;L^1({\mathcal {O}}))} + {{\mathbb {E}}}\Vert u^\varepsilon \Vert _{L^\infty (0,T;L^1({\mathcal {O}}))}\\&\quad \;+ \varepsilon {{\mathbb {E}}}\Vert Lw^\varepsilon \Vert _{L^\infty (0,T;L^2({\mathcal {O}}))}^2 + {{\mathbb {E}}}\Vert \nabla u^\varepsilon \Vert _{L^2(0,T;L^2({\mathcal {O}}))}^2 \\&\le C + C(\delta ){{\mathbb {E}}}\Vert u^\varepsilon \Vert _{L^1(0,T;L^1({\mathcal {O}}))} + \delta C\sum _{i=1}^n{{\mathbb {E}}}\Vert u_i^\varepsilon \log u_i^\varepsilon \Vert _{L^1(0,T;L^1({\mathcal {O}}))} \\&\quad \;+ \delta C\big ({{\mathbb {E}}}\Vert u^\varepsilon \Vert _{L^\infty (0,T;L^1({\mathcal {O}}))} + {{\mathbb {E}}}\Vert \nabla u^\varepsilon \Vert _{L^2(0,T;L^2({\mathcal {O}}))}^2\big ). \end{aligned}$$

For sufficiently small \(\delta >0\), the last terms on the right-hand side can be absorbed by the corresponding terms on the left-hand side, leading to

$$\begin{aligned} {{\mathbb {E}}}&\Vert h(u^\varepsilon )\Vert _{L^\infty (0,T;L^1({\mathcal {O}}))} + {{\mathbb {E}}}\Vert u^\varepsilon \Vert _{L^\infty (0,T;L^1({\mathcal {O}}))}\\&\quad \;+ \varepsilon {{\mathbb {E}}}\Vert Lw^\varepsilon \Vert _{L^\infty (0,T;L^2({\mathcal {O}}))}^2 + {{\mathbb {E}}}\Vert \nabla u^\varepsilon \Vert _{L^2(0,T;L^2({\mathcal {O}}))}^2 \\&\le C + C\int _0^T\Vert u^\varepsilon \Vert _{L^\infty (0,t;L^1({\mathcal {O}}))}\textrm{d}t \quad \text{ for } \text{ all } T>0. \end{aligned}$$

Gronwall’s lemma ends the proof. \(\square \)

In the case without self-diffusion, we have an \(H^1({\mathcal {O}})\) estimate for \((u_i^\varepsilon )^{1/2}\) only, and it can be seen that stochastic diffusion terms of the type \(\delta _{ij}u_i^\alpha \) for \(\alpha >1/2\) are not admissible. However, we may choose \(\sigma _{ij}(u)e_k = \delta _{ij}u_i^\alpha (1+(u_i^\varepsilon )^\beta )^{-1}a_k\eta _k\) for \(1/2\le \alpha <1\) and \(\beta \ge \alpha /2\).