1 Introduction

We are concerned with the numerical approximation of the 2D stochastic Navier–Stokes equations in a smooth bounded domain \(\mathcal {O}\subset \mathbb R^2\) supplemented with no-slip boundary conditions. They describe the flow of a homogeneous incompressible fluid in terms of the velocity field \(\textbf{u}\) and pressure function p defined on a filtered probability space \((\Omega ,{\mathfrak {F}},({\mathfrak {F}}_t),\mathbb P)\) and read as

$$\begin{aligned} \left\{ \begin{array}{ll} \mathrm d\textbf{u}=\mu \Delta \textbf{u}\, \textrm{d}t-(\textbf{u}\cdot \nabla )\textbf{u}\, \textrm{d}t-\nabla p\, \textrm{d}t+\Phi (\textbf{u})\mathrm dW &{} \text{ in } {\mathcal {O}}_T,\\ \textrm{div}\textbf{u}=0\qquad \qquad \qquad \qquad \qquad \,\,\,\,&{} \text{ in } {\mathcal {O}}_T,\\ \textbf{u}(0)=\textbf{u}_0\,\qquad \qquad \qquad \qquad \qquad &{} \text{ in } \mathcal {O},\end{array}\right. \end{aligned}$$
(1.1)

\(\mathbb P\)-a.s. in \({\mathcal {O}}_T:=(0,T)\times \mathcal {O}\), where \(T>0\), \(\mu >0\) is the viscosity and \(\textbf{u}_0\) is a given initial datum. The momentum equation is driven by a cylindrical Wiener process W and the diffusion coefficient \(\Phi \) takes values in the space of Hilbert-Schmidt operators; see Sect. 2.1 for details.

Existence, regularity and long-time behaviour of solutions to (1.1) have been studied extensively over the last three decades, and we refer to [23] for a complete picture. Most of the available results consider (1.1) with respect to periodic boundary conditions. In some cases this is only for a simplification of the presentation. For instance, the existence of stochastically strong solutions to (1.1) is not affected by the boundary condition. Looking at the spatial regularity of solutions the situation is completely different:

  • If \(\mathcal {O}={\mathbb {T}}^2\) — the two-dimensional torus — and (1.1) is supplemented with periodic boundary conditions one can obtain estimates in any Sobolev space provided the data (initial datum and diffusion coefficient) are sufficiently regular; cf. [23, Corollary 2.4.13].

  • If, on the other hand, \(\mathcal {O}\) is a bounded domain with smooth boundary and (1.1) is supplemented with the no-slip boundary condition

    $$\begin{aligned} \textbf{u}=\textbf{0}\quad \text{ on } (0,T)\times \partial \mathcal {O}\text{, } \end{aligned}$$
    (1.2)

    it is still an open problem if the solution satisfies

    $$\begin{aligned} \mathbb E\big [\Vert \nabla \textbf{u}(T)\Vert _{L^2_x}^2\big ]<\infty \end{aligned}$$
    (1.3)

    for any given \(T<\infty \), cf. [18, 22]. Regularity estimates are only known until a (possibly large) stopping time and even with this restriction the spatial regularity seems limited; see Lemma 3.1 (c) and Remark 3.2.

Moment estimates such as (1.3) are crucial for the numerical analysis. If they are not at disposal it is unclear how to obtain convergence rates for a discretisation of (1.1). Consequently most, if not all available results are concerned with the space-periodic problem. In particular, it is shown in [5] and [12] for the space-periodic problem that for any \(\xi >0\)

$$\begin{aligned} \begin{aligned}&{\mathbb {P}}\bigg [\frac{\max _m\Vert {\textbf {u}}(t_m)-{\textbf {u}}_{h,m}\Vert _{L^2_x}^2+\sum _{m=1}^M \tau \Vert \nabla {\textbf {u}}(t_m)-\nabla {\textbf {u}}_{h,m}\Vert _{L^2_x}^2}{h^{2\beta }+\tau ^{2\alpha }}>\xi \bigg ]\rightarrow 0 \end{aligned} \end{aligned}$$
(1.4)

as \(h,\tau \rightarrow 0\) (where \(\alpha <\frac{1}{2}\) and \(\beta <1\) are arbitrary); see also [3, 4] for related results. Here \(\textbf{u}\) is the solution to (1.1) and \(\textbf{u}_{h,m}\) the approximation of \(\textbf{u}(t_m)\) with discretisation parameters \(\tau =T/M\) (time) and h (space). The relation (1.4) tells us that the convergence in probability is of order (almost) 1/2 in time and 1 in space. It seems to be an intrinsic feature of SPDEs with general non-Lipschitz nonlinearities such as (1.1) that the more common concept of a pathwise error (an error measured in \(L^2(\Omega )\)) is too strong (see [26] for first contributions). Hence (1.4) is the best result we can hope for. The proof of (1.4) is based on estimates in \(L^2(\Omega )\), which are localised with respect to the sample set. The size of the neglected sets shrinks asymptotically with respect to the discretisation parameters and is consequently not seen in (1.4). The localised \(L^2(\Omega )\)-estimates in question rely on an iterative argument in the m-th step of which one can only control the discrete solution up to the step \(m-1\) (to avoid problems with \(({\mathfrak {F}}_t)\)-adaptedness), while the continuous solution is estimated by means of the global regularity estimates being available in the periodic setting (recall the discussion above).

In this work, we consider for (1.1) the semi-implicit space-time discretisation scheme (4.1), with general stable mixed finite element pairings as detailed in (2.8)–(2.10). We remark that many pairings are available in the literature that satisfy criterion (2.10), see e.g. [6, 7, 17, 19]; the convective term is treated in a semi-implicit, symmetrized way, which is a well-known strategy in the deterministic setting that goes back to [29] to enhance stability of this discretisation of the nonlinearity in the context of only discretely divergence-free functions; see (4.1)\(_2\). As a result, this amounts to solving linear (coupled) problems in the m-th iteration. It is due to the used Dirichlet data for (1.1) that a related error analysis of this scheme (4.1) is more difficult if compared to the periodic situation. In fact, in the Dirichlet-case estimates in stronger norms for the solution of (1.1) are only known for a (possibly large) stopping time since the equality \(\int _{{\mathcal O}} (\textbf{u}\cdot \nabla )\textbf{u}\cdot \Delta \textbf{u}\, \textrm{d}x=0\) is no longer available. Incorporating the latter case into the framework of the localised estimates, the iterative argument just mentioned fails: controlling the continuous solution in the m-th step only until the time \(t_{m-1}\) is insufficient for the estimates, while “looking into” the interval \([t_{m-1},t_m]\) in this set-up destroys the martingale character of certain stochastic integrals we have to estimate. We overcome this problem by using an approach based on discrete stopping times, which replaces the localised \(L^2(\Omega )\)-estimates from earlier contributions. This allows to control all quantities even in the interval \([t_{m-1},t_m]\) and, at the same time, preserves the martingale property of the stochastic integrals (see also the discussion in Remark 4.3). As a result we obtain ‘global-in-\(\Omega \)’ estimates up to the discrete stopping time; cf. Theorem 4.2. The discrete stopping times are constructed such that they converge to T, where T can be any given end-time. Consequently, the convergence in probability as in (1.4) follows for the Dirichlet-case, see our main result in Theorem 4.4. We believe that this strategy will be of use also for other SPDEs with non-Lipschitz nonlinearities.

We work under the structural assumption of a solenoidal diffusion coefficient which vanishes at the boundary. This is crucial in the regularity estimate from Lemma 3.1 (b) in order to control the correction term \(V^N(t)\) in the proof. Due to the counterexamples concerning the regularity for stochastic PDEs in bounded domains, see [21], this seems to be unavoidable. In fact, the same assumptions are made in the analytical paper [18] on which we built on.

2 Mathematical Framework

2.1 Probability Setup

Let \((\Omega ,\mathfrak F,(\mathfrak F_t)_{t\ge 0},\mathbb {P})\) be a stochastic basis with a complete, right-continuous filtration. The process W is a cylindrical \({\mathfrak {U}}\)-valued Wiener process, that is, \(W(t)=\sum _{j\ge 1}\beta _j(t) e_j\) with \((\beta _j)_{j\ge 1}\) being mutually independent real-valued standard Wiener processes relative to \((\mathfrak F_t)_{t\ge 0}\), and \((e_j)_{j\ge 1}\) a complete orthonormal system in a separable Hilbert space \(\mathfrak {U}\). Let us now give the precise definition of the diffusion coefficient \(\varPhi \) taking values in the set of Hilbert-Schmidt operators \(L_2(\mathfrak U;{\mathbb {H}})\), where \({\mathbb {H}}\) can take the role of various Hilbert spaces. We define \(L^2_{\textrm{div}}(\mathcal {O},\mathbb R^2)\) and \(W^{1,2}_{0,\textrm{div}}(\mathcal {O},\mathbb R^2)\) to be the closure of \(C^\infty _{c,\textrm{div}}(\mathcal {O},\mathbb R^2)\) – the solenoidal \(C^{\infty }_c(\mathcal {O},\mathbb R^2)\)-functions – in \(L^2(\mathcal {O},\mathbb R^2)\) and \(W^{1,2}_{0}(\mathcal {O},\mathbb R^2)\), respectively, see e.g. [15, Chapter III]. We also work with fractional Sobolev spaces \(W^{\sigma ,p}(0,T;X)\) for \(p\in (1,\infty )\) and \(\sigma \in (0,1)\) and a Banach space \((X;\Vert \cdot \Vert _X)\) with norm given by

$$\begin{aligned} \Vert f\Vert _{W^{\sigma ,p}(0,T;X)}^p:=\Vert f\Vert _{L^p(0,T;X)}^p+\int _{0}^T\int _{0}^T\frac{\Vert f(t)-f(s)\Vert _X^p}{|t-s|^{1+\sigma p}}\,\textrm{d}s\, \textrm{d}t. \end{aligned}$$

Similarly, \(W^{\sigma ,p}(\mathcal {O},\mathbb R^2)\) is the fractional Sobolev space with norm given by

$$\begin{aligned} \Vert v\Vert _{W^{\sigma ,p}_x}^p:=\Vert v\Vert _{L^p_x}^p+\int _{\mathcal {O}}\int _{\mathcal {O}}\frac{|v(x)-v(y)|^p}{|x-y|^{2+\sigma p}}\, \textrm{d}x\, \textrm{d}y. \end{aligned}$$

We assume that \(\Phi (\textbf{u})\in L_2(\mathfrak U;L^2_{\textrm{div}}(\mathcal {O},\mathbb R^2))\) for \(\textbf{u}\in L^2_{\textrm{div}}(\mathcal {O},\mathbb R^2)\), and \(\Phi (\textbf{u})\in L_2({\mathfrak {U}};W^{1,2}_{0,\textrm{div}}(\mathcal {O},\mathbb R^2))\) for \(\textbf{u}\in W^{1,2}_{0,\textrm{div}}(\mathcal {O},\mathbb R^2)\), together with

$$\begin{aligned} \Vert \Phi (\textbf{u})-\Phi (\textbf{v})\Vert _{L_2(\mathfrak U;L^2_x)}&\le \,c\Vert \textbf{u}-\textbf{v}\Vert _{L^2_x}\qquad \forall \textbf{u},\textbf{v}\in L^2_{\textrm{div}}(\mathcal {O},\mathbb R^2), \end{aligned}$$
(2.1)
$$\begin{aligned} \Vert \Phi (\textbf{u})\Vert _{L_2(\mathfrak U;W^{1,2}_x)}&\le \,c\big (1+\Vert \textbf{u}\Vert _{W^{1,2}_x}\big )\qquad \forall \textbf{u}\in W^{1,2}_{0,\textrm{div}}(\mathcal {O},\mathbb R^2), \end{aligned}$$
(2.2)
$$\begin{aligned} \Vert D\Phi (\textbf{u})\Vert _{L_2({\mathfrak {U}};{\mathcal {L}}( L^{2}_x;L^2_x))}&\le \,c\qquad \forall \textbf{u}\in L^{2}_{\textrm{div}}(\mathcal {O},\mathbb R^2). \end{aligned}$$
(2.3)

If we are interested in higher regularity, some further assumptions are in place and we require additionally that \(\Phi (\textbf{u})\in L_2({\mathfrak {U}};W^{2,2}(\mathcal {O},\mathbb R^2))\) for \(\textbf{u}\in W^{2,2}\cap W^{1,2}_{0,\textrm{div}}(\mathcal {O},\mathbb R^2)\), together with

$$\begin{aligned}&\Vert \Phi (\textbf{u})\Vert _{L_2(\mathfrak U;W^{2,2}_x)}\le \,c\big (1+\Vert \textbf{u}\Vert _{W^{1,4}_x}^2+\Vert \textbf{u}\Vert _{W^{2,2}_x}\big )\quad \forall \textbf{u}\in W^{2,2}\cap W^{1,2}_{0,\textrm{div}}(\mathcal {O},\mathbb R^2), \end{aligned}$$
(2.4)
$$\begin{aligned}&\Vert D^2\Phi (\textbf{u})\Vert _{L_2({\mathfrak {U}};{\mathcal {L}}( L^{2}_x\times L^2_x;L^2_x))}\le \,c\quad \forall \textbf{u}\in L^{2}_{\textrm{div}}(\mathcal {O},\mathbb R^2). \end{aligned}$$
(2.5)

Assumption (2.1) allows us to define stochastic integrals. Given an \(({\mathfrak {F}}_t)\)-adapted process \(\textbf{u}\in L^2(\Omega ;C([0,T];L^2_{\textrm{div}}(\mathcal {O})))\), the stochastic integral

$$\begin{aligned} t\mapsto \int _0^t\varPhi (\textbf{u})\,\textrm{d}W \end{aligned}$$

is a well-defined process taking values in \(L^2_{\textrm{div}}(\mathcal {O},\mathbb R^2)\); see [13] for a detailed construction. Moreover, we can multiply by test functions to obtain

$$\begin{aligned} \bigg (\int _0^t \varPhi (\textbf{u})\,\mathrm dW,{\varvec{\phi }}\bigg )_{L^2_x}=\sum _{j\ge 1} \int _0^t( \varPhi (\textbf{u}) e_j,{\varvec{\phi }})_{L^2_x}\,\mathrm d\beta _j \qquad \forall \, {\varvec{\phi }}\in L^2(\mathcal {O},\mathbb R^2). \end{aligned}$$

Similarly, we can define stochastic integrals with values in \(W^{1,2}_{0,\textrm{div}}(\mathcal {O},\mathbb R^2)\) and \(W^{2,2}(\mathcal {O},\mathbb R^2)\), respectively, if \(\textbf{u}\) belongs to the corresponding class.

2.2 The Concept of Solutions

In dimension two, pathwise uniqueness for analytically weak solutions is known under the assumption (2.1); we refer the reader for instance to Capiński–Cutland [11], Capiński [10]. Consequently, we may work with the definition of a weak pathwise solution.

Definition 2.1

Let \((\Omega ,\mathfrak {F},(\mathfrak {F}_t)_{t\ge 0},\mathbb {P})\) be a given stochastic basis with a complete right-continuous filtration and an \((\mathfrak {F}_t)\)-cylindrical Wiener process W. Let \(\textbf{u}_0\) be an \(\mathfrak {F}_0\)-measurable random variable with values in \(L^2_{\textrm{div}}(\mathcal {O},\mathbb R^2)\). Then \(\textbf{u}\) is called a weak pathwise solution to (1.1) with the initial condition \(\textbf{u}_0\) provided

  1. (a)

    the velocity field \(\textbf{u}\) is \((\mathfrak {F}_t)\)-adapted and

    $$\begin{aligned} \textbf{u}\in C_{\mathrm loc}([0,\infty );L^2_{{{\,\textrm{div}\,}}}(\mathcal {O},\mathbb R^2))\cap L^2_{\mathrm loc}(0,\infty ; W^{1,2}_{0,\textrm{div}}(\mathcal {O},\mathbb R^2))\quad \mathbb P\text {-a.s.}, \end{aligned}$$
  2. (b)

    the momentum equation

    $$\begin{aligned}{} & {} \int _{\mathcal {O}}\textbf{u}(t)\cdot {\varvec{\varphi }}\, \textrm{d}x-\int _{\mathcal {O}}\textbf{u}_0\cdot {\varvec{\varphi }}\, \textrm{d}x\\{} & {} \quad =-\int _0^t\int _{\mathcal {O}}(\textbf{u}\cdot \nabla )\textbf{u}\cdot {\varvec{\varphi }}\, \textrm{d}x\,\textrm{d}s+\mu \int _0^t\int _{\mathcal {O}}\nabla \textbf{u}:\nabla {\varvec{\varphi }}\, \textrm{d}x\,\textrm{d}s\\{} & {} \qquad +\int _0^t\int _{\mathcal {O}}\Phi (\textbf{u})\cdot {\varvec{\varphi }}\, \textrm{d}x\,\textrm{d}W \end{aligned}$$

    holds \(\mathbb P\)-a.s. for all \({\varvec{\varphi }}\in W^{1,2}_{0,{{\,\textrm{div}\,}}}(\mathcal {O},\mathbb R^2)\) and all \(t\ge 0\).

Theorem 2.2

Suppose that \(\Phi \) satisfies (2.1). Let \((\Omega ,\mathfrak {F},(\mathfrak {F}_t)_{t\ge 0},\mathbb {P})\) be a stochastic basis with a complete right-continuous filtration and an \((\mathfrak {F}_t)\)-cylindrical Wiener process W. Let \(\textbf{u}_0\) be an \(\mathfrak {F}_0\)-measurable random variable such that \(\textbf{u}_0\in L^r(\Omega ;L^2_{\textrm{div}}(\mathcal {O},\mathbb R^2))\) for some \(r>2\). Then there exists a unique weak pathwise solution to (1.1) in the sense of Definition 2.1 with the initial condition \(\textbf{u}_0\).

We give the definition of a strong pathwise solution to (1.1) which exists up to a stopping time \({\mathfrak {t}}\), cf. [18, 22]. The velocity field here belongs \(\mathbb P\)-a.s. to \(C([0,{\mathfrak {t}}];W^{1,2}_{0,{{\,\textrm{div}\,}}}(\mathcal {O},\mathbb R^2))\).

Definition 2.3

Let \((\Omega ,\mathfrak {F},(\mathfrak {F}_t)_{t\ge 0},\mathbb {P})\) be stochastic basis with a complete right-continuous filtration and an \((\mathfrak {F}_t)\)-cylindrical Wiener process W. Let \(\textbf{u}_0\) be an \(\mathfrak {F}_0\)-measurable random variable with values in \(W^{1,2}_{0,{{\,\textrm{div}\,}}}(\mathcal {O},\mathbb R^2)\). The tuple \((\textbf{u},{\mathfrak {t}})\) is called a strong pathwise solution to (1.1) with the initial condition \(\textbf{u}_0\) provided

  1. (a)

    \({\mathfrak {t}}\) is a \(\mathbb P\)-a.s. strictly positive \(({\mathfrak {F}}_t)\)-stopping time;

  2. (b)

    the velocity field \(\textbf{u}\) is \((\mathfrak {F}_t)\)-adapted and

    $$\begin{aligned} \textbf{u}(\cdot \wedge {\mathfrak {t}}) \in C_{\mathrm loc}([0,\infty );W^{1,2}_{0,{{\,\textrm{div}\,}}}(\mathcal {O},\mathbb R^2))\cap L^2_{\mathrm loc}(0,\infty ;W^{2,2}(\mathcal {O},\mathbb R^2)) \quad \mathbb P\text {-a.s.}, \end{aligned}$$
  3. (c)

    the momentum equation

    $$\begin{aligned} \begin{aligned}&\int _{\mathcal {O}}\textbf{u}(t\wedge \mathfrak t)\cdot {\varvec{\varphi }}\, \textrm{d}x-\int _{\mathcal {O}}\textbf{u}_0\cdot {\varvec{\varphi }}\, \textrm{d}x\\&\quad =-\int _0^{t\wedge \mathfrak t}\int _{\mathcal {O}}(\textbf{u}\cdot \nabla )\textbf{u}\cdot {\varvec{\varphi }}\, \textrm{d}x\,\textrm{d}s+\mu \int _0^{t\wedge \mathfrak t}\int _{\mathcal {O}}\Delta \textbf{u}\cdot {\varvec{\varphi }}\, \textrm{d}x\,\textrm{d}s\\ {}&\qquad +\int _{\mathcal {O}}\int _0^{t\wedge \mathfrak t}\Phi (\textbf{u})\cdot {\varvec{\varphi }}\,\textrm{d}W\, \textrm{d}x\end{aligned} \end{aligned}$$
    (2.6)

    holds \(\mathbb P\)-a.s. for all \({\varvec{\varphi }}\in C^{\infty }_{c,{{\,\textrm{div}\,}}}(\mathcal {O},\mathbb R^2)\) and all \(t\ge 0\).

Note that (2.6) certainly implies the corresponding formulation in Definition 2.1. The reverse implication is only true for analytically strong solutions.

We finally define what a maximal strong pathwise solution is.

Definition 2.4

(Maximal strong pathwise solution) Fix a stochastic basis with a cylindrical Wiener process and an initial condition as in Definition 2.3. A triplet

$$\begin{aligned} (\textbf{u},(\mathfrak {t}_R)_{R\in \mathbb N},\mathfrak {t}) \end{aligned}$$

is a maximal strong pathwise solution to system (1.1) provided

  1. (a)

    \(\mathfrak {t}\) is a \(\mathbb P\)-a.s. strictly positive \((\mathfrak {F}_t)\)-stopping time;

  2. (b)

    \((\mathfrak {t}_R)_{R\in \mathbb {N}}\) is an increasing sequence of \((\mathfrak {F}_t)\)-stopping times such that \(\mathfrak {t}_R<\mathfrak {t}\) on the set \([\mathfrak {t}<\infty ]\), as well as \(\lim _{R\rightarrow \infty }\mathfrak {t}_R={\mathfrak {t}}\) \(\mathbb P\)-a.s., and

    $$\begin{aligned} {\mathfrak {t}}_R:=\inf \big \{t\in [0,\infty ):\,\,\Vert \textbf{u}(t)\Vert _{W^{1,2}_x}\ge R\big \}\quad \text {on}\quad [\mathfrak {t}<\infty ], \end{aligned}$$
    (2.7)

    with the convention that \(\mathfrak {t}_R=\infty \) if the set above is empty;

  3. (c)

    each tuple \((\textbf{u},\mathfrak {t}_R)\), for \(R\in \mathbb {N}\), is a local strong pathwise solution in the sense of Definition 2.3.

We talk about a global solution if we have (in the framework of Definition 2.4) \({\mathfrak {t}}=\infty \) \(\mathbb P\)-a.s. Otherwise it is called a local solution. The following result concerning the existence of a global solution is shown in [24]; see also [18] for a similar statement. In the 3D case strong solutions are only known to exists locally, cf. [2, 8, 20].

Theorem 2.5

Suppose that (2.1)–(2.3) hold and that \(\textbf{u}_0\in L^2(\Omega ,W^{1,2}_{0,{{\,\textrm{div}\,}}}(\mathcal {O},\mathbb R^2))\). Then there is a unique global maximal strong pathwise solution to (1.1) in the sense of Definition 2.4.

2.3 Finite Elements

We work with a standard finite element set-up for incompressible fluid mechanics; see e.g. [17]. We denote by \(\mathscr {T}_h\) a quasi-uniform subdivision [6] of \(\mathcal {O}\) into triangles of maximal diameter \(h>0\). For \(K\subset \mathcal {O}\) and \(\ell \in \mathbb {N}_0\) we denote by \(\mathscr {P}_\ell (K)\) the polynomials on K of degree less than or equal to \(\ell \). Let us characterize the finite element spaces \(V^h(\mathcal {O},\mathbb R^2)\) and \(P^h(\mathcal {O})\) as

$$\begin{aligned} V^{h,i}(\mathcal {O},\mathbb R^2)&:= {\{{\textbf{v}_h \in W^{1,2}_0(\mathcal {O},\mathbb R^2)\,:\, \textbf{v}_h|_{K} \in (\mathscr {P}_i(K))^2\quad \forall K\in {\mathscr {T}}_h}\}}, \end{aligned}$$
(2.8)
$$\begin{aligned} P^{h,j}(\mathcal {O})&:={\{{\pi _h \in L^{2}(\mathcal {O})/\mathbb R\,:\, \pi _h|_{K} \in \mathscr {P}_j(K)\quad \forall K\in {\mathscr {T}}_h}\}}, \end{aligned}$$
(2.9)

where \(i,j\ge 0\). In order to guarantee stability of our approximation we relate \(V^h(\mathcal {O},\mathbb R^2)\) and \(P^h(\mathcal {O})\) by the discrete inf-sup condition, i.e., there exists a positive constant C not depending on h such that

$$\begin{aligned} \sup _{\textbf{v}_h\in V^{h,i}(\mathcal {O},\mathbb R^2)} \frac{\int _{\mathcal {O}}\textrm{div}\textbf{v}_h\,\pi _h\, \textrm{d}x}{\Vert \nabla \textbf{v}_h\Vert _{L^2_x}}\ge \,C\,\Vert \pi _h\Vert _{L^2_x}\quad \,\forall \pi _h\in P^{h,j}(\mathcal {O}) \, . \end{aligned}$$
(2.10)

A well-known class of inf-sup stable pairings are the ‘conforming Stokes elements’, with the simplest choice \(i=2\) in (2.8) and \(j=0\) in (2.9); see e.g. [7, Ch. 6] or [19, Rem. 3.4] for further admissible examples of pairings.

We define the space of discretely solenoidal finite element functions by

$$\begin{aligned} V^{h,i}_{\textrm{div}}(\mathcal {O},\mathbb R^2)&:= \bigg \{\textbf{v}_h\in V^{h,i}(\mathcal {O},\mathbb R^2):\,\,\int _{\mathcal {O}}\textrm{div}\textbf{v}_h\,\,\pi _h\, \textrm{d}x=0\quad \forall \pi _h\in P^{h,j}(\mathcal {O})\bigg \}. \end{aligned}$$

Let \(\Pi _h:L^2(\mathcal {O},\mathbb R^2)\rightarrow V_{\textrm{div}}^{h,i}(\mathcal {O},\mathbb R^2)\) be the \(L^2(\mathcal {O},\mathbb R^2)\)-orthogonal projection onto \(V_{\textrm{div}}^{h,i}(\mathcal {O},\mathbb R^2)\). The following results concerning the approximability of \(\Pi _h\) are well-known (see, for instance [19, Lemma 4.3]): there is \(c>0\) independent of h such that we have

$$\begin{aligned} \Vert \textbf{v}-\Pi _h \textbf{v}\Vert _{L^2_x}+ h\Vert \nabla \textbf{v}-\nabla \Pi _h \textbf{v}\Vert _{L^2_x}&\le \,c\,h \Vert \nabla \textbf{v}\Vert _{L^2_x} \end{aligned}$$
(2.11)

for all \(\textbf{v}\in W^{1,2}_{0,\textrm{div}}(\mathcal {O},\mathbb R^2)\); moreover, the arguments in [19, Section 4] together with standard interpolations arguments (see e.g. [17, Lemma A.2]) also imply for \(\beta \in (0,1]\) that

$$\begin{aligned} \Vert \textbf{v}-\Pi _h \textbf{v}\Vert _{L^2_x}+ h\Vert \nabla \textbf{v}-\nabla \Pi _h \textbf{v}\Vert _{L^2_x}&\le \,c\,h^{1+\beta } \Vert \textbf{v}\Vert _{W^{1+\beta ,2}_x} \end{aligned}$$
(2.12)

for all \(\textbf{v}\in W^{1+\beta ,2}\cap W^{1,2}_{0,\textrm{div}}(\mathcal {O},\mathbb R^2)\). Similarly, if \(\Pi _h^\pi :L^2(\mathcal {O})/\mathbb R\rightarrow P^{h,j}(\mathcal {O})\) denotes the \(L^2(\mathcal {O})\)-orthogonal projection onto \(P^{h,j}(\mathcal {O})\), we have

$$\begin{aligned} \Vert p-\Pi _h^\pi p\Vert ^2_{L^2_x}&\le \,c h\, \Vert \nabla p\Vert _{L^2_x} \end{aligned}$$
(2.13)

for all \(p\in W^{1,2}(\mathcal {O})/\mathbb R\).

3 Regularity of Solutions

In this section we analyse the regularity of the continuous solution as well as the associated pressure function. For various purposes we need the Helmholtz-projection \({\mathcal {P}}:L^p(\mathcal {O},\mathbb R^2)\rightarrow L^2_{\textrm{div}}(\mathcal {O},\mathbb R^2)\), for \(1<p<\infty \), given by

$$\begin{aligned} {\mathcal {P}}{\varvec{\phi }}:={\varvec{\phi }}-\nabla \Delta _{{\mathcal {O}}}^{-1}\textrm{div}{\varvec{\phi }}. \end{aligned}$$
(3.1)

Here \(\Delta ^{-1}_{{\mathcal {O}}}\textrm{div}\) is the solution operator to the equation

$$\begin{aligned} \Delta \mathfrak {h}=\textrm{div}\textbf{g}\quad \text {in}\quad {\mathcal {O}},\quad \nu _{{\mathcal {O}}}\cdot (\nabla \mathfrak h-\textbf{g})=0\quad \text {on}\quad \partial {\mathcal {O}}, \end{aligned}$$

where \(\nu _{{\mathcal {O}}}\) denotes the unit normal of \(\partial {\mathcal {O}}\). Note that \(\nabla \Delta ^{-1}_{\mathcal {O}}\textrm{div}\) satisfies (since \(\partial \mathcal {O}\) was assumed to be sufficiently smooth)

$$\begin{aligned} \nabla \Delta ^{-1}_{\mathcal {O}}\textrm{div}&:W^{r,p}(\mathcal {O},\mathbb R^2)\rightarrow W^{r,p}(\mathcal {O},\mathbb R^2), \end{aligned}$$
(3.2)

for all \(p\in (1,\infty )\) and all \(r\in \mathbb N\), where \(W^{0,p}(\mathcal {O},\mathbb R^2)=L^p(\mathcal {O},\mathbb R^2)\); see [1] for the case \(r\in \mathbb N\) and [15, Chapter IV] for the case \(r=0\). Clearly, (3.2) transfers to \({\mathcal {P}}\).

With the help of the Helmholtz projection we can define the Stokes operator as

$$\begin{aligned} {\mathcal {A}}:={\mathcal {P}}\Delta :W^{2,p}\cap W^{1,p}_{0,\textrm{div}}(\mathcal {O},\mathbb R^2)\rightarrow L^p_{\textrm{div}}(\mathcal {O},\mathbb R^2). \end{aligned}$$
(3.3)

Due to well-known estimates for the Stokes system there is \(c>0\) such that

$$\begin{aligned} \Vert \textbf{u}\Vert _{W^{r+2,p}_x}\le \,c\,\Vert \mathcal A\textbf{u}\Vert _{W^{r,p}_x},\quad \textbf{u}\in W^{r+2,p}\cap W^{1,p}_{0,\textrm{div}}(\mathcal {O},\mathbb R^2), \end{aligned}$$
(3.4)

for all \(p\in (1,\infty )\) and all \(r\in \mathbb N_0\), see, e.g., [15, Thm. IV. 6.1.], which uses sufficient smoothness of \(\partial \mathcal {O}\). Moreover, there is a system of eigenfunctions to the Stokes operator \((\textbf{u}_k)\subset W^{1,2}_{0,\textrm{div}}(\mathcal {O},\mathbb R^2)\) with strictly positive eigenvalues \((\lambda _k)\) such that \(\lambda _k\rightarrow \infty \) as \(k\rightarrow \infty \). It is possible to choose the \(\textbf{u}_k\)’s such that the system \((\textbf{u}_k)\) is orthonormal in \(L^2(\mathcal {O},\mathbb R^2)\) and orthogonal in \(W^{1,2}_0(\mathcal {O},\mathbb R^2)\). Finally, we can assume that the \(\textbf{u}_k\)’s are sufficiently smooth due to the assumed smoothness of \(\partial \mathcal {O}\). Since \(\mathcal A\) is positive, its root \(\mathcal A^{1/2}\) is well-defined with domain \(W^{1,p}_{0,\textrm{div}}(\Omega ,\mathbb R^2)\), and we have

$$\begin{aligned}&\Vert \nabla \textbf{u}\Vert _{L^p_x}\le c\big \Vert \mathcal A^{1/2}\textbf{u}\big \Vert _{L^p_x}\le C\Vert \nabla \textbf{u}\Vert _{L^p_x},\quad \textbf{u}\in W^{1,p}_{0,\textrm{div}}(\mathcal {O},\mathbb R^2), \end{aligned}$$
(3.5)
$$\begin{aligned}&\int _\mathcal {O}\mathcal A^{1/2}\textbf{u}\cdot \textbf{w}\, \textrm{d}x=\int _{\mathcal {O}}\textbf{u}\cdot \mathcal A^{1/2} \textbf{w}\, \textrm{d}x,\quad \textbf{u}\in W^{1,p}_{0,\textrm{div}}(\mathcal {O},\mathbb R^2),\,\,\textbf{w}\in W^{1,p'}_{0,\textrm{div}}(\mathcal {O},\mathbb R^2), \end{aligned}$$
(3.6)

where \(c,C>0\); cf. [16].

3.1 Estimates for the Continuous Solution

In this section we derive crucial estimates for the maximal strong pathwise solution from Definition 2.4, which hold up to the stopping time \({\mathfrak {t}}_R\). Here \(R>0\) is a fixed truncation parameter and \(T>0\) an arbitrary but fixed time.

Lemma 3.1

Let \((\Omega ,\mathfrak {F},(\mathfrak {F}_t)_{t\ge 0},\mathbb {P})\) be a given stochastic basis with a complete right-continuous filtration and an \((\mathfrak {F}_t)\)-cylindrical Wiener process W.

  1. (a)

    Assume that \(\textbf{u}_0\in L^r(\Omega ,L^{2}_{\textrm{div}}(\mathcal {O},\mathbb R^2))\) for some \(r\ge 2\) and that \(\Phi \) satisfies (2.1). Then we have

    $$\begin{aligned} \mathbb E\bigg [\bigg (\sup _{0\le t\le T}\Vert \textbf{u}(t)\Vert _{L^2_x}^2+\int _0^{T}\Vert \nabla \textbf{u}\Vert _{L^2_x}^2\, \textrm{d}t\bigg )^{\frac{r}{2}}\bigg ]\le \,c\,\mathbb E\Big [1+\Vert \textbf{u}_0\Vert _{L^2_x}^r\Big ], \end{aligned}$$
    (3.7)

    where \(\textbf{u}\) is the weak pathwise solution to (1.1); cf. Definition 2.1.

  2. (b)

    Assume that \(\textbf{u}_0\in L^r(\Omega ,W^{1,2}_{0,\textrm{div}}(\mathcal {O},\mathbb R^2))\) for some \(r\ge 2\) and that \(\Phi \) satisfies (2.1)–(2.3). Then we have

    $$\begin{aligned} \begin{aligned} \mathbb E\bigg [\bigg (\sup _{0\le t\le T}\Vert \textbf{u}(t\wedge \mathfrak t_R)\Vert ^2_{W^{1,2}_x}&+\int _0^{T\wedge \mathfrak t_R}\Vert \textbf{u}\Vert ^2_{W^{2,2}_x}\, \textrm{d}t\bigg )^{\frac{r}{2}}\bigg ]\\&\le \,cR^{3r}\,\mathbb E\Big [1+\Vert \textbf{u}_0\Vert _{W^{1,2}_x}^r\Big ], \end{aligned} \end{aligned}$$
    (3.8)

    where \((\textbf{u},({\mathfrak {t}}_R)_{R\in \mathbb N},{\mathfrak {t}})\) is the maximal strong pathwise solution to (1.1); cf. Definition 2.4.

  3. (c)

    Assume that \(\textbf{u}_0\in L^r(\Omega ,W^{2,2}(\mathcal {O},\mathbb R^2))\cap L^{5r}(\Omega ,W^{1,2}_{0,\textrm{div}}(\mathcal {O},\mathbb R^2))\) for some \(r\ge 2\), we have \({\mathcal {A}}\textbf{u}_0-{\mathcal {P}}(\textbf{u}_0\cdot \nabla \textbf{u}_0)|_{\partial {\mathcal {O}}}=0\) \({\mathbb {P}}\)-a.s. and that (2.1)–(2.5) holds. Then we have for all \(\beta <1\)

    $$\begin{aligned} \begin{aligned} \mathbb E\bigg [\bigg (\sup _{0\le t\le T}\Vert \textbf{u}(t\wedge \mathfrak t_R)\Vert _{W^{1+\beta }_x}^2&+\int _0^{T\wedge \mathfrak t_R}\Vert \textbf{u}\Vert ^2_{W^{2+\beta ,2}_x}\, \textrm{d}t\bigg )^{\frac{r}{2}}\bigg ]\\ {}&\le \,cR^{5r}\,\mathbb E\Big [1+\Vert \textbf{u}_0\Vert _{W^{2,2}_x}^r+\Vert \textbf{u}_0\Vert _{W^{1,2}_x}^{2r}\Big ], \end{aligned} \end{aligned}$$
    (3.9)

    where \((\textbf{u},({\mathfrak {t}}_R)_{R\in \mathbb N},{\mathfrak {t}})\) is the maximal strong pathwise solution to (1.1); cf. Definition 2.4.

Here \(c=c(r,T,\beta )>0\) is independent of R.

Proof

Part (a) is the standard a priori estimate, which is a consequence of applying Itô’s formula to \(t\mapsto \Vert \textbf{u}\Vert _{L^2_x}^2\).

For part (b) we follow [24], where the solution to a truncated problem is considered. For \(R>1\) and \(\zeta \in C_c^\infty ([0,1))\) with \(0\le \zeta \le 1\) and \(\zeta =1\) in [0, 1] we set \(\zeta _R:=\zeta (R^{-1}\cdot )\). Similar to Definition 2.1 we seek an \((\mathfrak {F}_t)\)-adapted stochastic process \(\textbf{u}^R\) with

$$\begin{aligned} \textbf{u}^R \in C([0,T];L^2_{{{\,\textrm{div}\,}}}(\mathcal {O},\mathbb R^2))\cap L^2(0,T; W^{1,2}_{0,{{\,\textrm{div}\,}}}(\mathcal {O},\mathbb R^2))\quad \mathbb P\text {-a.s.} \end{aligned}$$

such that

$$\begin{aligned}&\int _{\mathcal {O}}\textbf{u}^R(t)\cdot {\varvec{\varphi }}\, \textrm{d}x=\int _{\mathcal {O}}\textbf{u}_0\cdot {\varvec{\varphi }}\, \textrm{d}x+\int _0^t \zeta _R(\Vert \nabla \textbf{u}^{R}\Vert _{L^2_x}) \int _{\mathcal {O}}\textbf{u}^{R}\otimes \textbf{u}^{R} :\nabla {\varvec{\phi }}\, \textrm{d}x\,\textrm{d}s\nonumber \\&\quad -\mu \int _0^t\int _{\mathcal {O}}\nabla \textbf{u}^R:\nabla {\varvec{\varphi }}\, \textrm{d}x\,\textrm{d}s+\int _0^t\int _{\mathcal {O}}\Phi (\textbf{u}^R)\cdot {\varvec{\varphi }}\, \textrm{d}x\,\textrm{d}W \end{aligned}$$
(3.10)

holds \(\mathbb P\)-a.s. for all \({\varvec{\varphi }}\in W^{1,2}_{0,{{\,\textrm{div}\,}}}(\mathcal {O},\mathbb R^2)\) and all \(t\in [0,T]\). Arguing as in [24, Lemma 3.7] one can show that a unique global strong pathwise solution to (3.10) exists in the class \(C([0,T];W^{1,2}_{0,\textrm{div}}(\mathcal {O},\mathbb R^2))\),Footnote 1 and that it satisfies

$$\begin{aligned} \begin{aligned}&\mathbb E\bigg [\sup _{0\le t\le T}\Vert \nabla \textbf{u}^R(t)\Vert _{L^2_x}^2\, \textrm{d}x+\int _{0}^T\Vert \nabla ^2\textbf{u}^R\Vert _{L^2_x}^2\, \textrm{d}t\bigg ]\le \,c(r,R,T). \end{aligned} \end{aligned}$$
(3.11)

The proof of (3.11) in [24] is based on a Galerkin approximation which we mimick now in order to prove (3.8) and (3.9).

1) Galerkin approximation. Let \((\textbf{u}_k)\subset W^{1,2}_{0,\textrm{div}}(\mathcal {O},\mathbb R^2)\) be a system of eigenfunctions to the Stokes operator, cf. (3.3). For \(N\in \mathbb N\) let \(\mathbb H^N:=\textrm{span}\{\textbf{u}_1,\dots ,\textbf{u}_N\}\), and consider the unique solution \(\textbf{u}^{R,N}\) to

$$\begin{aligned} \begin{aligned} \int _{\mathcal {O}}\textbf{u}^{R,N}(t)\cdot {\varvec{\varphi }}\, \textrm{d}x=&\,\int _{\mathcal {O}}\textbf{u}_0\cdot {\varvec{\varphi }}\, \textrm{d}x-\mu \int _0^t\int _{\mathcal {O}}\nabla \textbf{u}^{R,N}:\nabla {\varvec{\varphi }}\, \textrm{d}x\,\textrm{d}s\\&+\int _0^t\zeta _R(\Vert \nabla \textbf{u}^{R,N}\Vert _{L^2_x})\int _{\mathcal {O}}\textbf{u}^{R,N}\otimes \textbf{u}^{R,N} :\nabla {\varvec{\phi }}\, \textrm{d}x\,\textrm{d}s\\&+\int _0^t\int _{\mathcal {O}}\Phi (\textbf{u}^{R,N})\cdot {\varvec{\varphi }}\, \textrm{d}x\,\textrm{d}W \end{aligned} \end{aligned}$$
(3.12)

for all \({\varvec{\phi }}\in {\mathbb {H}}^N\). By \({\mathcal {P}}_N\) we denote the \(L^2(\mathcal {O},\mathbb R^2)\)-projection onto \({\mathbb {H}}^N\). Problem (3.12) can be written as a system of SDEs with Lipschitz-continuous coefficients. Hence it is clear that there is a unique strong solution, i.e., an \(({\mathfrak {F}}_t)\)-adapted process defined on \((\Omega ,{\mathfrak {F}},{\mathbb {P}})\) with values in \(C([0,T];{\mathbb {H}}^N)\) and moments of order r. Arguing as in [24, Prop. 3.2] one can prove that as \(N\rightarrow \infty \)

$$\begin{aligned} \sup _{0\le t\le T}\Vert \textbf{u}^R(t)-\textbf{u}^{R,N}(t)\Vert _{L^2_x}^2&+\int _{0}^T\Vert \nabla (\textbf{u}^R-\textbf{u}^{R,N})\Vert _{L^2_x}^2\,\textrm{d}x\, \textrm{d}t\rightarrow 0 \end{aligned}$$
(3.13)

in probability. Applying Itô’s formula to \(t\mapsto \Vert \textbf{u}^{R,N}\Vert _{L^2_x}^2\) and using the cancellation of the convective term one can prove for \(r\ge 2\)

$$\begin{aligned} \mathbb E\bigg [\bigg (\sup _{0\le t\le T}\Vert \textbf{u}^{R,N}(t)\Vert _{L^2_x}^2+\int _{0}^T\Vert \nabla \textbf{u}^{R,N}\Vert _{L^2_x}^2\, \textrm{d}t\bigg )^{\frac{r}{2}}\bigg ]\le \,c\,\mathbb E\Big [1+\Vert \textbf{u}_0\Vert _{L^2_x}^r\Big ], \end{aligned}$$
(3.14)

where \(c=c(r,T)\) is independent of N and R.

2) Proof of (3.8). By construction we have \({\mathcal {A}}\textbf{u}^{R,N}\in C([0,T];{\mathbb {H}}^N)\) \(\mathbb P\)-a.s. such that we can apply Itô’s formula to \(t\mapsto (\textbf{u}^{R,N}(t),\mathcal A\textbf{u}^{R,N}(t))_{L^2_x}\) and use (3.12). This yields using \(\textbf{u}^{R,N}|_{\partial \mathcal {O}}=0\)

$$\begin{aligned}&\Vert \nabla \textbf{u}^{R,N}(t)\Vert _{L^2_x}^2=- \big (\textbf{u}^{R,N}(t),\Delta \textbf{u}^{R,N}(t)\big )_{L^2_x}=- \big (\textbf{u}^{R,N}(t),\mathcal A\textbf{u}^{R,N}(t)\big )_{L^2_x}\nonumber \\&\quad =\Vert \mathcal P_N\nabla \textbf{u}_0\Vert ^2_{^2_x} +2\int _0^t \zeta _R(\Vert \nabla \textbf{u}^{R,N}\Vert _{L^2_x})\big ((\textbf{u}^{R,N}\cdot \nabla )\textbf{u}^{R,N},\mathcal A\textbf{u}^{R,N}\big )_{L^2_x}\,\textrm{d}s\nonumber \\&\qquad -2\mu \int _0^t\Vert \mathcal A\textbf{u}^{R,N}\Vert _{L^2_x}^2\,\textrm{d}s+2\sum _{k=1}^N\int _0^t\big (\Phi (\textbf{u}^{R,N})e_k,\mathcal A\textbf{u}^{R,N}\big )_{L^2_x}\,\textrm{d}\beta _k\nonumber \\&\qquad +\sum _{k=1}^N\lambda _k\int _0^t\big (\Phi (\textbf{u}^{R,N})e_k,\textbf{u}_k\big )_{L^2_x}^2\,\textrm{d}s\nonumber \\&\quad =:\textrm{I}^N(t)+\dots +\textrm{V}^N(t) \end{aligned}$$
(3.15)

\({\mathbb {P}}\)-a.s. for all \(t\in [0,T].\) We estimate now the terms \(\textrm{II}^N\), \(\textrm{IV}^N\) and \(\textrm{V}^N\). First of all, we have by definition of \(\zeta _R\)

$$\begin{aligned} \textrm{II}^N(t)&\le 2\int _0^t\zeta _R(\Vert \nabla \textbf{u}^{R,N}\Vert _{L^2_x})\Vert \textbf{u}^{R,N}\Vert _{L^4_x}\Vert \nabla \textbf{u}^{R,N}\Vert _{L^4_x}\Vert \mathcal A\textbf{u}^{R,N}\Vert _{L^2_x}\textrm{d}s\\&\le 2\int _0^t\zeta _R(\Vert \nabla \textbf{u}^{R,N}\Vert _{L^2_x})\Vert \textbf{u}^{R,N}\Vert _{L^2_x}^{\frac{1}{2}}\Vert \nabla \textbf{u}^{R,N}\Vert _{L^2_x}\Vert \mathcal A\textbf{u}^{R,N}\Vert ^{\frac{3}{2}}_{L^2_x}\textrm{d}s\\&\le cR^{3/2}\int _0^t\Vert \mathcal A\textbf{u}^{R,N}\Vert ^{\frac{3}{2}}_{L^2_x}\textrm{d}s\le \delta \int _0^t\Vert \mathcal A\textbf{u}^{R,N}\Vert ^{2}_{L^2_x}\textrm{d}s+c(\delta )R^6, \end{aligned}$$

where \(\delta >0\) is arbitrary. Moreover, we obtain by definition of \(\textbf{u}_k\) and using (3.6) (and recalling that \(\Phi (\textbf{u}^{R,N})e_k\in W^{1,2}_{0,\textrm{div}}(\mathcal {O},\mathbb R^2)\) for all \(k\in \mathbb N\) by assumption)

$$\begin{aligned} \textrm{V}^N(t){} & {} =\sum _{k=1}^N\int _0^t\big (\Phi (\textbf{u}^{R,N})e_k,\sqrt{\lambda _k}\textbf{u}_k\big )_{L^2_x}^2\,\textrm{d}s=\sum _{k=1}^N\int _0^t\big (\Phi (\textbf{u}^{R,N})e_k, \mathcal A^{1/2}\textbf{u}_k\big )^2\,\textrm{d}s \\{} & {} =\sum _{k=1}^N\int _0^t\big (\mathcal A^{1/2}\Phi (\textbf{u}^{R,N})e_k,\textbf{u}_k\big )_{L^2_x}^2\,\textrm{d}s. \end{aligned}$$

Furthermore, since \(\Vert \textbf{u}_k\Vert _{L^2_x}=1\),

$$\begin{aligned} \textrm{V}^N(t){} & {} \le \sum _{k\ge 1}\int _0^t\Vert \mathcal A^{1/2}\Phi (\textbf{u}^{R,N})e_k\Vert _{L^2_x}^2\Vert \textbf{u}_k\Vert ^2_{L^2_x}\,\textrm{d}s \le \,c\sum _{k\ge 1}\int _0^t\Vert \nabla \Phi (\textbf{u}^{R,N})e_k\Vert _{L^2_x}^2\,\textrm{d}s\\{} & {} =c\int _0^t\Vert \Phi (\textbf{u}^{R,N})\Vert ^2_{L_2({\mathfrak {U}};W^{1,2}_x)}\,\textrm{d}s\le \,c\int _0^t\big (1+\Vert \textbf{u}^{R,N}\Vert ^2_{W^{1,2}_x}\big )\,\textrm{d}s, \end{aligned}$$

using (2.2) in the last step. The expectation of the right-hand side is bounded by (3.14). Finally, by Burkholder-Davis-Gundy inequality and (2.1),

$$\begin{aligned}{} & {} \mathbb E\bigg [\bigg (\sup _{0\le t\le T}|\textrm{IV}(t)|\bigg )^{\frac{r}{2}}\bigg ]\\{} & {} \quad \le \mathbb E\bigg [\bigg (\sup _{0\le t \le T}\Big |\int _0^t\sum _{k=1}^N\big (\Phi (\cdot ,\textbf{u}^{R,N})e_k ,{\mathcal {A}}\textbf{u}^{R,N}\big )_{L^2_x}\,\mathrm d\beta _k\Big |\bigg )^{\frac{r}{2}}\bigg ]\\{} & {} \quad \le c\,\mathbb E\bigg [\bigg (\sum _{k\ge 1}\int _0^T\big ( \Phi (\cdot ,\textbf{u}^{R,N})e_k \cdot {\mathcal {A}}\textbf{u}^{R,N}\big )_{L^2_x}^2\, \textrm{d}t\bigg )^{\frac{r}{4}}\bigg ]\\{} & {} \quad \le c\,\mathbb E\bigg [\bigg (\sum _{k\ge 1}\int _0^T \Vert \Phi _k(\textbf{u}^{R,N})e_k\Vert _{L^2_x}^2\Vert {\mathcal {A}}\textbf{u}^{R,N}\Vert ^2_{L^2_x}\, \textrm{d}t\bigg )^{\frac{r}{4}}\bigg ]\\{} & {} \quad \le c\,\mathbb E\bigg [\bigg (\int _0^T \big (1+\Vert \textbf{u}^{R,N}\Vert _{L^2_x}^2\big )\Vert \mathcal A\textbf{u}^{R,N}\Vert ^2_{L^2_x}\, \textrm{d}t\bigg )^{\frac{r}{4}}\bigg ]\\{} & {} \quad \le c(\delta )\,\mathbb E\bigg [\bigg (1+\sup _{0\le t\le T}\Vert \textbf{u}^{R,N}\Vert _{L^2_x}^2\bigg )^{\frac{r}{2}}\bigg ]+ \delta \,\mathbb E\bigg [\bigg (\int _0^T\Vert \mathcal A\textbf{u}^{R,N}\Vert _{L^2_x}^2\, \textrm{d}t\bigg )^{\frac{r}{2}}\bigg ]\\{} & {} \quad \le c(\delta )+\delta \,\mathbb E\bigg [\bigg (\int _0^T\Vert \mathcal A\textbf{u}^{R,N}\Vert _{L^2_x}^2\, \textrm{d}t\bigg )^{\frac{r}{2}}\bigg ] \end{aligned}$$

using (3.14), where again \(\delta >0\) is arbitrary. Choosing \(\delta \) small enough and using (3.4) we conclude that

$$\begin{aligned} \mathbb E\bigg [\bigg (\sup _{0\le t\le T}\int _{\mathcal {O}}\Vert \nabla \textbf{u}^{R,N}(t)\Vert _{L^2_x}^2{} & {} +\int _{0}^T\Vert \nabla ^2\textbf{u}^{R,N}\Vert _{L^2_x}^2\, \textrm{d}t\bigg )^{\frac{r}{2}}\bigg ] \nonumber \\{} & {} \le \,cR^{3r}\,\mathbb E\Big [1+\Vert \textbf{u}_0\Vert _{W^{1,2}_x}^r\Big ]\,, \end{aligned}$$
(3.16)

uniformly in N. This implies that \((\textbf{u}^{R,N})_{N\in \mathbb N}\) is a bounded sequence in the function space generated by the left-hand side of (3.16). After taking a subsequence we obtain a limit object \(\textbf{u}^R\) which is the unique global strong solution to (3.10) recalling (3.13). Furthermore, we can pass to the limit \(N\rightarrow \infty \) and obtain a corresponding estimate for \(\textbf{u}^R\) due to lower semi-continuity of the involved functionals. Since \(\textbf{u}^R(\cdot \wedge \mathfrak t_R)=\textbf{u}(\cdot \wedge {\mathfrak {t}}_R)\) we obtain (3.8).

3) Proof of (3.9). The verification of part (c) proceeds in two steps. In the first step we show an improved version of (3.16). Applying Itô’s formula to the mapping

$$\begin{aligned} t\mapsto \Vert \nabla \textbf{u}^{R,N}(t)\Vert _{L^2_x}^2\big (\textbf{u}^{R,N}(t),\mathcal A\textbf{u}^{R,N}(t)\big )_{L^2_x}, \end{aligned}$$

equation (3.15) yields

$$\begin{aligned} \Vert \nabla \textbf{u}^{R,N}(t)\Vert _{L^2_x}^4{} & {} =\Vert \mathcal P_N\nabla \textbf{u}_0\Vert _{L^2_x}^4-4\mu \int _0^t\Vert \nabla \textbf{u}^{R,N}\Vert _{L^2_x}^2\Vert \mathcal A\textbf{u}^{R,N}\Vert _{L^2_x}^2\,\textrm{d}s\\{} & {} \quad +4\int _0^t \zeta _R(\Vert \nabla \textbf{u}^{R,N}\Vert _{L^2_x})\Vert \nabla \textbf{u}^{R,N}\Vert _{L^2_x}^2\big ((\textbf{u}^{R,N}\cdot \nabla )\textbf{u}^{R,N},\mathcal A\textbf{u}^{R,N}\big )_{L^2_x}\,\textrm{d}s\\{} & {} \quad +4\sum _{k=1}^N\int _0^t\Vert \nabla \textbf{u}^{R,N}\Vert _{L^2_x}^2\big (\Phi (\textbf{u}^{R,N})e_k,\mathcal A\textbf{u}^{R,N}\big )_{L^2_x}\,\textrm{d}\beta _k\\{} & {} \quad +2\sum _{k=1}^N\lambda _k\int _0^t\Vert \nabla \textbf{u}^{R,N}\Vert _{L^2_x}^2\big (\Phi (\textbf{u}^{R,N})e_k,\textbf{u}_k\big )_{L^2_x}^2\,\textrm{d}s\\{} & {} \quad +2\sum _{k=1}^N\int _0^t\big (\Phi (\cdot ,\textbf{u}^{R,N})e_k ,\mathcal A\textbf{u}^{R,N}\big )_{L^2_x}^2\, \textrm{d}t. \end{aligned}$$

Following now step by step the arguments from the proof of (3.16) above we arrive at

$$\begin{aligned} \mathbb E\bigg [\bigg (\sup _{0\le t\le T}\Vert \nabla \textbf{u}^{R,N}(t)\Vert _{L^2_x}^4&+\int _0^T\Vert \nabla \textbf{u}^{R,N}\Vert _{L^2_x}^2\Vert \nabla ^2\textbf{u}^{R,N}\Vert _{L^2_x}^2\, \textrm{d}t\bigg )^{\frac{r}{2}}\bigg ]\\ {}&\le \,cR^{3r}\,\mathbb E\Big [1+\Vert \textbf{u}_0\Vert _{W^{1,2}_x}^{2r}\Big ]. \end{aligned}$$

Again we can pass to the limit in N obtaining

$$\begin{aligned} \begin{aligned} \mathbb E\bigg [\bigg (\sup _{0\le t\le T}\Vert \nabla \textbf{u}^{R}(t)\Vert _{L^2_x}^4&+\int _0^T\Vert \nabla \textbf{u}^{R}\Vert _{L^2_x}^2\Vert \nabla ^2\textbf{u}^{R}\Vert _{L^2_x}^2\, \textrm{d}t\bigg )^{\frac{r}{2}}\bigg ]\\ {}&\le \,cR^{3r}\,\mathbb E\Big [1+\Vert \textbf{u}_0\Vert _{W^{1,2}_x}^{2r}\Big ]. \end{aligned} \end{aligned}$$
(3.17)

Now we turn to the proof of (3.9) stated in part (c) for which we use the mild formulation of (3.10).

(\(\textbf{c}_1\)) Due to the regularity proved in (3.16) and (3.17), [25, Proposition F.0.5, (i)] applies and we can write

$$\begin{aligned} \textbf{u}^{R}(t)&=e^{-t\mathcal A}\textbf{u}_0+\int _0^te^{-(t-s)\mathcal A}\textbf{g}_R\,\textrm{d}s+\int _0^t e^{-(t-s)\mathcal A}\Phi (\textbf{u}^{R})\,\mathrm dW,\\ \text {where}\quad \textbf{g}_R:&= \zeta _R(\Vert \nabla \textbf{u}^{R}\Vert _{L^2_x})\mathcal P[(\textbf{u}^{R}\cdot \nabla )\textbf{u}^{R}]. \end{aligned}$$

Here \((e^{-t\mathcal A})_{t\ge 0}\) denotes the analytic semigroup on \(L^2_{\textrm{div}}(\mathcal {O},\mathbb R^2)\) generated by the Stokes operator \(\mathcal A\). Setting

$$\begin{aligned} \textbf{Y}^{R}(t)&:=e^{-t\mathcal A}\textbf{u}_0+\int _0^te^{-(t-s)\mathcal A} \textbf{g}_R\,\textrm{d}s,\\ \textbf{Z}^{R}(t)&:=\int _0^t e^{-(t-s)\mathcal A}\Phi (\textbf{u}^{R})\,\mathrm dW, \end{aligned}$$

we consider now the deterministic and stochastic contribution separately. We note that \(\textbf{Y}^R\) is the unique solution to a deterministic Stokes problem with initial datum \(\textbf{u}_0\) and forcing \(\textbf{g}_R\), whereas \(\textbf{Z}^R\) solves a stochastic Stokes problem with homogeneous initial datum and diffusion coefficient \(\Phi (\textbf{u}^R)\) – both equipped with homogeneous Dirichlet boundary conditions.

By Ladyshenskaya’s inequality we have

$$\begin{aligned} \mathbb E\bigg [\bigg (\int _0^T\Vert \textbf{g}_R\Vert _{L^2_x}^2\, \textrm{d}t\bigg )^{\frac{r}{2}}\bigg ]&\le \mathbb E\bigg [\bigg (\int _0^T\Vert \textbf{u}^R\Vert _{L^4_x}^2\Vert \nabla \textbf{u}^R\Vert _{L^4_x}^2\, \textrm{d}t\bigg )^{\frac{r}{2}}\bigg ]\\&\le \,c\,\mathbb E\bigg [\bigg (\int _0^T\Vert \textbf{u}^R\Vert _{L^2_x}\Vert \nabla \textbf{u}^R\Vert _{L^2_x}^2\Vert \nabla ^2\textbf{u}^R\Vert _{L^2_x}\, \textrm{d}t\bigg )^{\frac{r}{2}}\bigg ]\\&\le \,cR^{3r}\,\mathbb E\Big [1+\Vert \textbf{u}_0\Vert _{W^{1,2}_x}^{2r}\Big ], \end{aligned}$$

where we used (3.17) in the last step.

(\(\textbf{c}_2\)) Interpolating \(W^{1/2,2}(0,T;W^{1,2}(\mathcal {O},\mathbb R^2))\) between \(W^{1,2}(0,T;L^2(\mathcal {O},\mathbb R^2))\) and \(L^2(0,T;W^{2,2}(\mathcal {O},\mathbb R^2))\) and applying \({\mathbb {P}}\)-a.s. classical estimates for the Stokes system yields

$$\begin{aligned} \begin{aligned}&\mathbb E\bigg [\bigg (\Vert \textbf{Y}^R\Vert _{W^{1/2}(0,T;W^{1,2}_x)}^2\, \textrm{d}t\bigg )^{\frac{r}{2}}\bigg ]\\ {}&\quad \le \mathbb E\bigg [\bigg (\Vert \textbf{Y}^R\Vert _{W^{1,2}(0,T;L^{2}_x)}^2+\Vert \textbf{Y}^R\Vert _{L^{2}(0,T;W^{2,2}_x)}^2\, \textrm{d}t\bigg )^{\frac{r}{2}}\bigg ]\\&\quad \le \,c\,\mathbb E\bigg [\bigg (\int _0^T\Vert \textbf{g}_R\Vert _{L^2_x}^2\, \textrm{d}t\bigg )^{\frac{r}{2}}\bigg ]\le \,cR^{3r}\,\mathbb E\Big [1+\Vert \textbf{u}_0\Vert _{W^{1,2}_x}^{2r}\Big ]. \end{aligned} \end{aligned}$$
(3.18)

(\(\textbf{c}_3\)) For \(\textbf{Z}^R\) we apply the recent results from [30, Theorems 25 and 28] proving for any \(\sigma <1\)

$$\begin{aligned} \begin{aligned}&\mathbb E\bigg [\Vert \textbf{Z}^R\Vert ^2_{C^{\sigma /2}([0,T];L^2_x)}+\Vert \textbf{Z}^R\Vert _{W^{\sigma /2,2}(0,T;W^{1,2}_x)}^2\bigg ]\\ {}&\quad \le \,c\,\mathbb E\bigg [1+\sup _{0\le t\le T}\Vert \textbf{u}^R\Vert ^{4}_{W^{1,2}_x}\bigg ]\\&\quad \le \,c\,\mathbb E\bigg [1+\Vert \textbf{u}_0\Vert ^{4}_{W^{1,2}_x}\bigg ] \end{aligned} \end{aligned}$$
(3.19)

using also (2.2) and (3.17). Combining (3.18) and (3.19) and recalling that \(\textbf{u}^R\) is the sum of \(\textbf{Y}^R\) and \(\textbf{Z}^R\) gives

$$\begin{aligned} \begin{aligned} \mathbb E\bigg [\Vert \textbf{u}^R\Vert ^2_{C^{\sigma /2}([0,T];L^2_x)}+\Vert \textbf{u}^R\Vert _{W^{\sigma /2,2}(0,T;W^{1,2}_x)}^2\bigg ]&\le \,c\,\mathbb E\bigg [1+\Vert \textbf{u}_0\Vert ^{4}_{W^{1,2}_x}\bigg ]. \end{aligned} \end{aligned}$$
(3.20)

(\(\textbf{c}_4\)) Due to our assumption on the noise from (2.4) we know that \(\Phi (\textbf{u}^R)e_k\), with \(k\in \mathbb N\), belongs to the domain of the Stokes operator such that we can write

$$\begin{aligned} \mathcal A\textbf{Z}^{R}(t)=\int _0^t e^{-(t-s)\mathcal A}\mathcal A\Phi (\textbf{u}^{R})\,\mathrm dW. \end{aligned}$$

We conclude that \(\mathcal A\textbf{Z}^R\) is the unique weak pathwise solution to the stochastic Stokes problem with zero initial datum, homogeneous boundary conditions and diffusion coefficient \(\mathcal A\Phi (\textbf{u}^R)\). It is standard to derive for \(r\ge 2\) the estimate

$$\begin{aligned}&\mathbb E\bigg [\bigg (\sup _{0\le t\le T}\Vert \mathcal A\textbf{Z}^{R}\Vert ^2_{L^{2}_x}+\int _0^T\Vert \nabla \mathcal A\textbf{Z}^R\Vert ^2_{L^2_x}\,\textrm{d}s\bigg )^{\frac{r}{2}}\bigg ]\\&\quad \le \,c\,\mathbb E\bigg [\bigg (\int _0^T\Vert \mathcal A\Phi (\textbf{u}^{R})\Vert ^2_{L_2(\mathfrak U;L^2_x)}\,\mathrm ds\bigg )^{\frac{r}{2}}\bigg ]\\&\quad \le \,c\,\mathbb E\bigg [\bigg (\int _0^T\Vert \Phi (\textbf{u}^{R})\Vert ^2_{L_2(\mathfrak U;W^{2,2}_x)}\,\mathrm ds\bigg )^{\frac{r}{2}}\bigg ]\\&\quad \le \,c\bigg [\bigg (\int _0^{t}\big (1+\Vert \textbf{u}^{R}\Vert _{W^{1,2}_x}^2\Vert \textbf{u}^{R}\Vert ^2_{W^{2,2}_x}+\Vert \textbf{u}^{R}\Vert ^2_{W^{2,2}_x}\big )\,\textrm{d}s\bigg )^{\frac{r}{2}}\bigg ], \end{aligned}$$

applying Itô’s formula to \(t\mapsto \Vert \mathcal A\textbf{u}^R\Vert _{L^2_x}^2\) and using Burkholder-Davis-Gundy inequality (and (2.4) in the last step). The properties of the Stokes operator from (3.4) yield

$$\begin{aligned}&\mathbb E\bigg [\bigg (\sup _{0\le t\le T}\Vert \textbf{Z}^{R}\Vert ^2_{W^{2,2}_x}+\int _0^T\Vert \textbf{Z}^R\Vert ^2_{W^{3,2}_x}\,\textrm{d}s\bigg )^{\frac{r}{2}}\bigg ]\\&\quad \le \,c\bigg [\bigg (\int _0^{t}\big (1+\Vert \textbf{u}^{R}\Vert _{W^{1,2}_x}^2\Vert \textbf{u}^{R}\Vert ^2_{W^{2,2}_x}+\Vert \textbf{u}^{R}\Vert ^2_{W^{2,2}_x}\big )\,\textrm{d}s\bigg )^{\frac{r}{2}}\bigg ]. \end{aligned}$$

(\(\textbf{c}_5\)) To sharpen the estimates for \(\textbf{Y}^R\) is slightly more involved as the convective term \(\textbf{g}_R\) does not lie in the domain of the Stokes operator since it does not necessarily have a zero trace. We can choose \(p<2\) such that the embedding \(W^{1,p}(\mathcal {O})\hookrightarrow W^{\sigma ,2}(\mathcal {O})\) holds. We obtain by continuity of \({\mathcal {P}}\), cf. (3.2),

$$\begin{aligned} \Vert \textbf{g}_R\Vert _{W^{\sigma ,2}_x}&\le \,c\,\Vert \textbf{g}_R\Vert _{W^{1,p}_x}\le \,c \Vert (\textbf{u}^{R}\cdot \nabla )\textbf{u}^{R}\Vert _{W^{1,p}_x}\\&\le \,c\Vert \nabla \textbf{u}^R\Vert _{L^{2p}_x}^2+c\Vert \textbf{u}^R\Vert _{L^q_x}\Vert \nabla ^2\textbf{u}_R\Vert _{L^2_x}\le \,c\Vert \nabla \textbf{u}^R\Vert _{L^2_x}\Vert \nabla ^2\textbf{u}_R\Vert _{L^2_x}, \end{aligned}$$

where we used Hölder’s inequality with exponents 2/p and \(q:=2/(2-p)\) as well as Sobolev’s embedding \(W^{1,2}(\mathcal {O},\mathbb R^2)\hookrightarrow L^q(\mathcal {O},\mathbb R^2)\) and Ladyshenskaya’s inequality. By (3.17) we conclude that

$$\begin{aligned} \textbf{g}_R\in L^2(0,T;W^{\sigma ,2}(\mathcal {O},\mathbb R^2))\quad {\mathbb {P}}\text {-a.s.} \end{aligned}$$
(3.21)

We argue now similarly for the temporal regularity of order \(\sigma /2\) obtaining for any \(\sigma '\in (\sigma ,1)\)

$$\begin{aligned}&\Vert \textbf{g}_R\Vert _{W^{\sigma /2,p}(0,T;L^p_x)}^p\\ {}&\quad \le \,c\int _0^T\int _0^T\frac{\Vert \textbf{u}^R(t)\nabla \textbf{u}^R(t)-\textbf{u}^R(s)\nabla \textbf{u}^R(s)\Vert _{L^p_x}^p}{|t-s|^{1+p\sigma /2}}\, \textrm{d}t\,\textrm{d}s\\&\quad \le \,c\int _0^T\int _0^T\bigg (\frac{\Vert \textbf{u}^R(t)-\textbf{u}^R(s)\Vert _{L^2_x}}{|t-s|^{\sigma '/2}}\Vert \nabla \textbf{u}^R(t)\Vert _{L^q_x}\bigg )^p\frac{\, \textrm{d}t\,\textrm{d}s}{|t-s|^{1+\frac{(\sigma -\sigma ')p}{2}}}\\&\qquad + \,c\int _0^T\int _0^T\bigg (\frac{\Vert \textbf{u}^R(s)\Vert _{L^q_x}\Vert \nabla \textbf{u}^R(t)-\nabla \textbf{u}^R(s)\Vert _{L^2_x}}{|t-s|^{\sigma /2}}\bigg )^p\frac{\, \textrm{d}t\,\textrm{d}s}{|t-s|}\\&\quad \le \,c\Vert \textbf{u}^R\Vert ^p_{C^{\sigma '/2}([0,T];L^2_x)}\int _0^T\Vert \nabla \textbf{u}^R(t)\Vert _{L^q_x}^p\, \textrm{d}t\\&\qquad + \,c\sup _{0\le s\le t}\Vert \textbf{u}^R(s)\Vert _{L^q_x}^p\int _0^T\int _0^T\frac{\Vert \nabla \textbf{u}^R(t)-\nabla \textbf{u}^R(s)\Vert _{L^2_x}^p}{|t-s|^{1+p\sigma /2}}\, \textrm{d}t\,\textrm{d}s\\&\quad \le \,c\Vert \textbf{u}^R\Vert ^p_{C^{\sigma '/2}([0,T];L^2_x)}\int _0^T\big (1+\Vert \textbf{u}^R(t)\Vert _{W^{2,2}_x}^2\big )\, \textrm{d}t\\&\qquad + \,c\sup _{0\le s\le t}\Vert \textbf{u}^R(s)\Vert _{W^{1,2}_x}^p\Vert \textbf{u}^R\Vert _{W^{\sigma /2,p}(0,T;W^{1,2}_x)}^p\\&\quad \le \,c\bigg (\Vert \textbf{u}^R\Vert ^2_{C^{\sigma '/2}([0,T];L^2_x)}+\Vert \textbf{u}^R\Vert _{W^{\sigma '/2,2}(0,T;W^{1,2}_x)}^2+1\bigg )\\&\qquad +c\bigg (\,\sup _{0\le s\le t}\Vert \textbf{u}^R(s)\Vert _{W^{1,2}_x}^2+\int _0^T\Vert \textbf{u}^R\Vert _{W^{2,2}_x}^2\, \textrm{d}t\bigg )^q. \end{aligned}$$

The expectation of the right-hand side is bounded using (3.9) and (3.20); in particular, for any \(\sigma <1\)

$$\begin{aligned} \textbf{g}_R\in W^{\sigma /2,2}(0,T;L^{2}(\mathcal {O},\mathbb R^2))\quad \mathbb P\text {-a.s.} \end{aligned}$$
(3.22)

using the embedding decreasing the value of \(\sigma \) and using \(W^{\sigma /2,p}(0,T)\hookrightarrow W^{\sigma '/2,2}(0,T)\) for an appropriate choice of \(\sigma >\sigma '\) and \(p<2\). By (3.21) and (3.22) classical results on the Stokes system (see [28, Thm. 15] and note the compability assumption \({\mathcal {A}}\textbf{u}_0-\mathcal P(\textbf{u}_0\cdot \nabla \textbf{u}_0)|_{\partial {\mathcal {O}}}=0\) \(\mathbb P\)-a.s.) and interpolation yield

$$\begin{aligned} \textbf{Y}^R\in W^{1+\sigma /2}(0,T;L^{2}(\mathcal {O},\mathbb R^2))\cap L^2(0,T;W^{2+\sigma ,2}(\mathcal {O},\mathbb R^2))\quad {\mathbb {P}}\text {-a.s.} \end{aligned}$$

and thus, again by interpolation and appropriate choice of \(\sigma \in (\beta ,1)\) and the embedding \(W^{\alpha ,2}(0,T)\hookrightarrow L^\infty (0,T)\) for \(\alpha >1/2\),

$$\begin{aligned} \textbf{Y}^R\in L^\infty (0,T;W^{1+\beta ,2}(\mathcal {O},\mathbb R^2))\cap L^2(0,T;W^{2+\beta ,2}(\mathcal {O},\mathbb R^2))\quad {\mathbb {P}}\text {-a.s.} \end{aligned}$$
(3.23)

together with

$$\begin{aligned}&\sup _{0\le t\le T}\Vert \textbf{Y}^{R}\Vert _{W^{1+\beta ,2}_x}^2+\int _0^T\Vert \textbf{Y}^{R}\Vert _{W^{2+\beta ,2}_x}^2\,\mathrm ds\\&\quad \le \,c\bigg [\Vert \textbf{u}_0\Vert _{W^{1+\sigma ,2}_x}^2+\Vert \textbf{g}_R\Vert ^2_{W^{\sigma /2,2}_t(L^2_x)}+\int _0^T\Vert \textbf{g}_R\Vert _{W^{\sigma ,2}_x}^2\,\mathrm ds\bigg ]\quad {\mathbb {P}}\text {-a.s.} \end{aligned}$$

Combining the estimates for \(\textbf{Y}^{R}\) and \(\textbf{Z}^{R}\), choosing \(\kappa \) sufficiently small and using (3.16) and (3.17) we arrive at

$$\begin{aligned} \mathbb E\bigg [\bigg (\sup _{0\le t\le T}\Vert \textbf{u}^{R}(t)\Vert _{W^{1+\sigma ,2}_x}^2\, \textrm{d}x&+\int _0^{T}\Vert \textbf{u}^{R}\Vert ^2_{W^{2+\sigma ,2}_x}\, \textrm{d}t\bigg )^{\frac{r}{2}}\bigg ]\\ {}&\le \,cR^{5r}\,\mathbb E\Big [1+\Vert \textbf{u}_0\Vert _{W^{2,2}_x}^r+\Vert \textbf{u}_0\Vert _{W^{1,2}_x}^{2r}\Big ]. \end{aligned}$$

uniformly in R. \(\square \)

Remark 3.2

1. It seems not possible to prove Lemma 3.1 (c) for \(\beta \ge 1\), see (3.23). In fact, even for the deterministic Stokes system high regularity is only possible if the forcing is regular in space and time or belongs to the domain of the Stokes operator. Since neither is true for the convective term \({\mathcal {P}}(\textbf{u}\cdot \nabla )\textbf{u}\) (its temporal regularity is restricted by that of the driving Wiener process) we conjecture that the spatial regularity from 3.1 (c) is optimal. Interestingly, this is just enough to prove an optimal convergence rate for the discretisation of (1.1) in Theorem 4.4.

2. Using a recent result from [30] we can show that the gradient of the velocity field and hence the convective has a fractional time derivative of order \(\beta /2<1/2\). This is optimal in view of the limited regularity of the driving Wiener process in the momentum equation. It is classical for deterministic parabolic equations (see [28] for the Stokes equations and [27] for the heat equation) that the solution gains two spatial and one temporal derivatives compared to the right-hand side. Hence the regularity of the latter has to be measured in space and time with respect to the parabolic scaling; pure space regularity does not transfer unless additional assumptions are in place such that we can only hope for \(2+\beta \) spatial derivatives.

3.2 Regularity of the Pressure

Since we will be working with discretely divergence-free function spaces in the finite-element analysis for (4.1) in Sect. 4, it is inevitable to introduce the pressure function. Note that the strong formulation of the momentum equation in (2.6) even allows test functions from the class \(L^2_{\textrm{div}}(\mathcal {O},\mathbb R^2)\) (using a standard smooth approximation argument), i.e., functions which do not have zero traces on \(\partial \mathcal {O}\). Hence for \({\varvec{\phi }}\in C^\infty _c(\mathcal {O},\mathbb R^2)\) we can insert

$$\begin{aligned} {\mathcal {P}}{\varvec{\phi }}={\varvec{\phi }}-\nabla \Delta _{\mathcal O}^{-1}\textrm{div}{\varvec{\phi }}\end{aligned}$$

with the Helmholz projection \({\mathcal {P}}\); cf. (3.1). We obtain

$$\begin{aligned} \int _{\mathcal {O}}\textbf{u}(t\wedge {\mathfrak {t}}_R)\cdot {\varvec{\varphi }}\, \textrm{d}x&-\int _0^{t\wedge {\mathfrak {t}}_R}\int _{\mathcal {O}}\mu \Delta \textbf{u}\cdot {\varvec{\phi }}\,\textrm{d}x\, \textrm{d}\sigma +\int _0^{t\wedge {\mathfrak {t}}_R}\int _{\mathcal {O}}(\textbf{u}\cdot \nabla )\textbf{u}\cdot {\varvec{\phi }}\,\textrm{d}x\, \textrm{d}\sigma \nonumber \\&\quad =\int _{\mathcal {O}}\textbf{u}(0)\cdot {\varvec{\varphi }}\, \textrm{d}x+\int _0^{t\wedge {\mathfrak {t}}_R}\int _{\mathcal {O}}\pi \,\textrm{div}{\varvec{\phi }}\,\textrm{d}x\, \textrm{d}\sigma \nonumber \\&\qquad +\int _{\mathcal {O}}\int _0^{t\wedge {\mathfrak {t}}_R}\Phi (\textbf{u})\,\mathrm dW\cdot {\varvec{\varphi }}\, \textrm{d}x, \end{aligned}$$
(3.24)

where

$$\begin{aligned} \pi&=-\Delta ^{-1}_{\mathcal {O}}\textrm{div}\big ((\textbf{u}\cdot \nabla )\textbf{u}-\mu \Delta \textbf{u}\big ). \end{aligned}$$

In the following we will analyse how the regularity of \(\textbf{u}\) transfers to \(\pi \), where again \(R>0\) is a fixed truncation parameter and \(T>0\) an arbitrary but fixed time.

Lemma 3.3

  1. (a)

    Under the assumptions of Lemma 3.1 (b) we have

    $$\begin{aligned} \mathbb E\bigg [\bigg (\int _0^{T\wedge \mathfrak t_R}\Vert \pi \Vert _{W^{1,2}_x}^2\, \textrm{d}t\bigg )^{\frac{r}{4}}\bigg ]\le \,cR^{3r}\,\mathbb E\Big [1+\Vert \textbf{u}_0\Vert _{W^{1,2}_x}^r\Big ]. \end{aligned}$$
  2. (b)

    Under the assumptions of Lemma 3.1 (c) we have

    $$\begin{aligned} \mathbb E\bigg [\bigg (\int _0^{T\wedge \mathfrak t_R}\Vert \pi \Vert _{W^{2,2}_x}^2\, \textrm{d}t\bigg )^{\frac{r}{4}}\bigg ]\le \,cR^{5r}\,\mathbb E\Big [1+\Vert \textbf{u}_0\Vert _{W^{2,2}_x}^r\Big ]. \end{aligned}$$

Here \(c=c(r,T)>0\) is independent of R.

Proof

Ad (a). Arguing as in [5, Corollary 2.5] and using (3.2) we obtain

$$\begin{aligned}&\mathbb E\bigg [\bigg (\int _0^{T\wedge \mathfrak t_R}\Vert \pi \Vert _{W^{1,2}_x}^2\, \textrm{d}t\bigg )^{\frac{r}{4}}\bigg ]\\&\quad \le \,c\,\mathbb E\bigg [\bigg (1+\sup _{t\in [0, T\wedge \mathfrak t_R]}\Vert \textbf{u}\Vert _{W^{1,2}_x}^2+\int _0^{T\wedge \mathfrak t_R}\Vert \textbf{u}\Vert _{W^{2,2}_x}^2\, \textrm{d}t\bigg )^{\frac{r}{2}}\bigg ]. \end{aligned}$$

Consequently, Lemma 3.1 (b) implies (a).

Ad (b). Using (3.2) we have for \(p>2\) close to 2 and \(q:=\frac{2p}{p-2}\)

$$\begin{aligned}&\mathbb E\bigg [\bigg (\int _0^{T\wedge \mathfrak t_R}\Vert \pi \Vert _{W^{2,2}_x}^2\, \textrm{d}t\bigg )^{\frac{r}{4}}\bigg ]\le \,c\,\mathbb E\bigg [\bigg (\int _0^{T\wedge \mathfrak t_R}\Vert \textbf{u}\cdot \nabla \textbf{u}\Vert _{W^{1,2}_x}^2\, \textrm{d}t\bigg )^{\frac{r}{4}}\bigg ]\\&\quad \le \,c\,\mathbb E\bigg [\bigg (\int _0^{T\wedge \mathfrak t_R}\Vert \nabla \textbf{u}\Vert _{L^{4}_x}^4\, \textrm{d}t+\int _0^{T\wedge \mathfrak t_R}\Vert \textbf{u}\Vert _{L^q_x}^2\Vert \nabla ^2\textbf{u}\Vert _{L^{p}_x}^2\, \textrm{d}t\bigg )^{\frac{r}{4}}\bigg ]\\&\quad \le \,c\,\mathbb E\bigg [\bigg (\int _0^{T\wedge \mathfrak t_R}\Vert \textbf{u}\Vert _{W^{1+\beta ,2}_x}^4\, \textrm{d}t+\sup _{0\le t\le T\wedge {\mathfrak {t}}_R}\Vert \textbf{u}\Vert _{W^{1,2}_x}^2\int _0^{T\wedge \mathfrak t_R}\Vert \textbf{u}\Vert _{W^{2+\beta ,2}_x}^2\, \textrm{d}t\bigg )^{\frac{r}{4}}\bigg ]\\&\quad \le \,c\,\mathbb E\bigg [\bigg (\sup _{0\le t\le T\wedge \mathfrak t_R}\Vert \textbf{u}\Vert _{W^{1+\beta ,2}_x}^2+\int _0^{T\wedge \mathfrak t_R}\Vert \textbf{u}\Vert _{W^{2+\beta ,2}_x}^2\, \textrm{d}t\bigg )^{\frac{r}{2}}\bigg ] \end{aligned}$$

using the embeddings \(W^{1+\beta ,2}(\mathcal {O},\mathbb R^2)\hookrightarrow W^{1,4}(\mathcal {O},\mathbb R^2)\) and \(W^{2+\beta ,2}(\mathcal {O},\mathbb R^2)\hookrightarrow W^{2,p}(\mathcal {O},\mathbb R^2)\), which hold for an appropriate choice of \(\beta \in (0,1)\). Hence using Lemma 3.1 (c) completes the proof. \(\square \)

Corollary 3.4

  1. (a)

    Let the assumptions of Lemma 3.1 (b) be satisfied for some \(r>2\). For all \(\alpha <\frac{1}{2}\) we have

    $$\begin{aligned} \mathbb E\Big [\Big (\Vert \textbf{u}(\cdot \wedge \mathfrak t_R)\Vert _{C^\alpha ([0,T];L^{2}_x)}\Big )^{\frac{r}{2}}\Big ]\le cR^{3r}\,\mathbb E\Big [1+\Vert \textbf{u}_0\Vert _{W^{1,2}_x}^r\Big ]. \end{aligned}$$
    (3.25)
  2. (b)

    Let the assumptions of Lemma 3.1 (c) be satisfied for some \(r>2\). For all \(\alpha <\frac{1}{2}\) we have

    $$\begin{aligned} \mathbb E\Big [\Big (\Vert \textbf{u}(\cdot \wedge \mathfrak t_R)\Vert _{C^\alpha ([0,T];W^{1,2}_x)}\Big )^{\frac{r}{2}}\Big ]\le cR^{5r}\,\mathbb E\Big [1+\Vert \textbf{u}_0\Vert _{W^{2,2}_x}^r+\Vert \textbf{u}_0\Vert _{W^{1,2}_x}^{2r}\Big ]. \end{aligned}$$
    (3.26)

Here \(c=c(r,T,\alpha )>0\) is independent of R.

Proof

As in [5, Corollary 2.6] we can combine Lemmas 3.1 and 3.3 to conclude the result concerning the time regularity of \(\textbf{u}\) form (a). As far is (b) is concerned we analyse each term in equation (3.24) separately. Lemma 3.1(b) implies

$$\begin{aligned} \int _0^{\cdot \wedge {\mathfrak {t}}_R}\Delta \textbf{u}\, \textrm{d}\sigma \in L^r(\Omega ;L^{2}(0,T;W^{\beta ,2}(\mathcal {O},\mathbb R^2))), \end{aligned}$$

whereas Lemmas 3.1(c) and 3.3(b) yield

$$\begin{aligned} \int _0^{\cdot \wedge \mathfrak t_R}\big (\textrm{div}(\textbf{u}\otimes \textbf{u})+\nabla \pi \big )\, \textrm{d}\sigma \in L^{\frac{r}{2}}(\Omega ;L^{2}(0,T;W^{\beta ,2}(\mathcal {O},\mathbb R^2))). \end{aligned}$$

Finally, we have

$$\begin{aligned} \int _0^{\cdot \wedge {\mathfrak {t}}_R}\Phi (\textbf{u})\,\mathrm dW\in L^{r}(\Omega ;C^{\alpha }([0,T];L^2(\mathcal {O},\mathbb R^2))). \end{aligned}$$

by combing Lemma 3.1(a) with (2.2). We conclude that

$$\begin{aligned} \mathbb E\Big [\Big (\Vert \textbf{u}(\cdot \wedge \mathfrak t_R)\Vert _{C^\alpha ([0,T];W^{\beta ,2}_x)}\Big )^{\frac{r}{2}}\Big ]\le cR^{5r}\,\mathbb E\Big [1+\Vert \textbf{u}_0\Vert _{W^{2,2}_x}^r+\Vert \textbf{u}_0\Vert _{W^{1,2}_x}^{2r}\Big ]. \end{aligned}$$

for all \(\beta <1\). Interpolating this with the estimate from Lemma 3.1(c) gives the claim. \(\square \)

4 Error Analysis: Direct Comparison

Now we consider a fully practical scheme combining a semi-implicit Euler scheme in time with a finite element approximation in space. It is defined on the given filtered probability space \((\Omega ,{\mathfrak {F}},({\mathfrak {F}}_t),\mathbb P)\) on which W as well as the maximal strong pathwise solution to (1.1) are defined. For a given \(h>0\) let \(\textbf{u}_{h,0}\) be an \({\mathfrak {F}}_0\)-measurable random variable with values in \(V^{h,i}_{\textrm{div}}(\mathcal {O},\mathbb R^2)\) (for instance \(\Pi _h\textbf{u}_0\); see (2.11)). We aim at constructing iteratively a sequence of random variables \((\textbf{u}_{h,m},p_{h,m})\) such that for every \(({\varvec{\phi }}, \chi ) \in V^{h,i}(\mathcal {O},\mathbb R^2) \times P^{h,j}(\mathcal {O})\) it holds true \(\mathbb P\)-a.s.

$$\begin{aligned} \begin{aligned}&\int _{\mathcal {O}}\textbf{u}_{h,m}\cdot {\varvec{\varphi }}\, \textrm{d}x+\tau \int _{\mathcal {O}}\big ((\textbf{u}_{h,m-1}\cdot \nabla )\textbf{u}_{h,m}+(\textrm{div}\textbf{u}_{h,m-1})\textbf{u}_{h,m}\big )\cdot {\varvec{\phi }}\, \textrm{d}x\\&\quad +\mu \,\tau \int _{\mathcal {O}}\nabla \textbf{u}_{h,m}:\nabla {\varvec{\phi }}\, \textrm{d}x-\tau \int _{\mathcal {O}}p_{h,m}\,\textrm{div}{\varvec{\varphi }}\, \textrm{d}x\\ {}&=\int _{\mathcal {O}}\textbf{u}_{h,m-1}\cdot {\varvec{\varphi }}\, \textrm{d}x+\int _{\mathcal {O}}\Phi (\textbf{u}_{h,m-1})\,\Delta _mW\cdot {\varvec{\varphi }}\, \textrm{d}x\, , \\&\quad \int _{\mathcal {O}} \textrm{div} \textbf{u}_{h,m} \cdot \chi \, \textrm{d}x= 0\, , \end{aligned} \end{aligned}$$
(4.1)

where \(\Delta _m W=W(t_m)-W(t_{m-1})\). Here the interval [0, T] is decomposed into an equidistant grid of time points \(t_m=m\tau =m\frac{T}{M}\) with \(M\in \mathbb N\). For our theoretical analysis it is convenient to work with the pressure-free formulation of (4.1): For every \({\varvec{\phi }}\in V^{h,i}_{\textrm{div}}(\mathcal {O},\mathbb R^2)\) it holds true \(\mathbb P\)-a.s.

$$\begin{aligned} \begin{aligned}&\int _{\mathcal {O}}\textbf{u}_{h,m}\cdot {\varvec{\varphi }}\, \textrm{d}x+\tau \int _{\mathcal {O}}\big ((\textbf{u}_{h,m-1}\cdot \nabla )\textbf{u}_{h,m}+(\textrm{div}\textbf{u}_{h,m-1})\textbf{u}_{h,m}\big )\cdot {\varvec{\phi }}\, \textrm{d}x\\&\quad =\int _{\mathcal {O}}\textbf{u}_{h,m-1}\cdot {\varvec{\varphi }}\, \textrm{d}x-\mu \,\tau \int _{\mathcal {O}}\nabla \textbf{u}_{h,m}:\nabla {\varvec{\phi }}\, \textrm{d}x\\&\qquad +\int _{\mathcal {O}}\Phi (\textbf{u}_{h,m-1})\,\Delta _mW\cdot {\varvec{\varphi }}\, \textrm{d}x. \end{aligned} \end{aligned}$$
(4.2)

We quote the following result concerning the solution \((\textbf{u}_{h,m})_{m=1}^M\) to (4.2) from [9, Lemma 3.1].

Lemma 4.1

Fix \(T>0\). Assume that \(\textbf{u}_{h,0}\in L^{2^q}(\Omega ,V_{\textrm{div}}^{h,i}(\mathcal {O},\mathbb R^2))\) with \(q\in \mathbb N\) is an \({\mathfrak {F}}_0\)-measurable random variable. Suppose that \(\Phi \) satisfies (2.1). Then the iterates \((\textbf{u}_{h,m})_{m=1}^M\) given by (4.2) are \(({\mathfrak {F}}_{t_m})\)-measurable. Moreover, the following estimate holds uniformly in M and h:

$$\begin{aligned} \begin{aligned} \mathbb E\bigg [\max _{1\le m\le M}\Vert \textbf{u}_{h,m}\Vert ^{2^q}_{L^{2}_x}&+\tau \sum _{k=1}^M\Vert \textbf{u}_{h,m}\Vert ^{2^{q}-2}_{L^{2}_x}\Vert \nabla \textbf{u}_{h,m}\Vert ^2_{L^2_x}\bigg ]\\ {}&\le \,c(q,T)\mathbb E\Big [1+\Vert \textbf{u}_{h,0}\Vert _{L_x^2}^{2^q}\Big ]. \end{aligned} \end{aligned}$$
(4.3)

Our error analysis for (4.2) is based on an auxiliary problem which coincides with (4.2) until a discrete stopping time. As we shall see below both problems coincide with high probability. For every \(m\ge 1\) we introduce the discrete stopping time

$$\begin{aligned} {\mathfrak {t}}_m^R:=\max _{1\le n\le m}\big \{t_n:t_n\le {\mathfrak {t}}_R\big \}, \end{aligned}$$
(4.4)

which is obviously \(({\mathfrak {F}}_{t_m})\)-stopping time (but not an \(({\mathfrak {F}}_t)\)-stopping time). Setting \(\tau _{m}^R:=\mathfrak t_m^R-{\mathfrak {t}}_{m-1}^R\) we introduce \(\textbf{u}_{h,m}^R\) as the \(V_{\textrm{div}}^{h,i}(\mathcal {O},\mathbb R^2)\)-valued solution to

$$\begin{aligned} \begin{aligned}&\int _{\mathcal {O}}\textbf{u}_{h,m}^R\cdot {\varvec{\varphi }}\, \textrm{d}x+\tau _{m}^R\int _{\mathcal {O}}\big ((\textbf{u}^R_{h,m-1}\cdot \nabla )\textbf{u}_{h,m}^R+(\textrm{div}\textbf{u}_{h,m-1}^R)\textbf{u}_{h,m}^R\big )\cdot {\varvec{\phi }}\, \textrm{d}x\\&\quad =\int _{\mathcal {O}}\textbf{u}_{h,m-1}^R\cdot {\varvec{\varphi }}\, \textrm{d}x-\mu \,\tau _{m}^R\int _{\mathcal {O}}\nabla \textbf{u}_{m}^R:\nabla {\varvec{\phi }}\, \textrm{d}x\\&\qquad +\frac{\tau _{m}^R}{\tau }\int _{\mathcal {O}}\Phi (\textbf{u}_{h,m-1}^R)\,\Delta _m W\cdot {\varvec{\varphi }}\, \textrm{d}x\end{aligned} \end{aligned}$$
(4.5)

for every \({\varvec{\phi }}\in V^{h,i}_{\textrm{div}}(\mathcal {O},\mathbb R^2)\). Obviously \(\textbf{u}_{h,m}^R=\textbf{u}_{h,m}\) in \([t_m= {\mathfrak {t}}_m^R]\). Our main effort is dedicated to the proof of an error estimate for (4.5) in the following theorem, for which \(R>0\) is a fixed truncation parameter and \(T>0\) an arbitrary but fixed time.

Theorem 4.2

Let \(\textbf{u}_0\in L^8(\Omega ,W^{2,2}(\mathcal {O},\mathbb R^2))\cap L^{20}(\Omega ;W^{1,2}_{0,\textrm{div}}(\mathcal {O},\mathbb R^2))\) be \(\mathfrak F_0\)-measurable, we have \({\mathcal {A}}\textbf{u}_0-\mathcal P(\textbf{u}_0\cdot \nabla \textbf{u}_0)|_{\partial {\mathcal {O}}}=0\) \({\mathbb {P}}\)-a.s. and assume that \(\Phi \) satisfies (2.1)–(2.5). Let

$$\begin{aligned} (\textbf{u},(\mathfrak {t}_R)_{R\in \mathbb N},\mathfrak {t}) \end{aligned}$$

be the unique maximal global strong pathwise solution to (1.1) in the sense of Definition 2.4. Let \(({\mathfrak {t}}_m^R)_{m=1}^M\) be defined by (4.4). Then we have for all \(R\in \mathbb N\) and all \(\alpha <\frac{1}{2}\), \(\beta <1\)

$$\begin{aligned} \begin{aligned}&\mathbb E\bigg [\max _{1\le m\le M}\Vert \textbf{u}(\mathfrak t^R_{m})-\textbf{u}_{h,m}^R\Vert _{L^2_x}^2+\sum _{m=1}^M \tau _{m}^R\Vert \nabla \textbf{u}(\mathfrak t_{m}^R)-\nabla \textbf{u}_{h,m}^R\Vert _{L^2_x}^2\bigg )\bigg ]\\&\quad \le \,ce^{cR^4}\,\big (h^{2\beta }+\tau ^{2\alpha }\big ), \end{aligned} \end{aligned}$$
(4.6)

where \((\textbf{u}_{h,m}^R)_{m=1}^M\) is the solution to (4.5) with \(\textbf{u}_{h,0}^R=\Pi _h\textbf{u}_0\). The constant c in (4.6) is independent of \(\tau \), h and R.

Remark 4.3

1. In previous papers concerning the periodic problem, in particular [12], the idea is to consider the equation for the error in the m-th step and multiply by the indicator function of a set \(\Omega ^{h,\tau }_{m-1}\subset \Omega \). Hereby \(\Omega ^{h,\tau }_{m-1}\subset \Omega \) is \(\mathfrak F_{t_{m-1}}\)-measurable and certain quantities up to time \(t_{m-1}\) are bounded in \(\Omega ^{h,\tau }_{m-1}\). It is, however, not necessary to control the continuous solution in this way since global estimates are available, see, e.g., [12, Lemma 2.1] or [5, Lemma 2].

In our situation, having only stopped estimates as in Lemma 3.1, it is necessary to also control the continuous solution. For certain quantities, having control until time \(t_{m-1}\) is not sufficient (see, for instance, the estimates for \(I_2(m)\) and \(I_3(m)\) below, where norms of \(\textbf{u}\) over \([t_{m-1},t_m]\) appear). Using \({\mathfrak {F}}_{t_{m}}\)-measurable sets \(\Omega ^{h,\tau }_{m}\subset \Omega \) instead is not possible either as it destroys the martingale property of \({\mathscr {M}}^1\) given below in (4.7).

Both problems are overcome by the use of the discrete stopping time \({\mathfrak {t}}_m^R\): we can control norms of \(\textbf{u}\) over \([t_{m-1},t_m]\), and \({\mathscr {M}}^1\) is estimated at time \(\mathfrak t_R\ge {\mathfrak {t}}_m^R\) such that the martingale property can be used.

2. The (discrete) gradient of the noise term in (4.1) need not be subtracted here, as is in [14], since a simultaneous space-time error analysis is used to prove Theorem 4.4 below.

Our main result is now a direct consequence of Theorem 4.2: Setting for \(\varepsilon >0\) arbitrary \(R=c^{-1/4}\root 4 \of {-2\varepsilon \log \min \{h,\tau \}}\), we have for any \(\xi >0\)

$$\begin{aligned}&{\mathbb {P}}\bigg [\frac{\max _{1\le m\le M}\Vert \textbf{u}(t_m)-\textbf{u}_{h,m}\Vert _{L^2_x}^2+\sum _{m=1}^M \tau \Vert \nabla \textbf{u}(t_m)-\nabla \textbf{u}_{h,m}\Vert _{L^2_x}^2}{h^{2\beta -2\varepsilon }+\tau ^{2\alpha -2\varepsilon }}>\xi \bigg ]\\&\quad \le {\mathbb {P}}\bigg [\frac{\max _{1\le m\le M}\Vert \textbf{u}({\mathfrak {t}}^R_{m})-\textbf{u}_{h,m}^R\Vert _{L^2_x}^2+\sum _{m=1}^M \tau _{m}^R\Vert \nabla \textbf{u}(\mathfrak t_{m}^R)-\nabla \textbf{u}_{h,m}^R\Vert _{L^2_x}^2}{h^{2\beta -2\varepsilon }+\tau ^{2\alpha -2\varepsilon }}>\xi \bigg ]\\&\qquad +{\mathbb {P}}\big [\{{\mathfrak {t}}_R<T\}\big ] \rightarrow 0 \end{aligned}$$

as \(h,\tau \rightarrow 0\) (recall that \({\mathfrak {t}}_R\rightarrow \infty \) \(\mathbb P\)-a.s. by Theorem 2.5 and that \(\mathfrak t_M^R<t_M\) implies \({\mathfrak {t}}_R<T\)). Relabeling \(\alpha \) and \(\beta \) we have proved the following result.

Theorem 4.4

Let \((\Omega ,\mathfrak {F},(\mathfrak {F}_t)_{t\ge 0},\mathbb {P})\) be a given stochastic basis with a complete right-continuous filtration and an \((\mathfrak {F}_t)\)-cylindrical Wiener process W. Let \(\textbf{u}_0\in L^8(\Omega ,W^{2,2}(\mathcal {O},\mathbb R^2))\cap L^{20}(\Omega ;W^{1,2}_{0,\textrm{div}}(\mathcal {O},\mathbb R^2))\) be \(\mathfrak F_0\)-measurable, we have \({\mathcal {A}}\textbf{u}_0-\mathcal P(\textbf{u}_0\cdot \nabla \textbf{u}_0)|_{\partial {\mathcal {O}}}=0\) \({\mathbb {P}}\)-a.s. and assume that \(\Phi \) satisfies (2.1)–(2.5). Let

$$\begin{aligned} (\textbf{u},(\mathfrak {t}_R)_{R\in \mathbb N},\mathfrak {t}) \end{aligned}$$

be the unique maximal global strong pathwise solution to (1.1) from Theorem 2.5. Then we have for any \(\xi >0\), \(\alpha <\frac{1}{2}\), \(\beta <1\)

$$\begin{aligned}&{\mathbb {P}}\bigg [\frac{\max _{1\le m\le M}\Vert \textbf{u}(t_m)-\textbf{u}_{h,m}\Vert _{L^2_x}^2+\sum _{m=1}^M \tau \Vert \nabla \textbf{u}(t_m)-\nabla \textbf{u}_{h,m}\Vert _{L^2_x}^2}{h^{2\beta }+\tau ^{2\alpha }}>\xi \bigg ]\rightarrow 0 \end{aligned}$$

as \(h,\tau \rightarrow 0\), where \((\textbf{u}_{h,m})_{m=1}^M\) is the solution to (4.2) with \(\textbf{u}_{h,0}=\Pi _h\textbf{u}_0\).

In order to finish the proof of our main result stated in Theorem 4.4 above, we focus now on proving the error estimate from Theorem 4.2 concerning the auxiliary problem (4.5).

Proof of Theorem 4.2

Define the error \(\textbf{e}_{h,m}=\textbf{u}({\mathfrak {t}}_m^R)-\textbf{u}_{h,m}^R\). Subtracting (3.24) and (4.5) and recalling that functions from \(W^{1,2}_0(\mathcal {O},\mathbb R^2)\) are admissible in (3.24) we obtain

$$\begin{aligned}&\int _{\mathcal {O}}\textbf{e}_{h,m}\cdot {\varvec{\varphi }}\, \textrm{d}x+\int _{{\mathfrak {t}}_{m-1}^R}^{{\mathfrak {t}}_m^R}\int _{\mathcal {O}}\mu \nabla \textbf{u}(\sigma ):\nabla {\varvec{\phi }}\, \textrm{d}x\, \textrm{d}\sigma -\tau _{m}^R\int _{\mathcal {O}}\mu \nabla \textbf{u}^R_{h,m}:\nabla {\varvec{\phi }}\, \textrm{d}x\\&\quad =\int _{\mathcal {O}}\textbf{e}_{h,m-1}\cdot {\varvec{\varphi }}\, \textrm{d}x-\int _{{\mathfrak {t}}_{m-1}^R}^{{\mathfrak {t}}_m^R}\int _{\mathcal {O}}(\textbf{u}(\sigma )\cdot \nabla )\textbf{u}(\sigma )\cdot {\varvec{\phi }}\,\textrm{d}x\, \textrm{d}\sigma \\ {}&\qquad +\tau _{m}^R\int _{\mathcal {O}}\big ((\textbf{u}_{h,m-1}^R\cdot \nabla )\textbf{u}_{h,m}^R+(\textrm{div}\textbf{u}^R_{h,m-1})\textbf{u}_{h,m}^R\big )\cdot {\varvec{\phi }}\, \textrm{d}x\\&\qquad +\int _{\mathcal {O}}\int _{{\mathfrak {t}}_{m-1}^R}^{{\mathfrak {t}}_m^R}\Phi (\textbf{u}(\sigma ))\,\mathrm dW\cdot {\varvec{\varphi }}\, \textrm{d}x-\int _{\mathcal {O}}\int _{{\mathfrak {t}}_{m-1}^R}^{{\mathfrak {t}}_m^R}\Phi (\textbf{u}_{h,m-1}^R)\,\mathrm dW\cdot {\varvec{\varphi }}\, \textrm{d}x\\&\qquad +\int _{{\mathfrak {t}}_{m-1}^R}^{\mathfrak t_m^R}\int _{\mathcal {O}}\pi (\sigma )\,\textrm{div}{\varvec{\phi }}\, \textrm{d}x\, \textrm{d}\sigma \end{aligned}$$

for every \({\varvec{\phi }}\in V^{h,i}(\mathcal {O},\mathbb R^2)\), which is equivalent to

$$\begin{aligned}&\int _{\mathcal {O}}\textbf{e}_{h,m}\cdot {\varvec{\varphi }}\, \textrm{d}x+\tau _{m}^R\int _{\mathcal {O}}\mu \Big (\nabla \textbf{u}(\mathfrak t_{m}^R)-\nabla \textbf{u}^R_{h,m}\Big ):\nabla {\varvec{\phi }}\, \textrm{d}x\\&\quad =\int _{\mathcal {O}}\textbf{e}_{h,m-1}\cdot {\varvec{\varphi }}\, \textrm{d}x+\int _{{\mathfrak {t}}_{m-1}^R}^{\mathfrak t_m^R}\int _{\mathcal {O}}\mu \big (\nabla \textbf{u}(\mathfrak t_{m}^R)-\nabla \textbf{u}(\sigma )\big ):\nabla {\varvec{\phi }}\, \textrm{d}x\, \textrm{d}\sigma \\&\qquad +\int _{{\mathfrak {t}}_{m-1}^R}^{\mathfrak t_m^R}\int _{\mathcal {O}}\Big ((\textbf{u}(\mathfrak t_{m-1}^R)\cdot \nabla )\textbf{u}(\mathfrak t_{m}^R)-(\textbf{u}(\sigma )\cdot \nabla )\textbf{u}(\sigma )\Big )\cdot {\varvec{\phi }}\,\textrm{d}x\, \textrm{d}\sigma \\&\qquad -\tau _{m}^R\int _{\mathcal {O}}\Big ((\textbf{u}(\mathfrak t_{m-1}^R)\cdot \nabla )\textbf{u}(\mathfrak t_{m}^R)-\big ((\textbf{u}^R_{h,m-1}\cdot \nabla )\textbf{u}^R_{h,m}+(\textrm{div}\textbf{u}^R_{h,m-1})\textbf{u}^R_{h,m}\big )\Big )\cdot {\varvec{\phi }}\, \textrm{d}x\\&\qquad +\int _{\mathcal {O}}\int _{{\mathfrak {t}}_{m-1}^R}^{\mathfrak t_m^R}\big (\Phi (\textbf{u}(\sigma ))-\Phi (\textbf{u}^R_{h,m-1})\big )\,\mathrm dW\cdot {\varvec{\varphi }}\, \textrm{d}x+\int _{{\mathfrak {t}}_{m-1}^R}^{\mathfrak t_m^R}\int _{\mathcal {O}}\pi (\sigma )\,\textrm{div}{\varvec{\phi }}\,\textrm{d}x\, \textrm{d}\sigma . \end{aligned}$$

Setting \({\varvec{\phi }}=\Pi _h\textbf{e}_{h,m}\) and applying the identity \(\textbf{a}\cdot (\textbf{a}-\textbf{b})=\frac{1}{2}\big (|\textbf{a}|^2-|\textbf{b}|^2+|\textbf{a}-\textbf{b}|^2\big )\) (which holds for any \(\textbf{a},\textbf{b}\in {\mathbb {R}}^n\)) we gain

$$\begin{aligned}&\frac{1}{2}\big (\Vert \Pi _h\textbf{e}_{h,m}\Vert _{L^2_x}^2-\Vert \Pi _h\textbf{e}_{h,m-1}\Vert _{L^2_x}^2+\Vert \Pi _h\textbf{e}_{h,m}-\Pi _h\textbf{e}_{h,m-1}\Vert _{L^2_x}^2\big )+\tau _{m}^R\mu \Vert \nabla \textbf{e}_{h,m}\Vert ^2_{L^2_x}\\&\quad =\tau _{m}^R\int _{\mathcal {O}}\mu \nabla \textbf{e}_{h,m}:\nabla \big (\textbf{u}({\mathfrak {t}}_{m}^R)-\Pi _h\textbf{u}({\mathfrak {t}}_{m}^R)\big )\, \textrm{d}x\\&\qquad +\int _{{\mathfrak {t}}_{m-1}^R}^{\mathfrak t_{m}^R}\int _{\mathcal {O}}\mu \big (\nabla \textbf{u}(\mathfrak t_{m}^R)-\nabla \textbf{u}(\sigma )\big ):\nabla \Pi _h\textbf{e}_{h,m}\, \textrm{d}x\, \textrm{d}\sigma \\&\qquad +\int _{{\mathfrak {t}}_{m-1}^R}^{{\mathfrak {t}}_{m}^R}\int _{\mathcal {O}}\Big ((\textbf{u}({\mathfrak {t}}_{m-1}^R\cdot \nabla )\textbf{u}({\mathfrak {t}}_{m}^R)-(\textbf{u}(\sigma )\cdot \nabla )\textbf{u}(\sigma )\Big )\cdot \Pi _h\textbf{e}_{h,m}\,\textrm{d}x\, \textrm{d}\sigma \\&\qquad -\tau _{m}^R\int _{\mathcal {O}}\Big ((\textbf{u}({\mathfrak {t}}_{m-1}^R)\cdot \nabla )\textbf{u}({\mathfrak {t}}_{m}^R)-\big ((\textbf{u}_{h,m-1}^R\cdot \nabla +\textrm{div}\textbf{u}^R_{h,m-1})\textbf{u}_{h,m}^R\big )\Big )\cdot \Pi _h\textbf{e}_{h,m}\, \textrm{d}x\\&\qquad +\int _{\mathcal {O}}\int _{{\mathfrak {t}}_{m-1}^R}^{{\mathfrak {t}}^R_m}\big (\Phi (\textbf{u}(\sigma ))-\Phi (\textbf{u}^R_{h,m-1})\big )\,\mathrm dW\cdot \Pi _h\textbf{e}_{h,m}\, \textrm{d}x\\&\qquad +\int _{{\mathfrak {t}}_{m-1}^R}^{{\mathfrak {t}}_{m}^R}\int _{\mathcal {O}}\pi (\sigma )\,\textrm{div}\Pi _h\textbf{e}_{h,m}\,\textrm{d}x\, \textrm{d}\sigma \\&\quad =:I_1(m)+\dots +I_6(m). \end{aligned}$$

Eventually, we will take the maximum with respect to \(m\in \{1,\dots ,M\}\) and apply expectations. Let us explain how to deal with \(\mathbb E\big [\max _m I_1(m)],\dots ,\mathbb E[\max _mI_6(m)]\) independently.

We clearly have for any \(\kappa >0\)

$$\begin{aligned} I_1(m)&\le \,\kappa \tau _{m}^R\Vert \nabla \textbf{e}_{h,m}\Vert _{L^2_x}^2+c(\kappa )\tau _{m}^R\Vert \nabla (\textbf{u}({\mathfrak {t}}_{m}^R)-\Pi _h\textbf{u}({\mathfrak {t}}_{m}^R)\Vert ^2_{L^2_x}\\&\le \,\kappa \tau _{m}^R\Vert \nabla \textbf{e}_{h,m}\Vert ^2_{L^2_x}+c(\kappa )\tau h^{2\beta }\Vert \textbf{u}({\mathfrak {t}}_{m}^R)\Vert ^2_{W^{1+\beta ,2}_x} \end{aligned}$$

due to the \(W^{1,2}_x\)-stability of \(\Pi _h\), cf. (2.12). Note that the expectation of the last term may be bounded with the help of Lemma 3.1 (c) using \(\mathfrak t_{m}^R\le {\mathfrak {t}}_R\). We continue with \(I_2(m)\), for which we obtain

$$\begin{aligned} I_2(m)&\le \,\kappa \tau _{m}^R\Vert \nabla \Pi _h\textbf{e}_{h,m}\Vert _{L^2_x}^2+c(\kappa )\int _{{\mathfrak {t}}_{m-1}^R}^{{\mathfrak {t}}_{m}^R}\Vert \nabla (\textbf{u}({\mathfrak {t}}_{m}^R)-\textbf{u}(\sigma ))\Vert _{L^2_x}^2\, \textrm{d}\sigma \\&\le \,\kappa \tau _{m}^R\Vert \nabla \textbf{e}_{h,m}\Vert ^2_{L^2_x}+\,\kappa \tau _{m}^R\Vert \nabla (\textbf{u}(\mathfrak t_m^R)-\Pi _h\textbf{u}(\mathfrak t_m^R))\Vert _{L^2_x}^2\\ {}&\qquad +c(\kappa )\tau ^{1+2\alpha }\Vert \nabla \textbf{u}\Vert _{C^\alpha ([\mathfrak t_{m-1}^R,{\mathfrak {t}}_{m}^R];L^2_x)}^2, \end{aligned}$$

where the last term can be controlled by Corollary 3.4 and the second last one by (2.12) and 3.1 (c) as for \(I_2(m)\). We proceed by

$$\begin{aligned} I_3(m)&=-\int _{{\mathfrak {t}}_{m-1}^R}^{{\mathfrak {t}}_{m}^R}\int _{\mathcal {O}}\Big (\textbf{u}({\mathfrak {t}}_{m}^R)\otimes \textbf{u}({\mathfrak {t}}_{m-1}^R)-\textbf{u}(\sigma )\otimes \textbf{u}(\sigma )\Big ):\nabla \Pi _h\textbf{e}_{h,m}\,\textrm{d}x\, \textrm{d}\sigma \\&\le \,\kappa \tau _{m}^R\Vert \nabla \Pi _h\textbf{e}_{h,m}\Vert ^2_{L^2_x}+c(\kappa )\int _{{\mathfrak {t}}_{m-1}^R}^{{\mathfrak {t}}_{m}^R}\Vert \textbf{u}({\mathfrak {t}}_{m}^R)\otimes \textbf{u}({\mathfrak {t}}_{m-1}^R)-\textbf{u}(\sigma )\otimes \textbf{u}(\sigma )\Vert ^2_{L^2_x}\, \textrm{d}\sigma \\&\le \,\kappa \tau _{m}^R\Vert \nabla \textbf{e}_{h,m}\Vert ^2_{L^2_x}+\,\kappa \tau _{m}^R\Vert \nabla (\textbf{u}({\mathfrak {t}}_m^R)-\Pi _h\textbf{u}({\mathfrak {t}}_m^R))\Vert _{L^2_x}\\ {}&\qquad +c(\kappa )\tau ^{1+2\alpha }\Vert \textbf{u}\Vert ^2_{L^\infty (({\mathfrak {t}}_{m-1}^R,{\mathfrak {t}}_{m}^R)\times \mathcal {O})}\Vert \textbf{u}\Vert _{{C^\alpha ([{\mathfrak {t}}_{m-1}^R,{\mathfrak {t}}_{m}^R];L^2_x)}}^2\\&\le \,\kappa \tau _{m}^R\Vert \nabla \textbf{e}_{h,m}\Vert ^2_{L^2_x}+c(\kappa )\tau h^{2\beta }\Vert \textbf{u}(\mathfrak t_{m}^R)\Vert ^2_{W^{1+\beta ,2}_x}\\ {}&\qquad +c(\kappa )\tau ^{1+2\alpha }\Vert \textbf{u}\Vert ^2_{L^\infty (\mathfrak t_{m-1}^R,\mathfrak t_{m}^R;W^{2,2}_x)}\Vert \textbf{u}\Vert _{{C^\alpha ([\mathfrak t_{m-1}^R,{\mathfrak {t}}_{m}^R];L^2_x)}}^2, \end{aligned}$$

where we used Sobolev’s embedding \(W^{2,2}(\mathcal {O},\mathbb R^2)\hookrightarrow L^\infty (\mathcal {O},\mathbb R^2)\). We can apply again Lemma 3.1 (c) and Corollary 3.4 to the last term. The term \(I_4(m)\) can be decomposed as

$$\begin{aligned} I_4(m)&=I_4^1(m)+I_4^2(m)+I_4^3(m),\\ I_4^1(m)&=-\tau _{m}^R\int _{\mathcal {O}}(\textbf{u}({\mathfrak {t}}_{m-1}^R\cdot \nabla )\textbf{e}_{h,m}\cdot \big (\textbf{u}({\mathfrak {t}}_{m}^R)-\Pi _h\textbf{u}({\mathfrak {t}}_{m}^R)\big )\, \textrm{d}x,\\ I_4^2(m)&=\tau _{m}^R\int _{\mathcal {O}}(\textbf{e}_{h,m-1}\cdot \nabla )\textbf{e}_{h,m}\cdot \big (\textbf{u}({\mathfrak {t}}_{m}^R)-\Pi _h\textbf{u}({\mathfrak {t}}_{m}^R)\big )\, \textrm{d}x\\&\qquad +\tau _{m}^R\int _{\mathcal {O}}(\textrm{div}\textbf{e}_{h,m-1})\textbf{e}_{h,m}\cdot \big (\textbf{u}({\mathfrak {t}}_{m}^R)-\Pi _h\textbf{u}({\mathfrak {t}}_{m}^R)\big )\, \textrm{d}x,\\ I_4^3(m)&=-\tau _{m}^R\int _{\mathcal {O}}(\textbf{e}_{h,m-1}\cdot \nabla )\Pi _h\textbf{e}_{h,m}\cdot \textbf{u}({\mathfrak {t}}_{m}^R)\, \textrm{d}x\\&\qquad -\tau _{m}^R\int _{\mathcal {O}}(\textrm{div}\textbf{e}_{h,m-1})\Pi _h\textbf{e}_{h,m}\cdot \textbf{u}(\mathfrak t_{m}^R)\, \textrm{d}x. \end{aligned}$$

We obtain for any \(\kappa >0\)

$$\begin{aligned} I_4^1(m)&\le \,\tau _{m}^R\Vert \nabla \textbf{e}_{h,m}\Vert ^2_{L^2_x}\Vert \textbf{u}({\mathfrak {t}}_{m-1}^R)\Vert _{L^4_x}\Vert \textbf{u}({\mathfrak {t}}_{m}^R)-\Pi _h\textbf{u}({\mathfrak {t}}_{m}^R)\Vert _{L^4_x}\\&\le \,c\tau _{m}^R\Vert \nabla \textbf{e}_{h,m}\Vert ^2_{L^2_x}\Vert \textbf{u}({\mathfrak {t}}_{m-1}^R)\Vert _{W^{1,2}_x}\Vert \textbf{u}({\mathfrak {t}}_{m}^R)-\Pi _h\textbf{u}({\mathfrak {t}}_{m}^R)\Vert _{L^2_x}^{\frac{1}{2}}\Vert \textbf{u}({\mathfrak {t}}_{m}^R)-\Pi _h\textbf{u}({\mathfrak {t}}_{m}^R)\Vert _{W^{1,2}_x}^{\frac{1}{2}}\\&\le \,c\tau _{m}^R\Vert \nabla \textbf{e}_{h,m}\Vert _{L^2_x}h^{1+\beta /2} R\Vert \textbf{u}({\mathfrak {t}}_{m}^R)\Vert _{W^{1+\beta ,2}_x}\\&\le \,\kappa \tau _{m}^R\Vert \nabla \textbf{e}_{h,m}\Vert ^2_{L^2_x}+c(\kappa ) h^{2+\beta } R^2\tau _{m}^R\Vert \textbf{u}(\mathfrak t_{m}^R)\Vert _{W^{1+\beta ,2}_x}^2 \end{aligned}$$

by the embedding \(W^{1,2}(\mathcal {O},\mathbb R^2)\hookrightarrow L^4(\mathcal {O},\mathbb R^2)\), Ladyshenskaya’s inequality, the definition of \({\mathfrak {t}}_m^R\), and (2.12). The first term can be absorbed for \(\kappa \) small enough, whereas the second one (in summed form and expectation) is bounded by \(h^{2+\beta }R^{12}\) due Lemma 3.1 (c). Similarly, we have

$$\begin{aligned} I_4^2(m)&\le \tau _{m}^R\Vert \nabla \textbf{e}_{h,m}\Vert _{L^2_x}\Vert \textbf{e}_{h,m-1}\Vert _{L^4_x}\Vert \textbf{u}({\mathfrak {t}}_{m}^R)-\Pi _h\textbf{u}({\mathfrak {t}}_{m}^R)\Vert _{L^4_x}\\&\quad +\tau _{m}^R\Vert \nabla \textbf{e}_{h,m-1}\Vert _{L^2_x}\Vert \textbf{e}_{h,m}\Vert _{L^4_x}\Vert \textbf{u}({\mathfrak {t}}_{m}^R)-\Pi _h\textbf{u}({\mathfrak {t}}_{m}^R)\Vert _{L^4_x}\\&\le \,c\tau _{m}^Rh^{1+\beta /2}\Vert \nabla \textbf{e}_{h,m}\Vert _{L^2_x}\Vert \textbf{e}_{h,m-1}\Vert _{L^2_x}^{\frac{1}{2}}\Vert \nabla \textbf{e}_{h,m-1}\Vert ^{\frac{1}{2}}_{L^2_x}\Vert \textbf{u}({\mathfrak {t}}_{m}^R)\Vert _{W^{1+\beta ,2}_x}\\&\quad +c\tau _{m}^Rh^{1+\beta /2}\Vert \nabla \textbf{e}_{h,m-1}\Vert _{L^2_x}\Vert \textbf{e}_{h,m}\Vert _{L^2_x}^{\frac{1}{2}}\Vert \nabla \textbf{e}_{h,m}\Vert _{L^2_x}^{\frac{1}{2}}\Vert \textbf{u}({\mathfrak {t}}_{m}^R)\Vert _{W^{1+\beta ,2}_x}\\&\le \,\kappa \tau _{m}^R\Big (\Vert \nabla \textbf{e}_{h,m-1}\Vert ^2_{L^2_x}+\Vert \nabla \textbf{e}_{h,m}\Vert ^2_{L^2_x}\Big )\\&\quad +c(\kappa )\,\tau _{m}^R \,h^{4+2\beta }\Big (\max _{1\le n\le m}\Vert \textbf{e}_{h,n}\Vert _{L^2_x}^2\Big )\Vert \textbf{u}(\mathfrak t_{m}^R)\Vert ^4_{W^{1+\beta }_x}. \end{aligned}$$

The last term (in summed form and expectation) can be controlled by Lemma 3.1 (c) (with \(r=8\)) and Lemma 4.1 (with \(q=2\)). Note that we have either have \(\textbf{u}_{h,m}=\textbf{u}_{h,m}^R\) or \(\tau _m^R=0\). Finally, by definition of \({\mathfrak {t}}_m^R\),

$$\begin{aligned} I_4^3(m)&\le \tau _{m}^R\Vert \nabla \Pi _h\textbf{e}_{h,m}\Vert _{L^2_x}\Vert \textbf{e}_{h,m-1}\Vert _{L^4_x}\Vert \textbf{u}({\mathfrak {t}}_{m}^R)\Vert _{L^4_x}\\&\quad + \tau _{m}^R\Vert \nabla \textbf{e}_{h,m-1}\Vert _{L^2_x}\Vert \Pi _h\textbf{e}_{h,m}\Vert _{L^4_x}\Vert \textbf{u}({\mathfrak {t}}_{m}^R)\Vert _{L^4_x}\\&\le \tau _{m}^R\Vert \nabla \textbf{e}_{h,m}\Vert _{L^2_x}\Vert \textbf{e}_{h,m-1}\Vert _{L^2_x}^{\frac{1}{2}}\Vert \nabla \textbf{e}_{h,m-1}\Vert ^{\frac{1}{2}}_{L^2_x}\Vert \textbf{u}({\mathfrak {t}}_{m}^R)\Vert _{W^{1,2}_x}\\&\quad + \tau _{m}^R\Vert \nabla \textbf{e}_{h,m-1}\Vert _{L^2_x}\Vert \Pi _h\textbf{e}_{h,m}\Vert _{L^2_x}^{\frac{1}{2}}\Vert \nabla \Pi _h\textbf{e}_{h,m}\Vert ^{\frac{1}{2}}_{L^2_x}\Vert \textbf{u}({\mathfrak {t}}_{m}^R)\Vert _{W^{1,2}_x}\\&\le \,\kappa \tau _{m}^R\Big (\Vert \nabla \textbf{e}_{h,m-1}\Vert ^2_{L^2_x}+\Vert \nabla \Pi _h\textbf{e}_{h,m}\Vert ^2_{L^2_x}\Big )+c_\kappa \,\tau _{m}^R R^4\big (\Vert \Pi _h\textbf{e}_{h,m}\Vert ^2_{L^2_x}+\Vert \textbf{e}_{h,m-1}\Vert ^2_{L^2_x}\big )\\&\le \,\kappa \tau _{m}^R\Big (\Vert \nabla \textbf{e}_{h,m-1}\Vert ^2_{L^2_x}+\Vert \nabla \textbf{e}_{h,m}\Vert ^2_{L^2_x}\Big )+c(\kappa )\,\tau _{m}^R R^4\big (\Vert \Pi _h\textbf{e}_{h,m}\Vert ^2_{L^2_x}+\Vert \Pi _h\textbf{e}_{h,m-1}\Vert ^2_{L^2_x}\big )\\&\quad +c(\kappa )\,\tau _m^R\Vert \nabla (\textbf{u}(\mathfrak t_{m}^R)-\Pi _h\textbf{u}(\mathfrak t_m^R))\Vert ^2_{L^{2}_x}+c(\kappa )\,\tau _{m}^R R^4\Vert \textbf{u}(\mathfrak t_{m-1}^R)-\Pi _h\textbf{u}({\mathfrak {t}}_{m-1}^R)\Vert ^2_{L^2_x}. \end{aligned}$$

The second last term will be dealt with by Gronwall’s lemma leading to a constant of the form \(c e^{cR^4}\). The final line is bounded by \(c(\kappa )\,\tau _{m}^R R^4\,h^{2\beta }\Vert \textbf{u}(\mathfrak t_{m-1}^R)\Vert ^2_{W^{1+\beta ,2}_x}\) using (2.12) and hence can be controlled by Lemma 3.1 (c).

In order to estimate the stochastic term we write

$$\begin{aligned} {\mathscr {M}}_{m}&=\sum _{n=1}^mI_5(n)= \sum _{n=1}^m\int _{{\mathcal {O}}}\int _{{\mathfrak {t}}_{n-1}^R}^{{\mathfrak {t}}_{n}^R}\big (\Phi (\textbf{u})-\Phi (\textbf{u}_{h,n-1}^R)\big )\,\mathrm dW\cdot \Pi _h\textbf{e}_{h,n}\, \textrm{d}x\nonumber \\&= \sum _{n=1}^m\int _{\mathcal {O}}\int _{{\mathfrak {t}}_{n-1}^R}^{{\mathfrak {t}}_{n}^R}\big (\Phi (\textbf{u})-\Phi (\textbf{u}_{h,n-1}^R)\big )\,\mathrm dW\cdot \Pi _h\textbf{e}_{h,n-1}\, \textrm{d}x\nonumber \\&\quad + \sum _{n=1}^m\int _{\mathcal {O}}\int _{{\mathfrak {t}}_{n-1}^R}^{{\mathfrak {t}}_{n}^R}\big (\Phi (\textbf{u})-\Phi (\textbf{u}_{h,n-1}^R)\big )\,\mathrm dW\cdot \Pi _h(\textbf{e}_{h,n}-\textbf{e}_{h,n-1})\, \textrm{d}x\nonumber \\&= \int _{0}^{{\mathfrak {t}}_{m}^R}\sum _{n=1}^M{\textbf{1}}_{[t_{n-1},t_n)}\int _{\mathcal {O}}\big (\Phi (\textbf{u})-\Phi (\textbf{u}_{h,n-1}^R)\big )\,\mathrm dW\cdot \Pi _h\textbf{e}_{h,n-1}\, \textrm{d}x\nonumber \\&\quad + \sum _{n=1}^m\int _{\mathcal {O}}\int _{{\mathfrak {t}}_{n-1}^R}^{{\mathfrak {t}}_{n}^R}\big (\Phi (\textbf{u})-\Phi (\textbf{u}_{h,n-1}^R)\big )\,\mathrm dW\cdot \Pi _h(\textbf{e}_{h,n}-\textbf{e}_{h,n-1})\, \textrm{d}x\nonumber \\&=:{\mathscr {M}}^1({\mathfrak {t}}_{m}^R)+{\mathscr {M}}_{m}^2. \end{aligned}$$
(4.7)

Since the process \(({\mathscr {M}}^1(t\wedge {\mathfrak {t}}_R))_{t\ge 0}\) is an \(({\mathfrak {F}}_t)\)-martingale we gain by the Burgholder-Davis-Gundy inequality (using that \({\mathfrak {t}}_M^R\le {\mathfrak {t}}_R\) by definition)

$$\begin{aligned}&\mathbb E\bigg [\max _{1\le m\le M}\big |{\mathscr {M}}^1({\mathfrak {t}}_m^R)\big |\bigg ]\le \mathbb E\bigg [\sup _{s\in [0,{\mathfrak {t}}_M^R]}\big |{\mathscr {M}}^1(s)\big |\bigg ]\le \mathbb E\bigg [\sup _{s\in [0,T]}\big |{\mathscr {M}}^1(s\wedge {\mathfrak {t}}_R)\big |\bigg ]\\&\quad \le \,c\,\mathbb E\bigg [\bigg (\int _{0}^{T \wedge {\mathfrak {t}}_{R}}\sum _{n=1}^M{\textbf{1}}_{[t_{n-1},t_n)}\Vert \Phi (\textbf{u})-\Phi (\textbf{u}_{h,n-1}^R)\Vert ^2_{L_2({\mathfrak {U}},L^2_x)}\Vert \Pi _h\textbf{e}_{h,n-1}\Vert ^2_{L^2_x}\, \textrm{d}t\bigg )^{\frac{1}{2}}\bigg ]\\&\quad \le \,c\,\mathbb E\bigg [\max _{1\le n\le M}\Vert \Pi _h\textbf{e}_{h,n}\Vert _{L^2_x}\bigg (\int _{0}^{T\wedge {\mathfrak {t}}_R}\sum _{n=1}^M{\textbf{1}}_{[t_{n-1},t_n)}\Vert \Phi (\textbf{u})-\Phi (\textbf{u}_{h,n-1}^R)\Vert ^2_{L_2({\mathfrak {U}},L^2_x)}\, \textrm{d}t\bigg )^{\frac{1}{2}}\bigg ]\\&\quad \le \,\kappa \,\mathbb E\bigg [\max _{1\le n\le M}\Vert \Pi _h\textbf{e}_{h,n}\Vert ^2_{L^2_x}\bigg ]+\,c_\kappa \,\mathbb E\bigg [\int _{0}^{T\wedge \mathfrak t_R}\sum _{n=1}^M{\textbf{1}}_{[t_{n-1},t_n)}\Vert \textbf{u}-\textbf{u}_{h,n-1}^R\Vert _{L^2_x}^2\, \textrm{d}t\bigg ]. \end{aligned}$$

Here, we also used (2.1) as well as Young’s inequality for \(\kappa >0\) arbitrary. Since \(\textbf{u}_{h,n-1}^R=\textbf{e}_{h,n-1}+\textbf{u}({\mathfrak {t}}_{n-1}^R)\) is \(V_{\textrm{div}}^{h,i}(\mathcal {O},\mathbb R^2)\)-valued, we further estimate

$$\begin{aligned}&\mathbb E\bigg [\max _{1\le m\le M}\big |{\mathscr {M}}^1({\mathfrak {t}}_m^R)\big |\bigg ]\\ {}&\le \,\kappa \,\mathbb E\bigg [\max _{1\le n\le M}\Vert \Pi _h\textbf{e}_{h,n}\Vert ^2_{L^2_x}\bigg ]+\,c(\kappa )\,\mathbb E\bigg [\int _{0}^{T\wedge {\mathfrak {t}}_R}\sum _{n=1}^M{\textbf{1}}_{[t_{n-1},t_n)}\Vert \textbf{u}-\textbf{u}({\mathfrak {t}}_{n-1}^R)\Vert _{L^2_x}^2\, \textrm{d}t\bigg ]\\&\quad +\,c(\kappa )\,\mathbb E\bigg [\int _{ 0}^{T\wedge \mathfrak t_R}\sum _{n=1}^M{\textbf{1}}_{[t_{n-1},t_n)}\Vert \textbf{u}(\mathfrak t_{n-1}^R)-\Pi _h\textbf{u}(\mathfrak t_{n-1}^R)\Vert _{L^2_x}^2\, \textrm{d}t\bigg ]\\ {}&\qquad +\,c(\kappa )\,\mathbb E\bigg [\int _{0}^{T\wedge \mathfrak t_R}\sum _{n=1}^M{\textbf{1}}_{[t_{n-1},t_n)}\Vert \Pi _h\textbf{e}_{h,n-1}\Vert _{L^2_x}^2\, \textrm{d}t\bigg ] \end{aligned}$$

We bound the last term by

$$\begin{aligned} \mathbb E\bigg [\int _{0}^{T\wedge {\mathfrak {t}}_R}\sum _{n=1}^M{\textbf{1}}_{[t_{n-1},t_n)}\Vert \Pi _h\textbf{e}_{h,n-1}\Vert _{L^2_x}^2\, \textrm{d}t\bigg ]&\le \mathbb E\bigg [\sum _{n=1}^{M+1}\tau _n^R\Vert \Pi _h\textbf{e}_{h,n-1}\Vert _{L^2_x}^2\, \textrm{d}t\bigg ]\\&\le \mathbb E\bigg [\sum _{n=0}^{M}\tau _n^R\Vert \Pi _h\textbf{e}_{h,n}\Vert _{L^2_x}^2\, \textrm{d}t\bigg ] \end{aligned}$$

using that \({\mathfrak {t}}_R\wedge t_M\le {\mathfrak {t}}^R_{M+1}\) and \(\tau _n^R\le \tau _{n-1}^R\) with \(\tau _0^R:=\tau \). Applying (2.12) as well as Lemma 3.1 (b) and Corollary 3.4 (b) we gain

$$\begin{aligned}&\mathbb E\bigg [\max _{1\le m\le M}\big |{\mathscr {M}}^1(\mathfrak t_m^R)\big |\bigg ]\\&\quad \le \,\kappa \,\mathbb E\bigg [\max _{1\le n\le M}\Vert \Pi _h\textbf{e}_{h,n}\Vert ^2_{L^2_x}\bigg ]+\,c(\kappa )\,\mathbb E\bigg [\sum _{n=0}^M \tau _{n}^R\Vert \Pi _h\textbf{e}_{h,n}\Vert _{L^2_x}^2\bigg ]\\&\qquad +c(\kappa )\tau ^{2\alpha }\mathbb E\big [\Vert \textbf{u}\Vert _{C^\alpha ([0,T\wedge \mathfrak t_R],L^2_x)}^2\big ]+c(\kappa ) h^{1+\beta }\mathbb E\bigg [\sup _{t\in [0,T]}\int _{\mathcal {O}}|\nabla \textbf{u}(t\wedge {\mathfrak {t}}_R)|^2\, \textrm{d}x\bigg ]\\&\quad \le \,\kappa \,\mathbb E\bigg [\max _{1\le n\le M}\Vert \Pi _h\textbf{e}_{h,n}\Vert ^2_{L^2_x}\bigg ]+\,c(\kappa )\,\mathbb E\bigg [\sum _{n=0}^M \tau \Vert \Pi _h\textbf{e}_{h,n}\Vert _{L^2_x}^2\bigg ]\\&\qquad +c(\kappa )\tau ^{2\alpha }R^{20}+c(\kappa ) h^{1+\beta }R^6. \end{aligned}$$

Similarly: on using Cauchy-Schwartz inequality, Young’s inequality, Itô-isometry and (2.1) we have for \(\kappa >0\)

$$\begin{aligned}&\mathbb E\bigg [\max _{1\le m\le M}|{\mathscr {M}}_{m}^2|\bigg ]\\&\quad \le \mathbb E\bigg [ \sum _{n=1}^M\bigg ( \kappa \Vert \Pi _h(\textbf{e}_{h,n}-\textbf{e}_{h,n-1})\Vert _{L^2_x}^2 +c(\kappa ) \left\| \int _{{\mathfrak {t}}_{n-1}^R}^{\mathfrak t_{n}^R}\big (\Phi (\textbf{u})-\Phi (\textbf{u}_{h,n-1}^R)\big )\,\mathrm dW \right\| _{L^2_x}^2\bigg ) \bigg ]\\&\quad \le \kappa \mathbb E\bigg [ \sum _{n=1}^M \Vert \Pi _h(\textbf{e}_{h,n}-\textbf{e}_{h,n-1})\Vert _{L^2_x}^2 \bigg ] + c(\kappa )\mathbb E\bigg [\sum _{n=1}^M\int _{ {\mathfrak {t}}_{n-1}^R}^{\mathfrak t_{n}^R}\Vert \textbf{u}-\textbf{u}_{h,n-1}^R\Vert _{L^2_x}^2\, \textrm{d}t\bigg ]\\&\quad \le \kappa \mathbb E\bigg [ \sum _{n=1}^M \Vert \Pi _h(\textbf{e}_{h,n}-\textbf{e}_{h,n-1})\Vert _{L^2_x}^2 \bigg ] +c(\kappa ) \,\mathbb E\bigg [\sum _{n=1}^M \int _{ {\mathfrak {t}}_{n-1}^R}^{\mathfrak t_{n}^R}\Vert \textbf{u}-\textbf{u}(t_{n-1}^R)\Vert _{L^2_x}^2\, \textrm{d}t\bigg ]\\ {}&\qquad +c_\kappa \,\mathbb E\bigg [\sum _{n=1}^M \tau _{n,h}^R\Vert \textbf{u}(\mathfrak t_{n-1}^R)-\Pi _h\textbf{u}(\mathfrak t_{n-1}^R)\Vert _{L^2_x}^2\bigg ]+\,c(\kappa )\,\mathbb E\bigg [\sum _{n=1}^M \tau _{n}^R\Vert \Pi _h\textbf{e}_{h,n-1}\Vert _{L^2_x}^2\bigg ]\\&\quad \le \kappa \mathbb E\bigg [ \sum _{n=1}^M \Vert \Pi _h(\textbf{e}_{h,n}-\textbf{e}_{h,n-1})\Vert _{L^2_x}^2 \bigg ] +\,c(\kappa )\,\mathbb E\bigg [\sum _{n=1}^M \tau _{n}^R\Vert \Pi _h\textbf{e}_{h,n-1}\Vert _{L^2_x}^2\bigg ]\\&\qquad +c(\kappa )\tau ^{2\alpha }R^{20}+c(\kappa )h^2R^6 \end{aligned}$$

as a consequence Lemma 3.1 (b) (using also (2.11)) and Corollary 3.4 (b).

Finally, we have by (2.13)

$$\begin{aligned} I_6(m)&=\int _{{\mathfrak {t}}_{m-1}^R}^{\mathfrak t_m^R}\int _{\mathcal {O}}\big (\pi -\Pi _h^\pi \pi \big )\textrm{div}\Pi _h\textbf{e}_{h,m}\, \textrm{d}x\, \textrm{d}\sigma \\&\le \,c(\kappa )\int _{{\mathfrak {t}}_{m-1}^R}^{{\mathfrak {t}}_m^R} \,\Vert \pi -\Pi _h^\pi \pi \Vert _{L^2_x}^2\, \textrm{d}\sigma +\,\kappa \tau \,\Vert \nabla \Pi _h\textbf{e}_{h,m}\Vert _{L^2_x}^2\\&\le \, c(\kappa ) h^2\int _{{\mathfrak {t}}_{m-1}^R}^{\mathfrak t_m^R}\,\Vert \nabla \pi \Vert _{L^2_x}^2\, \textrm{d}\sigma +\,\kappa \tau \,\Vert \nabla \textbf{e}_{h,m}\Vert _{L^2_x}^2, \end{aligned}$$

where \(\kappa >0\) is arbitrary. The first term is summable in expectation with bound \(c(\kappa ) h^2 R^{12}\) due to Lemma 3.3 (b) and the last one can be absorbed. Collecting all estimates, choosing \(\kappa \) small enough implies

$$\begin{aligned}&\mathbb E\bigg [\max _{1\le m\le M}\Vert \Pi _h\textbf{e}_{h,m}\Vert _{L^2_x}^2+\sum _{m=1}^M \tau _{m}^R\Vert \Pi _h(\textbf{e}_{h,m}-\textbf{e}_{h,m-1})\Vert _{L^2_x}^2+\sum _{m=1}^M \tau _{m}^R\Vert \nabla \textbf{e}_{h,m}\Vert _{L^2_x}^2\bigg )\bigg ]\\&\quad \le \,cR^4\,\mathbb E\bigg [\sum _{m=1}^M \tau _{m}^R\max _{1\le n\le m}\Vert \textbf{e}_{h,n}\Vert _{L^2_x}^2\bigg ]+\,ce^{cR^4}\,\big (h^{2\beta }+\tau ^{2\alpha }\big ). \end{aligned}$$

Controlling the error between \(\textbf{e}_{h,m}\) and \(\Pi _h\textbf{e}_{h,m}\) by (2.11) as well as Lemma 3.1 (b) and applying Gronwall’s lemma yields the claim. \(\square \)