An adaptive time-stepping method based on a posteriori weak error analysis for large SDE systems

Merle, Fabian; Prohl, Andreas

doi:10.1007/s00211-021-01233-4

An adaptive time-stepping method based on a posteriori weak error analysis for large SDE systems

Open access
Published: 18 October 2021

Volume 149, pages 417–462, (2021)
Cite this article

Download PDF

You have full access to this open access article

Numerische Mathematik Aims and scope Submit manuscript

An adaptive time-stepping method based on a posteriori weak error analysis for large SDE systems

Download PDF

Fabian Merle¹ &
Andreas Prohl¹

2359 Accesses
4 Citations
Explore all metrics

Abstract

We develop an adaptive algorithm for large SDE systems, which automatically selects (quasi-)deterministic time steps for the semi-implicit Euler method, based on an a posteriori weak error estimate. Main tools to construct the a posteriori estimator are the representation of the weak approximation error via Kolmogorov’s backward equation, a priori bounds for its solution and the Clark–Ocone formula. For a certain class of SDE systems, we validate optimal weak convergence order 1 of the a posteriori estimator, and termination of the adaptive method based on it within ${{\mathcal {O}}}(\mathtt{Tol}^{-1})$ steps.

Multilevel Monte Carlo Implementation for SDEs Driven by Truncated Stable Processes

Cheap arbitrary high order methods for single integrand SDEs

Article 09 May 2016

Stochastic Approximation Procedures for Lévy-Driven SDEs

Article Open access 20 March 2023

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Let $L,K \in {{\mathbb {N}}}{\setminus } \{0\}$, and $T>0$. In this work, we study a new adaptive time-stepping strategy to efficiently approximate the ${{\mathbb {R}}}^L$-valued solution ${\mathbf {X}} \equiv \{ {\mathbf {X}}_t;\, t \in [0,T]\}$ of the stochastic differential equation (SDE)

$$\begin{aligned} {\mathrm d}{{\mathbf {X}}}_t = \bigl ( -{{\mathscr {A}}}{{\mathbf {X}}}_t + {\mathbf{f}}({{\mathbf {X}}}_t)\bigr )\,{\mathrm d}t + \sum \limits _{k=1}^{K}\pmb {\sigma }_{k}({{\mathbf {X}}}_t)\,{\mathrm d}\beta _{k}(t) \quad \text {for all } \,\; t \in [0,T], \quad {{\mathbf {X}}}_0 = {{\mathbf {y}}} \in {{\mathbb {R}}}^L, \end{aligned}$$

(1.1)

where $\{\beta _{k}(t);\, t\in [0,T]\}$, $k=1,\ldots ,K$ are independent ${{\mathbb {R}}}$-valued Wiener processes on the filtered probability space $(\Omega , {{\mathcal {F}}}, \{{{\mathcal {F}}}_t \}_{t \ge 0}, {{\mathbb {P}}})$, and ${{\mathscr {A}}}\in {{\mathbb {R}}}^{L \times L}$ is invertible and positive definite. We refer to Sect. 2, where proper settings for data ${{\mathscr {A}}},{\mathbf {f}},\{\pmb {\sigma }_{k}\}_k$, are given. Problem (1.1) may be motivated from a spatial discretization of the semilinear stochastic partial differential equation (SPDE) on a bounded domain $\pmb {{\mathcal {D}}} \subset {{\mathbb {R}}}^d$,

$$\begin{aligned} \mathrm{d}X_t&= \bigl ( \varepsilon \Delta X _t + [\pmb {\beta } \cdot \nabla ]X_t +{F}({X}_t)\bigr )\,\mathrm{d}t + \sum \limits _{k=1}^{K}\Sigma _{k} ({X}_t)\,\mathrm{d}\beta _{k}(t) \quad \text {for all } \,\; t \in [0,T], \nonumber \\ {X}_0&= {y} \in {\mathbb H}, \end{aligned}$$

(1.2)

for given $\varepsilon >0$, $\pmb {\beta }: \pmb {{\mathcal {D}}}\rightarrow {{\mathbb {R}}}^d$ constant for simplicity, and ${{\mathbb {H}}}$ a Hilbert space; see Sect. 5 for further details.

Our aim is an adaptive mesh strategy for the semi-implicit Euler method applied to (1.1), which, for every $j \in {\mathbb N}_0$, automatically selects the new step size $\tau ^{j+1} = t_{j+1} - t_j$, and then determines the ${{\mathbb {R}}}^L$-valued random variable ${\mathbf {Y}}^{j+1}$ from ($j \in {{\mathbb {N}}}_0$)

$$\begin{aligned} {\mathbf {Y}}^{j+1} = {\mathbf {Y}}^j + {\tau }^{j+1} \bigl (-{{\mathscr {A}}}{\mathbf {Y}}^{j+1} + {\mathbf {f}}({\mathbf {Y}}^j)\bigr ) + \sum \limits _{k=1}^{K}\pmb {\sigma }_{k}({\mathbf {Y}}^j) \Delta _{j+1} \beta _{k} , \quad {\mathbf {Y}}^0 = {\mathbf {y}}, \end{aligned}$$

(1.3)

for $\Delta _{j+1} \beta _{k} := \beta _{k}(t_{j+1}) - \beta _{k}(t_{j})$, to approximate the solution ${\mathbf {X}}_{t_{j+1}}$ from (1.1) at time $t_{j+1}$. Conceptually, we base this local step size selection strategy on a (computable) a posteriori weak error estimator ${\mathfrak {G}}$ in each step, i.e.,

$$\begin{aligned} {\max \limits _{0\le j \le J}\Big |{\mathbb {E}}\big [\phi ({\mathbf {X}}_{t_{j}})\big ] - {\mathbb {E}}\big [\phi ({\mathbf {Y}}^{j})\big ] \Big |\le \sum \limits _{j=0}^{J-1} \tau ^{j+1}{\mathfrak G}\bigl (\phi ;\tau ^{j+1}, {\mathbf {Y}}^{j}\bigr ) ,} \end{aligned}$$

(1.4)

for $ \phi \in C^{3}({{\mathbb {R}}}^L)$ with globally bounded first, second and third derivatives. A criterion may then be set up to select a new, large $\tau ^{j+1}$ in every step, such that the right-hand side of (1.4) stays below a chosen tolerance $\mathtt{Tol} > 0$. For the derivation of (1.4) we benefit from [28], where an expansion of the weak approximation error for uniform deterministic time steps (and originally for the explicit Euler method) was obtained via Kolmogorov’s backward equation:

$$\begin{aligned} \partial _t u(t, {\mathbf {x}}) + {{\mathcal {L}}}u(t,{\mathbf {x}})= & {} 0 \quad \ \, \qquad \text {for all }\, (t,{\mathbf {x}}) \in [0,T) \times {{\mathbb {R}}}^L, \nonumber \\ u(T,{\mathbf {x}})= & {} \phi ({\mathbf {x}}) \qquad \text {for all }\, {\mathbf {x}} \in {{\mathbb {R}}}^L, \end{aligned}$$

(1.5)

where ${{\mathcal {L}}} \equiv {{\mathcal {L}}}_{\mathbf {X}}$ is the generator of the Markovian semigroup from ${\mathbf {X}} \equiv \{ {\mathbf {X}}_t;\, t \in [0, T]\}$ in (1.1),

$$\begin{aligned} {{\mathcal {L}}}u(t,{\mathbf {x}})&= \bigl \langle -{{\mathscr {A}}}{\mathbf {x}} + {\mathbf {f}}({\mathbf {x}}), D_{\mathbf {x}} u(t, {\mathbf {x}}) \bigr \rangle _{{{\mathbb {R}}}^L} + \frac{1}{2} \sum \limits _{k=1}^{K} \mathrm{Tr} \Bigl (\pmb {\sigma }_{k}({\mathbf {x}}) \pmb {\sigma }_{k}^\top ({\mathbf {x}}) D_{\mathbf {x}}^2 u(t, {\mathbf {x}}) \Bigr )\\&=\bigl \langle -{{\mathscr {A}}}{\mathbf {x}} + {\mathbf {f}}({\mathbf {x}}), D_{\mathbf {x}} u(t, {\mathbf {x}}) \bigr \rangle _{{{\mathbb {R}}}^L} + \frac{1}{2} \mathrm{Tr} \Bigl (\pmb {\sigma }({\mathbf {x}}) \pmb {\sigma }^\top ({\mathbf {x}}) D_{\mathbf {x}}^2 u(t, {\mathbf {x}}) \Bigr ), \end{aligned}$$

with $\pmb {\sigma }(\cdot )\equiv \big [ \pmb {\sigma }_{1}(\cdot ),\ldots ,\pmb {\sigma }_{K}(\cdot )\big ]\in {\mathbb {R}}^{L\times K}$. Under proper assumptions, such as for instance those stated in Sect. 2, the function $u(t,{\mathbf {x}}) = {{\mathbb {E}}} \bigl [\phi ({\mathbf {X}}^{t,{\mathbf {x}}}_T)\bigr ]$ is the unique solution of (1.5); see e.g. [15, p. 366ff.]. As usual we denote by ${\mathbf {X}}^{t,{\mathbf {x}}}\equiv \big \{ {\mathbf {X}}_{s}^{t,{\mathbf {x}}};\, s \in [t,T]\big \}$ the ${\mathbb {R}}^{L}$-valued process which starts at time $t\in [0,T]$ in ${\mathbf {x}}\in {\mathbb {R}}^{L}$.

As already mentioned, we are motivated by (a spatial discretization of) SPDE (1.2), which is why we aim for adaptive methods, which are applicable to SDE (1.1) with $L\gg 1$ large; in this respect, we prefer (quasi-)deterministic (rather than random) meshes $\{ t_j\}_{j\ge 0} \subset [0,T]$ to avoid requirements for too large storage resources, or time-consuming post-processing tasks to synchronize data, such as interpolation, or projection. This approach lends itself to a vectorized implementation (see Algorithm 4.1) and is of advantage over procedurally generated meshes (such as those in [10, 16, 17]) where the efficient implementation as a vectorized algorithm is an unsolved problem.

The following example illustrates local mesh refinement and coarsening by the adaptive Algorithm 4.1, which is detailed in Sects. 4 and 5.

Example 1.1

Let $L = 25$. Consider SDE system (1.1) with $K=5$, which results from a finite element discretization (with spatial mesh size $\textit{h}=\tfrac{1}{L+1}$) of SPDE (1.2) with $\varepsilon =1$, $\pmb {\beta } \equiv \pmb {0}$, and

$$\begin{aligned} F\bigl (X_{t}(x)\bigr )&=\tfrac{1}{5}\sin \bigl (\pi X_{t}(x)\bigr ),\quad \Sigma _{k}\bigl (X_{t}(x)\bigr )=\frac{1}{2k}\sin (\pi k x) X_{t}(x),\\ y(x)&=\sin (\pi x),\quad x\in (0,1); \end{aligned}$$

see Sect. 5.1 for details. For the test function $\phi ({\mathbf {x}})=\sqrt{\textit{h}}\Vert {\mathbf {x}} \Vert _{{\mathbb {R}}^{L}}$ to approximate the ${\mathbb {L}}^{2}$-norm, and an initial step of size 0.1, we observe an instantaneous refinement via Algorithm 4.1 to $\tau ^{1}\approx 10^{-4}$; the mesh size then rapidly increases to values close to $10^{-1}$ at times $t \approx 0.5$, reflecting (spatial) smoothing dynamics; see Fig. 1b. Figure 1a shows a typical trajectory, where the buckling is caused by the driving noise. Figure 1c compares related errors for (1.3) on uniform vs. adaptive time meshes through Algorithm 4.1. Here, ${\mathbb {E}}_{\mathtt {M}}[\phi ({\mathbf {Y}}^{j})]:=\tfrac{1}{\mathtt {M}} \sum _{m=1}^{\mathtt {M}}\phi \bigl ({\mathbf {Y}}^{j}(\omega _{m})\bigr )$ denotes the empirical mean to approximate ${\mathbb {E}}[\phi ({\mathbf {Y}}^{j})]$, where we choose $\mathtt {M}=10^{4}$ Monte-Carlo simulations. For the tolerance parameter ${\mathtt {Tol}}=0.1$, Algorithm 4.1 generates an adaptively refined mesh with $J=501$ time steps to stay below the given error threshold ${\mathtt {Tol}}$. In contrast, a uniform mesh needs $J=2000$ time steps to perform equally well. In Fig. 1d, the evolution of the a posteriori error estimator $j \mapsto {\mathfrak {G}}^{(\mathtt {M})}\bigl (\phi ;\tau ^{j+1},{\mathbf {Y}}^{j}\bigr )$ is displayed, that approximately takes values between ${\mathtt {Tol}}$ and $\frac{{\mathtt {Tol}}}{2}$; this is in accordance with the tolerance criterion of Algorithm 4.1, indicating an efficient selection of variable step sizes. See Sect. 5.1 for more details.

Different adaptive methods to solve SDE (1.1) may be found in the literature, addressing diverse numerical goals: in [20], an adaptive time-meshing concept is combined with the Euler–Maruyama method to foster discrete stability of the explicit time-stepping scheme in cases where the local Lipschitz drift only satisfies a ‘one-sided Lipschitz condition’. Automatic mesh refinement (resp. coarsening) for each realization $\omega \in \Omega $ is applied if rapid (resp. slow) changes in the drift at two subsequent states are observed, where a maximum mesh size $\Delta _\mathtt{max}$ bounds local (random) mesh sizes $\{ \tau ^{j}(\omega )\}_{j\ge 0} \subset [0, \Delta _\mathtt{max}]$ to conclude asymptotic strong convergence. Another work which uses adaptive random meshes as well to strengthen the stability of the underlying explicit discretization of the above mentioned class of SDEs (1.1) is [10]: here, a ‘discrete one-sided Lipschitz condition’ is used to generate random mesh sizes $\{ \tau _j(\omega )\}_{j\ge 0}$, which are then further constrained to lie in $[\Delta _{\min }, \Delta _{\max }]$. The main result in [10] is the derivation of an optimal convergence rate ${{\mathcal {O}}}(\Delta ^{1/2}_\mathtt{max})$ on variable random meshes of size between $\Delta _{\min }$ and $\Delta _\mathtt{max}$. Close to the goals and applied tools in this work are [16, 17], where, again, to set parameters $\Delta _{\min }$ and $\Delta _{\max }$ requires some a priori knowledge, and the complexity in the worst case of the method may depend on $\Delta ^{-1}_{\min }$, and the dimension L of the problem due to the explicit character of the discretization that affects relevant discrete solution bounds; see also [13] in this respect.

A different line of research derives a posteriori error estimates (such as (1.4)) to judge the quality of the current approximation, and uses it then as a ‘steering tool’ to initiate an automatic remeshing strategy. While this conceptual idea to design adaptive methods has been well-known in the context of (certain) ODEs and PDEs before, it has first been introduced in [23, 27] for SDEs in the contents of weak approximation of SDE solutions (here again via Euler–Maruyama discretization). In these works, an (asymptotic weak) a posteriori error expansion

$$\begin{aligned} \Bigl \vert {{\mathbb {E}}}\bigl [\phi ({\mathbf {X}}_{T})\bigr ] -{\mathbb {E}}\bigl [ \phi ({\mathbf {Y}}_{T})\bigr ] \Bigr \vert ={\mathbb {E}}\left[ \sum \limits _{j=0}^{J-1} \rho _{j+1}\cdot \bigl (\tau ^{j+1}\bigr )^{2}\right] + \text {`}\text {higher order terms}\text {'}, \end{aligned}$$

(1.6)

with computable $ \{\rho _{j}\}_{j=1}^{J}$ has first been obtained. Its derivation in [23, 27] rests on the weak error expansion of Talay and Tubaro [28] via Kolmogorov’s backward equation (1.5), and numerically approximates derivatives of the solution u of the PDE (1.5), whose simulation is limited to small dimensions L. Then, (random) time meshes are generated automatically based on the computable part of the right-hand side of (1.6)—with no minimum or maximum mesh sizes to be set, but only the parameter $\mathtt{Tol}$ (also serving as convergence parameter) to bound the leading error term on the right-hand side of (1.6). The iterative generation of an adapted time mesh requires the repeated computation of (approximations of) the global problem (1.5)—opposed to determining local time steps $\tau ^{j+1}$ based on ‘so far’ computed solutions $\{{\mathbf {Y}}^{\ell }\}_{\ell =0}^{j+1}$ only. From an analytical viewpoint, the results in [23, 27] crucially rest on the assumed boundedness of the involved drift and diffusion functions to circumvent the deficiency of ‘discrete stability’ of the governing (explicit) Euler–Maruyama scheme; the advantage, however, is a theoretical backup for this weak adaptive algorithm in terms of termination at optimal rate, and the asymptotic weak a posteriori error estimate (1.6).

Conceptually, the derivation of the adaptive Algorithm 4.1 below is close to [23, 27] and uses a related weak error representation (see (1.8) below) for the semi-implicit Euler scheme (1.3) with the help of the solution u of (1.5)—but differs in some relevant aspects: the first is the use of the semi-implicit Euler scheme (1.3), which allows for $L-$independent (higher moment) stability bounds for its solution in case data satisfy (A1)–(A3) in Sect. 2.1; see Lemma 2.6. These stability bounds for (1.3) in Lemma 2.6 are the relevant property to show optimal order of weak convergence of the a posteriori weak error estimator proposed in Theorem 3.1 on given meshes; cf. Theorem 3.5.

A second difference to [23, 27] is that we bound derivatives of u that appear in (1.5) in the weak error representation (1.9) by a priori bounds (see (1.10) below) in terms of derivatives of $\phi $, which removes the neccessity to numerically approximate derivatives of the solution of (1.5)—and thus enables the applicability of Algorithm 4.1 to large SDE systems, as they e.g. come from SPDE (1.2) via spatial discretization (in Example 1.1).

To further detail relevant steps in our program, we start with the continuified process $\pmb {{\mathcal {Y}}} \equiv \{ \pmb {\mathcal Y}_t;\, t \in [0, T]\}$ of the sequence of random variables $\{ {\mathbf {Y}}^j\}_{j\ge 0}$ which solves (1.3). We easily observe in Sect. 3 that

$$\begin{aligned} \pmb {{\mathcal {Y}}}_t&:= {\mathbf {Y}}^j + \Bigl ( \bigl ( {\mathbb {I}} + \tau ^{j+1} {{\mathscr {A}}}\bigr )^{-1} {\mathbf {f}}({\mathbf {Y}}^j) - {{\mathscr {A}}}\bigl ( {\mathbb {I}} + \tau ^{j+1} {{\mathscr {A}}}\bigr )^{-1} {\mathbf {Y}}^j\Bigr )(t-t_j) \nonumber \\&\quad + \sum \limits _{k=1}^{K}\Bigl ( {\mathbb {I}}+ \tau ^{j+1} {{\mathscr {A}}}\Bigr )^{-1} \pmb {\sigma }_{k}({\mathbf {Y}}^j) \bigl ( \beta _{k}(t) - \beta _{k}(t_j)\bigr ) \quad \text {for all }\,\; t \in [t_j, t_{j+1}] \end{aligned}$$

(1.7)

interpolates $\{ {\mathbf {Y}}^{j}\}_{j\ge 0}$ at $\{ t_j\}_{j \ge 0}$, and is $\{{{\mathcal {F}}}_t \}_{t \ge 0}$-adapted. Now assume $0=t_{0}<t_{1}<\cdots <t_{J}=T$ and fix $n=0,\ldots ,J-1$; considering (1.5) on $[0,t_{n+1}]\times {\mathbb {R}}^{L}$, a standard argument then leads to (see Lemma 3.2)

$$\begin{aligned} \Bigl \vert {{\mathbb {E}}}\bigl [ \phi ({\mathbf {X}}_{t_{n+1}})\bigl ] -{\mathbb {E}}\bigl [ \phi ({\mathbf {Y}}^{n+1})\bigr ]\Bigr \vert= & {} \Bigl \vert {{\mathbb {E}}}\bigl [ u(0, {\mathbf {y}}) - u(t_{n+1}, {\mathbf {Y}}^{n+1})\bigr ]\Bigr \vert \nonumber \\\le & {} \sum _{j=0}^{n} \Bigl \vert {{\mathbb {E}}}\bigl [ u(t_{j+1}, {\mathbf {Y}}^{j+1}) - u(t_j, {\mathbf {Y}}^j)\bigr ]\Bigr \vert . \end{aligned}$$

(1.8)

We may now use Itô’s formula with u from (1.5) to transform $\pmb {{\mathcal {Y}}}$ on each time interval $[t_{j},t_{j+1}]$ to represent each increment $u(t_{j+1}, {\mathbf {Y}}^{j+1}) - u(t_j, {\mathbf {Y}}^j)$ in the last sum, and employ (1.5) to deduce

$$\begin{aligned}&{{\mathbb {E}}}\bigl [ u(t_{j+1}, {\mathbf {Y}}^{j+1}) - u(t_j, {\mathbf {Y}}^j) \bigr ] \nonumber \\&\quad = \int _{t_j}^{t_{j+1}} {{\mathbb {E}}}\left[ \left\langle \underbrace{\bigl ( {\mathbb {I}} + \tau ^{j+1} {{\mathscr {A}}}\bigr )^{-1} {\mathbf {f}}({\mathbf {Y}}^j) - {{\mathscr {A}}}\bigl ( {\mathbb {I}} + \tau ^{j+1} {{\mathscr {A}}}\bigr )^{-1} {\mathbf {Y}}^j - {\mathbf {f}}(\pmb {{\mathcal {Y}}}_s) + {{\mathscr {A}}}\pmb {{\mathcal {Y}}}_s}_{\text {`}\text {error indicator (drift)}\text {'}}, \underbrace{D_{\mathbf {x}} u(s, \pmb {{\mathcal {Y}}}_s)}_{\text {`}\text {weight}\text {'}}\right\rangle _{{{\mathbb {R}}}^L}\right] \, \mathrm{d}s \nonumber \\&+ \frac{1}{2} \int _{t_j}^{t_{j+1}} {{\mathbb {E}}}\left[ \mathrm{Tr} \Bigg ( \underbrace{\bigl \{ \bigl ( {\mathbb {I}} + \tau ^{j+1} {{\mathscr {A}}}\bigr )^{-1}\pmb {\sigma }({\mathbf {Y}}^j) \big [\bigl ( {\mathbb {I}} + \tau ^{j+1} {{\mathscr {A}}}\bigr )^{-1}\pmb {\sigma }({\mathbf {Y}}^j)\big ]^\top - \pmb {\sigma }(\pmb {{\mathcal {Y}}}_s) \pmb {\sigma }^\top (\pmb {{\mathcal {Y}}}_s)\bigr \}}_{\text {`}\text {error indicator (diffusion)}\text {'}}\cdot \underbrace{D^2_{\mathbf {x}} u(s, \pmb {{\mathcal {Y}}}_s)}_{\text {`}\text {weight}\text {'}}\Bigg )\right] \, \mathrm{d}s; \end{aligned}$$

(1.9)

see Lemma 3.2 for the justification of this identity. Conceptionally, the right-hand side of (1.9) uses the continuified process $\pmb {{\mathcal {Y}}}$ built from the iterates ${\mathbf {Y}}^j$ and ${\mathbf {Y}}^{j+1}$—and first and second derivatives of the solution u from (1.5) to transform $\{\pmb {{\mathcal {Y}}}_t;\, t \in [t_j, t_{j+1}]\}$. An interpretation of (the right-hand side of) (1.9) in a corresponding setting in [23, 27] is its view as products of (local) error indicators for the drift and diffusion, and weights $D_{{\mathbf {x}}}u$, $D_{{\mathbf {x}}}^{2}u$, which ‘encode’ the chosen test function $\phi $.

In the next step, we use the first to third variation equations for (1.1), see (2.2)–(2.4) in Sect. 2.3, to deduce bounds

$$\begin{aligned} \sup _{(t, {\mathbf {x}}) \in [0,T] \times {{\mathbb {R}}}^L} \Vert D^{\ell }_{\mathbf {x}} u(t,{\mathbf {x}})\Vert _{{\mathcal {L}}^{\ell }} \le \sum _{i=1}^{\ell } C_{\ell ,i} \sup _{{\mathbf {x}} \in {{\mathbb {R}}}^L} \Vert D^{i} \phi ({\mathbf {x}})\Vert _{{\mathcal {L}}^{i}} \quad (\ell \in \{1,2,3\}), \end{aligned}$$

(1.10)

for derivatives of the solution u of (1.5), where constants $C_{\ell ,i}>0$ do not depend on the dimension L. Note that only derivatives of u are involved in (1.9), so their estimation with the help of (1.10) suggests to choose test functions $\phi :{\mathbb {R}}^{L}\rightarrow {\mathbb {R}}$ in (1.4) whose derivatives are uniformly bounded on ${\mathbb {R}}^{L}$—such as norms; see also Example 1.1.

While the derivation of estimate (1.10) is known for a general class of SDEs, see e.g. [3, Sec. 1.3], we calculate the constants $\{ C_{\ell ,i};\, 1 \le i \le \ell , \ 1 \le \ell \le 3\}$ under the assumptions (A1)–(A3), which are needed in the a posteriori error estimate (1.4); see Lemma 2.5. A further tool to derive (1.4) then is the use of common Malliavin calculus techniques such as the Clark–Ocone formula, to avoid that the error estimator uses the interpolated process $\pmb {{\mathcal {Y}}}$ in (1.7) rather than computable iterates ${\mathbf {Y}}^{j}$ from (1.3). Here, we benefitted from similar ideas and concepts, which were used in [6] in the context of a priori weak error analysis of SPDEs of form (1.2) with $\pmb {\beta }\equiv {\mathbf {0}}$; see Remark 3.2 for further details. For a fixed $\phi \in C^3({{\mathbb {R}}}^L)$, our first main result in this work then is the weak a posteriori error estimate (1.4) (see also Theorem 3.1), giving quantitative error bounds for iterates $\{ {\mathbf {Y}}^j\}_{j\ge 0}$ solving (1.3) on a given a mesh $\{t_{j}\}_{j \ge 0}$ covering [0, T] with the help of computable (local) weak error estimators $\{{{\mathfrak {G}}}\bigl (\phi ;\tau ^{j+1}, {\mathbf {Y}}^{j}\bigr )\}_{j\ge 0}$.

In Sect. 4, we use the a posteriori error estimation (1.4) for iterates $\{{\mathbf {Y}}^{j}\}_{j\ge 0}$ of (1.3) to automatically steer the computation of local mesh sizes $\tau ^{j+1}$ via the adaptive Algorithm 4.1, yielding tuples $\big \{(\tau ^{j+1},{\mathbf {Y}}^{j+1})\big \}_{j\ge 0}$. Given $j \ge 0$, the guiding criterion for admissibility of a new tuple $(\tau ^{j+1},{\mathbf {Y}}^{j+1})$ is that the evaluation with the local error estimator yields ${{\mathfrak {G}}}\bigl (\phi ;\tau ^{j+1}, {\mathbf {Y}}^{j}\bigr ) \le \frac{{\mathtt {Tol}}}{T}$, where the tolerance ${\mathtt {Tol}}>0$ is provided by the user. We generate such an admissible tuple by successively halving the previous time step, and thus generating a sequence $\{\tau ^{j+1,\ell }\}_{\ell \ge 0}\subset {\mathbb {R}}^{+}$ with $\tau ^{j+1,\ell }=\frac{\tau ^{j+1,0}}{2^{\ell }}$ and $\tau ^{j+1,0}\equiv \tau ^{j}$, until admissibility of a tuple for some $\tau ^{j+1,\ell _{j+1}^*}$ is attained; see Fig. 2. This sequence of steps precedes a single potential step of coarsening; see Algorithm 4.1 for further details. As a result, we obtain an adaptive method where only ${\mathtt {Tol}}>0$ needs be set, and where admissible tuple $\big \{(\tau ^{j+1},{\mathbf {Y}}^{j+1})\big \}_{j\ge 0}$ satisfy (1.4), where the right-hand side is now bounded by ${\mathtt {Tol}}>0$. The second main result in this paper then is Theorem 4.2, which ensures computation of each new time step $\tau ^{j+1}$ in Algorithm 4.1 after no more than $\ell _{j+1}^{*} ={\mathcal {O}}\bigl (\log ({\mathtt {Tol}}^{-1})\bigr )$ many iterations (indexed by j; local determination), and at most $J={{\mathcal {O}}}(\mathtt{Tol}^{-1})$ many steps to reach T (global termination); its proof again rests on the stability bounds given in Lemma 2.6 and yields the existence of a lower bound of the step sizes generated via Algorithm 4.1; see also Fig. 2. We remark that this local construction of the new mesh size $\tau ^{j+1}$ with the help of only ${\mathbf {Y}}^{j}$ differs from the strategy in [23, 27], where admissible meshes are obtained by iterative computation of global problems (‘approximate Kolmogorov equation’), and where again the assumed boundedness of drift and diffusion is crucial to conclude optimality of attained meshes.

Section 5 then reports on computational studies for different SPDEs (1.2) after finite element discretization with the help of the adaptive Algorithm 4.1: we specify the corresponding a posteriori error estimator, and pinpoint those computable expressions $\{ \pmb {\mathtt {E}_{\ell }}\}_{\ell \ge \mathtt{1}}$ involved in the error estimator in Example 1.1, which are mainly responsible for local mesh adjustments. For the different examples, including one which is convection dominated, the results evidence efficiency in comparison with uniform meshing, and accuracy of the weak adaptive Algorithm 4.1.

The paper is organized as follows: Sect. 2 collects the assumptions needed for the data ${{\mathscr {A}}}, {\mathbf {f}}, \{\pmb {\sigma }_{k}\}_{k}$ of (1.1) and recalls relevant tools from Malliavin calculus; moreover, variation equations for (1.1) are recalled to verify the bounds (1.10) and stability bounds for iterates $\{{\mathbf {Y}}^{j}\}_{j\ge 0}$ from (1.3) are presented. The a posteriori error analysis for (1.3) is given in Sect. 3. The related weak adaptive method is proposed and analyzed in Sect. 4, and corresponding computational studies are reported in Sect. 5.

2 Assumptions and tools

Section 2.1 lists basic requirements on data $ {{\mathscr {A}}}, {\mathbf {f}}, \pmb {\sigma }\equiv \big [\pmb {\sigma }_{1},\ldots ,\pmb {\sigma }_{K}\big ], {\mathbf {y}}$ in (1.1) throughout this work. Section 2.2 shortly recalls needed tools from Malliavin calculus. In Sect. 2.3, we derive explicit bounds for $\{ D^{\ell }_{\mathbf {x}} u\}_{\ell =1}^3$ from Kolmogorov’s backward equation (1.5) under Assumptions (A1)–(A2). Stability bounds for $\{ {\mathbf {Y}}^j\}_{j\ge 0}$ from (1.3) are given in Sect. 2.4 provided (A1)–(A3) are valid.

2.1 Assumptions

Throughout this work, $(\Omega , {{\mathcal {F}}}, \{{{\mathcal {F}}}_t \}_{t \ge 0}, {{\mathbb {P}}})$ is a given filtered probability space with natural filtration of the Wiener processes in (1.1). Below, we use positive constants $C_{D^{\ell } {\mathbf {f}}}, C_{{\mathbf {f}}}^{(\ell -1)}, C_{D^{\ell } \pmb {\sigma }}, C_{\pmb {\sigma }}^{(\ell -1)}, C_{{\mathbf {y}}}^{(\ell -1)}$ ($1 \le \ell \le 3$), and $\lambda _{{{\mathscr {A}}}}$ to specify dependence on data $ {{\mathscr {A}}}, {\mathbf {f}}, \pmb {\sigma }\equiv \big [\pmb {\sigma }_{1},\ldots ,\pmb {\sigma }_{K}\big ]$ in (1.1); none of these constants depend on L. For a sufficiently smooth ${\mathbf {g}}\in C({{\mathbb {R}}}^L; {\mathbb R}^{n})$, corresponding (matrix) operator norms are given as follows $(\, n,L\in {{\mathbb {N}}}{\setminus } \{0\}$, ${\mathbf {x}}\in {\mathbb {R}}^{L}\,)$:

$$\begin{aligned}&\Vert D^{\ell }{\mathbf {g}}({\mathbf {x}})\Vert _{{\mathcal {L}}\left( \underbrace{{\mathbb {R}}^{L}\times \cdots \times {\mathbb {R}}^{L}}_{\ell -\text {times}};{\mathbb {R}}^{n}\right) }:=\sup \limits _{\Vert {\mathbf {v}}_{i} \Vert _{{\mathbb {R}}^{L}}=1}\Vert D^{\ell }{\mathbf {g}}({\mathbf {x}})({\mathbf {v}}_{1},\ldots ,{\mathbf {v}}_{\ell })\Vert _{{\mathbb {R}}^{n}}\quad (\ell \in {{\mathbb {N}}}{\setminus } \{0\}), \end{aligned}$$

where $\Vert \cdot \Vert _{{\mathbb {R}}^{n}}$ denotes the (Euclidean) vector norm of a ${\mathbb {R}}^{n}$-valued vector. If $n=L$, we write ${\mathcal {L}}^{\ell }\equiv {\mathcal {L}}\big ({\mathbb {R}}^{L}\times \cdots \times {\mathbb {R}}^{L};{\mathbb {R}}^{L}\big )$. If $n=1$, $D\equiv D_{{\mathbf {x}}}$ denotes the gradient and $D^{2}\equiv D^{2}_{{\mathbf {x}}}$ the Hessian matrix of ${\mathbf {g}}$, and we also write ${\mathcal {L}}^{\ell }\equiv {\mathcal {L}}\big ({\mathbb {R}}^{L}\times \cdots \times {\mathbb {R}}^{L};{\mathbb {R}}\big )$. Moreover, $\Vert D_{{\mathbf {x}}}{\mathbf {g}}({\mathbf {x}})\Vert _{{\mathcal {L}}^{1}}=\Vert D_{{\mathbf {x}}}{\mathbf {g}}({\mathbf {x}})\Vert _{{\mathbb {R}}^{L}}$, $\Vert D^{2}_{{\mathbf {x}}}{\mathbf {g}}({\mathbf {x}})\Vert _{{\mathcal {L}}^{2}}=\Vert D^{2}_{{\mathbf {x}}}{\mathbf {g}}({\mathbf {x}})\Vert _{{\mathbb {R}}^{L\times L}}$, where $\Vert \cdot \Vert _{{\mathbb {R}}^{L\times L}}$ denotes the spectral (matrix) norm.

(A1):

(a) The matrix ${{\mathscr {A}}}\in {{\mathbb {R}}}^{L \times L}$ is invertible and positive definite, i.e., there exists a constant $\lambda _{{{\mathscr {A}}}}>0$, s.t.

$$\begin{aligned} \langle {{\mathscr {A}}}{\mathbf {x}}, {\mathbf {x}}\rangle _{{{\mathbb {R}}}^L} \ge \lambda _{{{\mathscr {A}}}} \Vert {\mathbf {x}}\Vert ^2_{{{\mathbb {R}}}^L} \quad \text {for all }\,\; {\mathbf {x}} \in {{\mathbb {R}}}^L. \end{aligned}$$

(b) The map ${\mathbf {f}} \in C^{3}({{\mathbb {R}}}^L; {{\mathbb {R}}}^L)$, and there exist constants $\{C_{D^{\ell }{\mathbf {f}}}\}_{\ell =1}^3$, s.t.

$$\begin{aligned} \sup \limits _{{\mathbf {x}}\in {\mathbb {R}}^{L}} \Vert D^{\ell } {\mathbf {f}}({\mathbf {x}})\Vert _{{\mathcal {L}}^{\ell }} \le C_{D^{\ell } {\mathbf {f}}} \quad (1 \le \ell \le 3); \end{aligned}$$

moreover, there exist constants $\{C_{{\mathbf {f}}}^{(\ell )}\}_{\ell =0}^2$, s.t.

$$\begin{aligned} \Vert {{\mathscr {A}}}^{\ell } {\mathbf {f}}({\mathbf {x}})\Vert _{{{\mathbb {R}}}^L} \le C_{{\mathbf {f}}}^{(\ell )} \bigl (1+ \Vert {{\mathscr {A}}}^{\ell } {\mathbf {x}}\Vert _{{{\mathbb {R}}}^L} \bigr ) \quad \text {for all }\,\; {\mathbf {x}} \in {{\mathbb {R}}}^L\quad (0 \le \ell \le 2) . \end{aligned}$$

(A2):

The maps $\pmb {\sigma }_{k} \in C^{3}({{\mathbb {R}}}^L; {{\mathbb {R}}}^{L})$ for every $k=1,\ldots ,K$, and there exist constants $\{C_{D^{\ell } \pmb {\sigma }}\}_{\ell =1}^3$, s.t.

$$\begin{aligned} \sum \limits _{k=1}^{K} \sup \limits _{{\mathbf {x}}\in {\mathbb {R}}^{L}} \Vert D^{\ell } \pmb {\sigma }_{k}({\mathbf {x}})\Vert _{{\mathcal {L}}^{\ell }} \le C_{D^{\ell } \pmb {\sigma }} \quad (1 \le \ell \le 3); \end{aligned}$$

moreover, there exist constants $\{C_{\pmb {\sigma }}^{(\ell )}\}_{\ell =0}^2$, s.t. for every $k=1,\ldots ,K$

$$\begin{aligned} \Vert {{\mathscr {A}}}^{\ell } \pmb {\sigma }_{k}({\mathbf {x}})\Vert _{{{\mathbb {R}}}^{L}} \le C_{\pmb {\sigma }}^{(\ell )} \bigl (1+ \Vert {{\mathscr {A}}}^{\ell } {\mathbf {x}}\Vert _{{{\mathbb {R}}}^L} \bigr ) \quad \text {for all }\, {\mathbf {x}} \in {{\mathbb {R}}}^L\quad (0 \le \ell \le 2) . \end{aligned}$$

(A3):

For $0 \le \ell \le 2$, there exists $C_{{\mathbf {y}}}^{(\ell )}$, s.t. the initial datum ${\mathbf {y}}\in {\mathbb {R}}^{L}$ in (1.1) satisfies

$$\begin{aligned} \Vert {{\mathscr {A}}}^{\ell } {\mathbf {y}}\Vert _{{{\mathbb {R}}}^L} \le C_{{\mathbf {y}}}^{(\ell )}. \end{aligned}$$

Throughout this work, we admit test functions $\phi \in C^{3}({{\mathbb {R}}}^L)$ with globally bounded first, second and third derivatives.

2.2 Malliavin calculus

We briefly recall the Malliavin derivative, recall the chain rule for Malliavin derivatives and state the Clark–Ocone formula. For further details, we refer to [25].— We denote by $C_{\mathtt {p}}^{\infty }({\mathbb {R}}^{L})$ the space of all smooth functions $g:{\mathbb {R}}^{L}\rightarrow {\mathbb {R}}$, such that g and all of its partial derivatives have polynomial growth. Let ${\mathfrak {P}}$ the set of ${\mathbb {R}}$-valued random variables of the form

$$\begin{aligned} \mathrm {F}=g\bigl (W(h_{1}),\ldots ,W(h_{L})\bigr ) \end{aligned}$$

for some $g\in C_{\mathtt {p}}^{\infty }({\mathbb {R}}^{L})$ and $h_{1},\ldots ,h_{L}\in L^{2}(0,T)$. Here, $W:L^{2}(0,T)\rightarrow L_{{\mathcal {F}}_{T}}^{2}(\Omega )$ is defined by

$$\begin{aligned} W(h)=\int _{0}^{T} h(t)\,\mathrm {d}\beta (t). \end{aligned}$$

We further define for any $\mathrm {F}\in {\mathfrak {P}}$ its ${\mathbb {R}}$-valued Malliavin derivative process $\mathrm {D}\mathrm {F}:=\{\mathrm {D}_{t}\mathrm {F};\,0\le t\le T\}$ via

$$\begin{aligned} \mathrm {D}_{t}\mathrm {F}=\sum \limits _{i=1}^{L} \partial _{x_{i}}g\bigl (W(h_{1}),\ldots ,W(h_{L})\bigr )h_{i}(t). \end{aligned}$$

(2.1)

For any $p\ge 1$, let ${\mathbb {D}}^{1,p}$ denote the closure of the class of smooth random variables with respect to the norm

$$\begin{aligned} \Vert \mathrm {F}\Vert _{{\mathbb {D}}^{1,p}}=\Bigl ( {\mathbb {E}}[|\mathrm {F}|^{p}]+{\mathbb {E}}\Big [ \Vert \mathrm {D}\mathrm {F}\Vert _{L^{2}(0,T)}^{p}\Big ]\Bigr )^{\frac{1}{p}}. \end{aligned}$$

Next, we recall the chain rule for Malliavin derivatives; see [25, p. 28, Prop. 1.2.3].

Let $\varphi :{\mathbb {R}}^{L}\rightarrow {\mathbb {R}}$ be a continuously differentiable function with bounded partial derivatives of order 1, and $p\ge 1$ be fixed. Let further $\pmb {\mathrm {F}}=\bigl ( \mathrm {F}^{1},\ldots ,\mathrm {F}^{L}\bigr )^\top $ be a random vector whose components belong to the space ${\mathbb {D}}^{1,p}$. Then $\varphi (\pmb {\mathrm {F}})\in {\mathbb {D}}^{1,p}$, and

$$\begin{aligned} \mathrm {D}\bigl (\varphi (\pmb {\mathrm {F}})\bigr )=\sum \limits _{i=1}^{L}\partial _{x_{i}}\varphi (\pmb {\mathrm {F}}) \mathrm {D}\mathrm {F}^{i}. \end{aligned}$$

Finally, we recall the Clark–Ocone representation formula; see [25, p. 46, Prop. 1.3.14].

Lemma 2.1

Let $\mathrm {F}\in {\mathbb {D}}^{1,2}$, and $\beta $ be a one-dimensional Wiener process. Then

$$\begin{aligned} \mathrm {F}={\mathbb {E}}[\mathrm {F}]+\int _{0}^{T} {\mathbb {E}}\big [\mathrm {D}_{t}\mathrm {F}|{\mathcal {F}}_{t}\big ]\,\mathrm {d}\beta (t). \end{aligned}$$

2.3 Variation equations for (1.1) and a priori bounds for $\{ D_{\mathbf {x}}^{\ell }u\}_{\ell =1}^3$ of (1.5)

Kolmogorov’s backward equation (1.5) has a unique solution $[0,T] \times {{\mathbb {R}}}^L \ni (t,{\mathbf {x}}) \mapsto u(t,{\mathbf {x}}) = {{\mathbb {E}}}[\phi ({\mathbf {X}}^{t,{\mathbf {x}}}_T)]$, whenever assumptions (A1)–(A2) hold; see e.g. [15, p. 366ff.]. The derivation in Sect. 3 requires explicit bounds for derivatives u, which are uniform in L, in particular. To this end, we use the variation equations corresponding to (1.1) and derive these results here. For a detailed verification of the upcoming results, we refer to [22].

For ${\mathbf {y}} \in {{\mathbb {R}}}^L$ fixed, we denote by ${\mathbf {X}} \equiv {\mathbf {X}}^{0,{\mathbf {y}}}$ the solution of (1.1). Let ${\mathbf {h}} \in {{\mathbb {R}}}^L$. Following [3, p. 37ff.], we recall the first variation equation corresponding to (1.1),

$$\begin{aligned} \mathrm{d}\pmb {\eta }^{\mathbf {h}}_t&= \bigl ( -{{\mathscr {A}}}+ D{\mathbf {f}}({\mathbf {X}}_t)\bigr )\cdot \pmb {\eta }^{\mathbf {h}}_t\mathrm{d}t + \sum \limits _{k=1}^{K} D \pmb {\sigma }_{k}({\mathbf {X}}_t) \cdot \pmb {\eta }^{\mathbf {h}}_t\mathrm{d}\beta _{k}(t) \quad \text {for all }\,\; t \in [0,T], \nonumber \\ \pmb {\eta }^{\mathbf {h}}_0&= {\mathbf {h}}. \end{aligned}$$

(2.2)

Since assumptions (A1)–(A2) are valid, there exists a unique solution $\pmb {\eta }^{\mathbf {h}} \equiv \{ \pmb {\eta }^{\mathbf {h}}_t;\, t \in [0,T]\}$; it is equal to $D_{\mathbf {y}} {\mathbf {X}}^{0, {\mathbf {y}}} \cdot {\mathbf {h}}$, the derivative w.r.t. the initial datum ${\mathbf {y}} \in {{\mathbb {R}}}^L$ of the map ${\mathbf {y}} \mapsto {\mathbf {X}}^{0, {\mathbf {y}}}$, along the direction ${\mathbf {h}}\in {\mathbb {R}}^{L}$.

Lemma 2.2

Assume (A1)–(A2) in (2.2). Then, for every ${\mathbf {h}}\in {\mathbb {R}}^{L}$ and $p\ge 1$,

$$\begin{aligned} \sup \limits _{t\in [0,T]} {\mathbb {E}}\big [ \Vert \pmb {\eta }^{\mathbf {h}}_t \Vert _{{\mathbb {R}}^{L}}^{p}\big ]\le V_p^{(1)}\cdot \Vert {\mathbf {h}}\Vert _{{\mathbb {R}}^{L}}^{p}, \end{aligned}$$

where $V_p^{(1)}:=e^{pT\max \bigg \{-\lambda _{{{\mathscr {A}}}}+C_{D{\mathbf {f}}}+\tfrac{p-1}{2}C^2_{D\pmb {\sigma }},0\bigg \}}$ $(p{>}1)$, and $V_1^{(1)}:=e^{T\max \left\{ -\lambda _{{{\mathscr {A}}}}+C_{D{\mathbf {f}}}+\tfrac{1}{2}C^2_{D\pmb {\sigma }},0\right\} }$.

Proof

(a) $p>1$: Let $t\in [0,T]$, ${\mathbf {h}}\in {\mathbb {R}}^{L}$ and $p>1$. By Itô’s formula,

$$\begin{aligned} \frac{1}{p} {\mathbb {E}}\big [\Vert \pmb {\eta }_{t}^{{\mathbf {h}}} \Vert _{{\mathbb {R}}^{L}}^{p}\big ]&\le \frac{\Vert {\mathbf {h}}\Vert _{{\mathbb {R}}^{L}}^{p}}{p}+{\mathbb {E}}\bigg [ \int _{0}^{t} \big \langle \bigl (-{{\mathscr {A}}}+ D{\mathbf {f}}({\mathbf {X}}_{s})\bigr )\cdot \pmb {\eta }_{s}^{{\mathbf {h}}},\pmb {\eta }_{s}^{{\mathbf {h}}}\big \rangle _{{\mathbb {R}}^{L}}\cdot \Vert \pmb {\eta }_{s}^{{\mathbf {h}}}\Vert _{{\mathbb {R}}^{L}}^{p-2} \\&\quad +\frac{1}{2}\sum \limits _{k=1}^{K}\mathrm {Tr}\Bigl (D\pmb {\sigma }_{k}({\mathbf {X}}_{s})\pmb {\eta }_{s}^{{\mathbf {h}}} [D\pmb {\sigma }_{k}({\mathbf {X}}_{s}) \pmb {\eta }_{s}^{{\mathbf {h}}}]^\top \Bigr )\cdot (p-1) \Vert \pmb {\eta }_{s}^{{\mathbf {h}}}\Vert _{{\mathbb {R}}^{L}}^{p-2} \,\mathrm {d}s \bigg ]. \end{aligned}$$

Using assumptions (A1)–(A2) leads to

$$\begin{aligned}&\frac{1}{p} {\mathbb {E}}\big [\Vert \pmb {\eta }_{t}^{{\mathbf {h}}} \Vert _{{\mathbb {R}}^{L}}^{p}\big ]\\&\quad \le \frac{\Vert {\mathbf {h}}\Vert _{{\mathbb {R}}^{L}}^{p}}{p} +{\mathbb {E}}\bigg [ \int _{0}^{t} \bigg \{ \big \langle \bigl (-\pmb {{\mathscr {A}}} + D{\mathbf {f}}({\mathbf {X}}_{s})\bigr )\cdot \pmb {\eta }_{s}^{{\mathbf {h}}},\pmb {\eta }_{s}^{{\mathbf {h}}}\big \rangle _{{\mathbb {R}}^{L}}\\&\qquad +\tfrac{p-1}{2} \sum \limits _{k=1}^{K}\Vert D\pmb {\sigma }_{k}({\mathbf {X}}_{s}) \pmb {\eta }_{s}^{{\mathbf {h}}}\Vert _{{\mathbb {R}}^{L}}^{2}\bigg \}\Vert \pmb {\eta }_{s}^{{\mathbf {h}}}\Vert _{{\mathbb {R}}^{L}}^{p-2} \,\mathrm {d}s \bigg ]\\&\quad \le \frac{\Vert {\mathbf {h}}\Vert _{{\mathbb {R}}^{L}}^{p}}{p} + \max \left\{ -\lambda _{{{\mathscr {A}}}}+C_{D\mathbf {f}}+\tfrac{p-1}{2}C_{D\pmb {\sigma }}^{2},0 \right\} \int _{0}^{t} {\mathbb {E}}\big [\Vert \pmb {\eta }_{s}^{{\mathbf {h}}} \Vert _{{\mathbb {R}}^{L}}^{p}\big ]\,\mathrm {d}s. \end{aligned}$$

Applying Gronwall’s inequality leads to the first result.

(b) $p=1$: This follows by using Jensen’s inequality. $\square $

Next, following [3, p. 39ff.], we consider the second variation equation corresponding to (1.1), that is, for ${\mathbf {h}},{\mathbf {w}} \in {{\mathbb {R}}}^L$,

$$\begin{aligned} \mathrm{d}\pmb {\zeta }^{{\mathbf {h}},{\mathbf {w}}}_t&=\Bigl ( \bigl ( -{{\mathscr {A}}}+ D{\mathbf {f}}({\mathbf {X}}_t)\bigr )\cdot \pmb {\zeta }^{{\mathbf {h}},{\mathbf {w}}}_{t} +D^{2}{\mathbf {f}}({\mathbf {X}}_t)\cdot (\pmb {\eta }_{t}^{{\mathbf {h}}},\pmb {\eta }_{t}^{{\mathbf {w}}}) \Bigr )\mathrm{d}t \nonumber \\&\quad +\sum \limits _{k=1}^{K}\Bigl ( D \pmb {\sigma }_{k}({\mathbf {X}}_t) \cdot \pmb {\zeta }^{{\mathbf {h}},{\mathbf {w}}}_{t} + D^{2} \pmb {\sigma }_{k}({\mathbf {X}}_t)\cdot (\pmb {\eta }_{t}^{{\mathbf {h}}},\pmb {\eta }_{t}^{{\mathbf {w}}})\Bigr ) \mathrm{d}\beta _{k}(t) \quad \text {for all }\,\; t \in [0,T],\nonumber \\ \pmb {\zeta }^{{\mathbf {h}},{\mathbf {w}}}_0&= \mathbf{0}\in {\mathbb {R}}^{L}. \end{aligned}$$

(2.3)

Since assumptions (A1)–(A2) are valid, there exists a unique solution $\pmb {\zeta }^{\mathbf {h},\mathbf {w}} \equiv \{ \pmb {\zeta }^{\mathbf {h},\mathbf {w}}_t;\, t \in [0,T]\}$; it is equal to $D^{2}_{\mathbf {y}} {\mathbf {X}}^{0, {\mathbf {y}}} \cdot ({\mathbf {h}},{\mathbf {w}})$, the second derivative w.r.t. the initial datum ${\mathbf {y}} \in {{\mathbb {R}}}^L$ of the map ${\mathbf {y}} \mapsto {\mathbf {X}}^{0, {\mathbf {y}}}$ along the directions ${\mathbf {h}}, {\mathbf {w}} \in {\mathbb {R}}^{L}$.

Lemma 2.3

Assume (A1)–(A2) in (2.3). Then, for every ${\mathbf {h}},{\mathbf {w}}\in {\mathbb {R}}^{L}$, $\varepsilon _{1},\varepsilon _{2}>0$ and $p\ge 1$,

$$\begin{aligned} \sup \limits _{t\in [0,T]} {\mathbb {E}}\big [ \Vert \pmb {\zeta }^{{\mathbf {h}},{\mathbf {w}}}_{t} \Vert _{{\mathbb {R}}^{L}}^{p}\big ]\le V_{p,\varepsilon _{1},\varepsilon _{2}}^{(2)}\cdot \Vert {\mathbf {h}}\Vert _{{\mathbb {R}}^{L}}^{p} \Vert {\mathbf {w}}\Vert _{{\mathbb {R}}^{L}}^{p}, \end{aligned}$$

where $V_{1,\varepsilon _{1},\varepsilon _{2}}^{(2)}:=\sqrt{V_{2,\varepsilon _{1},\varepsilon _{2}}^{(2)}}$ for $p=1$, and

$$\begin{aligned} V_{p,\varepsilon _{1},\varepsilon _{2}}^{(2)}&:=T\left( \frac{1}{\varepsilon _{1}^{p-1}}C_{D^{2}{\mathbf {f}}}^{p}+\frac{2}{\varepsilon _{2}^{\nicefrac {(p-2)}{2}}}C_{D^{2}\pmb { \sigma }}^{p}(p-1)^{\tfrac{p}{2}}\right) V_{2p}^{(1)}\\&\quad \cdot e^{pT\max \left\{ -\lambda _{{{\mathscr {A}}}}+C_{D{\mathbf {f}}}+(p-1)C^2_{D\pmb {\sigma }}+\tfrac{(p-1)}{p}\varepsilon _{1}+\tfrac{(p-2)}{p}\varepsilon _{2},0\right\} } \quad (p \ge 2). \end{aligned}$$

Proof

The proof follows the steps outlined in the proof of Lemma 2.2: by Itô’s formula, we may control $\frac{1}{p}{\mathbb {E}}\big [ \Vert \pmb {\zeta }^{{\mathbf {h}},{\mathbf {w}}}_{t} \Vert _{{\mathbb {R}}^{L}}^{p}\big ]$ in terms of data, and ${\mathbf {h}},{\mathbf {w}}$ in (2.3); the assertion then follows with the help of generalized Young, and Gronwall inequalities. $\square $

Now, let $\mathbf {h_{1}},\mathbf {h_{2}},\mathbf {h_{3}} \in {\mathbb R}^L$. Following [3, p. 43ff.], we recall the third variation equation corresponding to (1.1),

$$\begin{aligned} \mathrm{d}\pmb {\Theta }^{\mathbf {h_{1}},\mathbf {h_{2}},\mathbf {h_{3}}}_{t}&=\bigg ( \bigl ( -{{\mathscr {A}}}+ D{\mathbf {f}}({\mathbf {X}}_t)\bigr )\cdot \pmb {\Theta }^{\mathbf {h_{1}},\mathbf {h_{2}},\mathbf {h_{3}}}_{t}+ \frac{1}{4}\sum \limits _{\mathbf {\pi }\in \mathtt {S_{3}}} D^{2}{\mathbf {f}}({\mathbf {X}}_t)\cdot (\pmb {\eta }_{t}^{\mathbf {h_{\pi (1)}}},\pmb {\zeta }_{t}^{\mathbf {h_{\pi (2)}},\mathbf {h_{\pi (3)}}})\nonumber \\&\quad +D^{3}{\mathbf {f}}({\mathbf {X}}_t)\cdot (\pmb {\eta }_{t}^{\mathbf {h_{1}}},\pmb {\eta }_{t}^{\mathbf {h_{2}}},\pmb {\eta }_{t}^{\mathbf {h_{3}}}) \bigg )\mathrm{d}t\nonumber \\&\quad + \sum \limits _{k=1}^{K}\bigg ( D\pmb {\sigma }_{k}({\mathbf {X}}_{t})\cdot \pmb {\Theta }^{\mathbf {h_{1}},\mathbf {h_{2}},\mathbf {h_{3}}}_{t}+ \frac{1}{4}\sum \limits _{\mathbf {\pi }\in \mathtt {S_{3}}} D^{2}\pmb {\sigma }_{k}({\mathbf {X}}_t)\cdot (\pmb {\eta }_{t}^{\mathbf {h_{\pi (1)}}},\pmb {\zeta }_{t}^{\mathbf {h_{\pi (2)}},\mathbf {h_{\pi (3)}}})\nonumber \\&\quad +D^{3}\pmb {\sigma }_{k}({\mathbf {X}}_t)\cdot (\pmb {\eta }_{t}^{\mathbf {h_{1}}},\pmb {\eta }_{t}^{\mathbf {h_{2}}},\pmb {\eta }_{t}^{\mathbf {h_{3}}}) \bigg ) \mathrm{d}\beta _{k}(t) \quad \text {for all }\; t \in [0,T],\nonumber \\ \pmb {\Theta }^{\mathbf {h_{1}},\mathbf {h_{2}},\mathbf {h_{3}}}_{0}&= {\mathbf {0}}\in {\mathbb {R}}^{L}. \end{aligned}$$

(2.4)

Here, $\mathtt {S_{3}}$ denotes the set of all permutations of a set of three elements. Since (A1)–(A2) apply, there exists a unique solution $\pmb {\Theta }^{\mathbf {h_{1}},\mathbf {h_{2}},\mathbf {h_{3}}} \equiv \{ \pmb {\Theta }^{\mathbf {h_{1}},\mathbf {h_{2}},\mathbf {h_{3}}}_{t};\, t \in [0,T]\}$; it is equal to $D^{3}_{\mathbf {y}} {\mathbf {X}}^{0, {\mathbf {y}}} \cdot ({\mathbf {h_{1}},\mathbf {h_{2}},\mathbf {h_{3}}})$, the third derivative w.r.t. the initial datum ${\mathbf {y}} \in {{\mathbb {R}}}^L$ of the map ${\mathbf {y}} \mapsto {\mathbf {X}}^{0, {\mathbf {y}}}$ along the directions $\mathbf {h_{1}},\mathbf {h_{2}},\mathbf {h_{3}} \in {\mathbb {R}}^{L}$.

Moment bounds for the solution of (2.4) may be obtained as in Lemmata 2.2 and 2.3 ; see [22] for further details; in view of Lemma 2.5 below, it suffices to consider only second moment bounds in the following lemma.

Lemma 2.4

Assume (A1)–(A2) in (2.4). Then, for every $\mathbf {h_{1}},\mathbf {h_{2}},\mathbf {h_{3}}\in {\mathbb {R}}^{L}$ and $\varepsilon _{3}>0$,

$$\begin{aligned} \sup \limits _{t\in [0,T]} {\mathbb {E}}\big [ \Vert \pmb {\Theta }^{\mathbf {h_{1}},\mathbf {h_{2}},\mathbf {h_{3}}}_{t} \Vert _{{\mathbb {R}}^{L}}^{2}\big ]\le V_{2,\varepsilon _{3}}^{(3)}\cdot \Vert \mathbf {h_{1}}\Vert _{{\mathbb {R}}^{L}}^{2} \Vert \mathbf {h_{2}}\Vert _{{\mathbb {R}}^{L}}^{2} \Vert \mathbf {h_{3}}\Vert _{{\mathbb {R}}^{L}}^{2}, \end{aligned}$$

where

$$\begin{aligned} V_{2,\varepsilon _{3}}^{(3)}&:=T \sqrt{V_{4}^{(1)}} \biggl ( \sqrt{V_{4,\varepsilon _{1},\varepsilon _{2}}^{(2)}} \Bigl ( \tfrac{9}{4\varepsilon _{3}}C_{D^{2}{\mathbf {f}}}^{2}+\tfrac{27}{4}C_{D^{2}\pmb { \sigma }}^{2}\Bigr )+ \sqrt{V_{8}^{(1)}}\Bigl (\tfrac{1}{\varepsilon _{3}}C_{D^{3}{\mathbf {f}}}^{2}+3C_{D^{3}\pmb {\sigma }}^{2}\Bigr )\biggr )\\&\quad \cdot e^{2T\max \big \{-\lambda _{{{\mathscr {A}}}}+C_{D{\mathbf {f}}}+\tfrac{3}{2}C^{2}_{D\pmb {\sigma }}+\varepsilon _{3},0\big \}}. \end{aligned}$$

Moreover for $V_{1,\varepsilon _{3}}^{(3)}:=\sqrt{V_{2,\varepsilon _{3}}^{(3)}}$,

$$\begin{aligned} \sup \limits _{t\in [0,T]} {\mathbb {E}}\big [ \Vert \pmb {\Theta }^{\mathbf {h_{1}},\mathbf {h_{2}},\mathbf {h_{3}}}_{t} \Vert _{{\mathbb {R}}^{L}}\big ]\le V_{1,\varepsilon _{3}}^{(3)}\cdot \Vert \mathbf {h_{1}}\Vert _{{\mathbb {R}}^{L}} \Vert \mathbf {h_{2}}\Vert _{{\mathbb {R}}^{L}} \Vert \mathbf {h_{3}}\Vert _{{\mathbb {R}}^{L}}. \end{aligned}$$

Let $t\in [0,T]$ and ${\mathbf {x}}\in {\mathbb {R}}^{L}$. In order to obtain global bounds for the first and higher derivatives of u in (1.5), we use the following identities, which can be found in e.g. [3, p. 94] in connection with Kolmogorov’s forward equation, which is just a time reversal $t\mapsto T-t$ and a change of the terminal condition into an initial condition in (1.5).

$$\begin{aligned} \big \langle D_{{\mathbf {x}}}u(t,{\mathbf {x}}),{\mathbf {h}}\big \rangle _{{\mathbb {R}}^{L}}={\mathbb {E}}\Big [\big \langle D_{{\mathbf {x}}}\phi ({\mathbf {X}}_{T}^{t,{\mathbf {x}}}), D_{{\mathbf {x}}}{\mathbf {X}}_{T}^{t,{\mathbf {x}}}\cdot {\mathbf {h}}\big \rangle _{{\mathbb {R}}^{L}}\Big ]\quad \text {for all }\,\; {\mathbf {h}}\in {\mathbb {R}}^{L}, \end{aligned}$$

(2.5)

$$\begin{aligned} \big \langle D^{2}_{{\mathbf {x}}}u(t,{\mathbf {x}}){\mathbf {h}},{\mathbf {w}}\big \rangle _{{\mathbb {R}}^{L}}&={\mathbb {E}}\Big [\big \langle D^{2}_{{\mathbf {x}}}\phi ({\mathbf {X}}_{T}^{t,{\mathbf {x}}}) D_{{\mathbf {x}}}{\mathbf {X}}_{T}^{t,{\mathbf {x}}}\cdot {\mathbf {h}},D_{{\mathbf {x}}}{\mathbf {X}}_{T}^{t,{\mathbf {x}}}\cdot {\mathbf {w}}\big \rangle _{{\mathbb {R}}^{L}}\Big ] \nonumber \\&\quad +{\mathbb {E}}\Big [\big \langle D_{{\mathbf {x}}}\phi ({\mathbf {X}}_{T}^{t,{\mathbf {x}}}),D^{2}_{{\mathbf {x}}}{\mathbf {X}}_{T}^{t,{\mathbf {x}}}\cdot ({\mathbf {h}},{\mathbf {w}})\big \rangle _{{\mathbb {R}}^{L}}\Big ]\quad \text {for all } \,\; {\mathbf {h}},{\mathbf {w}}\in {\mathbb {R}}^{L} \end{aligned}$$

(2.6)

and

$$\begin{aligned} \Big \langle D_{{\mathbf {x}}}\big \langle D^{2}_{{\mathbf {x}}} u(t,{\mathbf {x}}) \mathbf {h_{1}},\mathbf {h_{2}}\big \rangle _{{\mathbb {R}}^{L}},\mathbf {h_{3}}\Big \rangle _{{\mathbb {R}}^{L}}&={\mathbb {E}}\Big [ \big \langle D^{3}_{{\mathbf {x}}}\phi ({\mathbf {X}}_{T}^{t,{\mathbf {x}}}) D_{{\mathbf {x}}}{\mathbf {X}}_{T}^{t,{\mathbf {x}}}\cdot \mathbf {h_{1}} D_{{\mathbf {x}}}{\mathbf {X}}_{T}^{t,{\mathbf {x}}}\cdot \mathbf {h_{2}}, D_{{\mathbf {x}}}{\mathbf {X}}_{T}^{t,{\mathbf {x}}}\cdot \mathbf {h_{3}}\big \rangle _{{\mathbb {R}}^{L}}\Big ]\nonumber \\&\quad + {\mathbb {E}}\Big [ \big \langle D^{2}_{{\mathbf {x}}}\phi ({\mathbf {X}}_{T}^{t,{\mathbf {x}}}) D^{2}_{{\mathbf {x}}}{\mathbf {X}}_{T}^{t,{\mathbf {x}}}\cdot (\mathbf {h_{1}},\mathbf {h_{3}}), D_{{\mathbf {x}}}{\mathbf {X}}_{T}^{t,{\mathbf {x}}}\cdot \mathbf {h_{2}}\big \rangle _{{\mathbb {R}}^{L}}\Big ]\nonumber \\&\quad + {\mathbb {E}}\Big [ \big \langle D^{2}_{{\mathbf {x}}}\phi ({\mathbf {X}}_{T}^{t,{\mathbf {x}}}) D_{{\mathbf {x}}}{\mathbf {X}}_{T}^{t,{\mathbf {x}}}\cdot \mathbf {h_{1}}, D^{2}_{{\mathbf {x}}}{\mathbf {X}}_{T}^{t,{\mathbf {x}}}\cdot (\mathbf {h_{2}},\mathbf {h_{3}})\big \rangle _{{\mathbb {R}}^{L}}\Big ]\nonumber \\&\quad + {\mathbb {E}}\Big [ \big \langle D^{2}_{{\mathbf {x}}}\phi ({\mathbf {X}}_{T}^{t,{\mathbf {x}}}) D^{2}_{{\mathbf {x}}}{\mathbf {X}}_{T}^{t,{\mathbf {x}}}\cdot (\mathbf {h_{1}},\mathbf {h_{2}}), D_{{\mathbf {x}}}{\mathbf {X}}_{T}^{t,{\mathbf {x}}}\cdot \mathbf {h_{3}}\big \rangle _{{\mathbb {R}}^{L}}\Big ]\nonumber \\&\quad + {\mathbb {E}}\Big [ \big \langle D_{{\mathbf {x}}}\phi ({\mathbf {X}}_{T}^{t,{\mathbf {x}}}), D^{3}_{{\mathbf {x}}}{\mathbf {X}}_{T}^{t,{\mathbf {x}}}\cdot (\mathbf {h_{1}},\mathbf {h_{2}},\mathbf {h_{3}})\big \rangle _{{\mathbb {R}}^{L}}\Big ] \quad \text {for all } \,\; \mathbf {h_{1}},\mathbf {h_{2}},\mathbf {h_{3}}\in {\mathbb {R}}^{L}. \end{aligned}$$

(2.7)

In the following Lemma 2.5, based upon (2.5)–(2.7) and Lemmata 2.2, 2.3 and 2.4 , we derive global bounds for the first, second and third derivatives of the solution u of (1.5) in terms of derivatives of $\phi $, which are independent of the dimension L.

Lemma 2.5

Assume (A1)–(A2), and let $\{ D^{\ell }_{\mathbf {x}} u\}_{\ell =1}^3$ be from (1.5). Then, for all $\varepsilon _{1},\varepsilon _{2},\varepsilon _{3}>0$,

$\mathbf {(i)}$:: $\begin{aligned} \sup _{(t, {\mathbf {x}}) \in [0,T] \times {\mathbb R}^L} \Vert D_{\mathbf {x}} u(t,{\mathbf {x}})\Vert _{{{\mathbb {R}}}^{L}} \le V_{1}^{(1)}\cdot \sup \limits _{{\mathbf {x}}\in {\mathbb {R}}^{L}}\Vert D \phi ({\mathbf {x}})\Vert _{{\mathbb {R}}^{L}}, \end{aligned}$
$\mathbf {(ii)}$:: $\begin{aligned} \sup _{(t, {\mathbf {x}}) \in [0,T] \times {\mathbb R}^L} \Vert D^{2}_{\mathbf {x}} u(t,{\mathbf {x}})\Vert _{{{\mathbb {R}}}^{L\times L}} \le V_{1,\varepsilon _{1},\varepsilon _{2}}^{(2)}\cdot \sup \limits _{{\mathbf {x}}\in {\mathbb {R}}^{L}}\Vert D \phi ({\mathbf {x}})\Vert _{{\mathbb {R}}^{L}}+V_{2}^{(1)}\cdot \sup \limits _{{\mathbf {x}}\in {\mathbb {R}}^{L}}\Vert D^{2} \phi ({\mathbf {x}})\Vert _{{\mathbb {R}}^{L\times L}}, \end{aligned}$
$\mathbf {(iii)}$:: $\begin{aligned} \sup _{(t, {\mathbf {x}}) \in [0,T] \times {{\mathbb {R}}}^L} \Vert D^{3}_{\mathbf {x}} u(t,{\mathbf {x}})\Vert _{{\mathcal {L}}^{3}}&\le V_{1,\varepsilon _{3}}^{(3)}\cdot \sup \limits _{{\mathbf {x}}\in {\mathbb {R}}^{L}}\Vert D \phi ({\mathbf {x}})\Vert _{{\mathbb {R}}^{L}}+3V_{1}^{(1)}V_{1,\varepsilon _{1},\varepsilon _{2}}^{(2)}\cdot \sup \limits _{{\mathbf {x}}\in {\mathbb {R}}^{L}}\Vert D^{2} \phi ({\mathbf {x}})\Vert _{{\mathbb {R}}^{L\times L}} \\&\quad + V_{1}^{(1)}\sqrt{V_{4}^{(1)}} \cdot \sup \limits _{{\mathbf {x}}\in {\mathbb {R}}^{L}}\Vert D^{3} \phi ({\mathbf {x}})\Vert _{{\mathcal {L}}^{3}},\nonumber \end{aligned} $

where $V_{1}^{(1)},V_{2}^{(1)},V_{4}^{(1)},V_{1,\varepsilon _{1},\varepsilon _{2}}^{(2)},V_{1,\varepsilon _{3}}^{(3)}$ are given in Lemmata 2.2, 2.3 and 2.4.

Proof

(i) Let ${\mathbf {x}}\in {\mathbb {R}}^{L}$ and ${\mathbf {0}}\ne {\mathbf {h}}\in {\mathbb {R}}^{L}$. We apply the Cauchy–Schwarz inequality to the identity (2.5) and use Lemma 2.2 to get

$$\begin{aligned} \Bigl \vert \bigl \langle D_{\mathbf {x}} u(t, {\mathbf {x}}), {\mathbf {h}}\bigr \rangle _{\mathbb {R}^{L}}\Bigr \vert \le V_{1}^{(1)}\cdot \sup \limits _{{\mathbf {z}}\in {\mathbb {R}}^{L}}\Vert D\phi ({\mathbf {z}})\Vert _{{\mathbb {R}}^{L}} \cdot \Vert {\mathbf {h}}\Vert _{{\mathbb {R}}^{L}}. \end{aligned}$$

Thus, taking ${\mathbf {h}}=D_{\mathbf {x}} u(t, {\mathbf {x}})$ immediately yields the assertion.

(ii) Let ${\mathbf {x}}\in {\mathbb {R}}^{L}$, ${\mathbf {0}}\ne {\mathbf {h}},{\mathbf {w}}\in {\mathbb {R}}^{L}$ and $\varepsilon _{1},\varepsilon _{2}>0$. Similar to (i) we obtain, using Lemmata 2.2 and 2.3 ,

$$\begin{aligned} \Bigl |\big \langle D^{2}_{{\mathbf {x}}}u(t,{\mathbf {x}}){\mathbf {h}},{\mathbf {w}}\big \rangle _{{\mathbb {R}}^{L}} \Bigr |&\le \Bigl (V_{1,\varepsilon _{1},\varepsilon _{2}}^{(2)}\cdot \sup \limits _{{\mathbf {z}}\in {\mathbb {R}}^{L}}\Vert D \phi ({\mathbf {z}})\Vert _{{\mathbb {R}}^{L}}+V_{2}^{(1)}\\&\quad \cdot \sup \limits _{{\mathbf {z}}\in {\mathbb {R}}^{L}}\Vert D^{2} \phi ({\mathbf {z}})\Vert _{{\mathbb {R}}^{L\times L}}\Bigr )\cdot \Vert {\mathbf {h}}\Vert _{{\mathbb {R}}^{L}} \Vert {\mathbf {w}}\Vert _{{\mathbb {R}}^{L}}. \end{aligned}$$

Taking ${\mathbf {w}}= D^{2}_{{\mathbf {x}}}u(t,{\mathbf {x}}){\mathbf {h}}$, we further obtain

$$\begin{aligned} \frac{\Vert D^{2}_{{\mathbf {x}}}u(t,{\mathbf {x}}){\mathbf {h}}\Vert _{{\mathbb {R}}^{L}}}{\Vert {\mathbf {h}}\Vert _{{\mathbb {R}}^{L}}}\le V_{1,\varepsilon _{1},\varepsilon _{2}}^{(2)}\cdot \sup \limits _{{\mathbf {z}}\in {\mathbb {R}}^{L}}\Vert D \phi ({\mathbf {z}})\Vert _{{\mathbb {R}}^{L}}+V_{2}^{(1)}\cdot \sup \limits _{{\mathbf {z}}\in {\mathbb {R}}^{L}}\Vert D^{2} \phi ({\mathbf {z}})\Vert _{{\mathbb {R}}^{L\times L}}. \end{aligned}$$

The assertion now follows since

$$\begin{aligned} \Vert D^{2}_{{\mathbf {x}}}u(t,{\mathbf {x}})\Vert _{{\mathbb {R}}^{L\times L}}:=\sup \limits _{\Vert {\mathbf {h}}\Vert _{{\mathbb {R}}^{L}}=1} \Vert D^{2}_{{\mathbf {x}}}u(t,{\mathbf {x}}){\mathbf {h}}\Vert _{{\mathbb {R}}^{L}}. \end{aligned}$$

(iii) Let ${\mathbf {x}}\in {\mathbb {R}}^{L}$ and ${\mathbf {0}}\ne \mathbf {h_{1}},\mathbf {h_{2}},\mathbf {h_{3}}\in {\mathbb {R}}^{L}$. Similar to (i) and (ii), the verification of assertion (iii) follows by means of identity (2.7), the Cauchy–Schwarz inequality and Lemmata 2.2, 2.3 and 2.4 . $\square $

2.4 Stability bounds for iterates $\{ {\mathbf {Y}}^j\}_{j\ge 0}$ from (1.3)

We derive L-independent stability bounds for iterates $\{ {\mathbf {Y}}^j\}_{j\ge 0}$ from (1.3), provided (A1)–(A3) are valid.

Lemma 2.6

Assume (A1)–(A3). Consider a mesh $\{t_{j}\}_{j=0}^{J}\subset [0,T]$. Let $p\in {\mathbb {N}}{\setminus } \{0\}$ and let $\{{\mathbf {Y}}^{j}\}_{j\ge 0}$ solve (1.3). Then, for $\ell =0,1,2$, we have

$$\begin{aligned} \sup \limits _{j\ge 0} {\mathbb {E}}\big [\Vert {{\mathscr {A}}}^{\ell } {\mathbf {Y}}^{j}\Vert _{{\mathbb {R}}^{L}}^{2^{p}}\big ]\le \pmb {C}_{1,p}^{(\ell )}, \end{aligned}$$

where constants $\pmb {C}_{1,p}^{(\ell )}>0$ are independent of L.

We give the proof for $p=1,2$, and the proof for general $p>2$ then follows inductively.

Proof

(a) $p=1$: Fix $j\ge 0$ and $\ell =0,1,2$. Let $\pmb {{\mathsf {Z}}}^{j}:={{\mathscr {A}}}^{\ell }{\mathbf {Y}}^{j}$. We multiply (1.3) by $({{\mathscr {A}}}^{\ell })^{\top }\pmb {{\mathsf {Z}}}^{j+1}$ and use the binomial formula, as well as Young’s inequality $(\delta _{1},\delta _{2}\ge 1)$ to estimate

$$\begin{aligned}&\frac{1}{2}\Big ( \Vert \pmb {{\mathsf {Z}}}^{j+1}\Vert _{{\mathbb {R}}^{L}}^{2} - \Vert \pmb {{\mathsf {Z}}}^{j}\Vert _{{\mathbb {R}}^{L}}^{2}\Big )+ \left( \frac{1}{2} -\frac{1}{4\delta _{1}}- \frac{1}{4\delta _{2}}\right) \Vert \pmb {{\mathsf {Z}}}^{j+1}-\pmb {{\mathsf {Z}}}^{j}\Vert _{{\mathbb {R}}^{L}}^{2} + \tau ^{j+1}\big \langle {{\mathscr {A}}}\pmb {{\mathsf {Z}}}^{j+1},\pmb {{\mathsf {Z}}}^{j+1}\big \rangle _{{\mathbb {R}}^{L}}\nonumber \\&\quad \le \delta _{1}(\tau ^{j+1})^{2} \Vert {{\mathscr {A}}}^{\ell }{\mathbf {f}}({\mathbf {Y}}^{j})\Vert _{{\mathbb {R}}^{L}}^{2} + \tau ^{j+1}\Vert {{\mathscr {A}}}^{\ell }{\mathbf {f}}({\mathbf {Y}}^{j})\Vert _{{\mathbb {R}}^{L}} \Vert \pmb {{\mathsf {Z}}}^{j} \Vert _{{\mathbb {R}}^{L}} \nonumber \\&\qquad +\delta _{2} \sum \limits _{k=1}^{K}\Vert {{\mathscr {A}}}^{\ell }\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\Delta _{j+1} \beta _{k}\Vert _{{\mathbb {R}}^{L}}^{2}+ \sum \limits _{k=1}^{K}\big \langle {{\mathscr {A}}}^{\ell }\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\Delta _{j+1} \beta _{k},\pmb {{\mathsf {Z}}}^{j}\big \rangle _{{\mathbb {R}}^{L}}. \end{aligned}$$

(2.8)

Note that the last term vanishes if ${{\mathbb {E}}}[\cdot ]$ is applied. By (A1)–(A2), the tower property for expectations, and the identity ${\mathbb {E}}\big [|\Delta _{j+1}\beta _{k}|^{2}\big ]= \tau ^{j+1}$, we further conclude that

$$\begin{aligned} \frac{1}{2} {\mathbb {E}}\big [\Vert \pmb {{\mathsf {Z}}}^{j+1}\Vert _{{\mathbb {R}}^{L}}^{2}\big ]&\le \Big (2\delta _{1}\bigl (C_{{\mathbf {f}}}^{(\ell )}\bigr )^{2}T+2C_{{\mathbf {f}}}^{(\ell )}+2\delta _{2}K\bigl (C_{\pmb {\sigma }}^{(\ell )}\bigr )^{2}\Big )\cdot {\mathbb {E}}\big [ \Vert \pmb {{\mathsf {Z}}}^{j}\Vert _{{\mathbb {R}}^{L}}^{2}\big ]\cdot \tau ^{j+1}\\&\quad + \Big (2\delta _{1}\bigl (C_{{\mathbf {f}}}^{(\ell )}\bigr )^{2}T+2C_{{\mathbf {f}}}^{(\ell )}+2\delta _{2}K\bigl (C_{\pmb {\sigma }}^{(\ell )}\bigr )^{2}\Big )\cdot \tau ^{j+1}. \end{aligned}$$

We set

$$\begin{aligned} {\mathbf {C}}:=\Big (2\delta _{1}\bigl (C_{{\mathbf {f}}}^{(\ell )}\bigr )^{2}T+2C_{{\mathbf {f}}}^{(\ell )}+2\delta _{2}K\bigl (C_{\pmb {\sigma }}^{(\ell )}\bigr )^{2}\Big ). \end{aligned}$$

Summation over all iteration steps, and using (A3) then lead to

$$\begin{aligned} {\mathbb {E}}\big [\Vert \pmb {{\mathsf {Z}}}^{j^{*}}\Vert _{{\mathbb {R}}^{L}}^{2}\big ]\le \bigl (C_{{\mathbf {y}}}^{(\ell )}\bigr )^{2} + 2{\mathbf {C}} t_{j^{*}}+ 2 {\mathbf {C}} \sum \limits _{j=0}^{j^{*}-1} \tau ^{j+1} {\mathbb {E}}\big [ \Vert \pmb {{\mathsf {Z}}}^{j}\Vert _{{\mathbb {R}}^{L}}^{2}\big ] \quad (j^{*} \ge 1). \end{aligned}$$

Now, the discrete Gronwall inequality yields the assertion.

(b) $p=2$: Multiply (2.8) with $\Vert \pmb {{\mathsf {Z}}}^{j+1}\Vert _{{\mathbb {R}}^{L}}^{2}$ and use the binomial formula to get the estimate

$$\begin{aligned}&\frac{1}{4}\Big ( \Vert \pmb {{\mathsf {Z}}}^{j+1}\Vert _{{\mathbb {R}}^{L}}^{4} - \Vert \pmb {{\mathsf {Z}}}^{j}\Vert _{{\mathbb {R}}^{L}}^{4}\Big )+\frac{1}{4}\Bigl ( \Vert \pmb {{\mathsf {Z}}}^{j+1}\Vert _{{\mathbb {R}}^{L}}^{2} - \Vert \pmb {{\mathsf {Z}}}^{j}\Vert _{{\mathbb {R}}^{L}}^{2}\Bigr )^{2}+ \left( \frac{1}{2} -\frac{1}{4\delta _{1}}- \frac{1}{4\delta _{2}}\right) \\&\qquad \Vert \pmb {{\mathsf {Z}}}^{j+1}-\pmb {{\mathsf {Z}}}^{j}\Vert _{{\mathbb {R}}^{L}}^{2}\Vert \pmb {{\mathsf {Z}}}^{j+1}\Vert _{{\mathbb {R}}^{L}}^{2} + \tau ^{j+1}\big \langle {{\mathscr {A}}}\pmb {{\mathsf {Z}}}^{j+1},\pmb {{\mathsf {Z}}}^{j+1}\big \rangle _{{\mathbb {R}}^{L}} \Vert \pmb {{\mathsf {Z}}}^{j+1}\Vert _{{\mathbb {R}}^{L}}^{2}\\&\quad \le \delta _{1}(\tau ^{j+1})^{2} \Vert {{\mathscr {A}}}^{\ell }{\mathbf {f}}({\mathbf {Y}}^{j})\Vert _{{\mathbb {R}}^{L}}^{2}\Vert \pmb {{\mathsf {Z}}}^{j+1}\Vert _{{\mathbb {R}}^{L}}^{2} + \tau ^{j+1}\Vert {{\mathscr {A}}}^{\ell }{\mathbf {f}}({\mathbf {Y}}^{j})\Vert _{{\mathbb {R}}^{L}} \Vert \pmb {{\mathsf {Z}}}^{j} \Vert _{{\mathbb {R}}^{L}}\Vert \pmb {{\mathsf {Z}}}^{j+1}\Vert _{{\mathbb {R}}^{L}}^{2}\\&\qquad +\delta _{2} \sum \limits _{k=1}^{K}\Vert {{\mathscr {A}}}^{\ell }\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\Delta _{j+1} \beta _{k}\Vert _{{\mathbb {R}}^{L}}^{2}\Vert \pmb {{\mathsf {Z}}}^{j+1}\Vert _{{\mathbb {R}}^{L}}^{2} \\&\qquad + \sum \limits _{k=1}^{K}\big \langle {{\mathscr {A}}}^{\ell }\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\Delta _{j+1} \beta _{k},\pmb {{\mathsf {Z}}}^{j}\big \rangle _{{\mathbb {R}}^{L}}\Vert \pmb {{\mathsf {Z}}}^{j+1}\Vert _{{\mathbb {R}}^{L}}^{2}. \end{aligned}$$

We now add and substract $\Vert \pmb {{\mathsf {Z}}}^{j}\Vert _{{\mathbb {R}}^{L}}^{2}$ in the two terms on the right-hand side which involve random increments, to then absorb part of it to the second term on the left-hand side. For $\tilde{\delta }_{1},\tilde{\delta }_{2},\tilde{\delta }_{3},\tilde{\delta }_{4}>0$, thanks to (A1)–(A2), taking expectations and using the tower property leads to

$$\begin{aligned}&\frac{1}{4}\Big ( {\mathbb {E}}\big [\Vert \pmb {{\mathsf {Z}}}^{j+1}\Vert _{{\mathbb {R}}^{L}}^{4}\big ] - {\mathbb {E}}\big [\Vert \pmb {{\mathsf {Z}}}^{j}\Vert _{{\mathbb {R}}^{L}}^{4}\big ]\Big )\\&\qquad +\left( \frac{1}{4} - \frac{1}{4\tilde{\delta }_{1}}- \frac{1}{4\tilde{\delta }_{2}}- \frac{1}{4\tilde{\delta }_{3}}- \frac{1}{4\tilde{\delta }_{4}}\right) {\mathbb {E}}\Big [ \big |\Vert \pmb {{\mathsf {Z}}}^{j+1}\Vert _{{\mathbb {R}}^{L}}^{2} - \Vert \pmb {{\mathsf {Z}}}^{j}\Vert _{{\mathbb {R}}^{L}}^{2}\big |^{2}\Big ]\\&\qquad + \left( \frac{1}{2} -\frac{1}{4\delta _{1}}- \frac{1}{4\delta _{2}}\right) {\mathbb {E}}\big [\Vert \pmb {{\mathsf {Z}}}^{j+1}-\pmb {{\mathsf {Z}}}^{j}\Vert _{{\mathbb {R}}^{L}}^{2}\Vert \pmb {{\mathsf {Z}}}^{j+1}\Vert _{{\mathbb {R}}^{L}}^{2}\big ]\\&\qquad + \tau ^{j+1} {\mathbb {E}}\big [\big \langle {{\mathscr {A}}}\pmb {{\mathsf {Z}}}^{j+1},\pmb {{\mathsf {Z}}}^{j+1}\big \rangle _{{\mathbb {R}}^{L}} \Vert \pmb {{\mathsf {Z}}}^{j+1}\Vert _{{\mathbb {R}}^{L}}^{2} \big ]\\&\quad \le \mathbf {{\widetilde{C}}}\Bigl ( 1 + {\mathbb {E}}\big [\Vert \pmb {{\mathsf {Z}}}^{j} \Vert _{{\mathbb {R}}^{L}}^{4}\big ]\Bigr )\cdot \tau ^{j+1}, \end{aligned}$$

with

$$\begin{aligned} \mathbf {{\widetilde{C}}}&:=8\tilde{\delta }_{1}\delta _{1}^{2}T^{3} \bigl (C_{{\mathbf {f}}}^{(\ell )}\bigr )^{4} + 4 T \delta _{1} \bigl (C_{{\mathbf {f}}}^{(\ell )}\bigr )^{2} +4 T\tilde{\delta }_{2} \bigl (C_{{\mathbf {f}}}^{(\ell )}\bigr )^{2}+ 8C_{{\mathbf {f}}}^{(\ell )} + 24T \tilde{\delta }_{3} K^{2} \delta _{2}^{2} \bigl (C_{\pmb {\sigma }}^{(\ell )}\bigr )^{4}\\&\quad + 4K\delta _{2} \bigl (C_{\pmb {\sigma }}^{(\ell )}\bigr )^{2}+ 4K^{2}\tilde{\delta }_{4} \bigl (C_{\pmb {\sigma }}^{(\ell )}\bigr )^{2}. \end{aligned}$$

Choosing $\delta _{1},\delta _{2}\ge 1$, $\tilde{\delta }_{1},\tilde{\delta }_{2},\tilde{\delta }_{3},\tilde{\delta }_{4}\ge 4$ and using (A1) then leads to

$$\begin{aligned} \frac{1}{4}\Big ( {\mathbb {E}}\big [\Vert \pmb {{\mathsf {Z}}}^{j+1}\Vert _{{\mathbb {R}}^{L}}^{4}\big ] - {\mathbb {E}}\big [\Vert \pmb {{\mathsf {Z}}}^{j}\Vert _{{\mathbb {R}}^{L}}^{4}\big ]\Big )\le \mathbf {{\widetilde{C}}}\cdot \tau ^{j+1} + \mathbf {{\widetilde{C}}} {\mathbb {E}}\big [\Vert \pmb {{\mathsf {Z}}}^{j} \Vert _{{\mathbb {R}}^{L}}^{4}\big ] \cdot \tau ^{j+1}. \end{aligned}$$

Now, similar arguments as in b) yield the assertion. $\square $

3 A posteriori weak error estimates for the Scheme (1.3)

In Theorem 3.1, we derive an a posteriori error estimate for iterates $\{ {\mathbf {Y}}^j\}_{j=0}^J$ of scheme (1.3). It is shown in Theorem 3.5 that this error estimator converges with optimal order on uniform meshes, recovering a corresponding result in [6] on a priori weak error analysis for a corresponding time discretization of (1.2); the relevant tools to verify Theorem 3.5 are the (discrete) stability properties of (1.3) in Lemma 2.6.

3.1 A posteriori weak error estimation: derivation and properties

We bound the weak approximation error $\max \limits _{0\le j\le J}\Big |{\mathbb {E}}\big [ \phi ({\mathbf {X}}_{t_{j}})\big ]-{\mathbb {E}}\big [ \phi ({\mathbf {Y}}^{j})\big ] \Big |$ in a posteriori form in Theorem 3.1. For this purpose, we employ the (data-dependent) estimates in Lemma 2.5, and therefore define [cf. also (1.10)]

$$\begin{aligned} \pmb {C}_{D}(\phi )&:=\underbrace{V_{1}^{(1)}}_{C_{1,1}}\cdot \sup \limits _{{\mathbf {x}}\in {\mathbb {R}}^{L}}\Vert D \phi ({\mathbf {x}})\Vert _{{\mathbb {R}}^{L}},\\ \pmb {C}_{D^{2}}(\phi )&:=\underbrace{V_{1,\varepsilon _{1},\varepsilon _{2}}^{(2)}}_{C_{2,1}}\cdot \sup \limits _{{\mathbf {x}}\in {\mathbb {R}}^{L}}\Vert D \phi ({\mathbf {x}})\Vert _{{\mathbb {R}}^{L}}+\underbrace{V_{2}^{(1)}}_{C_{2,2}}\cdot \sup \limits _{{\mathbf {x}}\in {\mathbb {R}}^{L}}\Vert D^{2} \phi ({\mathbf {x}})\Vert _{{\mathbb {R}}^{L\times L}},\\ \pmb {C}_{D^{3}}(\phi )&:=\underbrace{V_{1,\varepsilon _{3}}^{(3)}}_{C_{3,1}}\cdot \sup \limits _{{\mathbf {x}}\in {\mathbb {R}}^{L}}\Vert D \phi ({\mathbf {x}})\Vert _{{\mathbb {R}}^{L}}+\underbrace{3V_{1}^{(1)}V_{1,\varepsilon _{1},\varepsilon _{2}}^{(2)}}_{C_{3,2}}\cdot \sup \limits _{{\mathbf {x}}\in {\mathbb {R}}^{L}}\Vert D^{2} \phi ({\mathbf {x}})\Vert _{{\mathbb {R}}^{L\times L}} \\&\quad + \underbrace{V_{1}^{(1)}\sqrt{V_{4}^{(1)}}}_{C_{3,3}} \cdot \sup \limits _{{\mathbf {x}}\in {\mathbb {R}}^{L}}\Vert D^{3} \phi ({\mathbf {x}})\Vert _{{\mathcal {L}}^{3}},\quad (\varepsilon _{1},\varepsilon _{2},\varepsilon _{3}>0). \end{aligned}$$

The following result estimates the weak error caused by $\{ {\mathbf {Y}}^j\}_{j=0}^J$ from (1.3) on a mesh of local mesh sizes $\{ {\tau }^{j+1}\}_{j=0}^{J-1}$ in terms of a computable a posteriori error estimator ${\mathfrak {G}}\equiv \{ {\mathfrak {G}}\bigl (\phi ;\tau ^{j+1},{\mathbf {Y}}^{j}\bigr )\}_{j=0}^{J-1}$.

Theorem 3.1

Assume (A1)–(A3). Let $\{t_{j}\}_{j=0}^{J}\subset [0,T]$ be a mesh with local mesh sizes $\{ {\tau }^{j+1}\}_{j=0}^{J-1}$. Let $\{{\mathbf {Y}}^{j}\}_{j=0}^{J}$ solve (1.3). Then, we have

$$\begin{aligned} \max \limits _{0\le j \le J}\Big |{\mathbb {E}}\big [\phi ({\mathbf {X}}_{t_{j}})\big ] - {\mathbb {E}}\big [\phi ({\mathbf {Y}}^{j})\big ] \Big |\le \sum \limits _{j=0}^{J-1} \tau ^{j+1}{\mathfrak {G}}\bigl (\phi ;\tau ^{j+1},{\mathbf {Y}}^{j}\bigr ) , \end{aligned}$$

(3.1)

where the a posteriori error estimator ${\mathfrak {G}}\bigl (\phi ;\tau ^{j+1},{\mathbf {Y}}^{j}\bigr )$ is given by

$$\begin{aligned}&{\mathfrak {G}}\bigl (\phi ;\tau ^{j+1},{\mathbf {Y}}^{j}\bigr ):=\Bigg \{ \tfrac{3\pmb {C}_{D}(\phi )}{2} \cdot \pmb {\mathtt {E_{1}}}({\mathbf {Y}}^{j}) +\tfrac{\pmb {C}_{D^{2}}(\phi )}{2} \cdot \pmb {\mathtt {E_{2}}}({\mathbf {Y}}^{j}) + \tfrac{\pmb {C}_{D}(\phi )}{2}\cdot \pmb {\mathtt {E_{3}}}({\mathbf {Y}}^{j}) \\&\quad + \tfrac{\pmb {C}_{D}(\phi ) }{4} \cdot \pmb {\mathtt {E_{4}}}({\mathbf {Y}}^{j}) + \tfrac{ \pmb {C}_{D^{2}}(\phi ) }{2} \cdot \pmb {\mathtt {E_{5}}}({\mathbf {Y}}^{j}) + \frac{\pmb {C}_{D^{2}}(\phi ) }{2}\cdot \pmb {\mathtt {E_{6}}}({\mathbf {Y}}^{j}) + \tfrac{\pmb {C}_{D^{2}}(\phi ) }{4}\cdot \pmb {\mathtt {E_{7}}}({\mathbf {Y}}^{j}) \\&\quad + \frac{\pmb {C}_{D^{3}}(\phi ) }{2}\cdot \pmb {\mathtt {E_{8}}}({\mathbf {Y}}^{j}) +\pmb {C}_{D^{2}}(\phi ) \cdot \pmb {\mathtt {E_{9}}}({\mathbf {Y}}^{j}) + \tfrac{\pmb {C}_{D^{2}}(\phi ) C_{D\pmb {\sigma }}^{2} }{4} \cdot \pmb {\mathtt {E_{12}}}({\mathbf {Y}}^{j}) \Bigg \}\cdot \tau ^{j+1} \\&\quad + \Bigg \{\bigg \{ \pmb {C}_{D}(\phi ) C_{D^{2}{\mathbf {f}}}\cdot \sqrt{\pmb {\mathtt {E_{10}}}({\mathbf {Y}}^{j})} +\Big [\tfrac{\pmb {C}_{D}(\phi ) C_{D^{3}{\mathbf {f}}}}{2} + \pmb {C}_{D^{2}}(\phi ) C_{D^{2}{\mathbf {f}}}\Big ]\cdot \sqrt{\pmb {\mathtt {E_{11}}}({\mathbf {Y}}^{j})} \\&\quad +\pmb {C}_{D^{2}}(\phi ) C_{D^{2}\pmb {\sigma }}\cdot \sqrt{\pmb {\mathtt {E_{13}}}({\mathbf {Y}}^{j})} +\Big [\frac{\pmb {C}_{D^{2}}(\phi ) C_{D^{3}\pmb {\sigma }}}{2}+ \pmb {C}_{D^{3}}(\phi ) C_{D^{2}\pmb {\sigma }}\Big ] \\&\quad \cdot \sqrt{\pmb {\mathtt {E_{14}}}({\mathbf {Y}}^{j})}\bigg \}\cdot \sqrt{\tfrac{ \tau ^{j}}{15} \cdot \pmb {\mathtt {E_{10}}}({\mathbf {Y}}^{j}) + \tfrac{1}{8} \cdot \pmb {\mathtt {E_{12}}}({\mathbf {Y}}^{j})}\Bigg \}\cdot \big (\tau ^{j}\big )^{1.5} \\&\quad +\Bigg \{ \tfrac{\pmb {C}_{D^{2}}(\phi ) C_{D\pmb {\sigma }}^{2} }{6}\cdot \pmb {\mathtt {E_{10}}}({\mathbf {Y}}^{j}) + \frac{\pmb {C}_{D^{2}}(\phi ) }{2}\cdot \pmb {\mathtt {E_{15}}}({\mathbf {Y}}^{j})\Bigg \}\cdot \big (\tau ^{j}\big )^{2}, \end{aligned}$$

with $\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}:=\bigl ({\mathbb {I}}+\tau ^{j+1}{{\mathscr {A}}}\bigr )^{-1}$, and computable terms

1.
$\begin{aligned} \pmb {\mathtt {E_{1}}}({\mathbf {Y}}^{j})={\mathbb {E}}\Big [ \big \Vert {{\mathscr {A}}}^{2} \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} {\mathbf {Y}}^{j}-{{\mathscr {A}}}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} {\mathbf {f}}({\mathbf {Y}}^{j}) \big \Vert _{{\mathbb {R}}^{L}} \Big ], \end{aligned}$
2.
$\begin{aligned} \pmb {\mathtt {E_{2}}}({\mathbf {Y}}^{j}):={\mathbb {E}}\left[ \sum \limits _{k=1}^{K}\big \Vert \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} \pmb {\sigma }_{k}({\mathbf {Y}}^{j})\big \Vert _{{\mathbb {R}}^{L}} \big \Vert {{\mathscr {A}}}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} \pmb {\sigma }_{k}({\mathbf {Y}}^{j})\big \Vert _{{\mathbb {R}}^{L}} \right] , \end{aligned}$
3.
$\begin{aligned} \pmb {\mathtt {E_{3}}}({\mathbf {Y}}^{j}):={\mathbb {E}}\Big [ \big \Vert {{\mathscr {A}}}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} {\mathbf {Y}}^{j}-\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} {\mathbf {f}}({\mathbf {Y}}^{j}) \big \Vert _{{\mathbb {R}}^{L}} \Vert D {\mathbf {f}}({\mathbf {Y}}^{j})\Vert _{{\mathcal {L}}} \Big ], \end{aligned}$
4.
$\begin{aligned} \pmb {\mathtt {E_{4}}}({\mathbf {Y}}^{j}):={\mathbb {E}}\left[ \Vert D^{2} {\mathbf {f}}({\mathbf {Y}}^{j})\Vert _{{\mathcal {L}}^{2}} \cdot \sum \limits _{k=1}^{K} \big \Vert \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} \pmb {\sigma }_{k}({\mathbf {Y}}^{j}) \big \Vert _{{\mathbb {R}}^{L}}^{2} \right] , \end{aligned}$
5.
$\begin{aligned} \pmb {\mathtt {E_{5}}}({\mathbf {Y}}^{j}):={\mathbb {E}}\left[ \Vert D {\mathbf {f}}({\mathbf {Y}}^{j})\Vert _{{\mathcal {L}}} \cdot \sum \limits _{k=1}^{K}\big \Vert \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\big \Vert _{{\mathbb {R}}^{L}}^{2} \right] , \end{aligned}$
6.
$\begin{aligned} \pmb {\mathtt {E_{6}}}({\mathbf {Y}}^{j}):\!=\!{\mathbb {E}}\left[ \big \Vert {{\mathscr {A}}}\pmb {{\bar{{{\mathscr {A}}}}}}^{j\!+\!1} {\mathbf {Y}}^{j}\!-\!\pmb {{\bar{{{\mathscr {A}}}}}}^{j\!+\!1} {\mathbf {f}}({\mathbf {Y}}^{j})\big \Vert _{{\mathbb {R}}^{L}} \cdot \sum \limits _{k=1}^{K}\Vert \pmb {\sigma }_{k}({\mathbf {Y}}^{j})\Vert _{{\mathbb {R}}^{L}} \Vert D \pmb {\sigma }_{k}({\mathbf {Y}}^{j})\Vert _{{\mathcal {L}}} \right] , \end{aligned}$
7.
$\begin{aligned} \pmb {\mathtt {E_{7}}}({\mathbf {Y}}^{j}):={\mathbb {E}}\left[ \sum \limits _{k=1}^{K}\Vert \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}( {\mathbf {Y}}^{j})\Vert _{{\mathbb {R}}^{L}}^{2} \cdot \sum \limits _{k=1}^{K}\Vert \pmb {\sigma }_{k}({\mathbf {Y}}^{j})\Vert _{{\mathbb {R}}^{L}} \Vert D^{2}\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\Vert _{{\mathcal {L}}^{2}}\right] , \end{aligned}$
8.
$\begin{aligned} \pmb {\mathtt {E_{8}}}({\mathbf {Y}}^{j}):={\mathbb {E}}\left[ \sum \limits _{k=1}^{K}\Vert \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\Vert _{{\mathbb {R}}^{L}}^{2} \cdot \sum \limits _{k=1}^{K} \Vert \pmb {\sigma }_{k}({\mathbf {Y}}^{j})\Vert _{{\mathbb {R}}^{L}} \Vert D \pmb {\sigma }_{k}({\mathbf {Y}}^{j})\Vert _{{\mathcal {L}}}\right] , \end{aligned}$
9.
$\begin{aligned} \pmb {\mathtt {E_{9}}}({\mathbf {Y}}^{j}):={\mathbb {E}}\left[ \sum \limits _{k=1}^{K}\big \Vert {{\mathscr {A}}}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j}) \big \Vert _{{\mathbb {R}}^{L}}\Vert \pmb {\sigma }_{k}({\mathbf {Y}}^{j})\Vert _{{\mathbb {R}}^{L}}\right] , \end{aligned}$
10.
$\begin{aligned} \pmb {\mathtt {E_{10}}}({\mathbf {Y}}^{j}):={\mathbb {E}}\Big [ \big \Vert {{\mathscr {A}}}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} {\mathbf {Y}}^{j}-\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} {\mathbf {f}}({\mathbf {Y}}^{j})\big \Vert _{{\mathbb {R}}^{L}}^{2} \Big ], \end{aligned}$
11.
$\begin{aligned} \pmb {\mathtt {E_{11}}}({\mathbf {Y}}^{j}):={\mathbb {E}}\left[ \Big |\sum \limits _{k=1}^{K} \big \Vert \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} \pmb {\sigma }_{k}({\mathbf {Y}}^{j})\big \Vert _{{\mathbb {R}}^{L}}^{2} \Big |^{2} \right] , \end{aligned}$
12.
$\begin{aligned} \pmb {\mathtt {E_{12}}}({\mathbf {Y}}^{j}):={\mathbb {E}}\left[ \sum \limits _{k=1}^{K} \big \Vert \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} \pmb {\sigma }_{k}({\mathbf {Y}}^{j})\big \Vert _{{\mathbb {R}}^{L}}^{2} \right] , \end{aligned}$
13.
$\begin{aligned} \pmb {\mathtt {E_{13}}}({\mathbf {Y}}^{j}):={\mathbb {E}}\left[ \big \Vert {{\mathscr {A}}}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} {\mathbf {Y}}^{j}-\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} {\mathbf {f}}({\mathbf {Y}}^{j})\big \Vert _{{\mathbb {R}}^{L}}^{2} \Big |\sum \limits _{k=1}^{K}\Vert \pmb {\sigma }_{k}({\mathbf {Y}}^{j})\Vert _{{\mathbb {R}}^{L}}\Big |^{2} \right] , \end{aligned}$
14.
$\begin{aligned} \pmb {\mathtt {E_{14}}}({\mathbf {Y}}^{j}):={\mathbb {E}}\left[ \Big |\sum \limits _{k=1}^{K} \big \Vert \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} \pmb {\sigma }_{k}({\mathbf {Y}}^{j})\big \Vert _{{\mathbb {R}}^{L}}^{2}\Big |^{2} \Big |\sum \limits _{k=1}^{K} \Vert \pmb {\sigma }_{k}({\mathbf {Y}}^{j})\Vert _{{\mathbb {R}}^{L}} \Big |^{2} \right] , \end{aligned}$
15.
$\begin{aligned} \pmb {\mathtt {E_{15}}}({\mathbf {Y}}^{j}):={\mathbb {E}}\left[ \sum \limits _{k=1}^{K}\big \Vert {{\mathscr {A}}}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} \pmb {\sigma }_{k}({\mathbf {Y}}^{j})\big \Vert _{{\mathbb {R}}^{L}}^{2} \right] . \end{aligned}$

Remark 3.1

$\pmb {1.}$ For ${\mathbf {f}}\equiv 0$ and/or $\pmb {\sigma }_{k}$, $k=1,\ldots ,K$ constant, the estimator ${\mathfrak {G}}$ simplifies considerably; also, Theorem 3.1 remains valid for ODE systems, i.e., for $\pmb {\sigma }_{k}\equiv {\mathbf {0}}$ ($k=1,\ldots ,K$), where only terms $\pmb {\mathtt {E_{1}}}(\cdot )$, $\pmb {\mathtt {E_{3}}}(\cdot )$ and $\pmb {\mathtt {E_{10}}}(\cdot )$ constitute ${\mathfrak {G}}$.

For ODE systems (1.1) with $\pmb {\sigma }_{k}\equiv {\mathbf {0}}$ ($k=1,\ldots ,K$), a different approach to derive a (residual-based) a posteriori estimate on a mesh $\{ t_{j}\}_{j=0}^{J} \subset [0,T]$ for $\Vert {\mathbf {X}}_T - {\mathbf {Y}}^{J}\Vert _{{{\mathbb {R}}}^L}$ is via duality methods [8], which exploit (strong stability properties of) the related adjoint equation; see also 2. below. Another variational approach here that avoids duality methods is [24], where an inherited ‘(discrete) energy dissipation’ property of the implicit discretization of (1.1) is used to bound $\max _{1 \le j \le J} \Vert {\mathbf {X}}_{t_j} - {\mathbf {Y}}^{j}\Vert _{{{\mathbb {R}}}^L}$ for cases where the drift operator in (1.1) is the gradient of a convex functional. We also mention [29, Ch. 6], where (residual-based) a posteriori estimates are derived by variational methods for space-time discretizations of the more general (1.2) with $\Sigma _{k} \equiv 0$ ($k=1,\ldots ,K$), where the drift operator need not be the gradient of a convex functional.

$\pmb {2.}$ For finite element based discretizations of (linear elliptic, parabolic) PDEs $A(u)=f$, residual-based a posteriori estimates are obtained in [8], where dual/adjoint problems are the relevant tool; their (global) stability properties may then be exploited to bound the error in terms of the residual $\rho (u_{h})= A(u_{h}) - f$ of the computed solution $u_{h}$, times a related stability constant. In later works, dual problems involve functionals $\phi $, and its solution z is computed approximately to then enter as local weights $\omega (z_{h})$ in the ‘duality-based weighted residual’ estimator of the form

$$\begin{aligned} \vert \phi (u) - \phi (u_{h})\vert \le \vert \bigl (\rho (u_{h}), \omega ({z_{h}})\bigr )\vert + \text {`}\text {higher order terms}\text {'} \end{aligned}$$

to sharpen computable error bounds; see the surveys [2, 11].

The derivation of a posteriori error estimate (3.1) for iterates of (1.3) uses the (backward) Kolmogorov equation (1.5) on ${[}0,t_{n+1}{]} \times {{\mathbb {R}}}^L$ for the transform $u(t,{\mathbf {x}}) = {{\mathbb {E}}} \bigl [\phi ({\mathbf {X}}^{t,{\mathbf {x}}}_{t_{n+1}})\bigr ]$—instead of an adjoint evolutionary problem on $(0, t_{n+1})$ that is motivated from optimal control: the works [23, 27] approximate derivatives of u to build local weights contained in $ \{\rho _{j}\}_{j=1}^{J}$ in (1.6), which is possible for small L; in this work, we use the global stability estimate (1.10) that leads to the a posteriori error estimator in (3.1) for iterates from (1.3), which is applicable to SDE (1.1) for arbitrary L.

$\pmb {3.}$ In [23, 27], (asymptotic) a posteriori error expansions for the terminal time T are given for both, random and (quasi-)deterministic meshes, while (3.1) bounds the error uniformly in time. The proof of Theorem (3.1) exploits that meshes $\{ t_{j}\}_{j=0}^{J} \subset [0,T]$ are (quasi-)deterministic, e.g., by repeated use of Itô’s formula.

$\pmb {4.}$ The weak Euler method replaces Wiener increments $\{\Delta _{j+1} \beta _{k}\}_{j=0}^{J-1}$ in (1.3) by bounded, discrete random variables $\{ {\widetilde{\xi }}^{j+1}_{k} \sqrt{\tau ^{j+1}}\}_{j=0}^{J-1}$ with approximate moments, see e.g. [18, p. 458]: for example, ${\mathbb P}\big [{\widetilde{\xi }}_k^{j+1} = \pm 1\big ] = \frac{1}{2}$ leads to iterates $\{ \widetilde{\mathbf {Y}}^{j}\}_{j=0}^J$, and their ‘continuification’ $\widetilde{\pmb {{\mathcal {Y}}}} \equiv \{ \widetilde{\pmb {{\mathcal {Y}}}}_t;\, t \in [0, T]\}$, given by

$$\begin{aligned} \widetilde{\pmb {{\mathcal {Y}}}}_{t}&= \widetilde{\mathbf {Y}}^{j} + \Bigl (\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}{\mathbf {f}}(\widetilde{\mathbf {Y}}^{j})-{{\mathscr {A}}}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} \widetilde{\mathbf {Y}}^{j} \Bigr )(t-t_{j}) \nonumber \\&\quad + \sum \limits _{k=1}^{K}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}(\widetilde{\mathbf {Y}}^{j}) {\widetilde{\xi }}_k^{j+1} \sqrt{t-t_{j}} \quad \text {for all } \,\;t\in [t_{j},t_{j+1}]. \end{aligned}$$

(3.2)

The a posteriori weak error analysis now starts again with (1.8), but lacks Itô’s formula in (1.9), and thus proceeds with the mean value theorem and Taylor’s formula,

$$\begin{aligned} {\mathbb {E}}\bigl [ u(t_{j+1}, \widetilde{\mathbf {Y}}^{j+1})- u(t_j, \widetilde{\mathbf {Y}}^{j}) \bigr ]&={\mathbb {E}}\bigl [ u(t_{j+1}, \widetilde{\mathbf {Y}}^{j+1})-u(t_{j}, \widetilde{\mathbf {Y}}^{j+1})\\&\quad +u(t_{j}, \widetilde{\mathbf {Y}}^{j+1}) - u(t_j, \widetilde{\mathbf {Y}}^{j}) \bigr ]\\&={\mathbb {E}}\Bigl [ \partial _{t} u(t^{*},\widetilde{\mathbf {Y}}^{j+1})\cdot \tau ^{j+1}+\big \langle D_{\mathbf {x}} u(t_{j}, \widetilde{\mathbf {Y}}^{j}), \widetilde{\mathbf {Y}}^{j+1} - \widetilde{\mathbf {Y}}^{j}\big \rangle _{{\mathbb {R}}^{L}}\\&\quad + \frac{1}{2}\mathrm {Tr}\Big (D_{{\mathbf {x}}}^{2}u(t_{j},\widetilde{\mathbf {Y}}^{*})\big (\widetilde{\mathbf {Y}}^{j+1}-\widetilde{\mathbf {Y}}^{j}\bigr )\bigl (\widetilde{\mathbf {Y}}^{j+1} - \widetilde{\mathbf {Y}}^{j}\bigr )^{\top }\Bigr )\Bigr ], \end{aligned}$$

where $t^{*}\in (t_{j},t_{j+1})$ and $\widetilde{\mathbf {Y}}^{*}:=\widetilde{\mathbf {Y}}^{j}+\Theta (\widetilde{\mathbf {Y}}^{j+1}-\widetilde{\mathbf {Y}}^{j})$ for some $\Theta \in [0,1]$. A repeated use of (1.5) and (1.3) (in modified form) [cf. also (3.2)] then causes changes to the proof of Theorem 3.1: no Malliavin calculus is needed any more for the weak Euler method—as is needed to handle terms (3.9) and (3.29); also, higher derivatives of u appear, and further computable terms constitute $\{\widetilde{{\mathfrak {G}}}\bigl (\phi ;\tau ^{j+1},\widetilde{\mathbf {Y}}^{j},\widetilde{\mathbf {Y}}^{j+1}\bigr )\}_{j\ge 0}$ for (3.1).

$\pmb {5.}$ In practice, the terms $\{\pmb {\mathtt {E}_{\ell }}(\cdot )\}_{\pmb {{\ell }}=1,\ldots ,15}$ may be approximated by Monte-Carlo method; see Sect. 5 for more details.

The representation of the a posteriori weak error estimator ${\mathfrak {G}}\equiv \{ {\mathfrak {G}}\bigl (\phi ;\tau ^{j+1},$${\mathbf {Y}}^{j}\bigr )\}_{j\!=\!0}^{J\!-\!1}$ in (3.1) involves the bounds (global in time and space) $\pmb {C}_{D^{\ell }}(\phi )$ $(\ell =1,2,3$; cf. (1.10)), as well as computable error terms $\{\pmb {\mathtt {E}_{\ell }}({\mathbf {Y}}^{j})\}_{\pmb {{\ell }},\,j}$. The matrix $\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}$, which also arises in the representations of $\{\pmb {\mathtt {E}_{\ell }}({\mathbf {Y}}^{j})\}_{\pmb {{\ell }}=1,\ldots ,15}$ results from the use of the semi-implicit Euler scheme (1.3).

The proof of Theorem 3.1 consists of several steps: Lemma 3.2 is based on the identity (1.8) in the introduction, where we represent the weak approximation error via (1.5). Lemma 3.3 then examines the first expectation on the right-hand side of (3.3), where only the drift term of (1.3) appears; and in a similar manner, Lemma 3.4 examines the second expectation on the right-hand side of (3.3), where only the diffusion term is involved. The proof of Theorem 3.1 then follows by combining these lemmata. Note that similar concepts for the investigation of the weak approximation error with the explicit Euler scheme have been proposed in e.g. [28], which here are adapted to the implicit scheme (1.3); see also [6], where weak a priori error estimates are obtained for a time-implicit discretization of SPDE (1.2) with $\pmb {\beta }\equiv {\mathbf {0}}$, and Remark 3.2.

Lemma 3.2

Assume (A1)–(A3). Let $\{t_{j}\}_{j=0}^{J}\subset [0,T]$, with local mesh sizes $\{ {\tau }^{j+1}\}_{j=0}^{J-1}$, and $\{{\mathbf {Y}}^{j}\}_{j=0}^{J}$ solves (1.3). Then, for every $n=0,\ldots ,J-1$, we have

$$\begin{aligned}&\Big |{\mathbb {E}}\big [\phi ({\mathbf {X}}_{t_{n+1}})\big ] - {\mathbb {E}}\big [\phi ({\mathbf {Y}}^{n+1})\big ] \Big |\nonumber \\&\quad \le \sum \limits _{j=0}^{n}\Bigg \{ \bigg |{\mathbb {E}}\bigg [\int _{t_{j}}^{t_{j+1}} \big \langle \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} {\mathbf {f}}({\mathbf {Y}}^{j})-{\mathbf {f}}(\pmb {{\mathcal {Y}}}_{s})+{{\mathscr {A}}}\pmb {{\mathcal {Y}}}_{s}\nonumber \\&\qquad - {{\mathscr {A}}}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}{\mathbf {Y}}^{j} ,D_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s}) \big \rangle _{{\mathbb {R}}^{L}}\,\mathrm {d}s\bigg ] \bigg |\nonumber \\&\qquad +\frac{1}{2} \bigg |\sum \limits _{k=1}^{K}{\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \mathrm {Tr}\Bigl ( \Big [ \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\big [\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j}) \big ]^{\top } \nonumber \\&\qquad - \pmb {\sigma }_{k}(\pmb {{\mathcal {Y}}}_{s})\pmb {\sigma }_{k}^{\top }(\pmb {{\mathcal {Y}}}_{s}) \Big ] D^{2}_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s}) \Bigr ) \mathrm {d}s \bigg ]\bigg |\ \Bigg \}. \end{aligned}$$

(3.3)

Proof

Fix $n=0,\ldots ,J-1$ and consider (1.5), where we replace T by $t_{n+1}$. Note that under assumptions (A1)–(A3), the function $u \in C^{1,3}\bigl ( [0,t_{n+1}]\times {\mathbb {R}}^{L};{\mathbb {R}}\bigr )$ with bounded continuous derivatives w.r.t. the state and continuous derivative w.r.t. the time, and given by $u(t,{\mathbf {x}})={\mathbb {E}}\big [\phi ({\mathbf {X}}_{t_{n+1}}^{t,{\mathbf {x}}})\big ]$ is the unique solution of (1.5); see e.g. [15, p. 366ff.]. Thus, putting ${\mathbf {x}}={\mathbf {Y}}^{n+1}(\omega )$ in the second equation in (1.5) on $[0,t_{n+1}]\times {\mathbb {R}}^{L}$, we immediately conclude that

$$\begin{aligned} {\mathbb {E}}\big [\phi ({\mathbf {X}}_{t_{n+1}})\big ]=u(0,{\mathbf {y}}) \quad \text {and} \quad {\mathbb {E}}\big [\phi ({\mathbf {Y}}^{n+1})\big ]={\mathbb {E}}\big [u(t_{n+1},{\mathbf {Y}}^{n+1})\big ]. \end{aligned}$$

(3.4)

Hence, applying (3.4), a first calculation yields (1.8). Since u is the unique solution of (1.5) on $[0,t_{n+1}]\times {\mathbb {R}}^{L}$, we use Itô’s formula with u to (1.7) in (1.8) to deduce

$$\begin{aligned}&{\mathbb {E}}\big [ u(t_{j+1},{\mathbf {Y}}^{j+1})-u(t_{j},{\mathbf {Y}}^{j})\big ]\nonumber \\&\quad = {\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}}\partial _{s} u(s,\pmb {{\mathcal {Y}}}_{s}) + \big \langle -{{\mathscr {A}}}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}{\mathbf {Y}}^{j}+ \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}{\mathbf {f}}({\mathbf {Y}}^{j}),D_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s})\big \rangle _{{\mathbb {R}}^{L}}\nonumber \\&\qquad +\frac{1}{2}\sum \limits _{k=1}^{K}\mathrm {Tr}\Bigl (\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\big [\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\big ]^{\top } D_{{\mathbf {x}}}^{2}u(s,\pmb {{\mathcal {Y}}}_{s})\Bigr )\,\mathrm {d}s\bigg ]. \end{aligned}$$

(3.5)

Using (1.5) on $[0,t_{n+1}]\times {\mathbb {R}}^{L}$ to eliminate $\partial _{s}u(s,\pmb {{\mathcal {Y}}}_{s})$ in (3.5) further leads to (1.9). Finally, combining (1.8) and (1.9) yields the assertion. $\square $

The next lemma examines the first expectation appearing on the right-hand side of (1.9).

Lemma 3.3

Suppose the setting in Lemma 3.2. Then we have

$$\begin{aligned}&\bigg |{\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \big \langle \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} {\mathbf {f}}({\mathbf {Y}}^{j})-{\mathbf {f}}(\pmb {{\mathcal {Y}}}_{s})+{{\mathscr {A}}}\pmb {{\mathcal {Y}}}_{s}- {{\mathscr {A}}}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}{\mathbf {Y}}^{j} ,D_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s}) \big \rangle _{{\mathbb {R}}^{L}}\,\mathrm {d}s\bigg ] \bigg |\nonumber \\&\quad \le \Bigg \{ \tfrac{3\pmb {C}_{D}(\phi )}{2} \cdot \pmb {\mathtt {E_{1}}}({\mathbf {Y}}^{j}) +\tfrac{\pmb {C}_{D^{2}(\phi )} }{2} \cdot \pmb {\mathtt {E_{2}}}({\mathbf {Y}}^{j})+ \tfrac{\pmb {C}_{D}(\phi )}{2}\cdot \pmb {\mathtt {E_{3}}}({\mathbf {Y}}^{j}) + \tfrac{\pmb {C}_{D}(\phi ) }{4}\cdot \pmb {\mathtt {E_{4}}}({\mathbf {Y}}^{j}) \nonumber \\&\qquad + \tfrac{ \pmb {C}_{D^{2}}(\phi ) }{2} \cdot \pmb {\mathtt {E_{5}}}({\mathbf {Y}}^{j}) \Bigg \}\cdot \bigl (\tau ^{j+1}\bigr )^{2} \nonumber \\&\qquad + \Bigg \{\bigg \{ \pmb {C}_{D}(\phi ) C_{D^{2}{\mathbf {f}}}\cdot \sqrt{\pmb {\mathtt {E_{10}}}({\mathbf {Y}}^{j})} +\Big [\tfrac{\pmb {C}_{D}(\phi ) C_{D^{3}{\mathbf {f}}}}{2} + \pmb {C}_{D^{2}}(\phi ) C_{D^{2}{\mathbf {f}}}\Big ]\cdot \sqrt{\pmb {\mathtt {E_{11}}}({\mathbf {Y}}^{j})} \bigg \} \nonumber \\&\qquad \cdot \sqrt{\tfrac{\tau ^{j}}{15}\cdot \pmb {\mathtt {E_{10}}}({\mathbf {Y}}^{j}) + \tfrac{1}{8} \cdot \pmb {\mathtt {E_{12}}}({\mathbf {Y}}^{j})}\Bigg \}\cdot \bigl (\tau ^{j+1}\bigr )^{2.5}. \end{aligned}$$

(3.6)

In the proof of Lemma 3.3, we use Lemma 2.5 to (globally) bound the first and second derivative of u by $\pmb {C}_{D}(\phi )$ and $\pmb {C}_{D^{2}}(\phi )$, respectively. Besides some standard arguments, we also use Malliavin calculus techniques to validate ${{\mathcal {O}}}(\vert \tau ^{j+1}\vert ^2)$ on the right hand-side of (3.6).

Proof

Using the fact that for any ${\mathbf {v}}\in {\mathbb {R}}^{L}$ it holds

$$\begin{aligned} \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}{\mathbf {v}}={\mathbf {v}}-\tau ^{j+1}{{\mathscr {A}}}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} {\mathbf {v}}, \end{aligned}$$

we obtain in a first calculation

$$\begin{aligned}&\big \langle \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} {\mathbf {f}}({\mathbf {Y}}^{j})-{\mathbf {f}}(\pmb {{\mathcal {Y}}}_{s})+{{\mathscr {A}}}\pmb {{\mathcal {Y}}}_{s}- {{\mathscr {A}}}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}{\mathbf {Y}}^{j} ,D_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s}) \big \rangle _{{\mathbb {R}}^{L}} \nonumber \\&\quad =\tau ^{j+1} \cdot \big \langle {{\mathscr {A}}}^{2}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}{\mathbf {Y}}^{j}- {{\mathscr {A}}}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}{\mathbf {f}}({\mathbf {Y}}^{j}) ,D_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s}) \big \rangle _{{\mathbb {R}}^{L}} \nonumber \\&\qquad + \big \langle {{\mathscr {A}}}\pmb {{\mathcal {Y}}}_{s}- {{\mathscr {A}}}{\mathbf {Y}}^{j} ,D_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s}) \big \rangle _{{\mathbb {R}}^{L}} \nonumber \\&\qquad + \big \langle {\mathbf {f}}({\mathbf {Y}}^{j})-{\mathbf {f}}(\pmb {{\mathcal {Y}}}_{s}),D_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s}) \big \rangle _{{\mathbb {R}}^{L}}. \end{aligned}$$

(3.7)

We use this identity and Lemma 2.5 to bound the left-hand side of (3.6),

$$\begin{aligned}&\bigg |{\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \big \langle \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} {\mathbf {f}}({\mathbf {Y}}^{j})-{\mathbf {f}}(\pmb {{\mathcal {Y}}}_{s})+{{\mathscr {A}}}\pmb {{\mathcal {Y}}}_{s}- {{\mathscr {A}}}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}{\mathbf {Y}}^{j} ,D_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s}) \big \rangle _{{\mathbb {R}}^{L}}\mathrm {d}s\bigg ] \bigg |\nonumber \\&\quad \le \pmb {C}_{D}(\phi )\cdot \pmb {\mathtt {E_{1}}}({\mathbf {Y}}^{j}) \cdot \bigl (\tau ^{j+1}\bigr )^{2} + \bigg |{\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \big \langle {{\mathscr {A}}}\pmb {{\mathcal {Y}}}_{s}- {{\mathscr {A}}}{\mathbf {Y}}^{j} ,D_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s}) \big \rangle _{{\mathbb {R}}^{L}}\,\mathrm {d}s \bigg ] \bigg |\nonumber \\&\qquad + \bigg |{\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \big \langle {\mathbf {f}}(\pmb {{\mathcal {Y}}}_{s})- {\mathbf {f}}({\mathbf {Y}}^{j}) ,D_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s}) \big \rangle _{{\mathbb {R}}^{L}}\mathrm {d}s \bigg ] \bigg |\nonumber \\&\quad =: {\varvec{I}}+{\varvec{{II}}}+{\varvec{{III}}}. \end{aligned}$$

(3.8)

We estimate the terms in (3.8) independently, starting with

Step 1: (Estimation of ${\varvec{III}}$) (a) We apply Itô’s formula with $\{f_{i};\, 1\le i \le L\}$ to (1.7) to get

$$\begin{aligned}&\bigg |{\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \big \langle {\mathbf {f}}(\pmb {{\mathcal {Y}}}_{s})- {\mathbf {f}}({\mathbf {Y}}^{j}) ,D_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s}) \big \rangle _{{\mathbb {R}}^{L}}\mathrm {d}s \bigg ] \bigg |\\&\quad =\bigg |\sum \limits _{i=1}^{L}{\mathbb {E}}\bigg [\int _{t_{j}}^{t_{j+1}} \big \{f_{i}(\pmb {{\mathcal {Y}}}_{s})-f_{i}({\mathbf {Y}}^{j})\big \} \partial _{x_{i}}u(s,\pmb {{\mathcal {Y}}}_{s})\,\mathrm {d}s \bigg ] \bigg |\\&\quad \le \bigg |{\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \int _{t_{j}}^{s} \big \langle D{\mathbf {f}}(\pmb {{\mathcal {Y}}}_{r})\bigl (-{{\mathscr {A}}}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} {\mathbf {Y}}^{j}+\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} {\mathbf {f}}({\mathbf {Y}}^{j})\bigr ) ,D_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s}))\big \rangle _{{\mathbb {R}}^{L}}\,\mathrm {d}r\,\mathrm {d}s \bigg ] \bigg |\\&\qquad + \frac{1}{2} \sum \limits _{k=1}^{K}\bigg |{\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \int _{t_{j}}^{s} \big \langle D^{2}{\mathbf {f}}(\pmb {{\mathcal {Y}}}_{r})\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j}),\\&\quad \qquad D_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s}))\big \rangle _{{\mathbb {R}}^{L}}\,\mathrm {d}r\,\mathrm {d}s \bigg ] \bigg |+ \pmb {M_{1}}, \end{aligned}$$

where

$$\begin{aligned} \pmb {M_{1}}&:= \bigg |\sum \limits _{i=1}^{L} \sum \limits _{k=1}^{K} {\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \int _{t_{j}}^{s}\big \langle Df_{i}(\pmb {{\mathcal {Y}}}_{r}),\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\,\mathrm {d}\beta _{k}(r)\big \rangle _{{\mathbb {R}}^{L}}\,\mathrm {d}s\cdot \partial _{x_{i}}u(s,\pmb {{\mathcal {Y}}}_{s}) \bigg ] \bigg |\nonumber \\&:=\bigg |\sum \limits _{i=1}^{L} M_{1,i} \bigg |. \end{aligned}$$

(3.9)

We add and substract $D{\mathbf {f}}({\mathbf {Y}}^{j})$ and $D^{2}{\mathbf {f}}({\mathbf {Y}}^{j})$ as integrands in order to get closer to computable optimal terms. This step leads to the additional terms

$$\begin{aligned} \pmb {K_{1}}&:=\bigg |{\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \int _{t_{j}}^{s} \big \langle \big \{D{\mathbf {f}}(\pmb {{\mathcal {Y}}}_{r})-D{\mathbf {f}}({\mathbf {Y}}^{j}) \big \}\bigl (-{{\mathscr {A}}}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} {\mathbf {Y}}^{j}\\&\quad +\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} {\mathbf {f}}({\mathbf {Y}}^{j})\bigr ) ,D_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s})\big \rangle _{{\mathbb {R}}^{L}}\,\mathrm {d}r\,\mathrm {d}s \bigg ] \bigg |\end{aligned}$$

and

$$\begin{aligned} \pmb {K_{2}}&:=\sum \limits _{k=1}^{K}\bigg |{\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \int _{t_{j}}^{s} \big \langle \big \{D^{2}{\mathbf {f}}(\pmb {{\mathcal {Y}}}_{r})-D^{2}{\mathbf {f}}({\mathbf {Y}}^{j}) \big \}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j}),\\&\quad D_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s})\big \rangle _{{\mathbb {R}}^{L}}\,\mathrm {d}r\,\mathrm {d}s \bigg ] \bigg |, \end{aligned}$$

which will be estimated in (c) below. Thus, we obtain

$$\begin{aligned}&\bigg |{\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \big \langle {\mathbf {f}}(\pmb {{\mathcal {Y}}}_{s})- {\mathbf {f}}({\mathbf {Y}}^{j}) ,D_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s}) \big \rangle _{{\mathbb {R}}^{L}}\mathrm {d}s \bigg ] \bigg |\\&\quad \le \bigg |{\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \int _{t_{j}}^{s} \big \langle D{\mathbf {f}}({\mathbf {Y}}^{j})\bigl (-{{\mathscr {A}}}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} {\mathbf {Y}}^{j}+\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} {\mathbf {f}}({\mathbf {Y}}^{j})\bigr ) ,D_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s})\big \rangle _{{\mathbb {R}}^{L}}\,\mathrm {d}r\,\mathrm {d}s \bigg ] \bigg |\\&\qquad + \frac{1}{2} \sum \limits _{k=1}^{K}\bigg |{\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \int _{t_{j}}^{s} \big \langle D^{2}{\mathbf {f}}({\mathbf {Y}}^{j})\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j}),\\&\qquad \qquad D_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s})\big \rangle _{{\mathbb {R}}^{L}}\,\mathrm {d}r\,\mathrm {d}s \bigg ] \bigg |+\pmb {M_{1}}+\pmb {K_{1}}+\frac{1}{2}\pmb {K_{2}}. \end{aligned}$$

We apply the Cauchy–Schwarz inequality, Lemma 2.5 (i), and some standard calculations to obtain

$$\begin{aligned}&\bigg |{\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \big \langle {\mathbf {f}}(\pmb {{\mathcal {Y}}}_{s})- {\mathbf {f}}({\mathbf {Y}}^{j}) ,D_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s}) \big \rangle _{{\mathbb {R}}^{L}}\mathrm {d}s \bigg ] \bigg |\nonumber \\&\quad \le \Bigg \{ \tfrac{\pmb {C}_{D}(\phi )}{2}\cdot \pmb {\mathtt {E_{3}}}({\mathbf {Y}}^{j}) + \tfrac{\pmb {C}_{D}(\phi ) }{4}\cdot \pmb {\mathtt {E_{4}}}({\mathbf {Y}}^{j})\Bigg \}\cdot \bigl (\tau ^{j+1}\bigr )^{2}+ \pmb {M_{1}} +\pmb {K_{1}}+\frac{1}{2}\pmb {K_{2}}. \end{aligned}$$

(3.10)

We estimate the terms $\pmb {M_{1}},\pmb {K_{1}},\pmb {K_{2}}$ independently in parts (b) and (c).

(b) We consider $\pmb {M_{1}}$ in (3.9): for its successul treatment, we use tools from Malliavin calculus. For $i=1,\ldots ,L$, we have

$$\begin{aligned} M_{1,i} = \sum \limits _{k=1}^{K} \sum \limits _{l=1}^{L}{\mathbb {E}}\left[ \int _{t_{j}}^{t_{j+1}} \partial _{x_{i}}u(s,\pmb {{\mathcal {Y}}}_{s}) \int _{t_{j}}^{s} \partial _{x_{l}} f_{i}(\pmb {{\mathcal {Y}}}_{r}) \bigl (\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\bigr )_{l} \mathrm {d}\beta _{k}(r) \,\mathrm {d}s\right] \end{aligned}$$

(3.11)

Since $\partial _{x_{i}}u(s,\pmb {{\mathcal {Y}}}_{s}) \in {\mathbb {D}}^{1,2}$ (see e.g. [1, 25]), we apply the Clark–Ocone formula in Lemma 2.1 to $\partial _{x_{i}}u(s,\pmb {{\mathcal {Y}}}_{s})$, to get

$$\begin{aligned} \partial _{x_{i}}u(s,\pmb {{\mathcal {Y}}}_{s})={\mathbb {E}}\big [ \partial _{x_{i}}u(s,\pmb {{\mathcal {Y}}}_{s})\big ] + \int _{t_{j}}^{s} {\mathbb {E}}\Big [\mathrm {D}_{r}^{(k)}\bigl (\partial _{x_{i}}u(s,\pmb {{\mathcal {Y}}}_{s})\bigr ) \big |{\mathcal {F}}_{r} \Big ]\,\mathrm {d}\beta _{k}(r), \end{aligned}$$

(3.12)

where $\mathrm {D}_{r}^{(k)}$ denotes the Malliavin derivative w.r.t. $\beta _{k}(r)$. Applying (2.1) to $\mathrm {D}_{r}^{(k)}\big (\partial _{x_{i}}u(s,\pmb {{\mathcal {Y}}}_{s})\big )$ further leads to

$$\begin{aligned} \mathrm {D}_{r}^{(k)}\bigl (\partial _{x_{i}}u(s,\pmb {{\mathcal {Y}}}_{s})\bigr )=\sum \limits _{m=1}^{L} \partial _{x_{m}}\bigl (\partial _{x_{i}}u(s,\pmb {{\mathcal {Y}}}_{s})\bigr )\big (\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\bigr )_{m}. \end{aligned}$$

(3.13)

Now, inserting (3.12) into (3.11) and using the fact that the expectation of a stochastic integral w.r.t. the Wiener process is zero, we conclude

$$\begin{aligned} M_{1,i}&= \sum \limits _{k=1}^{K}\sum \limits _{l=1}^{L}{\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \int _{t_{j}}^{s}{\mathbb {E}}\Big [\mathrm {D}_{r}^{(k)}\bigl (\partial _{x_{i}}u(s,\pmb {{\mathcal {Y}}}_{s})\bigr ) \big |{\mathcal {F}}_{r} \Big ]\,\mathrm {d}\beta _{k}(r)\nonumber \\&\quad \int _{t_{j}}^{s} \partial _{x_{l}} f_{i}(\pmb {{\mathcal {Y}}}_{r}) \bigl (\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\bigr )_{l}\, \mathrm {d}\beta _{k}(r) \,\mathrm {d}s\bigg ] \nonumber \\&=\sum \limits _{k=1}^{K} \sum \limits _{l=1}^{L} \int _{t_{j}}^{t_{j+1}} {\mathbb {E}}\bigg [\int _{t_{j}}^{s} {\mathbb {E}}\Big [\mathrm {D}_{r}^{(k)}\bigl (\partial _{x_{i}}u(s,\pmb {{\mathcal {Y}}}_{s})\bigr ) \big |{\mathcal {F}}_{r} \Big ]\partial _{x_{l}} f_{i}(\pmb {{\mathcal {Y}}}_{r}) \bigl (\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\bigr )_{l}\,\mathrm {d}r\bigg ]\,\mathrm {d}s , \end{aligned}$$

(3.14)

where in the last step we used a (generalized) Itô isometry argument; see e.g. [12, p. 135, Thm. 4.2.3]. An application of the tower property (law of total expectation) in (3.14), and (3.13) then lead to

$$\begin{aligned} \pmb {M_{1}}&=\left|\sum \limits _{k=1}^{K}\sum \limits _{i,\ell =1}^{L}\int _{t_{j}}^{t_{j+1}}\int _{t_{j}}^{s} {\mathbb {E}}\left[ \left( \sum \limits _{m=1}^{L} \partial _{x_{m}}\bigl (\partial _{x_{i}}u(s,\pmb {{\mathcal {Y}}}_{s})\bigr )\big (\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\bigr )_{m}\right) \partial _{x_{l}} f_{i}(\pmb {{\mathcal {Y}}}_{r}) \bigl (\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\bigr )_{l}\right] \,\mathrm {d}r\,\mathrm {d}s\right|\\&=\left|\sum \limits _{k=1}^{K} {\mathbb {E}}\left[ \int _{t_{j}}^{t_{j+1}} \int _{t_{j}}^{s} \mathrm {Tr}\Bigl ( D^{2}_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s})D{\mathbf {f}}(\pmb {{\mathcal {Y}}}_{r}) \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j}) \big [\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\big ]^{\top } \Bigr )\,\mathrm {d}r\,\mathrm {d}s\right] \right|. \end{aligned}$$

In the next step, we add and substract in the second argument $D{\mathbf {f}}({\mathbf {Y}}^{j})$ as well, to then obtain

$$\begin{aligned} \pmb {M_{1}}&\le \bigg |\sum \limits _{k=1}^{K} {\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \int _{t_{j}}^{s} \mathrm {Tr}\Bigl ( D^{2}_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s})D{\mathbf {f}}({\mathbf {Y}}^{j}) \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j}) \big [\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\big ]^{\top } \Bigr )\,\mathrm {d}r\,\mathrm {d}s\bigg ] \bigg |\\&\quad +\pmb {K_{3}} \end{aligned}$$

where

$$\begin{aligned} \pmb {K_{3}}&:=\bigg |\sum \limits _{k=1}^{K} {\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \int _{t_{j}}^{s} \mathrm {Tr}\Bigl ( D^{2}_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s})\big \{D{\mathbf {f}}(\pmb {{\mathcal {Y}}}_{r})\\&\quad - D{\mathbf {f}}({\mathbf {Y}}^{j})\big \} \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j}) \big [\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\big ]^{\top } \Bigr )\,\mathrm {d}r\,\mathrm {d}s\bigg ] \bigg |. \end{aligned}$$

Using Lemma 2.5 (ii), and $\mathrm {Tr}({\mathbf {B}}{\mathbf {v}}{\mathbf {w}}^{\top })\le \Vert {\mathbf {B}} \Vert _{{\mathbb {R}}^{L\times L}} \Vert {\mathbf {v}} \Vert _{{\mathbb {R}}^{L}} \Vert {\mathbf {w}} \Vert _{{\mathbb {R}}^{L}}$ for any ${\mathbf {B}}\in {\mathbb {R}}^{L\times L}$, ${\mathbf {v}},{\mathbf {w}}\in {\mathbb {R}}^{L}$, consequently lead to

$$\begin{aligned} \pmb {M_{1}}\le \tfrac{\pmb {C}_{D^{2}(\phi )} }{2}\cdot \pmb {\mathtt {E_{5}}}({\mathbf {Y}}^{j})\cdot \bigl (\tau ^{j+1}\bigr )^{2} + \pmb {K_{3}}. \end{aligned}$$

(3.15)

(c) We show that the terms $\pmb {K_{1}},\pmb {K_{2}},\pmb {K_{3}}$ are the higher order terms in (3.5), which account for the difference between $\pmb {{\mathcal {Y}}}$ and $\{{\mathbf {Y}}^{j}\}_{j\ge 0}$. Since the treatment of $\pmb {K_{1}},\pmb {K_{2}}$ and $\pmb {K_{3}}$ is similar, we only consider $\pmb {K_{1}}$ in detail. We start with a standard calculation, that is for $s\in [t_{j},t_{j+1}]$ and $r\in [t_{j},s]$ we have, considering the continuified process (1.7) and using some standard calculations

$$\begin{aligned} {\mathbb {E}}\Big [\Vert \pmb {{\mathcal {Y}}}_{r}-{\mathbf {Y}}^{j}\Vert _{{\mathbb {R}}^{L}}^{2}\Big ]&\le \pmb {\mathtt {E_{10}}}({\mathbf {Y}}^{j})\cdot (r-t_{j})^{2} +\pmb {\mathtt {E_{12}}}({\mathbf {Y}}^{j})\cdot (r-t_{j}). \end{aligned}$$

(3.16)

Using

$$\begin{aligned} \Vert D{\mathbf {f}}(\pmb {{\mathcal {Y}}}_{r})-D{\mathbf {f}}({\mathbf {Y}}^{j})\Vert _{{\mathcal {L}}}\le C_{D^{2}{\mathbf {f}}}\cdot \Vert \pmb {{\mathcal {Y}}}_{r}-{\mathbf {Y}}^{j}\Vert _{{\mathbb {R}}^{L}}, \end{aligned}$$

Cauchy–Schwarz inequality and Lemma 2.5 (i), we get

$$\begin{aligned} \pmb {K_{1}} \le \pmb {C}_{D}(\phi ) C_{D^{2}{\mathbf {f}}}\cdot \sqrt{\pmb {\mathtt {E_{10}}}({\mathbf {Y}}^{j})}\cdot \Biggl ( \tau ^{j+1}\int _{t_{j}}^{t_{j+1}}(s-t_{j})\int _{t_{j}}^{s} {\mathbb {E}}\Big [\Vert \pmb {{\mathcal {Y}}}_{r}-{\mathbf {Y}}^{j}\Vert _{{\mathbb {R}}^{L}}^{2}\Big ]\,\mathrm {d}r\,\mathrm {d}s\Biggr )^{\tfrac{1}{2}}. \end{aligned}$$

Using (3.16) consequently leads to

$$\begin{aligned} \pmb {K_{1}}\le \pmb {C}_{D}(\phi ) C_{D^{2}{\mathbf {f}}}\cdot \sqrt{\pmb {\mathtt {E_{10}}}({\mathbf {Y}}^{j})}\cdot \sqrt{\tfrac{\tau ^{j+1}}{15}\cdot \pmb {\mathtt {E_{10}}}({\mathbf {Y}}^{j}) + \tfrac{1}{8} \cdot \pmb {\mathtt {E_{12}}}({\mathbf {Y}}^{j})}\cdot \bigl (\tau ^{j+1}\bigr )^{2.5}. \end{aligned}$$

(3.17)

In a similar way, we obtain

$$\begin{aligned} \pmb {K_{2}}\le \pmb {C}_{D}(\phi ) C_{D^{3}{\mathbf {f}}}\cdot \sqrt{\pmb {\mathtt {E_{11}}}({\mathbf {Y}}^{j})}\cdot \sqrt{\tfrac{\tau ^{j+1}}{15}\cdot \pmb {\mathtt {E_{10}}}({\mathbf {Y}}^{j}) + \tfrac{1}{8} \cdot \pmb {\mathtt {E_{12}}}({\mathbf {Y}}^{j})}\cdot \bigl (\tau ^{j+1}\bigr )^{2.5} \end{aligned}$$

(3.18)

and

$$\begin{aligned} \pmb {K_{3}}\le \pmb {C}_{D^{2}}(\phi ) C_{D^{2}{\mathbf {f}}}\cdot \sqrt{\pmb {\mathtt {E_{11}}}({\mathbf {Y}}^{j})}\cdot \sqrt{\tfrac{\tau ^{j+1}}{15}\cdot \pmb {\mathtt {E_{10}}}({\mathbf {Y}}^{j}) + \tfrac{1}{8} \cdot \pmb {\mathtt {E_{12}}}({\mathbf {Y}}^{j})}\cdot \bigl (\tau ^{j+1}\bigr )^{2.5}. \end{aligned}$$

(3.19)

Step 2: (Estimation of ${\varvec{II}}$) Similar arguments as used for ${\varvec{III}}$ in Step 1 give the bound

$$\begin{aligned} {\varvec{II}}&\le \bigg \{ \tfrac{\pmb {C}_{D}(\phi )}{2}\cdot \pmb {\mathtt {E_{1}}}({\mathbf {Y}}^{j}) +\tfrac{\pmb {C}_{D^{2}}(\phi )}{2} \cdot \pmb {\mathtt {E_{2}}}({\mathbf {Y}}^{j}) \bigg \}\cdot \bigl (\tau ^{j}\bigr )^{2}. \end{aligned}$$

(3.20)

Step 3: (Finishing the proof) We combine (3.17), (3.18), (3.19) and (3.15) with (3.10) and plug the resulting expression as well as (3.20) into (3.8), which proves the assertion. $\square $

We now bound the last sum on the right hand-side of (3.3).

Lemma 3.4

Suppose the setting in Lemma 3.2. Then we have

$$\begin{aligned}&\bigg |\sum \limits _{k=1}^{K}{\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \mathrm {Tr}\Bigl ( \Big [ \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\big [\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j}) \big ]^{\top } - \pmb {\sigma }_{k}(\pmb {{\mathcal {Y}}}_{s})\pmb {\sigma }_{k}^{\top }(\pmb {{\mathcal {Y}}}_{s}) \Big ] D^{2}_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s}) \Bigr ) \mathrm {d}s \bigg ]\bigg |\nonumber \\&\quad \le \Bigg \{ \pmb {C}_{D^{2}}(\phi ) \cdot \pmb {\mathtt {E_{6}}}({\mathbf {Y}}^{j}) + \tfrac{\pmb {C}_{D^{2}}(\phi ) }{2}\cdot \pmb {\mathtt {E_{7}}}({\mathbf {Y}}^{j}) + \pmb {C}_{D^{3}}(\phi ) \cdot \pmb {\mathtt {E_{8}}}({\mathbf {Y}}^{j}) \nonumber \\&\qquad +2\pmb {C}_{D^{2}}(\phi ) \cdot \pmb {\mathtt {E_{9}}}({\mathbf {Y}}^{j}) + \pmb {C}_{D^{2}}(\phi ) C_{D\pmb {\sigma }}^{2} \cdot \Big [ \tfrac{ \tau ^{j+1}}{3}\cdot \pmb {\mathtt {E_{10}}}({\mathbf {Y}}^{j}) + \tfrac{1}{2} \cdot \pmb {\mathtt {E_{12}}}({\mathbf {Y}}^{j})\Big ]\Bigg \}\cdot \bigl (\tau ^{j+1}\bigr )^{2}\nonumber \\&\qquad +\Bigg \{\bigg \{2 \pmb {C}_{D^{2}}(\phi ) C_{D^{2}\pmb {\sigma }}\cdot \sqrt{\pmb {\mathtt {E_{13}}}({\mathbf {Y}}^{j})} +\Big [ \pmb {C}_{D^{2}}(\phi ) C_{D^{3}\pmb {\sigma }} + 2\pmb {C}_{D^{3}}(\phi ) C_{D^{2}\pmb {\sigma }}\Big ]\cdot \sqrt{\pmb {\mathtt {E_{14}}}({\mathbf {Y}}^{j})} \bigg \} \nonumber \\&\quad \qquad \cdot \sqrt{\tfrac{ \tau ^{j+1}}{15}\cdot \pmb {\mathtt {E_{10}}}({\mathbf {Y}}^{j}) + \tfrac{1}{8} \cdot \pmb {\mathtt {E_{12}}}({\mathbf {Y}}^{j})}\Bigg \}\cdot \bigl (\tau ^{j+1}\bigr )^{2.5} \nonumber \\&\qquad + \pmb {C}_{D^{2}}(\phi ) \cdot \pmb {\mathtt {E_{15}}}({\mathbf {Y}}^{j})\cdot \bigl (\tau ^{j+1}\bigr )^{3}. \end{aligned}$$

(3.21)

Proof

Similar as in (3.7), we start with a straightforward calculation. For $k=1,\ldots ,K$, we have

$$\begin{aligned}&\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\big [\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j}) \big ]^{\top } - \pmb {\sigma }_{k}(\pmb {{\mathcal {Y}}}_{s})\pmb {\sigma }_{k}^{\top }(\pmb {{\mathcal {Y}}}_{s})\\&\quad =\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\pmb {\sigma }_{k}^{\top }({\mathbf {Y}}^{j})-\tau ^{j+1}\cdot \pmb {\sigma }_{k}({\mathbf {Y}}^{j})\big [{{\mathscr {A}}}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j}) \big ]^{\top }\\&\qquad - \tau ^{j+1}\cdot {{\mathscr {A}}}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\pmb {\sigma }_{k}^{\top }({\mathbf {Y}}^{j}) \\&\qquad + \bigl (\tau ^{j+1}\bigr )^{2}\cdot {{\mathscr {A}}}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\big [{{\mathscr {A}}}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j}) \big ]^{\top } - \pmb {\sigma }_{k}(\pmb {{\mathcal {Y}}}_{s})\pmb {\sigma }_{k}^{\top }(\pmb {{\mathcal {Y}}}_{s}). \end{aligned}$$

This yields

$$\begin{aligned}&\mathrm {Tr}\biggl ( \Big [ \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\big [\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j}) \big ]^{\top } - \pmb {\sigma }_{k}(\pmb {{\mathcal {Y}}}_{s})\pmb {\sigma }_{k}^{\top }(\pmb {{\mathcal {Y}}}_{s}) \Big ]D^{2}_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s}) \biggr ) \nonumber \\&\quad =\mathrm {Tr}\Bigl ( \big [\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\pmb {\sigma }_{k}^{\top }({\mathbf {Y}}^{j}) - \pmb {\sigma }_{k}(\pmb {{\mathcal {Y}}}_{s})\pmb {\sigma }_{k}^{\top }(\pmb {{\mathcal {Y}}}_{s}) \big ] D^{2}_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s}) \Bigr ) \nonumber \\&\qquad - \biggl ( \mathrm {Tr}\Bigl (\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\big [{{\mathscr {A}}}\pmb {\bar{{\mathscr {A}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j}) \big ]^{\top } D^{2}_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s}) \Bigr ) \nonumber \\&\qquad + \mathrm {Tr}\Bigl ({{\mathscr {A}}}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\pmb {\sigma }_{k}^{\top }({\mathbf {Y}}^{j}) D^{2}_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s}) \Bigr )\biggr )\cdot \tau ^{j+1} \nonumber \\&\qquad + \mathrm {Tr}\Bigl ({{\mathscr {A}}}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\big [{{\mathscr {A}}}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j}) \big ]^{\top } D^{2}_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s}) \Bigr ) \cdot \bigl (\tau ^{j+1}\bigr )^{2}. \end{aligned}$$

(3.22)

We set

$$\begin{aligned} \pmb {T_{1}}:=\bigg |\sum \limits _{k=1}^{K} {\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \mathrm {Tr}\Bigl ( \big [\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\pmb {\sigma }_{k}^{\top }({\mathbf {Y}}^{j}) - \pmb {\sigma }_{k}(\pmb {{\mathcal {Y}}}_{s})\pmb {\sigma }_{k}^{\top }(\pmb {{\mathcal {Y}}}_{s}) \big ] D^{2}_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s}) \Bigr )\,\mathrm {d}s\bigg ] \bigg |\end{aligned}$$

and plug (3.22) into the left-hand side of (3.21) to deduce

$$\begin{aligned}&\bigg |\sum \limits _{k=1}^{K}{\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \mathrm {Tr}\Bigl ( \Big [ \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\big [\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j}) \big ]^{\top } - \pmb {\sigma }_{k}(\pmb {{\mathcal {Y}}}_{s})\pmb {\sigma }_{k}^{\top }(\pmb {{\mathcal {Y}}}_{s}) \Big ] D^{2}_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s}) \Bigr ) \mathrm {d}s \bigg ]\bigg |\\&\quad \le \bigg |\sum \limits _{k=1}^{K} {\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \mathrm {Tr}\Bigl (\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\big [{{\mathscr {A}}}\pmb {\bar{{\mathscr {A}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j}) \big ]^{\top } D^{2}_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s}) \\&\qquad + {{\mathscr {A}}}\pmb {\bar{{\mathscr {A}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\pmb {\sigma }_{k}^{\top }({\mathbf {Y}}^{j}) D^{2}_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s}) \Bigr )\,\mathrm {d}s\bigg ] \bigg |\cdot \tau ^{j+1}\\&\qquad + \bigg |\sum \limits _{k=1}^{K} {\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \mathrm {Tr}\Bigl ({{\mathscr {A}}}\pmb {\bar{{\mathscr {A}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j}) \big [{{\mathscr {A}}}\pmb {\bar{{\mathscr {A}}}}^{j+1}\pmb {\sigma }_{k}({\mathbf {Y}}^{j}) \big ]^{\top } D^{2}_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s}) \Bigr )\,\mathrm {d}s\bigg ]\bigg |\\&\qquad \cdot \bigl (\tau ^{j+1}\bigr )^{2} +\pmb {T_{1}}=: {\varvec{IV}}+{\varvec{V}}+\pmb {T_{1}}. \end{aligned}$$

Step 1: (Estimation of ${\varvec{IV}}$) Some standard calculations and Lemma 2.5 (ii) lead to

$$\begin{aligned} {\varvec{IV}}\le 2\pmb {C}_{D^{2}}(\phi ) \cdot \pmb {\mathtt {E_{9}}}({\mathbf {Y}}^{j}) \cdot \bigl (\tau ^{j+1}\bigr )^{2}. \end{aligned}$$

(3.23)

Step 2: (Estimation of ${\varvec{V}}$) Again, standard calculations and Lemma 2.5 (ii), lead to

$$\begin{aligned} {\varvec{V}}\le \pmb {C}_{D^{2}}(\phi ) \cdot \pmb {\mathtt {E_{15}}}({\mathbf {Y}}^{j}) \cdot \bigl (\tau ^{j+1}\bigr )^{3}. \end{aligned}$$

(3.24)

Step 3: (Estimation of $\pmb {T_{1}}$) (a) For $k=1,\ldots ,K$, we add and substract $\pmb {\sigma }_{k}({\mathbf {Y}}^{j}) \pmb {\sigma }_{k}(\pmb {{\mathcal {Y}}}_{s})$ and use $\mathrm {Tr}({\mathbf {B}})=\mathrm {Tr}({\mathbf {B}}^{\top })$ for ${\mathbf {B}}\in {\mathbb {R}}^{L\times L}$ to obtain

$$\begin{aligned}&\mathrm {Tr}\Bigl ( \big [ \pmb {\sigma }_{k}({\mathbf {Y}}^{j})\pmb {\sigma }_{k}^{\top }({\mathbf {Y}}^{j}) - \pmb {\sigma }_{k}(\pmb {{\mathcal {Y}}}_{s})\pmb {\sigma }_{k}^{\top }(\pmb {{\mathcal {Y}}}_{s}) \big ] D^{2}_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s}) \Bigr )\nonumber \\&\quad = 2\cdot \mathrm {Tr}\Bigl (\big [\pmb {\sigma }_{k}({\mathbf {Y}}^{j}) -\pmb {\sigma }_{k}(\pmb {{\mathcal {Y}}}_{s}) \big ] \pmb {\sigma }_{k}^{\top }({\mathbf {Y}}^{j})D^{2}_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s}) \Bigr ) \nonumber \\&\qquad - \mathrm {Tr}\Bigl (\big [ \pmb {\sigma }_{k}(\pmb {{\mathcal {Y}}}_{s})-\pmb {\sigma }_{k}({\mathbf {Y}}^{j}) \big ] \big [\pmb {\sigma }_{k}(\pmb {{\mathcal {Y}}}_{s})-\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\big ]^{\top } D^{2}_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s}) \Bigr ). \end{aligned}$$

(3.25)

Plugging (3.25) into $\pmb {T_{1}}$ immediately leads to

$$\begin{aligned} \pmb {T_{1}}\le 2\cdot \pmb {T_{1,a}} + \pmb {T_{1,b}}, \end{aligned}$$

(3.26)

where

$$\begin{aligned} \pmb {T_{1,a}}:=\left|\sum \limits _{k=1}^{K} {\mathbb {E}}\left[ \int _{t_{j}}^{t_{j+1}} \mathrm {Tr}\Bigl (\big [\pmb {\sigma }_{k}({\mathbf {Y}}^{j})-\pmb {\sigma }_{k}(\pmb {{\mathcal {Y}}}_{s})\big ] \pmb {\sigma }_{k}^{\top }({\mathbf {Y}}^{j})D^{2}_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s})\Bigr )\,\mathrm {d}s\right] \right|\end{aligned}$$

and

$$\begin{aligned} \pmb {T_{1,b}}:=\bigg |\sum \limits _{k=1}^{K}{\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \mathrm {Tr}\Bigl (\big [\pmb {\sigma }_{k}(\pmb {{\mathcal {Y}}}_{s}) - \pmb {\sigma }_{k}({\mathbf {Y}}^{j})\big ] \big [\pmb {\sigma }_{k}(\pmb {{\mathcal {Y}}}_{s}) - \pmb {\sigma }_{k}({\mathbf {Y}}^{j})\big ]^{\top } D^{2}_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s})\Bigr )\,\mathrm {d}s\bigg ] \bigg |. \end{aligned}$$

(b) We estimate $\pmb {T_{1,b}}$ with the help of Lemma 2.5 (ii),

$$\begin{aligned} \pmb {T_{1,b}}\le \pmb {C}_{D^{2}}(\phi )\cdot {\mathbb {E}}\left[ \int _{t_{j}}^{t_{j+1}} \sum \limits _{k=1}^{K}\Vert \pmb {\sigma }_{k}(\pmb {{\mathcal {Y}}}_{s})-\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\Vert _{{\mathbb {R}}^{L}}^{2}\,\mathrm {d}s\right] . \end{aligned}$$

Assumption (A2) and (3.16) then lead to

$$\begin{aligned} \pmb {T_{1,b}}\le \pmb {C}_{D^{2}}(\phi )C_{D\pmb {\sigma }}^{2} \cdot \bigg \{ \tfrac{\tau ^{j+1}}{3}\cdot \pmb {\mathtt {E_{10}}}({\mathbf {Y}}^{j})+\tfrac{1}{2}\cdot \pmb {\mathtt {E_{12}}}({\mathbf {Y}}^{j}) \bigg \}\cdot \bigl (\tau ^{j+1}\bigr )^{2}. \end{aligned}$$

(3.27)

(c) Next, we consider the expression $\pmb {T_{1,a}}$. We apply Itô’s formula with $\{\sigma _{k}^{(i)};\,1\le k \le K,1\le i\le L\}$ to (1.7) and use standard arguments to get

$$\begin{aligned} \pmb {T_{1,a}}&=\bigg |\sum \limits _{k=1}^{K}{\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \sum \limits _{i=1}^{L} \big \langle D_{{\mathbf {x}}}\partial _{x_{i}} u(s,\pmb {{\mathcal {Y}}}_{s}),\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\big \rangle _{{\mathbb {R}}^{L}}\big \{\sigma _{k}^{(i)}(\pmb {{\mathcal {Y}}}_{s})-\sigma _{k}^{(i)}({\mathbf {Y}}^{j})\big \}\,\mathrm {d}s\bigg ] \bigg |\nonumber \\&\le \pmb {T_{1,a,1}}+\pmb {T_{1,a,2}}+\pmb {M_{2}}, \end{aligned}$$

(3.28)

where

$$\begin{aligned} \pmb {T_{1,a,1}}&:=\bigg |\sum \limits _{k=1}^{K}{\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \sum \limits _{i=1}^{L} \big \langle D_{{\mathbf {x}}}\partial _{x_{i}} u(s,\pmb {{\mathcal {Y}}}_{s}),\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\big \rangle _{{\mathbb {R}}^{L}} \\&\quad \cdot \int _{t_{j}}^{s}\big \langle \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}{\mathbf {f}}({\mathbf {Y}}^{j})-{{\mathscr {A}}}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}{\mathbf {Y}}^{j},D\sigma _{k}^{(i)}(\pmb {{\mathcal {Y}}}_{r})\big \rangle _{{\mathbb {R}}^{L}}\,\mathrm {d}r\,\mathrm {d}s\bigg ]\bigg |\\&= \bigg |\sum \limits _{k=1}^{K}{\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}}\int _{t_{j}}^{s} \big \langle D^{2}_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s})\pmb {\sigma }_{k}({\mathbf {Y}}^{j}),\\&\quad D\pmb {\sigma }_{k}(\pmb {{\mathcal {Y}}}_{r})\bigl ({{\mathscr {A}}}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}{\mathbf {Y}}^{j}-\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}{\mathbf {f}}({\mathbf {Y}}^{j})\bigr )\big \rangle _{{\mathbb {R}}^{L}}\,\mathrm {d}r\,\mathrm {d}s\bigg ] \bigg |, \end{aligned}$$

$$\begin{aligned} \pmb {T_{1,a,2}}&:=\frac{1}{2} \bigg |\sum \limits _{k,l=1}^{K} {\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \sum \limits _{i=1}^{L} \big \langle D_{{\mathbf {x}}}\partial _{x_{i}} u(s,\pmb {{\mathcal {Y}}}_{s}),\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\big \rangle _{{\mathbb {R}}^{L}} \\&\quad \cdot \int _{t_{j}}^{s}\mathrm {Tr}\Bigl ( \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{l}({\mathbf {Y}}^{j})\big [\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{l}({\mathbf {Y}}^{j}) \big ]^{\top } D^{2}\sigma _{k}^{(i)}(\pmb {{\mathcal {Y}}}_{r})\Bigr )\,\mathrm {d}r\,\mathrm {d}s\bigg ]\bigg |\\&=\frac{1}{2} \bigg |\sum \limits _{k,l=1}^{K} {\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \int _{t_{j}}^{s} \big \langle D^{2}_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s})\pmb {\sigma }_{k}({\mathbf {Y}}^{j}),\\&\quad D^{2}\pmb {\sigma }_{k}(\pmb {{\mathcal {Y}}}_{r})\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{l}({\mathbf {Y}}^{j})\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{l}({\mathbf {Y}}^{j}) \big \rangle _{{\mathbb {R}}^{L}}\,\mathrm {d}r\,\mathrm {d}s \bigg ] \bigg |\end{aligned}$$

and

$$\begin{aligned} \pmb {M_{2}}&:=\bigg |\sum \limits _{k,l=1}^{K}{\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \sum \limits _{i=1}^{L} \big \langle D_{{\mathbf {x}}}\partial _{x_{i}} u(s,\pmb {{\mathcal {Y}}}_{s}),\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\big \rangle _{{\mathbb {R}}^{L}}\nonumber \\&\quad \int _{t_{j}}^{s}\big \langle D\sigma _{k}^{(i)}(\pmb {{\mathcal {Y}}}_{r}),\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{l}({\mathbf {Y}}^{j})\,\mathrm {d}\beta _{l}(r)\big \rangle _{{\mathbb {R}}^{L}}\,\mathrm {d}s\bigg ]\bigg |\nonumber \\&= \bigg |\sum \limits _{k,l=1}^{K}{\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \int _{t_{j}}^{s} \big \langle D^{2}_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s})\pmb {\sigma }_{k}({\mathbf {Y}}^{j}),\nonumber \\&\quad D\pmb {\sigma }_{k}(\pmb {{\mathcal {Y}}}_{r})\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{l}({\mathbf {Y}}^{j})\,\mathrm {d}\beta _{l}(r)\big \rangle _{{\mathbb {R}}^{L}}\,\mathrm {d}s\bigg ]\bigg |. \end{aligned}$$

(3.29)

Almost the same arguments as we used for the treatment of (3.10) in Lemma 3.3, that is generating additional higher order terms to get closer to computable terms gives

$$\begin{aligned} \pmb {T_{1,a,1}}\le \tfrac{\pmb {C}_{D^{2}}(\phi )}{2}\cdot \pmb {\mathtt {E_{6}}}({\mathbf {Y}}^{j})\cdot \bigl (\tau ^{j+1}\bigr )^{2}+\pmb {K_{4}} \end{aligned}$$

(3.30)

and

$$\begin{aligned} \pmb {T_{1,a,2}}\le \tfrac{\pmb {C}_{D^{2}}(\phi )}{4}\cdot \pmb {\mathtt {E_{7}}}({\mathbf {Y}}^{j})\cdot \bigl (\tau ^{j+1}\bigr )^{2}+\pmb {K_{5}}, \end{aligned}$$

(3.31)

where

$$\begin{aligned} \pmb {K_{4}}&:=\bigg |\sum \limits _{k=1}^{K}{\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}}\int _{t_{j}}^{s} \big \langle D^{2}_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s})\pmb {\sigma }_{k}({\mathbf {Y}}^{j}),\\&\quad \bigl (D\pmb {\sigma }_{k}(\pmb {{\mathcal {Y}}}_{r})-D\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\bigr )\bigl ({{\mathscr {A}}}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}{\mathbf {Y}}^{j}-\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}{\mathbf {f}}({\mathbf {Y}}^{j})\bigr ) \big \rangle _{{\mathbb {R}}^{L}}\,\mathrm {d}r\,\mathrm {d}s\bigg ] \bigg |\end{aligned}$$

and

$$\begin{aligned} \pmb {K_{5}}&: =\frac{1}{2}\bigg |\sum \limits _{k,l=1}^{K} {\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \int _{t_{j}}^{s} \big \langle D^{2}_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s})\pmb {\sigma }_{k}({\mathbf {Y}}^{j}),\\&\quad \bigl (D^{2}\pmb {\sigma }_{k}(\pmb {{\mathcal {Y}}}_{r})- D^{2}\pmb {\sigma }_{k}({\mathbf {Y}}^{j}) \bigr )\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{l}({\mathbf {Y}}^{j})\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{l}({\mathbf {Y}}^{j}) \big \rangle _{{\mathbb {R}}^{L}}\,\mathrm {d}r\,\mathrm {d}s \bigg ] \bigg |\end{aligned}$$

are higher order terms.

(d) Next, we consider the expression $\pmb {M_{2}}$. Here, our approach is very similar to that in Lemma 3.3, where we used tools from Malliavin calculus for an appropriate treatment of $\pmb {M_{1}}$, and therefore skip most of the details here. We obtain

$$\begin{aligned} \pmb {M_{2}}&= \bigg |\sum \limits _{k,l=1}^{K} {\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \int _{t_{j}}^{s} \mathrm {Tr}\Bigl (D^{3}_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s})\pmb {\sigma }_{k}({\mathbf {Y}}^{j})D\pmb {\sigma }_{k}(\pmb {{\mathcal {Y}}}_{r})\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{l}({\mathbf {Y}}^{j})\big [\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{l}({\mathbf {Y}}^{j})\big ]^\top \Bigr )\mathrm {d}r \mathrm {d}s \bigg ] \bigg |\\&\le \bigg |\sum \limits _{k,l=1}^{K} {\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \int _{t_{j}}^{s} \mathrm {Tr}\Bigl (D^{3}_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s})\pmb {\sigma }_{k}({\mathbf {Y}}^{j})D\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{l}({\mathbf {Y}}^{j})\big [\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{l}({\mathbf {Y}}^{j})\big ]^\top \Bigr )\mathrm {d}r \mathrm {d}s \bigg ] \bigg |\nonumber \\&\quad +\pmb {K_{6}}, \end{aligned}$$

where

$$\begin{aligned} \pmb {K_{6}}&:=\bigg |\sum \limits _{k,l=1}^{K} {\mathbb {E}}\bigg [ \int _{t_{j}}^{t_{j+1}} \int _{t_{j}}^{s} \mathrm {Tr}\Bigl (D^{3}_{{\mathbf {x}}}u(s,\pmb {{\mathcal {Y}}}_{s})\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\{D\pmb {\sigma }_{k}(\pmb {{\mathcal {Y}}}_{r})\\&\quad -D\pmb {\sigma }_{k}({\mathbf {Y}}^{j})\}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{l}({\mathbf {Y}}^{j})\big [\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\pmb {\sigma }_{l}({\mathbf {Y}}^{j})\big ]^\top \Bigr )\mathrm {d}r \mathrm {d}s \bigg ] \bigg |\end{aligned}$$

is again an additional higher order term, which results from adding and substracting $ D\pmb {\sigma }_{k}({\mathbf {Y}}^{j})$, $k=1,\ldots ,K$, in order to obtain an almost fully computable leading order term.

Similar calculations as we used before and using Lemma 2.5 (iii), yields

$$\begin{aligned} \pmb {M_{2}}\le \tfrac{\pmb {C}_{D^{3}}(\phi ) }{2}\cdot \pmb {\mathtt {E_{8}}}({\mathbf {Y}}^{j}) \cdot \bigl (\tau ^{j+1}\bigr )^{2} +\pmb {K_{6}}. \end{aligned}$$

(3.32)

(e) Hence, plugging (3.30), (3.31) and (3.32) into (3.28) yields

$$\begin{aligned} \pmb {T_{1,a}}&\le \Bigg \{\tfrac{\pmb {C}_{D^{2}}(\phi )}{2}\cdot \pmb {\mathtt {E_{6}}}({\mathbf {Y}}^{j}) + \tfrac{\pmb {C}_{D^{2}}(\phi )}{4}\cdot \pmb {\mathtt {E_{7}}}({\mathbf {Y}}^{j}) + \tfrac{\pmb {C}_{D^{3}}(\phi ) }{2}\cdot \pmb {\mathtt {E_{8}}}({\mathbf {Y}}^{j}) \Bigg \}\nonumber \\&\quad \cdot \bigl (\tau ^{j+1}\bigr )^{2} + \pmb {K_{4}}+\pmb {K_{5}}+\pmb {K_{6}}. \end{aligned}$$

(3.33)

(f) Plugging further (3.27) and (3.33) into (3.26) yields

$$\begin{aligned} \pmb {T_{1}}&\le \Bigg \{\pmb {C}_{D^{2}}(\phi )\cdot \pmb {\mathtt {E_{6}}}({\mathbf {Y}}^{j}) + \tfrac{\pmb {C}_{D^{2}}(\phi )}{2}\cdot \pmb {\mathtt {E_{7}}}({\mathbf {Y}}^{j}) + \pmb {C}_{D^{3}}(\phi ) \cdot \pmb {\mathtt {E_{8}}}({\mathbf {Y}}^{j}) \nonumber \\&\quad +\pmb {C}_{D^{2}}(\phi )C_{D\pmb {\sigma }}^{2}\cdot \bigg \{ \tfrac{\tau ^{j+1}}{3}\cdot \pmb {\mathtt {E_{10}}}({\mathbf {Y}}^{j}) +\tfrac{1}{2}\cdot \pmb {\mathtt {E_{12}}}({\mathbf {Y}}^{j}) \bigg \}\Bigg \} \cdot \bigl (\tau ^{j+1}\bigr )^{2}\nonumber \\&\quad +2\pmb {K_{4}}+2\pmb {K_{5}}+2\pmb {K_{6}}. \end{aligned}$$

(3.34)

(g) It remains to examine the terms $\pmb {K_{4}}, \pmb {K_{5}}$ and $\pmb {K_{6}}$ and to show that these terms are indeed higher order terms. Again, a similar treatment as we did for the higher order terms $\pmb {K_{1}}, \pmb {K_{2}}, \pmb {K_{3}}$ in Lemma 3.3 yields

$$\begin{aligned} \pmb {K_{4}}&\le \pmb {C}_{D^{2}}(\phi ) C_{D^{2}\pmb {\sigma }}\cdot \sqrt{\pmb {\mathtt {E_{13}}}({\mathbf {Y}}^{j})}\cdot \sqrt{\tfrac{\tau ^{j+1}}{15}\cdot \pmb {\mathtt {E_{10}}}({\mathbf {Y}}^{j}) + \tfrac{1}{8} \cdot \pmb {\mathtt {E_{12}}}({\mathbf {Y}}^{j})}\cdot \bigl (\tau ^{j+1}\bigr )^{2.5}, \end{aligned}$$

(3.35)

$$\begin{aligned} \pmb {K_{5}}&\le \tfrac{\pmb {C}_{D^{2}}(\phi ) C_{D^{3}\pmb {\sigma }}}{2}\cdot \sqrt{\pmb {\mathtt {E_{14}}}({\mathbf {Y}}^{j})}\cdot \sqrt{\tfrac{\tau ^{j+1}}{15}\cdot \pmb {\mathtt {E_{10}}}({\mathbf {Y}}^{j}) + \tfrac{1}{8} \cdot \pmb {\mathtt {E_{12}}}({\mathbf {Y}}^{j})}\cdot \bigl (\tau ^{j+1}\bigr )^{2.5} \end{aligned}$$

(3.36)

and

$$\begin{aligned} \pmb {K_{6}}\le \pmb {C}_{D^{3}}(\phi ) C_{D^{2}\pmb {\sigma }}\cdot \sqrt{\pmb {\mathtt {E_{14}}}({\mathbf {Y}}^{j})}\cdot \sqrt{\tfrac{\tau ^{j+1}}{15}\cdot \pmb {\mathtt {E_{10}}}({\mathbf {Y}}^{j}) + \tfrac{1}{8} \cdot \pmb {\mathtt {E_{12}}}({\mathbf {Y}}^{j})}\cdot \bigl (\tau ^{j+1}\bigr )^{2.5}. \end{aligned}$$

(3.37)

Step 4: (Finishing the proof) We combine (3.35), (3.36) and (3.37) with (3.34), and then combine the resulting expression with (3.23) and (3.24), which proves the assertion. $\square $

Next, we show convergence with optimal weak order ${\mathcal O}(\tau )$ for the a posteriori error estimator in (3.1) on a mesh with maximum mesh size $\tau >0$ with the help of Lemma 2.6—and hence of the weak error of $\{{\mathbf {Y}}^{j}\}_{j=0}^{J}$ from (1.3) thanks to Theorem 3.1.

Theorem 3.5

Assume (A1)–(A3). Let $\{{\mathbf {Y}}^{j}\}_{j=0}^{J}$ solve (1.3) on a mesh $\{t_{j}\}_{j=0}^{J}\subset [0,T]$ with local mesh sizes $\{ {\tau }^{j+1}\}_{j=0}^{J-1}$ and maximum mesh size $\tau = \max _j \tau ^{j+1}$. Then, there exists $\pmb {C}\equiv \pmb {C}(\phi )>0$ independent of L, such that

$$\begin{aligned} \sum \limits _{j=0}^{J-1} \tau ^{j+1}{\mathfrak {G}}\bigl (\phi ;\tau ^{j+1},{\mathbf {Y}}^{j}\bigr )\ \le \pmb {C}\cdot \tau . \end{aligned}$$

Remark 3.2

$\pmb {1.}$ The work [7] derives a weak a priori error estimate for the linear stochastic heat equation with additive noise, where the analysis exploits the representation formula for the mild solution, and a transformation of it to another process which solves a further SPDE without drift term and additive noise. In contrast, the weak a priori error analysis in [6] for SPDE (1.2) with $\pmb {\beta } = {\mathbf {0}}$ requires Malliavin calculus to efficiently estimate additionally appearing stochastic integral terms due to the nonlinearities F and $\Sigma $—which are of similar type as $\pmb {M_{1}}$ in (3.9) and $\pmb {M_{2}}$ in (3.29) appearing here. The a posteriori error analysis to verify Theorem 3.1 with estimators $\{{\mathfrak {G}}\bigl (\phi ;\tau ^{j+1},{\mathbf {Y}}^{j}\bigr )\}_{j\ge 0}$ also exploits Malliavin calculus, and eventually enables Theorem 3.5. We remark that the tools from Malliavin calculus used here slightly differ from those in [6]: while [6] utilizes an integration by parts formula (cf. [6, Lemma 2.1]), we are making use of the Clark–Ocone formula (cf. Lemma 2.1).

$\pmb {2.}$ The work [6] uses less regular initial data, in particular, and exploits the regularizing effect of the involved semigroup. In this work, we assume (A3) to verify Theorem 3.5; however, we believe a corresponding result to hold for less ‘regular’ initial data, by using a modification of Lemma 2.6 that involves temporal weights in the functional to handle less regular initial data, and mimic the regularizing effect in the present context of arguments.

Proof

We independently bound $\{\pmb {\mathtt {E}_{\ell }}({\mathbf {Y}}^{j})\}_{\pmb {{\ell }}=1,\ldots ,15,\,j=0,\ldots ,J-1}$ in ${\mathfrak {G}}\equiv \{{\mathfrak {G}}\bigl (\phi ;\tau ^{j+1},{\mathbf {Y}}^{j}\bigr )\}_{j=0}^{J-1}$ in (3.1) with the help of Lemma 2.6

(a) Second moment bounds for $\pmb {\mathtt {E}_{\ell }}({\mathbf {Y}}^{j})$, $\pmb {{\ell }}=1,2,3,4,5,6,9,10,12,15$, $j=0,\ldots ,J-1$: We show for $\ell =0,1,2$ that

$\mathbf {(i)}$:: $\begin{aligned} \max \limits _{j=0,\ldots ,J-1} {\mathbb {E}}\big [\Vert {{\mathscr {A}}}^{\ell } \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} {\mathbf {Y}}^{j}\Vert _{{\mathbb {R}}^{L}}^{2}\big ]\le \pmb {C}_{1,1}^{(\ell )}, \end{aligned}$
$\mathbf {(ii)}$:: $\begin{aligned} \max \limits _{j=0,\ldots ,J-1} {\mathbb {E}}\big [\Vert {{\mathscr {A}}}^{\ell } \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} {\mathbf {f}}({\mathbf {Y}}^{j})\Vert _{{\mathbb {R}}^{L}}^{2}\big ]\le 2\bigl (C_{{\mathbf {f}}}^{(\ell )} \bigr )^{2} \bigl (1+\pmb {C}_{1,1}^{(\ell )} \bigr ), \end{aligned}$
$\mathbf {(iii)}$:: $\begin{aligned} \max \limits _{j=0,\ldots ,J-1} \sum \limits _{k=1}^{K}{\mathbb {E}}\big [\Vert {{\mathscr {A}}}^{\ell } \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} \pmb {\sigma }_{k}({\mathbf {Y}}^{j})\Vert _{{\mathbb {R}}^{L}}^{2}\big ]\le 2K\bigl (C_{\pmb {\sigma }}^{(\ell )})^{2}\bigl (1+\pmb {C}_{1,1}^{(\ell )} \bigr ), \end{aligned} $
$\mathbf {(iv)}$:: $\begin{aligned} \max \limits _{j=0,\ldots ,J-1} \sum \limits _{k=1}^{K}{\mathbb {E}}\big [\Vert \pmb {\sigma }_{k}({\mathbf {Y}}^{j})\Vert _{{\mathbb {R}}^{L}}^{2}\big ]\le 2K\bigl (C_{\pmb {\sigma }}^{(0)})^{2}\bigl (1+\pmb {C}_{1,2}^{(0)} \bigr ), \end{aligned} $

where $\pmb {C}_{1,1}^{(\ell )}>0$, $\ell =0,1,2$, are the constants from Lemma 2.6, and $\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}=\bigl ({\mathbb {I}}+\tau ^{j+1}{{\mathscr {A}}}\bigr )^{-1}$.

Let $\ell =0,1,2$ and $j=0,\ldots ,J-1$. Since $\Vert \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\Vert _{{\mathbb {R}}^{L\times L}}\le 1$ for every $\tau ^{j+1}>0$, we immediately obtain

$$\begin{aligned} \Vert {{\mathscr {A}}}^{\ell } \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}{\mathbf {Y}}^{j} \Vert _{{\mathbb {R}}^{L}}^{2}\le \Vert {{\mathscr {A}}}^{\ell } {\mathbf {Y}}^{j}\Vert _{{\mathbb {R}}^{L}}^{2} \cdot \Vert \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}\Vert _{{\mathbb {R}}^{L\times L}} \le \Vert {{\mathscr {A}}}^{\ell } {\mathbf {Y}}^{j}\Vert _{{\mathbb {R}}^{L}}^{2}. \end{aligned}$$

Hence, taking the expectation and using Lemma 2.6, assertion $\mathbf {(i)}$ follows. In almost the same way, on using (A1) (b), we obtain

$$\begin{aligned} \Vert {{\mathscr {A}}}^{\ell } \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1}{\mathbf {f}}({\mathbf {Y}}^{j} )\Vert _{{\mathbb {R}}^{L}}^{2}\le \Vert {{\mathscr {A}}}^{\ell }{\mathbf {f}}({\mathbf {Y}}^{j}) \Vert _{{\mathbb {R}}^{L}}^{2}\le 2\bigl (C_{{\mathbf {f}}}^{(\ell )} \bigr )^{2} \bigl (1+\Vert {{\mathscr {A}}}^{\ell } {\mathbf {Y}}^{j}\Vert _{{\mathbb {R}}^{L}}^{2} \bigr ). \end{aligned}$$

Again, taking the expectation and applying Lemma 2.6, assertion $\mathbf {(ii)}$ follows. Statement $\mathbf {(iii)}$ follows by the same argumentation and $\mathbf {(iv)}$ immediately follows from (A2) and Lemma 2.6.

The bounds (i)–(iv) now yield these for $\pmb {\mathtt {E}_{\ell }}({\mathbf {Y}}^{j})$, $\pmb {{\ell }}=1,2,3,4,5,6,9,10,12,15$, $j=0,\ldots ,J-1$.

(b) Fourth moment bounds for $\pmb {\mathtt {E}_{\ell }}({\mathbf {Y}}^{j})$, $\pmb {{\ell }}=7,8,11,13,14$, $j=0,\ldots ,J-1$: Similar as in (a), by Lemma 2.6, we obtain

$\mathbf {(v)}$:: $\begin{aligned} \max \limits _{j=0,\ldots ,J-1} {\mathbb {E}}\big [\Vert {{\mathscr {A}}}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} {\mathbf {Y}}^{j}\Vert _{{\mathbb {R}}^{L}}^{4}\big ]\le \pmb {C}_{1,2}^{(1)}, \end{aligned}$
$\mathbf {(vi)}$:: $\begin{aligned} \max \limits _{j=0,\ldots ,J-1} {\mathbb {E}}\big [\Vert {{\mathscr {A}}}\pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} {\mathbf {f}}({\mathbf {Y}}^{j})\Vert _{{\mathbb {R}}^{L}}^{4}\big ]\le 8\bigl (C_{{\mathbf {f}}}^{(0)} \bigr )^{4} \bigl (1+\pmb {C}_{1,2}^{(0)} \bigr ), \end{aligned}$
$\mathbf {(vii)}$:: $\begin{aligned} \max \limits _{j=0,\ldots ,J-1} \sum \limits _{k=1}^{K}{\mathbb {E}}\big [\Vert \pmb {{\bar{{{\mathscr {A}}}}}}^{j+1} \pmb {\sigma }_{k}({\mathbf {Y}}^{j})\Vert _{{\mathbb {R}}^{L}}^{4}\big ]\le 8K\bigl (C_{\pmb {\sigma }}^{(0)})^{4}\bigl (1+\pmb {C}_{1,2}^{(0)} \bigr ), \end{aligned} $
$\mathbf {(viii)}$:: $\begin{aligned} \max \limits _{j=0,\ldots ,J-1} \sum \limits _{k=1}^{K}{\mathbb {E}}\big [\Vert \pmb {\sigma }_{k}({\mathbf {Y}}^{j})\Vert _{{\mathbb {R}}^{L}}^{4}\big ]\le 8K\bigl (C_{\pmb {\sigma }}^{(0)})^{4}\bigl (1+\pmb {C}_{1,2}^{(0)} \bigr ), \end{aligned} $

where $\pmb {C}_{1,2}^{(\ell )}>0$, $\ell =0,1$, are the constants from Lemma 2.6. Again, the bounds (v)–(viii) yield these for $\pmb {\mathtt {E}_{\ell }}({\mathbf {Y}}^{j})$, $\pmb {{\ell }}=7,8,11,13,14$, $j=0,\ldots ,J-1$.

(c): By means of (a) and (b), we can find a constant $\pmb {{\tilde{C}}}\ge 1$ independent of L and j such that for all $j\ge 0$

$$\begin{aligned} {\mathfrak {G}}\bigl (\phi ;\tau ^{j+1},{\mathbf {Y}}^{j}\bigr )\le \pmb {{\tilde{C}}}\cdot \tau ^{j+1}. \end{aligned}$$

(3.38)

Hence, plugging (3.38) into (3.1), using $\tau ^{j+1}\le \tau $ for every $j\ge 0$, and setting $\pmb {C}:=\pmb {{\tilde{C}}}\cdot T$ yields the assertion. $\square $

In the next section, we base an adaptive method on the a posteriori error estimate (3.1) to automatically select local step sizes. For every $j\ge 0$, we show that the adaptive method selects a new time step $\tau ^{j+1}$ within finitely many steps, and that the algorithm reaches the terminal time $T>0$ after finitely many steps as well (global termination).

4 Weak adaptive approximation via (1.1): algorithm and convergence

By Theorem 3.1, the weak error caused by scheme (1.3) on a given partition $\{ t_j\}_{j=0}^J \subset [0,T]$ is controllable via the a posteriori error estimate (1.4). In this section, we use this result for an adaptive method that automatically steers local mesh size selection. For this purpose, we check if the criterion ${\mathfrak {G}}\bigl (\phi ;\tau ^{j+1},{\mathbf {Y}}^{j}\bigr ) \le \frac{{\mathtt {Tol}}}{T}$ is met or not: in the first case, $\tau ^{j+1}$ is admissible, bounding the new local error in such a way that the overall error will be bounded by $\mathtt{Tol}$ through Theorem 3.1; in the latter case, $\tau ^{j+1}$ will be replaced by the refined mesh size ${\widetilde{\tau }}^{j+1} := \frac{\tau ^{j+1}}{2}$, and the criterion will be checked again. The following algorithm contains a refinement step (1) to generate $\{\tau ^{j+1,\ell }\}_{\ell \ge 0}$ and—if ${\mathfrak {G}}\bigl (\phi ;\tau ^{j+1,\ell },{\mathbf {Y}}^{j}\bigr )$ is ‘too small’—a final coarsening step (3) in the loop of generating the subsequent iterate from (1.3), after accepting the possible underestimation of ${\mathfrak {G}}\bigl (\phi ;\tau ^{j+1,\ell },{\mathbf {Y}}^{j}\bigr )$.

Algorithm 4.1

Fix ${\mathtt {Tol}}>0$ and $\tau ^{1}\ge \tfrac{{\mathtt {Tol}}}{T}$. Let $(\tau ^j, \mathbf {Y}^{j})$ be given for some $j\ge 1$. Define $\tau ^{j+1,0}:=\tau ^{j}$.—For $\ell =0,1,2,...$ compute ${\mathfrak G}\bigl (\phi ; \tau ^{j+1,\ell }, {\mathbf {Y}}^{j}\bigr )$ and decide:

(1):: If ${\mathfrak G}\bigl (\phi ; \tau ^{j+1,\ell }, {\mathbf {Y}}^{j}\bigr )>\frac{{\mathtt {Tol}}}{T}$, set $\tau ^{j+1,\ell +1}:=\frac{\tau ^{j+1,\ell }}{2}$, and $\ell \hookrightarrow \ell +1$.
(2):: If $\frac{{\mathtt {Tol}}}{2T}\le {\mathfrak G}\bigl (\phi ; \tau ^{j+1,\ell }, {\mathbf {Y}}^{j}\bigr )\le \frac{{\mathtt {Tol}}}{T}$, set $\tau ^{j+1}:=\tau ^{j+1,\ell }$, $t_{j+1}:=t_{j}+\tau ^{j+1}$, compute $\Delta _{j+1,\ell } \beta _{k}:=\beta _{k}\bigl (t_{j}+\tau ^{j+1,\ell }\bigr )-\beta _{k}(t_{j})$, for $k=1,\ldots ,K$, then solve (1.3) for $\mathbf {Y}^{j+1}$, and $j \hookrightarrow j+1$.
(3):: If ${\mathfrak G}\bigl (\phi ; \tau ^{j+1,\ell }, {\mathbf {Y}}^{j}\bigr )< \frac{{\mathtt {Tol}}}{2T}$, set $\tau ^{j+1}:=\tau ^{j+1,\ell }$, $t_{j+1}:=t_{j}+\tau ^{j+1}$, compute $\mathbf {Y}^{j+1}$ via (1.3) with $\tau ^{j+1}$ and $\{\Delta _{j+1,\ell } \beta _{k}, \, k=1,\ldots ,K\}$. Then set $\tau ^{j+1}:=2\tau ^{j+1}$ and $j \hookrightarrow j+1$.

Stop, if $t_{j}\ge T$ for some j and set $J:=j$.

This sequence of refinement steps, which is succeeded by possibly one coarsening step prevents infinite loops of refinement and coarsening, and enables a flexible re-meshing to capture local dynamics. The following theorem validates termination of the adaptive method, consisting of (1.3) and Algorithm 4.1.

Theorem 4.2

Let ${\mathtt {Tol}}>0$. Suppose (A1)–(A3). Then, the adaptive method consisting of (1.3) and Algorithm 4.1 generates each of the local step sizes of $\{\tau ^{j+1}\}_{j\ge 0}$ after ${\mathcal {O}}\bigl (\log ({\mathtt {Tol}}^{-1})\bigr )$ many iterations and the algorithm reaches the terminal time $T>0$ within $J={\mathcal {O}}\bigl ({\mathtt {Tol}}^{-1}\bigr )$ time steps. Furthermore, admissible tuple $\{ (\tau ^{j+1}, {\mathbf {Y}}^{j+1})\}_{j=0}^{J-1}$ satisfy

$$\begin{aligned} \max \limits _{0\le j\le J}\Big |{\mathbb {E}}\big [\phi ({\mathbf {X}}_{t_{j}})\big ] - {\mathbb {E}}\big [\phi ({\mathbf {Y}}^{j})\big ] \Big |\le {\mathtt {Tol}}. \end{aligned}$$

(4.1)

Remark 4.1

1. An adaptive method based on a strong a posteriori error estimate to steer automatic spatio-temporal remeshing for a discretization of (1.2) with additive noise is proposed in [21], for which termination of an iterative strategy to select new local mesh parameters for a fixed index $j \in {{\mathbb {N}}}{\setminus } \{0\}$ is shown. [21] conceptionally follows ideas in [4] for the heat equation (i.e., $\Sigma _k \equiv 0$ ($1 \le k \le K$) in (1.2)), where a local approximation argument in step $j \in {{\mathbb {N}}}{\setminus } \{0\}$ settles the existence of a value $\tau ^{j+1}_* >0$, s.t. values $\tau ^{j+1,\ell } \le \tau ^{j+1}_*$ meet the stopping criterion; this argument, however, does not exclude selected $\tau ^{j+1}_*$ in [4, 21] to crucially depend on j, leaving open global termination. This deficiency has been overcome in [19] for a modified version of the adaptive algorithm in [4], which, in particular, exploits a discrete stability property of the underlying discretization to herewith establish $\inf _j \tau ^{j+1}_* \ge \tau _* >0$.—We here proceed analogously to settle convergence of Algorithm 4.1 with the help of Lemma 2.6.

$\pmb {2.}$ Automatic mesh refinement in [23] is based on the computable leading-order term in the weak a posteriori estimator (1.6) that has been derived in [27], provided that the involved drift and diffusion functions are bounded. For a sufficiently fine initial mesh ${{\mathcal {I}}}_{J^0} := \{ t_j\}_{j=0}^{J^0} \subset [0,T]$ and given ${{\mathcal {I}}}_{J^\ell }$, the new mesh ${\mathcal I}_{J^{\ell +1}} \supset {{\mathcal {I}}}_{J^{\ell }}$ refines those intervals $[t_{j^{\ell }}, t_{(j+1)^{\ell }}]$, where $\rho _{(j+1)^\ell } \vert \tau ^{(j+1)^\ell }\vert ^{2}$ overshoots $\frac{\mathtt{Tol}}{J^{\ell }}$. Note that this iterative strategy requires the global re-computation of $\{ \rho _{j^\ell }\}_{j^\ell }$ for every $\ell \ge 0$: termination after $\ell ^* < \infty $ iterations, with $J^{\ell ^*} = {{\mathcal {O}}}(\mathtt{Tol^{-1}})$ is then shown in [23].

Proof

(a) Termination for each $j \ge 0$: Fix $j\ge 0$, and recall (3.38) in the proof of Theorem 3.5. Since the constant $\pmb {{\tilde{C}}}\ge 1$ appearing there does not depend on j, we generate a finite sequence $\{\tau ^{j+1,\ell }\}_{\ell =0}^{\ell _{j+1}^{*}}$ with $\tau ^{j+1,\ell }=\frac{\tau ^{j+1,0}}{2^{\ell }}$, $\ell =0,\ldots ,\ell _{j+1}^{*}$, according to the refinement mechanism (1) of Algorithm 4.1, until either (2) or (3) is met. In view of (3.38), we find out that $\ell =\Big \lceil \nicefrac {\log \Bigl (\tfrac{\tau ^{j+1,0} \pmb {{\tilde{C}}}T}{{\mathtt {Tol}}} \Bigr )}{\log (2)} \Big \rceil $ is the smallest natural number such that

$$\begin{aligned} {\mathfrak {G}}\bigl (\phi ;\tau ^{j+1,\ell },{\mathbf {Y}}^{j}\bigr )\le \pmb {{\tilde{C}}}\cdot \tau ^{j+1,\ell }=\pmb {{\tilde{C}}}\cdot \frac{\tau ^{j+1,0}}{2^{\ell }}\overset{!}{\le } \frac{{\mathtt {Tol}}}{T}. \end{aligned}$$

Consequently, we have

$$\begin{aligned} 0\le \ell _{j+1}^{*}\le \left\lceil \frac{\log \Bigl (\tfrac{\tau ^{j+1,0} \pmb {{\tilde{C}}}T}{{\mathtt {Tol}}} \Bigr )}{\log (2)} \right\rceil , \end{aligned}$$

(4.2)

which yields a maximum of ${\mathcal {O}}\bigl (\log ({\mathtt {Tol}}^{-1})\bigr )$ (refinement) steps to accept the local step size $\tau ^{j+1}:=\tau ^{j+1,\ell ^{*}_{j+1}}=\frac{\tau ^{j+1,0}}{2^{\ell ^{*}_{j+1}}}$.

(b) Global termination rate: We show by induction that

$$\begin{aligned} \tau ^{j+1}\ge \frac{{\mathtt {Tol}}}{2\pmb {{\tilde{C}}}T}\quad (j\ge 0), \end{aligned}$$

where $\pmb {{\tilde{C}}}\ge 1$ is the constant in (3.38). In particular, this means that T is reached within $J={\mathcal {O}}\bigl ({\mathtt {Tol}}^{-1}\bigr )$ many time steps.

The base case follows by the choice of the initial mesh size $\tau ^{1}\ge \frac{{\mathtt {Tol}}}{T}$. Now suppose that we have generated ${\mathbf {Y}}^{j}$ with step size $\tau ^{j}\ge \frac{{\mathtt {Tol}}}{2\pmb {{\tilde{C}}}T}$; see also Fig. 2. In order to successfully compute ${\mathbf {Y}}^{j+1}$, we set $\tau ^{j+1,0}:=\tau ^{j}$ (if (2) occurred in the generation of $\tau ^{j}$), or $\tau ^{j+1,0}:=2\tau ^{j}$ (if (3) occurred in the generation of $\tau ^{j}$). In both cases, $\tau ^{j+1,0}\ge \frac{{\mathtt {Tol}}}{2\pmb {{\tilde{C}}}T}$. Via a), we generate a finite sequence $\{\tau ^{j+1,\ell }\}_{\ell =0}^{\ell _{j+1}^{*}}$ until either (2) or (3) is met, and then generate ${\mathbf {Y}}^{j+1}$ with step size $\tau ^{j+1}:=\tau ^{j+1,\ell ^{*}_{j+1}}=\frac{\tau ^{j+1,0}}{2^{\ell ^{*}_{j+1}}}$. Since $\lceil x \rceil <1+x$, $x\in {\mathbb {R}}$, we conclude by means of (4.2)

$$\begin{aligned} \tau ^{j+1}:=\tau ^{j+1,\ell ^{*}_{j+1}}=\frac{\tau ^{j+1,0}}{2^{\ell ^{*}_{j+1}}}\ge \frac{{\mathtt {Tol}}}{2\pmb {{\tilde{C}}}T}. \end{aligned}$$

(c) Estimate (4.1) immediately follows from (3.1) and part (2) of Algorithm 4.1. $\square $

5 Computational experiments

The simulations of iterates $\{{\mathbf {Y}}^{j}\}_{j\ge 0}$ from (1.3) use independent standard normally distributed pseudo-random numbers (‘randn’) in MATLAB (version: 2017a). By Kolmogorov’s extension theorem, see e.g. [26, p. 11, Thm. 2.1.5], a family of probability measures $\{ \pmb {{{\mathcal {N}}}}_{{\mathbf {0}},t {\mathbb {I}}};\, 0 \le t \le T\}$ on $({\mathbb {R}}^{K},{\mathcal {B}}({\mathbb {R}}^{K}))$ yields the existence of a (filtered) probability space $(\Omega , {\mathcal F}, \{{{\mathcal {F}}}_t \}_{t \ge 0}, {{\mathbb {P}}})$ and Wiener processes $\{\beta _{k}(t);\, t\in [0,T]\}$, $k=1,\ldots ,K$ on it; we consider them to be the ones with which (1.1) and (1.3) are defined.

Let $\mathtt{Tol} >0$. We use (1.3) in combination with the adaptive Algorithm 4.1 for different examples, which result from a finite element spatial discretization of an SPDE (1.2). We show how the involved a posteriori error estimate (1.4) serves to estimate related weak errors, and that adaptive remeshing substantially reduces the amount of needed steps to overcome the interval [0, T]. For the sake of computations, we therefore use sufficiently large $\mathtt{M}$-samples to suppress additional statistical errors in the Monte-Carlo method due to approximating appearing expectations ${{\mathbb {E}}}[\cdot ]$ by ${{\mathbb {E}}}_\mathtt{M}[\cdot ]$—as in (1.4), which then takes the form

$$\begin{aligned} \max \limits _{0\le j\le J}\Big |{\mathbb {E}}\big [\phi ({\mathbf {X}}_{t_{j}})\big ] - {\mathbb {E}}_{\mathtt {M}}\big [\phi ({\mathbf {Y}}^{j})\big ] \Big |\le 2 {\mathtt {Tol}} , \end{aligned}$$

(5.1)

and which now holds with high probability. In order to obtain (5.1), we first add and substract ${\mathbb {E}}\big [ \phi ({\mathbf {Y}}^{j})\big ]$ and use Theorem 3.1, which yields

$$\begin{aligned} \max \limits _{0\le j\le J}\Big |{\mathbb {E}}\big [\phi ({\mathbf {X}}_{t_{j}})\big ] - {\mathbb {E}}_{\mathtt {M}}\big [\phi ({\mathbf {Y}}^{j})\big ] \Big |&\le \sum \limits _{j=0}^{J-1} \tau ^{j+1}{\mathfrak G}\bigl (\phi ;\tau ^{j+1}, {\mathbf {Y}}^{j}\bigr ) \nonumber \\&\quad + \max \limits _{0\le j\le J}\Big |{\mathbb {E}}\big [\phi ({\mathbf {Y}}^{j})\big ] - {\mathbb {E}}_{\mathtt {M}}\big [\phi ({\mathbf {Y}}^{j})\big ] \Big |. \end{aligned}$$

Replacing all arising expectations $\pmb {\mathtt {E}}_{\pmb {{\ell }}}(\cdot )$, $\pmb {{\ell }}=1,\ldots ,15$ in the representation of the error estimator ${\mathfrak {G}}$ by their corresponding empirical means $\pmb {\mathtt {E}}^{(\mathtt {M})}_{\pmb {{\ell }}}(\cdot )$ and writing ${\mathfrak {G}}^{(\mathtt {M})}$ for the related (empirical) error estimator further leads to

$$\begin{aligned} \max \limits _{0\le j\le J}\Big |{\mathbb {E}}\big [\phi ({\mathbf {X}}_{t_{j}})\big ] - {\mathbb {E}}_{\mathtt {M}}\big [\phi ({\mathbf {Y}}^{j})\big ] \Big |&\le \sum \limits _{j=0}^{J-1} \tau ^{j+1}{\mathfrak G}^{(\mathtt {M})}\bigl (\phi ;\tau ^{j+1}, {\mathbf {Y}}^{j}\bigr )\nonumber \\&\quad + \mathtt {ERR}_{{\mathfrak {G}}}(\mathtt {M}) + \mathtt {ERR}_{\phi }(\mathtt {M}), \end{aligned}$$

(5.2)

where $\mathtt {ERR}_{{\mathfrak {G}}}(\mathtt {M})$, $\mathtt {ERR}_{\phi }(\mathtt {M})$ denote the arising statistical errors resulting from the approximation of the error estimator ${\mathfrak {G}}$ and from the approximation of the expectation of the test function $\phi $. Algorithm 4.1 controls the first expression on the right-hand side of (5.2). In order to control the remaining statistical errors, i.e., to ensure that $\mathtt {ERR}_{{\mathfrak {G}}}(\mathtt {M})+\mathtt {ERR}_{\phi }(\mathtt {M})\le {\mathtt {Tol}}$ holds with high probability, and to conclude (5.1), one can (asymptotically) determine a number $\mathtt {M}\equiv \mathtt {M}({\mathtt {Tol}})\in {\mathbb {N}}{\setminus } \{0\}$ of Monte-Carlo samples by means of concentration inequalities, the central limit theorem or other (non-)asymptotic controls; we refer to [12] for more details in this direction—In the computational studies reported below, we mostly chose $\mathtt {M}=10^4$ for which the Monte-Carlo simulations performed stably.

5.1 The one-dimensional stochastic heat equation in Example 1.1

Let $T>0$, $K\in {{\mathbb {N}}}{\setminus } \{0\}$ and $\pmb {{\mathcal {D}}}=(0,1)\subset {\mathbb {R}}$. We consider the SPDE (1.2) with $\varepsilon = 1$, $\pmb {\beta } \equiv \pmb {0}$, and $F,\,\Sigma \in C^{3}\bigl ({\mathbb {R}}\bigr )\cap {\mathbb {H}}^{1}_{0}(\pmb {{\mathcal {D}}})$, as well as homogeneous Dirichlet boundary conditions, and $y_{0} \in {\mathbb {H}}^{1}_{0}(\pmb {{\mathcal {D}}})$. These assumptions ensure the existence of a unique strong solution of (1.2); see e.g. [5, p. 197ff.]. We use standard notation, as e.g. ${\mathbb {L}}^{2}\equiv {\mathbb {L}}^{2}(\pmb {{\mathcal {D}}})$ and ${\mathbb {H}}^{1}_{0}\equiv {\mathbb {H}}^{1}_{0}(\pmb {{\mathcal {D}}})$ below; see e.g. [9, p. 244 ff.].

Let $L\in {\mathbb {N}}{\setminus } \{0\}$. We consider a (uniform) triangulation of the domain $\pmb {{\mathcal {D}}}$ such that

$$\begin{aligned} 0={\mathsf {x}}_{0}{<}{\mathsf {x}}_{1}<\cdots<{\mathsf {x}}_{L}<{\mathsf {x}}_{L+1}{=}1,\quad \text {with}\quad \textit{h} {\equiv } {\mathsf {x}}_{\ell }-{\mathsf {x}}_{\ell -1}=\tfrac{1}{L+1} \quad (\ell {=}1,\ldots ,L). \end{aligned}$$

Following [14], we use a finite element method based on piecewise affine functions to spatially discretize SPDE (1.2) with $\varepsilon = 1$ and $\pmb {\beta } \equiv \pmb {0}$. In combination with ‘mass lumping’, we obtain the following $L-$dimensional SDE system:

$$\begin{aligned} {\left\{ \begin{array}{ll} {\mathrm d}{\mathbf {X}}_{t}^{\textit{h}} = \bigl ( -{{\mathscr {A}}}{\mathbf {X}}_{t}^{\textit{h}} +{\mathbf {f}}({\mathbf {X}}_{t}^{\textit{h}})\bigr ){\mathrm d}t + \sum _{k=1}^{K}\pmb {\sigma }_{k}({\mathbf {X}}_{t}^{\textit{h}}){\mathrm d}\beta _{k}(t) \quad \text {for all } \,\; t \in [0,T], \\ {\mathbf {X}}_{0}^{\textit{h}} = \bigl (y_{0}({\mathsf {x}}_{1}),\ldots ,y_{0}({\mathsf {x}}_{L})\bigr )^{\top } \in {{\mathbb {R}}}^L, \end{array}\right. } \end{aligned}$$

(5.3)

where

$$\begin{aligned} {\mathbf {X}}_{t}^{\textit{h}}&:=\bigl (X_{t}^{\textit{h}}({\mathsf {x}}_{1}),\ldots ,X_{t}^{\textit{h}}({\mathsf {x}}_{L}) \bigr )^{\top }\in {\mathbb {R}}^{L},\\ {\mathbf {f}}({\mathbf {X}}_{t}^{\textit{h}})&:= \bigl (F\bigl (X_{t}^{\textit{h}}({\mathsf {x}}_{1})\bigr ),\ldots ,F\bigl (X_{t}^{\textit{h}}({\mathsf {x}}_{L})\bigr )^{\top }\in {\mathbb {R}}^{L}, \\ \pmb {\sigma }_{k}({\mathbf {X}}_{t}^{\textit{h}})&:=\big (\Sigma _{k}\bigl (X_{t}^{\textit{h}}({\mathsf {x}}_{1})\bigr ),\ldots ,\Sigma _{k}\bigl (X_{t}^{\textit{h}}({\mathsf {x}}_{L})\bigr )\bigr )^\top \in {\mathbb {R}}^{L},\\ {{\mathscr {A}}}&:=\frac{1}{\textit{h}^{2}} \mathrm {tridiag}[-1,2,-1]\in {\mathbb {R}}^{L\times L}. \end{aligned}$$

Example 1.1 discusses computational studies for (5.3) via (1.3) in combination with adaptive Algorithm 4.1. We choose $\phi ({\mathbf {x}})=\sqrt{\textit{h}}\Vert {\mathbf {x}}\Vert _{{\mathbb {R}}^{L}}$ to approximate the ${\mathbb L}^2-$norm of (the finite element approximation of) $\{X_t$, $t \in [0,T]\}$ from SPDE (1.2). The related constants to compute ${\mathfrak {G}}\equiv \{{\mathfrak {G}}\bigl (\phi ;\tau ^{j+1},{\mathbf {Y}}^{j}\bigr )\}_{j \ge 0}$ below (3.1) are:

$$\begin{aligned} \lambda _{{{\mathscr {A}}}}&\approx \pi ^{2},\quad C_{D {\mathbf {f}}}=\tfrac{\pi }{5},\quad C_{D^{2}{\mathbf {f}}}=\tfrac{\pi ^2}{5},\quad C_{D^{3} {\mathbf {f}}}=\tfrac{\pi ^3}{5} ,\quad C_{D \pmb {\sigma }}=\tfrac{137}{120},\\ C_{D^{2} \pmb {\sigma }}&=0.158,\quad C_{D^{3}\pmb {\sigma }}=0.518, \\ \pmb {C}_{D}(\phi )&=\sqrt{\textit{h}},\quad \pmb {C}_{D^{2}}(\phi )=0,\quad \pmb {C}_{D^{3}}(\phi )=0 \quad \bigl (\varepsilon _{1}=\varepsilon _{2}=\varepsilon _{3}=6\bigr ). \end{aligned}$$

Figure 3 below displays the contributions of the different $\pmb {\mathtt {E}}^{(\mathtt {M})}_{\pmb {{\ell }}}(\cdot )$, $\pmb {{\ell }}\in \{1,\ldots ,15\}\backslash \{7,13,14\}$ in the a posteriori error estimator ${\mathfrak {G}}$, which steers the step size selection of the adaptive Algorithm 4.1. Note that ${\mathfrak {G}}$ consists of leading order terms $\pmb {\mathtt {E}_{\ell }}(\cdot )$, $\pmb {{\ell }}\in \{1,2,3,4,5,6,8,9,12\}$, as well as ‘higher order terms’ $\pmb {\mathtt {E}_{\ell }}(\cdot )$, $\pmb {{\ell }}\in \{10,11,15\}$. Mainly responsible for mesh adjustments are the leading order terms—in particular $\pmb {\mathtt {E_{1}}}(\cdot )$, which addresses higher derivatives (up to order 4) of the (approximated) solution $\{X_{t}$, $t\in [0,T]\}$ of (1.2), starting with $x\mapsto \sin (\pi x)$ as initial function. Heuristically, it might therefore be justified to neglect ‘higher order terms’ in order to save computational effort, and only take into account leading order terms in ${\mathfrak {G}}$.

Table 1 Different Setups for Example 5.1

Full size table

For different noise parameters $K\in \{0,1,3,5,6,8,10\}$, Fig. 4 compares the total amount of time steps needed for adaptive and uniform meshes to perform equally well, indicating that: the larger K the more improved performance of the adaptive Algorithm 4.1 compared to scheme (1.3) with uniform time steps may be expected.

5.2 A convection dominated (stochastic) problem

Example 5.1

Consider (1.2) on $\pmb {{\mathcal {D}}}=(0,1)$, $T>0$, with $\varepsilon >0$, $\pmb {\beta } \in {\mathbb {R}}$, $F\equiv 0$ and homogeneous Dirichlet boundary conditions. After a finite element discretization, using ‘mass lumping’, and $h = \frac{1}{L+1}$ for some $L \in {\mathbb N}{\setminus } \{0\}$, we obtain

$$\begin{aligned} {\left\{ \begin{array}{ll} {\mathrm d}{\mathbf {X}}_{t}^{\textit{h}} = -{{\mathscr {A}}}{\mathbf {X}}_{t}^{\textit{h}}{\mathrm d}t + \sum _{k=1}^{K}\pmb {\sigma }_{k}({\mathbf {X}}_{t}^{\textit{h}}){\mathrm d}\beta _{k}(t) \quad \text {for all } \,\; t \in [0,T], \\ {\mathbf {X}}_{0}^{\textit{h}} = \bigl (y_{0}({\mathsf {x}}_{1}),\ldots ,y_{0}({\mathsf {x}}_{L})\bigr )^{\top } \in {{\mathbb {R}}}^L, \end{array}\right. } \end{aligned}$$

where $\pmb {\sigma }_{k}({\mathbf {X}}_{t}^{\textit{h}})$ as in (5.3), and

$$\begin{aligned} {{\mathscr {A}}}:=\frac{\varepsilon }{\textit{h}^{2}} \mathrm {tridiag}[-1,2,-1] - \frac{\pmb {\beta }}{2h} \mathrm {tridiag}[-1,0,1] \in {\mathbb {R}}^{L\times L}. \end{aligned}$$

We study three different cases given in Table 1, where we, among other things, discuss the influence of varying $\varepsilon $ on the total amount of adaptive versus uniform time steps; see Fig. 5. Setup A deals with a purely deterministic version of Example 5.1, i.e., no diffusion is involved. In this context, for fixed $\pmb {\beta }$ and $\varepsilon $, we discuss the role of the chosen test function $\phi $ in the adaptive method; see Figs. 6 and 7 below. Then, for a fixed test function, Setup B studies the impact of different choices of $\pmb {\beta }$, $\varepsilon $ on adaptive meshing; see Figs. 8, 9 and 10 below. Setup C investigates Example 5.1 for a different initial function and a different type of multiplicative noise, but with fixed $\pmb {\beta }$, $\varepsilon $ and $\phi $; see Fig. 11 below.

As we can see in Figs. 6 and 7, different choices of test functions might lead to huge changes in the amount of time steps generated via Algorithm 4.1. In Figs. 6 and 7 we choose $\phi ({\mathbf {x}})=\sqrt{\textit{h}}\Vert {\mathbf {x}}\Vert _{{\mathbb {R}}^{L}}$ (resp. $\phi ({\mathbf {x}})=\Vert {\mathbf {x}} \Vert _{\infty }:=\max _{\ell =1,\ldots ,L}|x_{\ell } |$) to approximate the ${{\mathbb {L}}}^2-$norm (resp. the ${{\mathbb {L}}}^{\infty }-$norm) of (the finite element approximation of) $\{X_t$, $t \in [0,T]\}$, from SPDE (1.2). Although both figures illustrate similar behaviours of time step size, error and a posteriori error plots, the amount of time steps needed to stay below the given error threshold ${\mathtt {Tol}}$ in Fig. 7 is larger compared to Fig. 6, which is due to the different scalings of the considered norms.

Different choices of $\pmb {\beta }$ and $\varepsilon $ affect the amount of total steps generated via Algorithm 4.1. A larger size of $\pmb {\beta }$ increases the convection effect and leads to more time steps; see Figs. 10 and 6 . In turn, larger values of $\varepsilon $ reduce the transport, which requires fewer time steps, see Fig. 9. For different parameters $\varepsilon \in \{\textit{h},2\textit{h}, 5\textit{h},15\textit{h},30\textit{h},1\}$, Fig. 5 compares the total amount of time steps needed for adaptive and uniform meshes to perform equally well, indicating that the smaller $\varepsilon $ is, the more savings are obtained via the adaptive Algorithm 4.1, if compared to scheme (1.3) with uniform time steps.

References

Aboura, O.: Weak error expansion of the implicit Euler scheme. Preprint (2013)
Becker, R., Rannacher, R.: An optimal control approach to a posteriori error estimation in finite element methods. Acta Numerica 1–102 (2001)
Cerrai, S.: Second Order PDE’s in Finite and Infinite Dimension. Lecture Notes in Mathematics, vol. 1762. Springer, Berlin (2001)
Book Google Scholar
Chen, Z., Feng, J.: An adaptive finite element algorithm with reliable and efficient error control for linear parabolic problems. Math. Comput. 73, 1167–1193 (2004)
Article MathSciNet Google Scholar
Chow, P.-L.: Stochastic Partial Differential Equations. Advances in Applied Mathematics, 2nd edn. CRC Press, Boca Raton (2015)
MATH Google Scholar
Debussche, A.: Weak approximation of stochastic partial differential equations: the nonlinear case. Math. Comput. 80, 89–117 (2011)
Article MathSciNet Google Scholar
Debussche, A., Printems, J.: Weak order for the discretization of the stochastic heat equation. Math. Comput. 78, 845–863 (2009)
Article MathSciNet Google Scholar
Eriksson, K., Estep, D., Hansbo, P., Johnson, C.: Introduction to adaptive methods for differential equations. Acta Numerica 1–54 (1995)
Evans, L.C.: Partial Differential Equations. Graduate Studies in Mathematics, vol. 19. American Mathematical Society (2010)
Fang, W., Giles, M.B.: Adaptive Euler–Maruyama methods for SDEs with non-globally Lipschitz drift. Ann. Appl. Probab. 30, 526–560 (2020)
Article MathSciNet Google Scholar
Giles, M.B., Süli, E.: Adjoint methods for PDEs: a posteriori error analysis and postprocessing by duality. Acta Numerica 145–236 (2002)
Gobet, E.: Monte-Carlo Methods and Stochastic Processes. CRC Press, Boca Raton (2016)
Book Google Scholar
Gyöngy, I., Sabanis, S., Siska, D.: Convergence of tamed Euler schemes for a class of stochastic evolution equations. Stoch. PDE Anal. Comput. 4, 225–245 (2016)
Article MathSciNet Google Scholar
Johnson, C.: Numerical Solutions of Partial Differential Equations by the Finite Element Method. Camebridge Univ. Press, Camebridge (1987)
Google Scholar
Karatzas, I., Shreve, S.: Brownian Motion and Stochastic Calculus. Graduate Texts in Mathematics, vol. 113. Springer, Berlin (1991)
MATH Google Scholar
Kelly, C., Lord, G.J.: Adaptive time-stepping strategies for nonlinear stochastic systems. IMA J. Numer. Anal. 38, 1523–1549 (2018)
Article MathSciNet Google Scholar
Kelly, C., Lord, G.J.: Adaptive Euler methods for stochastic systems with non-globally Lipschitz coefficients. Numer Algor (2021)
Kloeden, P.E., Platen, E.: Numerical Solution of Stochastic Differential Equations. Application of Mathematics, vol. 23. Springer, Berlin (1992)
Book Google Scholar
Kreuzer, C., Möller, C.A., Schmidt, A., Siebert, K.G.: Design and convergence analysis for an adaptive discretization of the heat equation. IMA J. Numer. Anal. 32, 1375–1403 (2012)
Article MathSciNet Google Scholar
Lamba, H., Mattingly, J.C., Stuart, A.M.: An adaptive Euler-Maruyama scheme for SDEs: convergence and stability. IMA J. Numer. Anal. 27, 479–506 (2007)
Article MathSciNet Google Scholar
Majee, A.K., Prohl, A.: A posteriori error estimation and space-time adaptivity for a linear stochastic PDE with additive noise. IMA J. Numer. Anal. (Accepted)
Merle, F.: Adaptive concepts for high-dimensional stochastic differential equations. Ph.D. thesis, U Tübingen (in preparation) (2021)
Moon, K.-S., Szepessy, A., Tempone, R., Zouraris, G.E.: Convergence rates for adaptive weak approximation of stochastic differential equations. Stoch. Anal. Appl. 23, 511–558 (2005)
Article MathSciNet Google Scholar
Nochetto, R.H., Savare, G., Verdi, C.: A posteriori error estimates for variable time-step discretizations of nonlinear evolution equations. Commun. Pure Appl. Math. LIII 525–589 (2000)
Nualart, D.: The Malliavin Calculus and Related Topics. Probability and its Applications, Springer, Berlin (2006)
MATH Google Scholar
Oksendal, B.: Stochastic Differential Equations, 6th edn. Springer, Berlin (2013)
MATH Google Scholar
Szepessy, A., Tempone, R., Zouraris, G.E.: Adaptive weak approximation of stochastic differential equations. Commun. Pure Appl. Math. 54, 1169–1214 (2001)
Article MathSciNet Google Scholar
Talay, D., Tubaro, L.: Expansion of the global error for numerical schemes solving stochastic differential equations. Stoch. Anal. Appl. 8, 483–509 (1990)
Article MathSciNet Google Scholar
Verfürth, R.: A Posteriori Error Estimation Techniques for Finite Element Methods. Oxford University Press, Oxford (2013)
Book Google Scholar

Download references

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations

Mathematisches Institut, Universität Tübingen, Auf der Morgenstelle 10, 72076, Tübingen, Germany
Fabian Merle & Andreas Prohl

Authors

Fabian Merle
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Prohl
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andreas Prohl.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Merle, F., Prohl, A. An adaptive time-stepping method based on a posteriori weak error analysis for large SDE systems. Numer. Math. 149, 417–462 (2021). https://doi.org/10.1007/s00211-021-01233-4

Download citation

Received: 05 March 2021
Revised: 21 July 2021
Accepted: 28 August 2021
Published: 18 October 2021
Issue Date: October 2021
DOI: https://doi.org/10.1007/s00211-021-01233-4

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

An adaptive time-stepping method based on a posteriori weak error analysis for large SDE systems

Abstract

Similar content being viewed by others

Multilevel Monte Carlo Implementation for SDEs Driven by Truncated Stable Processes

Cheap arbitrary high order methods for single integrand SDEs

Stochastic Approximation Procedures for Lévy-Driven SDEs

1 Introduction

Example 1.1

2 Assumptions and tools

2.1 Assumptions

2.2 Malliavin calculus

Lemma 2.1

2.3 Variation equations for (1.1) and a priori bounds for \(\{ D_{\mathbf {x}}^{\ell }u\}_{\ell =1}^3\) of (1.5)

Lemma 2.2

Proof

Lemma 2.3

Proof

Lemma 2.4

Lemma 2.5

Proof

2.4 Stability bounds for iterates \(\{ {\mathbf {Y}}^j\}_{j\ge 0}\) from (1.3)

Lemma 2.6

Proof

3 A posteriori weak error estimates for the Scheme (1.3)

3.1 A posteriori weak error estimation: derivation and properties

Theorem 3.1

Remark 3.1

Lemma 3.2

Proof

Lemma 3.3

Proof

Lemma 3.4

Proof

Theorem 3.5

Remark 3.2

Proof

4 Weak adaptive approximation via (1.1): algorithm and convergence

Algorithm 4.1

Theorem 4.2

Remark 4.1

Proof

5 Computational experiments

5.1 The one-dimensional stochastic heat equation in Example 1.1

5.2 A convection dominated (stochastic) problem

Example 5.1

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Mathematics Subject Classification

Search

Navigation