Overcoming the Curse of Dimensionality in the Numerical Approximation of Parabolic Partial Differential Equations with Gradient-Dependent Nonlinearities

Hutzenthaler, Martin; Jentzen, Arnulf; Kruse, Thomas

doi:10.1007/s10208-021-09514-y

Overcoming the Curse of Dimensionality in the Numerical Approximation of Parabolic Partial Differential Equations with Gradient-Dependent Nonlinearities

Open access
Published: 13 July 2021

Volume 22, pages 905–966, (2022)
Cite this article

Download PDF

You have full access to this open access article

Foundations of Computational Mathematics Aims and scope Submit manuscript

Overcoming the Curse of Dimensionality in the Numerical Approximation of Parabolic Partial Differential Equations with Gradient-Dependent Nonlinearities

Download PDF

Martin Hutzenthaler¹,
Arnulf Jentzen^2,3 &
Thomas Kruse⁴

3413 Accesses
12 Citations
1 Altmetric
Explore all metrics

Abstract

Partial differential equations (PDEs) are a fundamental tool in the modeling of many real-world phenomena. In a number of such real-world phenomena the PDEs under consideration contain gradient-dependent nonlinearities and are high-dimensional. Such high-dimensional nonlinear PDEs can in nearly all cases not be solved explicitly, and it is one of the most challenging tasks in applied mathematics to solve high-dimensional nonlinear PDEs approximately. It is especially very challenging to design approximation algorithms for nonlinear PDEs for which one can rigorously prove that they do overcome the so-called curse of dimensionality in the sense that the number of computational operations of the approximation algorithm needed to achieve an approximation precision of size ${\varepsilon }> 0$ grows at most polynomially in both the PDE dimension $d \in \mathbb {N}$ and the reciprocal of the prescribed approximation accuracy ${\varepsilon }$. In particular, to the best of our knowledge there exists no approximation algorithm in the scientific literature which has been proven to overcome the curse of dimensionality in the case of a class of nonlinear PDEs with general time horizons and gradient-dependent nonlinearities. It is the key contribution of this article to overcome this difficulty. More specifically, it is the key contribution of this article (i) to propose a new full-history recursive multilevel Picard approximation algorithm for high-dimensional nonlinear heat equations with general time horizons and gradient-dependent nonlinearities and (ii) to rigorously prove that this full-history recursive multilevel Picard approximation algorithm does indeed overcome the curse of dimensionality in the case of such nonlinear heat equations with gradient-dependent nonlinearities.

Multilevel Picard iterations for solving smooth semilinear parabolic heat equations

Article Open access 04 November 2021

On Multilevel Picard Numerical Approximations for High-Dimensional Nonlinear Parabolic Partial Differential Equations and High-Dimensional Nonlinear Backward Stochastic Differential Equations

Article 07 March 2019

High Order Finite Difference Schemes for the Heat Equation Whose Convergence Rates are Higher Than Their Truncation Errors

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Partial differential equations (PDEs) play a prominent role in the modeling of many real-world phenomena. For instance, PDEs appear in financial engineering in models for the pricing of financial derivatives, PDEs such as the Schrödinger equation appear in quantum physics to describe the wave function of a quantum-mechanical system, PDEs are used in operations research to characterize the value function of control problems, PDEs provide solutions for backward stochastic differential equations (BSDEs) which itself appear in several models from applications, and stochastic PDEs such as the Zakai equation or the Kushner equation appear in nonlinear filtering problems to describe the density of the state of a physical system with only partial information available.

The PDEs in the above named models contain often nonlinearities and are typically high-dimensional, where, e.g., in the models from financial engineering the dimension of the PDE usually corresponds to the number of financial assets in the associated hedging or trading portfolio, where, e.g., in quantum physics the dimension of the PDE is, loosely speaking, three times the number of electrons in the considered physical system, where, e.g., in optimal control problems the dimension of the PDE is determined by the dimension of the state space of the control problem, and where, e.g., in nonlinear filtering problems the dimension of the PDE corresponds to the degrees of freedom in the considered physical system.

Such high-dimensional nonlinear PDEs can in nearly all cases not be solved explicitly and it is one of the most challenging tasks in applied mathematics to solve high-dimensional nonlinear PDEs approximately. In particular, it is very challenging to design approximation methods for nonlinear PDEs for which one can rigorously prove that they do overcome the so-called curse of dimensionality in the sense that the number of computational operations of the approximation method needed to achieve an approximation precision of size ${\varepsilon }> 0$ grows at most polynomially in both the PDE dimension $d \in \mathbb {N}$ and the reciprocal of the prescribed approximation accuracy ${\varepsilon }$.

Recently, several new stochastic approximation methods for certain classes of high-dimensional nonlinear PDEs have been proposed and studied in the scientific literature. In particular, we refer, e.g., to [11, 12, 26, 29, 30, 53] for BSDE-based approximation methods for PDEs in which nested conditional expectations are discretized through suitable regression methods, we refer, e.g., to [10, 39, 41, 42] for branching diffusion approximation methods for PDEs, we refer, e.g., to [1,2,3, 6,7,8, 13, 14, 16, 17, 21, 24, 25, 31, 34,35,36, 40, 43, 48, 50, 52, 54,55,58, 60, 62, 63] for deep learning based approximation methods for PDEs, and we refer to [4, 5, 20, 28, 46, 47] for numerical simulations, approximation results, and extensions of the in [19, 45] recently introduced full-history recursive multilevel Picard approximation methods for PDEs. In the following we abbreviate full-history recursive multilevel Picard as MLP.

Branching diffusion approximation methods are also in the case of certain nonlinear PDEs as efficient as plain vanilla Monte Carlo approximations in the case of linear PDEs, but the error analysis only applies in the case where the time horizon $T \in (0,\infty )$ and the initial condition, respectively, are sufficiently small and branching diffusion approximation methods fail to converge in the case where the time horizon $T \in (0,\infty )$ exceeds a certain threshold (cf., e.g., [41, Theorem 3.12]). For MLP approximation methods it has been recently shown in [4, 45, 46] that such algorithms do indeed overcome the curse of dimensionality for certain classes of gradient-independent PDEs. Numerical simulations for deep learning based approximation methods for nonlinear PDEs in high dimensions are very encouraging (see, e.g., the above named references [1,2,3, 6,7,8, 13, 14, 16, 17, 21, 24, 25, 31, 34,35,36, 40, 43, 48, 50, 52, 54,55,58, 60, 62, 63]) but so far there is only partial error analysis available for such algorithms (which, in turn, is strongly based on the above-mentioned error analysis for the MLP approximation method; cf. [44] and, e.g., [9, 23, 32, 33, 36, 49, 51, 61, 62]). To sum up, to the best of our knowledge until today the MLP approximation method (see [45]) is the only approximation method in the scientific literature for which it has been shown that it does overcome the curse of dimensionality in the numerical approximation of semilinear PDEs with general time horizons.

The above-mentioned articles [4, 28, 45, 46] prove, however, only in the case of gradient-independent nonlinearities that MLP approximation methods overcome the curse of dimensionality and it remains an open problem to overcome the curse of dimensionality in the case of PDEs with general time horizons and gradient-dependent nonlinearities. This is precisely the subject of this article. More specifically, in this article we propose a new MLP approximation method for nonlinear heat equations with gradient-dependent nonlinearities and the main result of this article, Theorem 5.2 in Sect. 5 below, proves that the number of realizations of scalar random variables required by this MLP approximation method to achieve a precision of size ${\varepsilon }> 0$ grows at most polynomially in both the PDE dimension $d \in \mathbb {N}$ and the reciprocal of the prescribed approximation accuracy ${\varepsilon }$. To illustrate the findings of the main result of this article in more detail, we now present in the following theorem a special case of Theorem 5.2.

Theorem 1.1

Let $T,\delta ,\lambda \in (0,\infty )$, let $u_d = ( u_d(t,x) )_{ (t,x) \in [0,T] \times \mathbb {R}^d }\in C^{1,2}([0,T]\times \mathbb {R}^d,\mathbb {R})$, $d\in \mathbb {N}$, be at most polynomially growing functions, let $f_d \in C( \mathbb {R}\times \mathbb {R}^d,\mathbb {R})$, $d\in \mathbb {N}$, let $g_d \in C( \mathbb {R}^d,\mathbb {R})$, $d\in \mathbb {N}$, let $L_{d,i}\in \mathbb {R}$, $d,i \in \mathbb {N}$, assume for all $d\in \mathbb {N}$, $t\in [0,T)$, $x=(x_1,x_2, \ldots ,x_d)$, $\mathfrak x=(\mathfrak x_1,\mathfrak x_2, \ldots ,\mathfrak x_d)$, $z=(z_1,z_2,\ldots ,z_d)$, $\mathfrak {z}=(\mathfrak z_1, \mathfrak z_2, \ldots , \mathfrak z_d)\in \mathbb {R}^d$, $y,\mathfrak {y} \in \mathbb {R}$ that

$$\begin{aligned}&\max \{|f_d(y,z)-f_d(\mathfrak y,\mathfrak {z})|,|g_d(x)-g_d(\mathfrak x)|\}\nonumber \\&\quad \le \textstyle { \sum _{j=1}^d}L_{d,j} \big (d^\lambda |x_j-\mathfrak {x}_j|+ |y-\mathfrak y|+|z_j-\mathfrak {z}_j| \big ), \end{aligned}$$

(1)

$$\begin{aligned}&\big ( \tfrac{ \partial }{ \partial t } u_d \big )( t, x )= ( \Delta _x u_d )( t, x ) \nonumber \\&\quad + f_d\big ( u_d(t,x), ( \nabla _x u_d )(t, x) \big ), \qquad u_d(0,x) = g_d(x), \end{aligned}$$

(2)

and $ d^{-\lambda }(|g_d(0)|+|f_d(0,0)|)+\sum _{i=1}^d L_{d,i}\le \lambda , $ let $ ( \Omega , \mathcal {F}, {\mathbb {P}}) $ be a probability space, let $ \Theta = \cup _{ n \in \mathbb {N}} \mathbb {Z}^n $, let $ Z^{d, \theta } :\Omega \rightarrow \mathbb {R}^d $, $d\in \mathbb {N}$, $ \theta \in \Theta $, be i.i.d. standard normal random variables, let $\mathfrak {r}^\theta :\Omega \rightarrow (0,1)$, $\theta \in \Theta $, be i.i.d. random variables, assume for all $b\in (0,1)$ that ${\mathbb {P}}(\mathfrak {r}^0\le b)=\sqrt{b}$, assume that $(Z^{d,\theta })_{(d, \theta ) \in \mathbb {N}\times \Theta }$ and $(\mathfrak {r}^\theta )_{ \theta \in \Theta }$ are independent, let $ \mathbf{U}_{ n,M}^{d,\theta } = ( \mathbf{U}_{ n,M}^{d,\theta , 0},\mathbf{U}_{ n,M}^{d,\theta , 1},\ldots ,\mathbf{U}_{ n,M}^{d,\theta , d} ) :(0,T]\times \mathbb {R}^d\times \Omega \rightarrow \mathbb {R}^{1+d} $, $n\in \mathbb {Z}$, $M,d\in \mathbb {N}$, $\theta \in \Theta $, satisfy for all $ n,M,d \in \mathbb {N}$, $ \theta \in \Theta $, $ t\in (0,T]$, $x \in \mathbb {R}^d$ that $ \mathbf{U}_{-1,M}^{d,\theta }(t,x)=\mathbf{U}_{0,M}^{d,\theta }(t,x)=0$ and

$$\begin{aligned} \begin{aligned}&\mathbf{U}_{n,M}^{d,\theta }(t,x) - \left( g_d(x) , 0 \right) \\&= \textstyle \sum \limits _{i=1}^{M^n} \displaystyle \tfrac{1}{M^n} \big (g_d(x+[2t]^{1/2}Z^{d,(\theta ,0,-i)})-g_d(x)\big )\big ( 1 , [2t]^{-1/2} Z^{d,(\theta , 0, -i)} \big ) \\&+\textstyle \sum \limits _{l=0}^{n-1}\sum \limits _{i=1}^{M^{n-l}}\displaystyle \tfrac{2t [\mathfrak {r}^{(\theta , l,i)}]^{1/2}}{M^{n-l}} \big [ f_d\big (\mathbf{U}_{l,M}^{d,(\theta ,l,i)}(t(1-\mathfrak {r}^{(\theta , l,i)}),x+[2t\mathfrak {r}^{(\theta , l,i)}]^{1/2}Z^{d,(\theta ,l,i)})\big ) \\&-\mathbb {1}_{\mathbb {N}}(l)f_d\big ( \mathbf{U}_{l-1,M}^{d,(\theta ,-l,i)}(t(1-\mathfrak {r}^{(\theta , l,i)}),x+[2t\mathfrak {r}^{(\theta , l,i)}]^{1/2}Z^{d,(\theta ,l,i)})\big ) \big ]\\&\cdot \big ( 1 , [2t\mathfrak {r}^{(\theta , l,i)}]^{-1/2} Z^{d,(\theta ,l,i)} \big ), \end{aligned} \end{aligned}$$

(3)

and for every $d,M,n \in \mathbb {N}$ let ${\text {RV}}_{d,n,M}\in \mathbb {N}$ be the number of realizations of scalar random variables which are used to compute one realization of $ \mathbf{U}_{n,M}^{d,0}(T,0):\Omega \rightarrow \mathbb {R}$ (cf. (176) for a precise definition). Then there exist $c\in \mathbb {R}$ and $N=(N_{d,{\varepsilon }})_{(d, {\varepsilon }) \in \mathbb {N}\times (0,1]}:\mathbb {N}\times (0,1] \rightarrow \mathbb {N}$ such that for all $d\in \mathbb {N}$, ${\varepsilon }\in (0,1]$ it holds that $ \sum _{n=1}^{N_{d,{\varepsilon }}}{\text {RV}}_{d,n,\lfloor n^{1/4} \rfloor } \le c d^c \varepsilon ^{-(2+\delta )}$ and

$$\begin{aligned}&\sup _{ n \in \mathbb {N}\cap [N_{d,{\varepsilon }},\infty ) } \Big [ {\mathbb {E}}\big [|\mathbf{U}_{{n},\lfloor n^{1/4} \rfloor }^{d,0,0}(T,0)-u_d(T,0)|^2\big ]\nonumber \\&\quad + \max _{i\in \{1,2,\ldots ,d\}} {\mathbb {E}}\big [ |\mathbf{U}_{{n},\lfloor n^{1/4} \rfloor }^{d,0,i}(T,0)-( \tfrac{ \partial }{ \partial x_i } u_d )(T,0)|^2\big ] \Big ]^{\nicefrac 12} \le {\varepsilon }. \end{aligned}$$

(4)

Theorem 1.1 is an immediate consequence of Corollary 5.4 in Sect. 5 below. Corollary 5.4, in turn, follows from Theorem 5.2 in Sect. 5, which is the main result of this article. In the following we add a few comments regarding some of the mathematical objects appearing in Theorem 1.1 above.

The real number $T \in (0,\infty )$ in Theorem 1.1 above describes the time horizon of the PDE under consideration (see (2) in Theorem 1.1 above). Theorem 1.1 reveals under suitable Lipschitz assumptions that the MLP approximation method in (3) above overcomes the curse of dimensionality in the numerical approximation of the gradient-dependent semilinear PDEs in (2) above. Theorem 1.1 even proves that the computational effort of the MLP approximation method in (3) required to obtain a precision of size ${\varepsilon }\in (0,1]$ is bounded by $c d^c {\varepsilon }^{ -( 2 + \delta ) }$ where $c\in \mathbb {R}$ is a constant which is completely independent of the PDE dimension $d\in \mathbb {N}$ and where $\delta \in (0,\infty )$ is an arbitrarily small positive real number which describes the convergence order which we lose when compared to standard Monte Carlo approximations of linear heat equations. The real number $\lambda \in (0,\infty )$ in Theorem 1.1 above is an arbitrary large constant which we employ to formulate the Lipschitz and growth assumptions in Theorem 1.1 (see (1) and below (2) in Theorem 1.1 above).

The functions $u_d :[0,T]\times \mathbb {R}^d \rightarrow \mathbb {R}$, $d \in \mathbb {N}$, in Theorem 1.1 above are the solutions of the PDEs under consideration; see (2) in Theorem 1.1 above. Note that for every $d \in \mathbb {N}$ we have that (2) is a PDE where the time variable $t \in [0,T]$ takes values in the interval [0, T] and where the space variable $x \in \mathbb {R}^d$ takes values in the d-dimensional Euclidean space $\mathbb {R}^d$. The functions $f_d :\mathbb {R}\times \mathbb {R}^d\rightarrow \mathbb {R}$, $d \in \mathbb {N}$, describe the nonlinearities of the PDEs in (2) and the functions $g_d:\mathbb {R}^d\rightarrow \mathbb {R}$, $d\in \mathbb {N}$, describe the initial conditions of the PDEs in (2). The quantities $\lfloor n^{ 1 / 4 } \rfloor $, $n \in \mathbb {N},$ in (4) in Theorem 1.1 above describe evaluations of the standard floor function in the sense that for all $n \in \mathbb {N}$ it holds that $\lfloor n^{ 1 / 4 } \rfloor = \max ( [0,n^{ 1 / 4 }]\cap \mathbb {N})$. Note that in this work for every $d\in \mathbb {N}$, $a\in \mathbb {R}$, $b=(b_1,b_2,\ldots , b_d) \in \mathbb {R}^d$ we sometimes write $(a,b)\in \mathbb {R}^{d+1}$ as an abbreviation for the vector $(a,b_1,b_2,\ldots ,b_d)\in \mathbb {R}^{d+1}$. In particular, observe that for all $d\in \mathbb {N}$, $x=(x_1,x_2,\ldots , x_d),z=(z_1,z_2,\ldots ,z_d)\in \mathbb {R}^d$, $t\in (0,T]$ it holds that $(g_d(x+[2t]^{1/2}z)-g_d(x))(1,[2t]^{-1/2}z)=(g_d(x+z\sqrt{2t})-g_d(x))(1,z_1\sqrt{(2t)^{-1}},z_2\sqrt{(2t)^{-1}},\ldots , z_d\sqrt{(2t)^{-1}})$ (cf. the first line in (3) above).

Theorem 1.1 proves under suitable assumptions (cf. (1) in Theorem 1.1 above for details) that the MLP approximation method in (3) above overcomes the curse of dimensionality in the numerical approximation of the gradient-dependent semilinear heat equations in (2) above. In order to give the reader a better understanding why the approximation scheme in (3) above is capable of overcoming the curse of dimensionality in the numerical approximation of gradient-dependent semilinear PDEs, we now outline a brief derivation of the approximation scheme in (3) in the special case where $f\in C(\mathbb {R}\times \mathbb {R}^d,\mathbb {R})$, $d\in \mathbb {N}$, and $g_d\in C(\mathbb {R}^d,\mathbb {R})$, $d\in \mathbb {N}$, are globally bounded functions. For this observe that the Feynman–Kac formula and the assumptions of Theorem 1.1 ensure that the solutions $u_d :[0,T]\times \mathbb {R}^d \rightarrow \mathbb {R}$, $d \in \mathbb {N}$, of the PDEs in (2) satisfy that for all $d\in \mathbb {N}$, $t\in [0,T]$, $x\in \mathbb {R}^d$ it holds that

$$\begin{aligned} u_d(t,x)&={\mathbb {E}}\biggl [g_d(x+[2t]^{1/2}Z^{d,0})+\int _0^t f_d\bigl ( u_d(s,x+[2(t-s)]^{1/2}Z^{d,0}),\nonumber \\&\quad (\nabla _x u_d)(s,x+[2(t-s)]^{1/2}Z^{d,0}) \bigr )\,ds \biggr ]. \end{aligned}$$

(5)

Note that (5) are stochastic fixed-point equations with the solutions $u_d :[0,T]\times \mathbb {R}^d \rightarrow \mathbb {R}$, $d \in \mathbb {N}$, of the PDEs in (2) being the fixed points of the stochastic fixed-point equations. MLP approximation methods are powerful approximation techniques for stochastic fixed-point equations but the MLP approach cannot be directly applied to (5) as the gradients $(\nabla _x u_d)( t, x )$, $t \in [0,T]$, $x \in \mathbb {R}^d$, of the solutions $u_d :[0,T]\times \mathbb {R}^d \rightarrow \mathbb {R}$, $d \in \mathbb {N}$, of the PDEs in (2) appear on the right-hand side of (5) but not on the left-hand side of (5).

To get over this obstacle, we reformulate (5) by means of the Bismut–Elworthy–Li formula (cf., e.g., Da Prato and Zabczyk [15, Theorem 2.1]) to obtain other stochastic fixed-point equations with no derivatives appearing on the right-hand side of the stochastic fixed point equations so that the MLP machinery can be brought into play. More formally, note that the Bismut–Elworthy–Li formula ensures that the solutions $u_d :[0,T]\times \mathbb {R}^d \rightarrow \mathbb {R}$, $d \in \mathbb {N}$, of the PDEs in (2) satisfy that for all $d\in \mathbb {N}$, $t\in (0,T]$, $x\in \mathbb {R}^d$ it holds that

$$\begin{aligned} (\nabla _x u_d)(t,x)&={\mathbb {E}}\biggl [g_d(x+[2t]^{1/2}Z^{d,0}) [2t]^{-1/2}Z^{d,0}\nonumber \\&\quad +\int _0^t f_d\bigl ( u_d(s,x+[2(t-s)]^{1/2}Z^{d,0}),\nonumber \\&\quad (\nabla _x u_d)(s,x+[2(t-s)]^{1/2}Z^{d,0}) \bigr ) [2(t-s)]^{-1/2}Z^{d,0} \,ds \biggr ]. \end{aligned}$$

(6)

Next let $\mathbf{u}_d = ( \mathbf{u}_{ d, 1 }, \mathbf{u}_{ d, 2 }, \ldots , \mathbf{u}_{ d, d + 1 } ) :[0,T] \times \mathbb {R}^d \rightarrow \mathbb {R}^{ d + 1 },$ $d \in \mathbb {N}$, satisfy for all $d \in \mathbb {N}$, $k \in \{ 2, 3, ..., d + 1 \}$, $x = (x_1,x_2,\ldots ,x_d) \in \mathbb {R}^d$, $t \in [0,T]$ that $\mathbf{u}_{ d, 1 }( t, x ) = u(t,x)$ and $ \mathbf{u}_{ d, k }( t, x ) = (\frac{\partial }{\partial x_{k-1}})u(t,x)$ and observe that (5) and (6) reveal that for all $d\in \mathbb {N}$, $t\in (0,T]$, $x\in \mathbb {R}^d$ it holds that

$$\begin{aligned} \mathbf{u}_d(t,x)&={\mathbb {E}}\biggl [g_d(x+[2t]^{1/2}Z^{d,0}) (1,[2t]^{-1/2}Z^{d,0})\nonumber \\&\quad +\int _0^t f_d\bigl ( \mathbf{u}_d(s,x+[2(t-s)]^{1/2}Z^{d,0}) \bigr ) (1,[2(t-s)]^{-1/2}Z^{d,0}) \,ds \biggr ]. \end{aligned}$$

(7)

Observe that (7) are stochastic fixed-point equations with no derivatives appearing on the right-hand side of (7) so that we are now in the position to apply the MLP approach to (7). Next we briefly sketch how MLP approximations for (7) can be obtained and, thereby, we briefly outline the derivation of the MLP approximation schemes in (3) above (cf., e.g., E et al. [18, Section 1.2]). For this observe that (7) can also be written as the fixed point equation $u=\Phi (u)$ where $\Phi $ is the self-mapping on the set of all bounded functions in $C((0, T ] \times \mathbb {R}^d , \mathbb {R}^{d+1})$ which is described through the right-hand side of (7). We now define Picard iterates $ \mathfrak {u}_n $, $ n \in \mathbb {N}_0 $, by means of the recursion that for all $n \in \mathbb {N}$ it holds that $ \mathfrak {u}_0 = 0 $ and $ \mathfrak {u}_n = \Phi ( \mathfrak {u}_{ n - 1 } ) $. Next we observe that a telescoping sum argument and the fact that for all $k \in \mathbb {N}$ it holds that $u_k = \Phi ( u_{ k - 1 } )$ demonstrate that for all $ n \in \mathbb {N}$ it holds that

$$\begin{aligned} \mathfrak {u}_n = \mathfrak {u}_1 + \sum _{ k = 1 }^{ n - 1 } \left[ \mathfrak {u}_{ k + 1 } - \mathfrak {u}_k \right] = \Phi ( 0 ) + \sum _{ l = 1 }^{ n - 1 } \left[ \Phi ( \mathfrak {u}_k ) - \Phi ( \mathfrak {u}_{ k - 1 } ) \right] . \end{aligned}$$

(8)

Roughly speaking, the MLP approximations in (3) can then be derived by approximating the expectations and temporal Lebesgue integrals in (8) within the fixed point function $\Phi $ through appropriate Monte Carlo approximations with different numbers of Monte Carlo samples (cf., e.g., Heinrich [37], Heinrich and Sindambiwe [38], and Giles [27] for related multilevel Monte Carlo approximations).

Theorem 1.1 above in this introductory section is a special case of the more general approximation results in Sect. 5 in this article, and these more general approximations results treat more general PDEs than (2) as well as more general MLP approximation methods than (3). More specifically, in (2) above we have for every $d \in \mathbb {N}$ that the nonlinearity $f_d$ depends only on the PDE solution $u_d$ and the spatial gradient $\nabla _x u_d$ of the PDE solution but not on $t \in [0,T]$ and $x \in \mathbb {R}^d$, while in Corollary 5.1 and Theorem 5.2 in Sect. 5 the nonlinearities of the PDEs may also depend on $t \in [0,T]$ and $x \in \mathbb {R}^d$. Corollary 5.1 and Theorem 5.2 also provide error analyses for a more general class of MLP approximation methods. In particular, in Theorem 1.1 above the family $\mathfrak {r}^\theta :\Omega \rightarrow (0,1)$, $\theta \in (0,1)$, of i.i.d. random variables satisfies for all $b\in (0,1)$ that ${\mathbb {P}}(\mathfrak {r}^0\le b)=\sqrt{b}$ and Corollary 5.1 and Theorem 5.2 are proved under the more general hypothesis that there exists $\alpha \in (0,1)$ such that for all $b\in (0,1)$ it holds that ${\mathbb {P}}(\mathfrak {r}^0\le b)=b^\alpha $ (see, e.g., (160) in Theorem 5.2). It is a crucial observation of this article that, roughly speaking, in the case of PDEs with gradient-dependent nonlinearities time integrals in the semigroup formulations of the PDEs under consideration cannot be approximated with continuous uniformly distributed random variables (corresponding to the case $\alpha =1$) as it was done, e.g., in [4, 45, 46]; cf. Sect. 5 for a more detailed discussion of this observation. Furthermore, the more general approximation result in Corollary 5.1 in Sect. 5 also provides an explicit upper bound for the constant $c \in \mathbb {R}$ in Theorem 1.1 above (see (144) in Corollary 5.1).

We also refer to the references [5] and [20] for numerical simulations for MLP approximation methods for semilinear high-dimensional PDEs. We would like to point out that there are two variants of MLP approximation methods in the scientific literature. First, MLP approximation methods with the temporal Lebesgue integrals being approximated by deterministic quadrature rules have been proposed in [19, 20] in 2016. In [20] this first variant of MLP approximation methods has also been tested numerically, also in the case of gradient-dependent nonlinearities (see [20, Section 3.2]), but the existing convergence results for this first variant of MLP approximation methods are only applicable under very strong regularity assumptions and even exclude many semilinear heat equations with globally Lipschitz continuous gradient-independent nonlinearities (cf. [19, Corollary 3.19] and [47, Corollary 4.8]). Thereafter, MLP approximation methods with the temporal Lebesgue integrals being approximated by Monte Carlo approximations have been proposed in [45] in 2018 and further studied and extended in [4, 28, 46] as well as in the present article. This second variant of MLP approximation methods has been shown to overcome the curse of dimensionality under only mild (local) Lipschitz continuity assumptions on the nonlinearity of the PDEs under consideration (see, e.g., [45, Theorem 1.1], [4, Theorem 1.1], [46, Theorem 1.1], and [28, Corollary 1.2]) and this second variant of MLP approximation methods has also been tested numerically in [5] but only in the case of gradient-independent nonlinearities and it remains a topic of future research to perform numerical simulations for the MLP approximation methods proposed in this article (belonging somehow to the ’second variant’ of MLP approximation methods) in the case of PDEs with gradient-dependent nonlinearities.

The remainder of this article is organized as follows. In Sect. 2 we establish a few identities and upper bounds for certain iterated deterministic integrals. The results of Sect. 2 are then used in Sect. 3 in which we introduce and analyze the considered MLP approximation methods. In Sect. 4 we establish suitable a priori bounds for exact solutions of PDEs of the form (2). In Sect. 5 we combine the findings of Sects. 3 and 4 to establish in Theorem 5.2 below the main approximation result of this article.

2 Analysis of Certain Deterministic Iterated Integrals

In this section we establish in Corollary 2.5 below an upper bound for products of certain independent random variables. Corollary 2.5 below is a central ingredient in our error analysis for MLP approximations in Sect. 3 below. Our proof of Corollary 2.5 employs the essentially well-known factorization lemma for certain conditional expectations in Lemma 2.4 below as well as a few elementary identities and estimates for certain deterministic iterated integrals which are provided in Lemmas 2.1, 2.2, and Corollary 2.3 below.

2.1 Identities for Certain Deterministic Iterated Integrals

Lemma 2.1

Let $T,\beta ,\gamma \in (0,\infty )$, let $\rho :(0,1)\rightarrow (0,\infty )$ be ${\mathcal {B}}((0,1))/{\mathcal {B}}((0,\infty ))$-measurable, and let $\varrho :[0,T]^2\rightarrow (0,\infty )$ satisfy for all $t\in [0,T)$, $s\in (t,T]$ that $\varrho (t,s)=\tfrac{1}{T-t}\rho (\tfrac{s-t}{T-t})$. Then it holds for all $j\in \mathbb {N}$, $s_0\in [0,T)$ that

$$\begin{aligned}&\int _{s_0}^T\frac{1}{(s_1-s_0)^\beta [\varrho (s_0,s_1)]^\gamma }\int _{s_1}^T\frac{1}{(s_2-s_1)^\beta [\varrho (s_1,s_2)]^\gamma }\nonumber \\&\qquad \ldots \int _{s_{j-1}}^T\frac{1}{(s_{j}-s_{j-1})^\beta [\varrho (s_{j-1},s_{j})]^\gamma }\,ds_{j}\ldots ds_2 \,ds_1\nonumber \\&\quad =(T-s_0)^{j(1+\gamma -\beta )} \left[ \prod _{i=0}^{j-1} \int _0^1 \frac{(1-s)^{i(1+\gamma -\beta )}}{s^\beta [\rho (s)]^\gamma }\,ds\right] . \end{aligned}$$

(9)

Proof of Lemma 2.1

We prove (9) by induction on $j\in \mathbb {N}$. For the base case $j=1$ note that integration by substitution yields that for all $s_0\in [0,T)$ it holds that

$$\begin{aligned} \int _{s_0}^T\frac{1}{(s_1-s_0)^\beta [\varrho (s_0,s_1)]^\gamma }\,ds_1&=\int _0^1\frac{(T-s_0)^{1-\beta }}{z[\varrho (s_0,s_0+z(T-s_0))]^{\gamma }}\,dz\nonumber \\&=(T-s_0)^{1+\gamma -\beta }\int _0^1 \frac{1}{z^\beta [\rho (z)]^\gamma }\,dz. \end{aligned}$$

(10)

This proves (9) in the base case $j=1$. For the induction step $\mathbb {N}\ni j\rightsquigarrow j+1\in \mathbb {N}$ note that the induction hypothesis and integration by substitution imply for all $s_0\in [0,T)$ that

$$\begin{aligned} \begin{aligned}&\int _{s_0}^T\frac{1}{(s_1-s_0)^\beta [\varrho (s_0,s_1)]^\gamma } \bigg [\int _{s_1}^T\frac{1}{(s_2-s_1)^\beta [\varrho (s_1,s_2)]^\gamma }\\&\qquad \ldots \int _{s_j}^T\frac{1}{(s_{j+1}-s_j)^\beta [\varrho (s_j,s_{j+1})]^\gamma }\,ds_{j+1}\ldots ds_2\bigg ]\, ds_1\\&\quad =\int _{s_0}^T\frac{(T-s_1)^{j(1+\gamma -\beta )}}{(s_1-s_0)^\beta [\varrho (s_0,s_1)]^\gamma }\left[ \prod _{i=0}^{j-1} \int _0^1 \frac{(1-s)^{i(1+\gamma -\beta )}}{s^\beta [\rho (s)]^\gamma }\,ds \right] \, ds_1\\&\quad =\left[ \prod _{i=0}^{j-1} \int _0^1 \frac{(1-s)^{i(1+\gamma -\beta )}}{s^\beta [\rho (s)]^\gamma }\,ds \right] \int _{0}^1\frac{(T-s_0)^{j(1+\gamma -\beta )}(1-z)^{j(1+\gamma -\beta )}}{(T-s_0)^{\beta }z^\beta [\varrho (s_0,s_0+z(T-s_0))]^\gamma } (T-s_0)dz\\&\quad =(T-s_0)^{(j+1)(1+\gamma -\beta )}\left[ \prod _{i=0}^j \int _0^1 \frac{(1-s)^{i(1+\gamma -\beta )}}{s^\beta [\rho (s)]^\gamma }\,ds \right] . \end{aligned} \end{aligned}$$

(11)

Induction thus proves (9). This completes the proof of Lemma 2.1. $\square $

Lemma 2.2

Let $\alpha \in (0,1)$, $T,\gamma \in (0,\infty )$, $\beta \in (0,\alpha \gamma +1)$ and let $\rho :(0,1)\rightarrow (0,\infty )$ and $\varrho :[0,T]^2\rightarrow (0,\infty )$ satisfy for all $r\in (0,1)$, $t\in [0,T)$, $s\in (t,T]$ that $\rho (r)=\frac{1-\alpha }{r^\alpha }$ and $\varrho (t,s)=\tfrac{1}{T-t}\rho (\tfrac{s-t}{T-t})$. Then it holds for all $j\in \mathbb {N}$, $s_0\in [0,T)$ that

$$\begin{aligned}&\int _{s_0}^T\frac{1}{(s_1-s_0)^\beta [\varrho (s_0,s_1)]^\gamma }\int _{s_1}^T\frac{1}{(s_2-s_1)^\beta [\varrho (s_1,s_2)]^\gamma }\nonumber \\&\qquad \ldots \int _{s_{j-1}}^T\frac{1}{(s_{j}-s_{j-1})^\beta [\varrho (s_{j-1},s_{j})]^\gamma }\,ds_{j}\ldots ds_2 \,ds_1\nonumber \\&\quad =\left[ \frac{(T-s_0)^{(1+\gamma -\beta )}\Gamma (\alpha \gamma -\beta +1)}{(1-\alpha )^\gamma }\right] ^{j}\nonumber \\&\qquad \cdot \left[ \prod _{i=0}^{j-1} \frac{\Gamma (i(1+\gamma -\beta )+1)}{\Gamma (\alpha \gamma -\beta +i(1+\gamma -\beta )+2)} \right] . \end{aligned}$$

(12)

Proof of Lemma 2.2

Throughout this proof let $B :(0,\infty )\times (0,\infty )\rightarrow \mathbb {R}$ satisfy for all $x,y\in (0,\infty )$ that

$$\begin{aligned} B(x,y)=\int _0^1s^{x-1}(1-s)^{y-1}\,ds. \end{aligned}$$

(13)

Note that (13) and the fact that for all $x, y \in (0,\infty )$ it holds that $B(x,y) = \frac{\Gamma (x)\Gamma (y)}{ \Gamma (x+y) }$ ensure that for all $i\in \mathbb {N}_0$ it holds that

$$\begin{aligned} \begin{aligned} \int _0^1 \frac{(1-s)^{i(1+\gamma -\beta )}}{s^\beta [\rho (s)]^\gamma }\,ds&=\frac{1}{(1-\alpha )^\gamma }\int _0^1s^{\alpha \gamma -\beta }(1-s)^{i(1+\gamma -\beta )} \,ds\\&=\frac{B(\alpha \gamma -\beta +1,i(1+\gamma -\beta )+1)}{(1-\alpha )^\gamma }\\&=\frac{\Gamma (\alpha \gamma -\beta +1)\Gamma (i(1+\gamma -\beta )+1)}{(1-\alpha )^\gamma \Gamma (\alpha \gamma -\beta +i(1+\gamma -\beta )+2)}. \end{aligned} \end{aligned}$$

(14)

Lemma 2.1 hence implies that for all $j\in \mathbb {N}$, $s_0\in [0,T)$ it holds that

$$\begin{aligned} \begin{aligned}&\int _{s_0}^T\frac{1}{(s_1-s_0)^\beta [\varrho (s_0,s_1)]^\gamma }\int _{s_1}^T\frac{1}{(s_2-s_1)^\beta [\varrho (s_1,s_2)]^\gamma }\\&\qquad \ldots \int _{s_{j-1}}^T\frac{1}{(s_{j}-s_{j-1})^\beta [\varrho (s_{j-1},s_{j})]^\gamma }\,ds_{j}\ldots ds_2\, ds_1\\&\quad = (T-s_0)^{j(1+\gamma -\beta )}\left[ \prod _{i=0}^{j-1}\int _0^1 \frac{(1-s)^{i(1+\gamma -\beta )}}{s^\beta [\rho (s)]^\gamma }\,ds\right] \\&\quad =(T-s_0)^{j(1+\gamma -\beta )}\left[ \prod _{i=0}^{j-1} \frac{\Gamma (\alpha \gamma -\beta +1)\Gamma (i(1+\gamma -\beta )+1)}{(1-\alpha )^\gamma \Gamma (\alpha \gamma -\beta +i(1+\gamma -\beta )+2)}\right] \\&\quad = \left[ \frac{(T-s_0)^{(1+\gamma -\beta )}\Gamma (\alpha \gamma -\beta +1)}{(1-\alpha )^\gamma }\right] ^{j}\left[ \prod _{i=0}^{j-1} \frac{\Gamma (i(1+\gamma -\beta )+1)}{\Gamma (\alpha \gamma -\beta +i(1+\gamma -\beta )+2)}\right] . \end{aligned} \end{aligned}$$

(15)

This establishes (12). The proof of Lemma 2.2 is thus completed. $\square $

2.2 Estimates for Certain Deterministic Iterated Integrals

Corollary 2.3

Let $\alpha \in (0,1)$, $T,\gamma \in (0,\infty )$, $\beta \in [\alpha \gamma ,\alpha \gamma +1]$ and let $\rho :(0,1)\rightarrow (0,\infty )$ and $\varrho :[0,T]^2\rightarrow (0,\infty )$ satisfy for all $r\in (0,1)$, $t\in [0,T)$, $s\in (t,T]$ that $\rho (r)=\frac{1-\alpha }{r^\alpha }$ and $\varrho (t,s)=\tfrac{1}{T-t}\rho (\tfrac{s-t}{T-t})$. Then it holds for all $j\in \mathbb {N}$, $s_0\in [0,T)$ that

$$\begin{aligned}&\int _{s_0}^T\frac{1}{(s_1-s_0)^\beta [\varrho (s_0,s_1)]^\gamma }\int _{s_1}^T\frac{1}{(s_2-s_1)^\beta [\varrho (s_1,s_2)]^\gamma }\nonumber \\&\qquad \ldots \int _{s_{j-1}}^T\frac{1}{(s_{j}-s_{j-1})^\beta [\varrho (s_{j-1},s_{j})]^\gamma }\,ds_{j}\ldots ds_2 \,ds_1\nonumber \\&\quad \le \left[ \frac{(T-s_0)^{(1+\gamma -\beta )}\Gamma (\alpha \gamma -\beta +1)}{(1-\alpha )^\gamma (1+\gamma -\beta )^{\alpha \gamma -\beta +1} }\right] ^{j} \nonumber \\&\qquad \cdot \left[ e^{1+\gamma -\beta }((1+\gamma -\beta )(j-1)+1) \right] ^{\frac{(\beta -\alpha \gamma )(\alpha \gamma -\beta +1)}{1+\gamma -\beta }} \left[ \tfrac{\Gamma \left( \frac{1}{1+\gamma -\beta }\right) }{\Gamma \left( j+\frac{1}{1+\gamma -\beta }\right) } \right] ^{\alpha \gamma -\beta +1} . \end{aligned}$$

(16)

Proof of Corollary 2.3

First, observe that Wendel’s inequality for the gamma function (see, e.g., Wendel [64] and Qi [59, Section 2.1]) ensures that for all $x\in (0,\infty )$, $s\in [0,1]$ it holds that

$$\begin{aligned} \frac{\Gamma (x)}{\Gamma (x+s)}\le \frac{1}{x^s} \left[ \frac{x+s}{x} \right] ^{1-s}. \end{aligned}$$

(17)

Moreover, note that the fact that for all $x\in (0,\infty )$ it holds that $\ln ^\prime (x)=x^{-1}$ demonstrates that for all $j\in \mathbb {N}$, $\lambda \in (0,\infty )$ it holds that

$$\begin{aligned} \sum _{i=0}^{j-1}\frac{1}{i+\lambda }= & {} \frac{1}{\lambda }+\sum _{i=1}^{j-1}\frac{1}{i+\lambda }\le \frac{1}{\lambda }+\sum _{i=1}^{j-1}\int _{i}^{i+1} \frac{1}{s-1+\lambda }\,ds \nonumber \\= & {} \frac{1}{\lambda }+\int _1^{j} \frac{1}{s-1+\lambda }\,ds=\frac{1}{\lambda }+\ln \!\left( \frac{j-1}{\lambda }+1\right) . \end{aligned}$$

(18)

Combining this, (17), and the fact that $\alpha \gamma -\beta +1\in [0,1]$ with the fact that for all $x\in [0,\infty )$ it holds that $1+x\le e^{x}$ proves that for all $j\in \mathbb {N}$ it holds that

$$\begin{aligned} \begin{aligned}&\left[ \prod _{i=0}^{j-1} \frac{\Gamma (i(1+\gamma -\beta )+1)}{\Gamma (\alpha \gamma -\beta +i(1+\gamma -\beta )+2)} \right] \\&\quad \le \left[ \prod _{i=0}^{j-1} \left( \frac{\alpha \gamma -\beta +i(1+\gamma -\beta )+2}{i(1+\gamma -\beta )+1} \right) ^{1-(\alpha \gamma -\beta +1) }\frac{1}{(i(1+\gamma -\beta )+1)^{\alpha \gamma -\beta +1}} \right] \\&\quad = \left[ \prod _{i=0}^{j-1} \left( 1+\frac{\alpha \gamma -\beta +1}{i(1+\gamma -\beta )+1} \right) ^{\beta -\alpha \gamma }\frac{1}{(i(1+\gamma -\beta )+1)^{\alpha \gamma -\beta +1}} \right] \\&\quad \le \left[ \prod _{i=0}^{j-1} e^{\frac{(\beta -\alpha \gamma )(\alpha \gamma -\beta +1)}{i(1+\gamma -\beta )+1}} \frac{1}{(i(1+\gamma -\beta )+1)^{\alpha \gamma -\beta +1}} \right] \\&\quad = e^{\frac{(\beta -\alpha \gamma )(\alpha \gamma -\beta +1)}{1+\gamma -\beta }\sum _{i=0}^{j-1}\frac{1}{i+\frac{1}{1+\gamma -\beta }}} \left[ \prod _{i=0}^{j-1}\frac{1}{(i(1+\gamma -\beta )+1)^{\alpha \gamma -\beta +1}}\right] \\&\quad \le e^{\frac{(\beta -\alpha \gamma )(\alpha \gamma -\beta +1)}{1+\gamma -\beta }(1+\gamma -\beta +\ln ((1+\gamma -\beta )(j-1)+1))} \left[ \prod _{i=0}^{j-1}\frac{1}{(i(1+\gamma -\beta )+1)^{\alpha \gamma -\beta +1}}\right] \\&\quad = \left[ e^{1+\gamma -\beta }((1+\gamma -\beta )(j-1)+1) \right] ^{\frac{(\beta -\alpha \gamma )(\alpha \gamma -\beta +1)}{1+\gamma -\beta }} \left[ \tfrac{\Gamma \left( \frac{1}{1+\gamma -\beta }\right) }{(1+\gamma -\beta )^{j}\Gamma \left( j+\frac{1}{1+\gamma -\beta }\right) } \right] ^{\alpha \gamma -\beta +1}. \end{aligned} \end{aligned}$$

(19)

Lemma 2.2 hence implies that for all $j\in \mathbb {N}$, $s_0\in [0,T)$ it holds that

$$\begin{aligned}&\int _{s_0}^T\frac{1}{(s_1-s_0)^\beta [\varrho (s_0,s_1)]^\gamma }\int _{s_1}^T\frac{1}{(s_2-s_1)^\beta [\varrho (s_1,s_2)]^\gamma }\nonumber \\&\qquad \ldots \int _{s_{j-1}}^T\frac{1}{(s_{j}-s_{j-1})^\beta [\varrho (s_{j-1},s_{j})]^\gamma }\,ds_{j}\ldots ds_2 \,ds_1\nonumber \\&\quad \le \left[ \tfrac{(T-s_0)^{(1+\gamma -\beta )}\Gamma (\alpha \gamma -\beta +1)}{(1-\alpha )^\gamma (1+\gamma -\beta )^{\alpha \gamma -\beta +1} }\right] ^{j} \left[ e^{1+\gamma -\beta }((1+\gamma -\beta )(j-1)+1) \right] ^{\frac{(\beta -\alpha \gamma )(\alpha \gamma -\beta +1)}{1+\gamma -\beta }} \nonumber \\&\qquad \cdot \left[ \tfrac{\Gamma \left( \frac{1}{1+\gamma -\beta }\right) }{\Gamma \left( j+\frac{1}{1+\gamma -\beta }\right) } \right] ^{\alpha \gamma -\beta +1}. \end{aligned}$$

(20)

This establishes (16). The proof of Corollary 2.3 is thus completed. $\square $

2.3 Estimates for Products of Certain Independent Random Variables

Lemma 2.4

Let $T\in (0,\infty )$, $d\in \mathbb {N}$, $F\in C((0,1)\times [0,T) \times \mathbb {R}^d, [0,\infty ))$, let $(\Omega ,\mathcal F,{\mathbb {P}})$ be a probability space, let $\rho :\Omega \rightarrow (0,1)$ and $\tau :\Omega \rightarrow (0,T)$ be random variables, let $W:[0,T]\times \Omega \rightarrow \mathbb {R}^d$ be a standard Brownian motion with continuous sample paths, let $f:[0,T) \rightarrow [0,\infty ]$ satisfy for all $t\in [0,T)$ that $f(t)={\mathbb {E}}[F(\rho , t, W_{t+(T-t)\rho }-W_t)]$, let ${\mathcal {G}}\subseteq \mathcal F$ be a sigma-algebra, let $\mathcal H=\sigma ({\mathcal {G}} \cup \sigma (\tau ,(W_{\min \{s,\tau \}})_{s\in [0,T]}))$, and assume that $\rho $, $\tau $, W, and ${\mathcal {G}}$ are independent. Then it holds ${\mathbb {P}}$-a.s. that

$$\begin{aligned} {\mathbb {E}}[F(\rho ,\tau , W_{\tau +(T-\tau )\rho }-W_\tau )|\mathcal H ]=f(\tau ). \end{aligned}$$

(21)

Proof of Lemma 2.4

First, note that independence of $\rho $, $\tau $, W, and ${\mathcal {G}}$ ensures that it holds ${\mathbb {P}}$-a.s. that

$$\begin{aligned} {\mathbb {E}}[F(\rho ,\tau , W_{\tau +(T-\tau )\rho }-W_\tau )|\mathcal H ]= {\mathbb {E}}[F(\rho ,\tau , W_{\tau +(T-\tau )\rho }-W_\tau )|\sigma (\tau ,(W_{\min \{s,\tau \}})_{s\in [0,T]})] . \end{aligned}$$

(22)

Next observe that independence of $\rho $, $\tau $, and W and the tower property of conditional expectations (cf., e.g., Hutzenthaler et al. [45, Lemma 2.2] (applied with ${\mathcal {G}}=\sigma (\rho , (W_{s})_{s\in [0,T]})$, $S=(0,T)$, ${\mathcal {S}}={\mathcal {B}}((0,T))$, $U(t,\omega )={\mathbf {1}}_{A}(t){\mathbf {1}}_{B}((W_{\min \{s,t\}}(\omega ))_{s\in [0,T]})) F(\rho (\omega ),t, W_{t+(T-t)\rho (\omega )}(\omega )-W_t(\omega ))$, $Y(\omega )=\tau (\omega )$ for $t\in (0,T)$, $\omega \in \Omega $ in the notation of [45, Lemma 2.2])) prove that for all $A\in {\mathcal {B}}((0,T))$, $B\in {\mathcal {B}}(C([0,T],\mathbb {R}^d))$ it holds that

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}[{\mathbf {1}}_{A}(\tau ){\mathbf {1}}_{B}((W_{\min \{s,\tau \}})_{s\in [0,T]}) F(\rho ,\tau , W_{\tau +(T-\tau )\rho }-W_\tau )]\\&\quad = \int _{(0,T)} {\mathbb {E}}[{\mathbf {1}}_{A}(t){\mathbf {1}}_{B}((W_{\min \{s,t\}})_{s\in [0,T]}) F(\rho ,t, W_{t+(T-t)\rho }-W_t)](\tau ({\mathbb {P}}))(dt). \end{aligned} \end{aligned}$$

(23)

The fact that for all $t\in [0,T]$, $r\in [t,T]$ it holds that $(W_{\min \{s,t\}})_{s\in [0,T]}$ and $W_r-W_t$ are independent hence proves that for all $A\in {\mathcal {B}}((0,T))$, $B\in {\mathcal {B}}(C([0,T],\mathbb {R}^d))$ it holds that

$$\begin{aligned}&{\mathbb {E}}[{\mathbf {1}}_{A}(\tau ){\mathbf {1}}_{B}((W_{\min \{s,\tau \}})_{s\in [0,T]}) F(\rho ,\tau , W_{\tau +(T-\tau )\rho }-W_\tau )]\nonumber \\&\quad = \int _{(0,T)} {\mathbb {E}}[{\mathbf {1}}_{A}(t){\mathbf {1}}_{B}((W_{\min \{s,t\}})_{s\in [0,T]})] {\mathbb {E}}[F(\rho ,t, W_{t+(T-t)\rho }-W_t)](\tau ({\mathbb {P}}))(dt)\nonumber \\&\quad = \int _{(0,T)} {\mathbb {E}}[{\mathbf {1}}_{A}(t){\mathbf {1}}_{B}((W_{\min \{s,t\}})_{s\in [0,T]})] f(t)(\tau ({\mathbb {P}})) (dt). \end{aligned}$$

(24)

This and, e.g., Hutzenthaler et al. [45, Lemma 2.2] (applied with ${\mathcal {G}}=\sigma (\rho , (W_{s})_{s\in [0,T]})$, $S=(0,T)$, ${\mathcal {S}}={\mathcal {B}}((0,T))$, $U(t,\omega )={\mathbf {1}}_{A}(t){\mathbf {1}}_{B}((W_{\min \{s,t\}}(\omega ))_{s\in [0,T]})) g(t)$, $Y(\omega )=\tau (\omega )$ for $t\in (0,T)$, $\omega \in \Omega $ in the notation of [45, Lemma 2.2]) proves that for all $A\in {\mathcal {B}}((0,T))$, $B\in {\mathcal {B}}(C([0,T],\mathbb {R}^d))$ it holds that

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}[{\mathbf {1}}_{A}(\tau ){\mathbf {1}}_{B}((W_{\min \{s,\tau \}})_{s\in [0,T]}) F(\rho ,\tau , W_{\tau +(T-\tau )\rho }-W_\tau )]\\&\quad = \int _{(0,T)} {\mathbb {E}}[{\mathbf {1}}_{A}(t){\mathbf {1}}_{B}((W_{\min \{s,t\}})_{s\in [0,T]})f(t)] (\tau ({\mathbb {P}})) (dt)\\&\quad = {\mathbb {E}}[{\mathbf {1}}_{A}(\tau ){\mathbf {1}}_{B}((W_{\min \{s,\tau \}})_{s\in [0,T]})f(\tau )]. \end{aligned} \end{aligned}$$

(25)

Combining this with (22) establishes (21). The proof of Lemma 2.4 is thus completed. $\square $

Corollary 2.5

Let $T\in (0,\infty )$, $d\in \mathbb {N}$, $j\in \mathbb {N}_0$, $\mathbf {e}_1=(1,0,0\ldots ,0)$, $\mathbf {e}_2=(0,1,0,\ldots ,0)$, ..., $\mathbf {e}_{d+1}=(0,0,\ldots , 0,1)\in \mathbb {R}^{d+1}$, $\nu _0,\nu _1, \ldots , \nu _j\in \{1,2,\ldots , d+1\}$, $\alpha \in (0,1)$, $p\in (1,\infty )$ satisfy $\alpha (p-1)\le \frac{p}{2} \le \alpha (p-1)+1$, let $\langle \cdot , \cdot \rangle :\mathbb {R}^{d+1}\times \mathbb {R}^{d+1} \rightarrow \mathbb {R}$ be the standard scalar product on $\mathbb {R}^{ d + 1 }$, let $ ( \Omega , \mathcal {F}, {\mathbb {P}}) $ be a probability space, let $ W=(W^1,W^2,\ldots ,W^d) :[0,T] \times \Omega \rightarrow \mathbb {R}^d $ be a standard Brownian motion with continuous sample paths, let $\mathfrak {r}^{(n)}:\Omega \rightarrow (0,1)$, $n\in \mathbb {N}_0$, be i.i.d. random variables, assume that W and $(\mathfrak {r}^{(n)})_{n\in \mathbb {N}_0}$ are independent, let $\rho :(0,1)\rightarrow (0,\infty )$ and $\varrho :[0,T]^2\rightarrow (0,\infty )$ satisfy for all $b\in (0,1)$, $t\in [0,T)$, $s\in (t,T]$ that $\rho (b)=\frac{1-\alpha }{b^\alpha }$, ${\mathbb {P}}(\mathfrak {r}^{(0)}\le b)=\int _0^b \rho (u)\,du$, and $\varrho (t,s)=\tfrac{1}{T-t}\rho (\tfrac{s-t}{T-t})$, let $S:\mathbb {N}_0 \times [0,T)\times \Omega \rightarrow [0,T)$ satisfy for all $n \in \mathbb {N}_0$, $t\in [0,T)$ that $S(0,t)=t$ and $S(n+1,t)=S(n,t)+(T-S(n,t))\mathfrak {r}^{(n)}$, and let $t\in [0,T)$. Then

$$\begin{aligned}&{\mathbb {E}}\!\left[ \left| \textstyle \prod \limits _{i=0}^{j}\displaystyle \tfrac{1}{\varrho (S(i,t),S(i+1,t))} \big \langle \mathbf {e}_{\nu _{i}}, \big ( 1, \tfrac{ W_{S(i+1,t)}- W_{S(i,t)} }{S(i+1,t)-S(i,t) } \big )\big \rangle \right| ^p \right] \nonumber \\&\quad \le \left[ \max \!\left\{ (T-t)^{\frac{p}{2}}, \tfrac{2^{\frac{p}{2}}\Gamma (\frac{p+1}{2})}{\sqrt{\pi }} \right\} \tfrac{(T-t)^{\frac{p}{2}}\Gamma (\frac{p}{2})}{(1-\alpha )^{p-1} (\frac{p}{2})^{\alpha (p-1)-\frac{p}{2}+1} }\right] ^{j+1} \left[ e^{\frac{p}{2}}\left( \tfrac{pj}{2}+1\right) \right] ^{\frac{1}{2p}} \nonumber \\&\quad \cdot \left[ \tfrac{\Gamma (\frac{2}{p})}{\Gamma (1+j+\frac{2}{p})} \right] ^{\alpha (p-1) -\frac{p}{2} +1}. \end{aligned}$$

(26)

Proof of Corollary 2.5

Throughout this proof let $\mathbb F_n\subseteq \mathcal F$, $n\in \mathbb {N}_0$, satisfy for all $n\in \mathbb {N}$ that $\mathbb F_0=\{\emptyset , \Omega \}$ and that $\mathbb F_n=\sigma (\mathbb F_{n-1} \cup \sigma (S(n,t), (W_{\min \{s,S(n,t)\}})_{s\in [0,T]}))$ and let $v = ( v_1, v_2, \ldots , v_d) \in \mathbb {R}^d$ satisfy $v_1 = v_2 = \cdots = v_{ d } = 1$. Note that for all $r\in [0,T)$, $s\in [r,T]$, $i\in \{1,2,\ldots ,d\}$, $n\in \mathbb {N}$ it holds that $S(n,r)>S(n-1,r)$ and

$$\begin{aligned} {\mathbb {E}}\big [|W^i_s-W^i_r|^p\big ]=\pi ^{-\frac{1}{2}}|2(s-r)|^{\frac{p}{2}}\Gamma (\tfrac{p+1}{2}). \end{aligned}$$

(27)

Next we claim that for all $k\in \{1,2,\ldots ,j+1\}$ it holds ${\mathbb {P}}$-a.s. that

$$\begin{aligned}&{\mathbb {E}}\! \left[ \prod _{i=k}^{j+1} \left| \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W_{S(i,t)}- W_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \right| ^p \Bigg | \mathbb F_{k-1} \right] \nonumber \\&\quad = \int _{S(k-1,t)}^T \tfrac{ \big \langle \mathbf {e}_{\nu _{k-1}}, \big ( 1, [ \frac{2}{s_k-S(k-1,t)} ]^{p/2}\pi ^{-1/2} \Gamma \!\left( \frac{p+1}{2}\right) v \big )\big \rangle }{[\varrho (S(k-1,t),s_k)]^{p-1}} \int _{s_k}^T \tfrac{ \big \langle \mathbf {e}_{\nu _{k}}, \big ( 1, [ \frac{2}{s_{k+1}-s_k} ]^{p/2}\pi ^{ - 1 / 2 } \Gamma (\frac{p+1}{2})v \big )\big \rangle }{[\varrho (s_k,s_{k+1})]^{p-1}} \nonumber \\&\qquad \ldots \int _{s_j}^T \tfrac{ \big \langle \mathbf {e}_{\nu _{j}}, \big ( 1, [ \frac{2}{s_{j+1}-s_j} ]^{p/2}\pi ^{ - 1 / 2 } \Gamma (\frac{p+1}{2})v \big )\big \rangle }{[\varrho (s_j,s_{j+1})]^{p-1}} \,ds_{j+1} \ldots ds_{k+1} \,ds_k . \end{aligned}$$

(28)

We prove (28) by backward induction on $k\in \{1,2,\ldots ,j+1\}$. For the base case $k=j+1$ note that the fact that $S(j+1,t)=S(j,t)+(T-S(j,t))\mathfrak {r}^{(j)}$, Lemma 2.4 (applied with $F(r,s,x)=\big |\frac{1}{\varrho (s,s+(T-s)r)}\big \langle \mathbf {e}_{\nu _j},\big ( 1, \tfrac{ x }{(T-s)r } \big )\big \rangle \big |^p$, $\rho =\mathfrak {r}^{(j)}$, $\tau =S(j,t)$, ${\mathcal {G}}=\mathbb {F}_{j-1}$ for $r\in (0,1)$, $s\in [0,T)$, $x\in \mathbb {R}^d$ in the notation of Lemma 2.4), e.g., Hutzenthaler et al. [45, Lemma 2.3], the hypothesis that W and $\mathfrak {r}^{(j)}$ are independent, (27), and the fact that for all $r\in [0,T)$, $s\in (r,T]$ it holds that $\varrho (r,s)=\frac{1}{T-r}\rho (\frac{s-r}{T-r})$ ensure that it holds ${\mathbb {P}}$-a.s. that

$$\begin{aligned}&{\mathbb {E}}\! \left[ \left| \tfrac{1}{\varrho (S(j,t),S(j+1,t))} \big \langle \mathbf {e}_{\nu _{j}}, \big ( 1, \tfrac{ W_{S(j+1,t)}- W_{S(j,t)} }{S(j+1,t)-S(j,t) } \big )\big \rangle \right| ^p \Bigg | \mathbb F_{j} \right] \nonumber \\&\quad ={\mathbb {E}}\! \left[ \left| \tfrac{1}{\varrho (S(j,t),S(j,t)+(T-S(j,t))\mathfrak {r}^{(j)})} \big \langle \mathbf {e}_{\nu _{j}}, \big ( 1, \tfrac{ W_{S(j,t)+(T-S(j,t))\mathfrak {r}^{(j)}}- W_{S(j,t)} }{(T-S(j,t))\mathfrak {r}^{(j)} } \big )\big \rangle \right| ^p \Bigg | \mathbb F_{j} \right] \nonumber \\&\quad ={\mathbb {E}}\! \left[ \left| \tfrac{1}{\varrho (s,s+(T-s)\mathfrak {r}^{(j)})} \big \langle \mathbf {e}_{\nu _{j}}, \big ( 1, \tfrac{ W_{s+(T-s)\mathfrak {r}^{(j)}}- W_{s} }{(T-s)\mathfrak {r}^{(j)} } \big )\big \rangle \right| ^p \right] \Bigg |_{s=S(j,t)} \nonumber \\&\quad = \int _0^1 \tfrac{ {\mathbb {E}}\! \left[ \big | \big \langle \mathbf {e}_{\nu _{j}}, \big ( 1, \frac{ W_{s+(T-s)r}- W_{s} }{(T-s)r } \big )\big \rangle \big |^p \right] \rho (r) }{[\varrho (s,s+(T-s)r)]^p} \,dr\Bigg |_{s=S(j,t)} \nonumber \\&\quad = \int _s^T \tfrac{ {\mathbb {E}}\! \left[ \big | \big \langle \mathbf {e}_{\nu _{j}}, \big ( 1, \frac{ W_{s_{j+1}}- W_{s} }{s_{j+1}-s } \big )\big \rangle \big |^p \right] \rho \big (\frac{s_{j+1}-s}{T-s}\big ) }{(T-s)[\varrho (s,s_{j+1})]^p} \,ds_{j+1}\Bigg |_{s=S(j,t)} \nonumber \\&\quad = \int _{S(j,t)}^T \tfrac{ \big \langle \mathbf {e}_{\nu _{j}}, \big ( 1, [ \frac{2}{s_{j+1}-S(j,t)} ]^{p/2} \pi ^{ - 1 / 2 } \Gamma (\frac{p+1}{2})v \big )\big \rangle }{[\varrho (S(j,t),s_{j+1})]^{p-1}} \,ds_{j+1} . \end{aligned}$$

(29)

This establishes (28) in the base case $k=j+1$. For the induction step $\{2,3\ldots ,j+1\} \ni k+1\rightsquigarrow k \in \{1,2,\ldots ,j\}$ assume that there exists $k\in \{1,2,\ldots ,j\}$ which satisfies that

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}\! \left[ \prod _{i=k+1}^{j+1} \left| \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W_{S(i,t)}- W_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \right| ^p \Bigg | \mathbb F_{k} \right] \\&\quad = \int _{S(k,t)}^T \tfrac{ \big \langle \mathbf {e}_{\nu _{k}}, \big ( 1, [ \frac{2}{s_{k+1}-S(k,t)} ]^{p/2} \pi ^{ - 1 / 2 } \Gamma (\frac{p+1}{2})v \big )\big \rangle }{[\varrho (S(k,t),s_{k+1})]^{p-1}} \\&\qquad \cdot \int _{s_{k+1}}^T \tfrac{ \big \langle \mathbf {e}_{\nu _{k+1}}, \big ( 1, [ \frac{2}{s_{k+2}-s_{k+1}} ]^{p/2} \pi ^{ - 1 / 2 } \Gamma (\frac{p+1}{2})v \big )\big \rangle }{[\varrho (s_{k+1},s_{k+2})]^{p-1}} \\&\qquad \ldots \int _{s_j}^T \tfrac{ \big \langle \mathbf {e}_{\nu _{j}}, \big ( 1, [ \frac{2}{s_{j+1}-s_j} ]^{p/2} \pi ^{ - 1 / 2 } \Gamma (\frac{p+1}{2})v \big )\big \rangle }{[\varrho (s_j,s_{j+1})]^{p-1}} \,ds_{j+1} \ldots ds_{k+2} \,ds_{k+1} . \end{aligned} \end{aligned}$$

(30)

Observe that the tower property, the fact that the random variable $\frac{1}{\varrho (S(k-1,t),S(k,t))} \big ( 1, \frac{ W_{S(k,t)}- W_{S(k-1,t)} }{S(k,t)-S(k-1,t) } \big ) $ is $\mathbb F_k/{\mathcal {B}}(\mathbb {R})$-measurable, and (30) ensure that it holds ${\mathbb {P}}$-a.s. that

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}\! \left[ \prod _{i=k}^{j+1} \left| \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W_{S(i,t)}- W_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \right| ^p \Bigg | \mathbb F_{k-1} \right] \\&\quad ={\mathbb {E}}\Bigg [ \left| \tfrac{1}{\varrho (S(k-1,t),S(k,t))} \big \langle \mathbf {e}_{\nu _{k-1}}, \big ( 1, \tfrac{ W_{S(k,t)}- W_{S(k-1,t)} }{S(k,t)-S(k-1,t) } \big )\big \rangle \right| ^p\\&\qquad \cdot {\mathbb {E}}\!\left[ \prod _{i=k+1}^{j+1} \left| \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W_{S(i,t)}- W_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \right| ^p \Bigg | \mathbb F_{k} \right] \Bigg | \mathbb F_{k-1} \Bigg ] \\&\quad = {\mathbb {E}}\Bigg [ \left| \tfrac{1}{\varrho (S(k-1,t),S(k,t) )} \big \langle \mathbf {e}_{\nu _{k-1}}, \big ( 1, \tfrac{ W_{S(k,t)}- W_{S(k-1,t)} }{S(k,t)-S(k-1,t) } \big )\big \rangle \right| ^p\\&\qquad \cdot \int _{S(k,t)}^T \tfrac{ \big \langle \mathbf {e}_{\nu _{k}}, \big ( 1, [ \frac{2}{s_{k+1}-S(k,t)} ]^{p/2} \pi ^{ - 1 / 2 } \Gamma (\frac{p+1}{2})v \big )\big \rangle }{[\varrho (S(k,t),s_{k+1})]^{p-1}} \cdot \int _{s_{k+1}}^T \tfrac{ \big \langle \mathbf {e}_{\nu _{k+1}}, \big ( 1, [ \frac{2}{s_{k+2}-s_{k+1}} ]^{p/2} \pi ^{ - 1 / 2 } \Gamma (\frac{p+1}{2})v \big )\big \rangle }{[\varrho (s_{k+1},s_{k+2})]^{p-1}}\\&\qquad \ldots \int _{s_j}^T \tfrac{ \big \langle \mathbf {e}_{\nu _{j}}, \big ( 1, [ \frac{2}{s_{j+1}-s_j} ]^{p/2} \pi ^{ - 1 / 2 } \Gamma (\frac{p+1}{2})v \big )\big \rangle }{[\varrho (s_j,s_{j+1})]^{p-1}} \,ds_{j+1} \ldots ds_{k+2} \,ds_{k+1} \Bigg | \mathbb F_{k-1} \Bigg ]. \end{aligned} \end{aligned}$$

(31)

The fact that $S(k,t)=S(k-1,t)+(T-S(k-1,t))\mathfrak {r}^{(k-1)}$ and Lemma 2.4 (applied with $\rho =\mathfrak {r}^{(k-1)}$, $\tau =S(k-1,t)$, ${\mathcal {G}}=\mathbb {F}_{k-2}$ in the notation of Lemma 2.4) hence prove that it holds ${\mathbb {P}}$-a.s. that

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}\! \left[ \prod _{i=k}^{j+1} \left| \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W_{S(i,t)}- W_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \right| ^p \Bigg | \mathbb F_{k-1} \right] \\&\quad = {\mathbb {E}}\! \Bigg [ \Bigg | \tfrac{ \big \langle \mathbf {e}_{\nu _{k-1}}, \big ( 1, \frac{ W_{s+(T-s)\mathfrak {r}^{(k-1)}}- W_{s} }{(T-s)\mathfrak {r}^{(k-1)} } \big )\big \rangle }{\varrho (s,s+(T-s)\mathfrak {r}^{(k-1)})} \Bigg |^p\\&\qquad \cdot \int _{s+(T-s)\mathfrak {r}^{(k-1)}}^T \tfrac{ \big \langle \mathbf {e}_{\nu _{k}}, \big ( 1, [ \frac{2}{s_{k+1}-(s+(T-s)\mathfrak {r}^{(k-1)})} ]^{p/2} \pi ^{ - 1 / 2 } \Gamma (\frac{p+1}{2})v \big )\big \rangle }{[\varrho (s+(T-s)\mathfrak {r}^{(k-1)},s_{k+1})]^{p-1}} \\&\qquad \cdot \int _{s_{k+1}}^T \tfrac{ \big \langle \mathbf {e}_{\nu _{k+1}}, \big ( 1, [ \frac{2}{s_{k+2}-s_{k+1}} ]^{p/2} \pi ^{ - 1 / 2 } \Gamma (\frac{p+1}{2})v \big )\big \rangle }{[\varrho (s_{k+1},s_{k+2})]^{p-1}} \ldots \\&\qquad \cdot \int _{s_j}^T \tfrac{ \big \langle \mathbf {e}_{\nu _{j}}, \big ( 1, [ \frac{2}{s_{j+1}-s_j} ]^{p/2} \pi ^{ - 1 / 2 } \Gamma (\frac{p+1}{2})v \big )\big \rangle }{[\varrho (s_j,s_{j+1})]^{p-1}} \,ds_{j+1} \ldots ds_{k+2} \,ds_{k+1} \Bigg ] \Bigg |_{s=S(k-1,t)}. \end{aligned} \end{aligned}$$

(32)

This, e.g., Hutzenthaler et al. [45, Lemma 2.3], the hypothesis that W and $\mathfrak {r}^{(k-1)}$ are independent, (27), and the fact that for all $r\in [0,T)$, $s\in (r,T]$ it holds that $\varrho (r,s)=\frac{1}{T-r}\rho (\frac{s-r}{T-r})$ assure that it holds ${\mathbb {P}}$-a.s. that

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}\! \left[ \prod _{i=k}^{j+1} \left| \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W_{S(i,t)}- W_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \right| ^p \Bigg | \mathbb F_{k-1} \right] \\&\quad = \int _0^1 \tfrac{ \big \langle \mathbf {e}_{\nu _{k-1}}, \big ( 1, \frac{(2((T-s)r))^{p/2}\Gamma \!\left( \frac{p+1}{2}\right) }{\sqrt{\pi }|(T-s)r|^p }v \big ) \big \rangle \rho (r)}{[\varrho (s,s+(T-s)r)]^p} \int _{s+(T-s)r}^T \tfrac{ \big \langle \mathbf {e}_{\nu _{k}}, \big ( 1, [ \frac{2}{s_{k+1}-(s+(T-s)r)} ]^{p/2} \pi ^{ - 1 / 2 } \Gamma (\frac{p+1}{2})v \big )\big \rangle }{[\varrho (s+(T-s)r,s_{k+1})]^{p-1}} \\&\qquad \cdot \int _{s_{k+1}}^T \tfrac{ \big \langle \mathbf {e}_{\nu _{k+1}}, \big ( 1, [ \frac{2}{s_{k+2}-s_{k+1}} ]^{p/2} \pi ^{ - 1 / 2 } \Gamma (\frac{p+1}{2})v \big )\big \rangle }{[\varrho (s_{k+1},s_{k+2})]^{p-1}} \ldots \\&\qquad \cdot \int _{s_j}^T \tfrac{ \big \langle \mathbf {e}_{\nu _{j}}, \big ( 1, [ \frac{2}{s_{j+1}-s_j} ]^{p/2} \pi ^{ - 1 / 2 } \Gamma (\frac{p+1}{2})v \big )\big \rangle }{[\varrho (s_j,s_{j+1})]^{p-1}} \,ds_{j+1} \ldots ds_{k+2} \,ds_{k+1} \,dr \Bigg |_{s=S(k-1,t)} \\&\quad = \int _{S(k-1,t)}^T \tfrac{ \big \langle \mathbf {e}_{\nu _{k-1}}, \big ( 1, [ \frac{2}{s_k-S(k-1,t)} ]^{p/2} \pi ^{ - 1 / 2 } \Gamma (\frac{p+1}{2})v \big )\big \rangle }{[\varrho (S(k-1,t),s_k)]^{p-1}} \int _{s_k}^T \tfrac{ \big \langle \mathbf {e}_{\nu _{k}}, \big ( 1, [ \frac{2}{s_{k+1}-s_k} ]^{p/2} \pi ^{ - 1 / 2 } \Gamma (\frac{p+1}{2})v \big )\big \rangle }{[\varrho (s_k,s_{k+1})]^{p-1}} \\&\qquad \cdot \int _{s_{k+1}}^T \tfrac{ \big \langle \mathbf {e}_{\nu _{k+1}}, \big ( 1, [ \frac{2}{s_{k+2}-s_{k+1}} ]^{p/2} \pi ^{ - 1 / 2 } \Gamma (\frac{p+1}{2})v \big )\big \rangle }{[\varrho (s_{k+1},s_{k+2})]^{p-1}} \ldots \int _{s_j}^T \tfrac{ \big \langle \mathbf {e}_{\nu _{j}}, \big ( 1, [ \frac{2}{s_{j+1}-s_j} ]^{p/2} \pi ^{ - 1 / 2 } \Gamma (\frac{p+1}{2})v \big )\big \rangle }{[\varrho (s_j,s_{j+1})]^{p-1}} \,ds_{j+1}\\&\qquad \ldots ds_{k+2} \,ds_{k+1} \,ds_k . \end{aligned} \end{aligned}$$

(33)

Induction thus proves (28). Next observe that (28) implies that

$$\begin{aligned}&{\mathbb {E}}\! \left[ \prod _{i=1}^{j+1} \left| \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W_{S(i,t)}- W_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \right| ^p \right] \nonumber \\&\quad \le \left[ \max \!\left\{ (T-t)^{\frac{p}{2}}, \tfrac{2^{\frac{p}{2}}\Gamma \!\left( \frac{p+1}{2}\right) }{\sqrt{\pi }} \right\} \right] ^{j+1} \nonumber \\&\qquad \cdot \int _t^T \frac{1}{(s_1-t)^{\frac{p}{2}}[\varrho (t,s_1)]^{p-1}} \int _{s_1}^T \frac{1}{(s_2-s_1)^{\frac{p}{2}}[\varrho (s_1,s_2)]^{p-1}}\nonumber \\&\qquad \ldots \int _{s_j}^T \frac{1}{(s_{j+1}-s_j)^{\frac{p}{2}}[\varrho (s_j,s_{j+1})]^{p-1}} \,ds_{j+1} \ldots ds_2 \,ds_1 . \end{aligned}$$

(34)

Corollary 2.3 (applied with $\beta =\frac{p}{2}$ and $\gamma =p-1$ in the notation of Corollary 2.3), the fact that $\frac{p}{2}\in [\alpha (p-1),\alpha (p-1)+1]$, and the fact that $(\frac{p}{2}-\alpha (p-1))(1-(\frac{p}{2}-\alpha (p-1)))\le \frac{1}{4}$ therefore show that

$$\begin{aligned} \begin{aligned}&\left\| \prod _{i=1}^{j+1} \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W_{S(i,t)}- W_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \right\| _{L^p({\mathbb {P}};\mathbb {R})}^p \\&\quad = {\mathbb {E}}\! \left[ \prod _{i=1}^{j+1} \left| \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W_{S(i,t)}- W_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \right| ^p \right] \\&\quad \le \left[ \max \left\{ (T-t)^{\frac{p}{2}}, \tfrac{2^{\frac{p}{2}}\Gamma \!\left( \frac{p+1}{2}\right) }{\sqrt{\pi }} \right\} \right] ^{j+1} \left[ \tfrac{(T-t)^{\frac{p}{2}}\Gamma (\alpha (p-1)-\frac{p}{2}+1)}{(1-\alpha )^{p-1} (\frac{p}{2})^{\alpha (p-1)-\frac{p}{2}+1} }\right] ^{j+1}\\&\qquad \cdot \left[ e^{\frac{p}{2}}\left( \tfrac{pj}{2}+1\right) \right] ^{\frac{2(\frac{p}{2}-\alpha (p-1))(\alpha (p-1)-\frac{p}{2}+1)}{p}} \cdot \left[ \tfrac{\Gamma \left( \frac{2}{p}\right) }{\Gamma \left( 1+j+\frac{2}{p}\right) } \right] ^{\alpha (p-1) -\frac{p}{2} +1} \\&\quad \le \left[ \max \left\{ (T-t)^{\frac{p}{2}}, \tfrac{2^{\frac{p}{2}}\Gamma \!\left( \frac{p+1}{2}\right) }{\sqrt{\pi }} \right\} \tfrac{(T-t)^{\frac{p}{2}}\Gamma (\alpha (p-1)-\frac{p}{2}+1)}{(1-\alpha )^{p-1} (\frac{p}{2})^{\alpha (p-1)-\frac{p}{2}+1} }\right] ^{j+1}\\&\qquad \cdot \left[ e^{\frac{p}{2}}\left( \tfrac{pj}{2}+1\right) \right] ^{\frac{1}{2p}} \left[ \tfrac{\Gamma \left( \frac{2}{p}\right) }{\Gamma \left( 1+j+\frac{2}{p}\right) } \right] ^{\alpha (p-1) -\frac{p}{2} +1}. \end{aligned} \end{aligned}$$

(35)

Combining this with the fact that $\alpha (p-1)-\frac{p}{2}+1\le p-1-\frac{p}{2}+1=\frac{p}{2}$ establishes (26). The proof of Corollary 2.5 is thus completed. $\square $

3 Full-History Recursive Multilevel Picard Approximation Methods

In this section we introduce and analyze a class of new MLP approximation methods for nonlinear heat equations with gradient-dependent nonlinearities. In the main result of this section, Proposition 3.5 in Sect. 3.3 below, we provide a detailed error analysis for these new MLP approximation methods. We will employ Proposition 3.5 in our proofs of the approximation results in Sect. 5 below (cf. Corollary 5.1, Theorem 5.2, Corollary 5.4, and Corollary 5.5 in Sect. 5 below). Our proof of Proposition 3.5 employs suitable recursive error bounds for the proposed MLP approximations, which we establish in Lemma 3.4 below. Our proof of Lemma 3.4, in turn, uses appropriate measurability and distribution properties of the proposed MLP approximations, which we establish in the elementary auxiliary result in Lemma 3.2 below, and appropriate explicit representations for expectations of the proposed MLP approximations, which we establish in the auxiliary result in Lemma 3.3 below. In Setting 3.1 we formulate the mathematical framework which we employ to analyze the proposed MLP approximations. In particular, in (39) in Setting 3.1 we introduce the proposed MLP approximations.

3.1 Description of MLP Approximations

Setting 3.1

Let $\left\| \cdot \right\| _1:(\cup _{n\in \mathbb {N}}\mathbb {R}^n)\rightarrow \mathbb {R}$ satisfy for all $n\in \mathbb {N}$, $x=(x_1,x_2,\ldots ,x_n)\in \mathbb {R}^n$ that $\Vert x\Vert _1=\sum _{i=1}^n |x_i|$, let $ T \in (0,\infty ) $, $ d \in \mathbb {N}$, $ \Theta = \cup _{ n \in \mathbb {N}} \mathbb {Z}^n $, $L=(L_1,L_2,\ldots ,L_{d+1})\in [0,\infty )^{d+1}$, $K=(K_1,K_2,\ldots , K_d)\in [0,\infty )^d$, $\mathbf {e}_1=(1,0,\ldots ,0)$, $\mathbf {e}_2=(0,1,0,\ldots ,0)$, ..., $\mathbf {e}_{d+1}=(0,0,\ldots ,1)\in \mathbb {R}^{d+1}$, $\rho \in C((0,1),(0,\infty ))$, $\mathbf{u}=(\mathbf{u}_1, \mathbf{u}_2, \ldots , \mathbf{u}_{d+1}) \in C([0,T)\times \mathbb {R}^d,\mathbb {R}^{1+d})$, let $\langle \cdot , \cdot \rangle :\mathbb {R}^{d+1}\times \mathbb {R}^{d+1} \rightarrow \mathbb {R}$ satisfy for all $v=(v_1,v_2,\ldots , v_{d+1})$, $w=(w_1,w_2,\ldots , w_{d+1})\in \mathbb {R}^{d+1}$ that $\langle v, w\rangle =\sum _{i=1}^{d+1}v_iw_i$, let $\varrho :[0,T]^2\rightarrow \mathbb {R}$ satisfy for all $t\in [0,T)$, $s\in (t,T)$ that $\varrho (t,s)=\frac{1}{T-t}\rho (\frac{s-t}{T-t})$, let $ ( \Omega , \mathcal {F}, {\mathbb {P}}) $ be a probability space, let $ W^{ \theta }=(W^{\theta ,1}, W^{\theta ,2}, \ldots , W^{\theta ,d}) :[0,T] \times \Omega \rightarrow \mathbb {R}^d $, $ \theta \in \Theta $, be i.i.d. standard Brownian motions with continuous sample paths, let $\mathfrak {r}^\theta :\Omega \rightarrow (0,1)$, $\theta \in \Theta $, be i.i.d. random variables, assume for all $b\in (0,1)$ that ${\mathbb {P}}(\mathfrak {r}^0\le b)=\int _0^b \rho (s)\,ds$, assume that $(W^\theta )_{\theta \in \Theta }$ and $(\mathfrak {r}^\theta )_{\theta \in \Theta }$ are independent, let $\mathcal {R}^{\theta }:[0,T)\times \Omega \rightarrow [0,T)$, $n \in \mathbb {N}_0$, satisfy for all $\theta \in \Theta $, $t\in [0,T)$ that $\mathcal {R}^{\theta } _t = t+ (T-t)\mathfrak {r}^{\theta }$, let $ f\in C([0,T]\times \mathbb {R}^d\times \mathbb {R}^{1+d},\mathbb {R})$, $ g\in C(\mathbb {R}^d, \mathbb {R}) $, $ F:C([0,T)\times \mathbb {R}^d,\mathbb {R}^{1+d}) \rightarrow C([0,T)\times \mathbb {R}^d,\mathbb {R}) $ satisfy for all $t\in [0,T)$, $x=(x_1,x_2,\ldots ,x_d)$, $\mathfrak {x}=(\mathfrak {x}_1, \mathfrak {x}_2,\ldots ,\mathfrak {x}_d)\in \mathbb {R}^d$, $u=(u_1,u_2,\ldots ,u_{d+1})$, $\mathfrak {u}=(\mathfrak {u}_1,\mathfrak {u}_2,\ldots ,\mathfrak {u}_{d+1})\in \mathbb {R}^{1+d}$, $\mathbf{v}\in C([0,T)\times \mathbb {R}^d,\mathbb {R}^{1+d})$ that

$$\begin{aligned}&\max \{|f(t,x,u)-f(t,x,\mathfrak {u})|,|g(x)-g(\mathfrak {x})|\}\nonumber \\&\quad \le \left[ \textstyle \sum \limits _{\nu =1}^{d+1} \displaystyle L_\nu \left| u_\nu -\mathfrak {u}_\nu \right| \right] +\left[ \textstyle \sum \limits _{\nu =1}^d \displaystyle K_\nu |x_\nu -\mathfrak {x}_\nu |\right] , \end{aligned}$$

(36)

$$\begin{aligned}&{\mathbb {E}}\!\left[ \big \Vert g(x+W^0_{T}-W^0_t)\big (1,\tfrac{W^0_T-W^0_t}{T-t} \big ) \big \Vert _1\right. \nonumber \\&\quad \left. + \int _t^{T}\big \Vert [(F(\mathbf{u}))(t,x+W^0_{s}-W^0_{t})] \big (1,\tfrac{W^0_s-W^0_t}{s-t} \big )\big \Vert _1 \,ds \right] <\infty , \end{aligned}$$

(37)

$$\begin{aligned}&\mathbf{u}(t,x)={\mathbb {E}}\!\left[ g(x+W^0_{T}-W^0_t)\big (1,\tfrac{W^0_T-W^0_t}{T-t} \big )\right. \nonumber \\&\quad \left. + \int _t^{T}[(F(\mathbf{u}))(t,x+W^0_{s}-W^0_{t})] \big (1,\tfrac{W^0_s-W^0_t}{s-t} \big ) \,ds \right] , \end{aligned}$$

(38)

and $(F(\mathbf{v}))(t,x)=f(t,x,\mathbf{v}(t,x))$, and let $ \mathbf{U}_{ n,M}^{\theta }=(\mathbf{U}_{ n,M}^{\theta ,1}, \mathbf{U}_{ n,M}^{\theta ,2}, \ldots , \mathbf{U}_{ n,M}^{\theta ,d+1}) :[0,T)\times \mathbb {R}^d\times \Omega \rightarrow \mathbb {R}^{1+d} $, $n,M\in \mathbb {Z}$, $\theta \in \Theta $, satisfy for all $ n,M \in \mathbb {N}$, $ \theta \in \Theta $, $ t\in [0,T)$, $x \in \mathbb {R}^d$ that $ \mathbf{U}_{-1,M}^{\theta }(t,x)=\mathbf{U}_{0,M}^{\theta }(t,x)=0$ and

$$\begin{aligned} \begin{aligned}&\mathbf{U}_{n,M}^{\theta }(t,x) = \big ( g(x) , 0 \big ) + \sum _{i=1}^{M^n}\frac{\big (g(x+W^{(\theta ,0,-i)}_T-W^{(\theta ,0,-i)}_t)-g(x)\big )}{M^n} \Big ( 1 , \tfrac{ W^{(\theta , 0, -i)}_{T}- W^{(\theta , 0, -i)}_{t} }{ T - t } \Big )\\&\quad +\sum _{l=0}^{n-1}\sum _{i=1}^{M^{n-l}} \frac{ \big (F(\mathbf{U}_{l,M}^{(\theta ,l,i)})-\mathbb {1}_{\mathbb {N}}(l)F( \mathbf{U}_{l-1,M}^{(\theta ,-l,i)})\big ) (\mathcal {R}^{(\theta , l,i)}_t,x+W_{\mathcal {R}^{(\theta , l,i)}_t}^{(\theta ,l,i)}-W_t^{(\theta ,l,i)}) }{M^{n-l}\varrho (t,\mathcal {R}^{(\theta , l,i)}_t)} \\&\qquad \cdot \left( 1 , \tfrac{ W_{\mathcal {R}^{(\theta , l,i)}_t}^{(\theta ,l,i)}- W^{(\theta , l, i)}_{t} }{ \mathcal {R}^{(\theta , l,i)}_t-t} \right) . \end{aligned} \end{aligned}$$

(39)

3.2 Properties of MLP Approximations

Lemma 3.2

(Measurability properties) Assume Setting 3.1 and let $M\in \mathbb {N}$. Then

(i)
for all $n \in \mathbb {N}_0$, $\theta \in \Theta $ it holds that $ \mathbf{U}_{ n,M}^{\theta } :[0, T) \times \mathbb {R}^d \times \Omega \rightarrow \mathbb {R}^{d+1} $ is a continuous random field,
(ii)
for all $n \in \mathbb {N}_0$, $\theta \in \Theta $ it holds that $ \sigma ( \mathbf{U}^\theta _{n, M} ) \subseteq \sigma ( (\mathfrak {r}^{(\theta , \vartheta )})_{\vartheta \in \Theta }, (W^{(\theta , \vartheta )})_{\vartheta \in \Theta }) $,
(iii)
for all $n, m \in \mathbb {N}_0$, $i,j,k,l, \in \mathbb {Z}$, $\theta \in \Theta $ with $(i,j) \ne (k,l)$ it holds that $ \mathbf{U}^{(\theta ,i,j)}_{n,M} $ and $ \mathbf{U}^{(\theta ,k,l)}_{m,M} $ are independent,
(iv)
for all $n \in \mathbb {N}_0$, $\theta \in \Theta $ it holds hat $\mathbf{U}_{ n,M}^{\theta }$, $W^\theta $, and $\mathfrak {r}^\theta $ are independent,
(v)
for all $n \in \mathbb {N}_0$ it holds that $ \mathbf{U}^\theta _{n, M} $, $ \theta \in \Theta $, are identically distributed, and
(vi)
for all $\theta \in \Theta $, $l\in \mathbb {N}$, $i\in \mathbb {N}$, $t\in [0,T)$, $x\in \mathbb {R}^d$ it holds that
$$\begin{aligned} \tfrac{ F( \mathbf{U}_{l-1,M}^{(\theta ,-l,i)}) (\mathcal {R}^{(\theta , l,i)}_t,x+W_{\mathcal {R}^{(\theta , l,i)}_t}^{(\theta ,l,i)}-W_t^{(\theta ,l,i)}) }{\varrho (t,\mathcal {R}^{(\theta , l,i)}_t)} \left( 1 , \tfrac{ W_{\mathcal {R}^{(\theta , l,i)}_t}^{(\theta ,l,i)}- W^{(\theta , l, i)}_{t} }{ \mathcal {R}^{(\theta , l,i)}_t-t} \right) \end{aligned}$$
(40)
and
$$\begin{aligned} \tfrac{ F( \mathbf{U}_{l-1,M}^{(\theta ,l,i)}) (\mathcal {R}^{(\theta , l,i)}_t,x+W_{\mathcal {R}^{(\theta , l,i)}_t}^{(\theta ,l,i)}-W_t^{(\theta ,l,i)}) }{\varrho (t,\mathcal {R}^{(\theta , l,i)}_t)} \left( 1 , \tfrac{ W_{\mathcal {R}^{(\theta , l,i)}_t}^{(\theta ,l,i)}- W^{(\theta , l, i)}_{t} }{ \mathcal {R}^{(\theta , l,i)}_t-t} \right) \end{aligned}$$
(41)
are identically distributed.

Proof of Lemma 3.2

First, observe that (39), the hypothesis that for all $M \in \mathbb {N}$, $\theta \in \Theta $ it holds that $\mathbf{U}^\theta _{0, M} = 0$, the fact that for all $\theta \in \Theta $ it holds that $W^\theta $ and $\mathcal {R}^\theta $ are continuous random fields, the hypothesis that $f\in C([0,T]\times \mathbb {R}^d \times \mathbb {R}\times \mathbb {R}^d, \mathbb {R})$, the hypothesis that $g\in C(\mathbb {R}^d, \mathbb {R})$, the fact that $\varrho |_{\{(s,t)\in [0,T)^2:s<t\}}\in C( \{(s,t)\in [0,T)^2:s<t\}, \mathbb {R}) $, and induction on $\mathbb {N}_0$ establish Item (i). Next note that Item (i), the hypothesis that $f\in C([0,T]\times \mathbb {R}^d \times \mathbb {R}\times \mathbb {R}^d, \mathbb {R})$, and, e.g., Beck et al. [2, Lemma 2.4] assure that for all $n \in \mathbb {N}_0$, $\theta \in \Theta $ it holds that $F(\mathbf{U}^\theta _{n, M})$ is $ ( \mathcal {B}((0, T) \times \mathbb {R}^d ) \otimes \sigma (\mathbf{U}^\theta _{n, M}) )/ \mathcal {B}(\mathbb {R})$-measurable. The hypothesis that for all $M \in \mathbb {N}$, $\theta \in \Theta $ it holds that $\mathbf{U}^\theta _{0, M} = 0$, (39), the fact that for all $\theta \in \Theta $ it holds that $W^\theta $ is $ ( \mathcal {B}([0, T]) \otimes \sigma (W^\theta ) )/ \mathcal {B}(\mathbb {R})$-measurable, the fact that for all $\theta \in \Theta $ it holds that $\mathcal {R}^\theta $ is $ ( \mathcal {B}([0, T)) \otimes \sigma (\mathfrak {r}^\theta ) )/ \mathcal {B}([0, T) )$-measurable, and induction on $\mathbb {N}_0$ hence prove Item (ii). In addition, note that Item (ii) and the fact that for all $i,j,k,l, \in \mathbb {Z}$, $\theta \in \Theta $ with $(i,j) \ne (k,l)$ it holds that $((\mathfrak {r}^{(\theta ,i,j, \vartheta )}, W^{(\theta ,i,j, \vartheta )}))_{\vartheta \in \Theta }$ and $((\mathfrak {r}^{(\theta ,k,l, \vartheta )}, W^{(\theta ,k,l, \vartheta )}))_{\vartheta \in \Theta }$ are independent prove Item (iii). Furthermore, observe that Item (ii) and the fact that for all $\theta \in \Theta $ it holds that $(\mathfrak {r}^{(\theta , \vartheta )})_{\vartheta \in \Theta }$, $(W^{(\theta , \vartheta )})_{\vartheta \in \Theta }$, $W^\theta $, and $\mathfrak {r}^\theta $ are independent establish Item (iv). Next observe that the hypothesis that for all $\theta \in \Theta $ it holds that $\mathbf{U}^\theta _{0, M} = 0$, the hypothesis that $(W^\theta )_{\theta \in \Theta }$ are i.i.d., the hypothesis that $(\mathcal {R}^\theta )_{\theta \in \Theta }$ are i.i.d., Items (i)–(iii), Hutzenthaler et al. [45, Corollary 2.5], and induction on $\mathbb {N}_0$ establish Item (v). Furthermore, observe that Item (ii), the fact that for all $\theta \in \Theta $, $l\in \mathbb {N}$, $i\in \mathbb {N}$ it holds that $(\mathfrak {r}^{(\theta ,-l,i, \vartheta )})_{\vartheta \in \Theta }$, $(W^{(\theta ,-l,i, \vartheta )})_{\vartheta \in \Theta }$, $W^{\theta ,l,i}$, and $\mathfrak {r}^{\theta ,l,i}$ are independent, and, e.g., Hutzenthaler et al. [45, Lemma 2.3] imply that for every $\theta \in \Theta $, $l,i\in \mathbb {N}$, $t\in [0,T)$, $x\in \mathbb {R}^d$ and every bounded ${\mathcal {B}}(\mathbb {R}^{d+1})/{\mathcal {B}}(\mathbb {R})$-measurable $\psi :\mathbb {R}^{d+1}\rightarrow \mathbb {R}$ it holds that

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}\!\left[ \psi \!\left( \tfrac{ F( \mathbf{U}_{l-1,M}^{(\theta ,-l,i)}) (\mathcal {R}^{(\theta , l,i)}_t,x+W_{\mathcal {R}^{(\theta , l,i)}_t}^{(\theta ,l,i)}-W_t^{(\theta ,l,i)}) }{\varrho (t,\mathcal {R}^{(\theta , l,i)}_t)} \left( 1 , \tfrac{ W_{\mathcal {R}^{(\theta , l,i)}_t}^{(\theta ,l,i)}- W^{(\theta , l, i)}_{t} }{ \mathcal {R}^{(\theta , l,i)}_t-t} \right) \right) \right] \\&\quad ={\mathbb {E}}\!\left[ {\mathbb {E}}\!\left[ \psi \!\left( \tfrac{ F( \mathbf{U}_{l-1,M}^{(\theta ,-l,i)}) (r,x+z) }{\varrho (t,r)} \left( 1 , \tfrac{ z }{r-t} \right) \right) \right] \Bigg |_{(r,z)=\bigl (\mathcal {R}^{(\theta , l,i)}_t, W_{\mathcal {R}^{(\theta , l,i)}_t}^{(\theta ,l,i)}- W^{(\theta , l, i)}_{t}\bigr )} \right] . \end{aligned} \end{aligned}$$

(42)

This, Item (v), Item (ii), the fact that for all $\theta \in \Theta $, $l, i\in \mathbb {N}$ it holds that $(\mathfrak {r}^{(\theta ,l,i, \vartheta )})_{\vartheta \in \Theta }$, $(W^{(\theta ,l,i, \vartheta )})_{\vartheta \in \Theta }$, $W^{\theta ,l,i}$, and $\mathfrak {r}^{\theta ,l,i}$ are independent, and, e.g., Hutzenthaler et al. [45, Lemma 2.3] imply that for every $\theta \in \Theta $, $l,i\in \mathbb {N}$, $t\in [0,T)$, $x\in \mathbb {R}^d$ and every bounded ${\mathcal {B}}(\mathbb {R}^{d+1})/{\mathcal {B}}(\mathbb {R})$-measurable $\psi :\mathbb {R}^{d+1}\rightarrow \mathbb {R}$ it holds that

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}\!\left[ \psi \!\left( \tfrac{ F( \mathbf{U}_{l-1,M}^{(\theta ,-l,i)}) (\mathcal {R}^{(\theta , l,i)}_t,x+W_{\mathcal {R}^{(\theta , l,i)}_t}^{(\theta ,l,i)}-W_t^{(\theta ,l,i)}) }{\varrho (t,\mathcal {R}^{(\theta , l,i)}_t)} \left( 1 , \tfrac{ W_{\mathcal {R}^{(\theta , l,i)}_t}^{(\theta ,l,i)}- W^{(\theta , l, i)}_{t} }{ \mathcal {R}^{(\theta , l,i)}_t-t} \right) \right) \right] \\&\quad ={\mathbb {E}}\!\left[ {\mathbb {E}}\!\left[ \psi \!\left( \tfrac{ F( \mathbf{U}_{l-1,M}^{(\theta ,l,i)}) (r,x+z) }{\varrho (t,r)} \left( 1 , \tfrac{ z }{r-t} \right) \right) \right] \Bigg |_{(r,z)=\bigl (\mathcal {R}^{(\theta , l,i)}_t,W_{\mathcal {R}^{(\theta , l,i)}_t}^{(\theta ,l,i)}- W^{(\theta , l, i)}_{t}\bigr )} \right] \\&\quad ={\mathbb {E}}\!\left[ \psi \!\left( \tfrac{ F( \mathbf{U}_{l-1,M}^{(\theta ,l,i)}) (\mathcal {R}^{(\theta , l,i)}_t,x+W_{\mathcal {R}^{(\theta , l,i)}_t}^{(\theta ,l,i)}-W_t^{(\theta ,l,i)}) }{\varrho (t,\mathcal {R}^{(\theta , l,i)}_t)} \left( 1 , \tfrac{ W_{\mathcal {R}^{(\theta , l,i)}_t}^{(\theta ,l,i)}- W^{(\theta , l, i)}_{t} }{ \mathcal {R}^{(\theta , l,i)}_t-t} \right) \right) \right] . \end{aligned} \end{aligned}$$

(43)

This establishes Item (vi). The proof of Lemma 3.2 is thus completed. $\square $

Lemma 3.3

(Approximations are integrable) Assume Setting 3.1, let $p\in (1,\infty )$, $M \in \mathbb {N}$, $x \in \mathbb {R}^d$, and assume for all $q\in [1,p)$, $t\in [0,T)$ that

$$\begin{aligned} \int _0^1 \frac{1}{ s^{\frac{q}{2}} \left[ \rho (s)\right] ^{q-1} } \,ds+ \sup _{s\in [t,T)} {\mathbb {E}}\!\left[ \left| f (s,x+W_{s}^0-W_t^0,0,0) \right| ^{q} \right] <\infty . \end{aligned}$$

(44)

Then

(i)
it holds for all $\theta \in \Theta $, $q\in [1,\infty )$, $\nu \in \{1,2,\ldots , d+1\}$ that
$$\begin{aligned} \sup _{u\in (0,T]}\sup _{t\in [0,u)}\sup _{y\in \mathbb {R}^d} {\mathbb {E}}\!\left[ \big |\big (g(y+W^{\theta }_u-W^{\theta }_t)-g(y)\big ) \big \langle \mathbf {e}_\nu , \big ( 1 , \tfrac{ W^{\theta }_{u}- W^{\theta }_{t} }{ u - t } \big ) \big \rangle \big |^q\right] <\infty , \end{aligned}$$
(45)
(ii)
it holds for all $n\in \mathbb {N}_0$, $\theta \in \Theta $, $q\in [1,p)$, $t\in [0,T)$, $\nu \in \{1,2,\ldots , d+1\}$ that
$$\begin{aligned}&\sup _{s\in [t,T)} \left| {\mathbb {E}}\!\left[ \left| (\mathbf{U}_{n,M}^{\theta }(s,x+W_{s}^\theta -W_{t}^\theta ))_\nu \right| ^q \right] \right. \nonumber \\&\left. \quad + {\mathbb {E}}\!\left[ \tfrac{ \big | \big (F(\mathbf{U}_{n,M}^{\theta })\big ) (\mathcal {R}^{\theta }_s,x+W_{\mathcal {R}^{\theta }_s}^{\theta }-W_t^{\theta }) \big |^q }{\left[ \varrho (s,\mathcal {R}^{\theta }_s)\right] ^q} \left| \big \langle \mathbf {e}_\nu , \big ( 1 , \tfrac{ W_{\mathcal {R}^{\theta }_s}^{\theta }- W^{\theta }_{s} }{ \mathcal {R}^{\theta }_s-s} \big ) \big \rangle \right| ^q \right] \right| <\infty , \end{aligned}$$
(46)
and
(iii)
it holds for all $n\in \mathbb {N}$, $\theta \in \Theta $, $t\in [0,T)$ that
$$\begin{aligned} {\mathbb {E}}\!\left[ \mathbf{U}_{n,M}^{\theta }(t,x)\right]&={\mathbb {E}}\!\left[ g(x+W_T^{\theta }-W_t^{\theta }) \Big ( 1 , \tfrac{ W^{\theta }_{T}- W^{\theta }_{t} }{ T - t } \Big ) \right] \nonumber \\&\quad + {\mathbb {E}}\!\left[ \left( F( \mathbf{U}_{n-1,M,Q}^{\theta })\right) \!(\mathcal {R}^{\theta }_t,x+W_{\mathcal {R}^{\theta }_t}^{\theta }-W_t^{\theta }) \Big ( 1 , \tfrac{ W^{\theta }_{\mathcal {R}^{\theta }_t}- W^{\theta }_{t} }{ {\mathcal {R}^{\theta }_t - t} } \Big ) \right] . \end{aligned}$$
(47)

Proof of Lemma 3.3

Observe that the Cauchy–Schwarz inequality, (36), Jensen’s inequality, and the fact that for all $u\in (0,T]$, $t\in [0,u)$, $i\in \{1,2,\ldots ,d\}$, $\theta \in \Theta $ it holds that $W^{\theta ,i}_u-W^{\theta ,i}_t$ and $(u-t)^{1/2}T^{-1/2}W^{0,1}_T$ are identically distributed yield that for all $\theta \in \Theta $, $q\in [1,\infty )$, $\nu \in \{1,2,\ldots , d+1\}$ it holds that

(48)

This proves Item (i). Next observe that Lemma 3.2 ensures that for all $n\in \mathbb {N}_0$, $\theta \in \Theta $ it holds that $W^{\theta }$, $\mathcal {R}^{\theta }$, and $\mathbf{U}_{n,M}^{\theta }$ are independent continuous random fields. Combining this and, e.g., Hutzenthaler et al. [45, Lemma 2.3] with Hölder’s inequality (applied with $p=\frac{p+q}{2q}$, $q=\frac{p+q}{p-q}$ in the notation of Hölder’s inequality) and the fact that for all $s\in [0,T]$, $h\in (0,\infty )$, $\theta \in \Theta $ it holds that $W^{\theta ,1}_{s+h}-W^{\theta ,1}_s$ and $h^{1/2}T^{-1/2}W^{0,1}_T$ are identically distributed demonstrates that for all $t\in [0,T)$, $s\in [t,T)$, $n \in \mathbb {N}_0$, $\theta \in \Theta $, $q\in [1,p)$, $\nu \in \{1,2,\ldots ,d+1\}$ it holds that

(49)

In the next step we prove (46) by induction on $n\in \mathbb {N}_0$. For the base case $n=0$ observe that (49), (44), and the fact that for all $\theta \in \Theta $ it holds that $U^\theta _{0,M} = 0$ ensure that for all $\theta \in \Theta $, $q\in [1,p)$, $t\in [0,T)$, $\nu \in \{1,2,\ldots , d+1\}$ it holds that

(50)

This establishes (46) in the base case $n=0$. For the induction step $\mathbb {N}_0 \ni n-1 \rightsquigarrow n \in \mathbb {N}$ assume that there exists $n\in \mathbb {N}$ which satisfies for all $k \in \mathbb {N}_0 \cap [0, n)$, $\theta \in \Theta $, $q\in [1,p)$, $t\in [0,T)$, $\nu \in \{1,2,\ldots , d+1\}$ that

(51)

Observe that (39) and Jensen’s inequality ensure that for all $\theta \in \Theta $, $q\in [1,p)$, $t\in [0,T)$, $s \in [t, T)$, $\nu \in \{1,2,\ldots , d+1\}$ it holds that

(52)

This, e.g., Hutzenthaler et al. [45, Corollary 2.5], Lemma 3.2, Item (i), and (51) yield that for all $\theta \in \Theta $, $q\in [1,p)$, $t\in [0,T)$, $\nu \in \{1,2,\ldots , d+1\}$ it holds that

$$\begin{aligned} \begin{aligned}&\sup _{s\in [t,T)}{\mathbb {E}}\!\left[ \left| (\mathbf{U}_{n,M}^{\theta }(s,x+W_{s}^\theta -W_{t}^\theta ))_\nu \right| ^q \right] \le (3+2n)^{q-1} |g(x)|^q \\&\quad + (6+2n)^{q} \sup _{u\in (0,T]} \sup _{s\in [0,u)} \sup _{y\in \mathbb {R}^d} \sup _{j\in \{1,2,\ldots ,d+1\}}\\&\quad {\mathbb {E}}\!\left[ \left| g(y+W^{0}_u-W^{0}_s)-g(y))\right| ^q \big | \big \langle \mathbf {e}_j, \big ( 1 , \tfrac{ W^{0}_{u}- W^{0}_{s} }{ u - s } \big )\big \rangle \big |^q \right] \\&\quad + \sum _{l=0}^{n-1}(6+2n)^{q} \sup _{s\in [t,T)}{\mathbb {E}}\!\left[ \tfrac{ \left| \big (F(\mathbf{U}_{l,M}^{0})\big ) (\mathcal {R}^{0}_s,x+W_{\mathcal {R}^{0}_s}^{0}-W_t^{0}) \right| ^q }{\left[ \varrho (s,\mathcal {R}^{0}_s)\right] ^q} \left| \big \langle \mathbf {e}_\nu , \big ( 1 , \tfrac{ W_{\mathcal {R}^{0}_s}^{0}- W^{0}_{s} }{ \mathcal {R}^{0}_s-s} \big )\big \rangle \right| ^q \right] <\infty . \end{aligned} \end{aligned}$$

(53)

Jensen’s inequality and (44) hence yield that for all $\theta \in \Theta $, $q\in [1,p)$, $t\in [0,T)$ it holds that

$$\begin{aligned} \begin{aligned}&\sup _{s\in [t,T)} \left| {\mathbb {E}}\!\left[ \left| \big (F(\mathbf{U}_{n,M}^{\theta })\big ) (s,x+W_{s}^\theta -W_t^{\theta }) \right| ^{q} \right] \right| \\&\quad \le 2^{q-1} \sup _{s\in [t,T)} \left( {\mathbb {E}}\!\left[ \left| \big (F(\mathbf{U}_{n,M}^{\theta }) -F(0)\big ) (s,x+W_{s}^\theta -W_t^{\theta }) \right| ^{q} \right] \right. \\&\qquad \left. + {\mathbb {E}}\!\left[ \left| \big ( F(0)\big ) (s,x+W_{s}^\theta -W_t^{\theta }) \right| ^{q} \right] \right) \\&\quad \le 2^{q-1} \sup _{s\in [t,T)} \left( {\mathbb {E}}\bigg [ \Big | \textstyle \sum _{\nu =1}^{d+1} \displaystyle L_\nu \mathbf{U}_{n,M}^{\theta ,\nu } (s,x+W_{s}^\theta -W_t^{\theta }) \Big |^{q} \bigg ] \right. \\&\qquad \left. + {\mathbb {E}}\!\left[ \left| \big ( F(0)\big ) (s,x+W_{s}^\theta -W_t^{\theta }) \right| ^{q} \right] \right) \\&\quad \le (4d)^q \sum _{\nu =1}^{d+1}(L_\nu )^q \sup _{s\in [t,T)} {\mathbb {E}}\!\left[ \big | \mathbf{U}_{n,M}^{\theta ,\nu } (s,x+W_{s}^\theta -W_t^{\theta }) \big |^{q} \right] + 2^q \sup _{s\in [t,T)}\\&\qquad \cdot {\mathbb {E}}\!\left[ \left| \big ( F(0)\big ) (s,x+W_{s}^\theta -W_t^{\theta }) \right| ^{q} \right] \\&\quad <\infty . \end{aligned} \end{aligned}$$

(54)

This, (49), and (44) imply that for all $\theta \in \Theta $, $q\in [1,p)$, $t\in [0,T)$, $\nu \in \{1,2,\ldots , d+1\}$ it holds that

$$\begin{aligned} \begin{aligned}&\sup _{s\in [t,T)} {\mathbb {E}}\!\left[ \tfrac{ \left| \big (F(\mathbf{U}_{n,M}^{\theta })\big ) (\mathcal {R}^{\theta }_s,x+W_{\mathcal {R}^{\theta }_s}^{\theta }-W_t^{\theta }) \right| ^q }{\left[ \varrho (s,\mathcal {R}^{\theta }_s)\right] ^q} \bigg | \big \langle \mathbf {e}_\nu , \big ( 1 , \tfrac{ W_{\mathcal {R}^{\theta }_s}^{\theta }- W^{\theta }_{s} }{ \mathcal {R}^{\theta }_s-s} \big )\big \rangle \bigg |^q \right] \\&\quad \le \left( \sup _{s\in [t,T)} {\mathbb {E}}\!\left[ \left| \big (F(\mathbf{U}_{n,M}^{\theta })\big ) (s,x+W_{s}^\theta -W_t^\theta ) \right| ^{\frac{p+q}{2}} \right] \right) ^{\frac{2q}{p+q}}\\&\qquad \cdot \left( T^{q}+ \left( {\mathbb {E}}\!\left[ \left| W^{0,1}_T \right| ^{\frac{q(p+q)}{p-q}} \right] \right) ^{\frac{p-q}{p+q}} \right) \int _0^1 \tfrac{u^{-\frac{q}{2}}}{ \left[ \rho (u)\right] ^{q-1} } \,du \\&\quad <\infty . \end{aligned} \end{aligned}$$

(55)

Combining this with (53) demonstrates that for all $\theta \in \Theta $, $q\in [1,p)$, $t\in [0,T)$, $\nu \in \{1,2,\ldots , d+1\}$ it holds that

$$\begin{aligned}&\sup _{s\in [t,T)}{\mathbb {E}}\!\left[ \left| (\mathbf{U}_{n,M}^{\theta }(s,x+W_{s}^\theta -W_{t}^\theta ))_\nu \right| ^q \right. \nonumber \\&\quad +\left. \tfrac{ \left| \big (F(\mathbf{U}_{n,M}^{\theta })\big ) (\mathcal {R}^{\theta }_s,x+W_{\mathcal {R}^{\theta }_s}^{\theta }-W_t^{\theta }) \right| ^q }{\left[ \varrho (s,\mathcal {R}^{\theta }_s)\right] ^q} \bigg | \big \langle \mathbf {e}_\nu , \big ( 1 , \tfrac{ W_{\mathcal {R}^{\theta }_s}^{\theta }- W^{\theta }_{s} }{ \mathcal {R}^{\theta }_s-s} \big )\big \rangle \bigg |^q \right] <\infty . \end{aligned}$$

(56)

Induction thus proves Item (ii). Next we prove Item (iii). Note that Item (iii), e.g., Hutzenthaler et al. [45, Corollary 2.5], Item (v) of Lemma 3.2, the fact that $(\mathcal {R}^{\theta },W^{\theta })$, $\theta \in \Theta $, are identically distributed, e.g., Hutzenthaler et al. [45, Lemma 2.3], and the fact that for every $t\in [0,T)$, $s\in [t,T]$, $\theta \in \Theta $ it holds that ${\mathbb {P}}(\mathcal {R}^{\theta }_t\le s)=\int _t^s\varrho (t,r)\,dr$ yield that for all $n\in \mathbb {N}$, $\theta \in \Theta $, $t\in [0,T)$ it holds that

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}\!\left[ \mathbf{U}_{n,M}^{\theta }(t,x)\right] - {\mathbb {E}}\!\left[ g(x+W_T^0-W_t^0) \Big ( 1 , \tfrac{ W^{0}_{T}- W^{0}_{t} }{ T - t } \Big ) \right] \\&\quad =\sum _{l=0}^{n-1} \sum _{i=1}^{M^{n-l}}\tfrac{1}{M^{n-l}}{\mathbb {E}}\!\left[ \tfrac{ \big (F(\mathbf{U}_{l,M}^{(\theta ,l,i)})\big ) (\mathcal {R}_t^{(\theta ,l,i)},x+W_{\mathcal {R}_t^{(\theta ,l,i)}}^{(\theta ,l,i)}-W_t^{(\theta ,l,i)})}{ \varrho (t,\mathcal {R}_t^{(\theta ,l,i)}) } \Big ( 1 , \tfrac{ W^{(\theta , l, i)}_{\mathcal {R}_t^{(\theta ,l,i)}}- W^{(\theta , l, i)}_{t} }{ \mathcal {R}_t^{(\theta ,l,i)} - t } \Big )\right] \\&\qquad -\sum _{l=1}^{n-1} \sum _{i=1}^{M^{n-l}}\tfrac{1}{M^{n-l}}{\mathbb {E}}\!\left[ \tfrac{ \big (F( \mathbf{U}_{l-1,M}^{(\theta ,-l,i)})\big ) (\mathcal {R}_t^{(\theta ,l,i)},x+W_{\mathcal {R}_t^{(\theta ,l,i)}}^{(\theta ,l,i)}-W_t^{(\theta ,l,i)})}{ \varrho (t,\mathcal {R}_t^{(\theta ,l,i)}) } \Big ( 1 , \tfrac{ W^{(\theta , l, i)}_{\mathcal {R}_t^{(\theta ,l,i)}}- W^{(\theta , l, i)}_{t} }{ \mathcal {R}_t^{(\theta ,l,i)} - t } \Big )\right] \\&\quad =\sum _{l=0}^{n-1} \left( {\mathbb {E}}\!\left[ \tfrac{ \big (F(\mathbf{U}_{l,M}^{\theta })\big ) (\mathcal {R}_t^{\theta },x+W_{\mathcal {R}_t^{\theta }}^{\theta }-W_t^{\theta })}{ \varrho (t,\mathcal {R}_t^{\theta }) } \Big ( 1 , \tfrac{ W^{\theta }_{\mathcal {R}_t^{\theta }}- W^{\theta }_{t} }{ \mathcal {R}_t^{\theta } - t } \Big )\right] \right. \\&\qquad \left. - \mathbb {1}_{\mathbb {N}}(l)\,{\mathbb {E}}\!\left[ \tfrac{ \big (F( \mathbf{U}_{l-1,M}^{\theta })\big ) (\mathcal {R}_t^{\theta },x+W_{\mathcal {R}_t^{\theta }}^{\theta }-W_t^{\theta })}{\varrho (t,\mathcal {R}_t^{\theta }) } \Big ( 1 , \tfrac{ W^{\theta }_{\mathcal {R}_t^{\theta }}- W^{\theta }_{t} }{ \mathcal {R}_t^{\theta } - t } \Big )\right] \right) \\&\quad = {\mathbb {E}}\!\left[ \tfrac{ \big (F(\mathbf{U}_{n-1,M}^{\theta })\big ) (\mathcal {R}_t^{\theta },x+W_{\mathcal {R}_t^{\theta }}^{\theta }-W_t^{\theta })}{ \varrho (t,\mathcal {R}_t^{\theta }) } \Big ( 1 , \tfrac{ W^{\theta }_{\mathcal {R}_t^{\theta }}- W^{\theta }_{t} }{ \mathcal {R}_t^{\theta } - t } \Big )\right] \\&\quad = \int _t^T{\mathbb {E}}\!\left[ \tfrac{ \big (F(\mathbf{U}_{n-1,M}^{\theta })\big ) (s,x+W_{s}^{\theta }-W_t^{\theta })}{ \varrho (t,s) } \Big ( 1 , \tfrac{ W^{\theta }_{s}- W^{\theta }_{t} }{ s - t } \Big )\right] {\mathbb {P}}(\mathcal {R}_t^{\theta }\in ds) \\&\quad = \int _t^T{\mathbb {E}}\!\left[ \left( F( \mathbf{U}_{n-1,M,Q}^{\theta })\right) \!(s,x+W_s^\theta -W_t^\theta ) \Big ( 1 , \tfrac{ W^{\theta }_{s}- W^{\theta }_{t} }{ {s - t} } \Big ) \right] \,ds. \end{aligned} \end{aligned}$$

(57)

This establishes Item (iii). The proof of Lemma 3.3 is thus completed. $\square $

3.3 Error Analysis for MLP Approximations

Lemma 3.4

(Recursive bound for global error) Assume Setting 3.1, let $p\in (1,\infty )$, $M \in \mathbb {N}$, let $S:\mathbb {N}_0 \times [0,T)\times \Omega \rightarrow [0,T)$ satisfy for all $\theta \in \Theta $, $n \in \mathbb {N}_0$, $t\in [0,T)$ that $S(0,t)=t$ and $S(n+1,t)=\mathcal {R}^{(n)}_{S(n,t)}$, and assume for all $q\in [1,p)$, $t\in [0,T)$, $x\in \mathbb {R}^d$ that

$$\begin{aligned} \int _0^1 \frac{1}{ s^{\frac{q}{2}} \left[ \rho (s)\right] ^{q-1} } \,ds+ \sup _{s\in [t,T)} {\mathbb {E}}\!\left[ \left| f(s,x+W_{s}^0-W_t^0,0,0) \right| ^{q} \right] <\infty . \end{aligned}$$

(58)

Then it holds for all $n\in \mathbb {N}$, $t\in [0,T)$, $x\in \mathbb {R}^d$, $\nu _0 \in \{1,2,\ldots , d+1\}$ that

$$\begin{aligned} \begin{aligned}&\big \Vert \mathbf{U}_{{n},M}^{0,\nu _0}({t},x)-\mathbf{u}_{\nu _0}(t,x)\big \Vert _{L^2({\mathbb {P}};\mathbb {R})}\\&\quad \le \sum _{j=0}^{n-1} \sum _{\begin{array}{c} \nu _1,\nu _2,\ldots ,\nu _{j+1} \in \\ \{1,2,\ldots ,d+1\} \end{array}} \genfrac(){0.0pt}1{n-1}{j} \tfrac{\mathrm {1}_{\{1\}}(\nu _{j+1})2^j\left[ \prod _{i=1}^j L_{\nu _i}\right] }{\sqrt{M^{n-j}}} \bigg \Vert \big (g(x+W^0_{T}-W^0_{t})-g(x+W^0_{S(j,t)}-W^0_{t})\big )\\&\qquad \cdot \big \langle \mathbf {e}_{\nu _j}, \big ( 1, \tfrac{ W^{0}_{T}- W^{0}_{S(j,t)} }{T-S(j,t) } \big )\big \rangle \Big [ \textstyle \prod _{i=1}^{j} \displaystyle \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W^{0}_{S(i,t)}- W^{0}_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \Big ] \bigg \Vert _{L^2({\mathbb {P}};\mathbb {R})}\\&\qquad + \sum _{j=0}^{n-1} \sum _{\begin{array}{c} \nu _1,\nu _2,\ldots ,\nu _{j+1} \in \\ \{1,2,\ldots ,d+1\} \end{array}} \genfrac(){0.0pt}1{n-1}{j} \tfrac{\mathrm {1}_{\{1\}}(\nu _{j+1})2^j\left[ \prod _{i=1}^j L_{\nu _i}\right] }{\sqrt{M^{n-j}}} \bigg \Vert (F(0))(S(j+1,t),x+W^0_{S(j+1,t)}-W^0_t)\\&\qquad \cdot \Big [ \textstyle \prod _{i=1}^{j+1} \displaystyle \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W^{0}_{S(i,t)}- W^{0}_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \Big ] \bigg \Vert _{L^2({\mathbb {P}};\mathbb {R})} \\&\qquad + \sum _{j=0}^{n-1} \sum _{\begin{array}{c} \nu _1,\nu _2,\ldots ,\nu _{j+1} \in \\ \{1,2,\ldots ,d+1\} \end{array}} \genfrac(){0.0pt}1{n-1}{j} \tfrac{2^j\left[ \prod _{i=1}^{j+1} L_{\nu _i}\right] }{\sqrt{M^{n-j-1}}} \Bigg \Vert \mathbf{u}_{\nu _{j+1}}(S(j+1,t),x+W^0_{S(j+1,t)}-W^0_t)\\&\qquad \cdot \Big [ \textstyle \prod _{i=1}^{j+1} \displaystyle \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W^{0}_{S(i,t)}- W^{0}_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \Big ] \bigg \Vert _{L^2({\mathbb {P}};\mathbb {R})}. \end{aligned} \end{aligned}$$

(59)

Proof of Lemma 3.4

First, we analyze the Monte Carlo error. Item (i) of Lemma 3.3, Item (ii) of Lemma 3.3, and Item (vi) of Lemma 3.2 ensure that for all $l\in \mathbb {N}$, $x\in \mathbb {R}^d$, $t\in [0,T)$, $\nu \in \{1,2,\ldots ,d+1\}$ it holds that

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}\Bigg [\left| \left( g(x+W_T^{0}-W_t^0)-g(x)\right) \big \langle \mathbf {e}_\nu , \big ( 1 , \tfrac{ W^{0}_{T}- W^{0}_{t} }{ T - t } \big )\big \rangle \right| \\&\quad + \bigg | \tfrac{ \left( F( \mathbf{U}_{l,M}^{(0,l,1)})-\mathbb {1}_{\mathbb {N}}(l)F( \mathbf{U}_{l-1,M}^{(0,-l,1)})\right) \! (\mathcal {R}_t^{(0,l,1)},x+W_{\mathcal {R}_t^{(0,l,1)}}^{(0,l,1)}-W_t^{(0,l,1)}) }{\varrho (t,\mathcal {R}^{(0, l,1)}_t)} \Big \langle \mathbf {e}_\nu , \Big ( 1 , \tfrac{ W_{\mathcal {R}_t^{(0,l,1)}}^{(0,l,1)}-W_t^{(0,l,1)} }{ {\mathcal {R}_t^{(0,l,1)} - t} } \Big )\Big \rangle \bigg | \Bigg ]<\infty . \end{aligned} \end{aligned}$$

(60)

Moreover, observe that Lemma 3.2 yields that

(a)
it holds for all $x\in \mathbb {R}^d$, $t\in [0,T)$, $\nu \in \{1,2,\ldots ,d+1\}$ that
$$\begin{aligned} \big (g(x+W_T^{(0,0,-i)}-W_t^{(0,0,-i)})-g(x)\big ) \big \langle \mathbf {e}_\nu , \big ( 1 , \tfrac{ W^{(0,0,-i)}_{T}- W^{(0,0,-i)}_{t} }{ T - t } \big )\big \rangle ,\quad i\in \mathbb {N}_0, \end{aligned}$$
(61)
are independent random variables,
(b)
it holds for all $x\in \mathbb {R}^d$, $t\in [0,T)$, $\nu \in \{1,2,\ldots ,d+1\}$ that
$$\begin{aligned} \tfrac{ (F( \mathbf{U}_{l,M}^{(0,l,i)})-\mathbb {1}_{\mathbb {N}}(l)F( \mathbf{U}_{l-1,M}^{(0,-l,i)})) (\mathcal {R}_t^{(0,l,i)},x+W_{\mathcal {R}_t^{(0,l,i)}}^{(0,l,i)}-W_t^{(0,l,i)}) }{\varrho (t,\mathcal {R}^{(0, l,i)}_t)} \Big \langle \mathbf {e}_\nu , \Big ( 1 , \tfrac{ W_{\mathcal {R}_t^{(0,l,i)}}^{(0,l,1)}-W_t^{(0,l,i)} }{ {\mathcal {R}_t^{(0,l,i)} - t} } \Big )\Big \rangle , \quad l,i\in \mathbb {N}_0, \end{aligned}$$
(62)
are independent random variables, and
(c)
it holds for all $x\in \mathbb {R}^d$, $t\in [0,T)$, $\nu \in \{1,2,\ldots ,d+1\}$ that
$$\begin{aligned} \big (g(x+W_T^{(0,0,-i)}-W_t^{(0,0,-i)})-g(x)\big ) \big \langle \mathbf {e}_\nu , \big ( 1 , \tfrac{ W^{(0,0,-i)}_{T}- W^{(0,0,-i)}_{t} }{ T - t } \big )\big \rangle ,\quad i\in \mathbb {N}_0, \end{aligned}$$
(63)
and
$$\begin{aligned} \tfrac{ \left( F( \mathbf{U}_{l,M}^{(0,l,i)})-\mathbb {1}_{\mathbb {N}}(l)F( \mathbf{U}_{l-1,M}^{(0,-l,i)})\right) \! (\mathcal {R}_t^{(0,l,i)},x+W_{\mathcal {R}_t^{(0,l,i)}}^{(0,l,i)}-W_t^{(0,l,i)}) }{\varrho (t,\mathcal {R}^{(0, l,i)}_t)} \Big \langle \mathbf {e}_\nu , \Big ( 1 , \tfrac{ W_{\mathcal {R}_t^{(0,l,i)}}^{(0,l,1)}-W_t^{(0,l,i)} }{ {\mathcal {R}_t^{(0,l,i)} - t} } \Big )\Big \rangle , \quad l,i\in \mathbb {N}_0, \end{aligned}$$
(64)
are independent.

Combining this, (60), and (39) with the fact that for all $N\in \mathbb {N}$, $X_1, X_2, \ldots , X_N \in L^1({\mathbb {P}};\mathbb {R})$ with $\forall \, i\in \{1,2,\ldots ,N\}$, $j\in \{1,2,\ldots ,N\}\setminus \{i\}$, $A,B\in {\mathcal {B}}(\mathbb {R}) :{\mathbb {P}}(\{X_i \in A\}\cap \{X_j\in B\})={\mathbb {P}}(X_i \in A){\mathbb {P}}(X_j\in B)$ it holds that ${{\text {Var}}}(\sum _{i=1}^NX_i)=\sum _{i=1}^N{{\text {Var}}}(X_i)$ implies that for all $m\in \mathbb {N}$, $x\in \mathbb {R}^d$, $t\in [0,T)$, $\nu \in \{1,2,\ldots ,d+1\}$ it holds that

$$\begin{aligned} \begin{aligned}&{{\text {Var}}}\!\left( \mathbf{U}_{m,M}^{0,\nu }(t,x) \right) =\tfrac{1}{M^m} {{\text {Var}}}\!\left( \left( g(x+W_T^{0}-W_t^0)-g(x)\right) \big \langle \mathbf {e}_\nu , \big ( 1 , \tfrac{ W^{0}_{T}- W^{0}_{t} }{ T - t } \big )\big \rangle \right) \\&\qquad +\sum _{l=0}^{m-1}\tfrac{1}{M^{m-l}} \\&\qquad \cdot {{\text {Var}}}\! \left( \tfrac{ \left( F( \mathbf{U}_{l,M}^{(0,l,1)})-\mathbb {1}_{\mathbb {N}}(l)F( \mathbf{U}_{l-1,M}^{(0,-l,1)})\right) (\mathcal {R}_t^{(0,l,1)},x+W_{\mathcal {R}_t^{(0,l,1)}}^{(0,l,1)}-W_t^{(0,l,1)}) }{\varrho (t,\mathcal {R}^{(0, l,1)}_t)} \Big \langle \mathbf {e}_\nu , \Big ( 1 , \tfrac{ W_{\mathcal {R}_t^{(0,l,1)}}^{(0,l,1)}-W_t^{(0,l,1)} }{ {\mathcal {R}_t^{(0,l,1)} - t} } \Big )\Big \rangle \right) \\&\quad \le \tfrac{1}{M^m}\, {\mathbb {E}}\!\left[ \left| \left( g(x+W_T^{0}-W_s^0)-g(x)\right) \big \langle \mathbf {e}_\nu , \big ( 1 , \tfrac{ W^{0}_{T}- W^{0}_{s} }{ T - s } \big )\big \rangle \right| ^2\right] \\&\qquad +\sum _{l=0}^{m-1}\tfrac{1}{M^{m-l}}\\&\qquad \cdot {\mathbb {E}}\left[ \left| \tfrac{ \left( F( \mathbf{U}_{l,M}^{(0,l,1)})-\mathbb {1}_{\mathbb {N}}(l)F( \mathbf{U}_{l-1,M}^{(0,-l,1)})\right) (\mathcal {R}_t^{(0,l,1)},x+W_{\mathcal {R}_t^{(0,l,1)}}^{(0,l,1)}-W_t^{(0,l,1)}) }{\varrho (t,\mathcal {R}^{(0, l,1)}_t)} \Big \langle \mathbf {e}_\nu , \Big ( 1 , \tfrac{ W_{\mathcal {R}_t^{(0,l,1)}}^{(0,l,1)}-W_t^{(0,l,1)} }{ {\mathcal {R}_t^{(0,l,1)} - t} } \Big )\Big \rangle \right| ^2\right] . \end{aligned} \end{aligned}$$

(65)

The triangle inequality and (36) hence yield that for all $m\in \mathbb {N}$, $x\in \mathbb {R}^d$, $t\in [0,T)$, $\nu \in \{1,2,\ldots ,d+1\}$ it holds that

$$\begin{aligned}&\left\| \mathbf{U}_{m,M}^{0,\nu }(t,x)-{\mathbb {E}}\!\left[ \mathbf{U}_{m,M}^{0,\nu }(t,x)\right] \right\| _{L^2({\mathbb {P}};\mathbb {R})} = \left[ {{\text {Var}}}\!\left( \mathbf{U}_{m,M}^{0,\nu }(t,x) \right) \right] ^{\nicefrac {1}{2}} \nonumber \\&\quad \le \tfrac{1}{\sqrt{M^m}} \left\| \left( g(x+W_T^{0}-W_t^0)-g(x)\right) \big \langle \mathbf {e}_\nu , \big ( 1 , \tfrac{ W^{0}_{T}- W^{0}_{t} }{ T - t } \big )\big \rangle \right\| _{L^2({\mathbb {P}};\mathbb {R})} \nonumber \\&\qquad + \sum _{l=0}^{m-1} \tfrac{1}{\sqrt{M^{m-l}}} \left\| \tfrac{ \left( F( \mathbf{U}_{l,M}^{(0,l,1)})-\mathbb {1}_{\mathbb {N}}(l)F( \mathbf{U}_{l-1,M}^{(0,-l,1)})\right) (\mathcal {R}_t^{(0,l,1)},x+W_{\mathcal {R}_t^{(0,l,1)}}^{(0,l,1)}-W_t^{(0,l,1)}) }{\varrho (t,\mathcal {R}^{(0, l,1)}_t)}\right. \nonumber \\&\left. \qquad \cdot \Big \langle \mathbf {e}_\nu , \Big ( 1 , \tfrac{ W_{\mathcal {R}_t^{(0,l,1)}}^{(0,l,1)}-W_t^{(0,l,1)} }{ {\mathcal {R}_t^{(0,l,1)} - t} } \Big )\Big \rangle \right\| _{L^2({\mathbb {P}};\mathbb {R})} \nonumber \\&\quad \le \tfrac{1}{\sqrt{M^m}} \left\| \left( g(x+W_T^{0}-W_t^0)-g(x)\right) \big \langle \mathbf {e}_\nu , \big ( 1 , \tfrac{ W^{0}_{T}- W^{0}_{t} }{ T - t } \big )\big \rangle \right\| _{L^2({\mathbb {P}};\mathbb {R})} \nonumber \\&\qquad +\tfrac{1}{\sqrt{M^m}} \left\| \tfrac{ \left( F( 0)\right) \! (\mathcal {R}^{0}_t,x+W_{\mathcal {R}_t^{0}}^0-W_t^0) }{\varrho (t,\mathcal {R}^{0}_t)} \Big \langle \mathbf {e}_\nu , \Big ( 1 , \tfrac{ W^{0}_{\mathcal {R}_t^{0}}- W^{0}_{t} }{ {{\mathcal {R}_t^{0}} - t} } \Big )\Big \rangle \right\| _{L^2({\mathbb {P}};\mathbb {R})} \nonumber \\&\qquad + \sum _{l=1}^{m-1} \tfrac{1}{\sqrt{M^{m-l}}} \left\| \sum _{\nu _1=1}^{d+1}L_{\nu _1} \tfrac{ \big | \left( \mathbf{U}_{l,M}^{(0,l,1),\nu _1}-\mathbf{U}_{l-1,M}^{(0,-l,1),\nu _1}\right) (\mathcal {R}^{(0, l,1)}_t,x+W_{\mathcal {R}^{(0, l,1)}_t}^{(0,l,1)}-W_t^{(0,l,1)}) \big | }{\varrho (t,\mathcal {R}^{(0, l,1)}_t)} \right. \nonumber \\&\qquad \cdot \left. \bigg | \Big \langle \mathbf {e}_\nu , \Big ( 1 , \tfrac{ W^{(0, l,1)}_{\mathcal {R}^{(0, l,1)}_t}- W^{(0, l,1)}_{t} }{ {\mathcal {R}^{(0, l,1)}_t - t} } \Big )\Big \rangle \bigg | \right\| _{L^2({\mathbb {P}};\mathbb {R})}. \end{aligned}$$

(66)

This and the triangle inequality ensure that for all $m\in \mathbb {N}$, $x\in \mathbb {R}^d$, $t\in [0,T)$, $\nu \in \{1,2,\ldots ,d+1\}$ it holds that

$$\begin{aligned}&\left\| \mathbf{U}_{m,M}^{0,\nu }(t,x)-{\mathbb {E}}\!\left[ \mathbf{U}_{m,M}^{0,\nu }(t,x)\right] \right\| _{L^2({\mathbb {P}};\mathbb {R})}\nonumber \\&\quad \le \tfrac{1}{\sqrt{M^m}} \left\| \left( g(x+W_T^{0}-W_t^0)-g(x)\right) \big \langle \mathbf {e}_\nu , \big ( 1 , \tfrac{ W^{0}_{T}- W^{0}_{t} }{ T - t } \big )\big \rangle \right\| _{L^2({\mathbb {P}};\mathbb {R})} \nonumber \\&\qquad +\tfrac{1}{\sqrt{M^m}} \left\| \tfrac{ \left( F( 0)\right) \! (\mathcal {R}^{0}_t,x+W_{\mathcal {R}_t^{0}}^0-W_t^0) }{\varrho (t,\mathcal {R}^{0}_t)} \Big \langle \mathbf {e}_\nu , \Big ( 1 , \tfrac{ W^{0}_{\mathcal {R}_t^{0}}- W^{0}_{t} }{ {{\mathcal {R}_t^{0}} - t} } \Big )\Big \rangle \right\| _{L^2({\mathbb {P}};\mathbb {R})} \nonumber \\&\qquad + \sum _{l=1}^{m-1} \sum _{\nu _1=1}^{d+1} \tfrac{L_{\nu _1}}{\sqrt{M^{m-l}}} \left\| \tfrac{ \left( \mathbf{U}_{l,M}^{0,\nu _1}-\mathbf{u}\right) (\mathcal {R}_t^0,x+W_{\mathcal {R}_t^0}^0-W_t^0) }{ \varrho (t,\mathcal {R}_t^0) } \Big \langle \mathbf {e}_\nu , \Big ( 1 , \tfrac{ W^{0}_{\mathcal {R}_t^0}- W^{0}_{t} }{ {\mathcal {R}_t^0 - t} } \Big )\Big \rangle \right\| _{L^2({\mathbb {P}};\mathbb {R})} \nonumber \\&\qquad + \sum _{l=1}^{m-1} \sum _{\nu _1=1}^{d+1} \tfrac{L_{\nu _1}}{\sqrt{M^{m-l}}} \left\| \tfrac{ \left( \mathbf{U}_{l-1,M}^{0,\nu _1}-\mathbf{u}\right) (\mathcal {R}_t^0,x+W_{\mathcal {R}_t^0}^0-W_t^0) }{\varrho (t,\mathcal {R}_t^0)} \Big \langle \mathbf {e}_\nu , \Big ( 1 , \tfrac{ W^{0}_{\mathcal {R}_t^0}- W^{0}_{t} }{ {\mathcal {R}_t^0 - t} } \Big )\Big \rangle \right\| _{L^2({\mathbb {P}};\mathbb {R})} \nonumber \\&\quad = \tfrac{1}{\sqrt{M^m}} \left\| \left( g(x+W_T^{0}-W_t^0)-g(x)\right) \big \langle \mathbf {e}_\nu , \big ( 1 , \tfrac{ W^{0}_{T}- W^{0}_{t} }{ T - t } \big )\big \rangle \right\| _{L^2({\mathbb {P}};\mathbb {R})} \nonumber \\&\qquad + \tfrac{1}{\sqrt{M^m}} \left\| \tfrac{ \left( F( 0)\right) \! (\mathcal {R}^{0}_t,x+W_{\mathcal {R}_t^{0}}^0-W_t^0) }{\varrho (t,\mathcal {R}^{0}_t)} \Big \langle \mathbf {e}_\nu , \Big ( 1 , \tfrac{ W^{0}_{\mathcal {R}_t^{0}}- W^{0}_{t} }{ {{\mathcal {R}_t^{0}} - t} } \Big )\Big \rangle \right\| _{L^2({\mathbb {P}};\mathbb {R})} \nonumber \\&\qquad + \sum _{l=0}^{m-1} \sum _{\nu _1=1}^{d+1} \tfrac{ L_{\nu _1}}{\sqrt{M^{m-l-1}}} \left( \tfrac{\mathbb {1}_{(0,m)}(l)}{\sqrt{M}}+\mathbb {1}_{(-1,m-1)}(l) \right) \nonumber \\&\qquad \cdot \left\| \tfrac{ \left( \mathbf{U}_{l,M}^{0,\nu _1}-\mathbf{u}\right) (\mathcal {R}_t^0,x+W_{\mathcal {R}_t^0}^0-W_t^0) }{ \varrho (t,\mathcal {R}_t^0) } \Big \langle \mathbf {e}_\nu , \Big ( 1 , \tfrac{ W^{0}_{\mathcal {R}_t^0}- W^{0}_{t} }{ {\mathcal {R}_t^0 - t} } \Big )\Big \rangle \right\| _{L^2({\mathbb {P}};\mathbb {R})} . \end{aligned}$$

(67)

Next we analyze the time discretization error. Item (ii) of Lemma 3.3 ensures that for all $m\in \mathbb {N}$, $t\in [0,T)$, $x\in \mathbb {R}^d$ it holds that

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}\!\left[ \mathbf{U}_{m,M}^{0}(t,x) -g(x+W_T^0-W_t^0) \Big ( 1 , \tfrac{ W^{0}_{T}- W^{0}_{t} }{ T - t } \Big ) \right] \\&\quad = {\mathbb {E}}\!\left[ \int _t^T \left( F( \mathbf{U}_{m-1,M}^{0})\right) (s,x+W_s^0-W_t^0) \Big ( 1 , \tfrac{ W^{0}_{s}- W^{0}_{t} }{ {s - t} } \Big )\, ds \right] . \end{aligned} \end{aligned}$$

(68)

Combining this, (38), (37), Item (i) of Lemma 3.3, and Jensen’s inequality shows that for for all $m\in \mathbb {N}$, $t\in [0,T)$, $x\in \mathbb {R}^d$, $\nu \in \{1,2,\ldots ,d+1\}$ it holds that

$$\begin{aligned} \begin{aligned}&\left| {\mathbb {E}}\!\left[ \mathbf{U}_{m,M}^{0,\nu }(t,x)\right] -\mathbf{u}_{\nu }(t,x)\right| \\&\quad = \left| {\mathbb {E}}\!\left[ \int _t^T \left( F( \mathbf{U}_{m-1,M}^{0})-F(\mathbf{u})\right) \!(s,x+W_s^0-W_t^0) \big \langle \mathbf {e}_\nu , \big ( 1 , \tfrac{ W^{0}_{s}- W^{0}_{t} }{ {s - t} } \big )\big \rangle \, ds \right] \right| \\&\quad \le \sum _{\nu _1=1}^{d+1}L_{\nu _1} {\mathbb {E}}\!\left[ \int _t^T \left| \left( \mathbf{U}_{m-1,M}^{0,\nu _1}-\mathbf{u}_{\nu _1}\right) (s,x+W_s^0-W_t^0) \right| \left| \big \langle \mathbf {e}_\nu , \big ( 1 , \tfrac{ W^{0}_{s}- W^{0}_{t} }{ {s - t} } \big )\big \rangle \right| \, ds \right] \\&\quad = \sum _{\nu _1=1}^{d+1}L_{\nu _1} {\mathbb {E}}\!\left[ \int _t^T \tfrac{ \left| \left( \mathbf{U}_{m-1,M}^{0,\nu _1}-\mathbf{u}_{\nu _1}\right) (s,x+W_s^0-W_t^0) \right| }{\varrho (t,s)} \left| \big \langle \mathbf {e}_\nu , \big ( 1 , \tfrac{ W^{0}_{s}- W^{0}_{t} }{ {s - t} } \big )\big \rangle \right| \varrho (t,s)\, ds \right] \\&\quad = \sum _{\nu _1=1}^{d+1}L_{\nu _1} {\mathbb {E}}\!\left[ \tfrac{ \left| \left( \mathbf{U}_{m-1,M}^{0,\nu _1}-\mathbf{u}_{\nu _1}\right) (\mathcal {R}_t^0,x+W_{\mathcal {R}_t^0}^0-W_t^0) \right| }{\varrho (t,\mathcal {R}_t^0)} \left| \Big \langle \mathbf {e}_\nu , \Big ( 1 , \tfrac{ W^{0}_{\mathcal {R}_t^0}- W^{0}_{t} }{ {\mathcal {R}_t^0 - t} } \Big )\Big \rangle \right| \right] \\&\quad \le \sum _{\nu _1=1}^{d+1}L_{\nu _1} \left\| \tfrac{ \left( \mathbf{U}_{m-1,M}^{0,\nu _1}-\mathbf{u}_{\nu _1}\right) (\mathcal {R}_t^0,x+W_{\mathcal {R}_t^0}^0-W_t^0) }{\varrho (t,\mathcal {R}_t^0)} \Big \langle \mathbf {e}_\nu , \Big ( 1 , \tfrac{ W^{0}_{\mathcal {R}_t^0}- W^{0}_{t} }{ {\mathcal {R}_t^0 - t} } \Big )\Big \rangle \right\| _{L^2({\mathbb {P}};\mathbb {R})}. \end{aligned} \end{aligned}$$

(69)

In the next step we combine the established bounds for the Monte Carlo error (see (67) above) and the time discretization error (see (69) above) to obtain a suitable bound for the overall approximation error. More formally, observe that (67) and (69) ensure that for all $m\in \mathbb {N}$, $t\in [0,T)$, $x\in \mathbb {R}^d$, $\nu \in \{1,2,\ldots , d+1\}$ it holds that

$$\begin{aligned}&\left\| \mathbf{U}_{m,M}^{0,\nu }(t,x)-\mathbf{u}_{\nu }(t,x)\right\| _{L^2({\mathbb {P}};\mathbb {R})}\nonumber \\&\quad \le \left\| \mathbf{U}_{m,M}^{0,\nu }(t,x)-{\mathbb {E}}\!\left[ \mathbf{U}_{m,M}^{0,\nu }(t,x)\right] \right\| _{L^2({\mathbb {P}};\mathbb {R})} +\left| {\mathbb {E}}\!\left[ \mathbf{U}_{m,M}^{0,\nu }(t,x)\right] -\mathbf{u}_\nu (t,x)\right| \nonumber \\&\quad \le \tfrac{1}{\sqrt{M^m}} \left\| \left( g(x+W_T^{0}-W_t^0)-g(x)\right) \big \langle \mathbf {e}_\nu , \big ( 1 , \tfrac{ W^{0}_{T}- W^{0}_{t} }{ T - t } \big )\big \rangle \right\| _{L^2({\mathbb {P}};\mathbb {R})}\nonumber \\&\qquad + \tfrac{1}{\sqrt{M^m}} \left\| \tfrac{ \left( F( 0)\right) \! (\mathcal {R}^{0}_t,x+W_{\mathcal {R}_t^{0}}^0-W_t^0) }{\varrho (t,\mathcal {R}^{0}_t)} \Big \langle \mathbf {e}_\nu , \Big ( 1 , \tfrac{ W^{0}_{\mathcal {R}_t^{0}}- W^{0}_{t} }{ {{\mathcal {R}_t^{0}} - t} } \Big )\Big \rangle \right\| _{L^2({\mathbb {P}};\mathbb {R})} \nonumber \\&\qquad + \sum _{l=0}^{m-1} \sum _{\nu _1=1}^{d+1} \tfrac{ L_{\nu _1}}{\sqrt{M^{m-l-1}}} \left( \tfrac{\mathbb {1}_{(0,m)}(l)}{\sqrt{M}}+\mathbb {1}_{(-1,m-1)}(l) \right) \nonumber \\&\qquad \cdot \left\| \tfrac{ \left( \mathbf{U}_{l,M}^{0,\nu _1}-\mathbf{u}_{\nu _1}\right) \! (\mathcal {R}_t^0,x+W_{\mathcal {R}_t^0}^0-W_t^0) }{ \varrho (t,\mathcal {R}_t^0) } \Big \langle \mathbf {e}_\nu , \Big ( 1 , \tfrac{ W^{0}_{\mathcal {R}_t^0}- W^{0}_{t} }{ {\mathcal {R}_t^0 - t} } \Big )\Big \rangle \right\| _{L^2({\mathbb {P}};\mathbb {R})} \nonumber \\&\qquad +\sum _{\nu _1=1}^{d+1}L_{\nu _1} \left\| \tfrac{ \left( \mathbf{U}_{m-1,M}^{0,\nu _1}-\mathbf{u}_{\nu _1}\right) \! (\mathcal {R}_t^0,x+W_{\mathcal {R}_t^0}^0-W_t^0) }{\varrho (t,\mathcal {R}_t^0)} \Big \langle \mathbf {e}_\nu , \Big ( 1 , \tfrac{ W^{0}_{\mathcal {R}_t^0}- W^{0}_{t} }{ {\mathcal {R}_t^0 - t} } \Big )\Big \rangle \right\| _{L^2({\mathbb {P}};\mathbb {R})}. \end{aligned}$$

(70)

This shows that for all $m\in \mathbb {N}$, $t\in [0,T)$, $x\in \mathbb {R}^d$, $\nu \in \{1,2,\ldots , d+1\}$ it holds that

$$\begin{aligned} \begin{aligned}&\left\| \mathbf{U}_{m,M}^{0,\nu }(t,x)-\mathbf{u}_{\nu }(t,x)\right\| _{L^2({\mathbb {P}};\mathbb {R})} \le \\&\quad \tfrac{1}{\sqrt{M^m}} \left\| \left( g(x+W_T^{0}-W_t^0)-g(x)\right) \big \langle \mathbf {e}_\nu , \big ( 1 , \tfrac{ W^{0}_{T}- W^{0}_{t} }{ T - t } \big )\big \rangle \right\| _{L^2({\mathbb {P}};\mathbb {R})}\\&\quad + \tfrac{1}{\sqrt{M^m}} \left\| \tfrac{ \left( F( 0)\right) \! (\mathcal {R}^{0}_t,x+W_{\mathcal {R}_t^{0}}^0-W_t^0) }{\varrho (t,\mathcal {R}^{0}_t)} \Big \langle \mathbf {e}_\nu , \Big ( 1 , \tfrac{ W^{0}_{\mathcal {R}_t^{0}}- W^{0}_{t} }{ {{\mathcal {R}_t^{0}} - t} } \Big )\Big \rangle \right\| _{L^2({\mathbb {P}};\mathbb {R})} \\&\quad + \sum _{\nu _1=1}^{d+1} \tfrac{ L_{\nu _1}}{\sqrt{M^{m-1}}} \left\| \tfrac{ \mathbf{u}_{\nu _1} (\mathcal {R}_t^0,x+W_{\mathcal {R}_t^0}^0-W_t^0) }{ \varrho (t,\mathcal {R}_t^0) } \Big \langle \mathbf {e}_\nu , \Big ( 1 , \tfrac{ W^{0}_{\mathcal {R}_t^0}- W^{0}_{t} }{ {\mathcal {R}_t^0 - t} } \Big )\Big \rangle \right\| _{L^2({\mathbb {P}};\mathbb {R})}\\&\quad + \sum _{l=1}^{m-1} \sum _{\nu _1=1}^{d+1} \tfrac{ 2 L_{\nu _1}}{\sqrt{M^{m-l-1}}} \left\| \tfrac{ \left( \mathbf{U}_{l,M}^{0,\nu _1}-\mathbf{u}_{\nu _1}\right) \! (\mathcal {R}_t^0,x+W_{\mathcal {R}_t^0}^0-W_t^0) }{ \varrho (t,\mathcal {R}_t^0) } \Big \langle \mathbf {e}_\nu , \Big ( 1 , \tfrac{ W^{0}_{\mathcal {R}_t^0}- W^{0}_{t} }{ {\mathcal {R}_t^0 - t} } \Big )\Big \rangle \right\| _{L^2({\mathbb {P}};\mathbb {R})}. \end{aligned} \end{aligned}$$

(71)

In the next step we iterate (71). More formally, we claim that for all $n,k\in \mathbb {N}$, $t\in [0,T)$, $x\in \mathbb {R}^d$, $\nu _0 \in \{1,2,\ldots , d+1\}$ it holds that

$$\begin{aligned}&\left\| \mathbf{U}_{{n},M}^{0,\nu _0}({t},x)-\mathbf{u}_{\nu _0}(t,x)\right\| _{L^2({\mathbb {P}};\mathbb {R})} \nonumber \\&\quad \le \sum _{j=0}^{k-1} \sum _{\begin{array}{c} l_1,l_2,\ldots ,l_{j+1}\in \mathbb {N},\\ l_1<l_2<\ldots<l_{j}<l_{j+1}=n \end{array}} \sum _{\begin{array}{c} \nu _1,\nu _2,\ldots ,\nu _{j+1} \in \\ \{1,2,\ldots ,d+1\} \end{array}} \tfrac{\mathrm {1}_{\{1\}}(\nu _{j+1})2^j\left[ \prod _{i=1}^j L_{\nu _i}\right] }{\sqrt{M^{n-j}}}\nonumber \\&\qquad \cdot \bigg \Vert \big (g(x+W^0_{T}-W^0_{t})-g(x+W^0_{S(j,t)}-W^0_{t})\big )\nonumber \\&\qquad \cdot \big \langle \mathbf {e}_{\nu _j}, \big ( 1, \tfrac{ W^{0}_{T}- W^{0}_{S(j,t)} }{T-S(j,t) } \big )\big \rangle \Big [ \textstyle \prod _{i=1}^{j} \displaystyle \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W^{0}_{S(i,t)}- W^{0}_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \Big ] \bigg \Vert _{L^2({\mathbb {P}};\mathbb {R})}\nonumber \\&\qquad + \sum _{j=0}^{k-1} \sum _{\begin{array}{c} l_1,l_2,\ldots ,l_{j+1}\in \mathbb {N},\nonumber \\ l_1<l_2<\ldots<l_{j}<l_{j+1}=n \end{array}} \sum _{\begin{array}{c} \nu _1,\nu _2,\ldots ,\nu _{j+1} \in \\ \{1,2,\ldots ,d+1\} \end{array}} \tfrac{\mathrm {1}_{\{1\}}(\nu _{j+1})2^j\left[ \prod _{i=1}^j L_{\nu _i}\right] }{\sqrt{M^{n-j}}}\\&\qquad \cdot \bigg \Vert (F(0))(S(j+1,t),x+W^0_{S(j+1,t)}-W^0_t)\nonumber \\&\qquad \cdot \Big [ \textstyle \prod _{i=1}^{j+1} \displaystyle \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W^{0}_{S(i,t)}- W^{0}_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \Big ] \bigg \Vert _{L^2({\mathbb {P}};\mathbb {R})} \nonumber \\&\qquad + \sum _{j=0}^{k-1} \sum _{\begin{array}{c} l_1,l_2,\ldots ,l_{j+1}\in \mathbb {N},\nonumber \\ l_1<l_2<\ldots<l_{j}<l_{j+1}=n \end{array}} \sum _{\begin{array}{c} \nu _1,\nu _2,\ldots ,\nu _{j+1} \in \\ \{1,2,\ldots ,d+1\} \end{array}} \tfrac{2^j\left[ \prod _{i=1}^{j+1} L_{\nu _i}\right] }{\sqrt{M^{n-j-1}}} \bigg \Vert \mathbf{u}_{\nu _{j+1}}(S(j+1,t),x+W^0_{S(j+1,t)}-W^0_t)\\&\qquad \cdot \Big [ \textstyle \prod _{i=1}^{j+1} \displaystyle \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W^{0}_{S(i,t)}- W^{0}_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \Big ] \bigg \Vert _{L^2({\mathbb {P}};\mathbb {R})}\nonumber \\&\qquad + \sum _{\begin{array}{c} l_1,l_2,\ldots ,l_{k}\in \mathbb {N},\nonumber \\ l_1<l_2<\ldots<l_{k}<n \end{array}} \sum _{\begin{array}{c} \nu _1,\nu _2,\ldots ,\nu _{k} \in \\ \{1,2,\ldots ,d+1\} \end{array}} \tfrac{2^k\left[ \prod _{i=1}^{k} L_{\nu _i}\right] }{\sqrt{M^{n-k-l_1}}} \bigg \Vert \left( \mathbf{U}_{l_1,M}^{0,\nu _k}-\mathbf{u}_{\nu _k}\right) \! (S(k,t),x+W_{S(k,t)}^0-W_t^0)\\&\qquad \cdot \Big [ \textstyle \prod _{i=1}^{k} \displaystyle \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W^{0}_{S(i,t)}- W^{0}_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \Big ] \bigg \Vert _{L^2({\mathbb {P}};\mathbb {R})}. \end{aligned}$$

(72)

We now prove (72) by induction on $k\in \mathbb {N}$. Observe that (71) establishes (72) in the base case $k=1$. For the induction step $\mathbb {N}\ni k\rightsquigarrow k+1\in \mathbb {N}$ assume that there exists $k\in \mathbb {N}$ which satisfies that (72) holds for k. Observe that Item (iv) of Lemma 3.2 shows that for all $m\in \mathbb {N}$ it holds that $\mathbf{U}^0_{m,M}$, $W^0$, and $(\mathfrak {r}^{(k)})_{k\in \mathbb {N}_0}$ are independent. Combining this and (71) yields that for all $l_1\in \mathbb {N}$, $x\in \mathbb {R}^d$, $t\in [0,T)$, $\nu _0,\nu _1,\ldots , \nu _{k} \in \{1,2,\ldots , d+1\}$ it holds that

$$\begin{aligned}&\Big \Vert \left( \mathbf{U}_{l_1,M}^{0,\nu _k}-\mathbf{u}_{\nu _k}\right) \! (S(k,t),x+W_{S(k,t)}^0-W_t^0)\nonumber \\&\qquad \cdot \prod _{i=1}^{k} \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W^{0}_{S(i,t)}- W^{0}_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \Big \Vert _{L^2({\mathbb {P}};\mathbb {R})}\nonumber \\&\quad = \Bigg \Vert \left\| \left( \mathbf{U}_{l_1,M}^{0,\nu _k}-\mathbf{u}_{\nu _k}\right) \! (s,y)\right\| _{L^2({\mathbb {P}};\mathbb {R})} \bigg |_{(s,y)=(S(k,t),x+W_{S(k,t)}^0-W_t^0)} \nonumber \\&\qquad \cdot \prod _{i=1}^{k} \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W^{0}_{S(i,t)}- W^{0}_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \Bigg \Vert _{L^2({\mathbb {P}};\mathbb {R})}\nonumber \\&\quad \le \tfrac{1}{\sqrt{M^{l_{1}}}} \left\| \left( g(x+W^0_{T}-W^0_{t})-g(x+W^0_{S(k,t)}-W^0_{t})\right) \big \langle \mathbf {e}_{\nu _{k}}, \big ( 1 , \tfrac{ W^{0}_{T}- W^{0}_{S(k,t)} }{ T-S(k,t) } \big ) \big \rangle \right. \nonumber \\&\qquad \cdot \left. \prod _{i=1}^{k} \tfrac{ \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W^{0}_{S(i,t)}- W^{0}_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle }{\varrho (S(i-1,t),S(i,t))} \right\| _{L^2({\mathbb {P}};\mathbb {R})} \nonumber \\&\qquad + \tfrac{1}{\sqrt{M^{l_{1}}}} \left\| \left( F(0)\right) (S(k+1,t),x+W_{S(k+1,t)}-W_{t}^0)\right. \nonumber \\&\qquad \cdot \left. \prod _{i=1}^{k+1} \tfrac{ \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W^{0}_{S(i,t)}- W^{0}_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle }{\varrho (S(i-1,t),S(i,t))} \right\| _{L^2({\mathbb {P}};\mathbb {R})} \nonumber \\&\qquad + \sum _{\nu _{k+1}=1}^{d+1}\tfrac{L_{\nu _{k+1}}}{\sqrt{M^{l_{1}-1}}} \left\| \mathbf{u}_{\nu _{k+1}}(S(k+1,t),x+W_{S(k+1,t)}-W_{t}^0)\right. \nonumber \\&\qquad \cdot \left. \prod _{i=1}^{k+1} \tfrac{ \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W^{0}_{S(i,t)}- W^{0}_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle }{\varrho (S(i-1,t),S(i,t))} \right\| _{L^2({\mathbb {P}};\mathbb {R})} \nonumber \\&\qquad + \sum _{l_0=1}^{l_1-1}\sum _{\nu _{k+1}=1}^{d+1}\tfrac{2L_{\nu _{k+1}}}{\sqrt{M^{l_{1}-l_0-1}}} \left\| (\mathbf{U}^{0,\nu _{k+1}}_{l_0,M}-\mathbf{u}_{\nu _{k+1}})(S(k+1,t),x+W_{S(k+1,t)}-W_{t}^0)\right. \nonumber \\&\qquad \cdot \left. \prod _{i=1}^{k+1} \tfrac{ \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W^{0}_{S(i,t)}- W^{0}_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle }{\varrho (S(i-1,t),S(i,t))} \right\| _{L^2({\mathbb {P}};\mathbb {R})}. \end{aligned}$$

(73)

This and the induction hypothesis establish (72) in the case $k+1$. Induction thus proves (72). Next note that (72) (applied with $k=n$ in the notation of (72)) demonstrates that for all $n\in \mathbb {N}$, $t\in [0,T)$, $x\in \mathbb {R}^d$, $\nu _0 \in \{1,2,\ldots , d+1\}$ it holds that

$$\begin{aligned}&\left\| \mathbf{U}_{{n},M}^{0,\nu _0}({t},x)-\mathbf{u}_{\nu _0}(t,x)\right\| _{L^2({\mathbb {P}};\mathbb {R})} \nonumber \\&\quad \le \sum _{j=0}^{n-1} \sum _{\begin{array}{c} l_1,l_2,\ldots ,l_{j+1}\in \mathbb {N},\\ l_1<l_2<\ldots<l_{j}<l_{j+1}=n \end{array}} \sum _{\begin{array}{c} \nu _1,\nu _2,\ldots ,\nu _{j+1} \in \\ \{1,2,\ldots ,d+1\} \end{array}} \tfrac{\mathrm {1}_{\{1\}}(\nu _{j+1})2^j\left[ \prod _{i=1}^j L_{\nu _i}\right] }{\sqrt{M^{n-j}}}\nonumber \\&\qquad \cdot \bigg \Vert \big (g(x+W^0_{T}-W^0_{t})-g(x+W^0_{S(j,t)}-W^0_{t})\big )\nonumber \\&\qquad \cdot \big \langle \mathbf {e}_{\nu _{j}}, \big ( 1, \tfrac{ W^{0}_{T}- W^{0}_{S(j,t)} }{T-S(j,t) } \big )\big \rangle \Big [ \textstyle \prod _{i=1}^{j} \displaystyle \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W^{0}_{S(i,t)}- W^{0}_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \Big ] \bigg \Vert _{L^2({\mathbb {P}};\mathbb {R})}\nonumber \\&\qquad + \sum _{j=0}^{n-1} \sum _{\begin{array}{c} l_1,l_2,\ldots ,l_{j+1}\in \mathbb {N},\\ l_1<l_2<\ldots<l_{j}<l_{j+1}=n \end{array}} \sum _{\begin{array}{c} \nu _1,\nu _2,\ldots ,\nu _{j+1} \in \\ \{1,2,\ldots ,d+1\} \end{array}} \tfrac{\mathrm {1}_{\{1\}}(\nu _{j+1})2^j\left[ \prod _{i=1}^j L_{\nu _i}\right] }{\sqrt{M^{n-j}}}\nonumber \\&\qquad \cdot \bigg \Vert (F(0))(S(j+1,t),x+W^0_{S(j+1,t)}-W^0_t)\nonumber \\&\qquad \cdot \Big [ \textstyle \prod _{i=1}^{j+1} \displaystyle \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W^{0}_{S(i,t)}- W^{0}_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \Big ] \bigg \Vert _{L^2({\mathbb {P}};\mathbb {R})} \nonumber \\&\qquad + \sum _{j=0}^{n-1} \sum _{\begin{array}{c} l_1,l_2,\ldots ,l_{j+1}\in \mathbb {N},\nonumber \\ l_1<l_2<\ldots<l_{j}<l_{j+1}=n \end{array}} \sum _{\begin{array}{c} \nu _1,\nu _2,\ldots ,\nu _{j+1} \in \\ \{1,2,\ldots ,d+1\} \end{array}} \tfrac{2^j\left[ \prod _{i=1}^{j+1} L_{\nu _i}\right] }{\sqrt{M^{n-j-1}}}\\&\qquad \cdot \bigg \Vert \mathbf{u}_{\nu _{j+1}}(S(j+1,t),x+W^0_{S(j+1,t)}-W^0_t)\nonumber \\&\qquad \cdot \Big [ \textstyle \prod _{i=1}^{j+1} \displaystyle \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W^{0}_{S(i,t)}- W^{0}_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \Big ] \bigg \Vert _{L^2({\mathbb {P}};\mathbb {R})}. \end{aligned}$$

(74)

Combining this with the fact that for all $n\in \mathbb {N}$, $j\in \{0,1,\ldots , n-1\}$ it holds that

$$\begin{aligned} \sum _{\begin{array}{c} l_1,l_2,\ldots ,l_{j+1}\in \mathbb {N},\\ l_1<l_2<\ldots<l_{j}<l_{j+1}=n \end{array}}1=\left( {\begin{array}{c}n-1\\ j\end{array}}\right) . \end{aligned}$$

(75)

establishes (59). The proof of Lemma 3.4 is thus completed. $\square $

Proposition 3.5

(Global approximation error) Assume Setting 3.1, let $t \in [0,T)$, $x\in \mathbb {R}^d$, $\nu _0\in \{1,2,\ldots ,d+1\}$, $M,n\in \mathbb {N}$, $p\in [2,\infty )$, $\alpha \in (\frac{p-2}{2(p-1)},\frac{p}{2(p-1)})$, $\beta =\frac{\alpha }{2}-\frac{(1-\alpha )(p-2)}{2p}$, $C\in \mathbb {R}$ satisfy that

$$\begin{aligned} C&=\max \!\left\{ 1, 2(T-t)^\frac{1}{2}|\Gamma (\tfrac{p}{2})|^{\frac{1}{p}}(1-\alpha )^{\frac{1}{p}-1} \max \{1,\Vert L\Vert _1\} \right. \nonumber \\&\quad \cdot \left. \max \!\big \{ (T-t)^\frac{1}{2}, 2^\frac{1}{2}|\Gamma (\tfrac{p+1}{2})|^{\frac{1}{p}}\pi ^{-\frac{1}{2p}} \big \} \right\} , \end{aligned}$$

(76)

and assume for all $s\in (0,1)$ that $\rho (s)=\frac{1-\alpha }{s^\alpha }$. Then

$$\begin{aligned} \begin{aligned}&\big \Vert \mathbf{U}_{{n},M}^{0,\nu _0}({t},x)-\mathbf{u}_{\nu _0}(t,x)\big \Vert _{L^2({\mathbb {P}};\mathbb {R})} \le \tfrac{1}{4}\left[ 1+\tfrac{pn}{2}\right] ^{\frac{1}{8}}M^{-\frac{n}{2}}(2C)^{n} \\&\quad \cdot \exp \!\big (\tfrac{1}{8}+\beta M^{\frac{1}{2\beta }}\big ) \bigg [2C^{-1}\sqrt{ \max \{T-t,3\}}\Vert K\Vert _1 \\ {}&\quad + \sup _{s\in [t,T)} \left\| f(s,x+W^0_{s}-W^0_t,0,0) \right\| _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})} \\&\quad + \sqrt{M} \sup _{s\in [t,T)}\max _{i\in \{1,2,\ldots ,d+1\}} \left\| \mathbf{u}_i(s,x+W^0_{s}-W^0_t) \right\| _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})} \bigg ] . \end{aligned} \end{aligned}$$

(77)

Proof of Proposition 3.5

Throughout this proof assume without loss of generality that

$$\begin{aligned} \sup _{s\in [t,T)} \left\| (F(0))(s,x+W^0_{s}-W^0_t) \right\| _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})}<\infty \end{aligned}$$

(78)

(otherwise (77) is clear), let $S:\mathbb {N}_0 \times [0,T)\times \Omega \rightarrow [0,T)$ satisfy for all $n \in \mathbb {N}_0$, $t\in [0,T)$ that $S(0,t)=t$ and $S(n+1,t)=\mathcal {R}^{(n)}_{S(n,t)}$, and let $\mathfrak C\in \mathbb {R}$ satisfy

$$\begin{aligned} \mathfrak C= \max \!\big \{ T-t, 2|\Gamma (\tfrac{p+1}{2})|^{\frac{2}{p}}\pi ^{-\frac{1}{p}} \big \} (T-t)|\Gamma (\tfrac{p}{2})|^{\frac{2}{p}}(1-\alpha )^{\frac{2}{p}-2}. \end{aligned}$$

(79)

Observe that the triangle inequality ensures that for all $\nu \in \{1,2,\ldots ,d+1\}$, $s\in [0,T)$ it holds that

$$\begin{aligned} \begin{aligned}&\sum _{\alpha =1}^{d}K_\alpha \left\| (W^{0,\alpha }_T-W^{0,\alpha }_{s}) \big \langle \mathbf {e}_\nu , \big ( 1, \tfrac{ W^{0}_{T}- W^{0}_{s} }{ {T-s} } \big )\big \rangle \right\| _{L^2({\mathbb {P}};\mathbb {R})}\\&\quad = \sum _{\alpha =1}^{d}K_\alpha \left( \sqrt{T-s}\mathbb {1}_{\{1\}}(\nu )+\tfrac{\mathbb {1}_{[2,\infty )}(\nu )}{T-s}\Vert (W^{0,\alpha }_T-W^{0,\alpha }_s)(W^{0,\nu -1}_T-W^{0,\nu -1}_s)\Vert _{L^2({\mathbb {P}};\mathbb {R})}\right) \\&\quad =\sqrt{T-s}\Vert K\Vert _1\mathbb {1}_{\{1\}}(\nu )+\tfrac{\mathbb {1}_{[2,\infty )}(\nu )}{T-s}\left( K_{\nu -1}\Vert (W^{0,\nu -1}_T-W^{0,\nu -1}_s)^2\Vert _{L^2({\mathbb {P}};\mathbb {R})}\right) \\&\qquad + \tfrac{\mathbb {1}_{[2,\infty )}(\nu )}{T-s}\left( \sum _{\alpha \in \{1,2,\ldots ,d\}\setminus \{\nu -1\}}K_\alpha \Vert W^{0,\alpha }_T-W^{0,\alpha }_s\Vert _{L^2({\mathbb {P}};\mathbb {R})}^2\right) \\&\quad =\sqrt{T-s}\Vert K\Vert _1\mathbb {1}_{\{1\}}(\nu )+\mathbb {1}_{[2,\infty )}(\nu )\left( \sqrt{3}K_{\nu -1}+ \sum _{\alpha \in \{1,2,\ldots ,d\}\setminus \{\nu -1\}}K_\alpha \right) \\&\quad \le \max \{\sqrt{T-s},\sqrt{3}\}\Vert K\Vert _1. \end{aligned} \end{aligned}$$

(80)

Moreover, observe that (36) assures that for all $j\in \{0,1,\ldots ,n-1\}$, $\nu _1,\nu _2,\ldots ,\nu _j \in \{1,2,\ldots ,d+1\}$ it holds that

$$\begin{aligned} \begin{aligned}&\bigg \Vert \left( g(x+W^0_{T}-W^0_{t})-g(x+W^0_{S(j,t)}-W^0_{t})\right) \big \langle \mathbf {e}_{\nu _j}, \big ( 1, \tfrac{ W^{0}_{T}- W^{0}_{S(j,t)} }{T-S(j,t) } \big )\big \rangle \\&\qquad \cdot \Big [ \textstyle \prod _{i=1}^{j} \displaystyle \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W^{0}_{S(i,t)}- W^{0}_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \Big ] \bigg \Vert _{L^2({\mathbb {P}};\mathbb {R})}\\&\quad \le \left\| \Big ( \sum _{\alpha =1}^d K_\alpha \big | W^{0,\alpha }_T-W^{0,\alpha }_{S(j,t)} \big | \Big ) \big \langle \mathbf {e}_{\nu _{j}}, \big ( 1, \tfrac{ W^{0}_{T}- W^{0}_{S(j,t)} }{T-S(j,t) } \big )\big \rangle \right. \\&\qquad \cdot \left. \Big [ \textstyle \prod _{i=1}^{j} \displaystyle \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W^{0}_{S(i,t)}- W^{0}_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \Big ] \right\| _{L^2({\mathbb {P}};\mathbb {R})}\\&\quad = \Bigg \Vert \bigg \Vert \Big ( \sum _{\alpha =1}^d K_\alpha \big | W^{0,\alpha }_T-W^{0,\alpha }_{s} \big | \Big ) \big \langle \mathbf {e}_{\nu _{j}}, \big ( 1, \tfrac{ W^{0}_{T}- W^{0}_{s} }{T-s } \big )\big \rangle \bigg \Vert _{L^2({\mathbb {P}};\mathbb {R})} \bigg |_{s=S(j,t)} \\&\qquad \cdot \Big [ \textstyle \prod _{i=1}^{j} \displaystyle \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W^{0}_{S(i,t)}- W^{0}_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \Big ] \Bigg \Vert _{L^2({\mathbb {P}};\mathbb {R})}. \end{aligned} \end{aligned}$$

(81)

Next note that the hypothesis that $p\in [2,\infty )$ and the fact that $\alpha \in (\frac{p-2}{2(p-1)},\frac{p}{2(p-1)})$ imply that $\frac{p}{2}\in [\alpha (p-1),\alpha (p-1)+1]$ and $\alpha \in (0,1)$. This, (81), (80), and Corollary 2.5 (applied with $p=2$ in the notation of Corollary 2.5) show that for all $j\in \{0,1,\ldots ,n-1\}$, $\nu _1,\nu _2,\ldots ,\nu _j \in \{1,2,\ldots ,d+1\}$ it holds that

$$\begin{aligned} \begin{aligned}&\bigg \Vert \left( g(x+W^0_{T}-W^0_{t})-g(x+W^0_{S(j,t)}-W^0_{t})\right) \big \langle \mathbf {e}_{\nu _j}, \big ( 1, \tfrac{ W^{0}_{T}- W^{0}_{S(j,t)} }{T-S(j,t) } \big )\big \rangle \\&\qquad \cdot \Big [ \textstyle \prod _{i=1}^{j} \displaystyle \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W^{0}_{S(i,t)}- W^{0}_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \Big ] \bigg \Vert _{L^2({\mathbb {P}};\mathbb {R})}\\&\quad \le \Bigg \Vert \bigg ( \sum _{\alpha =1}^d K_\alpha \left\| \Big (W^{0,\alpha }_T-W^{0,\alpha }_{s}\Big ) \big \langle \mathbf {e}_{\nu _{j}}, \big ( 1, \tfrac{ W^{0}_{T}- W^{0}_{s} }{T-s } \big )\big \rangle \right\| _{L^2({\mathbb {P}};\mathbb {R})} \bigg |_{s=S(j,t)} \bigg )\\&\qquad \cdot \textstyle \prod _{i=1}^{j} \displaystyle \Big [ \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W^{0}_{S(i,t)}- W^{0}_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \Big ] \Bigg \Vert _{L^2({\mathbb {P}};\mathbb {R})}\\&\quad \le \sqrt{ \max \{T-t,3\}}\Vert K\Vert _1 \left\| \prod _{i=1}^{j} \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W^{0}_{S(i,t)}- W^{0}_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \right\| _{L^2({\mathbb {P}};\mathbb {R})} \\&\quad \le \sqrt{ \max \{T-t,3\}}\Vert K\Vert _1 \left[ \tfrac{(T-t)\max \left\{ T-t, 2\pi ^{-1/2}\Gamma (\frac{3}{2}) \right\} }{(1-\alpha )} \right] ^{\frac{j}{2}} (ej)^{\frac{1}{8}} \left[ \tfrac{1}{\Gamma \left( j+1\right) } \right] ^{\frac{\alpha }{2}}. \end{aligned} \end{aligned}$$

(82)

The fact that $\Gamma (\frac{p+1}{2})\ge \Gamma ( \frac{3}{2})=\frac{\sqrt{\pi }}{2}$, the fact that $\frac{2(p-1)}{p}\ge 1$, and the fact that $\alpha < 1$ prove that

$$\begin{aligned} (T-t)(1-\alpha )^{-1}\max \big \{ T-t, 2\pi ^{-1/2}\Gamma \big (\tfrac{3}{2}\big ) \big \} \le \mathfrak C. \end{aligned}$$

(83)

Moreover, note that the fact that $p\ge 2$ ensures that for all $j\in \mathbb {N}_0$ it holds that $(ej)^{1/8}\le \left[ e\left( \tfrac{pj}{2}+1\right) \right] ^{1/8}$ and $\Gamma (j+1)^{\alpha /2}\ge \Gamma (j+1)^{\beta }$. Combining this with (82) and (83) proves that for all $j\in \{0,1,\ldots ,n-1\}$, $\nu _1,\nu _2,\ldots ,\nu _j \in \{1,2,\ldots ,d+1\}$ it holds that

$$\begin{aligned} \begin{aligned}&\bigg \Vert \big (g(x+W^0_{T}-W^0_{t})-g(x+W^0_{S(j,t)}-W^0_{t})\big ) \big \langle \mathbf {e}_{\nu _{j}}, \big ( 1, \tfrac{ W^{0}_{T}- W^{0}_{S(j,t)} }{T-S(j,t) } \big )\big \rangle \\&\qquad \cdot \Big [ \textstyle \prod _{i=1}^{j} \displaystyle \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W^{0}_{S(i,t)}- W^{0}_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \Big ] \bigg \Vert _{L^2({\mathbb {P}};\mathbb {R})}\\&\quad \le \sqrt{ \max \{T-t,3\}}\Vert K\Vert _1 \mathfrak C^{\frac{j}{2}} \left[ e\left( \tfrac{pj}{2}+1\right) \right] ^{\frac{1}{8}} \left[ \tfrac{1}{\Gamma \left( j+1\right) } \right] ^{\beta }. \end{aligned} \end{aligned}$$

(84)

Corollary 2.5, the fact that $p\ge 2$, the fact that $\alpha >\frac{p-2}{2(p-1)}$, and the fact that $\frac{2}{p}\Gamma (\frac{2}{p})\le 1$ prove that for all $j\in \{0,1,\ldots ,n-1\}$, $\nu _1,\nu _2,\ldots ,\nu _j \in \{1,2,\ldots ,d+1\}$ it holds that

$$\begin{aligned} \begin{aligned}&\left\| \prod _{i=1}^{j+1} \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W^{0}_{S(i,t)}- W^{0}_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \right\| _{L^p({\mathbb {P}};\mathbb {R})}\\&\quad \le \left[ \max \left\{ T-t, 2\tfrac{\Gamma \!\left( \frac{p+1}{2}\right) ^{\frac{2}{p}}}{\pi ^{\frac{1}{p}}} \right\} \tfrac{(T-t)\Gamma (\frac{p}{2})^{\frac{2}{p}}}{(1-\alpha )^{\frac{2(p-1)}{p}} (\frac{p}{2})^{\frac{2(\alpha (p-1)-\frac{p}{2}+1)}{p}} }\right] ^{\frac{j+1}{2}}\\&\qquad \cdot \left[ e^{\frac{p}{2}}\left( \tfrac{pj}{2}+1\right) \right] ^{\frac{1}{2p^2}} \left[ \tfrac{\Gamma \left( \frac{2}{p}\right) }{\Gamma \left( 1+j+\frac{2}{p}\right) } \right] ^{\frac{\alpha }{2}-\frac{(1-\alpha )(p-2)}{2p}}\\&\quad \le \left[ \max \left\{ T-t, 2\tfrac{\Gamma \!\left( \frac{p+1}{2}\right) ^{\frac{2}{p}}}{\pi ^{\frac{1}{p}}} \right\} \tfrac{(T-t)\Gamma (\frac{p}{2})^{\frac{2}{p}}}{(1-\alpha )^{\frac{2(p-1)}{p}} }\right] ^{\frac{j+1}{2}} \left[ e\left( \tfrac{pj}{2}+1\right) \right] ^{\frac{1}{8}} \left[ \tfrac{1}{\Gamma \left( 1+j+\frac{2}{p}\right) } \right] ^{\frac{\alpha }{2}-\frac{(1-\alpha )(p-2)}{2p}}\\&\quad = \mathfrak C^{\frac{j+1}{2}} \left[ e\left( \tfrac{pj}{2}+1\right) \right] ^{\frac{1}{8}} \left[ \tfrac{1}{\Gamma \left( 1+j+\frac{2}{p}\right) } \right] ^{\beta }. \end{aligned} \end{aligned}$$

(85)

Combining this with Hölder’s inequality and the fact that the random variables $W^0$ and $\mathfrak {r}^{(n)}$, $n\in \mathbb {N}$, are independent proves that for all $j\in \{0,1,\ldots ,n-1\}$, $\nu _1,\nu _2,\ldots ,\nu _{j+1} \in \{1,2,\ldots ,d+1\}$ it holds that

$$\begin{aligned}&\left\| (F(0))(S(j+1,t),x+W^0_{S(j+1,t)}-W^0_t) \prod _{i=1}^{j+1} \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W^{0}_{S(i,t)}- W^{0}_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \right\| _{L^2({\mathbb {P}};\mathbb {R})}\nonumber \\&\quad \le \left\| (F(0))(S(j+1,t),x+W^0_{S(j+1,t)}-W^0_t) \right\| _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})}\nonumber \\&\qquad \cdot \left\| \prod _{i=1}^{j+1} \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \Big ( 1, \tfrac{ W^{0}_{S(i,t)}- W^{0}_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \right\| _{L^p({\mathbb {P}};\mathbb {R})}\nonumber \\&\quad = \left( \int _t^T {\mathbb {E}}\Big [\Big \Vert (F(0))(s,x+W^0_s-W^0_t)\Big \Vert ^{\frac{2p}{p-2}}\Big ]{\mathbb {P}}\Big (S(j+1,t)\in ds\Big )\right) ^{\frac{p-2}{2p}}\nonumber \\&\qquad \cdot \left\| \prod _{i=1}^{j+1} \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W^{0}_{S(i,t)}- W^{0}_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \right\| _{L^p({\mathbb {P}};\mathbb {R})}\nonumber \\&\quad \le \sup _{s\in [t,T)} \left\| (F(0))(s,x+W^0_{s}-W^0_t) \right\| _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})} \left[ e\left( \tfrac{pj}{2}+1\right) \right] ^{\frac{1}{8}} \mathfrak C^{\frac{j+1}{2}} \left[ \tfrac{1}{\Gamma \left( 1+j+\frac{2}{p}\right) } \right] ^{\beta }. \end{aligned}$$

(86)

Moreover, observe that for all $j\in \{0,1,\ldots ,n-1\}$, $\nu _1,\nu _2,\ldots ,\nu _{j+1} \in \{1,2,\ldots ,d+1\}$ it holds that

$$\begin{aligned}&\left\| \mathbf{u}_{\nu _{j+1}}(S(j+1,t),x+W^0_{S(j+1,t)}-W^0_t) \prod _{i=1}^{j+1} \tfrac{1}{\varrho (S(i-1,t),S(i,t))} \big \langle \mathbf {e}_{\nu _{i-1}}, \big ( 1, \tfrac{ W^{0}_{S(i,t)}- W^{0}_{S(i-1,t)} }{S(i,t)-S(i-1,t) } \big )\big \rangle \right\| _{L^2({\mathbb {P}};\mathbb {R})}\nonumber \\&\quad \le \sup _{s\in [t,T)} \left\| \mathbf{u}_{\nu _{j+1}}(s,x+W^0_{s}-W^0_t) \right\| _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})} \left[ e\left( \tfrac{pj}{2}+1\right) \right] ^{\frac{1}{8}} \mathfrak C^{\frac{j+1}{2}} \left[ \tfrac{1}{\Gamma \left( 1+j+\frac{2}{p}\right) } \right] ^{\beta }. \end{aligned}$$

(87)

In the next step we intend to apply Lemma 3.4. For this we now verify the hypotheses of Lemma 3.4. More formally, observe that the fact that $\sup _{s\in [t,T)} \Vert (F(0))(s,x+W^0_{s}-W^0_t) \Vert _{L^{2p/(p-2)}({\mathbb {P}};\mathbb {R})}<\infty $ and the fact that $\tfrac{2p}{p-2}\ge 2$ ensure that for all $r\in [1,2)$ it holds that

$$\begin{aligned} \sup _{s\in [t,T)} {\mathbb {E}}\!\left[ \left| \big (F(0)\big ) (s,x+W_{s}^0-W_t^0) \right| ^{r} \right] <\infty . \end{aligned}$$

(88)

Moreover, observe that for all $r\in [1,2)$ it holds that

$$\begin{aligned} \int _0^1 \frac{1}{ s^{\frac{r}{2}} \left[ \rho (s)\right] ^{r-1} } \,ds= \frac{1}{(1-\alpha )^{r-1}}\int _0^1 s^{-(r(\frac{1}{2}-\alpha )+\alpha )}\, ds <\infty . \end{aligned}$$

(89)

Combining this, (88), and Lemma 3.4 with (84), (86), and (87) proves that

$$\begin{aligned} \begin{aligned}&\big \Vert \mathbf{U}_{{n},M}^{0,\nu _0}({t},x)-\mathbf{u}_{\nu _0}(t,x)\big \Vert _{L^2({\mathbb {P}};\mathbb {R})} \\&\quad \le \sqrt{ \max \{T-t,3\}}\Vert K\Vert _1 \\&\qquad \cdot \sum _{j=0}^{n-1} \sum _{\begin{array}{c} \nu _1,\nu _2,\ldots ,\nu _{j+1} \in \\ \{1,2,\ldots ,d+1\} \end{array}} \genfrac(){0.0pt}1{n-1}{j} \tfrac{\mathrm {1}_{\{1\}}(\nu _{j+1})\left[ e\left( \tfrac{pj}{2}+1\right) \right] ^{\frac{1}{8}}2^j \mathfrak C^{\frac{j}{2}} \left[ \prod _{i=1}^j L_{\nu _i}\right] }{\sqrt{M^{n-j}}} \left[ \tfrac{1}{\Gamma \left( j+1\right) } \right] ^{\beta }\\&\qquad + \sup _{s\in [t,T)} \left\| (F(0))(s,x+W^0_{s}-W^0_t) \right\| _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})} \\&\qquad \cdot \sum _{j=0}^{n-1} \sum _{\begin{array}{c} \nu _1,\nu _2,\ldots ,\nu _{j+1} \in \\ \{1,2,\ldots ,d+1\} \end{array}} \genfrac(){0.0pt}1{n-1}{j} \tfrac{\mathrm {1}_{\{1\}}(\nu _{j+1})\left[ e\left( \tfrac{pj}{2}+1\right) \right] ^{\frac{1}{8}} 2^j \mathfrak C^{\frac{j+1}{2}} \left[ \prod _{i=1}^j L_{\nu _i}\right] }{\sqrt{M^{n-j}} \left( \Gamma \left( 1+j+\frac{2}{p}\right) \right) ^{\!\beta }} \\&\qquad + \sum _{j=0}^{n-1} \sum _{\begin{array}{c} \nu _1,\nu _2,\ldots ,\nu _{j+1} \in \\ \{1,2,\ldots ,d+1\} \end{array}} \genfrac(){0.0pt}1{n-1}{j} \sup _{s\in [t,T)} \left\| \mathbf{u}_{\nu _{j+1}}(s,x+W^0_{s}-W^0_t) \right\| _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})}\\&\qquad \cdot \tfrac{\left[ e\left( \tfrac{pj}{2}+1\right) \right] ^{\frac{1}{8}} 2^j \mathfrak C^{\frac{j+1}{2}} \left[ \prod _{i=1}^{j+1} L_{\nu _i}\right] }{\sqrt{M^{n-j-1}} \left( \Gamma \left( 1+j+\frac{2}{p}\right) \right) ^{\!\beta }} . \end{aligned} \end{aligned}$$

(90)

Moreover, observe that for all $j\in \mathbb {N}_0$ it holds that

$$\begin{aligned} \left[ \sum _{\begin{array}{c} \nu _1,\nu _2,\ldots ,\nu _{j+1} \in \\ \{1,2,\ldots ,d+1\} \end{array}} \mathrm {1}_{\{1\}}(\nu _{j+1}) \prod _{i=1}^{j}L_{\nu _i} \right] =\Vert L\Vert _1^j. \end{aligned}$$

(91)

Combining this with (90) proves that

$$\begin{aligned} \begin{aligned}&\big \Vert \mathbf{U}_{{n},M}^{0,\nu _0}({t},x)-\mathbf{u}_{\nu _0}(t,x)\big \Vert _{L^2({\mathbb {P}};\mathbb {R})} \\&\quad \le \sqrt{ \max \{T-t,3\}}\Vert K\Vert _1 \sum _{j=0}^{n-1} \tfrac{\left[ e\left( \tfrac{pj}{2}+1\right) \right] ^{\frac{1}{8}}2^j \mathfrak C^{\frac{j}{2}} \Vert L\Vert _1^j}{\sqrt{M^{n-j}}} \genfrac(){0.0pt}1{n-1}{j} \left[ \tfrac{1}{\Gamma \left( j+1\right) } \right] ^{\beta }\\&\qquad + \sup _{s\in [t,T)} \left\| (F(0))(s,x+W^0_{s}-W^0_t) \right\| _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})} \\&\qquad \cdot \sum _{j=0}^{n-1} \tfrac{ \left[ e\left( \tfrac{pj}{2}+1\right) \right] ^{\frac{1}{8}} 2^j\Vert L\Vert _1^j\mathfrak C^{\frac{j+1}{2}}}{\sqrt{M^{n-j}}} \genfrac(){0.0pt}1{n-1}{j} \left[ \tfrac{1}{\Gamma \left( 1+j+\frac{2}{p}\right) } \right] ^{\beta } \\&\qquad + \sup _{s\in [t,T),i\in \{1,2,\ldots ,d+1\}} \left\| \mathbf{u}_i(s,x+W^0_{s}-W^0_t) \right\| _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})}\\&\qquad \cdot \sum _{j=0}^{n-1} \tfrac{\left[ e\left( \tfrac{pj}{2}+1\right) \right] ^{\frac{1}{8}} 2^j\Vert L\Vert _1^{j+1} \mathfrak C^{\frac{j+1}{2}}}{\sqrt{M^{n-j-1}}} \genfrac(){0.0pt}1{n-1}{j} \left[ \tfrac{1}{\Gamma \left( 1+j+\frac{2}{p}\right) } \right] ^{\beta }. \end{aligned} \end{aligned}$$

(92)

This, the fact that $2\sqrt{\mathfrak C}\Vert L\Vert _1\le 2\sqrt{\mathfrak C}\max \{1,\Vert L\Vert _1\}\le C$, the fact that $p\ge 2$, and the fact that $C\ge 1$ imply that

$$\begin{aligned}&\left\| \mathbf{U}_{{n},M}^{0,\nu _0}({t},x)-\mathbf{u}_{\nu _0}(t,x)\right\| _{L^2({\mathbb {P}};\mathbb {R})}\nonumber \\&\quad \le \sqrt{ \max \{T-t,3\}}\Vert K\Vert _1 \sum _{j=0}^{n-1} \tfrac{\left[ e\left( \tfrac{pj}{2}+1\right) \right] ^{\frac{1}{8}} C^j }{\sqrt{M^{n-j}}} \genfrac(){0.0pt}1{n-1}{j} \left[ \tfrac{1}{\Gamma \left( j+1\right) } \right] ^{\beta }\nonumber \\&\qquad + \frac{1}{2} \sup _{s\in [t,T)} \left\| (F(0))(s,x+W^0_{s}-W^0_t) \right\| _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})}\nonumber \\&\qquad \cdot \sum _{j=0}^{n-1} \tfrac{ \left[ e\left( \tfrac{pj}{2}+1\right) \right] ^{\frac{1}{8}} C^{j+1}}{\sqrt{M^{n-j}}} \genfrac(){0.0pt}1{n-1}{j} \left[ \tfrac{1}{\Gamma \left( j+1\right) } \right] ^{\beta }\nonumber \\&\qquad + \frac{1}{2} \sup _{s\in [t,T),i\in \{1,2,\ldots ,d+1\}} \left\| \mathbf{u}_i(s,x+W^0_{s}-W^0_t) \right\| _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})}\nonumber \\&\qquad \cdot \sum _{j=0}^{n-1} \tfrac{\left[ e\left( \tfrac{pj}{2}+1\right) \right] ^{\frac{1}{8}} C^{j+1}}{\sqrt{M^{n-j-1}}} \genfrac(){0.0pt}1{n-1}{j} \left[ \tfrac{1}{\Gamma \left( j+1\right) } \right] ^{\beta } \le \tfrac{\left[ e\left( \tfrac{pn}{2}+1\right) \right] ^{\frac{1}{8}} C^{n-1} }{\sqrt{M^{n-1}}} \nonumber \\&\qquad \cdot \left[ \sum _{j=0}^{n-1} \genfrac(){0.0pt}1{n-1}{j} \tfrac{(\sqrt{M})^j}{\Gamma \left( j+1\right) ^\beta } \right] \bigg [ \tfrac{\sqrt{ \max \{T-t,3\}}\Vert K\Vert _1}{\sqrt{M}} + \tfrac{C \sup _{s\in [t,T)} \left\| (F(0))(s,x+W^0_{s}-W^0_t) \right\| _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})}}{2\sqrt{M}}\nonumber \\&\qquad + \tfrac{C \sup _{s\in [t,T),i\in \{1,2,\ldots ,d+1\}} \left\| \mathbf{u}_i(s,x+W^0_{s}-W^0_t) \right\| _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})}}{2} \bigg ]. \end{aligned}$$

(93)

Next note that for all $j\in \mathbb {N}_0$ it holds that $\sum _{i=0}^{n-1} \genfrac(){0.0pt}1{n-1}{i} = 2^{n-1}$ and

$$\begin{aligned} \frac{(\sqrt{M})^j}{\Gamma (j+1)^{\beta }} = \left( \frac{(M^{\frac{1}{2\beta }})^{j}}{\Gamma (j+1)} \right) ^{\!\beta } \le \left( \sum _{k=0}^\infty \frac{(M^{\frac{1}{2\beta }})^k}{\Gamma (k+1)} \right) ^{\!\beta } =\exp \!\big (\beta M^{\frac{1}{2\beta }}\big ). \end{aligned}$$

(94)

Combining this with (92) assures that

$$\begin{aligned}&\left\| \mathbf{U}_{{n},M}^{0,\nu _0}({t},x)-\mathbf{u}_{\nu _0}(t,x)\right\| _{L^2({\mathbb {P}};\mathbb {R})} \le \tfrac{\left[ e\left( \tfrac{pn}{2}+1\right) \right] ^{\frac{1}{8}} (2C)^{n-1} \exp \left( \beta M^{\frac{1}{2\beta }}\right) }{\sqrt{M^{n-1}}} \cdot \bigg [ \tfrac{\sqrt{ \max \{T-t,3\}}\Vert K\Vert _1}{\sqrt{M}}\nonumber \\&\quad + \tfrac{C \sup _{s\in [t,T)} \left\| (F(0))(s,x+W^0_{s}-W^0_t) \right\| _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})}}{2\sqrt{M}} + \tfrac{C \sup _{s\in [t,T),i\in \{1,2,\ldots ,d+1\}} \left\| \mathbf{u}_i(s,x+W^0_{s}-W^0_t) \right\| _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})}}{2} \bigg ] . \end{aligned}$$

(95)

This establishes (77). The proof of Proposition 3.5 is thus completed. $\square $

4 Regularity Analysis for Solutions of Certain Differential Equations

The error analysis in Sect. 3.3 above provides upper bounds for the approximation errors of the MLP approximations in (39). The established upper bounds contain certain norms of the unknown exact solutions of the PDEs which we intend to approximate; see, e.g., the right-hand side of (77) in Proposition 3.5 in Sect. 3.3 above for details. In Lemma 4.2 below we establish suitable upper bounds for these norms of the unknown exact solutions of the PDEs which we intend to approximate. In our proof of Lemma 4.2 we employ certain a priori estimates for solutions of BSDEs which we establish in the essentially well-known result in Lemma 4.1 below (see, e.g., El Karoui et al. [22, Proposition 2.1 and Equation (2.12)] for results related to Lemma 4.1).

4.1 Regularity Analysis for Solutions of Backward Stochastic Differential Equations (BSDEs)

Lemma 4.1

Let $T\in (0,\infty )$, $t\in [0,T)$, $d\in \mathbb {N}$, $L,\mathfrak {L}\in [0,\infty )$, let $\left\| \cdot \right\| :\mathbb {R}^d \rightarrow [0,\infty )$ be the standard norm on $\mathbb {R}^d$, let $ \langle \cdot , \cdot \rangle :\mathbb {R}^d \times \mathbb {R}^d \rightarrow [0,\infty ) $ be the standard scalar product on $\mathbb {R}^d$, let $(\Omega ,\mathcal {F},{\mathbb {P}})$ be a probability space with a normal filtration $(\mathbb {F}_s)_{s\in [t,T]}$, let $f_1, f_2 :[t,T] \times \mathbb {R}\times \mathbb {R}^d \times \Omega \rightarrow \mathbb {R}$ be functions, assume for all $s\in [t,T]$, $i\in \{1,2\}$ that the function $[t,s] \times \mathbb {R}\times \mathbb {R}^d \times \Omega \ni (u,y,z, \omega ) \mapsto f_i(u,y,z,\omega )\in \mathbb {R}$ is $({\mathcal {B}}([t,s]) \otimes \mathcal {B}(\mathbb {R}) \otimes \mathcal {B}(\mathbb {R}^d)\otimes \mathbb {F}_s)/\mathcal {B}(\mathbb {R})$-measurable, assume that for all $s\in [t,T]$, $y,\mathfrak {y}\in \mathbb {R}$, $z,\mathfrak {z}\in \mathbb {R}^d$ it holds ${\mathbb {P}}$-a.s. that

$$\begin{aligned} |f_1(s,y,z)-f_1(s,\mathfrak {y},\mathfrak {z})|\le L|y-\mathfrak {y}|+\mathfrak {L}\Vert z-\mathfrak {z}\Vert , \end{aligned}$$

(96)

let $Y^1,Y^2:[t,T]\times \Omega \rightarrow \mathbb {R}$ and $W:[t,T]\times \Omega \rightarrow \mathbb {R}^d$ be $(\mathbb {F}_s)_{s\in [t,T]}$-adapted stochastic processes with continuous sample paths, assume that $(W_{s+t}-W_{t})_{s\in [0,T-t]}$ is a standard Brownian motion, let $Z^k=(Z^{k,1},Z^{k,2}, \ldots , Z^{k,d}) :[t,T]\times \Omega \rightarrow \mathbb {R}^d$, $k\in \{1,2\}$, be $(\mathbb {F}_s)_{s\in [t,T]}$-adapted $(\mathcal {B}([t,T])\otimes \mathcal {F})$/$\mathcal {B}(\mathbb {R}^d)$-measurable stochastic processes, assume that $\sum _{i=1}^2{\mathbb {E}}\big [\sup _{s\in [t,T]}|Y^i_{s}|^2\big ]<\infty $ and ${\mathbb {P}}(\sum _{i=1}^2\int _t^T |f_i(s,Y^i_s,Z^i_s)|+\Vert Z^i_s\Vert ^2\,ds<\infty )=1$, and assume that for all $s\in [t,T]$, $i\in \{1,2\}$ it holds ${\mathbb {P}}$-a.s. that

$$\begin{aligned} Y^i_s=Y^i_T+\int _s^T f_i(u,Y^i_u,Z^i_u)\,du-\int _s^T \langle Z^i_u, dW_u\rangle . \end{aligned}$$

(97)

Then it holds ${\mathbb {P}}$-a.s. that

$$\begin{aligned}&\big |Y^1_{t}-Y^2_{t}\big |\le e^{L(T-t)}\left( \Vert Y^1_T-Y^2_T\Vert _{L^{\infty }({\mathbb {P}};\mathbb {R})}\right. \nonumber \\&\quad \left. +(T-t)\sup _{s\in [t,T]}\sup _{ y\in \mathbb {R}}\sup _{ z\in \mathbb {R}^d}\Vert f_1(s,y,z)-f_2(s,y,z)\Vert _{L^\infty ({\mathbb {P}};\mathbb {R})}\right) . \end{aligned}$$

(98)

Proof of Lemma 4.1

Throughout this proof assume without loss of generality that $\sup _{s\in [t,T], y\in \mathbb {R}, z\in \mathbb {R}^d}\Vert f_1(s,y,z)-f_2(s,y,z)\Vert _{L^\infty ({\mathbb {P}};\mathbb {R})}<\infty $, let $A:[t,T]\times \Omega \rightarrow \mathbb {R}$ and $B=(B^1, B^2, \ldots , B^d):[t,T]\times \Omega \rightarrow \mathbb {R}^d$ satisfy for all $s\in [t,T]$, $j\in \{1,2,\ldots ,d\}$ that

$$\begin{aligned} A_s={\left\{ \begin{array}{ll}\frac{f_1(s,Y^1_s,Z^2_s)-f_1(s,Y^2_s,Z^2_s)}{Y^1_s-Y^2_s}&{}:Y^1_s\ne Y^2_s\\ 0&{}:Y^1_s = Y^2_s \end{array}\right. } \end{aligned}$$

(99)

and

$$\begin{aligned} B^j_s={\left\{ \begin{array}{ll}\frac{f_1(s,Y^1_s,Z^{1,1}_s,\ldots ,Z^{1,j}_s,Z^{2,j+1}_s\ldots ,Z^{2,d}_s)-f_1(s,Y^1_s,Z^{1,1}_s,\ldots ,Z^{1,j-1}_s,Z^{2,j}_s\ldots ,Z^{2,d}_s)}{Z^{1,j}_s-Z^{2,j}_s}&{}:Z^{1,j}_s\ne Z^{2,j}_s\\ 0&{}:Z^{1,j}_s=Z^{2,j}_s \end{array}\right. }, \end{aligned}$$

(100)

and let $\Gamma :[t,T]\times \Omega \rightarrow \mathbb {R}$ be a stochastic process with continuous sample paths which satisfies that for all $s\in [t,T]$ it holds ${\mathbb {P}}$-a.s. that $\Gamma _s=e^{\int _t^s A_r-\frac{\Vert B_r\Vert ^2}{2} dr +\int _t^s\langle B_r,dW_r\rangle }$. Note that Itô’s formula implies that for all $s\in [t,T]$ it holds ${\mathbb {P}}$-a.s. that

$$\begin{aligned} \Gamma _s=1+\int _{t}^s \Gamma _rA_r\,dr+\int _{t}^s\Gamma _r \langle B_r,dW_r\rangle . \end{aligned}$$

(101)

Next observe that (97) ensures that for all $u\in [t,T]$, $i\in \{1,2\}$ it holds ${\mathbb {P}}$-a.s. that $Y^i_u = Y^i_t - \int _t^u f_i(r,Y^i_r,Z^i_r)\, dr + \int _t^u \langle Z^i_r, dW_r\rangle $. Combining this and (101) with Itô’s formula yields that for all $u\in [t,T)$ it holds ${\mathbb {P}}$-a.s. that

$$\begin{aligned} \begin{aligned} \Gamma _u Y^1_u-Y^1_{t}&=\int _{t}^u \Gamma _s \Big (-f_1\left( s,Y^1_s,Z^1_s\right) +A_sY^1_s+ \langle B_s,Z^1_s\rangle \Big )\,ds \\&\quad +\int _{t}^u\Gamma _sY^1_s \langle B_s,dW_s\rangle +\int _{t}^u\Gamma _s \langle Z^1_s,dW_s\rangle \end{aligned} \end{aligned}$$

(102)

and

$$\begin{aligned} \begin{aligned} \Gamma _u Y^2_u-Y^2_t&=\int _{t}^u \Gamma _s \Big (- f_2(s,Y^2_s,Z^2_s)+A_sY^2_s+ \langle B_s,Z^2_s\rangle \Big )\,ds\\&\quad +\int _{t}^u\Gamma _s Y^2_s \langle B_s,dW_s\rangle +\int _{t}^u\Gamma _s \langle Z^2_s,dW_s\rangle . \end{aligned} \end{aligned}$$

(103)

Moreover, note that (99) and (100) imply that for all $s\in [t,T]$ it holds that

$$\begin{aligned}&A_s(Y^1_s-Y^2_s)+ \langle B_s,Z^1_s-Z^2_s\rangle = A_s(Y^1_s-Y^2_s)+\sum _{j=1}^d B_s^j\big (Z^{1,j}_s-Z^{2,j}_s\big )\nonumber \\&\quad =f_1(s,Y^1_s,Z^2_s)-f_1(s,Y^2_s,Z^2_s)\nonumber \\&\qquad +\sum _{j=1}^d\Big (f_1(s,Y^1_s,Z^{1,1}_s,\ldots ,Z^{1,j}_s,Z^{2,j+1}_s,\ldots ,Z^{2,d}_s)\nonumber \\&\qquad -f_1(s,Y^1_s,Z^{1,1}_s,\ldots ,Z^{1,j-1}_s,Z^{2,j}_s,\ldots ,Z^{2,d}_s)\Big )\nonumber \\&\quad =f_1(s,Y^1_s,Z^2_s)-f_1(s,Y^2_s,Z^2_s) +f_1(s,Y^1_s,Z^1_s)-f_1(s,Y^1_s,Z^2_s)\nonumber \\&\quad =f_1(s,Y^1_s,Z^1_s)-f_1(s,Y^2_s,Z^2_s) \end{aligned}$$

(104)

This, (102), and (103) assure that for all $u\in [t,T]$ it holds ${\mathbb {P}}$-a.s. that

$$\begin{aligned}&Y^1_t-Y^2_t -\Gamma _u (Y^1_u-Y^2_u) +\int _t^u\Gamma _s(Y^1_s-Y^2_s)\langle B_s,dW_s\rangle +\int _t^u\Gamma _s\langle Z^1_s-Z^2_s,dW_s\rangle \nonumber \\&\quad =\int _t^u \Gamma _s \Big (f_1(s,Y^1_s,Z^1_s)-f_2(s,Y^2_s,Z^2_s) -\left( A_s(Y^1_s-Y^2_s)+\langle B_s,Z^1_s-Z^2_s\rangle \Big )\right) \,ds\nonumber \\&\quad = \int _t^u \Gamma _s \Big (f_1(s,Y^2_s, Z^2_s)-f_2(s,Y^2_s,Z^2_s)\Big )\,ds. \end{aligned}$$

(105)

Next let $\tau _n:\Omega \rightarrow [t,T)$, $n\in \mathbb {N}$, be the $(\mathbb F_s)_{s\in [t,T]}$-stopping times which satisfy for all $n\in \mathbb {N}$ that

$$\begin{aligned} \tau _n&=\inf \left( \left\{ r\in [t,T]: \left| \int _t^r\Gamma _s(Y^1_s-Y^2_s)\langle B_s,dW_s\rangle \right| \right. \right. \nonumber \\&\quad \left. \left. +\left| \int _t^r\Gamma _s\langle Z^1_s-Z^2_s,dW_s\rangle \right| \ge n \right\} \cup \{T-\tfrac{T}{n}\}\right) . \end{aligned}$$

(106)

Note that (105) and (106) assure that for all $n\in \mathbb {N}$ it holds ${\mathbb {P}}$-a.s. that

$$\begin{aligned} Y^1_t-Y^2_t={\mathbb {E}}\left[ \Gamma _{\tau _n} (Y^1_{\tau _n}-Y^2_{\tau _n}) + \int _t^{\tau _n}\Gamma _s \Big (f_1(s,Y^2_s, Z^2_s)- f_2(s,Y^2_s,Z^2_s)\Big )\,ds\big | \mathbb F_t\right] .\nonumber \\ \end{aligned}$$

(107)

Moreover, observe that (96) ensures that $\sup _{s\in [t,T]}\Vert B_s\Vert ^2\le \mathfrak {L}^2d$. The fact that ${\mathbb {E}}[ e^{-\int _t^T \frac{\Vert 2B_s\Vert ^2}{2} \,ds+\int _t^T\langle 2B_s,dW_s\rangle }]=1$ hence implies that

$$\begin{aligned} \begin{aligned} {\mathbb {E}}\!\left[ \left| e^{-\int _t^T\frac{\Vert B_s\Vert ^2}{2} \,ds+\int _t^T\langle B_s,dW_s\rangle }\right| ^2\right]&= {\mathbb {E}}\!\left[ e^{\int _t^T \Vert B_s\Vert ^2 \,ds} e^{-\int _t^T \frac{\Vert 2B_s\Vert ^2}{2} \,ds+\int _t^T\langle 2B_s,dW_s\rangle }\right] \\&\le e^{(T-t)\mathfrak {L}^2 d}\, {\mathbb {E}}\!\left[ e^{-\int _t^T \frac{\Vert 2B_s\Vert ^2}{2} \,ds+\int _t^T\langle 2B_s,dW_s\rangle }\right] \\&= e^{(T-t)\mathfrak {L}^2 d}. \end{aligned} \end{aligned}$$

(108)

In addition, observe that (96) demonstrates that it holds ${\mathbb {P}}$-a.s. that $\sup _{s\in [t,T]}|A_s|\le L$. This, Doob’s martingale inequality, and (108) show that

$$\begin{aligned} \begin{aligned} {\mathbb {E}}\!\left[ \sup _{s\in [t,T]}|\Gamma _s|^2\right]&= {\mathbb {E}}\!\left[ \sup _{s\in [t,T]}e^{\int _t^s 2A_rdr}\left| e^{-\int _t^s\frac{\Vert B_r\Vert ^2}{2} dr+\int _t^s\langle B_r,dW_r\rangle }\right| ^2\right] \\&\le e^{2L(T-t)}\,{\mathbb {E}}\!\left[ \sup _{s\in [t,T]}\left| e^{-\int _t^s\frac{\Vert B_r\Vert ^2}{2} dr+\int _t^s\langle B_r,dW_r\rangle }\right| ^2\right] \\&\le 4 e^{2L(T-t)}\,{\mathbb {E}}\!\left[ \left| e^{-\int _t^T\frac{\Vert B_r\Vert ^2}{2} dr+\int _t^T\langle B_r,dW_r\rangle }\right| ^2\right] \le 4 e^{(T-t)\left( 2L+\mathfrak {L}^2d\right) }. \end{aligned} \end{aligned}$$

(109)

Combining this with the Cauchy–Schwarz inequality and the assumption that $\sum _{i=1}^2{\mathbb {E}}\big [\sup _{s\in [t,T]}|Y^i_{s}|^2\big ]<\infty $ proves that

$$\begin{aligned} {\mathbb {E}}\!\left[ \sup _{s\in [t,T]}|\Gamma _{s} (Y^1_{s}-Y^2_{s})|\right] \le \sqrt{{\mathbb {E}}\!\left[ \sup _{s\in [t,T]}|\Gamma _s|^2\right] {\mathbb {E}}\!\left[ \sup _{s\in [t,T]}|Y^1_{s}-Y^2_{s}|^2\right] }<\infty . \end{aligned}$$

(110)

Next observe that the fact that for all $s\in [t,T]$ it holds that $\Gamma _s\ge 0$ demonstrates that

$$\begin{aligned}&{\mathbb {E}}\!\left[ \int _t^{T}\Gamma _s \big |f_1(s,Y^2_s,Z^2_s)- f_2(s,Y^2_s,Z^2_s)\big |\,ds \right] \nonumber \\&\quad \le {\mathbb {E}}\!\left[ \int _t^{T}\Gamma _s \,ds \right] \left[ \sup _{s\in [t,T]}\sup _{y\in \mathbb {R}}\sup _{z\in \mathbb {R}^d}\Vert f_1(s,y,z)-f_2(s,y,z)\Vert _{L^\infty ({\mathbb {P}};\mathbb {R})} \right] <\infty . \end{aligned}$$

(111)

This, (110), Lebesgue’s dominated convergence theorem, (107), the fact that $Y^1$, $Y^2$, and $\Gamma $ are stochastic processes with continuous sample paths, and the fact that it holds ${\mathbb {P}}$-a.s. that $\lim _{n\rightarrow \infty }\tau _n=T$ ensure that it holds ${\mathbb {P}}$-a.s. that

$$\begin{aligned} Y^1_t-Y^2_t&=\lim _{n\rightarrow \infty }{\mathbb {E}}\!\left[ \Gamma _{\tau _n} (Y^1_{\tau _n}-Y^2_{\tau _n}) + \int _t^{\tau _n}\Gamma _s \Big (f_1(s, Y^2_s,Z^2_s)- f_2(s,Y^2_s,Z^2_s)\Big )\,ds \,\Big |\, \mathbb F_t\right] \nonumber \\&= {\mathbb {E}}\!\left[ \Gamma _{T} (Y^1_{T}-Y^2_{T}) + \int _t^{T}\Gamma _s \Big (f_1(s,Y^2_s, Z^2_s)-f_2(s,Y^2_s,Z^2_s)\Big )\,ds\,\bigg |\,\mathbb F_t\right] . \end{aligned}$$

(112)

Next observe that the fact that $\sup _{s\in [t,T]}|A_s|\le L$ demonstrates that for all $s\in [t,T]$ it holds that

$$\begin{aligned} {\mathbb {E}}[\Gamma _s | \mathbb F_t]\le e^{L(s-t)}\, {\mathbb {E}}\!\left[ e^{-\int _t^s\frac{\Vert B_r\Vert ^2}{2} dr +\int _t^s\langle B_r,dW_r\rangle }\, \Big |\,\mathbb F_t \right] =e^{L(s-t)}. \end{aligned}$$

(113)

Combining this, (112), and the triangle inequality shows that it holds ${\mathbb {P}}$-a.s. that

$$\begin{aligned} \begin{aligned} \big |Y^1_t-Y^2_t\big |&\le {\mathbb {E}}\!\left[ \Gamma _{T} |Y^1_T-Y^2_T| + \int _t^{T}\Gamma _s \big |f_1(s,Y^2_s, Z^2_s)- f_2(s,Y^2_s,Z^2_s)\big |\,ds \,\bigg |\, \mathbb F_t\right] \\&\le {\mathbb {E}}[\Gamma _T | \mathbb F_t]\,\Vert Y^1_T-Y^2_T\Vert _{L^{\infty }({\mathbb {P}};\mathbb {R})} +{\mathbb {E}}\!\left[ \int _t^{T}\Gamma _s \, ds \,\bigg |\, \mathbb F_t\right] \\&\quad \left[ \sup _{s\in [t,T]}\sup _{y\in \mathbb {R}}\sup _{z\in \mathbb {R}^d}\Vert f_1(s,y,z)-f_2(s,y,z)\Vert _{L^\infty ({\mathbb {P}};\mathbb {R})} \right] \\&\le e^{L(T-t)}\left( \Vert Y^1_T-Y^2_T\Vert _{L^{\infty }({\mathbb {P}};\mathbb {R})}+(T-t)\right. \\&\quad \left. \sup _{s\in [t,T]}\sup _{ y\in \mathbb {R}}\sup _{z\in \mathbb {R}^d}\Vert f_1(s,y,z)- f_2(s,y,z)\Vert _{L^\infty ({\mathbb {P}};\mathbb {R})}\right) . \end{aligned} \end{aligned}$$

(114)

This proves (98). The proof of Lemma 4.1 is thus completed. $\square $

4.2 Regularity Analysis for Solutions of Partial Differential Equations (PDEs)

Lemma 4.2

Let $T\in (0,\infty )$, $d\in \mathbb {N}$, $\eta , L_0,L_1,\ldots ,L_{d},\mathfrak {L}_1, \mathfrak {L}_2, \ldots , \mathfrak {L}_d,K_1,K_2,\ldots ,K_d\in \mathbb {R}$, $f\in C( [0,T]\times \mathbb {R}^d\times \mathbb {R}\times \mathbb {R}^d, \mathbb {R})$, $g\in C(\mathbb {R}^d, \mathbb {R})$, $u = ( u(t,x) )_{ (t,x) \in [0,T] \times \mathbb {R}^d }\in C^{1,2}([0,T]\times \mathbb {R}^d,\mathbb {R})$, let ${\left| \left| \left| \cdot \right| \right| \right| }:\mathbb {R}^{d+1}\rightarrow [0,\infty )$ be a norm, assume for all $t\in [0,T]$, $x=(x_1,x_2,\ldots , x_d)$, $\mathfrak {x}=(\mathfrak {x}_1,\mathfrak {x}_2,\ldots ,\mathfrak {x}_d)$, $z=(z_1,z_2,\ldots , z_d)$, $\mathfrak {z}=(\mathfrak {z}_1,\mathfrak {z}_2,\ldots ,\mathfrak {z}_d) \in \mathbb {R}^d$, $y,\mathfrak {y} \in \mathbb {R}$ that

$$\begin{aligned}&|f(t,x,y,z)-f(t,\mathfrak {x},\mathfrak y,\mathfrak {z})|\le L_0|y-\mathfrak y|+\textstyle \sum _{j=1}^d \displaystyle (L_{j}|z_j-\mathfrak {z}_j| + \mathfrak L_{j}|x_j-\mathfrak {x}_j|), \end{aligned}$$

(115)

$$\begin{aligned}&|g(x)-g(\mathfrak x)|\le \textstyle \sum _{i=1}^d \displaystyle K_i |x_i-\mathfrak x_i|, \quad | u(t,x) | \le \eta \big [ 1 + \textstyle \sum _{ i=1 }^d\displaystyle | x_i | \big ]^{ \eta }, \quad u(T,x)=g(x), \end{aligned}$$

(116)

$$\begin{aligned}&\text {and}\quad \big ( \tfrac{ \partial }{ \partial t } u \big )( t, x ) + \tfrac{ 1 }{ 2 } ( \Delta _x u )( t, x ) + f\big ( t, x, u(t,x), ( \nabla _x u )(t, x) \big ) = 0, \end{aligned}$$

(117)

let $(\Omega ,\mathcal {F},{\mathbb {P}})$ be a probability space, and let $W:[0,T]\times \Omega \rightarrow \mathbb {R}^d$ be a standard Brownian motion. Then

(i)
it holds for all $s\in [0,T)$, $x\in \mathbb {R}^d$ that
$$\begin{aligned}&{\mathbb {E}}\!\left[ {\big \vert \big \vert \big \vert g(x+W_{T-s})\big (1,\tfrac{W_{T-s}}{T-s} \big ) \big \vert \big \vert \big \vert } \right] \nonumber \\&\quad +{\mathbb {E}}\!\left[ \int _s^{T}{\big \vert \big \vert \big \vert \big [ f\big (t,x+W_{t-s},u(t,x+W_{t-s}),(\nabla _xu)(t,x+W_{t-s})\big )\big ] \big (1,\tfrac{W_{t-s}}{t-s} \big ) \big \vert \big \vert \big \vert } \,dt \right] <\infty , \end{aligned}$$
(118)
(ii)
it holds for all $s\in [0,T)$, $x\in \mathbb {R}^d$ that
$$\begin{aligned}&(u(s,x),(\nabla _x u)(s,x))={\mathbb {E}}\!\left[ g(x+W_{T-s})\big (1,\tfrac{W_{T-s}}{T-s} \big )\right] \nonumber \\&\quad +{\mathbb {E}}\!\left[ \int _s^{T}\left[ f\big (t,x+W_{t-s},u(t,x+W_{t-s}),(\nabla _xu)(t,x+W_{t-s})\big )\right] \big (1,\tfrac{W_{t-s}}{t-s} \big ) \,dt\right] , \end{aligned}$$
(119)
(iii)
it holds for all $t\in [0,T]$, $x=(x_1,x_2,\ldots , x_d)$, $\mathfrak x=(\mathfrak {x}_1,\mathfrak {x}_2,\ldots , \mathfrak {x}_d)\in \mathbb {R}^d$ that
$$\begin{aligned} \left| u(t,x)-u(t,\mathfrak {x})\right| \le e^{L_0(T-t)}\Big (\textstyle \sum _{j=1}^d\displaystyle (K_j+(T-t)\mathfrak L_{j})|x_j-\mathfrak x_j|\Big ), \end{aligned}$$
(120)
(iv)
it holds for all $t\in [0,T)$, $x\in \mathbb {R}^d$, $i\in \{1,2,\ldots ,d\}$ that
$$\begin{aligned} \big | (\tfrac{\partial }{\partial x_i}u)(t,x)\big |\le e^{L_0(T-t)} (K_i+(T-t)\mathfrak {L}_{i}), \end{aligned}$$
(121)
and
(v)
it holds for all $x\in \mathbb {R}^d$, $p\in [1,\infty )$ that
$$\begin{aligned}&\sup _{s\in [0,T]} \sup _{t\in [s,T]}\left( {\mathbb {E}}\!\left[ \left| u(t,x +W_t-W_s)\right| ^p\right] \right) ^{\!\nicefrac {1}{p}} \nonumber \\ {}&\quad \le e^{L_0T} \left[ \sup _{s\in [0,T]} ({\mathbb {E}}[|g(x+W_s)|^p])^{\!\nicefrac {1}{p}} +T\sup _{s,t\in [0,T]} ({\mathbb {E}}[|f(t,x+W_s,0,0)|^p])^{\!\nicefrac {1}{p}}\right. \nonumber \\&\qquad \left. +Te^{L_0T}\textstyle \sum _{j=1}^d \displaystyle L_{j} (K_j +T\mathfrak {L_j}) \right] . \end{aligned}$$
(122)

Proof of Lemma 4.2

Throughout this proof let $t\in [0,T)$, $x=(x_1,x_2,\ldots , x_d)$, $\mathfrak {x}=(\mathfrak {x}_1,\mathfrak {x}_2,\ldots ,\mathfrak {x}_d) \in \mathbb {R}^d$, let $\left\| \cdot \right\| :\mathbb {R}^d \rightarrow [0,\infty )$ be the standard norm on $\mathbb {R}^d$, let $ \langle \cdot , \cdot \rangle :\mathbb {R}^d \times \mathbb {R}^d \rightarrow [0,\infty ) $ be the standard scalar product on $\mathbb {R}^d$, let $ F:C([0,T)\times \mathbb {R}^d,\mathbb {R}^{1+d}) \rightarrow C([0,T)\times \mathbb {R}^d,\mathbb {R}) $ and $\mathbf{u}:[0,T)\times \mathbb {R}^d \rightarrow \mathbb {R}^{d+1}$ satisfy for all $s\in [0,T)$, $y\in \mathbb {R}^d$, $\mathbf{v}\in C([0,T)\times \mathbb {R}^d,\mathbb {R}^{1+d})$ that $(F(\mathbf{v}))(s,y)=f(s,y,\mathbf{v}(s,y))$ and $\mathbf{u}(s,y)=(u(s,y), (\nabla _y u)(s,y))$, and let $Y,\mathfrak {Y}:[t,T]\times \Omega \rightarrow \mathbb {R}$ and $Z,\mathfrak {Z}:[t,T]\times \Omega \rightarrow \mathbb {R}^d$ be stochastic processes which satisfy for all $s\in [t,T]$ that

$$\begin{aligned} Y_s=u(s,x+W_s-W_t), \quad \mathfrak {Y}_s=u(s,\mathfrak x+W_s-W_t),\quad Z_s=(\nabla _x u)(s,x+W_s-W_t), \end{aligned}$$

(123)

and $\mathfrak {Z}_s=(\nabla _x u)(s,\mathfrak x +W_s-W_t)$. Observe that [47, Lemma 4.2] establishes Item (i) and Item (ii). Next we prove Item (iii). Note that Itô’s lemma yields that for all $s\in [t,T]$ it holds ${\mathbb {P}}$-a.s. that

$$\begin{aligned} Y_T=Y_s-\int _s^T f(r,x+W_r-W_t,Y_r,Z_r)\,dr+\int _s^T \langle Z_r,dW_r\rangle \end{aligned}$$

(124)

and

$$\begin{aligned} \mathfrak {Y}_T=\mathfrak {Y}_s-\int _s^T f(r,\mathfrak x+W_r-W_t,\mathfrak {Y}_r,\mathfrak {Z}_r)\,dr +\int _s^T \langle \mathfrak {Z}_r,dW_r\rangle . \end{aligned}$$

(125)

Next note that (116) implies that there exists $\lambda \in (\frac{1}{2},\infty )$ which satisfies that $\sup _{s\in [0,T],\xi \in \mathbb {R}^d}\tfrac{|u(s,\xi )|}{1+\Vert \xi \Vert ^\lambda }<\infty $. Observe that Doob’s inequality implies that

$$\begin{aligned}&\left\| \sup _{s\in [t,T]} |Y_s| \right\| _{L^2({\mathbb {P}};\mathbb {R})} =\left\| \sup _{s\in [t,T]} |u(s,x+W_s-W_t)| \right\| _{L^2({\mathbb {P}};\mathbb {R})} \nonumber \\&\quad \le \left[ \sup _{s\in [0,T],\xi \in \mathbb {R}^d}\tfrac{|u(s,\xi )|}{1+\Vert \xi \Vert ^\lambda }\right] \left( 1+ \left\| \sup _{s\in [t,T]} \Vert x+W_s-W_t\Vert ^\lambda \right\| _{L^2({\mathbb {P}};\mathbb {R})} \right) \nonumber \\&\quad \le \left[ \sup _{s\in [0,T],\xi \in \mathbb {R}^d}\tfrac{|u(s,\xi )|}{1+\Vert \xi \Vert ^\lambda }\right] \left( 1+ \left( \frac{2\lambda }{2\lambda -1} \right) ^{1/\lambda }\left\| \Vert x+W_T-W_t\Vert ^\lambda \right\| _{L^2({\mathbb {P}};\mathbb {R})} \right) <\infty \end{aligned}$$

(126)

and

$$\begin{aligned} \left\| \sup _{s\in [t,T]} |\mathfrak Y_s| \right\| _{L^2({\mathbb {P}};\mathbb {R})}<\infty . \end{aligned}$$

(127)

Moreover, note that (115) implies that for all $s\in [t,T]$, $y,\mathfrak {y} \in \mathbb {R}$, $z=(z_1,z_2,\ldots , z_d)$, $\mathfrak {z}=(\mathfrak {z}_1,\mathfrak {z}_2,\ldots ,\mathfrak {z}_d) \in \mathbb {R}^d$ it holds ${\mathbb {P}}$-a.s. that

$$\begin{aligned} |f(s,x+W_t-W_s,y,z)-f(s,x+W_t-W_s,\mathfrak {y},\mathfrak {z})|\le L_0|y-\mathfrak {y}|+\sum _{j=1}^dL_{j}|z_j-\mathfrak {z}_j|. \end{aligned}$$

(128)

and

$$\begin{aligned} |f(s,x+W_s-W_t,y,z)-f(s,\mathfrak x +W_s-W_t,y,z)|\le \sum _{j=1}^d\mathfrak {L}_{j}|x_j-\tilde{x}_j|. \end{aligned}$$

(129)

Combining this with (124), (125), (126), (127), and Lemma 4.1 proves that

$$\begin{aligned}&|u(t,x)-u(t,\mathfrak x)|= \big |Y_{t}-\mathfrak {Y}_{t}\big |\nonumber \\&\quad \le e^{L_0(T-t)}\bigg (\Vert Y_T-\mathfrak {Y}_T\Vert _{L^{\infty }({\mathbb {P}};\mathbb {R})}\nonumber \\&\qquad +(T-t)\sup _{s\in [t,T]}\sup _{y\in \mathbb {R}}\sup _{z\in \mathbb {R}^d}\Vert f(s,x+W_s-W_t,y,z)\nonumber \\&\qquad - f(s,\mathfrak x +W_s-W_t,y,z)\Vert _{L^\infty ({\mathbb {P}};\mathbb {R})}\bigg )\nonumber \\&\quad \le e^{L_0(T-t)}\bigg (\Vert g(x+W_T-W_t)-g(\mathfrak x+W_T-W_t)\Vert _{L^{\infty }({\mathbb {P}};\mathbb {R})}\nonumber \\&\qquad +(T-t) \textstyle \sum \limits _{j=1}^d \displaystyle \mathfrak L_{j}|x_j-\mathfrak x_j|\bigg )\nonumber \\&\quad \le e^{L_0(T-t)}\bigg (\textstyle \sum \limits _{j=1}^d \displaystyle (K_j+(T-t)\mathfrak {L}_{j})|x_j-\mathfrak x_j|\bigg ). \end{aligned}$$

(130)

This establishes Item (iii). Next not that Item (iii) demonstrates that for all $i\in \{1,2,\ldots ,d\}$ it holds that

$$\begin{aligned} \left| (\tfrac{\partial }{\partial x_i}u)(t,x)\right| = \left| \lim _{\mathbb {R}\setminus \{0\}\ni h\rightarrow 0}\tfrac{u(t,x+h e_i)-u(t,x)}{h}\right| \le e^{L_0(T-t)} (K_i+(T-t)\mathfrak {L}_{i}). \end{aligned}$$

(131)

This establishes Item (iv). In the next step we prove Item (v). Observe that Item (ii), Tonelli’s theorem, and the triangle inequality prove that for all $r\in [0,T]$, $s\in [r,T]$, $p\in [1,\infty )$ it holds that

$$\begin{aligned}&\Vert u(s,x+W_s-W_r)\Vert _{L^p({\mathbb {P}};\mathbb {R})}\nonumber \\&\quad = \Big \Vert \int g(x+W_s-W_r+y){\mathbb {P}}_{W_T-W_r}(dy)\nonumber \\&\qquad +\int _s^T \int (F(\mathbf{u}))(v,x+W_s-W_r+y)\,{\mathbb {P}}_{W_v-W_s}(dy)\,dv\Big \Vert _{L^p({\mathbb {P}};\mathbb {R})}\nonumber \\&\quad \le \Big \Vert \int g(x+W_s-W_r+y){\mathbb {P}}_{W_T-W_r}(dy)\Big \Vert _{L^p({\mathbb {P}};\mathbb {R})}\nonumber \\&\qquad +\int _s^T \Big \Vert \int (F(\mathbf{u}))(v,x+W_s-W_r+y)\,{\mathbb {P}}_{W_v-W_s}(dy)\Big \Vert _{L^p({\mathbb {P}};\mathbb {R})}\,dv. \end{aligned}$$

(132)

This and Jensen’s inequality show that for all $r\in [0,T]$, $s\in [r,T]$, $p\in [1,\infty )$ it holds that

$$\begin{aligned}&\Vert u(r,x+W_s-W_r)\Vert _{L^p({\mathbb {P}};\mathbb {R})} \le \left( \int \int \big |g(x+z+y)\big |^p\,{\mathbb {P}}_{W_T-W_r}(dy)\,{\mathbb {P}}_{W_s-W_r}(dz)\right) ^{\!\frac{1}{p}}\nonumber \\&\qquad +\int _s^T \left( \int \int \big |(F(\mathbf{u}))(v,x+z+y)\big |^p\,{\mathbb {P}}_{W_T-W_r}(dy)\,{\mathbb {P}}_{W_s-W_r}(dz)\right) ^{\!\frac{1}{p}}\,dv.\nonumber \\&\quad = \left( \int \int \big |g(x+z+y)\big |^p{\mathbb {P}}_{(W_v-W_s,W_s-W_r)}(d(y,z))\right) ^{\!\frac{1}{p}}\nonumber \\&\qquad +\int _s^T \left( \int \int \big |(F(\mathbf{u}))(v,x+z+y)\big |^p\,{\mathbb {P}}_{(W_v-W_s,W_s-W_r)}(d(y,z))\right) ^{\!\frac{1}{p}}\,dv \nonumber \\&\quad =\Vert g(x+W_T-W_r)\Vert _{L^p({\mathbb {P}};\mathbb {R})} +\int _s^T\big \Vert (F(\mathbf{u}))(v,x+W_v-W_r)\big \Vert _{L^p({\mathbb {P}};\mathbb {R})}\,dv. \end{aligned}$$

(133)

This, the triangle inequality, (115), and Item (iv) show that for all $r\in [0,T]$, $p\in [1,\infty )$ it holds that

$$\begin{aligned}&\sup _{s\in [r,T]}\Vert u(r,x+W_s-W_r)\Vert _{L^p({\mathbb {P}};\mathbb {R})} \le \Vert g(x+W_T-W_r)\Vert _{L^p({\mathbb {P}};\mathbb {R})}\nonumber \\&\qquad +\int _r^T\big \Vert (F(0))(v,x+W_v-W_r)\big \Vert _{L^p({\mathbb {P}};\mathbb {R})}\,dv\nonumber \\&\qquad +L_0\int _r^T\big \Vert { u}(v,x+W_v-W_r)\big \Vert _{L^p({\mathbb {P}};\mathbb {R})}\,dv\nonumber \\&\qquad +\sum _{j=1}^d L_{j}\int _r^T\big \Vert (\tfrac{\partial }{\partial x_j}u)(v,x+W_v-W_r)\big \Vert _{L^p({\mathbb {P}};\mathbb {R})}\,dv\nonumber \\&\quad \le \Vert g(x+W_T-W_r)\Vert _{L^p({\mathbb {P}};\mathbb {R})} +\int _r^T\big \Vert (F(0))(v,x+W_v-W_r)\big \Vert _{L^p({\mathbb {P}};\mathbb {R})}\,dv\nonumber \\&\qquad +L_0\int _r^T\sup _{s\in [v,T]}\big \Vert { u}(s,x+W_s-W_t)\big \Vert _{L^p({\mathbb {P}};\mathbb {R})}\,dv +\sum _{j=1}^d e^{L_0T}T L_{j} (K_j +T\mathfrak {L_j}). \end{aligned}$$

(134)

In addition, note that the fact that $\sup _{s\in [0,T],\xi \in \mathbb {R}^d}\tfrac{|u(s,\xi )|}{1+\Vert \xi \Vert ^\lambda }<\infty $ ensures that for $p\in [1,\infty )$ it holds that

$$\begin{aligned}&\sup _{r\in [0,T]}\sup _{s\in [r,T]}\Vert u(s,x+W_s-W_t)\Vert _{L^p({\mathbb {P}};\mathbb {R})}\nonumber \\&\quad \le \left[ \sup _{r\in [0,T],\xi \in \mathbb {R}^d}\tfrac{|u(r,\xi )|}{1+\Vert \xi \Vert ^\lambda }\right] \left( 1+ \sup _{r\in [0,T]}\sup _{s\in [r,T]}\left\| \Vert x+W_s-W_r\Vert ^\lambda \right\| _{L^p({\mathbb {P}};\mathbb {R})} \right) <\infty . \end{aligned}$$

(135)

This, (134), and Gronwall’s inequality yield that for all $p\in [1,\infty )$ it holds that

$$\begin{aligned}&\sup _{r\in [0,T]}\sup _{s\in [r,T]}\Vert u(s,x+W_s-W_r)\Vert _{L^p({\mathbb {P}};\mathbb {R})} \le e^{L_0T}\sup _{r\in [0,T]} \Vert g(x+W_r)\Vert _{L^p({\mathbb {P}};\mathbb {R})}\nonumber \\&\quad +e^{L_0T}T\bigg (\sup _{s,r\in [0,T]}\big \Vert (F(0))(s,x+W_r)\big \Vert _{L^p({\mathbb {P}};\mathbb {R})} +e^{L_0T}\sum _{j=1}^d L_{j} (K_j +T\mathfrak {L_j}) \bigg ). \end{aligned}$$

(136)

The proof of Lemma 4.2 is thus completed. $\square $

5 Overall Complexity Analysis for MLP Approximation Methods

In this section we combine the findings of Sects. 3 and 4 to establish in Theorem 5.2 below the main approximation result of this article; see also Corollary 5.1 and Corollary 5.4 below. The i.i.d. random variables $\mathfrak {r}^\theta :\Omega \rightarrow (0,1)$, $\theta \in \Theta $, appearing in the MLP approximation methods in Corollary 5.1 (see (140) in Corollary 5.1), Theorem 5.2 (see (160) in Theorem 5.2), and Corollary 5.4 (see (175) in Corollary 5.4) are employed to approximate the time integrals in the semigroup formulations of the PDEs under consideration. One of the key ingredients of the MLP approximation methods, which we propose and analyze in this article, is the fact that the density of these i.i.d. random variables $\mathfrak {r}^\theta :\Omega \rightarrow (0,1)$, $\theta \in \Theta $, is equal to the function $(0,1) \ni s \mapsto \alpha s^{\alpha -1}\in \mathbb {R}$ for some $\alpha \in ( 0, 1 )$, or equivalently, that these i.i.d. random variables satisfy for all $\theta \in \Theta $, $b \in (0,1)$ that ${\mathbb {P}}( \mathfrak {r}^{ \theta } \le b ) = b^\alpha $ for some $\alpha \in (0,1)$. In particular, in contrast to previous MLP approximation methods studied in the scientific literature (see, e.g., [4, 45, 46]) it is crucial in this article to exclude the case where the random variables $\mathfrak {r}^\theta :\Omega \rightarrow (0,1)$, $\theta \in \Theta $, are continuous uniformly distributed on (0, 1) (corresponding to the case $\alpha = 1$). To make this aspect more clear to the reader, we provide in Lemma 5.3 below an explanation why it is essential to exclude the continuous uniform distribution case $\alpha = 1$. Note that the random variable $\mathbf{U}:\Omega \rightarrow \mathbb {R}^{d+1}$ in Lemma 5.3 coincides with a special case of the random fields in (160) in Theorem 5.2 (applied with $g_d(x)=0$, $f_d(s,x,y,z)=1$, $M=1$, $t=0$ for $s\in [0,T)$, $x,z \in \mathbb {R}^d$, $y\in \mathbb {R}$, $d\in \mathbb {N}$ in the notation of Theorem 5.2).

5.1 Quantitative Complexity Analysis for MLP Approximation Methods

Corollary 5.1

Let $\left\| \cdot \right\| _1:(\cup _{n\in \mathbb {N}}\mathbb {R}^n)\rightarrow \mathbb {R}$ and $\left\| \cdot \right\| _\infty :(\cup _{n\in \mathbb {N}}\mathbb {R}^n)\rightarrow \mathbb {R}$ satisfy for all $n\in \mathbb {N}$, $x=(x_1,x_2,\ldots ,x_n)\in \mathbb {R}^n$ that $\Vert x\Vert _1=\sum _{i=1}^n |x_i|$ and $\Vert x\Vert _\infty =\max _{i\in \{1,2,\ldots ,n\}}|x_i|$, let $T,\delta \in (0,\infty )$, ${\varepsilon }\in (0,1]$, $d \in \mathbb {N}$, $L=(L_0,L_1,\ldots ,L_{d}) \in \mathbb {R}^{d+1}$, $K=(K_1,K_2,\ldots ,K_d)$, $\mathfrak L=(\mathfrak L_1,\mathfrak L_2\ldots , \mathfrak L_d)$, $\xi \in \mathbb {R}^d$, $p\in (2,\infty )$, $\alpha \in (\frac{p-2}{2(p-1)},\frac{p}{2(p-1)})$, $\beta =\frac{\alpha }{2}-\frac{(1-\alpha )(p-2)}{2p}$, $f\in C( [0,T]\times \mathbb {R}^d\times \mathbb {R}\times \mathbb {R}^d, \mathbb {R})$, $g\in C(\mathbb {R}^d, \mathbb {R})$, let $u = ( u(t,x) )_{ (t,x) \in [0,T] \times \mathbb {R}^d }\in C^{1,2}([0,T]\times \mathbb {R}^d,\mathbb {R})$ be an at most polynomially growing function, assume for all $t\in (0,T)$, $x=(x_1,x_2, \ldots ,x_d)$, $\mathfrak x=(\mathfrak x_1,\mathfrak x_2, \ldots ,\mathfrak x_d)$, $z=(z_1,z_2,\ldots ,z_d)$, $\mathfrak {z}=(\mathfrak z_1, \mathfrak z_2, \ldots , \mathfrak z_d)\in \mathbb {R}^d$, $y,\mathfrak {y} \in \mathbb {R}$ that

$$\begin{aligned}&|f(t,x,y,z)-f(t,\mathfrak x,\mathfrak y,\mathfrak {z})|\le L_0|y-\mathfrak y|+ \textstyle { \sum _{j=1}^d} \big ( L_{j}|z_j-\mathfrak {z}_j| +\mathfrak L_{j}|x_j-\mathfrak {x}_j|\big ), \end{aligned}$$

(137)

$$\begin{aligned}&|g(x)-g(\mathfrak x)|\le \textstyle {\sum _{i=1}^d} K_i |x_i-\mathfrak x_i|, \quad u(T,x) = g(x), \end{aligned}$$

(138)

$$\begin{aligned}&\text {and}\quad \big ( \tfrac{ \partial }{ \partial t } u \big )( t, x ) + \tfrac{ 1 }{ 2 } ( \Delta _x u )( t, x ) + f\big ( t, x, u(t,x), ( \nabla _x u )(t, x) \big ) = 0, \end{aligned}$$

(139)

let $ F:C([0,T)\times \mathbb {R}^d,\mathbb {R}^{1+d}) \rightarrow C([0,T)\times \mathbb {R}^d,\mathbb {R}) $ satisfy for all $t\in [0,T)$, $x\in \mathbb {R}^d$, $\mathbf{v}\in C([0,T)\times \mathbb {R}^d,\mathbb {R}^{1+d})$ that $(F(\mathbf{v}))(t,x)=f(t,x,\mathbf{v}(t,x))$, let $ ( \Omega , \mathcal {F}, {\mathbb {P}}) $ be a probability space, let $ \Theta = \cup _{ n \in \mathbb {N}} \mathbb {Z}^n $, let $ Z^{ \theta } :\Omega \rightarrow \mathbb {R}^d $, $ \theta \in \Theta $, be i.i.d. standard normal random variables, let $\mathfrak {r}^\theta :\Omega \rightarrow (0,1)$, $\theta \in \Theta $, be i.i.d. random variables, assume for all $b\in (0,1)$ that ${\mathbb {P}}(\mathfrak {r}^0\le b)=b^{1-\alpha }$, assume that $(Z^\theta )_{\theta \in \Theta }$ and $(\mathfrak {r}^\theta )_{ \theta \in \Theta }$ are independent, let $ \mathbf{U}_{ n,M}^{\theta } = ( \mathbf{U}_{ n,M}^{\theta , 0},\mathbf{U}_{ n,M}^{\theta , 1},\ldots ,\mathbf{U}_{ n,M}^{\theta , d} ) :[0,T)\times \mathbb {R}^d\times \Omega \rightarrow \mathbb {R}^{1+d} $, $n,M\in \mathbb {Z}$, $\theta \in \Theta $, satisfy for all $ n,M \in \mathbb {N}$, $ \theta \in \Theta $, $ t\in [0,T)$, $x \in \mathbb {R}^d$ that $ \mathbf{U}_{-1,M}^{\theta }(t,x)=\mathbf{U}_{0,M}^{\theta }(t,x)=0$ and

$$\begin{aligned} \mathbf{U}_{n,M}^{\theta }(t,x)&= \left( g(x) , 0 \right) + \tfrac{1}{M^n}\textstyle \sum \limits _{i=1}^{M^n} \displaystyle \big (g(x+[T-t]^{1/2}Z^{(\theta ,0,-i)})\nonumber \\&\quad -g(x)\big ) \big ( 1 , [T - t]^{-1/2} Z^{(\theta , 0, -i)} \big )\nonumber \\&\quad +\textstyle \sum \limits _{l=0}^{n-1}\sum \limits _{i=1}^{M^{n-l}}\displaystyle \tfrac{(T-t)(\mathfrak {r}^{(\theta , l,i)})^\alpha }{(1-\alpha )M^{n-l}} \big ( 1 , [(T-t)\mathfrak {r}^{(\theta , l,i)}]^{-1/2} Z^{(\theta ,l,i)} \big )\nonumber \\&\quad \cdot \big [ \big (F(\mathbf{U}_{l,M}^{(\theta ,l,i)})-\mathbb {1}_{\mathbb {N}}(l)F( \mathbf{U}_{l-1,M}^{(\theta ,-l,i)})\big ) (t+(T-t)\mathfrak {r}^{(\theta , l,i)},x\nonumber \\&\quad +[(T-t)\mathfrak {r}^{(\theta , l,i)}]^{1/2}Z^{(\theta ,l,i)}) \big ], \end{aligned}$$

(140)

let $({\text {RV}}_{n,M})_{(n,M)\in \mathbb {Z}^2}\subseteq \mathbb {Z}$ satisfy for all $n,M \in \mathbb {N}$ that ${\text {RV}}_{0,M}=0$ and

$$\begin{aligned} {\text {RV}}_{ n,M} \le d M^n+\textstyle \sum \limits _{l=0}^{n-1}\displaystyle \left[ M^{(n-l)}( d+1 + {\text {RV}}_{ l, M}+ \mathbb {1}_{ \mathbb {N}}( l ) {\text {RV}}_{ l-1, M })\right] , \end{aligned}$$

(141)

and let $C\in (0,\infty )$ satisfy that

$$\begin{aligned} C=\max \left\{ \tfrac{1}{2},|\Gamma (\tfrac{p}{2})|^{\frac{1}{p}}(1-\alpha )^{\frac{1}{p}-1 }\max \left\{ T, |\Gamma (\tfrac{p+1}{2})|^{\frac{1}{p}}\pi ^{-\frac{1}{2p}}\sqrt{2T} \right\} \max \{ 1, \Vert L \Vert _1 \} \right\} . \end{aligned}$$

(142)

Then there exists $N\in \mathbb {N}\cap [2,\infty )$ such that

$$\begin{aligned}&\sup _{ n \in \mathbb {N}\cap [N,\infty ) } \Big [ \big \Vert \mathbf{U}_{{n},\lfloor n^{2\beta } \rfloor }^{0,0}(0,\xi )-u(0,\xi )\big \Vert _{L^2({\mathbb {P}};\mathbb {R})}\nonumber \\&\quad + \max _{i\in \{1,2,\ldots ,d\}} \big \Vert \mathbf{U}_{{n},\lfloor n^{2\beta } \rfloor }^{0,i}(0,\xi )-( \tfrac{ \partial }{ \partial x_i } u )(0,\xi )\big \Vert _{L^2({\mathbb {P}};\mathbb {R})} \Big ] \le {\varepsilon }\end{aligned}$$

(143)

and

$$\begin{aligned}&\sum _{n=1}^{N}{\text {RV}}_{n,\lfloor n^{2\beta } \rfloor } \le d \varepsilon ^{-(2+\delta )} 2^{3+\delta } \bigg [1+ \tfrac{\sqrt{ \max \{T,3\}}\Vert K\Vert _1}{\sqrt{\lfloor (N-1)^{2\beta } \rfloor }}\nonumber \\&\quad + Ce^{L_0T} \sup _{s\in [0,T]} \Vert g(\xi +\sqrt{s}Z^{(0)})\Vert _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})}\nonumber \\&\quad +C e^{L_0T}(\Vert K\Vert _{\infty }+T\Vert \mathfrak L\Vert _\infty ) \nonumber \\&\quad + C \sup _{s,t\in [0,T)} \left\| (F(0))(t,\xi +\sqrt{s}Z^{(0)}) \right\| _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})} \Big (\tfrac{1}{\sqrt{\lfloor (N-1)^{2\beta } \rfloor }}+Te^{L_0T}\Big )\nonumber \\&\quad +TCe^{2L_0T}\sum _{j=1}^d L_{j} (K_j +T\mathfrak {L_j})\bigg ]^{2+\delta } \nonumber \\&\quad \cdot \left[ \sup _{n\in \mathbb {N}\cap [2,\infty )} \left( \tfrac{10^n \left[ (n-1)^{2\beta } \left[ e\left( \tfrac{p(n-1)}{2}+1\right) \right] ^{\frac{1}{8}} (4Ce^\beta )^{n-1} \right] ^{2+\delta } }{(\lfloor (n-1)^{2\beta } \rfloor )^{\frac{\delta n}{2}}} \right) \right] <\infty . \end{aligned}$$

(144)

Proof of Corollary 5.1

Throughout the proof let $(\eta _{n,M})_{(n,M)\in \mathbb {N}^2}\subseteq \mathbb {R}$ satisfy for all $n,M\in \mathbb {N}$ that

$$\begin{aligned} \eta _{n,M}= \left\| \mathbf{U}_{{n},M}^{0,0}(0,\xi )-u(0,\xi )\right\| _{L^2({\mathbb {P}};\mathbb {R})}+ \max _{i\in \{1,2,\ldots ,d\}} \left\| \mathbf{U}_{{n},M}^{0,i}(0,\xi )-( \tfrac{ \partial }{ \partial x_i } u )(0,\xi )\right\| _{L^2({\mathbb {P}};\mathbb {R})}. \end{aligned}$$

(145)

Note that Proposition 3.5 and Item (i) of Lemma 4.2 ensure that for all $M,n \in \mathbb {N}$ it holds that

$$\begin{aligned}&\eta _{n,M} \le 2\max \left\{ \left\| \mathbf{U}_{{n},M}^{0,0}(0,\xi )-u(0,\xi )\right\| _{L^2({\mathbb {P}};\mathbb {R})}, \max _{i\in \{1,2,\ldots ,d\}} \left\| \mathbf{U}_{{n},M}^{0,i}(0,\xi )-( \tfrac{ \partial }{ \partial x_i } u )(0,\xi )\right\| _{L^2({\mathbb {P}};\mathbb {R})} \right\} \nonumber \\&\quad \le \tfrac{2\left[ e\left( \tfrac{pn}{2}+1\right) \right] ^{\frac{1}{8}} (4C)^{n-1} \exp \left( \beta M^{\frac{1}{2\beta }}\right) }{\sqrt{M^{n-1}}} \bigg [ \tfrac{\sqrt{ \max \{T,3\}}\Vert K\Vert _1}{\sqrt{M}} + \tfrac{C \sup _{s,t\in [0,T)} \left\| (F(0))(t,\xi +\sqrt{s}Z^{(0)}) \right\| _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})}}{\sqrt{M}}\nonumber \\&\qquad + C \sup _{s\in [0,T)}\bigg [ \max \left\{ \left\| u(s,\xi +\sqrt{s}Z^{(0)})\right\| _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})} , \max _{i\in \{1,2,\ldots ,d\}} \left\| (\tfrac{\partial }{\partial x_i} u)(s,\xi +\sqrt{s}Z^{(0)}) \right\| _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})}\right\} \bigg ]. \end{aligned}$$

(146)

Combining this with Lemma 4.2 implies for all $M,n \in \mathbb {N}$ that

$$\begin{aligned} \eta _{n,M}&\le \tfrac{2\left[ e\left( \tfrac{pn}{2}+1\right) \right] ^{\frac{1}{8}} (4C)^{n} \exp \left( \beta M^{\frac{1}{2\beta }}\right) }{\sqrt{M^{n-1}}} \bigg [ \tfrac{\sqrt{ \max \{T,3\}}\Vert K\Vert _1}{\sqrt{M}}\nonumber \\&\quad + C\Big (\tfrac{1}{\sqrt{M}}+Te^{L_0T}\Big ) \sup _{s,t\in [0,T)} \left\| (F(0))(t,\xi +\sqrt{s}Z^{(0)}) \right\| _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})} \nonumber \\&\quad + C e^{L_0T}(\Vert K\Vert _{\infty }+T\Vert \mathfrak L\Vert _\infty )\nonumber \\&\quad + Ce^{L_0T} \sup _{s\in [0,T]} \Vert g(\xi +\sqrt{s}Z^{(0)})\Vert _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})} +TCe^{2L_0T}\sum _{j=1}^d L_{j} (K_j +T\mathfrak {L_j}) \bigg ] . \end{aligned}$$

(147)

Next observe that (137) and (138) demonstrate that

$$\begin{aligned}&\sup _{s,t\in [0,T)} \left\| (F(0))(t,\xi +\sqrt{s}Z^{(0)}) \right\| _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})}<\infty \quad \text {and} \nonumber \\&\quad \sup _{s\in [0,T]} \Vert g(\xi +\sqrt{s}Z^{(0)})\Vert _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})}<\infty . \end{aligned}$$

(148)

Combining this with (147) proves that $ \limsup _{n\rightarrow \infty }\eta _{n, \lfloor n^{2\beta } \rfloor } =0 $. Next let $N\in \mathbb {N}$ satisfy that

$$\begin{aligned} N=\min \!\left\{ n\in \mathbb {N}\cap [2,\infty ) :\sup _{m\in \mathbb {N}\cap [n,\infty )} \eta _{m, \lfloor m^{2\beta } \rfloor } \le {\varepsilon }\right\} \end{aligned}$$

(149)

and let $\mathfrak {C}\in \mathbb {R}$ satisfy that

$$\begin{aligned} \begin{aligned} \mathfrak C&=2\bigg [1+ \tfrac{\sqrt{ \max \{T,3\}}\Vert K\Vert _1}{\sqrt{\lfloor (N-1)^{2\beta } \rfloor }} \\&\quad + C \sup _{s,t\in [0,T)}\left\| (F(0))(t,\xi +\sqrt{s}Z^{(0)}) \right\| _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})} \Big (\tfrac{1}{\sqrt{\lfloor (N-1)^{2\beta } \rfloor }}+Te^{L_0T}\Big )\\&\quad + C e^{L_0T}(\Vert K\Vert _{\infty }+T\Vert \mathfrak L\Vert _\infty )\\&\quad + Ce^{L_0T} \sup _{s\in [0,T]} \Vert g(\xi +\sqrt{s}Z^{(0)})\Vert _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})} +TCe^{2L_0T}\sum _{j=1}^d L_{j} (K_j +T\mathfrak {L_j}) \bigg ]. \end{aligned} \end{aligned}$$

(150)

Note that the fact that $C\ge \frac{1}{2}$ and (150) ensure that

$$\begin{aligned} 1\le 4C\mathfrak C\left[ e\left( \tfrac{p}{2}+1\right) \right] ^{\frac{1}{8}} e^\beta = \tfrac{\mathfrak C\left[ e\left( \tfrac{p(2-1)}{2}+1\right) \right] ^{\frac{1}{8}} (4C)^{2-1} \exp \left( \beta (\lfloor (2-1)^{2\beta } \rfloor )^{\frac{1}{2\beta }}\right) }{\sqrt{(\lfloor (2-1)^{2\beta } \rfloor )^{2-2}}}. \end{aligned}$$

(151)

Furthermore, observe that (147) and (150) demonstrate that

$$\begin{aligned} \eta _{N-1, \lfloor (N-1)^{2\beta } \rfloor } \le \tfrac{\mathfrak C\left[ e\left( \tfrac{p(N-1)}{2}+1\right) \right] ^{\frac{1}{8}} (4C)^{N-1} \exp \left( \beta (\lfloor (N-1)^{2\beta } \rfloor )^{\frac{1}{2\beta }}\right) }{\sqrt{(\lfloor (N-1)^{2\beta } \rfloor )^{N-2}}}. \end{aligned}$$

(152)

Combining this with (149) and (151) proves that

$$\begin{aligned} {\varepsilon }&\le \mathbb {1}_{\{2\}}(N)+\eta _{N-1, \lfloor (N-1)^{2\beta } \rfloor } \mathbb {1}_{(2,\infty )}(N) \nonumber \\&\le \tfrac{\mathfrak C\left[ e\left( \tfrac{p(N-1)}{2}+1\right) \right] ^{\frac{1}{8}} (4C)^{N-1} \exp \left( \beta (\lfloor (N-1)^{2\beta } \rfloor )^{\frac{1}{2\beta }}\right) }{\sqrt{(\lfloor (N-1)^{2\beta } \rfloor )^{N-2}}}. \end{aligned}$$

(153)

Moreover, observe that [45, Lemma 3.6] assures that for all $n\in \mathbb {N}$ it holds that ${\text {RV}}_{n,\lfloor n^{2\beta } \rfloor }\le d(5\lfloor n^{2\beta } \rfloor )^n$. Hence, we obtain that

$$\begin{aligned}&\sum _{n=1}^{N}{\text {RV}}_{n,\lfloor n^{2\beta } \rfloor } \le d \sum _{n=1}^{N}(5\lfloor n^{2\beta } \rfloor )^n \le d \sum _{n=1}^{N}(5\lfloor N^{2\beta } \rfloor )^n \nonumber \\&\quad =\frac{d (5\lfloor N^{2\beta })\rfloor )( (5\lfloor N^{2\beta } \rfloor )^{N} -1) }{5\lfloor N^{2\beta }\rfloor -1} \le 2d (5\lfloor N^{2\beta } \rfloor )^{N}. \end{aligned}$$

(154)

Note that the fact that $2\beta \in (0,1)$ ensures that

$$\begin{aligned} \lfloor N^{2\beta } \rfloor \le \lfloor (N-1)^{2\beta }+1 \rfloor =\lfloor (N-1)^{2\beta } \rfloor +1\le 2\lfloor (N-1)^{2\beta } \rfloor . \end{aligned}$$

(155)

Combining this and (153) with (154) proves that

$$\begin{aligned}&\sum _{n=1}^{N}{\text {RV}}_{n,\lfloor n^{2\beta } \rfloor }\le 2d(5\lfloor N^{2\beta } \rfloor )^N=2d(5\lfloor N^{2\beta } \rfloor )^N \varepsilon ^{2+\delta }\varepsilon ^{-(2+\delta )}\nonumber \\&\quad \le 2 d \varepsilon ^{-(2+\delta )} (5\lfloor N^{2\beta } \rfloor )^N \left( \tfrac{\mathfrak C\left[ e\left( \tfrac{p(N-1)}{2}+1\right) \right] ^{\frac{1}{8}} (4C)^{N-1} \exp \left( \beta (\lfloor (N-1)^{2\beta } \rfloor )^{\frac{1}{2\beta }}\right) }{\sqrt{(\lfloor (N-1)^{2\beta } \rfloor )^{N-2}}} \right) ^{2+\delta }\nonumber \\&\quad = 2d \varepsilon ^{-(2+\delta )} \mathfrak {C}^{2+\delta } \tfrac{5^N(\lfloor N^{2\beta } \rfloor )^N \left[ \left[ e\left( \tfrac{p(N-1)}{2}+1\right) \right] ^{\frac{1}{8}} (4C)^{N-1} \exp \left( \beta (\lfloor (N-1)^{2\beta } \rfloor )^{\frac{1}{2\beta }}\right) \right] ^{2+\delta } }{(\lfloor (N-1)^{2\beta } \rfloor )^{\frac{(N-2)(2+\delta )}{2}}}\nonumber \\&\quad \le 2d \varepsilon ^{-(2+\delta )} \mathfrak {C}^{2+\delta } \tfrac{10^N(\lfloor (N-1)^{2\beta } \rfloor )^N \left[ \left[ e\left( \tfrac{p(N-1)}{2}+1\right) \right] ^{\frac{1}{8}} (4C)^{N-1} \exp \left( \beta (\lfloor (N-1)^{2\beta } \rfloor )^{\frac{1}{2\beta }}\right) \right] ^{2+\delta } }{(\lfloor (N-1)^{2\beta } \rfloor )^{\frac{(N-2)(2+\delta )}{2}}}\nonumber \\&\quad = 2d \varepsilon ^{-(2+\delta )} \mathfrak {C}^{2+\delta } \tfrac{10^N \left[ \lfloor (N-1)^{2\beta } \rfloor \left[ e\left( \tfrac{p(N-1)}{2}+1\right) \right] ^{\frac{1}{8}} (4C)^{N-1} \exp \left( \beta (\lfloor (N-1)^{2\beta } \rfloor )^{\frac{1}{2\beta }}\right) \right] ^{2+\delta } }{(\lfloor (N-1)^{2\beta } \rfloor )^{\frac{\delta N}{2}}}\nonumber \\&\quad \le 2d \varepsilon ^{-(2+\delta )} \mathfrak {C}^{2+\delta } \tfrac{10^N \left[ (N-1)^{2\beta } \left[ e\left( \tfrac{p(N-1)}{2}+1\right) \right] ^{\frac{1}{8}} (4Ce^\beta )^{N-1} \right] ^{2+\delta } }{(\lfloor (N-1)^{2\beta } \rfloor )^{\frac{\delta N}{2}}}\nonumber \\&\quad \le 2d \varepsilon ^{-(2+\delta )} \mathfrak {C}^{2+\delta } \left[ \sup _{n\in \mathbb {N}\cap [2,\infty )} \left( \tfrac{10^n \left[ (n-1)^{2\beta } \left[ e\left( \tfrac{p(n-1)}{2}+1\right) \right] ^{\frac{1}{8}} (4Ce^\beta )^{n-1} \right] ^{2+\delta } }{(\lfloor (n-1)^{2\beta } \rfloor )^{\frac{\delta n}{2}}} \right) \right] <\infty . \end{aligned}$$

(156)

This establishes (144). The proof of Corollary 5.1 is thus completed. $\square $

5.2 Qualitative Complexity Analysis for MLP Approximation Methods

Theorem 5.2

Let $T,\delta ,\lambda \in (0,\infty )$, $\alpha \in (0,1)$, $\beta \in (\max \{\frac{1-2\alpha }{1-\alpha },0\},1-\alpha )$, let $f_d \in C( [0,T]\times \mathbb {R}^d\times \mathbb {R}\times \mathbb {R}^d,\mathbb {R})$, $d\in \mathbb {N}$, let $g_d \in C( \mathbb {R}^d,\mathbb {R})$, $d\in \mathbb {N}$, let $\xi _d=(\xi _{d,1}, \xi _{d,2}, \ldots , \xi _{d,d}) \in \mathbb {R}^d$, $d\in \mathbb {N}$, let $L_{d,i}\in \mathbb {R}$, $d,i \in \mathbb {N}$, let $u_d = ( u_d(t,x) )_{ (t,x) \in [0,T] \times \mathbb {R}^d }\in C^{1,2}([0,T]\times \mathbb {R}^d,\mathbb {R})$, $d\in \mathbb {N}$, be at most polynomially growing functions, let $ F_d :C([0,T)\times \mathbb {R}^d,\mathbb {R}^{1+d}) \rightarrow C([0,T)\times \mathbb {R}^d,\mathbb {R}) $, $d\in \mathbb {N}$, be functions, assume for all $d\in \mathbb {N}$, $t\in [0,T)$, $x=(x_1,x_2, \ldots ,x_d)$, $\mathfrak x=(\mathfrak x_1,\mathfrak x_2, \ldots ,\mathfrak x_d)$, $z=(z_1,z_2,\ldots ,z_d)$, $\mathfrak {z}=(\mathfrak z_1, \mathfrak z_2, \ldots , \mathfrak z_d)\in \mathbb {R}^d$, $y,\mathfrak {y} \in \mathbb {R}$, $\mathbf{v}\in C([0,T)\times \mathbb {R}^d,\mathbb {R}^{1+d})$ that

$$\begin{aligned}&\max \{|f_d(t,x,y,z)-f_d(t,\mathfrak x,\mathfrak y,\mathfrak {z})|,|g_d(x)-g_d(\mathfrak x)|\}\nonumber \\&\quad \le \textstyle { \sum _{j=1}^d}L_{d,j} \big (d^\lambda |x_j-\mathfrak {x}_j|+ |y-\mathfrak y|+|z_j-\mathfrak {z}_j| \big ), \end{aligned}$$

(157)

$$\begin{aligned}&\big ( \tfrac{ \partial }{ \partial t } u_d \big )( t, x ) + \tfrac{ 1 }{ 2 } ( \Delta _x u_d )( t, x ) + f_d\big ( t, x, u(t,x), ( \nabla _x u_d )(t, x) \big ) = 0, \nonumber \\&u_d(T,x) = g_d(x), \end{aligned}$$

(158)

$$\begin{aligned}&d^{-\lambda }(|g_d(0)|+|f_d(t,0,0,0)|+\max _{i\in \{1,2,\ldots ,d\}} |\xi _{d,i}|)+\textstyle \sum _{i=1}^d \displaystyle L_{d,i}\le \lambda ,\nonumber \\&\quad \text {and} \quad (F_d(\mathbf{v}))(t,x)=f_d(t,x,\mathbf{v}(t,x)), \end{aligned}$$

(159)

let $ ( \Omega , \mathcal {F}, {\mathbb {P}}) $ be a probability space, let $ \Theta = \cup _{ n \in \mathbb {N}} \mathbb {Z}^n $, let $ Z^{d, \theta } :\Omega \rightarrow \mathbb {R}^d $, $d\in \mathbb {N}$, $ \theta \in \Theta $, be i.i.d. standard normal random variables, let $\mathfrak {r}^\theta :\Omega \rightarrow (0,1)$, $\theta \in \Theta $, be i.i.d. random variables, assume for all $b\in (0,1)$ that ${\mathbb {P}}(\mathfrak {r}^0\le b)=b^{\alpha }$, assume that $(Z^{d,\theta })_{(d, \theta ) \in \mathbb {N}\times \Theta }$ and $(\mathfrak {r}^\theta )_{ \theta \in \Theta }$ are independent, let $ \mathbf{U}_{ n,M}^{d,\theta } = ( \mathbf{U}_{ n,M}^{d,\theta , 0},\mathbf{U}_{ n,M}^{d,\theta , 1},\ldots ,\mathbf{U}_{ n,M}^{d,\theta , d} ) :[0,T)\times \mathbb {R}^d\times \Omega \rightarrow \mathbb {R}^{1+d} $, $n,M,d\in \mathbb {Z}$, $\theta \in \Theta $, satisfy for all $ n,M,d \in \mathbb {N}$, $ \theta \in \Theta $, $ t\in [0,T)$, $x \in \mathbb {R}^d$ that $ \mathbf{U}_{-1,M}^{d,\theta }(t,x)=\mathbf{U}_{0,M}^{d,\theta }(t,x)=0$ and

$$\begin{aligned} \mathbf{U}_{n,M}^{d,\theta }(t,x)&= \left( g_d(x) , 0 \right) + \textstyle \sum \limits _{i=1}^{M^n} \displaystyle \tfrac{1}{M^n}\nonumber \\&\quad \big (g_d(x+[T-t]^{1/2}Z^{(\theta ,0,-i)})-g_d(x)\big ) \big ( 1 , [T - t]^{-1/2} Z^{d,(\theta , 0, -i)} \big )\nonumber \\&\quad +\textstyle \sum \limits _{l=0}^{n-1}\sum \limits _{i=1}^{M^{n-l}}\displaystyle \tfrac{(T-t)(\mathfrak {r}^{(\theta , l,i)})^{1-\alpha }}{\alpha M^{n-l}} \big ( 1 , [(T-t)\mathfrak {r}^{(\theta , l,i)}]^{-1/2} Z^{d,(\theta ,l,i)} \big )\nonumber \\&\quad \cdot \big [ \big (F_d(\mathbf{U}_{l,M}^{d,(\theta ,l,i)})-\mathbb {1}_{\mathbb {N}}(l)F_d( \mathbf{U}_{l-1,M}^{d,(\theta ,-l,i)})\big )\nonumber \\&\quad \cdot (t+(T-t)\mathfrak {r}^{(\theta , l,i)},x+[(T-t)\mathfrak {r}^{(\theta , l,i)}]^{1/2}Z^{d,(\theta ,l,i)}) \big ], \end{aligned}$$

(160)

and let ${\text {RV}}_{d,n,M}\in \mathbb {Z}$, $d,n,M\in \mathbb {Z}$, satisfy for all $d,n,M \in \mathbb {N}$ that ${\text {RV}}_{d,0,M}=0$ and

$$\begin{aligned} {\text {RV}}_{d,n,M} \le d M^n+\textstyle \sum \limits _{l=0}^{n-1}\displaystyle \left[ M^{(n-l)}( d+1 + {\text {RV}}_{d, l, M}+ \mathbb {1}_{ \mathbb {N}}( l ) {\text {RV}}_{d, l-1, M })\right] . \end{aligned}$$

(161)

Then there exist $c\in \mathbb {R}$ and $N=(N_{d,{\varepsilon }})_{(d, {\varepsilon }) \in \mathbb {N}\times (0,1]}:\mathbb {N}\times (0,1] \rightarrow \mathbb {N}$ such that for all $d\in \mathbb {N}$, ${\varepsilon }\in (0,1]$ it holds that $ \sum _{n=1}^{N_{d,{\varepsilon }}}{\text {RV}}_{d,n,\lfloor n^{\beta } \rfloor } \le c d^c \varepsilon ^{-(2+\delta )}$ and

$$\begin{aligned}&\sup _{ n \in \mathbb {N}\cap [N_{d,{\varepsilon }},\infty ) } \Big [ {\mathbb {E}}\big [|\mathbf{U}_{{n},\lfloor n^{\beta } \rfloor }^{d,0,0}(0,\xi _d)-u_d(0,\xi _d)|^2\big ]\nonumber \\&\quad + \max _{i\in \{1,2,\ldots ,d\}} {\mathbb {E}}\big [ |\mathbf{U}_{{n},\lfloor n^{\beta } \rfloor }^{d,0,i}(0,\xi _d)-( \tfrac{ \partial }{ \partial x_i } u_d )(0,\xi _d)|^2\big ] \Big ]^{\nicefrac 12} \le {\varepsilon }. \end{aligned}$$

(162)

Proof of Theorem 5.2

Throughout this proof let $p=\frac{2\alpha }{\beta +2\alpha -1}$. Note that the hypothesis that $\alpha \in (0,1)$ and the hypothesis that $\beta \in (\max \{\frac{1-2\alpha }{1-\alpha },0\},1-\alpha )$ ensure that $p>\frac{2\alpha }{1-\alpha +2\alpha -1}=2$. Moreover, observe that the fact that $p=\frac{2\alpha }{\beta +2\alpha -1}$ demonstrates that

$$\begin{aligned} \frac{\beta }{2}=\frac{1-\alpha }{2}-\frac{(1-(1-\alpha ))(p-2)}{2p}. \end{aligned}$$

(163)

Next note that the hypothesis that $\alpha \in (0,1)$ and the hypothesis that $\beta \in (\max \{\frac{1-2\alpha }{1-\alpha },0\},1-\alpha )$ ensure that

$$\begin{aligned} \frac{p}{2(p-1)} =\frac{\alpha }{1-\beta }>\frac{\alpha }{1-\frac{1-2\alpha }{1-\alpha }}=1-\alpha . \end{aligned}$$

(164)

Moreover, observe that the hypothesis that $\alpha \in (0,1)$ and the hypothesis that $\beta \in (\max \{\frac{1-2\alpha }{1-\alpha },0\},1-\alpha )$ prove that

$$\begin{aligned} \frac{p-2}{2(p-1)} =1-\frac{\alpha }{1-\beta }<1-\alpha . \end{aligned}$$

(165)

Combining this with (164) ensures that

$$\begin{aligned} 1-\alpha \in \bigg (\frac{p-2}{2(p-1)},\frac{p}{2(p-1)}\bigg ). \end{aligned}$$

(166)

Next note that (157) ensures for all $d\in \mathbb {N}$ that

$$\begin{aligned}&\sup _{s,t\in [0,T)} \left\| (F_d (0))(t,\xi _d+\sqrt{s}Z^{d,(0)}) \right\| _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})}\nonumber \\&\quad \le \sup _{s,t\in [0,T)} \left\| (F_d(0))(t,\xi _d+\sqrt{s}Z^{d,(0)})-(F_d(0))(t,0) \right\| _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})} \nonumber \\&\qquad +\sup _{t\in [0,T)} \left| (F_d(0))(t,0) \right| \nonumber \\&\quad = \sup _{s,t\in [0,T)} \left\| f_d(t,\xi _d+\sqrt{s}Z^{d,(0)},0,0)-f_d(t,0,0,0) \right\| _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})} \nonumber \\&\qquad +\sup _{t\in [0,T)} \left| f_d(t,0,0,0) \right| \nonumber \\&\quad \le \bigg (d^\lambda \sum _{j=1}^d L_{d,j}\big (|\xi _{d,j}|+\sqrt{T}\Vert Z^{1,(0)} \Vert _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})}\big ) \bigg ) \nonumber \\&\qquad +\sup _{t\in [0,T)} \left| f_d(t,0,0,0) \right| \nonumber \\&\quad \le d^\lambda \bigg (\big [\max _{j\in \{1,2,\ldots ,d\}}|\xi _{d,j}|\big ] +\sqrt{T}\Vert Z^{1,(0)} \Vert _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})}\bigg ) \bigg ( \sum _{j=1}^d L_{d,j} \bigg ) \nonumber \\&\qquad +\sup _{t\in [0,T)} \left| f_d(t,0,0,0) \right| . \end{aligned}$$

(167)

In addition, observe that (157) proves that for all $d\in \mathbb {N}$ it holds that

$$\begin{aligned} \begin{aligned}&\sup _{s\in [0,T]} \Vert g_d(\xi _d+\sqrt{s}Z^{d,(0)})\Vert _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})}\\&\quad \le \sup _{s\in [0,T]} \Vert g_d(\xi _d+\sqrt{s}Z^{d,(0)})-g_d(0)\Vert _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})}+|g_d(0)| \\&\quad \le \bigg (d^\lambda \sum _{j=1}^d L_{d,j}\big ( |\xi _{d,j}|+\sqrt{T}\Vert Z^{1,(0)} \Vert _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})} \big )\bigg ) +|g_d(0)| \\&\quad \le d^\lambda \bigg (\big [\max _{j\in \{1,2,\ldots ,d\}}|\xi _{d,j}|\big ] +\sqrt{T}\Vert Z^{1,(0)} \Vert _{L^{\frac{2p}{p-2}}({\mathbb {P}};\mathbb {R})}\bigg ) \bigg ( \sum _{j=1}^d L_{d,j} \bigg ) +|g_d(0)|. \end{aligned} \end{aligned}$$

(168)

Furthermore, note that the fact that for all $ d \in \mathbb {N}$ it holds that $( \sum _{ j = 1 }^d | L_{d,j} |^2 )^{ 1 / 2 } \le \sum _{ j=1}^d L_{ d,j}$ ensures that for all $d \in \mathbb {N}$ it holds that

$$\begin{aligned}&\max _{j\in \{1,2,\ldots , d\}}d^\lambda L_{d,j} \le d^\lambda \sum _{j=1}^d L_{d,j} \quad \text {and}\nonumber \\&\quad \sum _{j=1}^dL_{d,j} (d^\lambda L_{d,j}+Td^\lambda L_{d,j})\le d^\lambda (T+1) \bigg [\sum _{j=1}^d L_{d,j}\bigg ]^2. \end{aligned}$$

(169)

This, (163), (166), the fact that $p>2$, (167), (168), Corollary 5.1 (applied with $\alpha =1-\alpha $, $\beta =\frac{\beta }{2}$, $p=p$, $L_0=\sum _{j=1}^dL_{d,j}$, $L_j=L_{d,j}$, $K_j=d^\lambda L_{d,j}$, $\mathfrak {L}_j=d^\lambda L_{d,j}$ for $j\in \{1,\ldots ,d\}$, $d\in \mathbb {N}$ in the notation of Corollary 5.1), and the fact that

$$\begin{aligned} \sup _{d\in \mathbb {N}}\left[ \frac{1}{d^\lambda }\left( \max _{i\in \{1,2,\ldots ,d\}} |\xi _{d,i}|+\sup _{t\in [0,T]}|f_d(t,0,0,0)|+|g_d(0)|\right) +\sum _{i=1}^d L_{d,i}\right] <\infty \nonumber \\ \end{aligned}$$

(170)

establish (162). The proof of Theorem 5.2 is thus completed. $\square $

Lemma 5.3

Let $T\in (0,\infty )$, $d\in \mathbb {N}$, $\alpha \in (0,1]$, let $(\Omega , \mathcal F, {\mathbb {P}})$ be a probability space, let $Z=(Z^1,Z^2,\ldots , Z^d):\Omega \rightarrow \mathbb {R}^d$ be a standard normal random variable, let $\mathfrak {r}:\Omega \rightarrow (0,1)$ satisfy for all $b\in (0,1)$ that ${\mathbb {P}}(\mathfrak {r}\le b)=b^\alpha $, assume that Z and $\mathfrak {r}$ are independent, and let $\mathbf{U}=(\mathbf{U}^{0},\mathbf{U}^{1}, \ldots , \mathbf{U}^{d}):\Omega \rightarrow \mathbb {R}^{d+1}$ satisfy that $ \mathbf{U} = T\alpha ^{ - 1 }\mathfrak {r}^{1-\alpha } \big ( 1 , [T\mathfrak {r}]^{-1/2} Z \big ). $ Then

(i)
it holds for all $i\in \{1,2,\ldots ,d\}$ that
$$\begin{aligned} {\mathbb {E}}\big [|\mathbf{U}^i|^2\big ] =\frac{T}{\alpha }\int _0^1s^{-\alpha }\, ds = {\left\{ \begin{array}{ll} \frac{T}{\alpha (1-\alpha )} &{} :\alpha < 1\\ \infty &{} :\alpha = 1 \end{array}\right. } \end{aligned}$$
(171)
and
(ii)
it holds that $\mathbf{U}\in L^2({\mathbb {P}};\mathbb {R}^{d+1})$ if and only if $\alpha \in (0,1)$.

Proof of Lemma 5.3

Note that for all $i\in \{1,2,\ldots ,d\}$ it holds that

$$\begin{aligned} \begin{aligned} {\mathbb {E}}\big [|\mathbf{U}^i|^2\big ]&=\frac{T}{\alpha ^2}{\mathbb {E}}\big [\mathfrak {r}^{2(1-\alpha )-1}|Z^{i}|^2 \big ]\\&=\frac{T}{\alpha }\int _0^1{\mathbb {E}}\big [s^{2(1-\alpha )-1}|Z^{i}|^2\big ] s^{\alpha -1}\, ds =\frac{T}{\alpha }\int _0^1s^{2(1-\alpha )-1+\alpha -1}\, ds\\&=\frac{T}{\alpha }\int _0^1s^{-\alpha }\, ds= {\left\{ \begin{array}{ll} \frac{T}{\alpha (1-\alpha )} &{} :\alpha < 1\\ \infty &{} :\alpha = 1 \end{array}\right. }. \end{aligned} \end{aligned}$$

(172)

This proves Item (i). Moreover, observe that Item (i) establishes Item (ii). The proof of Lemma 5.3 is thus completed. $\square $

Corollary 5.4

Let $T,\delta ,\lambda \in (0,\infty )$, let $f_d \in C( \mathbb {R}\times \mathbb {R}^d,\mathbb {R})$, $d\in \mathbb {N}$, let $g_d \in C( \mathbb {R}^d,\mathbb {R})$, $d\in \mathbb {N}$, let $\xi _d=(\xi _{d,1}, \xi _{d,2}, \ldots , \xi _{d,d}) \in \mathbb {R}^d$, $d\in \mathbb {N}$, let $L_{d,i}\in \mathbb {R}$, $d,i \in \mathbb {N}$, let $u_d = ( u_d(t,x) )_{ (t,x) \in [0,T] \times \mathbb {R}^d }\in C^{1,2}([0,T]\times \mathbb {R}^d,\mathbb {R})$, $d\in \mathbb {N}$, be at most polynomially growing functions, assume for all $d\in \mathbb {N}$, $t\in [0,T)$, $x=(x_1,x_2, \ldots ,x_d)$, $\mathfrak x=(\mathfrak x_1,\mathfrak x_2, \ldots ,\mathfrak x_d)$, $z=(z_1,z_2,\ldots ,z_d)$, $\mathfrak {z}=(\mathfrak z_1, \mathfrak z_2, \ldots , \mathfrak z_d)\in \mathbb {R}^d$, $y,\mathfrak {y} \in \mathbb {R}$ that

$$\begin{aligned}&\max \{|f_d(y,z)-f_d(\mathfrak y,\mathfrak {z})|,|g_d(x)-g_d(\mathfrak x)|\}\le \textstyle { \sum _{j=1}^d}L_{d,j} \big (d^\lambda |x_j-\mathfrak {x}_j|+ |y-\mathfrak y|+|z_j-\mathfrak {z}_j| \big ), \end{aligned}$$

(173)

$$\begin{aligned}&\big ( \tfrac{ \partial }{ \partial t } u_d \big )( t, x )= ( \Delta _x u_d )( t, x ) + f_d\big ( u(t,x), ( \nabla _x u_d )(t, x) \big ), \qquad u_d(0,x) = g_d(x), \end{aligned}$$

(174)

and $ d^{-\lambda }(|g_d(0)|+|f_d(0,0)|+\max _{i\in \{1,2,\ldots ,d\}} |\xi _{d,i}|)+\sum _{i=1}^d L_{d,i}\le \lambda , $ let $ ( \Omega , \mathcal {F}, {\mathbb {P}}) $ be a probability space, let $ \Theta = \cup _{ n \in \mathbb {N}} \mathbb {Z}^n $, let $ Z^{d, \theta } :\Omega \rightarrow \mathbb {R}^d $, $d\in \mathbb {N}$, $ \theta \in \Theta $, be i.i.d. standard normal random variables, let $\mathfrak {r}^\theta :\Omega \rightarrow (0,1)$, $\theta \in \Theta $, be i.i.d. random variables, assume for all $b\in (0,1)$ that ${\mathbb {P}}(\mathfrak {r}^0\le b)=\sqrt{b}$, assume that $(Z^{d,\theta })_{(d, \theta ) \in \mathbb {N}\times \Theta }$ and $(\mathfrak {r}^\theta )_{ \theta \in \Theta }$ are independent, let $ \mathbf{U}_{ n,M}^{d,\theta } = ( \mathbf{U}_{ n,M}^{d,\theta , 0},\mathbf{U}_{ n,M}^{d,\theta , 1},\ldots ,\mathbf{U}_{ n,M}^{d,\theta , d} ) :(0,T]\times \mathbb {R}^d\times \Omega \rightarrow \mathbb {R}^{1+d} $, $n\in \mathbb {Z}$, $M,d\in \mathbb {N}$, $\theta \in \Theta $, satisfy for all $ n,M,d \in \mathbb {N}$, $ \theta \in \Theta $, $ t\in (0,T]$, $x \in \mathbb {R}^d$ that $ \mathbf{U}_{-1,M}^{d,\theta }(t,x)=\mathbf{U}_{0,M}^{d,\theta }(t,x)=0$ and

$$\begin{aligned}&\mathbf{U}_{n,M}^{d,\theta }(t,x) = \left( g_d(x) , 0 \right) + \textstyle \sum \limits _{i=1}^{M^n} \displaystyle \tfrac{1}{M^n} \big (g_d(x+[2t]^{1/2}Z^{(\theta ,0,-i)})\nonumber \\&\quad -g_d(x)\big ) \big ( 1 , [2t]^{-1/2} Z^{d,(\theta , 0, -i)} \big )\nonumber \\&\quad +\textstyle \sum \limits _{l=0}^{n-1}\sum \limits _{i=1}^{M^{n-l}}\displaystyle \tfrac{2t [\mathfrak {r}^{(\theta , l,i)}]^{1/2}}{M^{n-l}} \big [ f_d\big (\mathbf{U}_{l,M}^{d,(\theta ,l,i)}(t(1-\mathfrak {r}^{(\theta , l,i)}),x+[2t\mathfrak {r}^{(\theta , l,i)}]^{1/2}Z^{d,(\theta ,l,i)})\big )\nonumber \\&\quad -\mathbb {1}_{\mathbb {N}}(l)f_d\big ( \mathbf{U}_{l-1,M}^{d,(\theta ,-l,i)}(t(1-\mathfrak {r}^{(\theta , l,i)}),x+[2t\mathfrak {r}^{(\theta , l,i)}]^{1/2}Z^{d,(\theta ,l,i)})\big ) \big ]\nonumber \\&\quad \cdot \big ( 1 , [2t\mathfrak {r}^{(\theta , l,i)}]^{-1/2} Z^{d,(\theta ,l,i)} \big ), \end{aligned}$$

(175)

and let ${\text {RV}}_{d,n,M}\in \mathbb {Z}$, $d,n,M\in \mathbb {Z}$, satisfy for all $d,n,M \in \mathbb {N}$ that ${\text {RV}}_{d,0,M}=0$ and

$$\begin{aligned} {\text {RV}}_{d,n,M} \le d M^n+\textstyle \sum \limits _{l=0}^{n-1}\displaystyle \left[ M^{(n-l)}( d+1 + {\text {RV}}_{d, l, M}+ \mathbb {1}_{ \mathbb {N}}( l ) {\text {RV}}_{d, l-1, M })\right] . \end{aligned}$$

(176)

Then there exist $c\in \mathbb {R}$ and $N=(N_{d,{\varepsilon }})_{(d, {\varepsilon }) \in \mathbb {N}\times (0,1]}:\mathbb {N}\times (0,1] \rightarrow \mathbb {N}$ such that for all $d\in \mathbb {N}$, ${\varepsilon }\in (0,1]$ it holds that $ \sum _{n=1}^{N_{d,{\varepsilon }}}{\text {RV}}_{d,n,\lfloor n^{1/4} \rfloor } \le c d^c \varepsilon ^{-(2+\delta )}$ and

$$\begin{aligned}&\sup _{ n \in \mathbb {N}\cap [N_{d,{\varepsilon }},\infty ) } \Big [ {\mathbb {E}}\big [|\mathbf{U}_{{n},\lfloor n^{1/4} \rfloor }^{d,0,0}(T,\xi _d)-u_d(T,\xi _d)|^2\big ]\nonumber \\&\quad + \max _{i\in \{1,2,\ldots ,d\}} {\mathbb {E}}\big [ |\mathbf{U}_{{n},\lfloor n^{1/4} \rfloor }^{d,0,i}(T,\xi _d)-( \tfrac{ \partial }{ \partial x_i } u_d )(T,\xi _d)|^2\big ] \Big ]^{\nicefrac 12} \le {\varepsilon }. \end{aligned}$$

(177)

Proof of Corollary 5.4

Observe that Theorem 5.2 (applied with $\alpha =\frac{1}{2}$, $\beta =\frac{1}{4}$, $T=2T$, $u_d(t,x)=u_d(T-\frac{t}{2},x)$, $f_d(t,x,y,z)=f_d(y,z)/2$ for $t\in [0,2T]$, $x,z\in \mathbb {R}^d$, $y\in \mathbb {R}$, $d\in \mathbb {N}$ in the notation of Theorem 5.2) establishes (177). This completes the proof of Corollary 5.4. $\square $

Corollary 5.5

Let $T,\delta ,\lambda \in (0,\infty )$, let $g_d \in C( \mathbb {R}^d,\mathbb {R})$, $d\in \mathbb {N}$, let $\xi _d=(\xi _{d,1}, \xi _{d,2}, \ldots , \xi _{d,d}) \in \mathbb {R}^d$, $d\in \mathbb {N}$, let $u_d = ( u_d(t,x) )_{ (t,x) \in [0,T] \times \mathbb {R}^d }\in C^{1,2}([0,T]\times \mathbb {R}^d,\mathbb {R})$, $d\in \mathbb {N}$, be at most polynomially growing functions, assume for all $d\in \mathbb {N}$, $t\in [0,T)$, $x=(x_1,x_2, \ldots ,x_d)$, $\mathfrak x=(\mathfrak x_1,\mathfrak x_2, \ldots ,\mathfrak x_d)$ that $ |g_d(0)|+\max _{i\in \{1,2,\ldots ,d\}} |\xi _{d,i}|\le \lambda d^\lambda $, $ |g_d(x)-g_d(\mathfrak x)|\le \lambda d^\lambda (\textstyle { \sum _{j=1}^d} |x_j-\mathfrak {x}_j|)$, $ u_d(0,x) = g_d(x)$, and

$$\begin{aligned} \big ( \tfrac{ \partial }{ \partial t } u_d \big )( t, x )= ( \Delta _x u_d )( t, x ) + \sin \bigl (\bigl (\tfrac{\partial }{\partial x_1}u\bigr )(t,x)\bigr ), \qquad u_d(0,x) = g_d(x), \end{aligned}$$

(178)

let $ ( \Omega , \mathcal {F}, {\mathbb {P}}) $ be a probability space, let $ \Theta = \cup _{ n \in \mathbb {N}} \mathbb {Z}^n $, let $ Z^{d, \theta } :\Omega \rightarrow \mathbb {R}^d $, $d\in \mathbb {N}$, $ \theta \in \Theta $, be i.i.d. standard normal random variables, let $\mathfrak {r}^\theta :\Omega \rightarrow (0,1)$, $\theta \in \Theta $, be i.i.d. random variables, assume for all $b\in (0,1)$ that ${\mathbb {P}}(\mathfrak {r}^0\le b)=\sqrt{b}$, assume that $(Z^{d,\theta })_{(d, \theta ) \in \mathbb {N}\times \Theta }$ and $(\mathfrak {r}^\theta )_{ \theta \in \Theta }$ are independent, let $ \mathbf{U}_{ n,M}^{d,\theta } = ( \mathbf{U}_{ n,M}^{d,\theta , 0},\mathbf{U}_{ n,M}^{d,\theta , 1},\ldots ,\mathbf{U}_{ n,M}^{d,\theta , d} ) :(0,T]\times \mathbb {R}^d\times \Omega \rightarrow \mathbb {R}^{1+d} $, $n\in \mathbb {Z}$, $M,d\in \mathbb {N}$, $\theta \in \Theta $, satisfy for all $ n,M,d \in \mathbb {N}$, $ \theta \in \Theta $, $ t\in (0,T]$, $x \in \mathbb {R}^d$ that $ \mathbf{U}_{-1,M}^{d,\theta }(t,x)=\mathbf{U}_{0,M}^{d,\theta }(t,x)=0$ and

$$\begin{aligned}&\mathbf{U}_{n,M}^{d,\theta }(t,x) = \left( g_d(x) , 0 \right) + \textstyle \sum \limits _{i=1}^{M^n} \displaystyle \tfrac{1}{M^n} \big (g_d(x+[2t]^{1/2}Z^{(\theta ,0,-i)})\nonumber \\&\quad -g_d(x)\big ) \big ( 1 , [2t]^{-1/2} Z^{d,(\theta , 0, -i)} \big )\nonumber \\&\quad +\textstyle \sum \limits _{l=0}^{n-1}\sum \limits _{i=1}^{M^{n-l}}\displaystyle \tfrac{2t [\mathfrak {r}^{(\theta , l,i)}]^{1/2}}{M^{n-l}} \big [ \sin \!\big (\mathbf{U}_{l,M}^{d,(\theta ,l,i),1}(t(1-\mathfrak {r}^{(\theta , l,i)}),x+[2t\mathfrak {r}^{(\theta , l,i)}]^{1/2}Z^{d,(\theta ,l,i)})\big )\nonumber \\&\quad -\mathbb {1}_{\mathbb {N}}(l)\sin \!\big ( \mathbf{U}_{l-1,M}^{d,(\theta ,-l,i),1}(t(1-\mathfrak {r}^{(\theta , l,i)}),x+[2t\mathfrak {r}^{(\theta , l,i)}]^{1/2}Z^{d,(\theta ,l,i)})\big ) \big ]\nonumber \\&\quad \cdot \big ( 1 , [2t\mathfrak {r}^{(\theta , l,i)}]^{-1/2} Z^{d,(\theta ,l,i)} \big ), \end{aligned}$$

(179)

and let ${\text {RV}}_{d,n,M}\in \mathbb {Z}$, $d,n,M\in \mathbb {Z}$, satisfy for all $d,n,M \in \mathbb {N}$ that ${\text {RV}}_{d,0,M}=0$ and

$$\begin{aligned} {\text {RV}}_{d,n,M} \le d M^n+\textstyle \sum \limits _{l=0}^{n-1}\displaystyle \left[ M^{(n-l)}( d+1 + {\text {RV}}_{d, l, M}+ \mathbb {1}_{ \mathbb {N}}( l ) {\text {RV}}_{d, l-1, M })\right] . \end{aligned}$$

(180)

Then there exist $c\in \mathbb {R}$ and $N=(N_{d,{\varepsilon }})_{(d, {\varepsilon }) \in \mathbb {N}\times (0,1]}:\mathbb {N}\times (0,1] \rightarrow \mathbb {N}$ such that for all $d\in \mathbb {N}$, ${\varepsilon }\in (0,1]$ it holds that $ \sum _{n=1}^{N_{d,{\varepsilon }}}{\text {RV}}_{d,n,\lfloor n^{1/4} \rfloor } \le c d^c \varepsilon ^{-(2+\delta )}$ and

$$\begin{aligned}&\sup _{ n \in \mathbb {N}\cap [N_{d,{\varepsilon }},\infty ) } \Big [ {\mathbb {E}}\big [|\mathbf{U}_{{n},\lfloor n^{1/4} \rfloor }^{d,0,0}(T,\xi _d)-u_d(T,\xi _d)|^2\big ]\nonumber \\&\quad + \max _{i\in \{1,2,\ldots ,d\}} {\mathbb {E}}\big [ |\mathbf{U}_{{n},\lfloor n^{1/4} \rfloor }^{d,0,i}(T,\xi _d)-( \tfrac{ \partial }{ \partial x_i } u_d )(T,\xi _d)|^2\big ] \Big ]^{\nicefrac 12} \le {\varepsilon }. \end{aligned}$$

(181)

Proof of Corollary 5.5

Throughout this proof let $\kappa =2 \lambda +\sum _{i=1}^\infty \frac{1}{j^2}$. Observe that the fact that for all $z, \mathfrak {z} \in \mathbb {R}$ it holds that $|\sin (z)-\sin (\mathfrak {z})|\le |z-\mathfrak {z}|$ and the assumption that for all $d\in \mathbb {N}$, $x=(x_1,x_2, \ldots ,x_d)$, $\mathfrak x=(\mathfrak x_1,\mathfrak x_2, \ldots ,\mathfrak x_d) \in \mathbb {R}^d$ it holds that $ |g_d(x)-g_d(\mathfrak x)|\le \lambda d^\lambda (\textstyle { \sum _{j=1}^d} |x_j-\mathfrak {x}_j|)$ imply that for all $d\in \mathbb {N}$, $x=(x_1,x_2, \ldots ,x_d)$, $\mathfrak x=(\mathfrak x_1,\mathfrak x_2, \ldots ,\mathfrak x_d)$, $z=(z_1,z_2,\ldots ,z_d)$, $\mathfrak {z}=(\mathfrak z_1, \mathfrak z_2, \ldots , \mathfrak z_d)\in \mathbb {R}^d$ it holds that

$$\begin{aligned}&\max \{|\sin (z_1)-\sin (\mathfrak {z}_1)|,|g_d(x)-g_d(\mathfrak x)|\}\le \max \bigl \{|z_1-\mathfrak z_1|, \lambda d^\lambda \textstyle {\sum _{j=1}^d |x_j-\mathfrak {x}_j|\bigr \}}\nonumber \\&\quad \le |z_1-\mathfrak z_1|+ \lambda d^\lambda ({\textstyle { \sum _{j=1}^d}} |x_j-\mathfrak {x}_j|) \nonumber \\&\quad \le \sum _{j=1}^d \bigg [\bigg (\lambda d^{\lambda -\kappa }+\frac{1}{j^2}\bigg ) \big (d^\kappa |x_j-\mathfrak {x}_j|+ |z_j-\mathfrak {z}_j| \big )\bigg ]. \end{aligned}$$

(182)

Moreover, the assumption that for all $d\in \mathbb {N}$ it holds that $ |g_d(0)|+\max _{i\in \{1,2,\ldots ,d\}} |\xi _{d,i}|\le \lambda d^\lambda $ implies that

$$\begin{aligned} \begin{aligned}&d^{-\kappa }\bigg (|g_d(0)|+|\sin (0)|+\max _{i\in \{1,2,\ldots ,d\}} |\xi _{d,i}|\bigg )+\bigg [\sum _{j=1}^d \bigg (\lambda d^{\lambda -\kappa }+\frac{1}{j^2}\bigg )\bigg ]\\&\quad \le \lambda d^{\lambda -\kappa } +\lambda d^{\lambda -\kappa +1} +\bigg [\sum _{i=1}^\infty \frac{1}{j^2}\bigg ] \le 2\lambda +\bigg [\sum _{i=1}^\infty \frac{1}{j^2}\bigg ] = \kappa . \end{aligned} \end{aligned}$$

(183)

This, (182), and Corollary 5.4 establish (181). This completes the proof of Corollary 5.5. $\square $

References

Beck, C., Becker, S., Cheridito, P., Jentzen, A., and Neufeld, A. Deep splitting method for parabolic PDEs. arXiv:1907.03452 (2019). Revision requested from SIAM Journal of Scientific Computing.
Beck, C., Becker, S., Grohs, P., Jaafari, N., and Jentzen, A. Solving the Kolmogorov PDE by means of deep learning. arXiv:1806.00421 (2018). Accepted in Journal of Scientific Computing.
Beck, C., E, W., and Jentzen, A. Machine learning approximation algorithms for high-dimensional fully nonlinear partial differential equations and second-order backward stochastic differential equations. Journal of Nonlinear Science 29, 4 (2019), 1563–1619.
Beck, C., Hornung, F., Hutzenthaler, M., Jentzen, A., and Kruse, T. Overcoming the curse of dimensionality in the numerical approximation of Allen-Cahn partial differential equations via truncated full-history recursive multilevel Picard approximations. Journal of Numerical Mathematics 28, 4 (2020), 197–222.
Article MathSciNet Google Scholar
Becker, S., Braunwarth, R., Hutzenthaler, M., Jentzen, A., and von Wurstemberger, P. Numerical simulations for full history recursive multilevel Picard approximations for systems of high-dimensional partial differential equations. Commun. Comput. Phys. 28, 5 (2020), 2109–2138.
Article MathSciNet Google Scholar
Becker, S., Cheridito, P., and Jentzen, A. Deep optimal stopping. Journal of Machine Learning Research 20, 74 (2019), 1–25.
MathSciNet MATH Google Scholar
Becker, S., Cheridito, P., Jentzen, A., and Welti, T. Solving high-dimensional optimal stopping problems using deep learning. Cambridge University Press (2021). https://www.cambridge.org/core/journals/european-journal-of-applied-mathematics/article/solving-highdimensional-optimastopping-problems-using-deep-learning/A632772461C859353E6F8A7DAB8A1769
Berg, J., and Nyström, K. A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317 (2018), 28–41.
Article Google Scholar
Berner, J., Grohs, P., and Jentzen, A. Analysis of the generalization error: Empirical risk minimization over deep artificial neural networks overcomes the curse of dimensionality in the numerical approximation of Black-Scholes partial differential equations. SIAM Journal on Mathematics of Data Science 2, 3 (2020), 631–657.
Article MathSciNet Google Scholar
Bouchard, B., Tan, X., Warin, X., and Zou, Y. Numerical approximation of BSDEs using local polynomial drivers and branching processes. Monte Carlo Methods and Applications 23, 4 (2017), 241–263.
Article MathSciNet Google Scholar
Bouchard, B., and Touzi, N. Discrete-time approximation and Monte-Carlo simulation of backward stochastic differential equations. Stochastic Processes and their applications 111, 2 (2004), 175–206.
Article MathSciNet Google Scholar
Briand, P., and Labart, C. Simulation of BSDEs by Wiener chaos expansion. The Annals of Applied Probability 24, 3 (2014), 1129–1171.
Article MathSciNet Google Scholar
Chan-Wai-Nam, Q., Mikael, J., and Warin, X. Machine learning for semi linear PDEs. J. Sci. Comput. 79, 3 (2019), 1667–1712.
Article MathSciNet Google Scholar
Chen, Y., and Wan, J. W. Deep neural network framework based on backward stochastic differential equations for pricing and hedging american options in high dimensions. arXiv:1909.11532 (2019).
Da Prato, G., and Zabczyk, J. Differentiability of the Feynman-Kac semigroup and a control application. Atti Accad. Naz. Lincei Cl. Sci. Fis. Mat. Natur. Rend. Lincei (9) Mat. Appl. 8, 3 (1997), 183–188.
Dockhorn, T. A discussion on solving partial differential equations using neural networks. arXiv:1904.07200 (2019).
E, W., Han, J., and Jentzen, A. Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations. Commun. Math. Stat. 5, 4 (2017), 349–380.
E, W., Han, J., and Jentzen, A. Algorithms for Solving High Dimensional PDEs: From Nonlinear Monte Carlo to Machine Learning. arXiv:2008.13333 (2020).
E, W., Hutzenthaler, M., Jentzen, A., and Kruse, T. Multilevel Picard iterations for solving smooth semilinear parabolic heat equations. arXiv:1607.03295 (2016). Springer Nature Partial Differential Equations and Applications (in press).
E, W., Hutzenthaler, M., Jentzen, A., and Kruse, T. On multilevel Picard numerical approximations for high-dimensional nonlinear parabolic partial differential equations and high-dimensional nonlinear backward stochastic differential equations. Journal of Scientific Computing 79, 3 (2019), 1534–1571.
E, W., and Yu, B. The Deep Ritz method: A deep learning-based numerical algorithm for solving variational problems. Communications in Mathematics and Statistics 6, 1 (2018), 1–12.
El Karoui, N., Peng, S., and Quenez, M. C. Backward stochastic differential equations in finance. Mathematical finance 7, 1 (1997), 1–71.
Article MathSciNet Google Scholar
Elbrächter, D., Grohs, P., Jentzen, A. and Schwab, C. DNN Expression Rate Analysis of High-Dimensional PDEs: Application to Option Pricing. Constr. Approx. (2021). https://doi.org/10.1007/s00365-021-09541-6
Farahmand, A.-m., Nabi, S., and Nikovski, D. Deep reinforcement learning for partial differential equation control. 2017 American Control Conference (ACC) (2017), 3120–3127.
Fujii, M., Takahashi, A., and Takahashi, M. Asymptotic Expansion as Prior Knowledge in Deep Learning Method for high dimensional BSDEs. arXiv:1710.07030 (2017).
Geiss, C., and Labart, C. Simulation of BSDEs with jumps by Wiener chaos expansion. Stochastic processes and their applications 126, 7 (2016), 2123–2162.
Article MathSciNet Google Scholar
Giles, M. B. Multilevel Monte Carlo path simulation. Oper. Res. 56, 3 (2008), 607–617.
Article MathSciNet Google Scholar
Giles, M. B., Jentzen, A., and Welti, T. Generalised multilevel Picard approximations. arXiv:1911.03188 (2019). Revision requested from IMA J. Num. Anal.
Gobet, E., Lemor, J.-P., and Warin, X. A regression-based Monte Carlo method to solve backward stochastic differential equations. The Annals of Applied Probability 15, 3 (2005), 2172–2202.
Article MathSciNet Google Scholar
Gobet, E., Turkedjiev, P., et al. Approximation of backward stochastic differential equations using malliavin weights and least-squares regression. Bernoulli 22, 1 (2016), 530–562.
Article MathSciNet Google Scholar
Goudenège, L., Molent, A., and Zanette, A. Machine Learning for Pricing American Options in High Dimension. arXiv:1903.11275 (2019), 11 pages.
Grohs, P., Hornung, F., Jentzen, A., and von Wurstemberger, P. A proof that artificial neural networks overcome the curse of dimensionality in the numerical approximation of Black-Scholes partial differential equations. arXiv:1809.02362 (2019). Accepted in Mem. Amer. Math. Soc.
Grohs, P., Hornung, F., Jentzen, A., and Zimmermann, P. Space-time error estimates for deep neural network approximations for differential equations. arXiv:1908.03833 (2019).
Grohs, P., Jentzen, A., and Salimova, D. Deep neural network approximations for Monte Carlo algorithms. arXiv:1908.10828 (2019). Accepted in Springer Nature Partial Differential Equations and Applications.
Han, J., Jentzen, A., and E, W. Solving high-dimensional partial differential equations using deep learning. Proceedings of the National Academy of Sciences 115, 34 (2018), 8505–8510.
Han, J., and Long, J. Convergence of the Deep BSDE Method for Coupled FBSDEs. arXiv:1811.01165 (2018).
Heinrich, S. Monte Carlo complexity of global solution of integral equations. J. Complexity 14, 2 (1998), 151–175.
Article MathSciNet Google Scholar
Heinrich, S., and Sindambiwe, E. Monte Carlo complexity of parametric integration. J. Complexity 15, 3 (1999), 317–341. Dagstuhl Seminar on Algorithms and Complexity for Continuous Problems (1998).
Henry-Labordère, P. Counterparty risk valuation: a marked branching diffusion approach. arXiv:1203.2369 (2012).
Henry-Labordère, P. Deep Primal-Dual Algorithm for BSDEs: Applications of Machine Learning to CVA and IM. Available at SSRN:https://doi.org/10.2139/ssrn.3071506 (2017).
Henry-Labordère, P., Oudjane, N., Tan, X., Touzi, N., and Warin, X. Branching diffusion representation of semilinear PDEs and Monte Carlo approximation. Annales de l’Institut Henri Poincaré, Probabilités et Statistiques 55, 1 (2019), 184–210.
Article MathSciNet Google Scholar
Henry-Labordère, P., Tan, X., and Touzi, N. A numerical algorithm for a class of BSDEs via the branching process. Stochastic Process. Appl. 124, 2 (2014), 1112–1140.
Article MathSciNet Google Scholar
Huré, C., Pham, H., and Warin, X. Some machine learning schemes for high-dimensional nonlinear PDEs. arXiv:1902.01599 (2019).
Hutzenthaler, M., Jentzen, A., Kruse, T., and Nguyen, T. A. A proof that rectified deep neural networks overcome the curse of dimensionality in the numerical approximation of semilinear heat equations. SN Partial Differential Equations and Applications 1 (2020), 1–34.
Article MathSciNet Google Scholar
Hutzenthaler, M., Jentzen, A., Kruse, T., Nguyen, T. A., and von Wurstemberger, P. Overcoming the curse of dimensionality in the numerical approximation of semilinear parabolic partial differential equations. Proceedings of the Royal Society A 476, 2244 (2020), 20190630.
Article MathSciNet Google Scholar
Hutzenthaler, M., Jentzen, A., and von Wurstemberger, P. Overcoming the curse of dimensionality in the approximative pricing of financial derivatives with default risks. Electronic Journal of Probability 25 (2020).
Hutzenthaler, M., and Kruse, T. Multilevel Picard approximations of high-dimensional semilinear parabolic differential equations with gradient-dependent nonlinearities. SIAM Journal on Numerical Analysis 58, 2 (2020), 929–961.
Article MathSciNet Google Scholar
Jacquier, A., and Oumgari, M. Deep PPDEs for rough local stochastic volatility. arXiv:1906.02551 (2019).
Jentzen, A., Salimova, D., and Welti, T. A proof that deep artificial neural networks overcome the curse of dimensionality in the numerical approximation of Kolmogorov partial differential equations with constant diffusion and nonlinear drift coefficients. arXiv:1809.07321 (2018). Accepted in Communications in Mathematical Sciences.
Jianyu, L., Siwei, L., Yingjian, Q., and Yaping, H. Numerical solution of elliptic partial differential equation using radial basis function neural networks. Neural Networks 16, 5 (2003), 729 – 734.
Article Google Scholar
Kutyniok, G., Petersen, P., Raslan, M., and Schneider, R. A theoretical analysis of deep neural networks and parametric PDEs. arXiv:1904.00377 (2019).
Lagaris, I. E., Likas, A., and Fotiadis, D. I. Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks 9 (5) (1998), 987–1000.
Article Google Scholar
Lemor, J.-P., Gobet, E., and Warin, X. Rate of convergence of an empirical regression method for solving generalized backward stochastic differential equations. Bernoulli 12, 5 (2006), 889–916.
Article MathSciNet Google Scholar
Long, Z., Lu, Y., Ma, X., and Dong, B. PDE-Net: Learning PDEs from Data. In Proceedings of the 35th International Conference on Machine Learning (2018), pp. 3208–3216.
Lye, K. O., Mishra, S., and Ray, D. Deep learning observables in computational fluid dynamics. arXiv:1903.03040 (2019).
Magill, M., Qureshi, F., and de Haan, H. W. Neural networks trained to solve differential equations learn general representations. In Advances in Neural Information Processing Systems (2018), pp. 4071–4081.
Meade, Jr., A. J., and Fernández, A. A. The numerical solution of linear ordinary differential equations by feedforward neural networks. Math. Comput. Modelling 19, 12 (1994), 1–25.
Article MathSciNet Google Scholar
Pham, H., and Warin, X. Neural networks-based backward scheme for fully nonlinear PDEs. arXiv:1908.00412 (2019).
Qi, F. Bounds for the ratio of two gamma functions. Journal of Inequalities and Applications 2010, 1 (2010), 493058.
MathSciNet MATH Google Scholar
Raissi, M. Forward-Backward Stochastic Neural Networks: Deep Learning of High-dimensional Partial Differential Equations. arXiv:1804.07010 (2018).
Reisinger, C., and Zhang, Y. Rectified deep neural networks overcome the curse of dimensionality for nonsmooth value functions in zero-sum games of nonlinear stiff systems. arXiv:1903.06652 (2019).
Sirignano, J., and Spiliopoulos, K. DGM: A deep learning algorithm for solving partial differential equations. arXiv:1708.07469 (2017).
Uchiyama, T., and Sonehara, N. Solving inverse problems in nonlinear PDEs by recurrent neural networks. In IEEE International Conference on Neural Networks (1993), IEEE, pp. 99–102.
Wendel, J. Note on the gamma function. The American Mathematical Monthly 55, 9 (1948), 563–564.
Article MathSciNet Google Scholar

Download references

Acknowledgements

We thank two anonymous reviewers and the handling editor for very helpful comments and suggestions. The first author acknowledges funding by the Deutsche Forschungsgesellschaft (DFG) via RTG 2131 High-dimensional Phenomena in Probability – Fluctuations and Discontinuity and via research grant HU 1889/6-1. The second author acknowledges funding by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy EXC 2044-390685587, Mathematics Muenster: Dynamics-Geometry-Structure.

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations

Faculty of Mathematics, University of Duisburg-Essen, 45117, Essen, Germany
Martin Hutzenthaler
Applied Mathematics: Institute for Analysis and Numerics, Faculty of Mathematics and Computer Science, University of Münster, 48149, Münster, Germany
Arnulf Jentzen
Seminar for Applied Mathematics, Department of Mathematics, ETH Zürich, 8092, Zürich, Switzerland
Arnulf Jentzen
Institute of Mathematics, University of Gießen, 35392, Gießen, Germany
Thomas Kruse

Authors

Martin Hutzenthaler
View author publications
You can also search for this author in PubMed Google Scholar
Arnulf Jentzen
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Kruse
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Martin Hutzenthaler.

Additional information

Communicated by Endre Süli.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Hutzenthaler, M., Jentzen, A. & Kruse, T. Overcoming the Curse of Dimensionality in the Numerical Approximation of Parabolic Partial Differential Equations with Gradient-Dependent Nonlinearities . Found Comput Math 22, 905–966 (2022). https://doi.org/10.1007/s10208-021-09514-y

Download citation

Received: 02 March 2020
Revised: 18 January 2021
Accepted: 19 January 2021
Published: 13 July 2021
Issue Date: August 2022
DOI: https://doi.org/10.1007/s10208-021-09514-y

Keywords

Mathematics Subject Classification

65M75

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Overcoming the Curse of Dimensionality in the Numerical Approximation of Parabolic Partial Differential Equations with Gradient-Dependent Nonlinearities

Abstract

Similar content being viewed by others

Multilevel Picard iterations for solving smooth semilinear parabolic heat equations

On Multilevel Picard Numerical Approximations for High-Dimensional Nonlinear Parabolic Partial Differential Equations and High-Dimensional Nonlinear Backward Stochastic Differential Equations

High Order Finite Difference Schemes for the Heat Equation Whose Convergence Rates are Higher Than Their Truncation Errors

1 Introduction

Theorem 1.1

2 Analysis of Certain Deterministic Iterated Integrals

2.1 Identities for Certain Deterministic Iterated Integrals

Lemma 2.1

Proof of Lemma 2.1

Lemma 2.2

Proof of Lemma 2.2

2.2 Estimates for Certain Deterministic Iterated Integrals

Corollary 2.3

Proof of Corollary 2.3

2.3 Estimates for Products of Certain Independent Random Variables

Lemma 2.4

Proof of Lemma 2.4

Corollary 2.5

Proof of Corollary 2.5

3 Full-History Recursive Multilevel Picard Approximation Methods

3.1 Description of MLP Approximations

Setting 3.1

3.2 Properties of MLP Approximations

Lemma 3.2

Proof of Lemma 3.2

Lemma 3.3

Proof of Lemma 3.3

3.3 Error Analysis for MLP Approximations

Lemma 3.4

Proof of Lemma 3.4

Proposition 3.5

Proof of Proposition 3.5

4 Regularity Analysis for Solutions of Certain Differential Equations

4.1 Regularity Analysis for Solutions of Backward Stochastic Differential Equations (BSDEs)

Lemma 4.1

Proof of Lemma 4.1

4.2 Regularity Analysis for Solutions of Partial Differential Equations (PDEs)

Lemma 4.2

Proof of Lemma 4.2

5 Overall Complexity Analysis for MLP Approximation Methods

5.1 Quantitative Complexity Analysis for MLP Approximation Methods

Corollary 5.1

Proof of Corollary 5.1

5.2 Qualitative Complexity Analysis for MLP Approximation Methods

Theorem 5.2

Proof of Theorem 5.2

Lemma 5.3

Proof of Lemma 5.3

Corollary 5.4

Proof of Corollary 5.4

Corollary 5.5

Proof of Corollary 5.5

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation