1 Introduction

In the past few years deep artificial neural networks (DNNs) have been successfully employed in a large number of computational problems including, e.g., language processing (cf., e.g., [13, 23, 29, 31, 38, 57]), image recognition (cf., e.g., [32, 40, 52, 54, 56]), fraud detection (cf., e.g., [12, 51]), and computational advertisement (cf., e.g., [55, 58]).

Recently, it has also been proposed to reformulate high-dimensional partial differential equations (PDEs) as stochastic learning problems and to employ DNNs together with stochastic gradient descent methods to approximate the solutions of such high-dimensional PDEs [3, 16, 17, 20, 26, 39, 53] (cf., e.g., also [14, 37, 42]). We refer, e.g., to [1, 2, 4,5,6, 9, 11, 15, 19, 22, 27, 28, 33, 35, 43,44,45, 48, 49] and the references mentioned therein for further developments and extensions of such deep learning based numerical approximation methods for PDEs. In particular, the references [2, 9, 17, 35, 45] deal with linear PDEs (and the stochastic differential equations (SDEs) related to them, respectively), the references [1, 11, 15, 19, 20, 28, 33] deal with semilinear PDEs (and the backward stochastic differential equations (BSDEs) related to them, respectively), the references [3, 43, 48, 49] deal with fully nonlinear PDEs (and the second-order backward stochastic differential equations (2BSDEs) related to them, respectively), the references [27, 44, 53] deal with certain specific subclasses of fully nonlinear PDEs (and the 2BSDEs related to them, respectively), and the references [5, 6, 22, 53] deal with free boundary PDEs (and the optimal stopping/option pricing problems related to them (see, e.g., [8, Chapter 1]), respectively).

In the scientific literature there are also a few rigorous mathematical convergence results for DNN based approximation methods for PDEs. For example, the references [27, 53] provide mathematical convergence results for such DNN based approximation methods for PDEs without any information on the convergence speed and, for instance, the references [10, 18, 21, 24, 25, 30, 34, 36, 41, 50] provide mathematical convergence results of such DNN based approximation methods for PDEs with dimension-independent convergence rates and error constants which are only polynomially dependent on the dimension. In particular, the latter references show that DNNs can approximate solutions of certain PDEs without the curse of dimensionality (cf. [7]) in the sense that the number of real parameters employed to describe the DNN grows at most polynomially both in the PDE dimension \(d \in {\mathbb {N}}\) and the reciprocal of the prescribed approximation accuracy \(\varepsilon > 0\) (cf., e.g., [46, Chapter 1] and [47, Chapter 9]).

One key argument in most of these articles is, first, to employ a Monte Carlo approximation scheme which can approximate the solution of the PDE under consideration at a fixed space-time point without the curse of dimensionality and, thereafter, to prove then that DNNs are flexible enough to mimic the behaviour of the employed approximation scheme (cf., e.g., [36, Section 2 and (i)–(iii) in Section 1] and [24]). Having this in mind, one could aim for a general abstract result which shows under suitable assumptions that if a certain function can be approximated by any kind of (Monte Carlo) approximation scheme without the curse of dimensionality, then the function can also be approximated with DNNs without the curse of dimensionality.

It is a subject of this article to make a first step towards this direction. In particular, the main result of this paper, Theorem 2.3 below, roughly speaking, shows that if a function can be approximated by means of some suitable discrete approximation scheme without the curse of dimensionality (cf. (2.9) in Theorem 2.3 below) and if there exist DNNs which satisfy certain regularity properties and which approximate this discrete approximation scheme without the curse of dimensionality, then the function itself can also be approximated with DNNs without the curse of dimensionality. Moreover, for the number of real parameters used to describe such approximating DNNs we provide in Theorem 2.3 below an explicit upper bound for the optimal exponent of the dimension \(d \in {\mathbb {N}}\) of the function under consideration as well as an explicit lower bound for the optimal exponent of the prescribed approximation accuracy \(\varepsilon >0\) (see (2.16) in Theorem 2.3 below).

In our applications of Theorem 2.3 we employ Theorem 2.3 to study in Theorem 4.5 below DNN approximations for PDEs. Theorem 4.5 can be considered as a special case of Theorem 2.3 with the function to be approximated to be equal to the solution of a suitable Kolmogorov PDE (cf. (4.42) below) at the final time \(T \in (0, \infty )\) and the approximating scheme to be equal to the Monte Carlo Euler scheme. In particular, Theorem 4.5 shows that solutions of suitable Kolmogorov PDEs can be approximated with DNNs without the curse of dimensionality. For the number of real parameters used to describe such approximating DNNs Theorem 4.5 also provides an explicit upper bound for the optimal exponent of the dimension \(d \in {\mathbb {N}}\) of the PDE under consideration as well as an explicit lower bound for the optimal exponent of the prescribed approximation accuracy \(\varepsilon >0\) (see (4.43) below). In order to illustrate the findings of Theorem 4.5 below, we now present in Theorem 1.1 below a special case of Theorem 4.5.

Theorem 1.1

Let \( \varphi _{0,d} \in C({\mathbb {R}}^d, {\mathbb {R}}) \), \( d \in {\mathbb {N}}\), and \( \varphi _{ 1, d } \in C({\mathbb {R}}^d, {\mathbb {R}}^d) \), \( d \in {\mathbb {N}}\), let \(\left\| \cdot \right\| \! :(\cup _{d \in {\mathbb {N}}} {\mathbb {R}}^d) \rightarrow [0, \infty )\) and \({\mathfrak {R}}:(\cup _{d \in {\mathbb {N}}} {\mathbb {R}}^d) \rightarrow (\cup _{d \in {\mathbb {N}}} {\mathbb {R}}^d)\) satisfy for all \(d \in {\mathbb {N}}\), \(x = (x_1, \ldots , x_d) \in {\mathbb {R}}^d\) that

$$\begin{aligned} \Vert x\Vert = \big ( \textstyle \sum _{i=1}^d |x_i|^2\big )^{1/2} \qquad \text {and} \qquad {\mathfrak {R}}(x) = (\max \{x_1, 0\}, \ldots , \max \{x_d, 0\}), \end{aligned}$$
(1.1)

let \({\mathbf {N}}= \cup _{L \in {\mathbb {N}}} \cup _{ l_0,l_1,\ldots , l_L\in {\mathbb {N}}} ( \times _{k = 1}^L ({\mathbb {R}}^{l_k \times l_{k-1}} \times {\mathbb {R}}^{l_k}) )\), let \({\mathcal {P}}:{\mathbf {N}}\rightarrow {\mathbb {N}}\) and \({\mathcal {R}}:{\mathbf {N}}\rightarrow (\cup _{k,l\in {\mathbb {N}}} C({\mathbb {R}}^k,{\mathbb {R}}^l))\) satisfy for all \( L\in {\mathbb {N}}\), \(l_0,l_1,\ldots , l_L \in {\mathbb {N}}\), \( \Phi = ((W_1, B_1),\ldots , (W_L,B_L)) \in ( \times _{k = 1}^L({\mathbb {R}}^{l_k \times l_{k-1}} \times {\mathbb {R}}^{l_k})) \), \(x_0 \in {\mathbb {R}}^{l_0}, x_1 \in {\mathbb {R}}^{l_1}, \ldots , x_{L} \in {\mathbb {R}}^{l_{L}}\) with \(\forall \, k \in {\mathbb {N}}\cap (0,L) :x_k = {\mathfrak {R}}(W_k x_{k-1} + B_k)\) that \({\mathcal {P}}(\Phi ) = \sum _{k = 1}^L l_k(l_{k-1} + 1) \), \({\mathcal {R}}(\Phi ) \in C({\mathbb {R}}^{l_0},{\mathbb {R}}^{l_L})\), and

$$\begin{aligned} ({\mathcal {R}}(\Phi )) (x_0) = W_L x_{L-1} + B_L, \end{aligned}$$
(1.2)

let \( T, \kappa , {\mathfrak {e}}\in (0, \infty )\), \({\mathfrak {d}}\in [4, \infty )\), \(\theta \in [1, \infty )\), \( ( \phi ^{ m, d }_{ \varepsilon } )_{ (m, d, \varepsilon ) \in \{ 0, 1 \} \times {\mathbb {N}}\times (0,1] } \subseteq {\mathbf {N}}\), assume for all \( d \in {\mathbb {N}}\), \( \varepsilon \in (0,1] \), \( m \in \{0, 1\}\), \( x, y \in {\mathbb {R}}^d \) that

$$\begin{aligned}&{\mathcal {R}}( \phi ^{ 0, d }_{ \varepsilon } ) \in C( {\mathbb {R}}^d, {\mathbb {R}}), \quad {\mathcal {R}}( \phi ^{ 1, d }_{ \varepsilon } ) \in C( {\mathbb {R}}^d, {\mathbb {R}}^d ), \quad {\mathcal {P}}( \phi ^{ m, d }_{ \varepsilon } ) \le \kappa d^{ 2^{(-m)} {\mathfrak {d}}} \varepsilon ^{ - 2^{(-m)} {\mathfrak {e}}}, \end{aligned}$$
(1.3)
$$\begin{aligned}&|( {\mathcal {R}}(\phi ^{ 0, d }_{ \varepsilon }) )(x) - ( {\mathcal {R}}(\phi ^{ 0, d }_{ \varepsilon }) )(y)| \le \kappa d^{{\mathfrak {d}}} (1 + \Vert x\Vert ^{\theta } + \Vert y \Vert ^{\theta })\Vert x-y\Vert , \end{aligned}$$
(1.4)
$$\begin{aligned}&\Vert ( {\mathcal {R}}(\phi ^{ 1, d }_{ \varepsilon }) )(x) \Vert \le \kappa ( d^{ {\mathfrak {d}}} + \Vert x \Vert ), \qquad | \varphi _{ 0, d }( x )| \le \kappa d^{ {\mathfrak {d}}} ( d^{ \theta {\mathfrak {d}}} + \Vert x \Vert ^{ \theta } ), \end{aligned}$$
(1.5)
$$\begin{aligned}&\Vert \varphi _{ m, d }(x) - ( {\mathcal {R}}(\phi ^{ m, d }_{ \varepsilon }) )(x) \Vert \le \varepsilon \kappa d^{{\mathfrak {d}}} (d^{\theta {\mathfrak {d}}}+ \Vert x\Vert ^{\theta }), \end{aligned}$$
(1.6)

and \( \Vert \varphi _{ 1, d }( x ) - \varphi _{ 1, d }( y ) \Vert \le \kappa \Vert x - y \Vert \), and for every \( d \in {\mathbb {N}}\) let \( u_d :[0,T] \times {\mathbb {R}}^{d} \rightarrow {\mathbb {R}}\) be an at most polynomially growing viscosity solution of

$$\begin{aligned} \begin{aligned} \left( \tfrac{ \partial }{\partial t} u_d \right) ( t, x )&= \left( \tfrac{ \partial }{\partial x} u_d \right) ( t, x ) \, \varphi _{ 1, d }( x ) + \textstyle \sum \limits _{ i = 1 }^d \displaystyle \left( \tfrac{ \partial ^2 }{ \partial x_i^2 } u_d \right) ( t, x ) \end{aligned} \end{aligned}$$
(1.7)

with \( u_d( 0, x ) = \varphi _{ 0, d }( x ) \) for \( ( t, x ) \in (0,T) \times {\mathbb {R}}^d \). Then for every \(p \in (0, \infty )\) there exist \( c \in {\mathbb {R}}\) and \( ( \Psi _{ d, \varepsilon } )_{ (d , \varepsilon ) \in {\mathbb {N}}\times (0,1] } \subseteq {\mathbf {N}}\) such that for all \( d \in {\mathbb {N}}\), \( \varepsilon \in (0,1] \) it holds that \( {\mathcal {R}}( \Psi _{ d, \varepsilon } ) \in C( {\mathbb {R}}^{ d }, {\mathbb {R}}) \), \([ \int _{ [0, 1]^d } | u_d(T, x) - ( {\mathcal {R}}(\Psi _{ d, \varepsilon }) )( x ) |^p \, dx ]^{ \nicefrac { 1 }{ p } } \le \varepsilon \), and

$$\begin{aligned} {\mathcal {P}}( \Psi _{ d, \varepsilon } ) \le c \varepsilon ^{-({\mathfrak {e}}+6)} d^{{\mathfrak {d}}[6 \theta + 13 + 2 {\mathfrak {e}}(\theta +1)]}. \end{aligned}$$
(1.8)

Theorem 1.1 is an immediate consequence of Corollary 4.6 in Sect. 4 below. Corollary 4.6, in turn, is a special case of Theorem 4.5. Let us add some comments regarding the mathematical objects appearing in Theorem 1.1.

The set \( {\mathbf {N}}\) in Theorem 1.1 above is a set of tuples of pairs of real matrices and real vectors and this set represents the set of all DNNs (see also Definition 3.1 below). The function \({\mathfrak {R}}:(\cup _{d \in {\mathbb {N}}} {\mathbb {R}}^d) \rightarrow (\cup _{d \in {\mathbb {N}}} {\mathbb {R}}^d)\) in Theorem 1.1 represents multidimensional rectifier functions. Theorem 1.1 is thus an approximation result for rectified DNNs.

Moreover, for every DNN \( \Phi \in {\mathbf {N}}\) in Theorem 1.1 above \( {\mathcal {P}}( \Phi ) \in {\mathbb {N}}\) represents the number of real parameters which are used to describe the DNN \( \Phi \) (see also Definition 3.1 below). In particular, for every DNN \( \Phi \in {\mathbf {N}}\) in Theorem 1.1 one can think of \( {\mathcal {P}}( \Phi ) \in {\mathbb {N}}\) as a number proportional to the amount of memory storage needed to store the DNN \(\Phi \). Furthermore, the function \( {\mathcal {R}}:{\mathbf {N}}\rightarrow (\cup _{ k, l \in {\mathbb {N}}} C( {\mathbb {R}}^k, {\mathbb {R}}^l )) \) from the set \( {\mathbf {N}}\) of “all DNNs" to the union \( \cup _{ k, l \in {\mathbb {N}}} C( {\mathbb {R}}^k, {\mathbb {R}}^l ) \) of continuous functions describes the realization functions associated to the DNNs (see also Definition 3.3 below).

The real number \( T > 0 \) in Theorem 1.1 describes the time horizon under consideration and the real numbers \( \kappa , {\mathfrak {e}}, {\mathfrak {d}}, \theta \in {\mathbb {R}}\) in Theorem 1.1 are constants used to formulate the assumptions in Theorem 1.1. The key assumption in Theorem 1.1 is the hypothesis that both the possibly nonlinear initial value functions \( \varphi _{ 0, d } :{\mathbb {R}}^d \rightarrow {\mathbb {R}}\), \( d \in {\mathbb {N}}\), and the possibly nonlinear drift coefficient functions \( \varphi _{ 1, d } :{\mathbb {R}}^d \rightarrow {\mathbb {R}}^d \), \( d \in {\mathbb {N}}\), of the PDEs in (1.7) can be approximated by means of DNNs without the curse of dimensionality (see (1.3)–(1.6) above for details).

Results related to Theorem 4.5 have been established in [24, Theorem 3.14], [36, Theorem 1.1], [34, Theorem 4.1], and [50, Corollary 2.2]. Theorem 3.14 in [24] proves a similar statement to (1.8) for a different class of PDEs than (1.7), that is, Theorem 3.14 in [24] deals with Black-Scholes PDEs with affine linear coefficient functions while in (1.7) the diffusion coefficient is constant and the drift coefficient may be nonlinear. Theorem 1.1 in [36] shows the existence of constants and exponents of \(d \in {\mathbb {N}}\) and \(\varepsilon >0\) such that (1.8) holds but does not provide any explicit form for these exponents. Theorem 4.1 in [34] studies a different class of PDEs than (1.7) (the diffusion coefficient is chosen so that the second order term is the Laplacian and the drift coefficient is chosen to be zero but there is a nonlinearity depending on the PDE solution in the PDE in Theorem 4.1 in [34]) and provides an explicit exponent for \(\varepsilon >0\) and the existence of constants and exponents of \(d \in {\mathbb {N}}\) such that (1.8) holds. Corollary 2.2 in [50] studies a more general class of Kolmogorov PDEs than (1.7) and shows the existence of constants and exponents of \(d \in {\mathbb {N}}\) and \(\varepsilon >0\) such that (1.8) holds. Theorem 4.5 above extends these results by providing explicit exponents for \(d \in {\mathbb {N}}\) and \(\varepsilon > 0\) in terms of the used assumptions such that (1.8) holds and, in addition, Theorem 4.5 can be considered as a special case of the general DNN approximation result in Theorem 2.3 with the functions to be approximated to be equal to the solutions of the PDEs in (1.7) at the final time \(T \in (0, \infty )\) and the approximating scheme to be equal to the Monte Carlo Euler scheme.

The remainder of this article is organized as follows. In Sect. 2 we present Theorem 2.3, which is the main result of this paper. The proof of Theorem 2.3 employs the elementary result in Lemma 2.2. Lemma 2.2 establishes suitable a priori bounds for random variables and follows from the well-known discrete Gronwall-type inequality in Lemma 2.1 below. In Sect. 3 we develop in Lemmas 3.29 and 3.30 a few elementary results on representation flexibilities of DNNs. The proofs of Lemmas 3.29 and 3.30 use results on a certain artificial neural network (ANN) calculus which we recall and extend in Sects. 3.13.7. In Sect. 4 in Theorem 4.5 we employ Lemmas 3.29 and 3.30 to establish the existence of DNNs which approximate solutions of suitable Kolmogorov PDEs without the curse of dimensionality. In our proof of Theorem 4.5 we also employ error estimates for the Monte Carlo Euler method which we present in Proposition 4.4 in Sect. 4. The proof of Proposition 4.4, in turn, makes use of the elementary error estimate results in Lemmas 4.14.3 below.

2 Deep artificial neural network (DNN) approximations

In this section we show in Theorem 2.3 below that, roughly speaking, if a function can be approximated by means of some suitable discrete approximation scheme without the curse of dimensionality and if there exist DNNs which satisfy certain regularity properties and which approximate this discrete approximation scheme without the curse of dimensionality, then the function itself can also be approximated with DNNs without the curse of dimensionality.

In our proof of Theorem 2.3 we employ the elementary a priori estimates for expectations of certain random variables in Lemma 2.2 below. Lemma 2.2, in turn, follows from the well-known discrete Gronwall-type inequality in Lemma 2.1 below. Only for completeness we include in this section a detailed proof for Lemma 2.1.

2.1 A priori bounds for random variables

Lemma 2.1

Let \(\alpha \in [0, \infty )\), \( \beta \in [0, \infty ]\) and let \( x :{\mathbb {N}}_0 \rightarrow {\mathbb {R}}\) satisfy for all \(n \in {\mathbb {N}}\) that \(x_n \le \alpha x_{n-1} + \beta \). Then it holds for all \(n \in {\mathbb {N}}\) that

$$\begin{aligned} x_n \le \alpha ^n x_0 + \beta (1 + \alpha + \ldots + \alpha ^{n-1}) \le \alpha ^n x_0 + \beta e^{\alpha }. \end{aligned}$$
(2.1)

Proof of Lemma 2.1

We prove (2.1) by induction on \(n \in {\mathbb {N}}\). For the base case \(n=1\) note that the hypothesis that \(\forall \, k \in {\mathbb {N}}:x_k \le \alpha x_{k-1} + \beta \) ensures that

$$\begin{aligned} x_1 \le \alpha x_0 +\beta = \alpha ^1 x_0 + \beta \le \alpha ^1 x_0 + \beta e^\alpha . \end{aligned}$$
(2.2)

This establishes (2.1) in the base case \(n=1\). For the induction step \({\mathbb {N}}\ni (n-1) \rightarrow n \in {\mathbb {N}}\cap [2, \infty )\) observe that the hypothesis that \(\forall \, k \in {\mathbb {N}}:x_k \le \alpha x_{k-1} + \beta \) implies that for all \(n \in {\mathbb {N}}\cap [2, \infty )\) with \(x_{n-1} \le \alpha ^{n-1} x_0 + \beta (1 + \alpha + \ldots + \alpha ^{n-2})\) it holds that

$$\begin{aligned} \begin{aligned} x_{n}&\le \alpha x_{n-1} + \beta \le \alpha ^{n} x_0 + \alpha \beta (1 + \alpha + \ldots + \alpha ^{n-2}) + \beta \\&= \alpha ^{n} x_0 + \beta (1 + \alpha + \ldots + \alpha ^{n-1}) \le \alpha ^{n} x_0 + \beta e^{\alpha }. \end{aligned} \end{aligned}$$
(2.3)

Induction thus establishes (2.1). This completes the proof of Lemma 2.1. \(\square \)

Lemma 2.2

Let \(N \in {\mathbb {N}}\), \(p \in [1, \infty )\), \(\alpha , \beta , \gamma \in [0, \infty )\) and let \(X_n :\Omega \rightarrow {\mathbb {R}}\), \(n \in \{0, 1, \ldots , N\}\), and \(Z_n :\Omega \rightarrow {\mathbb {R}}\), \(n \in \{0, 1, \ldots , N-1\}\), be random variables which satisfy for all \(n \in \{1, 2, \ldots , N\}\) that

$$\begin{aligned} |X_n | \le \alpha |X_{n-1}| + \beta \big [\gamma + |Z_{n-1}| \big ]. \end{aligned}$$
(2.4)

Then it holds that

$$\begin{aligned} \begin{aligned} \left( {\mathbb {E}}\! \left[ |X_N |^p \right] \right) ^{\nicefrac {1}{p}} \le \alpha ^N \! \left( {\mathbb {E}}\! \left[ |X_0 |^p \right] \right) ^{\nicefrac {1}{p}} + e^{\alpha } \beta \! \left[ \gamma + \sup \nolimits _{i \in \{0, 1, \ldots , N -1 \}} \left( {\mathbb {E}}\! \left[ |Z_{i} |^p \right] \right) ^{\nicefrac {1}{p}}\right] . \end{aligned} \end{aligned}$$
(2.5)

Proof of Lemma 2.2

First, note that (2.4) implies for all \(n \in \{1, 2, \ldots , N\}\) that

$$\begin{aligned} \begin{aligned} \left( {\mathbb {E}}\! \left[ |X_n |^p \right] \right) ^{\nicefrac {1}{p}}&\le \alpha \! \left( {\mathbb {E}}\! \left[ |X_{n-1} |^p \right] \right) ^{\nicefrac {1}{p}} +\beta \! \left[ \gamma + \left( {\mathbb {E}}\! \left[ |Z_{n-1} |^p \right] \right) ^{\nicefrac {1}{p}} \right] \\&\le \alpha \! \left( {\mathbb {E}}\! \left[ |X_{n-1} |^p \right] \right) ^{\nicefrac {1}{p}} +\beta \! \left[ \gamma + \sup \nolimits _{i \in \{0, 1, \ldots , N -1 \}} \left( {\mathbb {E}}\! \left[ |Z_{i} |^p \right] \right) ^{\nicefrac {1}{p}} \right] . \end{aligned} \end{aligned}$$
(2.6)

Lemma 2.1 (with \(\alpha = \alpha \), \(\beta = \beta \, [ \gamma + \sup \nolimits _{i \in \{0, 1, \ldots , N -1 \}} ( {\mathbb {E}}[ |Z_{i} |^p ] )^{\nicefrac {1}{p}} ]\) in the notation of Lemma 2.1) hence establishes for all \(n \in \{1, 2, \ldots , N\}\) that

$$\begin{aligned} \begin{aligned} \left( {\mathbb {E}}\! \left[ |X_n |^p \right] \right) ^{\nicefrac {1}{p}} \le \alpha ^n \! \left( {\mathbb {E}}\! \left[ |X_0 |^p \right] \right) ^{\nicefrac {1}{p}} + e^{\alpha } \beta \! \left[ \gamma + \sup \nolimits _{i \in \{0, 1, \ldots , N -1 \}} \left( {\mathbb {E}}\! \left[ |Z_{i} |^p \right] \right) ^{\nicefrac {1}{p}}\right] . \end{aligned} \end{aligned}$$
(2.7)

The proof of Lemma 2.2 is thus completed. \(\square \)

2.2 A DNN approximation result for Monte Carlo algorithms

Theorem 2.3

Let \((\Omega , {\mathcal {F}}, {\mathbb {P}})\) be a probability space, let \( {\mathfrak {n}}_0 \in (0, \infty )\), \({\mathfrak {n}}_1, {\mathfrak {n}}_2, {\mathfrak {e}}, {\mathfrak {d}}_0, {\mathfrak {d}}_1, \ldots , {\mathfrak {d}}_6 \in [0, \infty )\), \({\mathfrak {C}}, p, \theta \in [1, \infty )\), \((M_{N})_{N \in {\mathbb {N}}} \subseteq {\mathbb {N}}\), let \(Z^{N, d, m}_n :\Omega \rightarrow {\mathbb {R}}^{d} \), \(n \in \{0, 1, \ldots , N-1\}\), \(m \in \{1, 2, \ldots , M_{N}\}\), \(d, N \in {\mathbb {N}}\), be random variables, let \(f_{N, d} \in C( {\mathbb {R}}^{d} \times {\mathbb {R}}^{d}, {\mathbb {R}}^{d})\), \(d, N \in {\mathbb {N}}\), and \(Y^{N, d, x}_n = (Y^{N, d, m, x}_n)_{m \in \{1, 2, \ldots , M_{N}\}} :\Omega \rightarrow {\mathbb {R}}^{M_N d}\), \(n \in \{0, 1, \ldots , N\}\), \(x \in {\mathbb {R}}^d\), \(d, N \in {\mathbb {N}}\), satisfy for all \(N, d \in {\mathbb {N}}\), \(m \in \{1, 2, \ldots , M_{N}\}\), \(x \in {\mathbb {R}}^d\), \(n \in \{1, 2, \ldots , N\}\), \(\omega \in \Omega \) that \(Y^{N, d, m, x}_{0}(\omega ) = x\) and

$$\begin{aligned} \begin{aligned} Y^{N, d, m, x}_{n}(\omega )&= f_{N, d} \big (Z^{N, d, m}_{n-1}(\omega ), Y^{N, d, m, x}_{n-1}(\omega )\big ), \end{aligned} \end{aligned}$$
(2.8)

let \(\left\| \cdot \right\| \!:(\cup _{d \in {\mathbb {N}}} {\mathbb {R}}^d) \rightarrow [0, \infty )\) satisfy for all \(d \in {\mathbb {N}}\), \(x = (x_1, x_2, \ldots , x_d) \in {\mathbb {R}}^d\) that \(\Vert x\Vert = ( \textstyle \sum _{i=1}^d |x_i|^2)^{\nicefrac {1}{2}}\), for every \(d \in {\mathbb {N}}\) let \( \nu _d :{\mathcal {B}}({\mathbb {R}}^d) \rightarrow [0,1]\) be a probability measure on \({\mathbb {R}}^d\), let \(g_{N, d} \in C( {\mathbb {R}}^{Nd}, {\mathbb {R}})\), \( d, N \in {\mathbb {N}}\), and \(u_d \in C({\mathbb {R}}^d, {\mathbb {R}})\), \(d \in {\mathbb {N}}\), satisfy for all \( N, d \in {\mathbb {N}}\), \(m \in \{1, 2, \ldots , M_N\}\), \(n \in \{0, 1, \ldots , N-1\}\) that

$$\begin{aligned}&\left( {\mathbb {E}}\! \left[ \int _{{\mathbb {R}}^d} \big |u_d(x) - g_{M_N,d} (Y^{N, d, x}_N) \big |^p \, \nu _d (dx) \right] \right) ^{\!\nicefrac {1}{p}} \le {\mathfrak {C}}d^{{\mathfrak {d}}_0} N^{-{\mathfrak {n}}_0}, \end{aligned}$$
(2.9)
$$\begin{aligned}&\left( {\mathbb {E}}\! \left[ \Vert Z^{N, d, m}_{n} \Vert ^{2 p \theta } \right] \right) ^{\nicefrac {1}{(2 p \theta )}} \le {\mathfrak {C}}d^{{\mathfrak {d}}_1}, \quad \text {and} \quad \left[ \int _{{\mathbb {R}}^d} \Vert x\Vert ^{2p \theta } \, \nu _d (dx) \right] ^{\nicefrac {1}{(2 p \theta )}} \le {\mathfrak {C}}d^{{\mathfrak {d}}_1 + {\mathfrak {d}}_2},\nonumber \\ \end{aligned}$$
(2.10)

let \({\mathbf {N}}\) be a set, let \( {\mathcal {P}}:{\mathbf {N}}\rightarrow {\mathbb {N}}\), \({\mathcal {D}} :{\mathbf {N}}\rightarrow ( \cup _{L \in {\mathbb {N}}} {\mathbb {N}}^{L})\), and \( {\mathcal {R}}:{\mathbf {N}}\rightarrow (\cup _{ k, l \in {\mathbb {N}}} C( {\mathbb {R}}^k, {\mathbb {R}}^l )) \) be functions, let \({\mathfrak {N}}_{d, \varepsilon } \subseteq {\mathbf {N}}\), \( \varepsilon \in (0, 1]\), \(d \in {\mathbb {N}}\), let \(({\mathbf {f}}^{N, d}_{\varepsilon , z})_{(N, d, \varepsilon , z) \in {\mathbb {N}}^2 \times (0, 1] \times {\mathbb {R}}^d } \subseteq {\mathbf {N}}\), \(({\mathbf {g}}^{N, d}_{\varepsilon })_{(N, d, \varepsilon ) \in {\mathbb {N}}^2 \times (0, 1] } \subseteq {\mathbf {N}}\), \(({\mathfrak {I}}_{d})_{d \in {\mathbb {N}}} \subseteq {\mathbf {N}}\), assume for all \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\), \(x, y, z \in {\mathbb {R}}^d\) that \( {\mathfrak {N}}_{d, \varepsilon } \subseteq \{\Phi \in {\mathbf {N}}:{\mathcal {R}}( \Phi ) \in C({\mathbb {R}}^d, {\mathbb {R}}^d) \}\), \({\mathfrak {I}}_d \in {\mathfrak {N}}_{d, \varepsilon }\), \(({\mathcal {R}}({\mathfrak {I}}_d))(x) = x\), \({\mathcal {P}}({\mathfrak {I}}_d) \le {\mathfrak {C}}d^{{\mathfrak {d}}_3} \), \( {\mathcal {R}}( {\mathbf {f}}^{N, d}_{\varepsilon , z}) \in C({\mathbb {R}}^d, {\mathbb {R}}^d)\), \(({\mathbb {R}}^d \ni {\mathfrak {z}} \mapsto ( {\mathcal {R}}({\mathbf {f}}^{N, d}_{\varepsilon , {\mathfrak {z}}}))(x) \in {\mathbb {R}}^d)\) is \({\mathcal {B}}({\mathbb {R}}^d) /{\mathcal {B}}({\mathbb {R}}^d)\)-measurable, and

$$\begin{aligned}&\Vert f_{N, d}(z, x) - ( {\mathcal {R}}({\mathbf {f}}^{N, d}_{\varepsilon , z}))(x) \Vert \le \varepsilon {\mathfrak {C}}d^{{\mathfrak {d}}_4} (d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)}+ \Vert x\Vert ^{\theta }), \end{aligned}$$
(2.11)
$$\begin{aligned}&\Vert ( {\mathcal {R}}({\mathbf {f}}^{N, d}_{\varepsilon , z}))(x) \Vert \le \big (1 + \tfrac{{\mathfrak {C}}}{N}\big ) \Vert x\Vert + {\mathfrak {C}}d^{{\mathfrak {d}}_2}( d^{{\mathfrak {d}}_1} + \Vert z\Vert ), \end{aligned}$$
(2.12)
$$\begin{aligned}&\Vert f_{N, d}(z, x) - f_{N, d}(z, y) \Vert \le {\mathfrak {C}}\Vert x - y \Vert , \end{aligned}$$
(2.13)

assume for every \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\), \(\Phi \in {\mathfrak {N}}_{d, \varepsilon }\) that there exist \((\phi _z)_{z \in {\mathbb {R}}^d} \subseteq {\mathfrak {N}}_{d, \varepsilon }\) such that for all \(x, z, {\mathfrak {z}} \in {\mathbb {R}}^d\) it holds that \( ({\mathcal {R}}(\phi _z)) (x) = ( {\mathcal {R}}({\mathbf {f}}^{N, d}_{\varepsilon , z}))(({\mathcal {R}}(\Phi ))(x)) \), \({\mathcal {P}}(\phi _z) \le {\mathcal {P}}(\Phi ) + {\mathfrak {C}}N^{{\mathfrak {n}}_1} d^{{\mathfrak {d}}_3} \varepsilon ^{-{\mathfrak {e}}}\), and \( {\mathcal {D}} (\phi _z) = {\mathcal {D}} (\phi _{{\mathfrak {z}}})\), assume for all \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\), \(x = (x_i)_{i \in \{1, 2, \ldots , N\}} \in {\mathbb {R}}^{Nd}\), \(y = (y_i)_{i \in \{1, 2, \ldots , N\}} \in {\mathbb {R}}^{Nd}\) that \( {\mathcal {R}}({\mathbf {g}}^{N, d}_{\varepsilon }) \in C({\mathbb {R}}^{Nd}, {\mathbb {R}})\) and

$$\begin{aligned}&|g_{N,d}(x) - ( {\mathcal {R}}({\mathbf {g}}^{N, d}_{\varepsilon }) )(x) | \le \varepsilon {\mathfrak {C}}d^{{\mathfrak {d}}_5} \left[ d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)}+ \tfrac{1}{N} \textstyle \sum \limits _{i=1}^N \displaystyle \Vert x_i \Vert ^{\theta } \right] , \end{aligned}$$
(2.14)
$$\begin{aligned}&|( {\mathcal {R}}({\mathbf {g}}^{N, d}_{\varepsilon }) )(x) - ( {\mathcal {R}}({\mathbf {g}}^{N, d}_{\varepsilon }) )(y)| \nonumber \\&\quad \le \frac{{\mathfrak {C}}d^{{\mathfrak {d}}_6}}{N} \left[ \textstyle \sum \limits _{i=1}^N \displaystyle (d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} + \Vert x_i\Vert ^{\theta } + \Vert y_i \Vert ^{\theta })\Vert x_i- y_i\Vert \right] , \end{aligned}$$
(2.15)

and assume for every \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\), \(\Phi _1, \Phi _2, \ldots , \Phi _{M_N} \in {\mathfrak {N}}_{d, \varepsilon }\) with \({\mathcal {D}}(\Phi _1) = {\mathcal {D}}(\Phi _2) = \cdots = {\mathcal {D}}(\Phi _{M_N})\) that there exists \(\varphi \in {\mathbf {N}}\) such that for all \(x \in {\mathbb {R}}^d\) it holds that \( {\mathcal {R}}(\varphi ) \in C({\mathbb {R}}^d, {\mathbb {R}})\), \(( {\mathcal {R}}(\varphi ))(x) = ( {\mathcal {R}}({\mathbf {g}}^{ M_N, d }_{ \varepsilon }) )( ({\mathcal {R}}(\Phi _1))(x), ({\mathcal {R}}(\Phi _2))(x),\) \(\ldots , ({\mathcal {R}}(\Phi _{M_N}))(x))\), and \({\mathcal {P}}(\varphi ) \le {\mathfrak {C}}N^{{\mathfrak {n}}_2} ( N^{{\mathfrak {n}}_1 +1} d^{{\mathfrak {d}}_3} \varepsilon ^{-{\mathfrak {e}}} + {\mathcal {P}}(\Phi _1))\). Then there exist \( c \in {\mathbb {R}}\) and \( ( \Psi _{ d, \varepsilon } )_{ (d , \varepsilon ) \in {\mathbb {N}}\times (0,1] } \subseteq {\mathbf {N}}\) such that for all \( d \in {\mathbb {N}}\), \( \varepsilon \in (0,1] \) it holds that \( {\mathcal {R}}( \Psi _{ d, \varepsilon } ) \in C( {\mathbb {R}}^{ d }, {\mathbb {R}}) \), \([ \int _{ {\mathbb {R}}^d } | u_d(x) - ( {\mathcal {R}}(\Psi _{ d, \varepsilon }) )( x ) |^p \, \nu _d(dx) ]^{ \nicefrac { 1 }{ p } } \le \varepsilon \), and

$$\begin{aligned} {\mathcal {P}}( \Psi _{ d, \varepsilon } ) \le c d^{\frac{{\mathfrak {d}}_0( {\mathfrak {n}}_1+{\mathfrak {n}}_2 +1)}{{\mathfrak {n}}_0} + {\mathfrak {d}}_3 + {\mathfrak {e}}\max \{{\mathfrak {d}}_5 + \theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2), {\mathfrak {d}}_4 + {\mathfrak {d}}_6 + 2\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)\}} \varepsilon ^{-\frac{( {\mathfrak {n}}_1+ {\mathfrak {n}}_2 +1)}{{\mathfrak {n}}_0} -{\mathfrak {e}}}. \end{aligned}$$
(2.16)

Theorem 2.3, roughly speaking, shows that if a function can be approximated by means of some suitable discrete (Monte Carlo) approximation scheme without the curse of dimensionality (cf. (2.9) above) and if there exist DNNs which satisfy certain regularity properties and which approximate this discrete approximation scheme without the curse of dimensionality, then the function itself can also be approximated with DNNs without the curse of dimensionality.

The proof of Theorem 2.3 is given below. In the following we provide some comments on the mathematical objects appearing in Theorem 2.3 above.

The triple \((\Omega , {\mathcal {F}}, {\mathbb {P}})\) denotes the probability space on which we consider the discrete (Monte Carlo) approximation scheme. For every \(N, d \in {\mathbb {N}}\) the random variables \(Z^{N, d, m}_n :\Omega \rightarrow {\mathbb {R}}^{d} \), \(n \in \{0, 1, \ldots , N-1\}\), \(m \in \{1, 2, \ldots , M_{N}\}\), and the Lipschitz continuous function \(f_{N, d} \in C( {\mathbb {R}}^{d} \times {\mathbb {R}}^{d}, {\mathbb {R}}^{d})\) (cf. (2.13) above) are employed in the iterative construction of the discrete approximations \(Y^{N, d, x}_n = (Y^{N, d, m, x}_n)_{m \in \{1, 2, \ldots , M_{N}\}} :\Omega \rightarrow {\mathbb {R}}^{M_N d}\), \(n \in \{0, 1, \ldots , N\}\), \(x \in {\mathbb {R}}^d\) (cf. (2.8) above). We assume that these approximations composed with the functions \(g_{N, d} \in C( {\mathbb {R}}^{Nd}, {\mathbb {R}})\), \( d, N \in {\mathbb {N}}\), approximate the functions \(u_d \in C({\mathbb {R}}^d, {\mathbb {R}})\), \(d \in {\mathbb {N}}\), without the curse of dimensionality in the strong \(L^p\)-sense with respect to the probability measures \( \nu _d :{\mathcal {B}}({\mathbb {R}}^d) \rightarrow [0,1]\), \(d \in {\mathbb {N}}\) (cf. (2.9) above). We assume suitable moment bounds for the random variables \(Z^{N, d, m}_n :\Omega \rightarrow {\mathbb {R}}^{d} \), \(n \in \{0, 1, \ldots , N-1\}\), \(m \in \{1, 2, \ldots , M_{N}\}\), \(d, N \in {\mathbb {N}}\), and the probability measures \( \nu _d :{\mathcal {B}}({\mathbb {R}}^d) \rightarrow [0,1]\), \(d \in {\mathbb {N}}\) (cf. (2.10) above).

We think of \({\mathbf {N}}\) in Theorem 2.3 above as a set of DNNs (see also Definition 3.1 below) and for every \(\Phi \in {\mathbf {N}}\) we think of \({\mathcal {P}}(\Phi ) \in {\mathbb {N}}\) as the number of parameters which are used to describe \(\Phi \) (see also Definition 3.1 below). For every \(\Phi \in {\mathbf {N}}\) we think of \({\mathcal {D}}(\Phi ) \in (\cup _{L\in {\mathbb {N}}} {\mathbb {N}}^{L})\) as the vector consisting of the dimensions of all layers of \(\Phi \) and we think of \({\mathcal {R}}(\Phi )\) as the realization function associated to \(\Phi \) (see also Definition 3.3 below).

For every \(d \in {\mathbb {N}}\), \( \varepsilon \in (0, 1]\) we think of \({\mathfrak {N}}_{d, \varepsilon } \subseteq {\mathbf {N}}\) as a set of DNNs with suitable regularity properties. For every \(N, d \in {\mathbb {N}}\), \(z \in {\mathbb {R}}^d\) we think of \(({\mathbf {f}}^{N, d}_{\varepsilon , z})_{\varepsilon \in (0, 1] } \subseteq {\mathbf {N}}\) as neural networks which approximate the function \({\mathbb {R}}^d \ni x \mapsto f_{N, d} (z, x) \in {\mathbb {R}}^{d}\) without the curse of dimensionality (cf. (2.11) above) and which satisfy a suitable linear growth condition (cf. (2.12) above). For every \(N, d \in {\mathbb {N}}\) we think of \(({\mathbf {g}}^{N, d}_{\varepsilon })_{\varepsilon \in (0, 1] } \subseteq {\mathbf {N}}\) as neural networks which approximate the function \(g_{N, d} :{\mathbb {R}}^{d}\rightarrow {\mathbb {R}}^{d}\) without the curse of dimensionality (cf. (2.14) above) and which satisfy a suitable Lipschitz-type condition (cf. (2.15) above). For every \(d \in {\mathbb {N}}\) we think of \({\mathfrak {I}}_d \in {\mathbf {N}}\) as a neural network representing the identity function on \({\mathbb {R}}^d\) in the sense that for all \(x \in {\mathbb {R}}^d\) it holds that \(({\mathcal {R}}({\mathfrak {I}}_d))(x) = x\) (see also Definition 3.15 below).

Proof of Theorem 2.3

Throughout this proof let \(\gamma = 46 e^{{\mathfrak {C}}} {\mathfrak {C}}^2 ( 4 e^{{\mathfrak {C}}+1} {\mathfrak {C}}^3 )^{ 2\theta }\), let \(\delta = \max \{{\mathfrak {d}}_5 + \theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2), {\mathfrak {d}}_4 + {\mathfrak {d}}_6 + 2\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)\}\), let \(X^{N, d, x, \varepsilon }_n = (X^{N, d, m, x, \varepsilon }_n)_{m \in \{1, 2, \ldots , M_{N}\}} :\Omega \rightarrow {\mathbb {R}}^{M_N d}\), \(n \in \{0, 1, \ldots , N\}\), \(\varepsilon \in (0, 1]\), \(x \in {\mathbb {R}}^d\), \(d, N \in {\mathbb {N}}\), be the random variables which satisfy for all \(N, d \in {\mathbb {N}}\), \(m \in \{1, 2, \ldots , M_{N}\}\), \(x \in {\mathbb {R}}^d\), \(\varepsilon \in (0, 1]\), \(n \in \{1, 2, \ldots , N\}\), \(\omega \in \Omega \) that \(X^{N, d, m, x, \varepsilon }_{0}(\omega ) = x\) and

$$\begin{aligned} \begin{aligned} X^{N, d, m, x, \varepsilon }_{n}(\omega )&= \Big ( {\mathcal {R}}\Big ({\mathbf {f}}^{N, d}_{\varepsilon , Z^{N, d, m}_{n-1}(\omega )}\Big ) \Big ) \big ( X^{N, d, m, x, \varepsilon }_{n-1}(\omega )\big ), \end{aligned} \end{aligned}$$
(2.17)

and let \(({\mathcal {N}}_{d, \varepsilon })_{(d, \varepsilon ) \in {\mathbb {N}}\times (0, 1]} \subseteq {\mathbb {N}}\) and \(({\mathcal {E}}_{d, \varepsilon })_{(d, \varepsilon ) \in {\mathbb {N}}\times (0, 1]} \subseteq (0, 1]\) satisfy for all \(\varepsilon \in (0, 1]\), \(d \in {\mathbb {N}}\) that

$$\begin{aligned} {\mathcal {N}}_{d, \varepsilon } = \min \! \left( {\mathbb {N}}\cap \big [\big (\tfrac{2{\mathfrak {C}}d^{{\mathfrak {d}}_0}}{\varepsilon }\big )^{\nicefrac {1}{{\mathfrak {n}}_0}}, \infty \big ) \right) \qquad \text {and}\qquad {\mathcal {E}}_{d, \varepsilon } = \tfrac{\varepsilon }{\gamma d^{ \delta }} . \end{aligned}$$
(2.18)

Note that for all \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\), \(n \in \{0, 1, 2, \ldots , N\}\) it holds that

$$\begin{aligned} \big ({\mathbb {R}}^d \ni x \mapsto X^{N, d, x, \varepsilon }_n \in {\mathbb {R}}^{M_N d}\big ) \in C({\mathbb {R}}^d, {\mathbb {R}}^{M_N d}). \end{aligned}$$
(2.19)

This implies that for all \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\) it holds that

$$\begin{aligned} \begin{aligned}&\left( {\mathbb {E}}\! \left[ \int _{{\mathbb {R}}^d} \big |u_d(x) - ( {\mathcal {R}}({\mathbf {g}}^{M_N, d}_{\varepsilon }) )(X^{N, d, x, \varepsilon }_N) \big |^p \, \nu _d (dx) \right] \right) ^{\!\nicefrac {1}{p}} \\&\quad \le \left( {\mathbb {E}}\! \left[ \int _{{\mathbb {R}}^d} \big |u_d(x) - g_{M_N,d} (Y^{N, d, x}_N) \big |^p \, \nu _d (dx) \right] \right) ^{\!\nicefrac {1}{p}} \\&\qquad + \left( {\mathbb {E}}\! \left[ \int _{{\mathbb {R}}^d} \big | g_{M_N,d} (Y^{N, d, x}_N) - ({\mathcal {R}}({\mathbf {g}}^{M_N, d}_{\varepsilon } ))(Y^{N, d, x}_N) \big |^p \, \nu _d (dx) \right] \right) ^{\!\nicefrac {1}{p}}\\&\qquad + \left( {\mathbb {E}}\! \left[ \int _{{\mathbb {R}}^d} \big | ({\mathcal {R}}({\mathbf {g}}^{M_N, d}_{\varepsilon }) )(Y^{N, d, x}_N) - ( {\mathcal {R}}({\mathbf {g}}^{M_N, d}_{\varepsilon }) )(X^{N, d, x, \varepsilon }_N) \big |^p \, \nu _d (dx) \right] \right) ^{\!\nicefrac {1}{p}}. \end{aligned} \end{aligned}$$
(2.20)

Next observe that (2.14) ensures for all \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\) that

$$\begin{aligned} \begin{aligned}&\left( {\mathbb {E}}\! \left[ \int _{{\mathbb {R}}^d} \big | g_{M_N,d} (Y^{N, d, x}_N) - ({\mathcal {R}}({\mathbf {g}}^{M_N, d}_{\varepsilon }) )(Y^{N, d, x}_N) \big |^p \, \nu _d (dx) \right] \right) ^{\!\nicefrac {1}{p}} \\&\quad \le \varepsilon {\mathfrak {C}}d^{{\mathfrak {d}}_5} \left( {\mathbb {E}}\! \left[ \int _{{\mathbb {R}}^d} \Big | d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} + \tfrac{1}{M_N} \textstyle \sum \limits _{m=1}^{M_N} \displaystyle \Vert Y^{N, d, m, x}_N \Vert ^{\theta } \Big |^p \, \nu _d (dx) \right] \right) ^{\!\nicefrac {1}{p}}\\&\quad \le \varepsilon {\mathfrak {C}}d^{{\mathfrak {d}}_5} \left[ d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)}+ \tfrac{1}{M_N} \textstyle \sum \limits _{m=1}^{M_N} \displaystyle \left( {\mathbb {E}}\! \left[ \int _{{\mathbb {R}}^d} \Vert Y^{N, d, m, x}_N \Vert ^{p\theta } \, \nu _d (dx) \right] \right) ^{\!\nicefrac {1}{p}} \right] . \end{aligned} \end{aligned}$$
(2.21)

In addition, note that (2.11) and (2.12) assure that for all \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\), \(x, z \in {\mathbb {R}}^d\) it holds that

$$\begin{aligned} \begin{aligned} \Vert f_{N, d}(z, x) \Vert&\le \Vert f_{N, d}(z, x) - ( {\mathcal {R}}({\mathbf {f}}^{N, d}_{\varepsilon , z}))(x) \Vert + \Vert ( {\mathcal {R}}({\mathbf {f}}^{N, d}_{\varepsilon , z}))(x) \Vert \\&\le \varepsilon {\mathfrak {C}}d^{{\mathfrak {d}}_4} (d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)}+ \Vert x\Vert ^{\theta }) + \big (1 + \tfrac{{\mathfrak {C}}}{N}\big ) \Vert x\Vert + {\mathfrak {C}}d^{{\mathfrak {d}}_2}( d^{{\mathfrak {d}}_1} + \Vert z\Vert ). \end{aligned} \end{aligned}$$
(2.22)

This proves that for all \(N, d \in {\mathbb {N}}\), \(x, z \in {\mathbb {R}}^d\) it holds that

$$\begin{aligned} \Vert f_{N, d}(z, x) \Vert \le \big (1 + \tfrac{{\mathfrak {C}}}{N}\big ) \Vert x\Vert + {\mathfrak {C}}d^{{\mathfrak {d}}_2} ( d^{{\mathfrak {d}}_1} + \Vert z\Vert ). \end{aligned}$$
(2.23)

Hence, we obtain that for all \(N, d \in {\mathbb {N}}\), \(m \in \{1, 2, \ldots , M_N\}\), \(x \in {\mathbb {R}}^d\), \( n \in \{1, 2, \ldots , N\}\) it holds that

$$\begin{aligned} \begin{aligned} \Vert Y^{N, d, m, x}_n \Vert&= \big \Vert f_{N, d} \big (Z^{N, d, m}_{n-1}, Y^{N, d, m, x}_{n-1}\big ) \big \Vert \\&\le \big (1 + \tfrac{{\mathfrak {C}}}{N}\big ) \Vert Y^{N, d, m, x}_{n-1} \Vert + {\mathfrak {C}}d^{{\mathfrak {d}}_2} \big [ d^{{\mathfrak {d}}_1} + \Vert Z^{N, d, m}_{n-1}\Vert \big ]. \end{aligned} \end{aligned}$$
(2.24)

Moreover, note that (2.12) assures that for all \(N, d \in {\mathbb {N}}\), \(m \in \{1, 2, \ldots , M_N\}\), \(x \in {\mathbb {R}}^d\), \(\varepsilon \in (0, 1]\), \( n \in \{1, 2, \ldots , N\}\) it holds that

$$\begin{aligned} \begin{aligned} \Vert X^{N, d, m, x, \varepsilon }_{n} \Vert&= \Big \Vert \Big ( {\mathcal {R}}\Big ({\mathbf {f}}^{N, d}_{\varepsilon , Z^{N, d, m}_{n-1}} \Big ) \Big ) \big ( X^{N, d, m, x, \varepsilon }_{n-1}\big ) \Big \Vert \\&\le \big (1 + \tfrac{{\mathfrak {C}}}{N}\big )\Vert X^{N, d, m, x, \varepsilon }_{n-1} \Vert + {\mathfrak {C}}d^{{\mathfrak {d}}_2} \big [ d^{{\mathfrak {d}}_1} + \Vert Z^{N, d, m}_{n-1}\Vert \big ]. \end{aligned} \end{aligned}$$
(2.25)

Lemma 2.2 (with \(N = n\), \(p = 2p\theta \), \(\alpha \) = \((1 + \frac{{\mathfrak {C}}}{N})\), \(\beta = {\mathfrak {C}}d^{{\mathfrak {d}}_2}\), \(\gamma = d^{{\mathfrak {d}}_1}\), \(Z_i = \Vert Z^{N, d, m}_{i} \Vert \) for \(N, d \in {\mathbb {N}}\), \( n \in \{1, 2, \ldots , N\}\), \(m \in \{1, 2, \ldots , M_N\}\), \(i \in \{0, 1, \ldots , n-1\}\) in the notation of Lemma 2.2), (2.24), and (2.10) therefore demonstrate that for all \(N, d \in {\mathbb {N}}\), \(m \in \{1, 2, \ldots , M_N\}\), \(x \in {\mathbb {R}}^d\), \(\varepsilon \in (0, 1]\), \( n \in \{1, 2, \ldots , N\}\) it holds that

$$\begin{aligned} \begin{aligned}&\max \left\{ \left( {\mathbb {E}}\big [ \Vert Y^{N, d, m, x}_n \Vert ^{2 p \theta } \big ] \right) ^{\nicefrac {1}{(2 p \theta )}}, \left( {\mathbb {E}}\big [ \Vert X^{N, d, m, x, \varepsilon }_{n} \Vert ^{2 p \theta } \big ] \right) ^{\nicefrac {1}{(2 p \theta )}} \right\} \\&\quad \le \big (1 + \tfrac{{\mathfrak {C}}}{N}\big )^n \Vert x\Vert + e^{(1 + \frac{{\mathfrak {C}}}{N})} {\mathfrak {C}}d^{{\mathfrak {d}}_2} \left[ d^{{\mathfrak {d}}_1}+ \sup \nolimits _{i \in \{0, 1, \ldots , n -1 \}} \left( {\mathbb {E}}\! \left[ \Vert Z^{N, d, m}_{i} \Vert ^{2 p \theta } \right] \right) ^{\!\nicefrac {1}{(2 p \theta )}} \right] \\&\quad \le e^{{\mathfrak {C}}} \Vert x\Vert + e^{{\mathfrak {C}}+1} {\mathfrak {C}}d^{{\mathfrak {d}}_2} \big [ d^{{\mathfrak {d}}_1} + {\mathfrak {C}}d^{{\mathfrak {d}}_1} \big ] \le e^{{\mathfrak {C}}} \Vert x\Vert + 2 e^{{\mathfrak {C}}+1} {\mathfrak {C}}^2 d^{{\mathfrak {d}}_1 + {\mathfrak {d}}_2}\\&\quad \le 2 e^{{\mathfrak {C}}+1} {\mathfrak {C}}^2 \big [\Vert x\Vert + d^{{\mathfrak {d}}_1 + {\mathfrak {d}}_2} \big ]. \end{aligned} \end{aligned}$$
(2.26)

This and the fact that \(\forall \, a, b \in {\mathbb {R}}:|a+b|^{\theta } \le 2^{\theta -1}(|a|^{\theta }+|b|^{\theta })\) prove for all \(N, d \in {\mathbb {N}}\), \(m \in \{1, 2, \ldots , M_N\}\), \(x \in {\mathbb {R}}^d\), \(\varepsilon \in (0, 1]\), \( n \in \{1, 2, \ldots , N\}\) that

$$\begin{aligned} \begin{aligned}&\max \left\{ {\mathbb {E}}\big [ \Vert Y^{N, d, m, x}_n \Vert ^{2 p \theta } \big ], {\mathbb {E}}\big [ \Vert X^{N, d, m, x, \varepsilon }_{n} \Vert ^{2 p \theta } \big ] \right\} \\&\quad \le \left( 2 e^{{\mathfrak {C}}+1} {\mathfrak {C}}^2 \big [\Vert x\Vert + d^{{\mathfrak {d}}_1 + {\mathfrak {d}}_2} \big ] \right) ^{2 p \theta } = \left( 2 e^{{\mathfrak {C}}+1} {\mathfrak {C}}^2 \right) ^{2 p \theta } \big [\Vert x\Vert + d^{{\mathfrak {d}}_1 + {\mathfrak {d}}_2} \big ]^{2 p \theta }\\&\quad \le 2^{2 p (\theta -1) } \left( 2 e^{{\mathfrak {C}}+1} {\mathfrak {C}}^2 \right) ^{2 p \theta }\big [\Vert x\Vert ^{\theta } + d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} \big ]^{2p}\\&\quad \le ( 4 e^{{\mathfrak {C}}+1} {\mathfrak {C}}^2 )^{2 p \theta } \big [\Vert x\Vert ^{\theta } + d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} \big ]^{2p}. \end{aligned} \end{aligned}$$
(2.27)

This and (2.10) establish that for all \(N, d \in {\mathbb {N}}\), \(m \in \{1, 2, \ldots , M_N\}\), \(\varepsilon \in (0, 1]\) it holds that

$$\begin{aligned}&\max \left\{ \left( {\mathbb {E}}\! \left[ \int _{{\mathbb {R}}^d} \Vert Y^{N, d, m, x}_N \Vert ^{2p\theta } \, \nu _d (dx) \right] \right) ^{\!\nicefrac {1}{(2p)}}, \left( {\mathbb {E}}\! \left[ \int _{{\mathbb {R}}^d} \Vert X^{N, d, m, x, \varepsilon }_{N} \Vert ^{2p\theta } \, \nu _d (dx) \right] \right) ^{\!\nicefrac {1}{(2p)}} \right\} \nonumber \\&\quad \le ( 4 e^{{\mathfrak {C}}+1} {\mathfrak {C}}^2 )^{ \theta } \left( \int _{{\mathbb {R}}^d} \big [\Vert x\Vert ^{\theta } + d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} \big ]^{2p}\, \nu _d (dx) \right) ^{\!\nicefrac {1}{(2p)}}\nonumber \\&\quad \le ( 4 e^{{\mathfrak {C}}+1} {\mathfrak {C}}^2 )^{ \theta } \left[ \left( \int _{{\mathbb {R}}^d} \Vert x\Vert ^{ 2p \theta } \, \nu _d (dx) \right) ^{\!\nicefrac {1}{(2p)}} + d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} \right] \nonumber \\&\quad \le ( 4 e^{{\mathfrak {C}}+1} {\mathfrak {C}}^2 )^{ \theta } \big [ {\mathfrak {C}}^{\theta } d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} + d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)}\big ] \le 2 ( 4 e^{{\mathfrak {C}}+1} {\mathfrak {C}}^3 )^{ \theta } d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} . \end{aligned}$$
(2.28)

Hence, we obtain that for all \(N, d \in {\mathbb {N}}\), \(m \in \{1, 2, \ldots , M_N\}\) it holds that

$$\begin{aligned} \begin{aligned} \left( {\mathbb {E}}\! \left[ \int _{{\mathbb {R}}^d} \Vert Y^{N, d, m, x}_N \Vert ^{p\theta } \, \nu _d (dx) \right] \right) ^{\!\nicefrac {1}{p}}&\le \left( {\mathbb {E}}\! \left[ \int _{{\mathbb {R}}^d} \Vert Y^{N, d, m, x}_N \Vert ^{2p\theta } \, \nu _d (dx) \right] \right) ^{\!\nicefrac {1}{(2p)}}\\&\le 2 ( 4 e^{{\mathfrak {C}}+1} {\mathfrak {C}}^3 )^{ \theta } d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)}. \end{aligned} \end{aligned}$$
(2.29)

Combining this and (2.21) demonstrates that for all \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\) it holds that

$$\begin{aligned} \begin{aligned}&\left( {\mathbb {E}}\! \left[ \int _{{\mathbb {R}}^d} \big | g_{M_N,d} (Y^{N, d, x}_N) - ({\mathcal {R}}({\mathbf {g}}^{M_N, d}_{\varepsilon }) )(Y^{N, d, x}_N) \big |^p \, \nu _d (dx) \right] \right) ^{\!\nicefrac {1}{p}} \\&\quad \le \varepsilon {\mathfrak {C}}d^{{\mathfrak {d}}_5} \big [ d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} +2 ( 4 e^{{\mathfrak {C}}+1} {\mathfrak {C}}^3 )^{ \theta } d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} \big ] \le 3 \varepsilon {\mathfrak {C}}( 4 e^{{\mathfrak {C}}+1} {\mathfrak {C}}^3 )^{ \theta } d^{{\mathfrak {d}}_5 + \theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)}. \end{aligned} \end{aligned}$$
(2.30)

In addition, observe that (2.15) ensures that for all \(N, d \in {\mathbb {N}}\), \(x \in {\mathbb {R}}^d\), \(\varepsilon \in (0, 1]\) it holds that

$$\begin{aligned}&\big | ({\mathcal {R}}({\mathbf {g}}^{M_N, d}_{\varepsilon }) )(Y^{N, d, x}_N) - ( {\mathcal {R}}({\mathbf {g}}^{M_N, d}_{\varepsilon }) )(X^{N, d, x, \varepsilon }_N) \big | \nonumber \\&\quad \le \frac{{\mathfrak {C}}d^{{\mathfrak {d}}_6}}{M_N} \bigg [ \textstyle \sum \limits _{m=1}^{M_N} \displaystyle \big (d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} + \Vert Y^{N, d, m, x}_N\Vert ^{\theta } + \Vert X^{N, d, m, x, \varepsilon }_N \Vert ^{\theta }\big ) \Vert Y^{N, d, m, x}_N - X^{N, d, m, x, \varepsilon }_N\Vert \bigg ]. \end{aligned}$$
(2.31)

This ensures for all \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\) that

$$\begin{aligned} \begin{aligned}&\left( {\mathbb {E}}\! \left[ \int _{{\mathbb {R}}^d} \big | ({\mathcal {R}}({\mathbf {g}}^{M_N, d}_{\varepsilon }) )(Y^{N, d, x}_N) - ( {\mathcal {R}}({\mathbf {g}}^{M_N, d}_{\varepsilon }) )(X^{N, d, x, \varepsilon }_N) \big |^p \, \nu _d (dx) \right] \right) ^{\!\nicefrac {1}{p}}\\&\quad \le \frac{{\mathfrak {C}}d^{{\mathfrak {d}}_6}}{ M_N } \sum _{m=1}^{M_N} \bigg ({\mathbb {E}}\bigg [ \int _{{\mathbb {R}}^d} \big (d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} + \Vert Y^{N, d, m, x}_N\Vert ^{\theta } + \Vert X^{N, d, m, x, \varepsilon }_N \Vert ^{\theta }\big )^p \\&\qquad \cdot \Vert Y^{N, d, m, x}_N -X^{N, d, m, x, \varepsilon }_N\Vert ^p \, \nu _d (dx) \bigg ] \bigg )^{\!\nicefrac {1}{p}}. \end{aligned} \end{aligned}$$
(2.32)

Hölder’s inequality hence assures for all \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\) that

$$\begin{aligned}&\left( {\mathbb {E}}\! \left[ \int _{{\mathbb {R}}^d} \big | ({\mathcal {R}}({\mathbf {g}}^{M_N, d}_{\varepsilon }) )(Y^{N, d, x}_N) - ( {\mathcal {R}}( {\mathbf {g}}^{M_N, d}_{\varepsilon }) )(X^{N, d, x, \varepsilon }_N) \big |^p \, \nu _d (dx) \right] \right) ^{\!\nicefrac {1}{p}}\nonumber \\&\quad \le \frac{{\mathfrak {C}}d^{{\mathfrak {d}}_6}}{ M_N } \sum _{m=1}^{M_N} \bigg ({\mathbb {E}}\bigg [ \int _{{\mathbb {R}}^d} \big (d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} + \Vert Y^{N, d, m, x}_N\Vert ^{\theta } + \Vert X^{N, d, m, x, \varepsilon }_N \Vert ^{\theta }\big )^{2p} \, \nu _d (dx) \bigg ] \bigg )^{\!\nicefrac {1}{(2p)}} \nonumber \\&\qquad \cdot \bigg ({\mathbb {E}}\bigg [ \int _{{\mathbb {R}}^d} \Vert Y^{N, d, m, x}_N - X^{N, d, m, x, \varepsilon }_N\Vert ^{2p} \, \nu _d (dx) \bigg ] \bigg )^{\!\nicefrac {1}{(2p)}}. \end{aligned}$$
(2.33)

Moreover, note that (2.28) implies that for all \(N, d \in {\mathbb {N}}\), \(m \in \{1, 2, \ldots , M_N\}\), \(\varepsilon \in (0, 1]\) it holds that

$$\begin{aligned} \begin{aligned}&\left( {\mathbb {E}}\bigg [ \int _{{\mathbb {R}}^d} \big (d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} + \Vert Y^{N, d, m, x}_N\Vert ^{\theta } + \Vert X^{N, d, m, x, \varepsilon }_N \Vert ^{\theta }\big )^{2p} \, \nu _d (dx) \bigg ] \right) ^{\!\nicefrac {1}{(2p)}}\\&\quad \le d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)}+ \bigg ({\mathbb {E}}\bigg [ \int _{{\mathbb {R}}^d} \Vert Y^{N, d, m, x}_N\Vert ^{2p\theta } \, \nu _d (dx) \bigg ] \bigg )^{\!\nicefrac {1}{(2p)}} \\&\qquad + \bigg ({\mathbb {E}}\bigg [ \int _{{\mathbb {R}}^d} \Vert X^{N, d, m, x, \varepsilon }_N \Vert ^{2p \theta } \, \nu _d (dx) \bigg ] \bigg )^{\!\nicefrac {1}{(2p)}}\\&\quad \le d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} + 4 ( 4 e^{{\mathfrak {C}}+1} {\mathfrak {C}}^3 )^{ \theta } d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} \le 5 ( 4 e^{{\mathfrak {C}}+1} {\mathfrak {C}}^3 )^{ \theta } d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)}. \end{aligned} \end{aligned}$$
(2.34)

Next observe that (2.13) and (2.11) prove that for all \(N, d \in {\mathbb {N}}\), \(m \in \{1, 2, \ldots , M_N\}\), \(x \in {\mathbb {R}}^d\), \(\varepsilon \in (0, 1]\), \(n \in \{1, 2, \ldots , N\}\) it holds that

$$\begin{aligned} \begin{aligned}&\Vert Y^{N, d, m, x}_n - X^{N, d, m, x, \varepsilon }_n\Vert \\&\quad = \left\| f_{N, d} \big (Z^{N, d, m}_{n-1}, Y^{N, d, m, x}_{n-1}\big ) - \Big ( {\mathcal {R}}\Big ({\mathbf {f}}^{N, d}_{\varepsilon , Z^{N, d, m}_{n-1}} \Big ) \Big ) \big ( X^{N, d, m, x, \varepsilon }_{n-1}\big ) \right\| \\&\quad \le \left\| f_{N, d} \big (Z^{N, d, m}_{n-1}, Y^{N, d, m, x}_{n-1}\big ) - f_{N, d} \big (Z^{N, d, m}_{n-1}, X^{N, d, m, x, \varepsilon }_{n-1} \big ) \right\| \\&\qquad + \left\| f_{N, d} \big (Z^{N, d, m}_{n-1}, X^{N, d, m, x, \varepsilon }_{n-1} \big ) - \Big ( {\mathcal {R}}\Big ({\mathbf {f}}^{N, d}_{\varepsilon , Z^{N, d, m}_{n-1}} \Big ) \Big ) \big ( X^{N, d, m, x, \varepsilon }_{n-1}\big ) \right\| \\&\quad \le {\mathfrak {C}}\Vert Y^{N, d, m, x}_{n-1} - X^{N, d, m, x, \varepsilon }_{n-1}\Vert + \varepsilon {\mathfrak {C}}d^{{\mathfrak {d}}_4} \left( d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} + \Vert X^{N, d, m, x, \varepsilon }_{n-1} \Vert ^{\theta } \right) . \end{aligned} \end{aligned}$$
(2.35)

Lemma 2.2 (with \(N = N\), \(p = 2p\), \(\alpha \) = \({\mathfrak {C}}\), \(\beta = \varepsilon {\mathfrak {C}}d^{{\mathfrak {d}}_4}\), \(\gamma = d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)}\), \(Z_n = \Vert X^{N, d, m, x, \varepsilon }_{n} \Vert ^{\theta } \), \(X_n = \Vert Y^{N, d, m, x}_n - X^{N, d, m, x, \varepsilon }_n \Vert \) for \(N, d \in {\mathbb {N}}\), \(m \in \{1, 2, \ldots , M_N\}\), \(x \in {\mathbb {R}}^d\), \(\varepsilon \in (0, 1]\), \(n \in \{1, 2, \ldots , N\}\) in the notation of Lemma 2.2) and (2.27) hence ensure for all \(N, d \in {\mathbb {N}}\), \(m \in \{1, 2, \ldots , M_N\}\), \(x \in {\mathbb {R}}^d\), \(\varepsilon \in (0, 1]\) that

$$\begin{aligned} \begin{aligned}&\left( {\mathbb {E}}\big [ \Vert Y^{N, d, m, x}_N - X^{N, d, m, x, \varepsilon }_N\Vert ^{2p} \big ]\right) ^{\!\nicefrac {1}{(2p)}}\\&\quad \le e^{{\mathfrak {C}}} \varepsilon {\mathfrak {C}}d^{{\mathfrak {d}}_4} \left[ d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} + \sup \nolimits _{i \in \{0, 1, \ldots , N-1\}} \left( {\mathbb {E}}\big [ \Vert X^{N, d, m, x, \varepsilon }_{i}\Vert ^{2p \theta } \big ]\right) ^{\!\nicefrac {1}{(2p)}} \right] \\&\quad \le \varepsilon e^{{\mathfrak {C}}} {\mathfrak {C}}d^{{\mathfrak {d}}_4} \left[ d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} + \left( ( 4 e^{{\mathfrak {C}}+1} {\mathfrak {C}}^2 )^{2 p \theta } \big [\Vert x\Vert ^{\theta } + d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} \big ]^{2p} \right) ^{\!\nicefrac {1}{(2p)}} \right] \\&\quad = \varepsilon e^{{\mathfrak {C}}} {\mathfrak {C}}d^{{\mathfrak {d}}_4} \left[ d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} + ( 4 e^{{\mathfrak {C}}+1} {\mathfrak {C}}^2 )^{ \theta } \big [\Vert x\Vert ^{\theta } + d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} \big ] \right] \\&\quad \le 2 \varepsilon e^{{\mathfrak {C}}} {\mathfrak {C}}d^{{\mathfrak {d}}_4} ( 4 e^{{\mathfrak {C}}+1} {\mathfrak {C}}^2 )^{ \theta } \big [\Vert x\Vert ^{\theta } + d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} \big ]. \end{aligned} \end{aligned}$$
(2.36)

This and (2.10) demonstrate that for all \(N, d \in {\mathbb {N}}\), \(m \in \{1, 2, \ldots , M_N\}\), \(\varepsilon \in (0, 1]\) it holds that

$$\begin{aligned} \begin{aligned}&\bigg ({\mathbb {E}}\bigg [ \int _{{\mathbb {R}}^d} \Vert Y^{N, d, m, x}_N - X^{N, d, m, x, \varepsilon }_N\Vert ^{2p} \, \nu _d (dx) \bigg ] \bigg )^{\!\nicefrac {1}{(2p)}}\\&\quad \le 2 \varepsilon e^{{\mathfrak {C}}} {\mathfrak {C}}d^{{\mathfrak {d}}_4} ( 4 e^{{\mathfrak {C}}+1} {\mathfrak {C}}^2 )^{ \theta } \left[ \int _{{\mathbb {R}}^d} \big ( \Vert x\Vert ^{\theta } +d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} \big )^{2p}\, \nu _d (dx) \right] ^{\nicefrac {1}{(2p)}}\\&\quad \le 2 \varepsilon e^{{\mathfrak {C}}} {\mathfrak {C}}d^{{\mathfrak {d}}_4} ( 4 e^{{\mathfrak {C}}+1} {\mathfrak {C}}^2 )^{ \theta } \left( \left[ \int _{{\mathbb {R}}^d} \Vert x\Vert ^{2p \theta } \, \nu _d (dx) \right] ^{\nicefrac {1}{(2p)}} + d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)}\right) \\&\quad \le 2 \varepsilon e^{{\mathfrak {C}}} {\mathfrak {C}}d^{{\mathfrak {d}}_4} ( 4 e^{{\mathfrak {C}}+1} {\mathfrak {C}}^2 )^{ \theta } \big ( {\mathfrak {C}}^{\theta } d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} + d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} \big )\\&\quad \le 4 \varepsilon e^{{\mathfrak {C}}} {\mathfrak {C}}( 4 e^{{\mathfrak {C}}+1} {\mathfrak {C}}^3 )^{ \theta } d^{{\mathfrak {d}}_4 + \theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)}. \end{aligned} \end{aligned}$$
(2.37)

Combining this with (2.33) and (2.34) establishes that for all \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\) it holds that

$$\begin{aligned} \begin{aligned}&\left( {\mathbb {E}}\! \left[ \int _{{\mathbb {R}}^d} \big | ({\mathcal {R}}({\mathbf {g}}^{M_N, d}_{\varepsilon }) )(Y^{N, d, x}_N) - ( {\mathcal {R}}({\mathbf {g}}^{M_N, d}_{\varepsilon }) )(X^{N, d, x, \varepsilon }_N) \big |^p \, \nu _d (dx) \right] \right) ^{\!\nicefrac {1}{p}} \\&\quad \le {\mathfrak {C}}d^{{\mathfrak {d}}_6} \cdot 5 ( 4 e^{{\mathfrak {C}}+1} {\mathfrak {C}}^3 )^{ \theta } d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} \cdot 4 \varepsilon e^{{\mathfrak {C}}} {\mathfrak {C}}( 4 e^{{\mathfrak {C}}+1} {\mathfrak {C}}^3 )^{ \theta } d^{{\mathfrak {d}}_4 + \theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)}\\&\quad \le 20 \varepsilon e^{{\mathfrak {C}}} {\mathfrak {C}}^2 ( 4 e^{{\mathfrak {C}}+1} {\mathfrak {C}}^3 )^{ 2\theta } d^{{\mathfrak {d}}_4 + {\mathfrak {d}}_6 + 2\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)}. \end{aligned} \end{aligned}$$
(2.38)

This, (2.9), (2.20), and (2.30) prove for all \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\) that

$$\begin{aligned}&\left( {\mathbb {E}}\! \left[ \int _{{\mathbb {R}}^d} \big |u_d(x) - ( {\mathcal {R}}({\mathbf {g}}^{M_N, d}_{\varepsilon }) )(X^{N, d, x, \varepsilon }_N) \big |^p \, \nu _d (dx) \right] \right) ^{\!\nicefrac {1}{p}} \nonumber \\&\quad \le {\mathfrak {C}}d^{{\mathfrak {d}}_0} N^{-{\mathfrak {n}}_0} + 3 \varepsilon {\mathfrak {C}}( 4 e^{{\mathfrak {C}}+1} {\mathfrak {C}}^3 )^{ \theta } d^{{\mathfrak {d}}_5 + \theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} + 20 \varepsilon e^{{\mathfrak {C}}} {\mathfrak {C}}^2 ( 4 e^{{\mathfrak {C}}+1} {\mathfrak {C}}^3 )^{ 2\theta } d^{{\mathfrak {d}}_4 + {\mathfrak {d}}_6+ 2\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)}\nonumber \\&\quad \le {\mathfrak {C}}d^{{\mathfrak {d}}_0} N^{-{\mathfrak {n}}_0} + 23 \varepsilon e^{{\mathfrak {C}}} {\mathfrak {C}}^2 ( 4 e^{{\mathfrak {C}}+1} {\mathfrak {C}}^3 )^{ 2\theta } d^{\delta }\nonumber \\&\quad = {\mathfrak {C}}d^{{\mathfrak {d}}_0} N^{-{\mathfrak {n}}_0} + \tfrac{\varepsilon \gamma d^{\delta }}{2} . \end{aligned}$$
(2.39)

Combining this and (2.18) assures that for all \(d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\) it holds that

$$\begin{aligned} \begin{aligned}&\left( {\mathbb {E}}\! \left[ \int _{{\mathbb {R}}^d} \Big |u_d(x) - \Big ( {\mathcal {R}}\Big ({\mathbf {g}}^{ M_{{\mathcal {N}}_{d, \varepsilon }}, d }_{ {\mathcal {E}}_{d, \varepsilon } }\Big ) \Big ) \Big (X^{{\mathcal {N}}_{d, \varepsilon }, d, x, {\mathcal {E}}_{d, \varepsilon }}_{{\mathcal {N}}_{d, \varepsilon }} \Big ) \Big |^p \, \nu _d (dx) \right] \right) ^{\!\nicefrac {1}{p}} \le \frac{\varepsilon }{2} + \frac{\varepsilon }{2} = \varepsilon . \end{aligned} \end{aligned}$$
(2.40)

This and, e.g., [36, Lemma 2.1] establish that there exists \({\mathfrak {w}} = ({\mathfrak {w}}_{d, \varepsilon })_{(d, \varepsilon ) \in {\mathbb {N}}\times (0, 1]} :{\mathbb {N}}\times (0, 1] \rightarrow \Omega \) which satisfies for all \(d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\) that

$$\begin{aligned} \begin{aligned}&\left[ \int _{{\mathbb {R}}^d} \Big |u_d(x) - \Big ( {\mathcal {R}}\Big ( {\mathbf {g}}^{ M_{{\mathcal {N}}_{d, \varepsilon }}, d }_{ {\mathcal {E}}_{d, \varepsilon } } \Big ) \Big )\Big (X^{{\mathcal {N}}_{d, \varepsilon }, d, x, {\mathcal {E}}_{d, \varepsilon }}_{{\mathcal {N}}_{d, \varepsilon }}({\mathfrak {w}}_{d, \varepsilon })\Big ) \Big |^p \, \nu _d (dx) \right] ^{\nicefrac {1}{p}} \le \varepsilon . \end{aligned} \end{aligned}$$
(2.41)

Next note that for all \(N, d \in {\mathbb {N}}\), \(m \in \{1, 2, \ldots , M_{N}\}\), \(x \in {\mathbb {R}}^d\), \(\varepsilon \in (0, 1]\), \(\omega \in \Omega \) it holds that \(X^{N, d, m, x, \varepsilon }_{0}(\omega ) = ({\mathcal {R}}( {\mathfrak {I}}_d))(x)\). The assumption that for all \( d\in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\) it holds that \({\mathfrak {I}}_d \in {\mathfrak {N}}_{d, \varepsilon }\) and (2.17) hence ensure that there exist \((\Phi ^{N, d, m, \varepsilon , \omega }_{n})_{m \in \{1, 2, \ldots , M_{N}\}} \subseteq {\mathfrak {N}}_{d, \varepsilon }\), \(\omega \in \Omega \), \(n \in \{0, 1, 2, \ldots , N\}\), \(\varepsilon \in (0, 1]\), \(d, N \in {\mathbb {N}}\), which satisfy for all \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\), \(n \in \{0, 1, 2, \ldots , N\}\), \(\omega \in \Omega \) , \(m \in \{1, 2, \ldots , M_{N}\}\), \(x \in {\mathbb {R}}^d\) that \({\mathcal {P}}(\Phi ^{N, d, m, \varepsilon , \omega }_{n}) \le {\mathcal {P}}({\mathfrak {I}}_d)+ n {\mathfrak {C}}N^{{\mathfrak {n}}_1} d^{{\mathfrak {d}}_3} \varepsilon ^{-{\mathfrak {e}}}\), \({\mathcal {D}}(\Phi ^{N, d, m, \varepsilon , \omega }_{n}) = {\mathcal {D}}(\Phi ^{N, d, 1, \varepsilon , \omega }_{n})\), and

$$\begin{aligned} ({\mathcal {R}}( \Phi ^{N, d, m, \varepsilon , \omega }_{n}))(x) = \Big ( {\mathcal {R}}\Big ({\mathbf {f}}^{N, d}_{\varepsilon , Z^{N, d, m}_{n-1}(\omega )} \Big ) \Big ) \big ( X^{N, d, m, x, \varepsilon }_{n-1}(\omega )\big )= X^{N, d, m, x, \varepsilon }_{n}(\omega ). \end{aligned}$$
(2.42)

The assumption that for all \(d \in {\mathbb {N}}\) it holds that \({\mathcal {P}}({\mathfrak {I}}_d) \le {\mathfrak {C}}d^{{\mathfrak {d}}_3} \) therefore implies for all \(N, d \in {\mathbb {N}}\), \(m \in \{1, 2, \ldots , M_{N}\}\), \(\varepsilon \in (0, 1]\), \(\omega \in \Omega \) that

$$\begin{aligned} {\mathcal {P}}(\Phi ^{N, d, m, \varepsilon , \omega }_{N}) \le {\mathfrak {C}}d^{{\mathfrak {d}}_3}+ N {\mathfrak {C}}N^{{\mathfrak {n}}_1} d^{{\mathfrak {d}}_3} \varepsilon ^{-{\mathfrak {e}}} \le {\mathfrak {C}}d^{{\mathfrak {d}}_3}+ {\mathfrak {C}}N^{{\mathfrak {n}}_1+1} d^{{\mathfrak {d}}_3} \varepsilon ^{-{\mathfrak {e}}}. \end{aligned}$$
(2.43)

Therefore, we obtain that there exist \(\Psi ^{N, d, \varepsilon , \omega } \in {\mathbf {N}}\), \(\omega \in \Omega \), \(\varepsilon \in (0, 1]\), \(d, N \in {\mathbb {N}}\), which satisfy for all \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\), \(\omega \in \Omega \), \(x \in {\mathbb {R}}^d\) that \({\mathcal {R}}(\Psi ^{N, d, \varepsilon , \omega }) \in C({\mathbb {R}}^d, {\mathbb {R}})\), \({\mathcal {P}}(\Psi ^{N, d, \varepsilon , \omega }) \le {\mathfrak {C}}N^{{\mathfrak {n}}_2} (N^{{\mathfrak {n}}_1+1} d^{{\mathfrak {d}}_3} \varepsilon ^{-{\mathfrak {e}}} + {\mathfrak {C}}d^{{\mathfrak {d}}_3}+ {\mathfrak {C}}N^{{\mathfrak {n}}_1+1} d^{{\mathfrak {d}}_3} \varepsilon ^{-{\mathfrak {e}}} ) \), and

$$\begin{aligned}&({\mathcal {R}}(\Psi ^{N, d, \varepsilon , \omega }))(x)\nonumber \\&\quad = ({\mathcal {R}}({\mathbf {g}}^{M_N, d}_{\varepsilon })\big (({\mathcal {R}}(\Phi ^{N, d, 1, \varepsilon , \omega }_{N}))(x), ({\mathcal {R}}(\Phi ^{N, d, 2, \varepsilon , \omega }_{N}))(x), \ldots , ({\mathcal {R}}(\Phi ^{N, d, M_N, \varepsilon , \omega }_{N}))(x)\big )\nonumber \\&\quad = ({\mathcal {R}}({\mathbf {g}}^{M_N, d}_{\varepsilon })) \big ( X^{N, d, 1, x, \varepsilon }_{N}(\omega ), X^{N, d, 2, x, \varepsilon }_{N}(\omega ), \ldots , X^{N, d, M_N, x, \varepsilon }_{N}(\omega )\big )\nonumber \\&\quad = ({\mathcal {R}}({\mathbf {g}}^{M_N, d}_{\varepsilon }))(X^{N, d, x, \varepsilon }_{N}(\omega )). \end{aligned}$$
(2.44)

Hence, we obtain that for all \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\), \(\omega \in \Omega \) it holds that

$$\begin{aligned} \begin{aligned} {\mathcal {P}}(\Psi ^{N, d, \varepsilon , \omega })&\le {\mathfrak {C}}^2 N^{{\mathfrak {n}}_2} d^{{\mathfrak {d}}_3} \varepsilon ^{-{\mathfrak {e}}}(2N^{{\mathfrak {n}}_1+1} +1 )\\&\le 3 {\mathfrak {C}}^2 N^{{\mathfrak {n}}_1+ {\mathfrak {n}}_2+1} d^{{\mathfrak {d}}_3} \varepsilon ^{-{\mathfrak {e}}}. \end{aligned} \end{aligned}$$
(2.45)

This and (2.18) demonstrate that for all \(d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\) it holds that

$$\begin{aligned} \begin{aligned}&{\mathcal {P}}(\Psi ^{{\mathcal {N}}_{d, \varepsilon }, d, {\mathcal {E}}_{d, \varepsilon }, {\mathfrak {w}}_{d, \varepsilon }})\\&\quad \le 3 {\mathfrak {C}}^2 2^{ {\mathfrak {n}}_1+{\mathfrak {n}}_2 +1} \big (\tfrac{2{\mathfrak {C}}d^{{\mathfrak {d}}_0}}{\varepsilon }\big )^{\frac{( {\mathfrak {n}}_1+{\mathfrak {n}}_2 +1)}{{\mathfrak {n}}_0}} d^{ {\mathfrak {d}}_3} \varepsilon ^{-{\mathfrak {e}}} \gamma ^{ {\mathfrak {e}}} d^{ {\mathfrak {e}}\delta }\\&\quad \le 3 {\mathfrak {C}}^2 2^{{\mathfrak {n}}_1+{\mathfrak {n}}_2 + 1} (2 {\mathfrak {C}})^{\frac{ {\mathfrak {n}}_1+{\mathfrak {n}}_2 +1}{{\mathfrak {n}}_0}} \gamma ^{ {\mathfrak {e}}} d^{\frac{{\mathfrak {d}}_0( {\mathfrak {n}}_1+{\mathfrak {n}}_2 +1)}{{\mathfrak {n}}_0} + {\mathfrak {d}}_3 + {\mathfrak {e}}\delta } \varepsilon ^{-\frac{( {\mathfrak {n}}_1+ {\mathfrak {n}}_2 +1)}{{\mathfrak {n}}_0} -{\mathfrak {e}}}. \end{aligned} \end{aligned}$$
(2.46)

Combining this, (2.41), and (2.44) establishes (2.16). The proof of Theorem 2.3 is thus completed. \(\square \)

3 Artificial neural network (ANN) calculus

In this section we establish in Lemma 3.29 and Lemma 3.30 below a few elementary results on representation flexibilities of ANNs. In our proofs of Lemma 3.29 and Lemma 3.30 we use results from a certain ANN calculus which we recall from the scientific literature and extend in Sects. 3.13.7 below.

In particular, Definition 3.1 below is [25, Definitions 2.1], Definition 3.2 below is [25, Definitions 2.2], Definition 3.3 below is [25, Definitions 2.3], Definition 3.4 below is [25, Definitions 2.5], and Definition 3.5 below is [25, Definitions 2.17]. Moreover, all the results in Sect. 3.5 below are well-known and/or elementary and the proofs of these results are therefore omitted.

3.1 ANNs

Definition 3.1

(ANNs) We denote by \({\mathbf {N}}\) the set given by

$$\begin{aligned} \begin{aligned} {\mathbf {N}}&= \cup _{L \in {\mathbb {N}}} \cup _{ l_0,l_1,\ldots , l_L \in {\mathbb {N}}} \! \left( \times _{k = 1}^L ({\mathbb {R}}^{l_k \times l_{k-1}} \times {\mathbb {R}}^{l_k}) \right) \end{aligned} \end{aligned}$$
(3.1)

and we denote by \( {\mathcal {P}}:{\mathbf {N}}\rightarrow {\mathbb {N}}\), \({\mathcal {L}}:{\mathbf {N}}\rightarrow {\mathbb {N}}\), \({\mathcal {I}}:{\mathbf {N}}\rightarrow {\mathbb {N}}\), \({\mathcal {O}}:{\mathbf {N}}\rightarrow {\mathbb {N}}\), \({\mathcal {H}}:{\mathbf {N}}\rightarrow {\mathbb {N}}_0\), and \({\mathcal {D}}:{\mathbf {N}}\rightarrow ( \cup _{L\in {\mathbb {N}}}{\mathbb {N}}^{L})\) the functions which satisfy for all \( L\in {\mathbb {N}}\), \(l_0,l_1,\ldots , l_L \in {\mathbb {N}}\), \( \Phi \in ( \times _{k = 1}^L({\mathbb {R}}^{l_k \times l_{k-1}} \times {\mathbb {R}}^{l_k}))\) that \({\mathcal {P}}(\Phi ) = \sum _{k = 1}^L l_k(l_{k-1} + 1) \), \({\mathcal {L}}(\Phi )=L\), \({\mathcal {I}}(\Phi )=l_0\), \({\mathcal {O}}(\Phi )=l_L\), \({\mathcal {H}}(\Phi )=L-1\), and \({\mathcal {D}}(\Phi )= (l_0,l_1,\ldots , l_L)\).

3.2 Realizations of ANNs

Definition 3.2

(Multidimensional versions) Let \(d \in {\mathbb {N}}\) and let \(\psi :{\mathbb {R}}\rightarrow {\mathbb {R}}\) be a function. Then we denote by \({\mathfrak {M}}_{\psi , d} :{\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) the function which satisfies for all \( x = ( x_1, \dots , x_{d} ) \in {\mathbb {R}}^{d} \) that

$$\begin{aligned} {\mathfrak {M}}_{\psi , d}( x ) = \left( \psi (x_1) , \ldots , \psi (x_d) \right) . \end{aligned}$$
(3.2)

Definition 3.3

(Realizations associated to ANNs) Let \(a\in C({\mathbb {R}},{\mathbb {R}})\). Then we denote by \( {\mathcal {R}}_{a}:{\mathbf {N}}\rightarrow ( \cup _{k,l\in {\mathbb {N}}}C({\mathbb {R}}^k,{\mathbb {R}}^l)) \) the function which satisfies for all \( L\in {\mathbb {N}}\), \(l_0,l_1,\ldots , l_L \in {\mathbb {N}}\), \( \Phi = ((W_1, B_1),(W_2, B_2),\ldots , (W_L,B_L)) \in ( \times _{k = 1}^L({\mathbb {R}}^{l_k \times l_{k-1}} \times {\mathbb {R}}^{l_k})) \), \(x_0 \in {\mathbb {R}}^{l_0}, x_1 \in {\mathbb {R}}^{l_1}, \ldots , x_{L-1} \in {\mathbb {R}}^{l_{L-1}}\) with \(\forall \, k \in {\mathbb {N}}\cap (0,L) :x_k ={\mathfrak {M}}_{a,l_k}(W_k x_{k-1} + B_k)\) that

$$\begin{aligned} {\mathcal {R}}_{a}(\Phi ) \in C({\mathbb {R}}^{l_0},{\mathbb {R}}^{l_L})\qquad \text {and}\qquad ( {\mathcal {R}}_{a}(\Phi ) ) (x_0) = W_L x_{L-1} + B_L \end{aligned}$$
(3.3)

(cf. Definitions 3.1 and 3.2).

3.3 Compositions of ANNs

Definition 3.4

(Compositions of ANNs) We denote by \({(\cdot ) \bullet (\cdot )}:\{(\Phi _1,\Phi _2)\in {\mathbf {N}}\times {\mathbf {N}}:{\mathcal {I}}(\Phi _1)={\mathcal {O}}(\Phi _2)\}\rightarrow {\mathbf {N}}\) the function which satisfies for all \( L,{\mathscr {L}}\in {\mathbb {N}}\), \(l_0,l_1,\ldots , l_L,{\mathfrak {l}}_0,{\mathfrak {l}}_1,\ldots , {\mathfrak {l}}_{\mathscr {L}} \in {\mathbb {N}}\), \( \Phi _1 = ((W_1, B_1),(W_2, B_2),\ldots , (W_L,B_L)) \in ( \times _{k = 1}^L({\mathbb {R}}^{l_k \times l_{k-1}} \times {\mathbb {R}}^{l_k})) \), \( \Phi _2 = (({\mathscr {W}}_1, {\mathscr {B}}_1),({\mathscr {W}}_2, {\mathscr {B}}_2),\ldots , ({\mathscr {W}}_{\mathscr {L}},{\mathscr {B}}_{\mathscr {L}})) \in ( \times _{k = 1}^{\mathscr {L}}({\mathbb {R}}^{{\mathfrak {l}}_k \times {\mathfrak {l}}_{k-1}} \times {\mathbb {R}}^{{\mathfrak {l}}_k})) \) with \(l_0={\mathcal {I}}(\Phi _1)={\mathcal {O}}(\Phi _2)={\mathfrak {l}}_{{\mathscr {L}}}\) that

$$\begin{aligned} \begin{aligned}&{\Phi _1 \bullet \Phi _2}\\&\quad = {\left\{ \begin{array}{ll} \begin{array}{r} \big (({\mathscr {W}}_1, {\mathscr {B}}_1),({\mathscr {W}}_2, {\mathscr {B}}_2),\ldots , ({\mathscr {W}}_{{\mathscr {L}}-1},{\mathscr {B}}_{{\mathscr {L}}-1}), (W_1 {\mathscr {W}}_{{\mathscr {L}}}, W_1 {\mathscr {B}}_{{\mathscr {L}}}+B_{1}),\\ (W_2, B_2), (W_3, B_3),\ldots ,(W_{L},B_{L})\big ) \end{array} &{}: L>1<{\mathscr {L}} \\ \big ( (W_1 {\mathscr {W}}_{1}, W_1 {\mathscr {B}}_1+B_{1}), (W_2, B_2), (W_3, B_3),\ldots ,(W_{L},B_{L}) \big ) &{}: L>1={\mathscr {L}}\\ \big (({\mathscr {W}}_1, {\mathscr {B}}_1),({\mathscr {W}}_2, {\mathscr {B}}_2),\ldots , ({\mathscr {W}}_{{\mathscr {L}}-1},{\mathscr {B}}_{{\mathscr {L}}-1}),(W_1 {\mathscr {W}}_{{\mathscr {L}}}, W_1 {\mathscr {B}}_{{\mathscr {L}}}+B_{1}) \big ) &{}: L=1<{\mathscr {L}} \\ (W_1 {\mathscr {W}}_{1}, W_1 {\mathscr {B}}_1+B_{1}) &{}: L=1={\mathscr {L}} \end{array}\right. } \end{aligned}\nonumber \\ \end{aligned}$$
(3.4)

(cf. Definition 3.1).

3.4 Parallelizations of ANNs with the same length

Definition 3.5

(Parallelizations of ANNs with the same length) Let \(n\in {\mathbb {N}}\). Then we denote by

$$\begin{aligned} {\mathbf {P}}_{n}:\big \{(\Phi _1,\Phi _2,\dots , \Phi _n)\in {\mathbf {N}}^n:{\mathcal {L}}(\Phi _1)= {\mathcal {L}}(\Phi _2)=\cdots ={\mathcal {L}}(\Phi _n) \big \}\rightarrow {\mathbf {N}}\end{aligned}$$
(3.5)

the function which satisfies for all \(L\in {\mathbb {N}}\), \(l_{1,0},l_{1,1},\dots , l_{1,L}, l_{2,0},l_{2,1},\dots , l_{2,L},\dots ,l_{n,0},l_{n,1},\dots , l_{n,L}\in {\mathbb {N}}\), \(\Phi _1=((W_{1,1}, B_{1,1}),(W_{1,2}, B_{1,2}),\ldots , (W_{1,L},B_{1,L}))\in ( \times _{k = 1}^L({\mathbb {R}}^{l_{1,k} \times l_{1,k-1}} \times {\mathbb {R}}^{l_{1,k}}))\), \(\Phi _2=((W_{2,1}, B_{2,1}),(W_{2,2}, B_{2,2}),\ldots , (W_{2,L},B_{2,L}))\in ( \times _{k = 1}^L({\mathbb {R}}^{l_{2,k} \times l_{2,k-1}} \times {\mathbb {R}}^{l_{2,k}}))\), ..., \(\Phi _n=((W_{n,1}, B_{n,1}),(W_{n,2}, B_{n,2}),\ldots , (W_{n,L},B_{n,L}))\in ( \times _{k = 1}^L({\mathbb {R}}^{l_{n,k} \times l_{n,k-1}} \times {\mathbb {R}}^{l_{n,k}}))\) that

$$\begin{aligned} \begin{aligned} {\mathbf {P}}_{n}(\Phi _1,\Phi _2,\dots ,\Phi _n)&= \left( \left( {\begin{pmatrix} W_{1,1}&{} 0&{} 0&{} \cdots &{} 0\\ 0&{} W_{2,1}&{} 0&{}\cdots &{} 0\\ 0&{} 0&{} W_{3,1}&{}\cdots &{} 0\\ \vdots &{} \vdots &{}\vdots &{} \ddots &{} \vdots \\ 0&{} 0&{} 0&{}\cdots &{} W_{n,1} \end{pmatrix} ,\begin{pmatrix}B_{1,1}\\ B_{2,1}\\ B_{3,1}\\ \vdots \\ B_{n,1}\end{pmatrix}}\right) ,\right. \\ {}&\quad \left( {\begin{pmatrix} W_{1,2}&{} 0&{} 0&{} \cdots &{} 0\\ 0&{} W_{2,2}&{} 0&{}\cdots &{} 0\\ 0&{} 0&{} W_{3,2}&{}\cdots &{} 0\\ \vdots &{} \vdots &{}\vdots &{} \ddots &{} \vdots \\ 0&{} 0&{} 0&{}\cdots &{} W_{n,2} \end{pmatrix} ,\begin{pmatrix}B_{1,2}\\ B_{2,2}\\ B_{3,2}\\ \vdots \\ B_{n,2}\end{pmatrix}}\right) ,\dots , \\ {}&\quad \left. \left( {\begin{pmatrix} W_{1,L}&{} 0&{} 0&{} \cdots &{} 0\\ 0&{} W_{2,L}&{} 0&{}\cdots &{} 0\\ 0&{} 0&{} W_{3,L}&{}\cdots &{} 0\\ \vdots &{} \vdots &{}\vdots &{} \ddots &{} \vdots \\ 0&{} 0&{} 0&{}\cdots &{} W_{n,L} \end{pmatrix} ,\begin{pmatrix}B_{1,L}\\ B_{2,L}\\ B_{3,L}\\ \vdots \\ B_{n,L}\end{pmatrix}}\right) \right) \end{aligned} \end{aligned}$$
(3.6)

(cf. Definition 3.1).

3.5 Linear transformations of ANNs

Definition 3.6

(Identity matrix) Let \(n\in {\mathbb {N}}\). Then we denote by \({\text {I}}_{n}\in {\mathbb {R}}^{n\times n}\) the identity matrix in \({\mathbb {R}}^{n\times n}\).

Definition 3.7

(ANNs with a vector input) Let \(n \in {\mathbb {N}}\), \(B \in {\mathbb {R}}^n\). Then we denote by \({\mathfrak {B}}_B \in ({\mathbb {R}}^{n \times n} \times {\mathbb {R}}^n)\) the pair given by \({\mathfrak {B}}_B = ({\text {I}}_n, B)\) (cf. Definition 3.6).

Lemma 3.8

Let \(n \in {\mathbb {N}}\), \(B \in {\mathbb {R}}^n\). Then

  1. (i)

    it holds that \({\mathfrak {B}}_B \in {\mathbf {N}}\),

  2. (ii)

    it holds that \({\mathcal {D}}({\mathfrak {B}}_{B}) = (n, n) \in {\mathbb {N}}^2\),

  3. (iii)

    it holds for all \(a \in C({\mathbb {R}}, {\mathbb {R}})\) that \({\mathcal {R}}_{a}({\mathfrak {B}}_{B}) \in C({\mathbb {R}}^n, {\mathbb {R}}^n)\), and

  4. (vi)

    it holds for all \(a \in C({\mathbb {R}}, {\mathbb {R}})\), \(x \in {\mathbb {R}}^n\) that \(({\mathcal {R}}_{a}({\mathfrak {B}}_{B})) (x) = x + B\)

(cf. Definitions 3.1, 3.3, and 3.7).

Lemma 3.9

Let \(\Phi \in {\mathbf {N}}\) (cf. Definition 3.1). Then

  1. (i)

    it holds for all \(B \in {\mathbb {R}}^{{\mathcal {O}}(\Phi )} \) that \({\mathcal {D}}({{\mathfrak {B}}_B \bullet \Phi }) = {\mathcal {D}}(\Phi )\),

  2. (ii)

    it holds for all \(B \in {\mathbb {R}}^{{\mathcal {O}}(\Phi )}\), \(a \in C({\mathbb {R}}, {\mathbb {R}})\) that \({\mathcal {R}}_{a}({{\mathfrak {B}}_B \bullet \Phi }) \in C({\mathbb {R}}^{{\mathcal {I}}(\Phi )}, {\mathbb {R}}^{{\mathcal {O}}(\Phi )}) \),

  3. (iii)

    it holds for all \(B \in {\mathbb {R}}^{{\mathcal {O}}(\Phi )} \), \(a \in C({\mathbb {R}}, {\mathbb {R}})\), \(x \in {\mathbb {R}}^{{\mathcal {I}}(\Phi )}\) that

    $$\begin{aligned} ({\mathcal {R}}_{a}({{\mathfrak {B}}_B \bullet \Phi }))(x) = ({\mathcal {R}}_{a}(\Phi ))(x) + B, \end{aligned}$$
    (3.7)
  4. (iv)

    it holds for all \(B \in {\mathbb {R}}^{{\mathcal {I}}(\Phi )} \) that \({\mathcal {D}}({\Phi \bullet {\mathfrak {B}}_B}) = {\mathcal {D}}(\Phi )\),

  5. (v)

    it holds for all \(B \in {\mathbb {R}}^{{\mathcal {I}}(\Phi )} \), \(a \in C({\mathbb {R}}, {\mathbb {R}})\) that \({\mathcal {R}}_{a}({\Phi \bullet {\mathfrak {B}}_B}) \in C({\mathbb {R}}^{{\mathcal {I}}(\Phi )}, {\mathbb {R}}^{{\mathcal {O}}(\Phi )}) \), and

  6. (iv)

    it holds for all \(B \in {\mathbb {R}}^{{\mathcal {I}}(\Phi )} \), \(a \in C({\mathbb {R}}, {\mathbb {R}})\), \(x \in {\mathbb {R}}^{{\mathcal {O}}(\Phi )}\) that

    $$\begin{aligned} ({\mathcal {R}}_{a}({\Phi \bullet {\mathfrak {B}}_B}))(x) = ({\mathcal {R}}_{a}(\Phi ))(x+B) \end{aligned}$$
    (3.8)

(cf. Definitions 3.3, 3.4, and 3.7).

Definition 3.10

(ANNs with a matrix input) Let \(m, n \in {\mathbb {N}}\), \(W \in {\mathbb {R}}^{m \times n}\). Then we denote by \({\mathfrak {W}}_{W} \in ({\mathbb {R}}^{m \times n} \times {\mathbb {R}}^{m})\) the pair given by \({\mathfrak {W}}_{W} = (W, 0)\).

Lemma 3.11

Let \(m, n \in {\mathbb {N}}\), \(W \in {\mathbb {R}}^{m \times n}\). Then

  1. (i)

    it holds that \({\mathfrak {W}}_W \in {\mathbf {N}}\),

  2. (ii)

    it holds that \({\mathcal {D}}({\mathfrak {W}}_{W}) = (n, m) \in {\mathbb {N}}^2\),

  3. (iii)

    it holds for all \(a \in C({\mathbb {R}}, {\mathbb {R}})\) that \({\mathcal {R}}_{a}({\mathfrak {W}}_{W}) \in C({\mathbb {R}}^n, {\mathbb {R}}^m)\), and

  4. (iv)

    it holds for all \(a \in C({\mathbb {R}}, {\mathbb {R}})\), \(x \in {\mathbb {R}}^n\) that \( ({\mathcal {R}}_{a}({\mathfrak {W}}_{W})) (x) = Wx \)

(cf. Definitions 3.1, 3.3, and 3.10).

Lemma 3.12

Let \(a \in C({\mathbb {R}}, {\mathbb {R}})\), \(\Phi \in {\mathbf {N}}\) (cf. Definition 3.1). Then

  1. (i)

    it holds for all \(m \in {\mathbb {N}}\), \(W \in {\mathbb {R}}^{m \times {\mathcal {O}}(\Phi )}\) that \({\mathcal {R}}_{a}({{\mathfrak {W}}_W \bullet \Phi }) \in C({\mathbb {R}}^{{\mathcal {I}}(\Phi )}, {\mathbb {R}}^m) \),

  2. (ii)

    it holds for all \(m \in {\mathbb {N}}\), \(W \in {\mathbb {R}}^{m \times {\mathcal {O}}(\Phi )}\), \(x \in {\mathbb {R}}^{{\mathcal {I}}(\Phi )}\) that

    $$\begin{aligned} ({\mathcal {R}}_{a}({{\mathfrak {W}}_W \bullet \Phi }))(x) = W \big (({\mathcal {R}}_{a}(\Phi ))(x)\big ), \end{aligned}$$
    (3.9)
  3. (iii)

    it holds for all \(n \in {\mathbb {N}}\), \(W \in {\mathbb {R}}^{ {\mathcal {I}}(\Phi ) \times n}\) that \({\mathcal {R}}_{a}({\Phi \bullet {\mathfrak {W}}_W}) \in C({\mathbb {R}}^n, {\mathbb {R}}^{{\mathcal {O}}(\Phi )}) \), and

  4. (iv)

    it holds for all \(n \in {\mathbb {N}}\), \(W \in {\mathbb {R}}^{ {\mathcal {I}}(\Phi ) \times n}\), \(x \in {\mathbb {R}}^n\) that

    $$\begin{aligned} ({\mathcal {R}}_{a}({\Phi \bullet {\mathfrak {W}}_W}))(x) = ({\mathcal {R}}_{a}(\Phi ))(Wx) \end{aligned}$$
    (3.10)

(cf. Definitions 3.3, 3.4, and 3.10).

Definition 3.13

(Scalar multiplications of ANNs) We denote by \(\left( \cdot \right) \circledast \left( \cdot \right) :{\mathbb {R}}\times {\mathbf {N}}\rightarrow {\mathbf {N}}\) the function which satisfies for all \(\lambda \in {\mathbb {R}}\), \(\Phi \in {\mathbf {N}}\) that

$$\begin{aligned} \lambda \circledast \Phi = {{\mathfrak {W}}_{\lambda {\text {I}}_{{\mathcal {O}}(\Phi )}} \bullet \Phi } \end{aligned}$$
(3.11)

(cf. Definitions 3.1, 3.4, 3.6, and 3.10).

Lemma 3.14

Let \(\lambda \in {\mathbb {R}}\), \(\Phi \in {\mathbf {N}}\) (cf. Definition 3.1). Then

  1. (i)

    it holds that \({\mathcal {D}}(\lambda \circledast \Phi ) = {\mathcal {D}}(\Phi )\),

  2. (ii)

    it holds for all \(a \in C({\mathbb {R}}, {\mathbb {R}})\) that \({\mathcal {R}}_{a}(\lambda \circledast \Phi ) \in C({\mathbb {R}}^{{\mathcal {I}}(\Phi )}, {\mathbb {R}}^{{\mathcal {O}}(\Phi )})\), and

  3. (iii)

    it holds for all \(a \in C({\mathbb {R}}, {\mathbb {R}})\), \(x \in {\mathbb {R}}^{{\mathcal {I}}(\Phi )}\) that

    $$\begin{aligned} ({\mathcal {R}}_{a}(\lambda \circledast \Phi ))(x) = \lambda \big ( ({\mathcal {R}}_{a}(\Phi ))(x) \big ) \end{aligned}$$
    (3.12)

(cf. Definitions 3.3 and 3.13).

3.6 Representations of the identities with rectifier functions

Definition 3.15

We denote by \({\mathfrak {I}}= ({\mathfrak {I}}_d)_{d \in {\mathbb {N}}} :{\mathbb {N}}\rightarrow {\mathbf {N}}\) the function which satisfies for all \(d \in {\mathbb {N}}\) that

$$\begin{aligned} {\mathfrak {I}}_1 = \left( \left( \begin{pmatrix} 1\\ -1 \end{pmatrix}, \begin{pmatrix} 0\\ 0 \end{pmatrix} \right) , \Big ( \begin{pmatrix} 1&-1 \end{pmatrix}, 0 \Big ) \right) \in \big (({\mathbb {R}}^{2 \times 1} \times {\mathbb {R}}^{2}) \times ({\mathbb {R}}^{1 \times 2} \times {\mathbb {R}}^1) \big ) \end{aligned}$$
(3.13)

and

$$\begin{aligned} {\mathfrak {I}}_d = {\mathbf {P}}_{d} ({\mathfrak {I}}_1, {\mathfrak {I}}_1, \ldots , {\mathfrak {I}}_1) \end{aligned}$$
(3.14)

(cf. Definitions 3.1 and 3.5).

Lemma 3.16

Let \(d \in {\mathbb {N}}\), \(a \in C({\mathbb {R}}, {\mathbb {R}})\) satisfy for all \(x \in {\mathbb {R}}\) that \(a(x) = \max \{x, 0\}\). Then

  1. (i)

    it holds that \({\mathcal {D}}({\mathfrak {I}}_d) = (d, 2d, d) \in {\mathbb {N}}^3\),

  2. (ii)

    it holds that \( {\mathcal {R}}_{a}({\mathfrak {I}}_d) \in C({\mathbb {R}}^d, {\mathbb {R}}^d)\), and

  3. (iii)

    it holds for all \(x \in {\mathbb {R}}^d\) that \(({\mathcal {R}}_{a}({\mathfrak {I}}_d))(x) = x\)

(cf. Definitions 3.1, 3.3, and 3.15).

Proof of Lemma 3.16

Throughout this proof let \(L =2\), \(l_0 = 1\), \(l_1 = 2\), \(l_2 =1\). Note that (3.13) ensures that

$$\begin{aligned} {\mathcal {D}}({\mathfrak {I}}_1) = (1, 2, 1) = (l_0, l_1, l_2). \end{aligned}$$
(3.15)

This and, e.g., [25, Lemma 2.18] prove that

$$\begin{aligned} \begin{aligned}&{\mathbf {P}}_{d} ({\mathfrak {I}}_1, {\mathfrak {I}}_1, \ldots , {\mathfrak {I}}_1) \\&\in \big (\! \times _{k = 1}^L \big ( {\mathbb {R}}^{(d l_k) \times (d l_{k-1})} \times {\mathbb {R}}^{(d l_k)}\big ) \big ) = \big (\big ({\mathbb {R}}^{(2d) \times d} \times {\mathbb {R}}^{2d}\big ) \times \big ({\mathbb {R}}^{d \times (2d)} \times {\mathbb {R}}^d\big ) \big ) \end{aligned} \end{aligned}$$
(3.16)

(cf. Definition 3.5). Hence, we obtain that \({\mathcal {D}}({\mathfrak {I}}_d) = (d, 2d, d) \in {\mathbb {N}}^3\). This establishes item (i). Next note that (3.13) assures that for all \(x \in {\mathbb {R}}\) it holds that

$$\begin{aligned} ({\mathcal {R}}_{a}({\mathfrak {I}}_1))(x) = a(x) - a(-x) = \max \{x, 0\} - \max \{-x, 0\} = x. \end{aligned}$$
(3.17)

Combining this and, e.g., [25, Proposition 2.19] demonstrates that for all \( x = (x_1, x_2, \ldots , x_d) \in {\mathbb {R}}^d\) it holds that \({\mathcal {R}}_{a}({\mathfrak {I}}_d) \in C({\mathbb {R}}^d, {\mathbb {R}}^d)\) and

$$\begin{aligned} \begin{aligned} ({\mathcal {R}}_{a}({\mathfrak {I}}_d))(x)&= \big ( {\mathcal {R}}_{a}\big ({\mathbf {P}}_{d} ({\mathfrak {I}}_1, {\mathfrak {I}}_1, \ldots , {\mathfrak {I}}_1)\big )\big )(x_1, x_2, \ldots , x_d)\\&= \big ( ({\mathcal {R}}_{a}({\mathfrak {I}}_1))(x_1), ({\mathcal {R}}_{a}({\mathfrak {I}}_1))(x_2), \ldots , ({\mathcal {R}}_{a}({\mathfrak {I}}_1))(x_d)\big )\\&= (x_1, x_2, \ldots , x_d) = x. \end{aligned} \end{aligned}$$
(3.18)

This establishes items (ii)–(iii). The proof of Lemma 3.16 is thus completed. \(\square \)

3.7 Sums of ANNs with the same length

Definition 3.17

Let \(m, n \in {\mathbb {N}}\). Then we denote by \({\mathfrak {S}}_{m, n} \in ({\mathbb {R}}^{m \times (nm)} \times {\mathbb {R}}^m)\) the pair given by

$$\begin{aligned} {\mathfrak {S}}_{m, n} = {\mathfrak {W}}_{({\text {I}}_m \,\,\, {\text {I}}_m \,\,\, \ldots \,\,\, {\text {I}}_m) } \end{aligned}$$
(3.19)

(cf. Definitions 3.6 and 3.10).

Lemma 3.18

Let \(m, n \in {\mathbb {N}}\). Then

  1. (i)

    it holds that \({\mathfrak {S}}_{m, n} \in {\mathbf {N}}\),

  2. (ii)

    it holds that \({\mathcal {D}}({\mathfrak {S}}_{m, n}) = (nm, m) \in {\mathbb {N}}^2\),

  3. (iii)

    it holds for all \(a \in C({\mathbb {R}}, {\mathbb {R}})\) that \({\mathcal {R}}_{a}({\mathfrak {S}}_{m, n}) \in C({\mathbb {R}}^{nm}, {\mathbb {R}}^m)\), and

  4. (iv)

    it holds for all \(a \in C({\mathbb {R}}, {\mathbb {R}})\), \(x_1, x_2, \ldots , x_n \in {\mathbb {R}}^{m}\) that

    $$\begin{aligned} ({\mathcal {R}}_{a}({\mathfrak {S}}_{m, n})) (x_1, x_2, \ldots , x_n) = \textstyle \sum _{k=1}^n x_k \end{aligned}$$
    (3.20)

(cf. Definitions 3.1, 3.3, and 3.17).

Proof of Lemma 3.18

Note that the fact that \({\mathfrak {S}}_{m, n} \in ({\mathbb {R}}^{m \times (nm)} \times {\mathbb {R}}^m)\) ensures that \({\mathfrak {S}}_{m, n} \in {\mathbf {N}}\) and \({\mathcal {D}}({\mathfrak {S}}_{m, n}) = (nm, m) \in {\mathbb {N}}^2\). This establishes items (i)–(ii). Next observe that items (iii)–(iv) in Lemma 3.11 prove that for all \(a \in C({\mathbb {R}}, {\mathbb {R}})\), \(x_1, x_2, \ldots , x_n \in {\mathbb {R}}^{m}\) it holds that \({\mathcal {R}}_{a}({\mathfrak {S}}_{m, n}) \in C({\mathbb {R}}^{nm}, {\mathbb {R}}^m)\) and

$$\begin{aligned} \begin{aligned} ({\mathcal {R}}_{a}({\mathfrak {S}}_{m, n})) (x_1, x_2, \ldots , x_n)&= \big ( {\mathcal {R}}_{a}\big ( {\mathfrak {W}}_{({\text {I}}_m \,\,\, {\text {I}}_m \,\,\, \ldots \,\,\, {\text {I}}_m) } \big )\big )(x_1, x_2, \ldots , x_n)\\&= ({\text {I}}_m \,\,\, {\text {I}}_m \,\,\, \ldots \,\,\, {\text {I}}_m) (x_1, x_2, \ldots , x_n) = \textstyle \sum _{k=1}^n x_k \end{aligned} \end{aligned}$$
(3.21)

(cf. Definitions 3.6 and 3.10). This establishes items (iii)–(iv). The proof of Lemma 3.18 is thus completed. \(\square \)

Lemma 3.19

Let \(m, n \in {\mathbb {N}}\), \(a \in C({\mathbb {R}}, {\mathbb {R}})\), \(\Phi \in \{\Psi \in {\mathbf {N}}:{\mathcal {O}}(\Psi ) = nm\}\) (cf. Definition 3.1). Then

  1. (i)

    it holds that \({\mathcal {R}}_{a}({{\mathfrak {S}}_{m, n} \bullet \Phi }) \in C({\mathbb {R}}^{{\mathcal {I}}(\Phi )}, {\mathbb {R}}^m) \) and

  2. (ii)

    it holds for all \(x \in {\mathbb {R}}^{{\mathcal {I}}(\Phi )}\), \(y_1, y_2, \ldots , y_n \in {\mathbb {R}}^{m}\) with \(({\mathcal {R}}_{a}(\Phi ))(x) = (y_1, y_2, \ldots , y_n)\) that

    $$\begin{aligned} \big ( {\mathcal {R}}_{a}({{\mathfrak {S}}_{m, n} \bullet \Phi }) \big )(x) = \textstyle \sum _{k=1}^n y_k \end{aligned}$$
    (3.22)

(cf. Definitions 3.3, 3.4, and 3.17).

Proof of Lemma 3.19

Note that Lemma 3.18 ensures that for all \(x_1, x_2, \ldots , x_n \in {\mathbb {R}}^{m}\) it holds that \({\mathcal {R}}_{a}({\mathfrak {S}}_{m, n}) \in C({\mathbb {R}}^{nm}, {\mathbb {R}}^m)\) and

$$\begin{aligned} \begin{aligned} ({\mathcal {R}}_{a}({\mathfrak {S}}_{m, n})) (x_1, x_2, \ldots , x_n) = \textstyle \sum _{k=1}^n x_k. \end{aligned} \end{aligned}$$
(3.23)

Combining this and, e.g., [25, item (v) in Proposition 2.6] establishes items (i)–(ii). The proof of Lemma 3.19 is thus completed. \(\square \)

Lemma 3.20

Let \(n \in {\mathbb {N}}\), \(a \in C({\mathbb {R}}, {\mathbb {R}})\), \(\Phi \in {\mathbf {N}}\) (cf. Definition 3.1). Then

  1. (i)

    it holds that \({\mathcal {R}}_{a}({\Phi \bullet {\mathfrak {S}}_{{\mathcal {I}}(\Phi ), n}}) \in C({\mathbb {R}}^{n {\mathcal {I}}(\Phi )}, {\mathbb {R}}^{{\mathcal {O}}(\Phi )}) \) and

  2. (ii)

    it holds for all \(x_1, x_2, \ldots , x_n \in {\mathbb {R}}^{{\mathcal {I}}(\Phi )}\) that

    $$\begin{aligned} \big ({\mathcal {R}}_{a}({\Phi \bullet {\mathfrak {S}}_{{\mathcal {I}}(\Phi ), n}}) \big )(x_1, x_2, \ldots , x_n) = ({\mathcal {R}}_{a}(\Phi ))(\textstyle \sum _{k=1}^n x_k) \end{aligned}$$
    (3.24)

(cf. Definitions 3.3, 3.4, and 3.17).

Proof of Lemma 3.20

Note that Lemma 3.18 demonstrates that for all \(m \in {\mathbb {N}}\), \(x_1, x_2, \ldots , x_n \in {\mathbb {R}}^{m}\) it holds that \({\mathcal {R}}_{a}({\mathfrak {S}}_{m, n}) \in C({\mathbb {R}}^{nm}, {\mathbb {R}}^m)\) and

$$\begin{aligned} \begin{aligned} ({\mathcal {R}}_{a}({\mathfrak {S}}_{m, n})) (x_1, x_2, \ldots , x_n) = \textstyle \sum _{k=1}^n x_k. \end{aligned} \end{aligned}$$
(3.25)

Combining this and, e.g., [25, item (v) in Proposition 2.6] establishes items (i)–(ii). The proof of Lemma 3.20 is thus completed. \(\square \)

Definition 3.21

Let \(m, n \in {\mathbb {N}}\), \(A \in {\mathbb {R}}^{m \times n}\). Then we denote by \(A^* \in {\mathbb {R}}^{n \times m}\) the transpose of A.

Definition 3.22

Let \(m, n \in {\mathbb {N}}\). Then we denote by \({\mathfrak {T}}_{m, n} \in ({\mathbb {R}}^{(nm) \times m} \times {\mathbb {R}}^{nm})\) the pair given by

$$\begin{aligned} {\mathfrak {T}}_{m, n} = {\mathfrak {W}}_{({\text {I}}_m \,\,\, {\text {I}}_m \,\,\, \ldots \,\,\, {\text {I}}_m)^* } \end{aligned}$$
(3.26)

(cf. Definitions 3.6, 3.10, and 3.21).

Lemma 3.23

Let \(m, n \in {\mathbb {N}}\). Then

  1. (i)

    it holds that \({\mathfrak {T}}_{m, n} \in {\mathbf {N}}\),

  2. (ii)

    it holds that \( {\mathcal {D}}({\mathfrak {T}}_{m, n}) = (m, nm) \in {\mathbb {N}}^2\),

  3. (iii)

    it holds for all \(a \in C({\mathbb {R}}, {\mathbb {R}})\) that \({\mathcal {R}}_{a}({\mathfrak {T}}_{m, n}) \in C({\mathbb {R}}^{m}, {\mathbb {R}}^{nm})\), and

  4. (iv)

    it holds for all \(a \in C({\mathbb {R}}, {\mathbb {R}})\), \(x \in {\mathbb {R}}^m\) that

    $$\begin{aligned} ({\mathcal {R}}_{a}({\mathfrak {T}}_{m, n})) (x) = (x, x, \ldots , x) \end{aligned}$$
    (3.27)

(cf. Definitions 3.1, 3.3, and 3.22).

Proof of Lemma 3.23

Note that the fact that \({\mathfrak {T}}_{m, n} \in ({\mathbb {R}}^{(nm) \times m} \times {\mathbb {R}}^{nm})\) ensures that \({\mathfrak {T}}_{m, n} \in {\mathbf {N}}\) and \({\mathcal {D}}({\mathfrak {T}}_{m, n}) = (m, nm) \in {\mathbb {N}}^2\). This establishes items (i)–(ii). Next observe that items (iii)–(iv) in Lemma 3.11 prove that for all \(a \in C({\mathbb {R}}, {\mathbb {R}})\), \(x \in {\mathbb {R}}^m\) it holds that \({\mathcal {R}}_{a}({\mathfrak {T}}_{m, n}) \in C({\mathbb {R}}^{m}, {\mathbb {R}}^{nm})\) and

$$\begin{aligned} \begin{aligned} ({\mathcal {R}}_{a}({\mathfrak {T}}_{m, n})) (x)&= \big ({\mathcal {R}}_{a}\big ( {\mathfrak {W}}_{({\text {I}}_m \,\,\, {\text {I}}_m \,\,\, \ldots \,\,\, {\text {I}}_m)^* } \big ) \big )(x)\\&= ({\text {I}}_m \,\,\, {\text {I}}_m \,\,\, \ldots \,\,\, {\text {I}}_m)^{*} x = (x, x, \ldots , x) \end{aligned} \end{aligned}$$
(3.28)

(cf. Definitions 3.6 and 3.10). This establishes items (iii)–(iv). The proof of Lemma 3.23 is thus completed. \(\square \)

Lemma 3.24

Let \(n \in {\mathbb {N}}\), \(a \in C({\mathbb {R}}, {\mathbb {R}})\), \(\Phi \in {\mathbf {N}}\) (cf. Definition 3.1). Then

  1. (i)

    it holds that \({\mathcal {R}}_{a}({{\mathfrak {T}}_{{\mathcal {O}}(\Phi ), n} \bullet \Phi }) \in C({\mathbb {R}}^{{\mathcal {I}}(\Phi )}, {\mathbb {R}}^{n {\mathcal {O}}(\Phi )}) \) and

  2. (ii)

    it holds for all \(x \in {\mathbb {R}}^{{\mathcal {I}}(\Phi )}\) that

    $$\begin{aligned} \big ( {\mathcal {R}}_{a}({{\mathfrak {T}}_{{\mathcal {O}}(\Phi ), n} \bullet \Phi }) \big )(x) = \big (({\mathcal {R}}_{a}(\Phi ))(x), ({\mathcal {R}}_{a}(\Phi ))(x), \ldots , ({\mathcal {R}}_{a}(\Phi ))(x) \big ) \end{aligned}$$
    (3.29)

(cf. Definitions 3.3, 3.4, and 3.22).

Proof of Lemma 3.24

Note that Lemma 3.23 ensures that for all \(m \in {\mathbb {N}}\), \(x \in {\mathbb {R}}^m\) it holds that \({\mathcal {R}}_{a}({\mathfrak {T}}_{m, n}) \in C({\mathbb {R}}^{m}, {\mathbb {R}}^{nm})\) and

$$\begin{aligned} \begin{aligned} ({\mathcal {R}}_{a}({\mathfrak {T}}_{m, n})) (x) = (x, x, \ldots , x). \end{aligned} \end{aligned}$$
(3.30)

Combining this and, e.g., [25, item (v) in Proposition 2.6] establishes items (i)–(ii). The proof of Lemma 3.24 is thus completed. \(\square \)

Lemma 3.25

Let \(m, n \in {\mathbb {N}}\), \(a \in C({\mathbb {R}}, {\mathbb {R}})\), \(\Phi \in \{\Psi \in {\mathbf {N}}:{\mathcal {I}}(\Psi ) = nm\}\) (cf. Definition 3.1). Then

  1. (i)

    it holds that \({\mathcal {R}}_{a}({\Phi \bullet {\mathfrak {T}}_{m, n}}) \in C({\mathbb {R}}^{m}, {\mathbb {R}}^{{\mathcal {O}}(\Phi )}) \) and

  2. (ii)

    it holds for all \(x \in {\mathbb {R}}^{m}\) that

    $$\begin{aligned} \big ({\mathcal {R}}_{a}({\Phi \bullet {\mathfrak {T}}_{m, n}}) \big )(x) = ({\mathcal {R}}_{a}(\Phi ))(x, x, \ldots , x) \end{aligned}$$
    (3.31)

(cf. Definitions 3.3, 3.4, and 3.22).

Proof of Lemma 3.25

Observe that Lemma 3.23 demonstrates that for all \(x \in {\mathbb {R}}^m\) it holds that \({\mathcal {R}}_{a}({\mathfrak {T}}_{m, n}) \in C({\mathbb {R}}^{m}, {\mathbb {R}}^{nm})\) and

$$\begin{aligned} \begin{aligned} ({\mathcal {R}}_{a}({\mathfrak {T}}_{m, n})) (x) = (x, x, \ldots , x). \end{aligned} \end{aligned}$$
(3.32)

Combining this and, e.g., [25, item (v) in Proposition 2.6] establishes items (i)–(ii). The proof of Lemma 3.25 is thus completed. \(\square \)

Definition 3.26

(Sums of ANNs with the same length) Let \(n \in {\mathbb {N}}\), \(\Phi _1, \Phi _2, \ldots , \Phi _n \in {\mathbf {N}}\) satisfy for all \(k \in \{1, 2, \ldots , n\}\) that \({\mathcal {L}}(\Phi _k) = {\mathcal {L}}(\Phi _1)\), \({\mathcal {I}}(\Phi _k) = {\mathcal {I}}(\Phi _1)\), and \({\mathcal {O}}(\Phi _k) = {\mathcal {O}}(\Phi _1)\). Then we denote by \(\oplus _{k \in \{1, 2, \ldots , n\}} \Phi _k\) (we denote by \(\Phi _1 \oplus \Phi _2 \oplus \ldots \oplus \Phi _n\)) the tuple given by

$$\begin{aligned} \oplus _{k \in \{1, 2, \ldots , n\}} \Phi _k = \big ( {{\mathfrak {S}}_{{\mathcal {O}}(\Phi _1), n} \bullet {{\big [{\mathbf {P}}_n(\Phi _1,\Phi _2,\dots , \Phi _n)\big ] \bullet {\mathfrak {T}}_{{\mathcal {I}}(\Phi _1), n}}}} \big ) \in {\mathbf {N}}\end{aligned}$$
(3.33)

(cf. Definitions 3.1, 3.4, 3.5, 3.17, and 3.22).

Definition 3.27

(Dimensions of ANNs) Let \(n \in {\mathbb {N}}_0\). Then we denote by \({\mathbb {D}}_n :{\mathbf {N}}\rightarrow {\mathbb {N}}_0\) the function which satisfies for all \( L\in {\mathbb {N}}\), \(l_0,l_1,\ldots , l_L \in {\mathbb {N}}\), \( \Phi \in ( \times _{k = 1}^L({\mathbb {R}}^{l_k \times l_{k-1}} \times {\mathbb {R}}^{l_k}))\) that

$$\begin{aligned} \begin{aligned} {\mathbb {D}}_n (\Phi ) = {\left\{ \begin{array}{ll} l_n &{}:n \le L \\ 0 &{}:n > L \end{array}\right. } \end{aligned} \end{aligned}$$
(3.34)

(cf. Definition 3.1).

Lemma 3.28

Let \(n \in {\mathbb {N}}\), \(\Phi _1, \Phi _2, \ldots , \Phi _n \in {\mathbf {N}}\) satisfy for all \(k \in \{1, 2, \ldots , n\}\) that \({\mathcal {L}}(\Phi _k) = {\mathcal {L}}(\Phi _1)\), \({\mathcal {I}}(\Phi _k) = {\mathcal {I}}(\Phi _1)\), and \({\mathcal {O}}(\Phi _k) = {\mathcal {O}}(\Phi _1)\) (cf. Definition 3.1). Then

  1. (i)

    it holds that \( {\mathcal {L}}(\oplus _{k \in \{1, 2, \ldots , n\}} \Phi _k) = {\mathcal {L}}(\Phi _1)\),

  2. (ii)

    it holds that

    $$\begin{aligned}&{\mathcal {D}}(\oplus _{k \in \{1, 2, \ldots , n\}} \Phi _k) \nonumber \\&\quad = \big ({\mathcal {I}}(\Phi _1), \textstyle \sum _{k=1}^n {\mathbb {D}}_1(\Phi _k), \textstyle \sum _{k = 1}^n {\mathbb {D}}_2(\Phi _k), \ldots , \textstyle \sum _{k=1}^n {\mathbb {D}}_{{\mathcal {L}}(\Phi _1)-1}(\Phi _k), {\mathcal {O}}(\Phi _1)\big ), \end{aligned}$$
    (3.35)
  3. (iii)

    it holds for all \(a \in C({\mathbb {R}}, {\mathbb {R}})\) that \({\mathcal {R}}_{a}(\oplus _{k \in \{1, 2, \ldots , n\}} \Phi _k) \in C({\mathbb {R}}^{{\mathcal {I}}(\Phi _1)}, {\mathbb {R}}^{{\mathcal {O}}(\Phi _1)})\), and

  4. (iv)

    it holds for all \(a \in C({\mathbb {R}}, {\mathbb {R}})\), \(x \in {\mathbb {R}}^{{\mathcal {I}}(\Phi _1)}\) that

    $$\begin{aligned} \big ({\mathcal {R}}_{a} (\oplus _{k \in \{1, 2, \ldots , n\}} \Phi _k ) \big ) (x) = \sum _{k=1}^n ({\mathcal {R}}_a(\Phi _k))(x) \end{aligned}$$
    (3.36)

(cf. Definitions 3.3, 3.26, and 3.27).

Proof of Lemma 3.28

First, note that, e.g., [25, Lemma 2.18] proves that

$$\begin{aligned} \begin{aligned}&{\mathcal {D}}\big ( {\mathbf {P}}_n(\Phi _1,\Phi _2,\dots , \Phi _n) \big )\\&\quad = \big ( \textstyle \sum _{k=1}^n {\mathbb {D}}_0(\Phi _k), \textstyle \sum _{k = 1}^n {\mathbb {D}}_1(\Phi _k), \ldots , \textstyle \sum _{k=1}^n {\mathbb {D}}_{{\mathcal {L}}(\Phi _1)-1}(\Phi _k), \textstyle \sum _{k=1}^n {\mathbb {D}}_{{\mathcal {L}}(\Phi _1)}(\Phi _k)\big )\\&\quad = \big (n {\mathcal {I}}(\Phi _1), \textstyle \sum _{k=1}^n {\mathbb {D}}_1(\Phi _k), \textstyle \sum _{k = 1}^n {\mathbb {D}}_2(\Phi _k), \ldots , \textstyle \sum _{k=1}^n {\mathbb {D}}_{{\mathcal {L}}(\Phi _1)-1}(\Phi _k), n {\mathcal {O}}(\Phi _1)\big ) \end{aligned} \end{aligned}$$
(3.37)

(cf. Definition 3.5). Moreover, observe that item (ii) in Lemma 3.18 ensures that

$$\begin{aligned} {\mathcal {D}}\big ({\mathfrak {S}}_{{\mathcal {O}}(\Phi _1), n} \big ) = (n{\mathcal {O}}(\Phi _1), {\mathcal {O}}(\Phi _1)) \end{aligned}$$
(3.38)

(cf. Definition 3.17). This, (3.37), and, e.g., [25, item (i) in Proposition 2.6] demonstrate that

$$\begin{aligned} \begin{aligned}&{\mathcal {D}}\big ({{\mathfrak {S}}_{{\mathcal {O}}(\Phi _1), n} \bullet \big [{\mathbf {P}}_n(\Phi _1,\Phi _2,\dots , \Phi _n)\big ]} \big )\\&\quad = \big (n {\mathcal {I}}(\Phi _1), \textstyle \sum _{k=1}^n {\mathbb {D}}_1(\Phi _k), \textstyle \sum _{k = 1}^n {\mathbb {D}}_2(\Phi _k), \ldots , \textstyle \sum _{k=1}^n {\mathbb {D}}_{{\mathcal {L}}(\Phi _1)-1}(\Phi _k), {\mathcal {O}}(\Phi _1)\big ). \end{aligned} \end{aligned}$$
(3.39)

Next note that item (ii) in Lemma 3.23 assures that

$$\begin{aligned} {\mathcal {D}}\big ( {\mathfrak {T}}_{{\mathcal {I}}(\Phi _1), n}\big ) = ({\mathcal {I}}(\Phi _1), n {\mathcal {I}}(\Phi _1)) \end{aligned}$$
(3.40)

(cf. Definition 3.22). Combining this, (3.39), and, e.g., [25, item (i) in Proposition 2.6] proves that

$$\begin{aligned} \begin{aligned}&{\mathcal {D}}(\oplus _{k \in \{1, 2, \ldots , n\}} \Phi _k) \\&\quad = {\mathcal {D}}\big ( {{\mathfrak {S}}_{{\mathcal {O}}(\Phi _1), n} \bullet {{\big [{\mathbf {P}}_n(\Phi _1,\Phi _2,\dots , \Phi _n)\big ] \bullet {\mathfrak {T}}_{{\mathcal {I}}(\Phi _1), n}}}}\big )\\&\quad = \big ({\mathcal {I}}(\Phi _1), \textstyle \sum _{k=1}^n {\mathbb {D}}_1(\Phi _k), \textstyle \sum _{k = 1}^n {\mathbb {D}}_2(\Phi _k), \ldots , \textstyle \sum _{k=1}^n {\mathbb {D}}_{{\mathcal {L}}(\Phi _1)-1}(\Phi _k), {\mathcal {O}}(\Phi _1)\big ). \end{aligned} \end{aligned}$$
(3.41)

This establishes items (i)–(ii). Next observe that Lemma 3.25 and (3.37) ensure that for all \(a \in C({\mathbb {R}}, {\mathbb {R}})\), \(x \in {\mathbb {R}}^{{\mathcal {I}}(\Phi _1)}\) it holds that \({\mathcal {R}}_{a}({[{\mathbf {P}}_n(\Phi _1,\Phi _2,\dots , \Phi _n)] \bullet {\mathfrak {T}}_{{\mathcal {I}}(\Phi _1), n}}) \in C({\mathbb {R}}^{{\mathcal {I}}(\Phi _1)}, {\mathbb {R}}^{n {\mathcal {O}}(\Phi _1)}) \) and

$$\begin{aligned} \begin{aligned}&\big ({\mathcal {R}}_{a}\big ({[{\mathbf {P}}_n(\Phi _1,\Phi _2,\dots , \Phi _n)] \bullet {\mathfrak {T}}_{{\mathcal {I}}(\Phi _1), n}}\big )\big ) (x)\\&\quad = \big ({\mathcal {R}}_{a}\big ({\mathbf {P}}_n(\Phi _1,\Phi _2,\dots , \Phi _n)\big )\big )(x, x, \ldots , x). \end{aligned} \end{aligned}$$
(3.42)

Combining this with, e.g., [25, item (ii) in Proposition 2.19] proves that for all \(a \in C({\mathbb {R}}, {\mathbb {R}})\), \(x \in {\mathbb {R}}^{{\mathcal {I}}(\Phi _1)}\) it holds that

$$\begin{aligned} \begin{aligned}&\big ({\mathcal {R}}_{a}\big ({[{\mathbf {P}}_n(\Phi _1,\Phi _2,\dots , \Phi _n)] \bullet {\mathfrak {T}}_{{\mathcal {I}}(\Phi _1), n}}\big )\big ) (x)\\&\quad = \big ( ({\mathcal {R}}_{a}(\Phi _1))(x), ({\mathcal {R}}_{a}(\Phi _2))(x), \ldots , ({\mathcal {R}}_{a}(\Phi _n))(x) \big ) \in {\mathbb {R}}^{n {\mathcal {O}}(\Phi _1)}. \end{aligned} \end{aligned}$$
(3.43)

Lemma 3.19, (3.38), and, e.g., [25, Lemma 2.8] therefore demonstrate that for all \(a \in C({\mathbb {R}}, {\mathbb {R}})\), \(x \in {\mathbb {R}}^{{\mathcal {I}}(\Phi _1)}\) it holds that \({\mathcal {R}}_{a}(\oplus _{k \in \{1, 2, \ldots , n\}} \Phi _k) \in C({\mathbb {R}}^{{\mathcal {I}}(\Phi _1)}, {\mathbb {R}}^{{\mathcal {O}}(\Phi _1)})\) and

$$\begin{aligned} \begin{aligned}&\big ({\mathcal {R}}_{a} (\oplus _{k \in \{1, 2, \ldots , n\}} \Phi _k ) \big ) (x)\\&\quad = \big ({\mathcal {R}}_{a}\big ({{\mathfrak {S}}_{{\mathcal {O}}(\Phi _1), n} \bullet {{[{\mathbf {P}}_n(\Phi _1,\Phi _2,\dots , \Phi _n)] \bullet {\mathfrak {T}}_{{\mathcal {I}}(\Phi _1), n}}}}\big )\big ) (x) = \sum _{k=1}^n( {\mathcal {R}}_a(\Phi _k)) (x). \end{aligned}\nonumber \\ \end{aligned}$$
(3.44)

This establishes items (iii)–(iv). The proof of Lemma 3.28 is thus completed. \(\square \)

3.8 ANN representation results

Lemma 3.29

Let \( n \in {\mathbb {N}}\), \(h_1, h_2, \ldots , h_n \in {\mathbb {R}}\), \( \Phi _1, \Phi _2, \ldots , \Phi _n \in {\mathbf {N}}\) satisfy that \({\mathcal {D}}(\Phi _1) = {\mathcal {D}}(\Phi _2) = \ldots = {\mathcal {D}}(\Phi _n)\), let \(A_k \in {\mathbb {R}}^{{\mathcal {I}}(\Phi _1) \times (n {\mathcal {I}}(\Phi _1))}\), \(k \in \{1, 2, \ldots , n\}\), satisfy for all \(k \in \{1, 2, \ldots , n\}\), \(x = (x_i)_{i \in \{1, 2, \ldots , n\}} \in {\mathbb {R}}^{n {\mathcal {I}}(\phi _1)}\) that \(A_k x = x_k\), and let \(\Psi \in {\mathbf {N}}\) satisfy that

$$\begin{aligned} \Psi = \oplus _{k \in \{1, 2, \ldots , n\}} ( h_k \circledast ( {\Phi _k \bullet {\mathfrak {W}}_{A_k}})) \end{aligned}$$
(3.45)

(cf. Definitions 3.1, 3.10, 3.13, and 3.26). Then

  1. (i)

    it holds that

    $$\begin{aligned} {\mathcal {D}}(\Psi ) = (n {\mathcal {I}}(\Phi _1), n{\mathbb {D}}_1(\Phi _1), n{\mathbb {D}}_2(\Phi _1), \ldots , n{\mathbb {D}}_{{\mathcal {L}}(\Phi _1)-1}(\Phi _1), {\mathcal {O}}(\Phi _1)), \end{aligned}$$
    (3.46)
  2. (ii)

    it holds that \({\mathcal {P}}(\Psi ) \le n^2 {\mathcal {P}}(\Phi _1)\),

  3. (iii)

    it holds for all \( a \in C({\mathbb {R}}, {\mathbb {R}})\) that \({\mathcal {R}}_{a}(\Psi ) \in C({\mathbb {R}}^{n {\mathcal {I}}(\Phi _1)}, {\mathbb {R}}^{{\mathcal {O}}(\Phi _1)})\), and

  4. (iv)

    it holds for all \( a \in C({\mathbb {R}}, {\mathbb {R}})\), \(x = (x_k)_{k \in \{1, 2, \ldots , n\}} \in {\mathbb {R}}^{n {\mathcal {I}}(\Phi _1)}\) that

    $$\begin{aligned} ({\mathcal {R}}_{a}(\Psi ))(x) = \sum _{k=1}^n h_k ({\mathcal {R}}_{a}(\Phi _k))(x_k) \end{aligned}$$
    (3.47)

(cf. Definitions 3.3 and 3.27).

Proof of Lemma 3.29

First, note that item (ii) in Lemma 3.11 ensures for all \(k \in \{1, 2, \ldots , n\}\) that

$$\begin{aligned} {\mathcal {D}}({\mathfrak {W}}_{A_k}) = (n {\mathcal {I}}(\Phi _1), {\mathcal {I}}(\Phi _1)) \in {\mathbb {N}}^2. \end{aligned}$$
(3.48)

This and, e.g., [25, item (i) in Proposition 2.6] prove for all \(k \in \{1, 2, \ldots , n\}\) that

$$\begin{aligned} {\mathcal {D}}( {\Phi _k \bullet {\mathfrak {W}}_{A_k}}) = (n {\mathcal {I}}(\Phi _1), {\mathbb {D}}_1(\Phi _k), {\mathbb {D}}_2(\Phi _k), \ldots , {\mathbb {D}}_{{\mathcal {L}}(\Phi _k)}(\Phi _k)). \end{aligned}$$
(3.49)

Item (i) in Lemma 3.14 therefore demonstrates for all \(k \in \{1, 2, \ldots , n\}\) that

$$\begin{aligned} \begin{aligned} {\mathcal {D}}( h_k \circledast ( {\Phi _k \bullet {\mathfrak {W}}_{A_k}}))&= {\mathcal {D}}({\Phi _k \bullet {\mathfrak {W}}_{A_k}}) \\&= (n {\mathcal {I}}(\Phi _1), {\mathbb {D}}_1(\Phi _k), {\mathbb {D}}_2(\Phi _k), \ldots , {\mathbb {D}}_{{\mathcal {L}}(\Phi _k)-1}(\Phi _k), {\mathcal {O}}(\Phi _k))\\&= (n {\mathcal {I}}(\Phi _1), {\mathbb {D}}_1(\Phi _1), {\mathbb {D}}_2(\Phi _1), \ldots , {\mathbb {D}}_{{\mathcal {L}}(\Phi _1)-1}(\Phi _1), {\mathcal {O}}(\Phi _1)). \end{aligned}\nonumber \\ \end{aligned}$$
(3.50)

Combining this with item (ii) in Lemma 3.28 ensures that

$$\begin{aligned} \begin{aligned} {\mathcal {D}}(\Psi )&= {\mathcal {D}}\big (\! \oplus _{k \in \{1, 2, \ldots , n\}} ( h_k \circledast ( {\Phi _k \bullet {\mathfrak {W}}_{A_k}}))\big )\\&= (n {\mathcal {I}}(\Phi _1), n{\mathbb {D}}_1(\Phi _1), n{\mathbb {D}}_2(\Phi _1), \ldots , n{\mathbb {D}}_{{\mathcal {L}}(\Phi _1)-1}(\Phi _1), {\mathcal {O}}(\Phi _1)). \end{aligned} \end{aligned}$$
(3.51)

This establishes item (i). Hence, we obtain that

$$\begin{aligned} {\mathcal {P}}(\Psi ) \le n^2 {\mathcal {P}}(\Phi _1). \end{aligned}$$
(3.52)

This establishes item (ii). Moreover, observe that items (iii)–(iv) in Lemma 3.12 assure for all \(k \in \{1, 2, \ldots , n\}\), \(a \in C({\mathbb {R}}, {\mathbb {R}})\), \(x = (x_i)_{i \in \{1, 2, \ldots , n\}} \in {\mathbb {R}}^{n {\mathcal {I}}(\Phi _1)}\) that \({\mathcal {R}}_{a}( {\Phi _k \bullet {\mathfrak {W}}_{A_k}}) \in C({\mathbb {R}}^{n {\mathcal {I}}(\Phi _1)}, {\mathbb {R}}^{{\mathcal {O}}(\Phi _k)})\) and

$$\begin{aligned} \big ({\mathcal {R}}_{a}( {\Phi _k \bullet {\mathfrak {W}}_{A_k}})\big ) (x) = ({\mathcal {R}}_{a}(\Phi ))(A_k x) = ({\mathcal {R}}_{a}(\Phi )) (x_k). \end{aligned}$$
(3.53)

Combining this with items (ii)–(iii) in Lemma 3.14 proves for all \(k \in \{1, 2, \ldots , n\}\), \(a \in C({\mathbb {R}}, {\mathbb {R}})\), \(x = (x_i)_{i \in \{1, 2, \ldots , n\}} \in {\mathbb {R}}^{n {\mathcal {I}}(\Phi _1)}\) that \({\mathcal {R}}_{a}( h_k \circledast ( {\Phi _k \bullet {\mathfrak {W}}_{A_k}})) \in C({\mathbb {R}}^{n {\mathcal {I}}(\Phi _1)}, {\mathbb {R}}^{{\mathcal {O}}(\Phi _1)})\) and

$$\begin{aligned} \big ({\mathcal {R}}_{a}( h_k \circledast ( {\Phi _k \bullet {\mathfrak {W}}_{A_k}}))\big ) (x) = h_k ({\mathcal {R}}_{a}(\Phi )) (x_k). \end{aligned}$$
(3.54)

Items (iii)–(iv) in Lemma 3.28 and (3.50) hence ensure for all \(a \in C({\mathbb {R}}, {\mathbb {R}})\), \(x = (x_i)_{i \in \{1, 2, \ldots , n\}} \in {\mathbb {R}}^{n {\mathcal {I}}(\Phi _1)}\) that \({\mathcal {R}}_{a}(\Psi ) \in C({\mathbb {R}}^{n {\mathcal {I}}(\Phi _1)}, {\mathbb {R}}^{{\mathcal {O}}(\Phi _1)})\) and

$$\begin{aligned} \begin{aligned} ({\mathcal {R}}_{a}(\Psi ))(x)&= \big ( {\mathcal {R}}_{a}\big ( \! \oplus _{k \in \{1, 2, \ldots , n\}} ( h_k \circledast ( {\Phi _k \bullet {\mathfrak {W}}_{A_k}})) \big ) \big )(x)\\&= \sum _{k = 1}^n \big ({\mathcal {R}}_{a}( h_k \circledast ( {\Phi _k \bullet {\mathfrak {W}}_{A_k}}))\big ) (x) = \sum _{k=1}^n h_k ({\mathcal {R}}_{a}(\Phi _k))(x_k). \end{aligned} \end{aligned}$$
(3.55)

This establishes items (iii)–(iv). The proof of Lemma 3.29 is thus completed. \(\square \)

Lemma 3.30

Let \(a\in C({\mathbb {R}},{\mathbb {R}})\), \(L_1, L_2\in {\mathbb {N}}\), \({\mathbb {I}}, \Phi _1,\Phi _2\in {\mathbf {N}}\), \(d,{\mathfrak {i}}, l_{1,0},l_{1,1},\dots ,l_{1,L_1},l_{2,0}, l_{2,1},\dots ,l_{2,L_2}\in {\mathbb {N}}\) satisfy for all \(k\in \{1,2\}\), \(x\in {\mathbb {R}}^{d}\) that \(2\le {\mathfrak {i}}\le 2d\), \(l_{2,L_2-1}\le l_{1,L_1-1}+{\mathfrak {i}}\), \({\mathcal {D}}({\mathbb {I}}) = (d,{\mathfrak {i}},d)\), \(({\mathcal {R}}_{a}({\mathbb {I}}))(x)=x\), \({\mathcal {I}}(\Phi _k)={\mathcal {O}}(\Phi _k)=d\), and \({\mathcal {D}}(\Phi _k)=(l_{k,0},l_{k,1},\dots , l_{k,L_k})\) (cf. Definitions 3.1 and 3.3). Then there exists \(\Psi \in {\mathbf {N}}\) such that

  1. (i)

    it holds that \({\mathcal {R}}_{a}(\Psi )\in C({\mathbb {R}}^d,{\mathbb {R}}^d)\),

  2. (ii)

    it holds for all \(x\in {\mathbb {R}}^d\) that

    $$\begin{aligned} ({\mathcal {R}}_{a}(\Psi ))(x)=({\mathcal {R}}_{a}(\Phi _2))(x)+\big (({\mathcal {R}}_{a}(\Phi _1))\circ ({\mathcal {R}}_{a}(\Phi _2))\big )(x), \end{aligned}$$
    (3.56)
  3. (iii)

    it holds that

    $$\begin{aligned} {\mathbb {D}}_{{\mathcal {L}}(\Psi ) -1} (\Psi ) \le l_{1, L_1 -1} + {\mathfrak {i}}, \end{aligned}$$
    (3.57)

    and

  4. (iv)

    it holds that \({\mathcal {P}}(\Psi ) \le {\mathcal {P}}(\Phi _2)+\big [\tfrac{1}{2}{\mathcal {P}}({\mathbb {I}})+{\mathcal {P}}(\Phi _1)\big ]^{\!2}\)

(cf. Definitions 3.4 and 3.27).

Proof of Lemma 3.30

To prove items (i)–(iv) we distinguish between the case \(L_1=1\) and the case \(L_1 \in {\mathbb {N}}\cap [2, \infty )\). We first prove items (i)–(iv) in the case \(L_1=1\). Note that, e.g., [25, Proposition 2.30] (with \(a=a\), \(d=d\), \({\mathfrak {L}} = L_2\), \((\ell _0, \ell _1, \ldots , \ell _{{\mathfrak {L}}}) = (l_{2,0},l_{2,1}, \ldots , l_{2,L_2})\), \(\psi = \Phi _2\), \(\phi _n = \Phi _1\) for \(n \in {\mathbb {N}}_0\) in the notation of [25, Proposition 2.30]) implies that there exists \(\Psi \in {\mathbf {N}}\) such that

  1. (I)

    it holds that \({\mathcal {R}}_{a}(\Psi ) \in C({\mathbb {R}}^d, {\mathbb {R}}^d)\),

  2. (II)

    it holds for all \(x \in {\mathbb {R}}^d\) that

    $$\begin{aligned} ({\mathcal {R}}_{a}(\Psi ))(x)=({\mathcal {R}}_{a}(\Phi _2))(x)+\big (({\mathcal {R}}_{a}(\Phi _1))\circ ({\mathcal {R}}_{a}(\Phi _2))\big )(x), \end{aligned}$$
    (3.58)

    and

  3. (III)

    it holds that \({\mathcal {D}}(\Psi ) = {\mathcal {D}}(\Phi _2)\).

The hypothesis that \(l_{2,L_2-1}\le l_{1,L_1-1}+{\mathfrak {i}}\) hence ensures that

$$\begin{aligned} {\mathbb {D}}_{{\mathcal {L}}(\Psi ) -1} (\Psi ) = {\mathbb {D}}_{{\mathcal {L}}(\Phi _2) -1} (\Phi _2) = l_{2,L_2-1} \le l_{1, L_1 -1} + {\mathfrak {i}}. \end{aligned}$$
(3.59)

Moreover, note that (III) assures that

$$\begin{aligned} {\mathcal {P}}(\Psi ) = {\mathcal {P}}(\Phi _2) \le {\mathcal {P}}(\Phi _2)+\big [\tfrac{1}{2}{\mathcal {P}}({\mathbb {I}})+{\mathcal {P}}(\Phi _1)\big ]^{\!2}. \end{aligned}$$
(3.60)

Combining this with (I) and (3.59) establishes items (i)–(iv) in the case \(L_1=1\). We now prove items (i)–(iv) in the case \(L_1 \in {\mathbb {N}}\cap [2, \infty )\). Observe that, e.g., [25, Proposition 2.28] (with \(a=a\), \(L_1 = L_1\), \(L_2 = L_2\), \({\mathbb {I}} = {\mathbb {I}}\), \(\Phi _1 = \Phi _1\), \(\Phi _2 = \Phi _2\), \(d=d\), \({\mathfrak {i}} = {\mathfrak {i}}\), \((l_{1, 0}, l_{1, 1}, \ldots , l_{1, L_1}) = (l_{1, 0}, l_{1, 1}, \ldots , l_{1, L_1})\), \((l_{2, 0}, l_{2, 1}, \ldots , l_{2, L_2}) = (l_{2, 0}, l_{2, 1}, \ldots , l_{2, L_2})\) in the notation of [25, Proposition 2.28]) proves that there exists \(\Psi \in {\mathbf {N}}\) such that

  1. (a)

    it holds that \({\mathcal {R}}_{a}(\Psi ) \in C({\mathbb {R}}^d, {\mathbb {R}}^d)\),

  2. (b)

    it holds for all \(x\in {\mathbb {R}}^d\) that

    $$\begin{aligned} ({\mathcal {R}}_{a}(\Psi ))(x)=({\mathcal {R}}_{a}(\Phi _2))(x)+\big (({\mathcal {R}}_{a}(\Phi _1))\circ ({\mathcal {R}}_{a}(\Phi _2))\big )(x), \end{aligned}$$
    (3.61)
  3. (c)

    it holds that

    $$\begin{aligned} {\mathcal {D}}(\Psi )=(l_{2,0},l_{2,1},\dots , l_{2,L_2-1},l_{1,1}+{\mathfrak {i}},l_{1,2}+{\mathfrak {i}},\dots ,l_{1,L_1-1}+{\mathfrak {i}}, l_{1, L_1}), \end{aligned}$$
    (3.62)

    and

  4. (d)

    it holds that \({\mathcal {P}}(\Psi ) \le {\mathcal {P}}(\Phi _2)+\big [\tfrac{1}{2}{\mathcal {P}}({\mathbb {I}})+{\mathcal {P}}(\Phi _1)\big ]^{\!2}\).

This establishes items (i)–(iv) in the case \(L_1 \in {\mathbb {N}}\cap [2, \infty )\). The proof of Lemma 3.30 is thus completed. \(\square \)

4 Kolmogorov partial differential equations (PDEs)

In this section we establish in Theorem 4.5 below the existence of DNNs which approximate solutions of suitable Kolmogorov PDEs without the curse of dimensionality. Moreover, in Corollary 4.6 below we specialize Theorem 4.5 to the case where for every \( d \in {\mathbb {N}}\) we have that the probability measure \( \nu _d \) appearing in Theorem 4.5 is the uniform distribution on the d-dimensional unit cube \( [0,1]^d \). In addition, in Corollary 4.7 below we specialize Theorem 4.5, roughly speaking, to the case where the constants \(\kappa \in (0, \infty )\), \({\mathfrak {e}}, {\mathfrak {d}}_1, {\mathfrak {d}}_2, \ldots , {\mathfrak {d}}_6 \in [0, \infty ) \), which we use to specify the regularity hypotheses in Theorem 4.5, are all equal in the sense that \(\kappa = {\mathfrak {e}}= {\mathfrak {d}}_1 = {\mathfrak {d}}_2= \ldots = {\mathfrak {d}}_6\).

Corollary 4.7 follows immediately from Theorem 4.5 and is a slight generalization of [36, Theorem 6.1] and [36, Theorem 1.1], respectively. In our proof of Theorem 4.5 we employ the DNN representation results in Lemmas 3.293.30 from Sect. 3 above as well as essentially well-known error estimates for the Monte Carlo Euler method which we establish in Proposition 4.4 in this section below. The proof Proposition 4.4, in turn, employs the elementary error estimate results in Lemmas 4.14.3 below.

4.1 Error analysis for the Monte Carlo Euler method

Lemma 4.1

Let \(d, m \in {\mathbb {N}}\), \(\xi \in {\mathbb {R}}^d\), \(T \in (0, \infty )\), \(L_0, L_1, l \in [0, \infty )\), \(h \in (0, T]\), \(B \in {\mathbb {R}}^{d \times m}\), let \( \left\| \cdot \right\| \! :{\mathbb {R}}^d \rightarrow [0,\infty ) \) be the d-dimensional Euclidean norm, let \((\Omega , {\mathcal {F}}, {\mathbb {P}})\) be a probability space, let \(W :[0, T] \times \Omega \rightarrow {\mathbb {R}}^m\) be a standard Brownian motion, let \(f_0 :{\mathbb {R}}^d \rightarrow {\mathbb {R}}\) and \(f_1 :{\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) be functions, let \(\chi :[0, T] \rightarrow [0, T]\) be a function, assume for all \(t \in [0, T]\), \(x, y \in {\mathbb {R}}^d\) that

$$\begin{aligned}&|f_0(x) - f_0(y)| \le L_0 \!\left( 1+ \int _0^1 [r \Vert x\Vert + (1-r) \Vert y\Vert ]^l \, dr \right) \! \Vert x-y\Vert , \end{aligned}$$
(4.1)
$$\begin{aligned}&\Vert f_1(x) - f_1(y) \Vert \le L_1 \Vert x-y\Vert , \end{aligned}$$
(4.2)

and \(\chi (t) = \max (\{0, h, 2h, \ldots \} \cap [0, t])\), and let \( X, Y :[0,T] \times \Omega \rightarrow {\mathbb {R}}^d \) be stochastic processes with continuous sample paths which satisfy for all \( t \in [0,T] \) that \( Y_t = \xi + \int _0^t f_1\big ( Y_{ \chi ( s ) } \big ) \, ds + B W_t \) and

$$\begin{aligned} X_t = \xi + \int _0^t f_1( X_s ) \, ds + B W_t . \end{aligned}$$
(4.3)

Then it holds that

$$\begin{aligned}&\big |{\mathbb {E}}[f_0(X_T)] - {\mathbb {E}}[f_0(Y_T)] \big | \le (h/T)^{\nicefrac {1}{2}} e^{(l+3+2L_1+[l L_1 + 2 L_1 +2]T)} \max \{1, L_0\} \nonumber \\&\quad \cdot \Big [ \Vert \xi \Vert + 2 + \max \{1, \Vert f_1(0)\Vert \} \max \{1, T\} + \sqrt{(2\max \{l, 1\} -1) {\text {Trace}}(B^*B) T} \Big ]^{1+l} . \end{aligned}$$
(4.4)

Proof of Lemma 4.1

First, note that (4.2) proves that for all \(x \in {\mathbb {R}}^d\) it holds that

$$\begin{aligned} \Vert f_1(x) \Vert \le \Vert f_1(x) - f_1(0) \Vert + \Vert f_1(0)\Vert \le L_1 \Vert x\Vert + \Vert f_1(0)\Vert . \end{aligned}$$
(4.5)

This, (4.1), (4.2), and, e.g., [36, Proposition 4.2] (with \(d = d\), \(m = m\), \(\xi = \xi \), \(T = T\), \(c = L_1\), \(C = \Vert f_1(0)\Vert \), \(\varepsilon _0 = 0\), \(\varepsilon _1 = 0\), \(\varepsilon _2 = 0\), \(\varsigma _0 = 0\), \(\varsigma _1 = 0\), \(\varsigma _2 = 0\), \(L_0 = L_0\), \(L_1 = L_1\), \(l = l\), \(h = h\), \(B = B\), \(p = 2\), \(q = 2\), \(\left\| \cdot \right\| = \left\| \cdot \right\| \), \((\Omega , {\mathcal {F}}, {\mathbb {P}}) = (\Omega , {\mathcal {F}}, {\mathbb {P}})\), \(W = W\), \(\phi _0 = f_0\), \(f_1 = f_1\), \(\phi _2 = ({\mathbb {R}}^d \ni x \mapsto x \in {\mathbb {R}}^d)\), \(\chi = \chi \), \(f_0 = f_0\), \(\phi _1 = \phi _1\), \(\varpi _r = ( {\mathbb {E}}[ \Vert B W_T \Vert ^r])^{ \nicefrac {1}{r} }\), \(X = X\), \(Y = Y\) for \(r \in (0, \infty )\) in the notation of [36, Proposition 4.2]) establish that

$$\begin{aligned} \begin{aligned}&\big |{\mathbb {E}}[f_0(X_T)] - {\mathbb {E}}[f_0(Y_T)] \big | \le (h/T)^{\nicefrac {1}{2}} e^{(l+3+2L_1+[l L_1 + L_1 + L_1 +2]T)} \max \{1, L_0\} \\&\quad \cdot \Big [ \Vert \xi \Vert + 2 + \max \{1, \Vert f_1(0)\Vert \} \max \{1, T\} + \big ( {\mathbb {E}}\big [ \Vert B W_T \Vert ^{\max \{2, 2l\}} \big ] \big )^{ \nicefrac {1}{\max \{2, 2l\}} } \Big ]^{1+l}. \end{aligned} \end{aligned}$$
(4.6)

Combining this with, e.g., [36, Lemma 4.2] (with \(d = d\), \(m = m\), \(T = T\), \(p = \max \{2, 2l\}\), \(B = B\), \(\left\| \cdot \right\| = \left\| \cdot \right\| \), \((\Omega , {\mathcal {F}}, {\mathbb {P}}) = (\Omega , {\mathcal {F}}, {\mathbb {P}})\), \(W = W\) in the notation of [36, Lemma 4.2]) ensures that

$$\begin{aligned}&\big |{\mathbb {E}}[f_0(X_T)] - {\mathbb {E}}[f_0(Y_T)] \big | \le (h/T)^{\nicefrac {1}{2}} e^{(l+3+2L_1+[l L_1 + 2 L_1 +2]T)} \max \{1, L_0\} \nonumber \\&\quad \cdot \Big [ \Vert \xi \Vert + 2 + \max \{1, \Vert f_1(0)\Vert \} \max \{1, T\} + \sqrt{(2\max \{l, 1\} -1) {\text {Trace}}(B^*B) T} \Big ]^{1+l} . \end{aligned}$$
(4.7)

The proof of Lemma 4.1 is thus completed. \(\square \)

Lemma 4.2

Let \(d, m \in {\mathbb {N}}\), \(T, \kappa \in (0, \infty )\), \(\theta , {\mathfrak {d}}_0, {\mathfrak {d}}_1 \in [0, \infty )\), \(h \in (0, T]\), \(B \in {\mathbb {R}}^{d \times m}\), \(p \in [1, \infty )\), let \(\nu :{\mathcal {B}}({\mathbb {R}}^d) \rightarrow [0,1] \) be a probability measure on \({\mathbb {R}}^d\), let \( \left\| \cdot \right\| \! :{\mathbb {R}}^d \rightarrow [0,\infty ) \) be the d-dimensional Euclidean norm, let \((\Omega , {\mathcal {F}}, {\mathbb {P}})\) be a probability space, let \(W :[0, T] \times \Omega \rightarrow {\mathbb {R}}^m\) be a standard Brownian motion, let \(f_0 :{\mathbb {R}}^d \rightarrow {\mathbb {R}}\) and \(f_1 :{\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) be functions, let \(\chi :[0, T] \rightarrow [0, T]\) be a function, assume for all \(t \in [0, T]\), \(x, y \in {\mathbb {R}}^d\) that

$$\begin{aligned}&|f_0(x) - f_0(y)| \le \kappa d^{{\mathfrak {d}}_0} (1 + \Vert x\Vert ^{\theta } + \Vert y\Vert ^{\theta }) \Vert x-y\Vert , \end{aligned}$$
(4.8)
$$\begin{aligned}&\Vert f_1(x) - f_1(y)\Vert \le \kappa \Vert x -y\Vert , \qquad {\text {Trace}}(B^* B) \le \kappa d^{2 {\mathfrak {d}}_1}, \end{aligned}$$
(4.9)
$$\begin{aligned}&\Vert f_1(0)\Vert \le \kappa d^{{\mathfrak {d}}_1}, \qquad \left[ \int _{{\mathbb {R}}^d} \Vert z\Vert ^{p(1+\theta )} \, \nu (dz) \right] ^{\nicefrac {1}{(p (1+\theta ))}} \le \kappa d^{{\mathfrak {d}}_1}, \end{aligned}$$
(4.10)

and \(\chi (t) = \max (\{0, h, 2h, \ldots \} \cap [0, t])\), and let \( X^x:[0,T] \times \Omega \rightarrow {\mathbb {R}}^d \), \(x \in {\mathbb {R}}^d\), and \(Y^x :[0,T] \times \Omega \rightarrow {\mathbb {R}}^d \), \(x \in {\mathbb {R}}^d\), be stochastic processes with continuous sample paths which satisfy for all \(x \in {\mathbb {R}}^d\), \( t \in [0,T] \) that \( Y_t^x = x + \int _0^t f_1\big ( Y^x_{ \chi ( s ) } \big ) \, ds + B W_t \) and

$$\begin{aligned} X^x_t = x + \int _0^t f_1( X^x_s ) \, ds + B W_t . \end{aligned}$$
(4.11)

Then it holds that

$$\begin{aligned} \begin{aligned}&\left[ \int _{{\mathbb {R}}^d} \big |{\mathbb {E}}[f_0(X^x_T)] - {\mathbb {E}}[f_0(Y^x_T)] \big |^p \, \nu (dx) \right] ^{\nicefrac {1}{p}} \le 2^{4\theta +5} | \!\max \{1, T\} |^{\theta +1}\\&\qquad \cdot |\!\max \{ \kappa , \theta , 1 \}|^{\theta +3} e^{(6\max \{ \kappa , \theta , 1 \}+5|\!\max \{ \kappa , \theta , 1 \}|^2 T)} d^{{\mathfrak {d}}_0 + {\mathfrak {d}}_1(\theta +1)} (h/T)^{\nicefrac {1}{2}}. \end{aligned} \end{aligned}$$
(4.12)

Proof of Lemma 4.2

Throughout this proof let \( \iota = \max \{ \kappa , \theta , 1 \} \). Note that (4.8) proves that for all \(x, y \in {\mathbb {R}}^d\) it holds that

$$\begin{aligned} \begin{aligned}&|f_0(x) - f_0(y)| \le \kappa d^{{\mathfrak {d}}_0} (1 + \Vert x\Vert ^{\theta } + \Vert y \Vert ^{\theta } )\Vert x-y\Vert \\&\quad \le \kappa d^{{\mathfrak {d}}_0} \! \left( 1 + 2 (\theta +1) \int _0^1 \big [r \Vert x\Vert + (1-r) \Vert y\Vert \big ]^{\theta } \, dr \right) \! \Vert x-y\Vert \\&\quad \le 2 \kappa (\theta +1 ) d^{{\mathfrak {d}}_0} \! \left( 1 + \int _0^1 \big [r \Vert x\Vert + (1-r) \Vert y\Vert \big ]^{\theta } \, dr \right) \! \Vert x-y\Vert . \end{aligned} \end{aligned}$$
(4.13)

Lemma 4.1 (with \(d = d\), \(m = m\), \(\xi = x\), \(T =T\), \(L_0 = 2\kappa (\theta +1) d^{{\mathfrak {d}}_0}\), \(L_1 = \kappa \), \(l = \theta \), \(h = h\), \(B = B\), \(\left\| \cdot \right\| = \left\| \cdot \right\| \), \((\Omega , {\mathcal {F}}, {\mathbb {P}}) = (\Omega , {\mathcal {F}}, {\mathbb {P}})\), \(W = W\), \(f_0 = f_0\), \(f_1 = f_1\), \(\chi = \chi \), \(X = X^x\), \(Y = Y^x\) for \(x \in {\mathbb {R}}^d\) in the notation of Lemma 4.1), (4.10), and (4.9) hence ensure that for all \(x \in {\mathbb {R}}^d\) it holds that

$$\begin{aligned}&\big |{\mathbb {E}}[f_0(X^x_T)] - {\mathbb {E}}[f_0(Y^x_T)] \big | \le (h/T)^{\nicefrac {1}{2}} e^{(\theta +3+2\kappa +[\theta \kappa + 2 \kappa +2]T)} \max \{1, 2 \kappa (\theta +1) d^{{\mathfrak {d}}_0}\}\nonumber \\&\qquad \cdot \Big [ \Vert x\Vert + 2 + \max \{1, \Vert f_1(0)\Vert \} \max \{1, T\} + \sqrt{(2\max \{\theta , 1\} -1) {\text {Trace}}(B^*B) T} \Big ]^{1+\theta }\nonumber \\&\quad \le (h/T)^{\nicefrac {1}{2}} e^{(\theta +3+2\kappa +[\theta \kappa + 2 \kappa +2]T)} \max \{1, 2 \kappa (\theta +1) d^{{\mathfrak {d}}_0}\} \nonumber \\&\qquad \cdot \Big [ \Vert x\Vert + 2 + \max \{1, \kappa d^{{\mathfrak {d}}_1} \} \max \{1, T\} + \sqrt{(2\max \{\theta , 1\} -1) \kappa d^{2 {\mathfrak {d}}_1} T} \Big ]^{1+\theta }. \end{aligned}$$
(4.14)

Therefore, we obtain that for all \(x \in {\mathbb {R}}^d\) it holds that

$$\begin{aligned}&\big |{\mathbb {E}}[f_0(X^x_T)] - {\mathbb {E}}[f_0(Y^x_T)] \big |\nonumber \\&\quad \le 2 \iota (\iota +1) d^{{\mathfrak {d}}_0} (h/T)^{\nicefrac {1}{2}} e^{(6\iota +5\iota ^2 T)} \big [ \Vert x\Vert + 2 + \iota d^{{\mathfrak {d}}_1}\max \{1, T\} + \sqrt{(2\iota -1) \kappa d^{2 {\mathfrak {d}}_1} T} \big ]^{1+\theta } \nonumber \\&\quad \le 4 \iota ^2 d^{{\mathfrak {d}}_0} (h/T)^{\nicefrac {1}{2}} e^{(6\iota +5\iota ^2 T)} \big [ \Vert x\Vert + 2 + \iota d^{{\mathfrak {d}}_1}\max \{1, T\} + \sqrt{2\iota \kappa d^{2 {\mathfrak {d}}_1} T} \big ]^{1+\theta }\nonumber \\&\quad \le 4 \iota ^2 d^{{\mathfrak {d}}_0} (h/T)^{\nicefrac {1}{2}} e^{(6\iota +5\iota ^2 T)} \big [ \Vert x\Vert + 2 + 3 \iota d^{{\mathfrak {d}}_1}\max \{1, T\} \big ]^{1+\theta }\nonumber \\&\quad \le 4 \iota ^2 d^{{\mathfrak {d}}_0} (h/T)^{\nicefrac {1}{2}} e^{(6\iota +5\iota ^2 T)} \big [ \Vert x\Vert + 5 \iota d^{{\mathfrak {d}}_1}\max \{1, T\} \big ]^{1+\theta }. \end{aligned}$$
(4.15)

This establishes that

$$\begin{aligned}&\left[ \int _{{\mathbb {R}}^d} \big |{\mathbb {E}}[f_0(X^x_T)] - {\mathbb {E}}[f_0(Y^x_T)] \big |^p \, \nu (dx) \right] ^{\nicefrac {1}{p}} \nonumber \\&\quad \le 4 \iota ^2 d^{{\mathfrak {d}}_0} (h/T)^{\nicefrac {1}{2}} e^{(6\iota +5\iota ^2 T)} \left[ \int _{{\mathbb {R}}^d} \big [ \Vert x\Vert + 5 \iota d^{{\mathfrak {d}}_1}\max \{1, T\} \big ]^{p(1+\theta )} \, \nu (dx) \right] ^{\nicefrac {1}{p}}\nonumber \\&\quad \le 4 \iota ^2 d^{{\mathfrak {d}}_0} (h/T)^{\nicefrac {1}{2}} e^{(6\iota +5\iota ^2 T)} \left[ \int _{{\mathbb {R}}^d} \big [ 2^{\theta } \Vert x\Vert ^{1+\theta } + 2^{\theta } ( 5 \iota d^{{\mathfrak {d}}_1} \max \{1, T\} )^{1+\theta } \big ]^{p} \, \nu (dx) \right] ^{\nicefrac {1}{p}}\nonumber \\&\quad \le 2^{\theta +2} \iota ^2 d^{{\mathfrak {d}}_0} (h/T)^{\nicefrac {1}{2}} e^{(6\iota +5\iota ^2 T)} \left[ \left[ \int _{{\mathbb {R}}^d} \Vert x\Vert ^{p(1+\theta )} \, \nu (dx) \right] ^{\nicefrac {1}{p}} + ( 5 \iota d^{{\mathfrak {d}}_1} \max \{1, T\} )^{1+\theta } \right] .\nonumber \\ \end{aligned}$$
(4.16)

Combining this and (4.10) assures that

$$\begin{aligned} \begin{aligned}&\left[ \int _{{\mathbb {R}}^d} \big |{\mathbb {E}}[f_0(X^x_T)] - {\mathbb {E}}[f_0(Y^x_T)] \big |^p \, \nu (dx) \right] ^{\nicefrac {1}{p}} \\&\quad \le 2^{\theta +2} \iota ^2 d^{{\mathfrak {d}}_0} (h/T)^{\nicefrac {1}{2}} e^{(6\iota +5\iota ^2 T)} \big [ \kappa ^{1+\theta } d^{{\mathfrak {d}}_1(1+\theta )} + ( 5 \iota d^{{\mathfrak {d}}_1} \max \{1, T\} )^{1+\theta } \big ]\\&\quad \le 2^{\theta +2} ( 5 \iota \max \{1, T\} )^{\theta +1} \iota ^2 d^{{\mathfrak {d}}_0 + {\mathfrak {d}}_1(\theta +1)} (h/T)^{\nicefrac {1}{2}} e^{(6\iota +5\iota ^2 T)} \\&\quad \le 2^{\theta +2 + 3(\theta +1)} |\! \max \{1, T\} |^{\theta +1} \iota ^{2 +\theta +1} d^{{\mathfrak {d}}_0 + {\mathfrak {d}}_1(\theta +1)} (h/T)^{\nicefrac {1}{2}} e^{(6\iota +5\iota ^2 T)} \\&\quad \le 2^{4\theta +5} |\! \max \{1, T\} |^{\theta +1} \iota ^{\theta +3} e^{(6\iota +5\iota ^2 T)} d^{{\mathfrak {d}}_0 + {\mathfrak {d}}_1(\theta +1)} (h/T)^{\nicefrac {1}{2}}. \end{aligned} \end{aligned}$$
(4.17)

The proof of Lemma 4.2 is thus completed. \(\square \)

Lemma 4.3

Let \(d, M, n \in {\mathbb {N}}\), \(T, \kappa , \theta \in (0, \infty )\), \({\mathfrak {d}}_0, {\mathfrak {d}}_1 \in [0, \infty )\), \(B \in {\mathbb {R}}^{d \times n}\), \(p \in [2, \infty )\), let \(\nu :{\mathcal {B}}({\mathbb {R}}^d) \rightarrow [0,1] \) be a probability measure on \({\mathbb {R}}^d\), let \( \left\| \cdot \right\| \! :{\mathbb {R}}^d \rightarrow [0,\infty ) \) be the d-dimensional Euclidean norm, let \((\Omega , {\mathcal {F}}, {\mathbb {P}})\) be a probability space, let \(W^m :[0, T] \rightarrow {\mathbb {R}}^n\), \(m \in \{1, 2, \ldots , M\}\), be independent standard Brownian motions, let \(f_0 :{\mathbb {R}}^d \rightarrow {\mathbb {R}}\) be \({\mathcal {B}}({\mathbb {R}}^d) /{\mathcal {B}}({\mathbb {R}})\)-measurable, let \(f_1 :{\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) be \({\mathcal {B}}({\mathbb {R}}^d) /{\mathcal {B}}({\mathbb {R}}^d)\)-measurable, let \(\chi :[0, T] \rightarrow [0, T]\) be \({\mathcal {B}}([0,T]) /{\mathcal {B}}([0, T])\)-measurable, assume for all \(t \in [0, T]\), \(x \in {\mathbb {R}}^d\) that

$$\begin{aligned} |f_0(x)| \le \kappa d^{{\mathfrak {d}}_0} (d^{{\mathfrak {d}}_1 \theta } + \Vert x\Vert ^{\theta }), \qquad \Vert f_1(x)\Vert \le \kappa (d^{{\mathfrak {d}}_1} + \Vert x\Vert ), \end{aligned}$$
(4.18)
$$\begin{aligned} {\text {Trace}}(B^* B) \le \kappa d^{2 {\mathfrak {d}}_1}, \qquad \left[ \int _{{\mathbb {R}}^d} \Vert z\Vert ^{p\theta } \, \nu (dz) \right] ^{\nicefrac {1}{(p \theta )}} \le \kappa d^{{\mathfrak {d}}_1}, \end{aligned}$$
(4.19)

and \(\chi (t) \le t\), and let \( Y^{m, x} :[0,T] \times \Omega \rightarrow {\mathbb {R}}^d \), \(m \in \{1, 2, \ldots , M\}\), \(x \in {\mathbb {R}}^d\), be stochastic processes with continuous sample paths which satisfy for all \(x \in {\mathbb {R}}^d\), \(m \in \{1, 2, \ldots , M\}\), \( t \in [0,T] \) that

$$\begin{aligned} Y_t^{m, x} = x + \int _0^t f_1\big ( Y^{m, x}_{ \chi ( s ) } \big ) \, ds + B W_t^m, \end{aligned}$$
(4.20)

Then it holds that

$$\begin{aligned} \begin{aligned}&\left( {\mathbb {E}}\!\left[ \int _{{\mathbb {R}}^d} \Big | {\mathbb {E}}[f_0(Y^{1, x}_T)] - \tfrac{1}{M} \Big [ \textstyle \sum \nolimits _{m=1}^M \displaystyle f_0(Y^{m, x}_T) \Big ] \Big |^p \, \nu (dx) \right] \right) ^{\!\nicefrac {1}{p}}\\&\quad \le 2^{\theta +2} p \kappa (p \theta +p +1)^{\theta } (\kappa T +1)^{\theta } e^{\kappa \theta T} (\kappa ^{\theta } +1) d^{{\mathfrak {d}}_0 + {\mathfrak {d}}_1 \theta } M^{-\nicefrac {1}{2}}. \end{aligned} \end{aligned}$$
(4.21)

Proof of Lemma 4.3

Throughout this proof let \( \iota = \max \{ \theta , 1 \} \). Note that (4.18) and, e.g., [36, Lemma 4.1] (with \(d =d\), \(m =n\), \(\xi = \xi \), \(p =q\), \(c = \kappa \), \(C= \kappa d^{{\mathfrak {d}}_1}\), \(T =T\), \(B = B\), \(\left\| \cdot \right\| = \left\| \cdot \right\| \), \((\Omega , {\mathcal {F}}, {\mathbb {P}}) = (\Omega , {\mathcal {F}}, {\mathbb {P}})\), \(W = W^1\), \(\mu = f_1\), \(\chi = \chi \), \(X = Y^{1, x}\) for \(q \in [1, \infty )\), \(x \in {\mathbb {R}}^d\) in the notation of [36, Lemma 4.1]) prove that for all \(q \in [1, \infty )\), \(x \in {\mathbb {R}}^d\) it holds that

$$\begin{aligned} \begin{aligned} \big ({\mathbb {E}}\big [ \Vert Y^{1, x}_T \Vert ^q \big ]\big )^{\nicefrac {1}{q}} \le \Big ( \Vert x\Vert + \kappa d^{{\mathfrak {d}}_1}T + \big ({\mathbb {E}}\big [ \Vert B W^1_T \Vert ^q \big ]\big )^{\nicefrac {1}{q}} \Big ) e^{\kappa T}. \end{aligned} \end{aligned}$$
(4.22)

This, (4.19), and, e.g., [36, Lemma 4.2] (with \(d = d\), \(m = n\), \(T = T\), \(p = q\), \(B = B\), \(\left\| \cdot \right\| = \left\| \cdot \right\| \), \((\Omega , {\mathcal {F}}, {\mathbb {P}}) = (\Omega , {\mathcal {F}}, {\mathbb {P}})\), \(W = W^1\) for \(q \in [1, \infty )\) in the notation of [36, Lemma 4.2]) ensure that for all \(q \in [1, \infty )\), \(x \in {\mathbb {R}}^d\) it holds that

$$\begin{aligned} \begin{aligned} \Big ({\mathbb {E}}\big [ \Vert Y^{1, x}_T \Vert ^q \big ]\Big )^{\!\nicefrac {1}{q}}&\le \Big ( \Vert x\Vert + \kappa d^{{\mathfrak {d}}_1}T +\sqrt{\max \{1, q-1\} {\text {Trace}}(B^* B)T} \Big ) e^{\kappa T}\\&\le \Big ( \Vert x\Vert + \kappa d^{{\mathfrak {d}}_1}T +\sqrt{\max \{1, q-1\} \kappa d^{2 {\mathfrak {d}}_1} T} \Big ) e^{\kappa T}\\&\le \big ( \Vert x\Vert + \kappa d^{{\mathfrak {d}}_1}T +q \max \{\kappa T, 1\} d^{{\mathfrak {d}}_1} \big ) e^{\kappa T}\\&\le \big ( \Vert x\Vert + (q+1) \max \{\kappa T, 1\} d^{{\mathfrak {d}}_1} \big ) e^{\kappa T}. \end{aligned} \end{aligned}$$
(4.23)

Combining this with (4.18) and Hölder’s inequality establishes for all \(x \in {\mathbb {R}}^d\) that

$$\begin{aligned} \begin{aligned} \Big ({\mathbb {E}}\big [ |f_{0} (Y^{1, x}_T ) |^p \big ]\Big )^{\!\nicefrac {1}{p}}&\le \kappa d^{{\mathfrak {d}}_0 + {\mathfrak {d}}_1 \theta } + \kappa d^{{\mathfrak {d}}_0} \Big ({\mathbb {E}}\big [ \Vert Y^{1, x}_T \Vert ^{p \theta } \big ]\Big )^{\!\nicefrac {1}{p}}\\&\le \kappa d^{{\mathfrak {d}}_0 + {\mathfrak {d}}_1 \theta } + \kappa d^{{\mathfrak {d}}_0} \Big ({\mathbb {E}}\big [ \Vert Y^{1, x}_T \Vert ^{p \iota } \big ]\Big )^{\!\nicefrac {\theta }{(p \iota )}}\\&\le \kappa d^{{\mathfrak {d}}_0 + {\mathfrak {d}}_1 \theta } + \kappa d^{{\mathfrak {d}}_0} \big ( \Vert x\Vert + (p \iota +1) \max \{\kappa T, 1\} d^{{\mathfrak {d}}_1} \big )^{\theta } e^{\kappa \theta T}\\&\le \kappa d^{{\mathfrak {d}}_0 + {\mathfrak {d}}_1 \theta } + \kappa d^{{\mathfrak {d}}_0} \big ( \Vert x\Vert + (p \iota +1) (\kappa T +1) d^{{\mathfrak {d}}_1} \big )^{\theta } e^{\kappa \theta T}. \end{aligned} \end{aligned}$$
(4.24)

The fact that \( \forall \, y, z \in {\mathbb {R}}, \alpha \in [0, \infty ) :|y + z|^{\alpha } \le 2^{\alpha }(|y|^{\alpha } + |z|^{\alpha })\) hence proves for all \(x \in {\mathbb {R}}^d\) that

$$\begin{aligned} \begin{aligned} \Big ({\mathbb {E}}\big [ |f_{0} (Y^{1, x}_T ) |^p \big ]\Big )^{\!\nicefrac {1}{p}}&\le \kappa d^{{\mathfrak {d}}_0 + {\mathfrak {d}}_1 \theta } + 2^{\theta } \kappa d^{{\mathfrak {d}}_0} \big ( \Vert x\Vert ^{\theta } + (p \iota +1)^{\theta } (\kappa T +1)^{\theta } d^{{\mathfrak {d}}_1 \theta } \big ) e^{\kappa \theta T}\\&\le 2^{\theta } \kappa (p \iota +1)^{\theta } (\kappa T +1)^{\theta } d^{{\mathfrak {d}}_0} e^{\kappa \theta T} \big ( \Vert x\Vert ^{\theta } + 2 d^{{\mathfrak {d}}_1 \theta } \big )\\&\le 2^{\theta +1} \kappa (p \iota +1)^{\theta } (\kappa T +1)^{\theta } d^{{\mathfrak {d}}_0} e^{\kappa \theta T} \big ( \Vert x\Vert ^{\theta } + d^{{\mathfrak {d}}_1 \theta } \big ). \end{aligned} \end{aligned}$$
(4.25)

This implies that for all \(x \in {\mathbb {R}}^d\) it holds that

$$\begin{aligned} \begin{aligned} {\mathbb {E}}\big [ |f_{0} (Y^{1, x}_T ) | \big ] \le \Big ({\mathbb {E}}\big [ |f_{0} (Y^{1, x}_T ) |^p \big ]\Big )^{\!\nicefrac {1}{p}} < \infty . \end{aligned} \end{aligned}$$
(4.26)

Combining this with, e.g., [24, Corollary 2.5] (with \(p=p\), \(d=1\), \(n=M\), \(\left\| \cdot \right\| = \left\| \cdot \right\| \), \((\Omega , {\mathcal {F}}, {\mathbb {P}}) = (\Omega , {\mathcal {F}}, {\mathbb {P}})\), \(X_i = f_0(Y^{i, x})\) for \(i \in \{1, 2, \ldots , M\}\), \(x \in {\mathbb {R}}^d\) in the notation of [24, Corollary 2.5]) and (4.25) assures for all \(x \in {\mathbb {R}}^d\) that

$$\begin{aligned} \begin{aligned}&\left( {\mathbb {E}}\! \left[ \Big | {\mathbb {E}}[f_0(Y^{1, x}_T)] - \tfrac{1}{M} \Big [ \textstyle \sum \nolimits _{m=1}^M \displaystyle f_0(Y^{m, x}_T) \Big ] \Big |^p \right] \right) ^{\!\nicefrac {1}{p}}\\&\quad \le 2 M^{-\nicefrac {1}{2}} \sqrt{(p-1)} \Big ({\mathbb {E}}\Big [\big |f_{0} (Y^{1, x}_T) - {\mathbb {E}}[ f_{0} (Y^{1, x}_T)]\big |^p\Big ] \Big )^{\!\nicefrac {1}{p}} \\&\quad \le 4 M^{-\nicefrac {1}{2}} \sqrt{(p-1)} \Big ({\mathbb {E}}\big [ |f_{0} (Y^{1, x}_T ) |^p \big ]\Big )^{\!\nicefrac {1}{p}} \\&\quad \le 2^{\theta +3} M^{-\nicefrac {1}{2}} \sqrt{(p-1)} \kappa (p \iota +1)^{\theta } (\kappa T +1)^{\theta } d^{{\mathfrak {d}}_0} e^{\kappa \theta T} \big ( \Vert x\Vert ^{\theta } + d^{{\mathfrak {d}}_1 \theta } \big ) . \end{aligned} \end{aligned}$$
(4.27)

This and the fact that \(\sqrt{p -1} \le \nicefrac {p}{2}\) establish that

$$\begin{aligned}&\left( {\mathbb {E}}\! \left[ \int _{{\mathbb {R}}^d} \Big | {\mathbb {E}}[f_0(Y^{1, x}_T)] - \tfrac{1}{M} \Big [ \textstyle \sum \nolimits _{m=1}^M \displaystyle f_0(Y^{m, x}_T) \Big ] \Big |^p \, \nu (dx) \right] \right) ^{\!\nicefrac {1}{p}} \nonumber \\&\quad \le 2^{\theta +2} M^{-\nicefrac {1}{2}} p \kappa (p \iota +1)^{\theta } (\kappa T +1)^{\theta } d^{{\mathfrak {d}}_0} e^{\kappa \theta T} \left( \int _{ {\mathbb {R}}^d } \big ( \Vert x\Vert ^{\theta } + d^{{\mathfrak {d}}_1 \theta } \big )^p \, \nu (dx) \right) ^{\!\nicefrac {1}{p}}\nonumber \\&\quad \le 2^{\theta +2} M^{-\nicefrac {1}{2}} p \kappa (p \iota +1)^{\theta } (\kappa T +1)^{\theta } d^{{\mathfrak {d}}_0} e^{\kappa \theta T} \left( d^{{\mathfrak {d}}_1 \theta } + \left[ \int _{{\mathbb {R}}^d} \Vert x\Vert ^{p \theta } \, \nu (dx) \right] ^{\nicefrac {1}{p}} \right) .\nonumber \\ \end{aligned}$$
(4.28)

Combining this and (4.19) demonstrates that

$$\begin{aligned} \begin{aligned}&\left( {\mathbb {E}}\! \left[ \int _{{\mathbb {R}}^d} \Big | {\mathbb {E}}[f_0(Y^{1, x}_T)] - \tfrac{1}{M} \Big [ \textstyle \sum \nolimits _{m=1}^M \displaystyle f_0(Y^{m, x}_T) \Big ] \Big |^p \, \nu (dx) \right] \right) ^{\!\nicefrac {1}{p}}\\&\quad \le 2^{\theta +2} M^{-\nicefrac {1}{2}} p \kappa (p \iota +1)^{\theta } (\kappa T +1)^{\theta } d^{{\mathfrak {d}}_0} e^{\kappa \theta T} \left[ d^{{\mathfrak {d}}_1 \theta } + \kappa ^{\theta } d^{{\mathfrak {d}}_1 \theta } \right] \\&\quad \le 2^{\theta +2} p \kappa (p \iota +1)^{\theta } (\kappa T +1)^{\theta } e^{\kappa \theta T} (\kappa ^{\theta } +1) d^{{\mathfrak {d}}_0 + {\mathfrak {d}}_1 \theta } M^{-\nicefrac {1}{2}}\\&\quad \le 2^{\theta +2} p \kappa (p \theta +p +1)^{\theta } (\kappa T +1)^{\theta } e^{\kappa \theta T} (\kappa ^{\theta } +1) d^{{\mathfrak {d}}_0 + {\mathfrak {d}}_1 \theta } M^{-\nicefrac {1}{2}}. \end{aligned} \end{aligned}$$
(4.29)

The proof of Lemma 4.3 is thus completed. \(\square \)

Proposition 4.4

Let \(d, M, n \in {\mathbb {N}}\), \(T, \kappa , \theta \in (0, \infty )\), \({\mathfrak {d}}_0, {\mathfrak {d}}_1 \in [0, \infty )\), \(h \in (0, T]\), \(B \in {\mathbb {R}}^{d \times n}\), \(p \in [2, \infty )\), let \(\nu :{\mathcal {B}}({\mathbb {R}}^d) \rightarrow [0,1] \) be a probability measure on \({\mathbb {R}}^d\), let \( \left\| \cdot \right\| \! :{\mathbb {R}}^d \rightarrow [0,\infty ) \) be the d-dimensional Euclidean norm, let \((\Omega , {\mathcal {F}}, {\mathbb {P}})\) be a probability space, let \(W^m :[0, T] \times \Omega \rightarrow {\mathbb {R}}^n\), \(m \in \{1, 2, \ldots , M\}\), be independent standard Brownian motions, let \(f_0 :{\mathbb {R}}^d \rightarrow {\mathbb {R}}\) and \(f_1 :{\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) be functions, let \(\chi :[0, T] \rightarrow [0, T]\) be a function, assume for all \(t \in [0, T]\), \(x, y \in {\mathbb {R}}^d\) that

$$\begin{aligned}&|f_0(x) - f_0(y)| \le \kappa d^{{\mathfrak {d}}_0} (1 + \Vert x\Vert ^{\theta } + \Vert y\Vert ^{\theta }) \Vert x-y\Vert , \end{aligned}$$
(4.30)
$$\begin{aligned}&|f_0(x)| \le \kappa d^{{\mathfrak {d}}_0} ( d^{{\mathfrak {d}}_1 \theta } + \Vert x\Vert ^{\theta }), \qquad {\text {Trace}}(B^* B) \le \kappa d^{2{\mathfrak {d}}_1}, \end{aligned}$$
(4.31)
$$\begin{aligned}&\Vert f_1(x) - f_1(y)\Vert \le \kappa \Vert x -y\Vert , \qquad \Vert f_1(x)\Vert \le \kappa (d^{{\mathfrak {d}}_1} + \Vert x\Vert ), \end{aligned}$$
(4.32)
$$\begin{aligned}&\left[ \int _{{\mathbb {R}}^d} \Vert z\Vert ^{p(1+\theta )} \, \nu (dz) \right] ^{\nicefrac {1}{(p(1+\theta ))}} \le \kappa d^{{\mathfrak {d}}_1}, \end{aligned}$$
(4.33)

and \(\chi (t) = \max (\{0, h, 2h, \ldots \} \cap [0, t])\), and let \( X^x :[0,T] \times \Omega \rightarrow {\mathbb {R}}^d \), \(x \in {\mathbb {R}}^d\), and \( Y^{m, x} :[0,T] \times \Omega \rightarrow {\mathbb {R}}^d \), \(m \in \{1, 2, \ldots , M\}\), \(x \in {\mathbb {R}}^d\), be stochastic processes with continuous sample paths which satisfy for all \(x \in {\mathbb {R}}^d\), \(m \in \{1, 2, \ldots , M\}\), \( t \in [0,T] \) that \( X^x_t = x + \int _0^t f_1( X^x_s ) \, ds + B W_t^1 \) and

$$\begin{aligned} Y_t^{m, x} = x + \int _0^t f_1\big ( Y^{m, x}_{ \chi ( s ) } \big ) \, ds + B W_t^m . \end{aligned}$$
(4.34)

Then it holds that

$$\begin{aligned} \begin{aligned}&\left( {\mathbb {E}}\! \left[ \int _{{\mathbb {R}}^d} \Big | {\mathbb {E}}[f_0(X^{x}_T)] - \tfrac{1}{M} \Big [ \textstyle \sum \nolimits _{m=1}^M \displaystyle f_0(Y^{m, x}_T) \Big ] \Big |^p \, \nu (dx) \right] \right) ^{\!\nicefrac {1}{p}}\\&\quad \le 2^{4\theta +5} |\! \max \{1, T\} |^{\theta +1} |\!\max \{ \kappa , \theta , 1 \}|^{2\theta +3} e^{(6\max \{ \kappa , \theta , 1 \}+5|\!\max \{ \kappa , \theta , 1 \}|^2 T)} \\&\qquad \cdot p (p \theta + p +1)^{\theta } d^{{\mathfrak {d}}_0 + {\mathfrak {d}}_1(\theta +1)} \big ((h/T)^{\nicefrac {1}{2}} + M^{-\nicefrac {1}{2}}\big ). \end{aligned} \end{aligned}$$
(4.35)

Proof of Proposition 4.4

Throughout this proof let \( \iota = \max \{ \kappa , \theta , 1 \}\). Note that the triangle inequality proves that

$$\begin{aligned} \begin{aligned}&\left( {\mathbb {E}}\! \left[ \int _{{\mathbb {R}}^d} \Big | {\mathbb {E}}[f_0(X^{x}_T)] - \tfrac{1}{M} \Big [ \textstyle \sum \nolimits _{m=1}^M \displaystyle f_0(Y^{m, x}_T) \Big ] \Big |^p \, \nu (dx) \right] \right) ^{\!\nicefrac {1}{p}}\\&\quad \le \left( \int _{{\mathbb {R}}^d} \big |{\mathbb {E}}[f_0(X^x_T)] - {\mathbb {E}}[f_0(Y^{1, x}_T)] \big |^p \, \nu (dx) \right) ^{\!\nicefrac {1}{p}}\\&\qquad + \left( {\mathbb {E}}\! \left[ \int _{{\mathbb {R}}^d} \Big | {\mathbb {E}}[f_0(Y^{1, x}_T)] - \tfrac{1}{M} \Big [ \textstyle \sum \nolimits _{m=1}^M \displaystyle f_0(Y^{m, x}_T) \Big ] \Big |^p \, \nu (dx) \right] \right) ^{\!\nicefrac {1}{p}}. \end{aligned} \end{aligned}$$
(4.36)

Next note that (4.31)–(4.33) and Lemma 4.2 (with \(d=d\), \(m=n\), \(T=T\), \(\kappa =\kappa \), \(\theta =\theta \), \({\mathfrak {d}}_0 = {\mathfrak {d}}_0\), \({\mathfrak {d}}_1 = {\mathfrak {d}}_1\), \(h=h\), \(B=B\), \(p=p\), \(\nu =\nu \), \(\left\| \cdot \right\| = \left\| \cdot \right\| \), \((\Omega , {\mathcal {F}}, {\mathbb {P}}) = (\Omega , {\mathcal {F}}, {\mathbb {P}})\), \(W = W^1\), \(f_0 = f_0\), \(f_1 = f_1\), \(\chi =\chi \), \(X^x = X^x\), \(Y^x= Y^{1, x}\) for \(x \in {\mathbb {R}}^d\) in the notation of Lemma 4.2) demonstrates that

$$\begin{aligned}&\left( \int _{{\mathbb {R}}^d} \big |{\mathbb {E}}[f_0(X^x_T)] - {\mathbb {E}}[f_0(Y^{1, x}_T)] \big |^p \, \nu (dx) \right) ^{\!\nicefrac {1}{p}} \nonumber \\&\quad \le 2^{4\theta +5} |\! \max \{1, T\} |^{\theta +1} |\!\max \{ \kappa , \theta , 1 \}|^{\theta +3} e^{(6\max \{ \kappa , \theta , 1 \}+5|\!\max \{ \kappa , \theta , 1 \}|^2 T)} d^{{\mathfrak {d}}_0 + {\mathfrak {d}}_1(\theta +1)} (h/T)^{\nicefrac {1}{2}}\nonumber \\&\quad = 2^{4\theta +5} | \!\max \{1, T\} |^{\theta +1} \iota ^{\theta +3} e^{(6\iota +5\iota ^2 T)} d^{{\mathfrak {d}}_0 + {\mathfrak {d}}_1(\theta +1)} (h/T)^{\nicefrac {1}{2}}. \end{aligned}$$
(4.37)

Moreover, observe that Hölder’s inequality and (4.33) imply that

$$\begin{aligned} \begin{aligned} \left[ \int _{{\mathbb {R}}^d} \Vert z\Vert ^{(p\theta )} \, \nu (dz) \right] ^{\nicefrac {1}{p \theta }} \le \left[ \int _{{\mathbb {R}}^d} \Vert z\Vert ^{p(1+\theta )} \, \nu (dz) \right] ^{\nicefrac {1}{(p(1+\theta ))}} \le \kappa d^{{\mathfrak {d}}_1}. \end{aligned} \end{aligned}$$
(4.38)

Lemma 4.3 (with \(d=d\), \(M=M\), \(n=n\), \(T=T\), \(\kappa =\kappa \), \(\theta =\theta \), \({\mathfrak {d}}_0 = {\mathfrak {d}}_0\), \({\mathfrak {d}}_1 = {\mathfrak {d}}_1\), \(B=B\), \(p=p\), \(\nu = \nu \), \(\left\| \cdot \right\| = \left\| \cdot \right\| \), \((\Omega , {\mathcal {F}}, {\mathbb {P}}) = (\Omega , {\mathcal {F}}, {\mathbb {P}})\), \(W^m=W^m\), \(f_0=f_0\), \(f_1=f_1\), \(\chi =\chi \), \(Y^{m,x} = Y^{m,x}\) for \(m \in \{1, 2, \ldots , M\}\), \(x \in {\mathbb {R}}^d\) in the notation of Lemma 4.3), (4.31), and (4.32) hence establish that

$$\begin{aligned} \begin{aligned}&\left( {\mathbb {E}}\! \left[ \int _{{\mathbb {R}}^d} \Big | {\mathbb {E}}[f_0(Y^{1, x}_T)] - \tfrac{1}{M} \Big [ \textstyle \sum \nolimits _{m=1}^M \displaystyle f_0(Y^{m, x}_T) \Big ] \Big |^p \, \nu (dx) \right] \right) ^{\!\nicefrac {1}{p}}\\&\quad \le 2^{\theta +2} p \kappa (p \theta +p +1)^{\theta } (\kappa T +1)^{\theta } e^{\kappa \theta T} (\kappa ^{\theta } +1) d^{{\mathfrak {d}}_0 + {\mathfrak {d}}_1 \theta } M^{-\nicefrac {1}{2}}\\&\quad \le 2^{\theta +2} p \kappa (p \theta +p +1)^{\theta } | \!\max \{1, T\} |^{\theta } (\kappa +1)^{\theta } e^{\kappa \theta T} (\kappa ^{\theta } +1) d^{{\mathfrak {d}}_0 + {\mathfrak {d}}_1 \theta } M^{-\nicefrac {1}{2}}\\&\quad \le 2^{\theta +2} p \iota (p \theta +p +1)^{\theta } | \!\max \{1, T\} |^{\theta } (2 \iota )^{\theta } e^{\iota \theta T} ({\iota }^{\theta } +1) d^{{\mathfrak {d}}_0 + {\mathfrak {d}}_1 \theta } M^{-\nicefrac {1}{2}}\\&\quad \le 2^{2\theta +3} p \iota ^{2\theta +1} (p \theta +p +1)^{\theta } |\! \max \{1, T\} |^{\theta } e^{\iota \theta T} d^{{\mathfrak {d}}_0 + {\mathfrak {d}}_1 \theta } M^{-\nicefrac {1}{2}}. \end{aligned} \end{aligned}$$
(4.39)

This, (4.36), and (4.37) assure that

$$\begin{aligned} \begin{aligned}&\left( {\mathbb {E}}\! \left[ \int _{{\mathbb {R}}^d} \Big | {\mathbb {E}}[f_0(X^{x}_T)] - \tfrac{1}{M} \Big [ \textstyle \sum \nolimits _{m=1}^M \displaystyle f_0(Y^{m, x}_T) \Big ] \Big |^p \, \nu (dx) \right] \right) ^{\!\nicefrac {1}{p}} \\&\quad \le 2^{4\theta +5} | \!\max \{1, T\} |^{\theta +1} \iota ^{\theta +3} e^{(6\iota +5\iota ^2 T)} d^{{\mathfrak {d}}_0 + {\mathfrak {d}}_1(\theta +1)} (h/T)^{\nicefrac {1}{2}} \\&\qquad +2^{2\theta +3} p \iota ^{2\theta +1} (p \theta +p +1)^{\theta } |\! \max \{1, T\} |^{\theta } e^{\iota \theta T} d^{{\mathfrak {d}}_0 + {\mathfrak {d}}_1 \theta } M^{-\nicefrac {1}{2}}\\&\quad \le 2^{4\theta +5} | \!\max \{1, T\} |^{\theta +1} \iota ^{2\theta +3} e^{(6\iota +5\iota ^2 T)} d^{{\mathfrak {d}}_0 + {\mathfrak {d}}_1(\theta +1)} \\&\qquad \cdot \Big ( (h/T)^{\nicefrac {1}{2}}+ p (p \theta + p +1)^{\theta } M^{-\nicefrac {1}{2}} \Big )\\&\quad \le 2^{4\theta +5} | \!\max \{1, T\} |^{\theta +1} \iota ^{2\theta +3} e^{(6\iota +5\iota ^2 T)} d^{{\mathfrak {d}}_0 + {\mathfrak {d}}_1(\theta +1)} p (p \theta + p +1)^{\theta } \\&\qquad \cdot \big ((h/T)^{\nicefrac {1}{2}} + M^{-\nicefrac {1}{2}}\big ). \end{aligned} \end{aligned}$$
(4.40)

The proof of Proposition 4.4 is thus completed. \(\square \)

4.2 DNN approximations for Kolmogorov PDEs

Theorem 4.5

Let \( A_d = (A_{d, i, j})_{(i, j) \in \{1, \ldots , d\}^2} \in {\mathbb {R}}^{ d \times d } \), \( d \in {\mathbb {N}}\), be symmetric positive semidefinite matrices, let \(\left\| \cdot \right\| \! :(\cup _{d \in {\mathbb {N}}} {\mathbb {R}}^d) \rightarrow [0, \infty )\) satisfy for all \(d \in {\mathbb {N}}\), \(x = (x_1, x_2, \ldots , x_d) \in {\mathbb {R}}^d\) that \(\Vert x\Vert = ( \textstyle \sum _{i=1}^d |x_i|^2)^{\nicefrac {1}{2}}\), for every \( d \in {\mathbb {N}}\) let \( \nu _d :{\mathcal {B}}({\mathbb {R}}^d) \rightarrow [0,1]\) be a probability measure on \({\mathbb {R}}^d\), let \( \varphi _{0,d} :{\mathbb {R}}^d \rightarrow {\mathbb {R}}\), \( d \in {\mathbb {N}}\), and \( \varphi _{ 1, d } :{\mathbb {R}}^d \rightarrow {\mathbb {R}}^d \), \( d \in {\mathbb {N}}\), be functions, let \( T, \kappa \in (0, \infty )\), \({\mathfrak {e}}, {\mathfrak {d}}_1, {\mathfrak {d}}_2, \ldots , {\mathfrak {d}}_6 \in [0, \infty )\), \(\theta \in [1, \infty )\), \(p \in [2, \infty )\), \( ( \phi ^{ m, d }_{ \varepsilon } )_{ (m, d, \varepsilon ) \in \{ 0, 1 \} \times {\mathbb {N}}\times (0,1] } \subseteq {\mathbf {N}}\), \(a\in C({\mathbb {R}}, {\mathbb {R}})\) satisfy for all \(x \in {\mathbb {R}}\) that \(a(x) = \max \{x, 0\}\), assume for all \( d \in {\mathbb {N}}\), \( \varepsilon \in (0,1] \), \( m \in \{0, 1\}\), \( x, y \in {\mathbb {R}}^d \) that \( {\mathcal {R}}_{a}( \phi ^{ 0, d }_{ \varepsilon } ) \in C( {\mathbb {R}}^d, {\mathbb {R}}) \), \( {\mathcal {R}}_{a}( \phi ^{ 1, d }_{ \varepsilon } ) \in C( {\mathbb {R}}^d, {\mathbb {R}}^d ) \), \( {\text {Trace}}(A_d) \le \kappa d^{ 2 {\mathfrak {d}}_1 } \), \([ \int _{{\mathbb {R}}^d} \Vert x\Vert ^{2p \theta } \, \nu _d (dx) ]^{\nicefrac {1}{(2p \theta )}} \le \kappa d^{{\mathfrak {d}}_1 + {\mathfrak {d}}_2}\), \( {\mathcal {P}}( \phi ^{ m, d }_{ \varepsilon } ) \le \kappa d^{ 2^{(-m)} {\mathfrak {d}}_3 } \varepsilon ^{ - 2^{(-m)} {\mathfrak {e}}}\), \( |( {\mathcal {R}}_{a}(\phi ^{ 0, d }_{ \varepsilon }) )(x) - ( {\mathcal {R}}_{a}(\phi ^{ 0, d }_{ \varepsilon }) )(y)| \le \kappa d^{{\mathfrak {d}}_6} (1 + \Vert x\Vert ^{\theta } + \Vert y \Vert ^{\theta })\Vert x-y\Vert \), \( \Vert ( {\mathcal {R}}_{a}(\phi ^{ 1, d }_{ \varepsilon }) )(x) \Vert \le \kappa ( d^{ {\mathfrak {d}}_1 + {\mathfrak {d}}_2 } + \Vert x \Vert ) \), \(| \varphi _{ 0, d }( x )| \le \kappa d^{ {\mathfrak {d}}_6 } ( d^{ \theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2) } + \Vert x \Vert ^{ \theta } )\), \( \Vert \varphi _{ 1, d }( x ) - \varphi _{ 1, d }( y ) \Vert \le \kappa \Vert x - y \Vert \), and

$$\begin{aligned} \Vert \varphi _{ m, d }(x) - ( {\mathcal {R}}_{a}(\phi ^{ m, d }_{ \varepsilon }) )(x) \Vert \le \varepsilon \kappa d^{{\mathfrak {d}}_{(5 -m)}} (d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)}+ \Vert x\Vert ^{\theta }) , \end{aligned}$$
(4.41)

and for every \( d \in {\mathbb {N}}\) let \( u_d :[0,T] \times {\mathbb {R}}^{d} \rightarrow {\mathbb {R}}\) be an at most polynomially growing viscosity solution of

$$\begin{aligned} \begin{aligned} \left( \tfrac{ \partial }{\partial t} u_d \right) ( t, x )&= \left( \tfrac{ \partial }{\partial x} u_d \right) ( t, x ) \, \varphi _{ 1, d }( x ) + \textstyle \sum \limits _{ i, j = 1 }^d \displaystyle A_{ d, i, j } \, \left( \tfrac{ \partial ^2 }{ \partial x_i \partial x_j } u_d \right) ( t, x ) \end{aligned} \end{aligned}$$
(4.42)

with \( u_d( 0, x ) = \varphi _{ 0, d }( x ) \) for \( ( t, x ) \in (0,T) \times {\mathbb {R}}^d \) (cf. Definitions 3.1 and 3.3). Then there exist \( c \in {\mathbb {R}}\) and \( ( \Psi _{ d, \varepsilon } )_{ (d , \varepsilon ) \in {\mathbb {N}}\times (0,1] } \subseteq {\mathbf {N}}\) such that for all \( d \in {\mathbb {N}}\), \( \varepsilon \in (0,1] \) it holds that \( {\mathcal {R}}( \Psi _{ d, \varepsilon } ) \in C( {\mathbb {R}}^{ d }, {\mathbb {R}}) \), \([ \int _{ {\mathbb {R}}^d } | u_d(T, x) - ( {\mathcal {R}} (\Psi _{ d, \varepsilon }) )( x ) |^p \, \nu _d(dx) ]^{ \nicefrac { 1 }{ p } } \le \varepsilon \), and

$$\begin{aligned} {\mathcal {P}}( \Psi _{ d, \varepsilon } ) \le c d^{6[{\mathfrak {d}}_6 + ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)(\theta +1)] + \max \{4, {\mathfrak {d}}_3\} + {\mathfrak {e}}\max \{{\mathfrak {d}}_5 + \theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2), {\mathfrak {d}}_4 + {\mathfrak {d}}_6 + 2\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)\}} \varepsilon ^{-({\mathfrak {e}}+6)}. \end{aligned}$$
(4.43)

Proof of Theorem 4.5

Throughout this proof let \( {\mathcal {A}}_d \in {\mathbb {R}}^{ d \times d } \), \( d \in {\mathbb {N}}\), satisfy for all \( d \in {\mathbb {N}}\) that \( {\mathcal {A}}_d = \sqrt{ 2 A_d } \), let \((\Omega , {\mathcal {F}}, {\mathbb {P}})\) be a probability space, let \( W^{ d, m } :[0,T] \times \Omega \rightarrow {\mathbb {R}}^d \), \( d, m \in {\mathbb {N}}\), be independent standard Brownian motions, let \(Z^{N, d, m}_n :\Omega \rightarrow {\mathbb {R}}^{d} \), \(n \in \{0, 1, \ldots , N-1\}\), \(m \in \{1, 2, \ldots , N\}\), \(d, N \in {\mathbb {N}}\), be the random variables which satisfy for all \(N, d \in {\mathbb {N}}\), \(m \in \{1, 2, \ldots , N\}\), \(n \in \{0, 1, \ldots , N-1\}\) that

$$\begin{aligned} Z^{N, d, m}_n = {\mathcal {A}}_d W^{d, m}_{\frac{(n+1)T}{N}} - {\mathcal {A}}_d W^{d, m}_{\frac{nT}{N}}, \end{aligned}$$
(4.44)

let \(f_{N, d} :{\mathbb {R}}^{d} \times {\mathbb {R}}^{d} \rightarrow {\mathbb {R}}^{d}\), \(d, N \in {\mathbb {N}}\), satisfy for all \(N, d \in {\mathbb {N}}\), \(x, y \in {\mathbb {R}}^d\) that

$$\begin{aligned} f_{N, d}(x, y) = x+ y + \tfrac{T}{N} \varphi _{1, d}(y), \end{aligned}$$
(4.45)

let \( X^{ d, x } :[0,T] \times \Omega \rightarrow {\mathbb {R}}^d \), \( x \in {\mathbb {R}}^d \), \( d \in {\mathbb {N}}\), be stochastic processes with continuous sample paths which satisfy for all \( d \in {\mathbb {N}}\), \( x \in {\mathbb {R}}^d \), \( t \in [0,T] \) that

$$\begin{aligned} X^{ d, x }_t = x + \int _0^t \varphi _{ 1, d }( X^{ d, x }_s ) \, ds + {\mathcal {A}}_d W^{ d, 1 }_t \end{aligned}$$
(4.46)

(cf., e.g., [36, item (i) in Theorem 3.1] (with \((\Omega , {\mathcal {F}}, {\mathbb {P}}) = (\Omega , {\mathcal {F}}, {\mathbb {P}})\), \(T=T\), \(d=d\), \(m=d\), \(B= {\mathcal {A}}_d\), \(\mu = \varphi _{1,d}\) for \(d \in {\mathbb {N}}\) in the notation of [36, Theorem 3.1])), let \(Y^{N, d, x}_n = (Y^{N, d, m, x}_n)_{m \in \{1, 2, \ldots , N\}} :\Omega \rightarrow {\mathbb {R}}^{N d}\), \(n \in \{0, 1, \ldots , N\}\), \(x \in {\mathbb {R}}^d\), \(d, N \in {\mathbb {N}}\), satisfy for all \(N, d \in {\mathbb {N}}\), \(m \in \{1, 2, \ldots , N\}\), \(x \in {\mathbb {R}}^d\), \(n \in \{1, 2, \ldots , N\}\) that \(Y^{N, d, m, x}_{0} = x\) and

$$\begin{aligned} Y^{N, d, m, x}_{n}&= f_{N, d} \big (Z^{N, d, m}_{n-1}, Y^{N, d, m, x}_{n-1}\big ), \end{aligned}$$
(4.47)

let \(g_{N, d} :{\mathbb {R}}^{Nd} \rightarrow {\mathbb {R}}\), \(d, N \in {\mathbb {N}}\), satisfy for all \(N, d \in {\mathbb {N}}\), \(x = (x_i)_{i \in \{1, 2, \ldots , N\}} \in {\mathbb {R}}^{Nd}\) that

$$\begin{aligned} g_{N, d}(x) = \frac{1}{N} \sum _{i=1}^N \varphi _{0,d} (x_i), \end{aligned}$$
(4.48)

and let \({\mathfrak {N}}_{d, \varepsilon } \subseteq {\mathbf {N}}\), \( \varepsilon \in (0, 1]\), \(d \in {\mathbb {N}}\), satisfy for all \(d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\) that

$$\begin{aligned}&{\mathfrak {N}}_{d, \varepsilon } \nonumber \\&\quad = \Big \{ \Phi \in {\mathbf {N}}:\! \big [ \big ( {\mathcal {R}}_{a}( \Phi ) \in C({\mathbb {R}}^d, {\mathbb {R}}^d) \big ) \wedge \big ({\mathbb {D}}_{{\mathcal {L}}(\Phi ) -1}(\Phi ) \le {\mathbb {D}}_{{\mathcal {L}}(\phi ^{1,d}_{\varepsilon }) -1}(\phi ^{1,d}_{\varepsilon }) + 2d \big ) \big ] \Big \} \end{aligned}$$
(4.49)

(cf. Definition 3.27). Note that (4.44) and, e.g., [36, Lemma 4.2] (with \(d = d\), \(m = d\), \(T = T\), \(p = 2p \theta \), \(B = {\mathcal {A}}_d\), \((\Omega , {\mathcal {F}}, {\mathbb {P}}) = (\Omega , {\mathcal {F}}, {\mathbb {P}})\), \(W = W^{d,m}\) for \(d, m \in {\mathbb {N}}\) in the notation of [36, Lemma 4.2]) ensure that for all \(N, d \in {\mathbb {N}}\), \(m \in \{1, 2, \ldots , N\}\), \(n \in \{0, 1, \ldots , N-1\}\) it holds that

$$\begin{aligned} \begin{aligned}&\left( {\mathbb {E}}\! \left[ \Vert Z^{N, d, m}_{n} \Vert ^{2 p \theta } \right] \right) ^{\nicefrac {1}{(2 p \theta )}} = \left( {\mathbb {E}}\! \left[ \Big \Vert {\mathcal {A}}_d W^{d, m}_{\frac{(n+1)T}{N}} - {\mathcal {A}}_d W^{d, m}_{\frac{nT}{N}}\Big \Vert ^{2 p \theta } \right] \right) ^{\!\nicefrac {1}{(2 p \theta )}}\\&\quad \le \left( {\mathbb {E}}\! \left[ \Big \Vert {\mathcal {A}}_d W^{d, m}_{\frac{(n+1)T}{N}} \Big \Vert ^{2 p \theta } \right] \right) ^{\!\nicefrac {1}{(2 p \theta )}} + \left( {\mathbb {E}}\! \left[ \Big \Vert {\mathcal {A}}_d W^{d, m}_{\frac{nT}{N}}\Big \Vert ^{2 p \theta } \right] \right) ^{\!\nicefrac {1}{(2 p \theta )}}\\&\quad \le 2 \sqrt{(2p\theta -1) {\text {Trace}}({\mathcal {A}}_d^* {\mathcal {A}}_d) T} = 2 \sqrt{2(2p\theta -1) {\text {Trace}}(A_d) T}. \end{aligned} \end{aligned}$$
(4.50)

This and the assumption that \( \forall \, d \in {\mathbb {N}}:{\text {Trace}}(A_d) \le \kappa d^{ 2 {\mathfrak {d}}_1 } \) assure for all \(N, d \in {\mathbb {N}}\), \(m \in \{1, 2, \ldots , N\}\), \(n \in \{0, 1, \ldots , N-1\}\) that

$$\begin{aligned} \begin{aligned} \left( {\mathbb {E}}\! \left[ \Vert Z^{N, d, m}_{n} \Vert ^{2 p \theta } \right] \right) ^{\nicefrac {1}{(2 p \theta )}} \le 4 p \theta \sqrt{\kappa T} d^{{\mathfrak {d}}_1}. \end{aligned} \end{aligned}$$
(4.51)

Moreover, observe that Lemma 3.16 (with \(d=d\), \(a=a\) for \(d \in {\mathbb {N}}\) in the notation of Lemma 3.16) ensures that there exist \({\mathfrak {I}}_d \in {\mathbf {N}}\), \(d \in {\mathbb {N}}\), such that for all \(d \in {\mathbb {N}}\), \(x \in {\mathbb {R}}^d\) it holds that \({\mathcal {D}}({\mathfrak {I}}_d) = (d, 2d, d)\), \({\mathcal {R}}_{a}( {\mathfrak {I}}_{d}) \in C({\mathbb {R}}^d, {\mathbb {R}}^d)\), and \(({\mathcal {R}}_{a}({\mathfrak {I}}_d))(x) = x\). This and (4.49) assure for all \(d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\) that \({\mathfrak {I}}_d \in {\mathfrak {N}}_{d, \varepsilon }\) and

$$\begin{aligned} {\mathcal {P}}({\mathfrak {I}}_d) = 2d(d+1) + d(2d+1) = 2d^2 + 2d+2d^2 +d = 4d^2 + 3d \le 7d^2. \end{aligned}$$
(4.52)

Next note that Lemma 3.14 demonstrates that for all \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\) it holds that \({\mathcal {D}}(\frac{T}{N} \circledast \phi ^{1, d}_{\varepsilon }) = {\mathcal {D}}(\phi ^{1, d}_{\varepsilon })\), \({\mathcal {R}}_{a}(\frac{T}{N} \circledast \phi ^{1, d}_{\varepsilon } ) \in C({\mathbb {R}}^d, {\mathbb {R}}^d)\), and

$$\begin{aligned} {\mathcal {R}}_{a}\big (\tfrac{T}{N} \circledast \phi ^{1, d}_{\varepsilon }\big ) = \tfrac{T}{N} {\mathcal {R}}_{a}(\phi ^{1, d}_{\varepsilon }) \end{aligned}$$
(4.53)

(cf. Definition 3.13). This, the fact that \({\mathcal {D}}({\mathfrak {I}}_d) = (d, 2d, d)\), and Lemma 3.30 (with \(a=a\), \(L_1 = {\mathcal {L}}(\tfrac{T}{N} \circledast \phi ^{1, d}_{\varepsilon }) \), \(L_2 = 2\), \({\mathbb {I}} = {\mathfrak {I}}_d\), \(\Phi _1 = \tfrac{T}{N} \circledast \phi ^{1, d}_{\varepsilon }\), \(\Phi _2 = {\mathfrak {I}}_d\), \(d=d\), \({\mathfrak {i}} = 2d\), \((l_{1, 0}, l_{1, 1}, \ldots , l_{1, L_1}) = {\mathcal {D}}(\tfrac{T}{N} \circledast \phi ^{1, d}_{\varepsilon })\), \((l_{2, 0}, l_{2, 1}, l_{2, L_2}) = (d, 2d, d)\) for \(d, N \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\) in the notation of Lemma 3.30) establish that there exist \({\mathbf {f}}^{N, d}_{\varepsilon } \in {\mathbf {N}}\), \(\varepsilon \in (0, 1]\), \(d, N \in {\mathbb {N}}\), such that for all \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\), \(x \in {\mathbb {R}}^d\) it holds that \({\mathcal {R}}_{a}({\mathbf {f}}^{N, d}_{\varepsilon }) \in C({\mathbb {R}}^d, {\mathbb {R}}^d)\) and

$$\begin{aligned} ( {\mathcal {R}}_{a}({\mathbf {f}}^{N, d}_{\varepsilon })) (x) = x + \big ( {\mathcal {R}}_{a}\big (\tfrac{T}{N} \circledast \phi ^{1, d}_{\varepsilon }\big )\big )(x) = x + \tfrac{T}{N} ( {\mathcal {R}}_{a}(\phi ^{1, d}_{\varepsilon }))(x). \end{aligned}$$
(4.54)

Items (ii)–(iii) in Lemma 3.9 hence ensure that there exist \({\mathbf {f}}^{N, d}_{\varepsilon , z} \in {\mathbf {N}}\), \(z \in {\mathbb {R}}^d\), \(\varepsilon \in (0, 1]\), \(d, N \in {\mathbb {N}}\), which satisfy for all \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\), \(z, x \in {\mathbb {R}}^d\) that \({\mathcal {R}}_{a}({\mathbf {f}}^{N, d}_{\varepsilon , z}) \in C({\mathbb {R}}^d, {\mathbb {R}}^d)\) and

$$\begin{aligned} ( {\mathcal {R}}_{a}({\mathbf {f}}^{N, d}_{\varepsilon , z})) (x) = ( {\mathcal {R}}_{a}({\mathbf {f}}^{N, d}_{\varepsilon })) (x) + z = z+x+ \tfrac{T}{N} ( {\mathcal {R}}_{a}( \phi ^{1, d}_{\varepsilon }))(x) . \end{aligned}$$
(4.55)

This, (4.45), and (4.41) imply for all \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\), \(x, z \in {\mathbb {R}}^d\) that \(({\mathbb {R}}^d \ni {\mathfrak {z}} \mapsto ( {\mathcal {R}}_{a}({\mathbf {f}}^{N, d}_{\varepsilon , {\mathfrak {z}}}))(x) \in {\mathbb {R}}^d)\) is \({\mathcal {B}}({\mathbb {R}}^d) /{\mathcal {B}}({\mathbb {R}}^d)\)-measurable and

$$\begin{aligned} \begin{aligned} \Vert f_{N, d}(z, x) - ( {\mathcal {R}}_{a}({\mathbf {f}}^{N, d}_{\varepsilon , z}))(x) \Vert&= \tfrac{T}{N} \Vert \varphi _{1, d} (x) - ( {\mathcal {R}}_{a}(\phi ^{1, d}_{\varepsilon }))(x) \Vert \\&\le \tfrac{T \varepsilon \kappa d^{{\mathfrak {d}}_4}}{N} ( d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} + \Vert x \Vert ^{ \theta } )\\&\le \varepsilon T \kappa d^{{\mathfrak {d}}_4} ( d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} + \Vert x \Vert ^{ \theta } ). \end{aligned} \end{aligned}$$
(4.56)

Next note that (4.55) and the assumption that \( \forall \, \varepsilon \in (0, 1], d \in {\mathbb {N}}, x \in {\mathbb {R}}^d :\Vert ( {\mathcal {R}}_{a}(\phi ^{ 1, d }_{ \varepsilon }) )(x) \Vert \le \kappa ( d^{ {\mathfrak {d}}_1 + {\mathfrak {d}}_2 } + \Vert x \Vert ) \) prove for all \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\), \(x, z \in {\mathbb {R}}^d\) that

$$\begin{aligned} \begin{aligned} \Vert ( {\mathcal {R}}_{a}({\mathbf {f}}^{N, d}_{\varepsilon , z}))(x) \Vert&\le \Vert z\Vert + \Vert x\Vert + \tfrac{T}{N} \Vert ( {\mathcal {R}}_{a}(\phi ^{ 1, d }_{ \varepsilon }) )(x) \Vert \\&\le \Vert z\Vert + \Vert x\Vert + \tfrac{T \kappa }{N} ( d^{ {\mathfrak {d}}_1 + {\mathfrak {d}}_2 } + \Vert x \Vert ) \\&= \big (1 + \tfrac{T \kappa }{N} \big ) \Vert x\Vert + \tfrac{T \kappa d^{{\mathfrak {d}}_1 + {\mathfrak {d}}_2}}{N} + \Vert z\Vert \\&\le \big (1 + \tfrac{T \kappa }{N} \big ) \Vert x\Vert + (T\kappa +1) (d^{{\mathfrak {d}}_1 + {\mathfrak {d}}_2} + \Vert z\Vert )\\&\le \big (1 + \tfrac{T \kappa }{N} \big ) \Vert x\Vert + (T\kappa +1) d^{{\mathfrak {d}}_2}(d^{{\mathfrak {d}}_1} + \Vert z\Vert ). \end{aligned} \end{aligned}$$
(4.57)

In addition, observe that (4.45) and the assumption that \(\forall \, d \in {\mathbb {N}}, x, y \in {\mathbb {R}}^d :\Vert \varphi _{ 1, d }( x ) - \varphi _{ 1, d }( y ) \Vert \le \kappa \Vert x - y \Vert \) imply that for all \(N, d \in {\mathbb {N}}\), \(x, z \in {\mathbb {R}}^d\) it holds that

$$\begin{aligned} \begin{aligned} \Vert f_{N, d}(z, x) - f_{N, d}(z, y) \Vert&= \Vert x + \tfrac{T}{N} \varphi _{1,d}(x) - y - \tfrac{T}{N} \varphi _{1, d}(y) \Vert \\&\le \Vert x-y\Vert + \tfrac{T}{N} \Vert \varphi _{ 1, d }( x ) - \varphi _{ 1, d }( y ) \Vert \\&\le \big (1 + \tfrac{T \kappa }{N} \big ) \Vert x -y\Vert \le (1+ T \kappa )\Vert x -y\Vert . \end{aligned} \end{aligned}$$
(4.58)

Moreover, note that (4.53), the fact that \({\mathcal {D}} ({\mathfrak {I}}_d) = (d, 2d, d)\), and Lemma 3.30 (with \(a=a\), \(L_1 = {\mathcal {L}}(\tfrac{T}{N} \circledast \phi ^{1, d}_{\varepsilon }) \), \(L_2 = {\mathcal {L}}(\Phi )\), \({\mathbb {I}} = {\mathfrak {I}}_d\), \(\Phi _1 = \tfrac{T}{N} \circledast \phi ^{1, d}_{\varepsilon }\), \(\Phi _2 = \Phi \), \(d=d\), \({\mathfrak {i}} = 2d\), \((l_{1, 0}, l_{1, 1}, \ldots , l_{1, L_1}) = {\mathcal {D}}(\tfrac{T}{N} \circledast \phi ^{1, d}_{\varepsilon }) = {\mathcal {D}}(\phi ^{1, d}_{\varepsilon })\), \((l_{2, 0}, l_{2, 1}, \ldots , l_{2, L_2}) = {\mathcal {D}}(\Phi )\) for \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\), \(\Phi \in {\mathfrak {N}}_{d, \varepsilon }\) in the notation of Lemma 3.30) prove that for every \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\), \(\Phi \in {\mathfrak {N}}_{d, \varepsilon }\) there exists \({\hat{\Phi }} \in {\mathbf {N}}\) such that for all \(x \in {\mathbb {R}}^d\) it holds that \({\mathcal {R}}_{a}({\hat{\Phi }}) \in C({\mathbb {R}}^d, {\mathbb {R}}^d)\), \({\mathbb {D}}_{{\mathcal {L}}({\hat{\Phi }}) -1}({\hat{\Phi }}) \le {\mathbb {D}}_{{\mathcal {L}}(\phi ^{1,d}_{\varepsilon }) -1}(\phi ^{1,d}_{\varepsilon }) + 2d \), \({\mathcal {P}}({\hat{\Phi }}) \le {\mathcal {P}}(\Phi ) + [\frac{1}{2} {\mathcal {P}}({\mathfrak {I}}_d) + {\mathcal {P}}(\phi ^{1, d}_{\varepsilon })]^2\), and

$$\begin{aligned} \begin{aligned} ({\mathcal {R}}_{a}({\hat{\Phi }})) (x)&= ({\mathcal {R}}_{a}(\Phi ))(x) + \big ( \big ({\mathcal {R}}_{a}\big (\tfrac{T}{N} \circledast \phi ^{1, d}_{\varepsilon }\big )\big )\circ ({\mathcal {R}}_{a}(\Phi ))\big )(x)\\&= ({\mathcal {R}}_{a}(\Phi ))(x) + \tfrac{T}{N} \big (({\mathcal {R}}_{a}( \phi ^{1, d}_{\varepsilon }))\circ ({\mathcal {R}}_{a}(\Phi ))\big )(x). \end{aligned} \end{aligned}$$
(4.59)

This, (4.49), (4.52), and the fact that \(\forall \, d \in {\mathbb {N}}, \varepsilon \in (0, 1] :{\mathcal {P}}( \phi ^{ 1, d }_{ \varepsilon } ) \le \kappa d^{ 2^{(-1)} {\mathfrak {d}}_3 } \varepsilon ^{ - 2^{(-1)} {\mathfrak {e}}}\) demonstrate that for every \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\), \(\Phi \in {\mathfrak {N}}_{d, \varepsilon }\) there exists \({\hat{\Phi }} \in {\mathfrak {N}}_{d, \varepsilon }\) such that for all \(x \in {\mathbb {R}}^d\) it holds that

$$\begin{aligned} {\mathcal {P}}({\hat{\Phi }}) \le {\mathcal {P}}(\Phi ) + (4d^2 +\kappa d^{ 2^{(-1)} {\mathfrak {d}}_3 } \varepsilon ^{ - 2^{(-1)} {\mathfrak {e}}})^2 \le {\mathcal {P}}(\Phi ) + (\kappa +4)^2 d^{ \max \{4, {\mathfrak {d}}_3\}} \varepsilon ^{-{\mathfrak {e}}} \end{aligned}$$
(4.60)

and

$$\begin{aligned} \begin{aligned} ({\mathcal {R}}_{a}({\hat{\Phi }})) (x) = ({\mathcal {R}}_{a}(\Phi ))(x) + \tfrac{T}{N} \big (({\mathcal {R}}_{a}( \phi ^{1, d}_{\varepsilon }))\circ ({\mathcal {R}}_{a}(\Phi ))\big )(x). \end{aligned} \end{aligned}$$
(4.61)

Items (i)–(iii) in Lemma 3.9 and (4.55) hence ensure that for every \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\), \(\Phi \in {\mathfrak {N}}_{d, \varepsilon }\) there exist \(({\hat{\Phi }}_{z})_{z \in {\mathbb {R}}^d} \subseteq {\mathfrak {N}}_{d, \varepsilon }\) such that for all \(x, z, {\mathfrak {z}} \in {\mathbb {R}}^d\) it holds that

$$\begin{aligned} ({\mathcal {R}}_{a}({\hat{\Phi }}_z))(x)= & {} z+ ({\mathcal {R}}_{a}(\Phi ))(x)+ \tfrac{T}{N} \big (({\mathcal {R}}_{a}( \phi ^{1, d}_{\varepsilon }))\circ ({\mathcal {R}}_{a}(\Phi ))\big )(x) \nonumber \\= & {} ( {\mathcal {R}}_{a}({\mathbf {f}}^{N, d}_{\varepsilon , z}))\big ( ({\mathcal {R}}_{a}(\Phi ))(x)\big ), \end{aligned}$$
(4.62)
$$\begin{aligned} {\mathcal {P}}({\hat{\Phi }}_z)\le & {} {\mathcal {P}}(\Phi ) + (\kappa +4)^2 d^{ \max \{4, {\mathfrak {d}}_3\}} \varepsilon ^{-{\mathfrak {e}}}, \end{aligned}$$
(4.63)

and \({\mathcal {D}} ({\hat{\Phi }}_{z}) = {\mathcal {D}} ({\hat{\Phi }}_{{\mathfrak {z}}})\). In the next step we observe that Lemma 3.29 (with \(n = N\), \(h_m = \nicefrac {1}{N}\), \(\phi _m = \phi ^{0, d}_{\varepsilon }\), \(a= a\) for \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\), \(m \in \{1, 2, \ldots , N\}\) in the notation of Lemma 3.29) demonstrates that there exist \({\mathbf {g}}^{N, d}_{\varepsilon } \in {\mathbf {N}}\), \(\varepsilon \in (0, 1]\), \(d, N \in {\mathbb {N}}\), which satisfy for all \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\), \(x = (x_i)_{i \in \{1, 2, \ldots , N\}} \in {\mathbb {R}}^{Nd}\) that \({\mathcal {R}}_{a}({\mathbf {g}}^{N, d}_{\varepsilon }) \in C({\mathbb {R}}^{N d}, {\mathbb {R}})\) and

$$\begin{aligned} ({\mathcal {R}}_{a}({\mathbf {g}}^{N, d}_{\varepsilon })) (x) = \frac{1}{N} \sum _{i = 1}^N ( {\mathcal {R}}_{a}(\phi ^{0, d}_{\varepsilon }))(x_i). \end{aligned}$$
(4.64)

This, (4.48), and (4.41) ensure that for all \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\), \(x = (x_i)_{i \in \{1, 2, \ldots , N\}} \in {\mathbb {R}}^{Nd}\) it holds that

$$\begin{aligned} \begin{aligned}&|g_{N,d}(x) - ( {\mathcal {R}}_{a}({\mathbf {g}}^{N, d}_{\varepsilon }) )(x) | \le \frac{1}{N} \sum _{i=1}^N |\varphi _{0,d}(x_i) - ( {\mathcal {R}}_{a}( \phi ^{0, d}_{\varepsilon }))(x_i)|\\&\quad \le \frac{\varepsilon \kappa d^{{\mathfrak {d}}_5}}{N} \sum _{i=1}^N (d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} + \Vert x_i\Vert ^{\theta }) = \varepsilon \kappa d^{{\mathfrak {d}}_5} \left[ d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} + \tfrac{1}{N} \textstyle \sum \nolimits _{i=1}^N \displaystyle \Vert x_i \Vert ^{\theta } \right] . \end{aligned} \end{aligned}$$
(4.65)

Moreover, note that (4.64) and the assumption that \(\forall \, \varepsilon \in (0,1], d \in {\mathbb {N}}, x, y \in {\mathbb {R}}^d :|( {\mathcal {R}}_{a}(\phi ^{ 0, d }_{ \varepsilon }) )(x) - ( {\mathcal {R}}_{a}(\phi ^{ 0, d }_{ \varepsilon }) )(y)| \le \kappa d^{{\mathfrak {d}}_6} (1 + \Vert x\Vert ^{\theta } + \Vert y \Vert ^{\theta })\Vert x-y\Vert \) imply that for all \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\), \(x = (x_i)_{i \in \{1, 2, \ldots , N\}} \in {\mathbb {R}}^{Nd}\), \(y = (y_i)_{i \in \{1, 2, \ldots , N\}} \in {\mathbb {R}}^{Nd}\) it holds that

$$\begin{aligned} \begin{aligned}&|( {\mathcal {R}}_{a}({\mathbf {g}}^{N, d}_{\varepsilon }) )(x) - ( {\mathcal {R}}_{a}({\mathbf {g}}^{N, d}_{\varepsilon }) )(y)|\\&\quad \le \frac{1}{N} \sum _{i=1}^N | ( {\mathcal {R}}_{a}( \phi ^{0, d}_{\varepsilon }))(x_i) - ( {\mathcal {R}}_{a}( \phi ^{0, d}_{\varepsilon }))(y_i)|\\&\quad \le \frac{\kappa d^{{\mathfrak {d}}_6}}{N} \left[ \textstyle \sum \nolimits _{i=1}^N \displaystyle (1 + \Vert x_i\Vert ^{\theta } + \Vert y_i \Vert ^{\theta })\Vert x_i- y_i\Vert \right] . \end{aligned} \end{aligned}$$
(4.66)

Next observe that the fact that \({\mathcal {D}}({\mathfrak {I}}_d) = (d, 2d, d)\) and, e.g., [25, Proposition 2.16] (with \(\Psi = {\mathfrak {I}}_d\), \(\Phi _1 = \phi ^{0, d}_{\varepsilon }\), \(\Phi _2 \in \{\Phi \in {\mathbf {N}}:{\mathcal {I}}(\Phi ) = {\mathcal {O}}(\Phi ) =d \}\), \({\mathfrak {i}} = 2d\) in the notation of [25, Proposition 2.16]) prove that for every \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\), \(\Phi _1, \Phi _2, \ldots , \Phi _{N} \in \{\Phi \in {\mathbf {N}}:{\mathcal {I}}(\Phi ) = {\mathcal {O}}(\Phi ) =d \}\) with \({\mathcal {D}}(\Phi _1) = {\mathcal {D}}(\Phi _2) = \ldots = {\mathcal {D}}(\Phi _{N})\) there exist \(\Psi _1, \Psi _2, \ldots , \Psi _{N} \in {\mathbf {N}}\) such that for all \(i \in \{1, 2, \ldots , N\}\) it holds that \( {\mathcal {R}}_{a}(\Psi _i) \in C({\mathbb {R}}^d, {\mathbb {R}})\), \({\mathcal {D}}(\Psi _i) = {\mathcal {D}}(\Psi _1)\), \({\mathcal {P}}(\Psi _i) \le 2 ({\mathcal {P}}(\phi ^{0, d}_{\varepsilon }) + {\mathcal {P}}(\Phi _i))\), and

$$\begin{aligned} {\mathcal {R}}_{a}(\Psi _i) = [ {\mathcal {R}}_{a}(\phi ^{0, d}_{\varepsilon })] \circ [{\mathcal {R}}_{a}({\mathfrak {I}}_d)] \circ [{\mathcal {R}}_{a}(\Phi _i)] = [ {\mathcal {R}}_{a}(\phi ^{0, d}_{\varepsilon })] \circ [{\mathcal {R}}_{a}(\Phi _i)]. \end{aligned}$$
(4.67)

This, (4.64), and Lemma 3.28 assure that for every \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\), \(\Phi _1, \Phi _2, \ldots , \Phi _{N} \in \{\Phi \in {\mathbf {N}}:{\mathcal {I}}(\Phi ) = {\mathcal {O}}(\Phi ) =d \}\) with \({\mathcal {D}}(\Phi _1) = {\mathcal {D}}(\Phi _2) = \ldots = {\mathcal {D}}(\Phi _{N})\) there exists \(\Psi \in {\mathbf {N}}\) such that for all \(x \in {\mathbb {R}}^d\) it holds that \( {\mathcal {R}}_{a}(\Psi ) \in C({\mathbb {R}}^d, {\mathbb {R}})\), \({\mathcal {P}}(\Psi ) \le 2 N^2 ({\mathcal {P}}(\phi ^{0, d}_{\varepsilon }) + {\mathcal {P}}(\Phi _1))\), and

$$\begin{aligned} \begin{aligned} ({\mathcal {R}}_{a}(\Psi )) (x)&= \frac{1}{N} \sum _{i=1}^{N} ( {\mathcal {R}}_{a}(\phi ^{0, d}_{\varepsilon }))\big ( ({\mathcal {R}}_{a}(\Phi _i)) (x)\big )\\&= ( {\mathcal {R}}_{a}({\mathbf {g}}^{N, d}_{\varepsilon })) \big ( ({\mathcal {R}}_{a}(\Phi _1)) (x), ({\mathcal {R}}_{a}(\Phi _2)) (x), \ldots , ({\mathcal {R}}_{a}(\Phi _N)) (x)\big ). \end{aligned} \end{aligned}$$
(4.68)

The assumption that \(\forall \, d \in {\mathbb {N}}, \varepsilon \in (0, 1] :{\mathcal {P}}( \phi ^{ 0, d }_{ \varepsilon } ) \le \kappa d^{ {\mathfrak {d}}_3 } \varepsilon ^{ - {\mathfrak {e}}}\) hence ensures that for every \(N, d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\), \(\Phi _1, \Phi _2, \ldots , \Phi _{N} \in \{\Phi \in {\mathbf {N}}:{\mathcal {I}}(\Phi ) = {\mathcal {O}}(\Phi ) =d \}\) with \({\mathcal {D}}(\Phi _1) = {\mathcal {D}}(\Phi _2) = \ldots = {\mathcal {D}}(\Phi _{N})\) there exists \(\Psi \in {\mathbf {N}}\) such that for all \(x \in {\mathbb {R}}^d\) it holds that \( {\mathcal {R}}_{a}(\Psi ) \in C({\mathbb {R}}^d, {\mathbb {R}})\), \(({\mathcal {R}}_{a}(\Psi )) (x) = ( {\mathcal {R}}_{a}({\mathbf {g}}^{N, d}_{\varepsilon })) ( ({\mathcal {R}}_{a}(\Phi _1)) (x), ({\mathcal {R}}_{a}(\Phi _2)) (x),\) \( \ldots , ({\mathcal {R}}_{a}(\Phi _N)) (x))\), and

$$\begin{aligned} \begin{aligned} {\mathcal {P}}(\Psi ) \le 2 N^2 ( \kappa d^{ {\mathfrak {d}}_3 } \varepsilon ^{ - {\mathfrak {e}}} + {\mathcal {P}}(\Phi _1)) \le 2 \max \{\kappa , 1\} N^2 ( d^{ {\mathfrak {d}}_3 } \varepsilon ^{ - {\mathfrak {e}}} + {\mathcal {P}}(\Phi _1)). \end{aligned} \end{aligned}$$
(4.69)

Furthermore, note that (4.41) and the assumption that \(\forall \, d\in {\mathbb {N}}, \varepsilon \in (0, 1], x, y \in {\mathbb {R}}^d :|( {\mathcal {R}}_{a}(\phi ^{ 0, d }_{ \varepsilon }) )(x) - ( {\mathcal {R}}_{a}(\phi ^{ 0, d }_{ \varepsilon }) )(y)| \le \kappa d^{ {\mathfrak {d}}_6} (1 + \Vert x\Vert ^{\theta } + \Vert y \Vert ^{\theta })\Vert x-y\Vert \) demonstrate for all \(d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\), \(x \in {\mathbb {R}}^d\) that

$$\begin{aligned}&|\varphi _{0,d}(x) - \varphi _{0, d}(y)| \nonumber \\&\quad \le | \varphi _{ 0, d }(x) - ( {\mathcal {R}}_{a}(\phi ^{ 0, d }_{ \varepsilon }) )(x) | + | \varphi _{ 0, d }(y) - ( {\mathcal {R}}_{a}(\phi ^{ 0, d }_{ \varepsilon }) )(y) | \nonumber \\&\qquad + |( {\mathcal {R}}_{a}(\phi ^{ 0, d }_{ \varepsilon }) )(x) - ( {\mathcal {R}}_{a}(\phi ^{ 0, d }_{ \varepsilon } ))(y)|\nonumber \\&\quad \le \varepsilon \kappa d^{ {\mathfrak {d}}_5 } ( d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} + \Vert x \Vert ^{ \theta } ) + \varepsilon \kappa d^{ {\mathfrak {d}}_5 } ( d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)} + \Vert y \Vert ^{ \theta } ) \nonumber \\&\qquad + \kappa d^{ {\mathfrak {d}}_6} (1 + \Vert x\Vert ^{\theta } + \Vert y \Vert ^{\theta })\Vert x-y\Vert . \end{aligned}$$
(4.70)

This establishes that for all \(d \in {\mathbb {N}}\), \(x \in {\mathbb {R}}^d\) it holds that

$$\begin{aligned} \begin{aligned} |\varphi _{0,d}(x) - \varphi _{0, d}(y)|&\le \kappa d^{ {\mathfrak {d}}_6} (1 + \Vert x\Vert ^{\theta } + \Vert y \Vert ^{\theta })\Vert x-y\Vert . \end{aligned} \end{aligned}$$
(4.71)

Next observe that the assumption that \(\forall \, d\in {\mathbb {N}}, \varepsilon \in (0, 1], x \in {\mathbb {R}}^d :\Vert ( {\mathcal {R}}_{a}(\phi ^{ 1, d }_{ \varepsilon }) )(x) \Vert \le \kappa ( d^{ {\mathfrak {d}}_1 + {\mathfrak {d}}_2 } + \Vert x \Vert ) \) and (4.41) ensure for all \(d \in {\mathbb {N}}\), \(\varepsilon \in (0, 1]\), \(x \in {\mathbb {R}}^d\) that

$$\begin{aligned} \begin{aligned} \Vert \varphi _{1, d} (x) \Vert&\le \Vert \varphi _{ 1, d }(x) - ( {\mathcal {R}}_{a}(\phi ^{ 1, d }_{ \varepsilon }) )(x) \Vert + \Vert ( {\mathcal {R}}_{a}(\phi ^{ 1, d }_{ \varepsilon }) )(x) \Vert \\&\le \varepsilon \kappa d^{{\mathfrak {d}}_4} (d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)}+ \Vert x\Vert ^{\theta }) + \kappa ( d^{ {\mathfrak {d}}_1 + {\mathfrak {d}}_2 } + \Vert x \Vert ). \end{aligned} \end{aligned}$$
(4.72)

This proves that for all \(d \in {\mathbb {N}}\), \(x \in {\mathbb {R}}^d\) it holds that

$$\begin{aligned} \Vert \varphi _{1, d} (x) \Vert \le \kappa ( d^{ {\mathfrak {d}}_1 + {\mathfrak {d}}_2 } + \Vert x \Vert ). \end{aligned}$$
(4.73)

In the next step we note that the Hölder’s inequality, the assumption that \(\forall \, d \in {\mathbb {N}}:[ \int _{{\mathbb {R}}^d} \Vert x\Vert ^{2p \theta } \, \nu _d (dx) ]^{\nicefrac {1}{(2p\theta )}} \le \kappa d^{{\mathfrak {d}}_1 + {\mathfrak {d}}_2}\), and the assumption that \(\theta \in [1, \infty )\) assure that for all \(d \in {\mathbb {N}}\) it holds that

$$\begin{aligned} \begin{aligned} \left[ \int _{{\mathbb {R}}^d} \Vert x\Vert ^{p (1+\theta )} \, \nu _d (dx) \right] ^{\nicefrac {1}{(p(1+\theta ))}}&\le \left[ \int _{{\mathbb {R}}^d} \Vert x\Vert ^{2p\theta } \, \nu _d (dx) \right] ^{\nicefrac {1}{(2p\theta )}} \le \kappa d^{{\mathfrak {d}}_1 + {\mathfrak {d}}_2}. \end{aligned} \end{aligned}$$
(4.74)

Next note that (4.47), (4.45), and (4.44) imply that for all \(N, d \in {\mathbb {N}}\), \(m \in \{1, 2, \ldots , N\}\), \(x \in {\mathbb {R}}^d\), \(n \in \{1, 2, \ldots , N\}\) it holds that

$$\begin{aligned} \begin{aligned} Y^{N, d, m, x}_{n}&= Z^{N, d, m}_{n-1} + Y^{N, d, m, x}_{n-1} + \tfrac{T}{N} \varphi _{1,d}(Y^{N, d, m, x}_{n-1}) \\&= Y^{N, d, m, x}_{n-1} + \tfrac{T}{N} \varphi _{1,d}(Y^{N, d, m, x}_{n-1}) + {\mathcal {A}}_d W^{d, m}_{\frac{nT}{N}} - {\mathcal {A}}_d W^{d, m}_{\frac{(n-1)T}{N}}. \end{aligned} \end{aligned}$$
(4.75)

The assumption that \(\forall \, d \in {\mathbb {N}}, x \in {\mathbb {R}}^d :| \varphi _{ 0, d }( x )| \le \kappa d^{ {\mathfrak {d}}_6 } ( d^{ \theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2) } + \Vert x \Vert ^{ \theta } )\), the assumption that \( \forall \, d \in {\mathbb {N}}:{\text {Trace}}(A_d) \le \kappa d^{ 2 {\mathfrak {d}}_1 } \), the assumption that \(\forall \, d \in {\mathbb {N}}, x, y \in {\mathbb {R}}^d :\Vert \varphi _{ 1, d }( x ) - \varphi _{ 1, d }( y ) \Vert \le \kappa \Vert x - y \Vert \), (4.71), (4.73), (4.74), (4.46), and Proposition 4.4 (with \(d=d\), \(M=N\), \(n=d\), \(T=T\), \(\kappa = \kappa \), \(\theta = \theta \), \({\mathfrak {d}}_0 = {\mathfrak {d}}_6\), \({\mathfrak {d}}_1 = {\mathfrak {d}}_1 + {\mathfrak {d}}_2\), \(h = \nicefrac {T}{N}\), \(B = {\mathcal {A}}_d\), \(p=p\), \(\nu = \nu _d\), \((\Omega , {\mathcal {F}}, {\mathbb {P}}) = (\Omega , {\mathcal {F}}, {\mathbb {P}})\), \(W^m = W^{d, m}\), \(f_0 = \varphi _{0,d}\), \(f_1 = \varphi _{1, d}\) for \(N, d \in {\mathbb {N}}\) in the notation of Proposition 4.4) hence establish that for all \(N, d \in {\mathbb {N}}\) it holds that

$$\begin{aligned}&\left( {\mathbb {E}}\! \left[ \int _{{\mathbb {R}}^d} \Big |{\mathbb {E}}[\varphi _{0,d} (X_T^{d, x})] - \tfrac{1}{N} \Big [ \textstyle \sum \nolimits _{i=1}^N \displaystyle \varphi _{0,d} (Y^{N, d, i, x}_{N}) \Big ] \Big |^p \, \nu _d (dx) \right] \right) ^{\!\nicefrac {1}{p}}\nonumber \\&\quad \le 2^{4\theta +5} | \!\max \{1, T\} |^{\theta +1} |\!\max \{ \kappa , \theta , 1 \}|^{2\theta +3} e^{(6\max \{ \kappa , \theta , 1 \}+5|\!\max \{ \kappa , \theta , 1 \}|^2 T)} \nonumber \\&\qquad \cdot p (p \theta + p +1)^{\theta } d^{{\mathfrak {d}}_6 + ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)(\theta +1)} \big ( N^{-\nicefrac {1}{2}} + N^{-\nicefrac {1}{2}}\big )\nonumber \\&\quad = 2^{4\theta +6} |\! \max \{1, T\} |^{\theta +1} |\!\max \{ \kappa , \theta \}|^{2\theta +3} e^{(6\max \{ \kappa , \theta \}+5|\!\max \{ \kappa , \theta \}|^2 T)} \nonumber \\&\qquad \cdot p (p \theta + p +1)^{\theta } d^{{\mathfrak {d}}_6 + ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)(\theta +1)} N^{-\nicefrac {1}{2}} . \end{aligned}$$
(4.76)

This, the fact that \(\forall \, d\in {\mathbb {N}}, x \in {\mathbb {R}}^d :| \varphi _{ 0, d }( x ) | \le \kappa d^{ {\mathfrak {d}}_6 } ( d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2) } + \Vert x \Vert ^{ \theta } )\), (4.73), (4.48), and, e.g., [36, Theorem 3.1] (with \((\Omega , {\mathcal {F}}, {\mathbb {P}}) = (\Omega , {\mathcal {F}}, {\mathbb {P}})\), \(T=T\), \(d=d\), \(m=d\), \(B= {\mathcal {A}}_d\), \(\varphi = \varphi _{0,d}\), \(\mu = \varphi _{1,d}\) for \(d \in {\mathbb {N}}\) in the notation of [36, Theorem 3.1]) prove for all \(N, d \in {\mathbb {N}}\) that

$$\begin{aligned} \begin{aligned}&\left( {\mathbb {E}}\! \left[ \int _{{\mathbb {R}}^d} \big |u_d(T, x) - g_{N,d} (Y^{N, d, x}_N) \big |^p \, \nu _d (dx) \right] \right) ^{\!\nicefrac {1}{p}}\\&\quad = \left( {\mathbb {E}}\! \left[ \int _{{\mathbb {R}}^d} \big |{\mathbb {E}}[\varphi _{0,d} (X_T^{d, x})] - g_{N,d} (Y^{N, d, x}_N) \big |^p \, \nu _d (dx) \right] \right) ^{\!\nicefrac {1}{p}}\\&\quad = \left( {\mathbb {E}}\! \left[ \int _{{\mathbb {R}}^d} \Big |{\mathbb {E}}[\varphi _{0,d} (X_T^{d, x})] - \tfrac{1}{N} \Big [ \textstyle \sum \nolimits _{i=1}^N \displaystyle \varphi _{0,d} (Y^{N, d, i, x}_{N}) \Big ] \Big |^p \, \nu _d (dx) \right] \right) ^{\!\nicefrac {1}{p}}\\&\quad \le 2^{4\theta +6} | \!\max \{1, T\} |^{\theta +1} |\!\max \{ \kappa , \theta \}|^{2\theta +3} e^{(6\max \{ \kappa , \theta \}+5|\!\max \{ \kappa , \theta \}|^2 T)} \\&\qquad \cdot p (p \theta + p +1)^{\theta } d^{{\mathfrak {d}}_6 + ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)(\theta +1)} N^{-\nicefrac {1}{2}}. \end{aligned} \end{aligned}$$
(4.77)

Combining this, (4.47), (4.51), (4.52), (4.56), (4.57), (4.58), (4.62), (4.63), (4.65), (4.66), (4.69), and Theorem 2.3 (with \((\Omega , {\mathcal {F}}, {\mathbb {P}}) = (\Omega , {\mathcal {F}}, {\mathbb {P}})\), \({\mathfrak {n}}_0= \nicefrac {1}{2}\), \({\mathfrak {n}}_1 = 0\), \({\mathfrak {n}}_2 =2\), \({\mathfrak {e}}= {\mathfrak {e}}\), \({\mathfrak {d}}_0 = {\mathfrak {d}}_6 + ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)(\theta +1)\), \({\mathfrak {d}}_1 = {\mathfrak {d}}_1\), \({\mathfrak {d}}_2 = {\mathfrak {d}}_2\), \({\mathfrak {d}}_3 = \max \{4, {\mathfrak {d}}_3\}\), \({\mathfrak {d}}_4 = {\mathfrak {d}}_4\), \({\mathfrak {d}}_5 = {\mathfrak {d}}_5\), \({\mathfrak {d}}_6 = {\mathfrak {d}}_6\), \( {\mathfrak {C}}=2^{4\theta +6} | \!\max \{1, T\} |^{\theta +1} |\!\max \{ \kappa , \theta \}|^{2\theta +3} e^{(6\max \{ \kappa , \theta \}+5|\!\max \{ \kappa , \theta \}|^2 T)} p (p \theta + p +1)^{\theta }\), \(p=p\), \(\theta = \theta \), \(M_N= N\), \(Z^{N, d, m}_n = Z^{N, d, m}_n\), \(f_{N, d} = f_{N, d}\), \(Y^{N, d, x}_l = Y^{N, d, x}_l\), \(\left\| \cdot \right\| = \left\| \cdot \right\| \), \(\nu _d = \nu _d\), \(g_{ N, d } = g_{ N, d }\), \(u_d(x) = u_d(T,x)\), \({\mathbf {N}}= {\mathbf {N}}\), \({\mathcal {P}}= {\mathcal {P}}\), \({\mathcal {D}}= {\mathcal {D}}\), \({\mathcal {R}}= {\mathcal {R}}_{a}\), \({\mathfrak {N}}_{d, \varepsilon } = {\mathfrak {N}}_{d, \varepsilon }\), \({\mathbf {f}}^{N, d}_{\varepsilon , z} = {\mathbf {f}}^{N, d}_{\varepsilon , z}\), \({\mathbf {g}}^{N, d}_{\varepsilon } = {\mathbf {g}}^{N, d}_{\varepsilon }\), \({\mathfrak {I}}_d = {\mathfrak {I}}_d\) for \(N, d \in {\mathbb {N}}\), \(m \in \{1, 2, \ldots , N\}\), \(n \in \{0, 1, \ldots , N-1\}\), \(l \in \{0, 1, \ldots , N\}\), \(\varepsilon \in (0, 1]\), \(x, z \in {\mathbb {R}}^d\) in the notation of Theorem 2.3) establish (4.43). The proof of Theorem 4.5 is thus completed. \(\square \)

Corollary 4.6

Let \( \varphi _{0,d} :{\mathbb {R}}^d \rightarrow {\mathbb {R}}\), \( d \in {\mathbb {N}}\), and \( \varphi _{ 1, d } :{\mathbb {R}}^d \rightarrow {\mathbb {R}}^d \), \( d \in {\mathbb {N}}\), be functions, let \(\left\| \cdot \right\| \! :(\cup _{d \in {\mathbb {N}}} {\mathbb {R}}^d) \rightarrow [0, \infty )\) satisfy for all \(d \in {\mathbb {N}}\), \(x = (x_1, x_2, \ldots , x_d) \in {\mathbb {R}}^d\) that \(\Vert x\Vert = ( \textstyle \sum _{i=1}^d |x_i|^2)^{\nicefrac {1}{2}}\), let \( T, \kappa \in (0, \infty )\), \({\mathfrak {e}}, {\mathfrak {d}}_1, {\mathfrak {d}}_2, \ldots , {\mathfrak {d}}_6 \in [0, \infty )\), \(\theta \in [1, \infty )\), \(p \in [2, \infty )\), \( ( \phi ^{ m, d }_{ \varepsilon } )_{ (m, d, \varepsilon ) \in \{ 0, 1 \} \times {\mathbb {N}}\times (0,1] } \subseteq {\mathbf {N}}\), \(a\in C({\mathbb {R}}, {\mathbb {R}})\) satisfy for all \(x \in {\mathbb {R}}\) that \(a(x) = \max \{x, 0\}\), assume for all \( d \in {\mathbb {N}}\), \( \varepsilon \in (0,1] \), \( m \in \{0, 1\}\), \( x, y \in {\mathbb {R}}^d \) that \( {\mathcal {R}}_{a}( \phi ^{ 0, d }_{ \varepsilon } ) \in C( {\mathbb {R}}^d, {\mathbb {R}}) \), \( {\mathcal {R}}_{a}( \phi ^{ 1, d }_{ \varepsilon } ) \in C( {\mathbb {R}}^d, {\mathbb {R}}^d ) \), \( {\mathcal {P}}( \phi ^{ m, d }_{ \varepsilon } ) \le \kappa d^{ 2^{(-m)} {\mathfrak {d}}_3 } \varepsilon ^{ - 2^{(-m)} {\mathfrak {e}}}\), \( |( {\mathcal {R}}_{a}(\phi ^{ 0, d }_{ \varepsilon }) )(x) - ( {\mathcal {R}}_{a}(\phi ^{ 0, d }_{ \varepsilon }) )(y)| \le \kappa d^{{\mathfrak {d}}_6} (1 + \Vert x\Vert ^{\theta } + \Vert y \Vert ^{\theta })\Vert x-y\Vert \), \( \Vert ( {\mathcal {R}}_{a}(\phi ^{ 1, d }_{ \varepsilon }) )(x) \Vert \le \kappa ( d^{ {\mathfrak {d}}_1 + {\mathfrak {d}}_2 } + \Vert x \Vert ) \), \(| \varphi _{ 0, d }( x )| \le \kappa d^{ {\mathfrak {d}}_6 } ( d^{ \theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2) } + \Vert x \Vert ^{ \theta } )\), \( \Vert \varphi _{ 1, d }( x ) - \varphi _{ 1, d }( y ) \Vert \le \kappa \Vert x - y \Vert \), and

$$\begin{aligned} \Vert \varphi _{ m, d }(x) - ( {\mathcal {R}}_{a}(\phi ^{ m, d }_{ \varepsilon }) )(x) \Vert \le \varepsilon \kappa d^{{\mathfrak {d}}_{(5 -m)}} (d^{\theta ({\mathfrak {d}}_1 + {\mathfrak {d}}_2)}+ \Vert x\Vert ^{\theta }) , \end{aligned}$$
(4.78)

and for every \( d \in {\mathbb {N}}\) let \( u_d :[0,T] \times {\mathbb {R}}^{d} \rightarrow {\mathbb {R}}\) be an at most polynomially growing viscosity solution of

$$\begin{aligned} \begin{aligned} \left( \tfrac{ \partial }{\partial t} u_d \right) ( t, x )&= \left( \tfrac{ \partial }{\partial x} u_d \right) ( t, x ) \, \varphi _{ 1, d }( x ) + \textstyle \sum \limits _{ i = 1 }^d \displaystyle \left( \tfrac{ \partial ^2 }{ \partial x_i^2 } u_d \right) ( t, x ) \end{aligned} \end{aligned}$$
(4.79)

with \( u_d( 0, x ) = \varphi _{ 0, d }( x ) \) for \( ( t, x ) \in (0,T) \times {\mathbb {R}}^d \) (cf. Definitions 3.1 and 3.3). Then there exist \( c \in {\mathbb {R}}\) and \( ( \Psi _{ d, \varepsilon } )_{ (d , \varepsilon ) \in {\mathbb {N}}\times (0,1] } \subseteq {\mathbf {N}}\) such that for all \( d \in {\mathbb {N}}\), \( \varepsilon \in (0,1] \) it holds that \( {\mathcal {R}}( \Psi _{ d, \varepsilon } ) \in C( {\mathbb {R}}^{ d }, {\mathbb {R}}) \), \([ \int _{ [0, 1]^d } | u_d(T, x) - ( {\mathcal {R}} (\Psi _{ d, \varepsilon }) )( x ) |^p \, dx ]^{ \nicefrac { 1 }{ p } } \le \varepsilon \), and

$$\begin{aligned}&{\mathcal {P}}( \Psi _{ d, \varepsilon } ) \le c \varepsilon ^{-({\mathfrak {e}}+6)} \nonumber \\&\quad \cdot d^{6[{\mathfrak {d}}_6 + ( \max \{{\mathfrak {d}}_1, \nicefrac {1}{2}\} + {\mathfrak {d}}_2)(\theta +1)] + \max \{4, {\mathfrak {d}}_3\} + {\mathfrak {e}}\max \{{\mathfrak {d}}_5 + \theta ( \max \{{\mathfrak {d}}_1, \nicefrac {1}{2}\} + {\mathfrak {d}}_2), {\mathfrak {d}}_4 + {\mathfrak {d}}_6 + 2\theta ( \max \{{\mathfrak {d}}_1, \nicefrac {1}{2}\} + {\mathfrak {d}}_2)\}} . \end{aligned}$$
(4.80)

Proof of Corollary 4.6

Throughout this proof for every \( d \in {\mathbb {N}}\) let \( \lambda _d :{\mathcal {B}}({\mathbb {R}}^d) \rightarrow [0, \infty ]\) be the Lebesgue-Borel measure on \({\mathbb {R}}^d\) and let \(\nu _d :{\mathcal {B}}({\mathbb {R}}^d) \rightarrow [0,1]\) be the function which satisfies for all \(B \in {\mathcal {B}}({\mathbb {R}}^d)\) that

$$\begin{aligned} \nu _d(B) = \lambda _{d}(B \cap [0, 1]^d). \end{aligned}$$
(4.81)

Observe that (4.81) implies that for all \(d \in {\mathbb {N}}\) it holds that \(\nu _d\) is a probability measure on \({\mathbb {R}}^d\). This and (4.81) ensure that for all \(d \in {\mathbb {N}}\), \(g \in C({\mathbb {R}}^d, {\mathbb {R}})\) it holds that

$$\begin{aligned} \int _{{\mathbb {R}}^d} |g(x)| \, \nu _d(dx) = \int _{ [0,1]^d } |g(x)| \, dx. \end{aligned}$$
(4.82)

Combining this with, e.g., [24, Lemma 3.15] demonstrates that for all \(d \in {\mathbb {N}}\) it holds that

$$\begin{aligned} \begin{aligned} \int _{{\mathbb {R}}^d} \Vert x\Vert ^{2p \theta } \, \nu _d (dx)&= \int _{[0,1]^d} \Vert x\Vert ^{2p \theta } \, dx \le d^{p \theta }. \end{aligned} \end{aligned}$$
(4.83)

This assures for all \(d \in {\mathbb {N}}\) that

$$\begin{aligned} \left[ \int _{{\mathbb {R}}^d} \Vert x\Vert ^{2p \theta } \, \nu _d (dx)\right] ^{\nicefrac {1}{(2p \theta )}} \le d^{\nicefrac {1}{2}} \le \max \{\kappa , 1\} d^{ \max \{{\mathfrak {d}}_1, \nicefrac {1}{2}\} + {\mathfrak {d}}_2}. \end{aligned}$$
(4.84)

Moreover, note that for all \(d \in {\mathbb {N}}\) it holds that

$$\begin{aligned} {\text {Trace}}({\text {I}}_d) \le d \le \max \{\kappa , 1\} d^{2 \max \{{\mathfrak {d}}_1, \nicefrac {1}{2}\}} \end{aligned}$$
(4.85)

(cf. Definition (3.6)). This, (4.84), and Theorem 4.5 (with \(A_d = {\text {I}}_d\), \(\left\| \cdot \right\| = \left\| \cdot \right\| \), \(\nu _d = \nu _d\), \(\varphi _{ 0, d } = \varphi _{ 0, d }\), \(\varphi _{ 1, d } = \varphi _{ 1, d }\), \(T =T\), \(\kappa = \max \{\kappa , 1\}\), \({\mathfrak {e}}= {\mathfrak {e}}\), \({\mathfrak {d}}_1 = \max \{{\mathfrak {d}}_1, \nicefrac {1}{2}\}\), \({\mathfrak {d}}_2 = {\mathfrak {d}}_2\), \({\mathfrak {d}}_3 = {\mathfrak {d}}_3\), \({\mathfrak {d}}_4 = {\mathfrak {d}}_4\), \({\mathfrak {d}}_5 = {\mathfrak {d}}_5\), \({\mathfrak {d}}_6 = {\mathfrak {d}}_6\), \(\theta = \theta \), \(p = p\), \(\phi ^{0, d}_{\varepsilon } = \phi ^{0, d}_{\varepsilon }\), \(\phi ^{1, d}_{\varepsilon } = \phi ^{1, d}_{\varepsilon }\), \(a = a\), \(u_d = u_d\) for \(d \in {\mathbb {N}}\) in the notation of Theorem 4.5) establish (4.80). The proof of Corollary 4.6 is thus completed. \(\square \)

Corollary 4.7

Let \( A_d = ( A_{ d, i, j } )_{ (i, j) \in \{ 1, \dots , d \}^2 } \in {\mathbb {R}}^{ d \times d } \), \( d \in {\mathbb {N}}\), be symmetric positive semidefinite matrices, let \(\left\| \cdot \right\| \! :(\cup _{d \in {\mathbb {N}}} {\mathbb {R}}^d) \rightarrow [0, \infty )\) satisfy for all \(d \in {\mathbb {N}}\), \(x = (x_1, x_2, \ldots , x_d) \in {\mathbb {R}}^d\) that \(\Vert x\Vert = ( \textstyle \sum _{i=1}^d |x_i|^2)^{\nicefrac {1}{2}}\), for every \( d \in {\mathbb {N}}\) let \( \nu _d :{\mathcal {B}}({\mathbb {R}}^d) \rightarrow [0,1]\) be a probability measure on \({\mathbb {R}}^d\), let \( \varphi _{0,d} :{\mathbb {R}}^d \rightarrow {\mathbb {R}}\), \( d \in {\mathbb {N}}\), and \( \varphi _{ 1, d } :{\mathbb {R}}^d \rightarrow {\mathbb {R}}^d \), \( d \in {\mathbb {N}}\), be functions, let \( T, \kappa , p \in (0, \infty )\), \(\theta \in [1, \infty )\), \( ( \phi ^{ m, d }_{ \varepsilon } )_{ (m, d, \varepsilon ) \in \{ 0, 1 \} \times {\mathbb {N}}\times (0,1] } \subseteq {\mathbf {N}}\), \(a\in C({\mathbb {R}}, {\mathbb {R}})\) satisfy for all \(x \in {\mathbb {R}}\) that \(a(x) = \max \{x, 0\}\), assume for all \( d \in {\mathbb {N}}\), \( \varepsilon \in (0,1] \), \(m \in \{0, 1\}\), \( x, y \in {\mathbb {R}}^d \) that \( {\mathcal {R}}_{a}( \phi ^{ 0, d }_{ \varepsilon } ) \in C( {\mathbb {R}}^d, {\mathbb {R}}) \), \( {\mathcal {R}}_{a}( \phi ^{ 1, d }_{ \varepsilon } ) \in C( {\mathbb {R}}^d, {\mathbb {R}}^d ) \), \( | \varphi _{ 0, d }( x ) | + {\text {Trace}}(A_d) \le \kappa d^{ \kappa } ( 1 + \Vert x \Vert ^{ \theta }) \), \([ \int _{{\mathbb {R}}^d} \Vert x\Vert ^{2 \max \{p, 2\} \theta } \, \nu _d (dx) ]^{\nicefrac {1}{(2 \max \{p, 2\} \theta )}} \le \kappa d^{\kappa }\), \( {\mathcal {P}}( \phi ^{ m, d }_{ \varepsilon } ) \le \kappa d^{ \kappa } \varepsilon ^{ - \kappa } \), \( |( {\mathcal {R}}_{a}(\phi ^{ 0, d }_{ \varepsilon }) )(x) - ( {\mathcal {R}}_{a}(\phi ^{ 0, d }_{ \varepsilon }) )(y)| \le \kappa d^{\kappa } (1 + \Vert x\Vert ^{\theta } + \Vert y \Vert ^{\theta })\Vert x-y\Vert \), \( \Vert ( {\mathcal {R}}_{a}(\phi ^{ 1, d }_{ \varepsilon }) )(x) \Vert \le \kappa ( d^{ \kappa } + \Vert x \Vert ) \), \( \Vert \varphi _{ 1, d }( x ) - \varphi _{ 1, d }( y ) \Vert \le \kappa \Vert x - y \Vert \), and

$$\begin{aligned} \Vert \varphi _{ m, d }(x) - ( {\mathcal {R}}_{a}(\phi ^{ m, d }_{ \varepsilon }) )(x) \Vert \le \varepsilon \kappa d^{ \kappa } ( 1 + \Vert x \Vert ^{ \theta } ) , \end{aligned}$$
(4.86)

and for every \( d \in {\mathbb {N}}\) let \( u_d :[0,T] \times {\mathbb {R}}^{d} \rightarrow {\mathbb {R}}\) be an at most polynomially growing viscosity solution of

$$\begin{aligned} \begin{aligned} \left( \tfrac{ \partial }{\partial t} u_d \right) ( t, x )&= \left( \tfrac{ \partial }{\partial x} u_d \right) ( t, x ) \, \varphi _{ 1, d }( x ) + \textstyle \sum \limits _{ i, j = 1 }^d \displaystyle A_{ d, i, j } \, \left( \tfrac{ \partial ^2 }{ \partial x_i \partial x_j } u_d \right) ( t, x ) \end{aligned} \end{aligned}$$
(4.87)

with \( u_d( 0, x ) = \varphi _{ 0, d }( x ) \) for \( ( t, x ) \in (0,T) \times {\mathbb {R}}^d \) (cf. Definitions 3.1 and 3.3). Then there exist \( c \in {\mathbb {R}}\) and \( ( \Psi _{ d, \varepsilon } )_{ (d , \varepsilon ) \in {\mathbb {N}}\times (0,1] } \subseteq {\mathbf {N}}\) such that for all \( d \in {\mathbb {N}}\), \( \varepsilon \in (0,1] \) it holds that \( {\mathcal {P}}( \Psi _{ d, \varepsilon } ) \le c \, d^c \varepsilon ^{ - c } \), \( {\mathcal {R}}_{a}( \Psi _{ d, \varepsilon } ) \in C( {\mathbb {R}}^{ d }, {\mathbb {R}}) \), and

$$\begin{aligned} \left[ \int _{ {\mathbb {R}}^d } | u_d(T,x) - ( {\mathcal {R}}_{a}(\Psi _{ d, \varepsilon }) )( x ) |^p \, \nu _d(dx) \right] ^{ \nicefrac { 1 }{ p } } \le \varepsilon . \end{aligned}$$
(4.88)