Deep ReLU Network Expression Rates for Option Prices in high-dimensional, exponential L\'evy models

We study the expression rates of deep neural networks (DNNs for short) for option prices written on baskets of $d$ risky assets, whose log-returns are modelled by a multivariate L\'evy process with general correlation structure of jumps. We establish sufficient conditions on the characteristic triplet of the L\'evy process $X$ that ensure $\varepsilon$ error of DNN expressed option prices with DNNs of size that grows polynomially with respect to $\mathcal{O}(\varepsilon^{-1})$, and with constants implied in $\mathcal{O}(\cdot)$ which grow polynomially with respect $d$, thereby overcoming the curse of dimensionality and justifying the use of DNNs in financial modelling of large baskets in markets with jumps. In addition, we exploit parabolic smoothing of Kolmogorov partial integrodifferential equations for certain multivariate L\'evy processes to present alternative architectures of ReLU DNNs that provide $\varepsilon$ expression error in DNN size $\mathcal{O}(|\log(\varepsilon)|^a)$ with exponent $a \sim d$, however, with constants implied in $\mathcal{O}(\cdot)$ growing exponentially with respect to $d$. Under stronger, dimension-uniform non-degeneracy conditions on the L\'evy symbol, we obtain algebraic expression rates of option prices in exponential L\'evy models which are free from the curse of dimensionality. In this case the ReLU DNN expression rates of prices depend on certain sparsity conditions on the characteristic L\'evy triplet. We indicate several consequences and possible extensions of the present results.

Recent years have seen a dynamic development in applications of deep neural networks (DNNs for short) in expressing high-dimensional input-output relations. This development was driven mainly by the need for quantitative modelling of input-output relationships subject to large sets of observation data. Rather naturally, therefore, DNNs have found a large number of applications in computational finance and in financial engineering. We refer to the survey Ruf and Wang [RW20] and to the references there. Without going into details, we only state that the majority of activity addresses techniques to employ DNNs in demanding tasks in computational finance. The often striking efficient computational performance of DNN based algorithms raises naturally the question for theoretical, in particular mathematical, underpinning of successful algorithms. Recent years have seen progress, in particular in the context of option pricing for Black-Scholes type models, for DNN based numerical approximation of diffusion models on possibly large baskets (see, e.g. Berner et al. [BGJ20b], Elbrächter et al. [EGJS21] and Ito et al. [IRZ21], Reisinger and Zhang [RZ20] for game-type options). These references prove that DNN based approximations of option prices on possibly large baskets of risky assets can overcome the so-called curse of dimensionality in the context of affine diffusion models for the dynamics of the (log-)prices of the underlying risky assets. These results could be viewed also as particular instances of DNN expression rates of certain PDEs on high-dimensional state spaces, and indeed corresponding DNN expressive power results have been shown for their solution sets in Grohs et al. [GHJvW18], Gonon et al. [GGJ + 19] and the references there.
Since the turn of the century, models beyond the classical diffusion setting have been employed increasingly in financial engineering. In particular, Lévy processes and their non-stationary generalizations such as Feller-Lévy processes (see, e.g., Böttcher et al. [BSW13] and the references there) have received wide attention. This can in part be explained by their ability to account for heavy tails of financial data and by Lévy-based models constituting hierarchies of models, comprising in particular classical diffusion ("Black-Scholes") models with constant volatility that are still widely used in computational finance as a benchmark. Therefore, all results for geometric Lévy processes in the present paper apply in particular to the Black-Scholes model. The "Feynman-Kac correspondence" which relates conditional expectations of sufficiently regular functionals over diffusions to (viscosity) solutions of corresponding Kolmogorov PDEs, extends to multivariate Lévy processes. We mention only Nualart and Schoutens [NS01], Cont and Tankov [CT04], Cont and Voltchkova [CV05b], Glau [Gla16], Eberlein and Kallsen [EK19,Chapter 5.4] and the references there. The Kolmogorov PDE ("Black-Scholes equation") in the diffusion case is then replaced by a so-called Partial Integrodifferential Equation (PIDE) where the fractional integrodifferential operator accounting for the jumps is related in a one-to-one fashion with the Lévy measure ν d of the R d -valued LP X d . In particular, Lévy type models for (log-)returns of risky assets result in nonlocal partial integrodifferential equations for the option price, which generalize the linear parabolic differential equations which arise in classical diffusion models. We refer to Bertoin [Ber96], Sato [Sat99] for fundamentals on Lévy processes and to Böttcher et al. [BSW13] for extensions to certain non-stationary settings. For the use of Lévy processes in financial modelling we refer to Cont and Tankov [CT04], Eberlein and Kallsen [EK19] and to the references there. We refer to Cont and Voltchkova [CV05b,CV05a], Matache et al. [MvPS04], Hilber et al. [HRSW13] for a presentation and for numerical methods for option pricing in Lévy models.
The results on DNNs in the context of option pricing mentioned above are exclusively concerned with models with continuous price processes. This naturally raises the question whether DNN based approximations are still capable of overcoming the curse of dimensionality in high-dimensional financial models with jumps, which have a much richer mathematical structure. This question is precisely the subject of this article. We study the expression rates of DNNs for prices of options (and the associated PIDEs) written on possibly large baskets of risky assets, whose log-returns are modelled by a multivariate Lévy process with general correlation structure of jumps. In particular, we establish sufficient conditions on the characteristic triplet of the Lévy process X d that ensure ε error of DNN expressed option prices with DNNs of size O(ε −2 ), and with constants implied in O(·) which grow polynomially with respect d. This shows that DNNs are capable to overcome the curse of dimensionality also for general exponential Lévy models. Let us outline the scope of our results. The DNN expression rate results proved here give a theoretical justification for neural network based non-parametric option pricing methods. These have become very popular recently, see for instance the recent survey Ruf and Wang [RW20]. Our results show that if option prices result from an exponential Lévy model, as described e.g. in [EK19,Chapter 3.7], under mild conditions on the Lévy-triplets these prices can be expressed efficiently by (ReLU) neural networks, also for high dimensions. The result covers, in particular, rather general, multivariate correlation structure in the jump part of the Lévy process, for example parametrized by a so-called Lévy copula, see Kallsen and Tankov [KT06], Farkas et al. [FRS07], [EK19,Chapter 8.1] and the references there. This extends, at least to some extent, the theoretical foundation to the widely used neural network based non-parametric option pricing methodologies to market models with jumps. We prove two types of results on DNN expression rate bounds for European options in exponential Lévy models, with one probabilistic and one "deterministic" proof. The former one is based on concepts from statistical learning theory, and provides for relevant payoffs (baskets, call on max, . . . ) an expression error O(ε) with DNN sizes of O(ε −2 ) and with constants implied in O(·) which grow polynomially in d, thereby overcoming the curse of dimensionality, whereas the latter is based on parabolic smoothing of the Kolmogorov equation, and allows us to prove exponential expressivity of prices for positive maturities, i.e. an expression error O(ε) with DNN sizes of O(| log ε| a ) for some a > 0, albeit with constants implied in O(·) possibly growing exponentially in d.
For the latter approach certain non-degeneracy is required on the symbol of the underlying Lévy process. The probabilistic proof of DNN approximation rate results, on the other hand, does not require any such assumptions. It only relies on the additive structure of the semigroup associated to the Lévy process and existence of moments. Thus, the results proved here are specifically tailored to the class of option pricing functions (or more generally expectations of exponential Lévy processes) under European style, plain vanilla payoffs. The structure of this paper is as follows. In Section 2 we review terminology, basic results, and financial modelling with exponential Lévy processes. In particular, we also recapitulate the corresponding fractional, partial integrodifferential Kolmogorov equations which generalize the classical Black-Scholes equations to Lévy models. Section 3 recapitulates notation and basic terminology for deep neural networks to the extent required in the ensuing expression rate analysis. We focus mainly on so-called ReLU DNNs, but add that corresponding definitions and also results do hold for more general activation functions. In Section 4 we present a first set of DNN expression rate results, still in the univariate case. This is, on the one hand, for presentation purposes, as this setting allows for lighter notation, and to introduce mathematical concepts which will be used subsequently also for contracts on possibly large basket of Lévy-driven risky assets. We also present an application of the results to neural-network based call option pricing. Section 5 then has the main results of the present paper: expression rate bounds for ReLU DNNs for multivariate, exponential Lévy models. We identify sufficient conditions to obtain expression rates which are free from the curse of dimensionality via mathematical tools from statistical learning theory. We also develop a second argument based on parabolic Gevrey regularity with quantified derivative bounds, which even yield exponential expressivity of ReLU DNNs, albeit with constants that generally depend on the basket size in a possibly exponential way. Finally, we develop an argument based on quantified sparsity in polynomial chaos expansions and corresponding ReLU expression rates from Schwab and Zech [SZ19] to prove high algebraic expression rates for ReLU DNNs, with constants that are independent of the basket size. We also provide a brief discussion of recent, related results. We conclude in Section 6 and indicate several possible generalizations of the present results.
2. Exponential Lévy models and PIDEs 2.1. Lévy processes. Fix a complete probability space (Ω, F, P) on which all random elements are defined. We start with the univariate case. We recall that an R-valued continuous-time process (X t ) t≥0 is called a Lévy process if it is stochastically continuous, it has almost surely RCLL sample paths, it satisfies X 0 = 0 almost surely, and it has stationary and independent increments. See, e.g. Bertoin [Ber96], Sato [Sat99] for discussion and for detailed statements of definitions. It is shown in these references that a Lévy process (LP for short) X is characterized by its so-called Lévy triplet (σ 2 , γ, ν), where σ ≥ 0, γ ∈ R and where ν is a measure on (R, B(R)) with ν({0}) = 0, the so-called jump-measure, or Lévy-measure of the LP X which satisfies R (x 2 ∧ 1) ν(dx) < ∞. For more details on both univariate LPs and the multivariate situation we refer to [Sat99]. As in the univariate case, multivariate (R d -valued) LPs X d are completely described by their This condition ensures that the functions defined in (2.3) and (5.1) below represent option prices. However, the condition is not needed for the proof of the results later, so we do not need to impose (2.1) or (2.2) in any of the results proved in the article. We will, however, impose certain moment or regularity conditions. For more details on exponential Lévy models, with particular attention to their use in financial modelling, we refer to Cont . By the Markov property C t = C(t, S t ) and so, switching to time-to-maturity τ = T − t, u(τ, s) = C(T − τ, s) we can rewrite the option price as follows: for τ ∈ [0, T ], s ∈ (0, ∞), where the second step uses that X T − X t is independent of X t and has the same distribution as X T −t . If the payoff function ϕ is Lipschitz-continuous on R and the Lévy process fulfils either σ > 0 or a certain non-degeneracy condition on ν, then u is continuous on [0, T ) × (0, ∞), it is C 1,2 on (0, T ) × (0, ∞) and it satisfies the linear, parabolic partial integrodifferential equation (PIDE for short) with initial condition u(0, ·) = ϕ, see for instance Proposition 2 in Cont and Voltchkova [CV05b]. If the non-degeneracy condition on ν is dropped, one can still characterize u (transformed to log-price variables) as the unique viscosity solution to the PIDE above. This is established e.g. in [CV05b] (see also Proposition 3.3 in [CV05a]). For our purposes the representation (2.3) is more suitable. However, by using this characterization (also called Feynman-Kac representation for viscosity-solutions of PIDEs, see Barles et al. [BBP97]) the results formulated below also provide DNN approximations for PIDEs. Finally, note that the interest rate r may also be directly modelled as a part of X by modifying γ. To simplify the notation we set r = 0 in what follows. We also remark that all expression rate results hold verbatim for assets with a constant dividend payment (see, e.g., [LM08, Eqn. (3.1)] for the functional form of the exponential Lévy model in that case).

Deep neural networks (DNNs)
This article is concerned with establishing expression rate bounds of deep neural networks (DNNs) for prices of options (and the associated PIDEs) written on possibly large baskets of risky assets, whose log-returns are modelled by a multivariate Lévy process with general correlation structure of jumps. The term "expression rate" denotes the rate of convergence to 0 of the error between the option price and its DNN approximation. This rate can be directly translated to quantify the DNN size required to achieve a given approximation accuracy. For instance, in Theorem 5.1 below an expression rate of q −1 is established and one may even choose q = 2 in many relevant cases. We now give a brief introduction to DNNs. Here we follow current practice and refer to the collection of parameters Φ as "the neural network" and denote by R(Φ) its realization, that is, the function defined by these parameters. More specifically, we use the following terminology (see for example Section 2 in Opschoor et al. [OPS20]): firstly, we fix a function ̺ : R → R (referred to as the activation function) which is applied componentwise to vector-valued inputs.
. . , L and (A i , b i ) are referred to as the weights of the i-th layer of the NN. The associated realization of Φ is the mapping where x L−1 is given as We call M j (Φ) = A j 0 + b j 0 the number of (non-zero) weights in the j-th layer and M (Φ) = L j=1 M j (Φ) the number of weights of the neural network Φ. We also refer to M (Φ) as the size of the neural network, write L(Φ) = L for the number of layers of Φ and refer to N o (Φ) = N L as the output dimension.
We refer to Section 2 in Opschoor et al. [OPS20] for further details. The following lemma shows that concatenating n affine transformations with distinct neural networks and taking their weighted average can itself be represented as a neural network. The number of non-zero weights in the resulting neural network can be controlled by the number of non-zero weights in the original neural networks. The proof of the lemma is based on a simple extension of the full parallelization operation for neural networks (see [OPS20, Proposition 2.5]) and refines Grohs et al. [GHJvW18,Lemma 3.8].
Then there exists a neural network ψ such that If, in addition, D 1 , . . . , D n are diagonal matrices and ). Then, for l ∈ {1, . . . , L − 1} and x ∈ R d , it is straightforward to verify that x l has a block structure (with subscripts indicating the layers and superscripts indicating the blocks) Hence, (3.1) is satisfied and If in addition D 1 , . . . , D n are diagonal matrices and c 1 = · · · = c n = 0, then A i

DNN approximations for univariate Lévy models
We study DNN expression rates for option prices under (geometric) Lévy models for asset prices, initially here in one spatial dimension. We present two expression rate estimates for ReLU DNNs, which are based on distinct mathematical arguments: the first, probabilistic argument builds on ideas used in recent works Gonon  The probabilistic arguments result in, essentially, ε-complexity of DNN expression of order ε −2 . The second argument draws on parabolic (analytic) regularity furnished by the corresponding Kolmogorov equations, and results in far stronger, exponential expression rates, i.e., with an ε-complexity of DNN expression scaling, essentially, polylogarithmic with respect to 0 < ε < 1. As we shall see in the next section, however, the latter argument is in general subject to the curse of dimensionality.
4.1. DNN expression rates: probabilistic argument. We fix 0 < a < b < ∞ and measure the approximation error in the uniform norm on [a, b]. Recall that M (Φ) denotes the number of (non-zero) weights of a neural network Φ and R(Φ) is the realization of Φ. Consider the following exponential integrability condition on the Lévy measure ν: for some p ≥ 2, Furthermore, for any function g we denote by Lip(g) the best Lipschitz constant for g.
Proposition 4.1. Suppose the moment condition (4.1) holds. Suppose further the payoff ϕ can be approximated by neural networks, that is, given a payoff function s → ϕ(s) there exists constants c > 0, q ≥ 0 such that for any ε ∈ (0, 1] there exists a neural network φ ε with Then there exists κ ∈ [c, ∞) (depending on the interval [a, b]) and neural networks ψ ε , ε ∈ (0, 1], such that for any target accuracy ε ∈ (0, 1] the number of weights is bounded by M (ψ ε ) ≤ κε −2−q and the approximation error between the neural network ψ ε and the option price is at most ε, that is, sup Remark 4.2.
Remark 4.3. In Proposition 4.1 the time horizon T > 0 is finite and fixed. As evident from the proof, the constant κ depends on T .
Proof. Let ε ∈ (0, 1] be the given target accuracy and fixε ∈ (0, 1] (to be specified later). Denote φ = φε. First, (4.2) and (4.4) show for any s ∈ (0, ∞) that Thus, ϕ is at most linearly growing at ∞. Hence we obtain E[ϕ(se X T )] < ∞, since even the second exponential moment is finite, i.e., due to the assumed integrability (4.1) of the Lévy measure and Sato [Sat99,Theorem 25.17]. Now recall that Combining this with assumption (4.2) yields for all s ∈ [a, b] with the constant c 1 = c(1 + bE[e X T ]) being finite due to (4.5). In the second step, let X 1 , . . . , X n denote n i.i.d. copies of X and introduce an independent collection of Rademacher random variables ε 1 , . . . , ε n . Write Elementary properties of conditional expectations in the first step and Theorem 4.12 in Ledoux and Talagrand [LT11] (with T in that result chosen as T x 1 ,...,xn = {t ∈ R n : t 1 = se x 1 , . . . , t n = se xn for some s ∈ [a, b]}) in the second step show that On the other hand, one may apply Jensen's inequality, independence and E[ε k ε l ] = δ k,l to estimate Combining this with the previous estimates (4.7)-(4.8) and the hypothesis on the Lipschitz-constant of the neural network (4.4) we obtain that , which is finite again due to the existence of exponential moments (4.5).
In a third step we can now apply Markov's inequality for the first estimate and then insert (4.9) to estimate This proves in particular that Therefore (as A ∈ F with P[A] > 0 necessarily needs to satisfy A = ∅) there exists ω ∈ Ω with (4.11) sup Lemma 3.2 proves that s → 1 n n k=1 R(φ)(se X k T (ω) ) is itself the realization of a neural networkψ with M (ψ) ≤ nM (φ) and hence we have proved the existence of a neural networkψ with Proposition 4.4. Consider the setting of Proposition 4.1, but instead of (4.4) assume that R(φ ε ) is C 1 and there is a constant c > 0 such that for every s ∈ (0, ∞) holds Then the assertion of Proposition 4.1 remains valid.
Proof. This result is a corollary of Proposition 4.1. For the ease of the reader we provide an alternative proof. First, let us verify that (4.13) and (4.2) yield a linear growth condition for R(φ ε ). Indeed, we may use the triangle inequality to estimate for any ε ∈ (0, 1], s ∈ (0, ∞), Now the same proof as for Proposition 4.1 applies, only the second step needs to be adapted. In other words, we prove the estimate (4.9) with a different constant c 2 by using a different technique. To do this, again we let X 1 , . . . , X n denote n i.i.d. copies of X. Applying Lemma 2.16 in [GGJ + 19] (with random fields ξ k (s, ω) = R(φ)(se X k T (ω) ), k = 1, . . . , n, which satisfy the hypotheses of Lemma 2.16 in [GGJ + 19] thanks to (4.5) and (4.13)) in the first inequality and using (4.13) and (4.14) for the second inequality then proves that which is a bound as in (4.9) with constant Remark 4.5. The architecture of the neural network approximations constructed using probabilistic arguments in Proposition 4.1, Proposition 4.4 and also Theorem 5.1 ahead differ from architectures obtained by analytic arguments, see Proposition 4.8 and Theorem 5.4 ahead. While the neural networks in the latter results are deep in any situation, the architecture of the neural networks in the former situation depends heavily on the architecture of the neural network φ ε used to approximate the payoff function ϕ. Therefore, in certain simple situations, the approximating neural network ψ ε may be a shallow neural network, that is, a neural network with only L = 2 layers. E.g., by (4.6) or (2.3) the function ϕ is specified in the variable s > 0, and not in log-return variable x. This implies, e.g., for a plain-vanilla European call that ϕ(s) = (s − K) + must be emulated by a ReLU NN, which can be done using the simple 2-layer neural network φ 0 = ((1, −K), (1, 0)), that is, R(φ 0 ) = ϕ.

DNN expression of European calls.
In this section we illustrate how the results of Proposition 4.1 can be used to bound DNN expression rates of call options on exponential Lévy models. Suppose we observe call option prices for a fixed maturity T and N different strikes K 1 , . . . , K N > 0. Denote these prices byĈ(T, K 1 ), . . . ,Ĉ(T, K N ). A task frequently encountered in practice is to extrapolate from these prices to prices corresponding to unobserved maturities or to learn a nonparametric option pricing function. A widely used approach is to solve Here H is a suitable collection of (realizations of) neural networks, for example all networks with an a-priori fixed architecture. In fact, many of the papers listed in the recent review Ruf and Wang [RW20] use this approach or a variation of it, where for example an absolute value is inserted instead of a square orĈ(T, In this section we assume that the observed call prices are generated from an (assumed unknown) exponential Lévy model and H consists of ReLU networks. Then we show that the error in (4.15) can be controlled and we can give bounds on the number of non-zero parameters of the minimizing neural network. The following result is a direct consequence of Proposition 4.1. It shows that O(ε −1 ) weights suffice to achieve an error of at most ε in (4.15).
Remark 4.7. The proof shows that κ is independent of N . This can also be seen by observing that the result directly generalizes to an infinite number of call options with strikes in a compact interval K = [K, K] with K > 0, K < ∞. Indeed, let µ be a probability measure on (K, B(K)), then choosing ψ δ , δ = ε 2 as in the proof of Proposition 4.6 and a = S 0 /K, b = S 0 /K yields R(ψ δ ) ∈ H κ,ε and 4.3. ReLU DNN exponential expressivity. We now develop a second argument for bounding the expressivity of ReLU DNNs for the option price u(τ, s) solution of (2.4), subject to the initial condition u(0, s) = ϕ(s). In particular, in this subsection we choose ̺ given by ̺(x) = max{x, 0} as activation function. As in the preceding, probabilistic argument, we consider the DNN expression error in a bounded interval [a, b] with 0 < a < s < b < ∞. The second argument is based on parabolic smoothing of the linear, parabolic PIDE (2.4). This, in turn, ensures smoothness of s → u(τ, s) at positive times τ > 0, i.e. smoothness in the "spatial" variables s ∈ [a, b] resp. in the log-return variable x = log(s) ∈ [log(a), log(b)], even for non-smooth payoff functions ϕ (so, in particular, binary options with discontinuous payoffs ϕ are admissible, albeit at the cost of non-uniformity of derivative bounds at τ ↓ 0). It is a classical result that this implies spectral, possibly exponential convergence Section 3.2], this exponential polynomial convergence rate implies also exponential expressivity of To ensure smoothing properties of the solution operator of the PIDE, we require additional assumptions (see (4.17) below) on the Lévy triplet (σ 2 , γ, ν). To formulate these, we recall the Lévy symbol ψ of the R-valued LP X Proposition 4.8. Suppose that the symbol ψ of the LP X is such that there exists ρ ∈ (0, 1] and constants C i > 0, i = 1, 2, 3 such that for all ξ ∈ R holds Then, for every v 0 such that v 0 = ϕ • exp ∈ L 2 (R), for every 0 < τ ≤ T < ∞, for every 0 < a < b < ∞, and for every 0 < ε < 1/2 exist neural networks ψ u ε which express the solution u(τ, ·)| [a,b] to accuracy ε, i.e., sup A sufficient condition on the Lévy triplet which ensures (4.17) is as follows. Let X be a Lévy process with characteristic triplet (σ 2 , γ, ν) and Lévy density k(z) where ν(dz) = k(z)dz satisfies (1) There are constants β − > 0, β + > 1 and C > 0 such that (2) Furthermore, there exist constants 0 < α < 2 and C + > 0 such that (3) If σ = 0, we assume additionally that there is a C − > 0 such that Then (4.17) is satisfied (see [HRSW13,Lemma 10.4.2]). Here, ρ = 1 if σ > 0 and otherwise ρ = α/2.
Proof. The proof proceeds in several steps: first, we apply the change of variables x = log(s) ∈ R in order to leverage the stationarity of the LP X for obtaining a constant coefficient Kolmogorov PIDE. Assumptions (4.17) then ensure well-posedness of the PIDE in a suitable variational framework. We then exploit that stationarity of the LP X facilitates the use of Fourier transformation; the lower bound on ψ in (4.17) will allow to derive sharp, explicit bounds on high spatial derivatives of (variational) solutions of the PIDE which imply Gevrey regularity of these solutions on bounded where A denotes the integrodifferential operator together with the initial condition Then Conversely, if C(t, s) in (4.20) is sufficiently regular, then v(τ, x)=C(T − τ, e x ) is solution of (4.18), (4.19) (recall that we assume r = 0 for notational simplicity). The Lévy-Khintchine formula describes the R-valued LP X by the log-characteristic function ψ of the RV X 1 . From the time-homogeneity of the LP X, The Lévy exponent ψ of the LP X admits the explicit representation (4.16).
The Lévy exponent ψ is the symbol of the pseudo-differential operator −L, where L is the infinitesimal generator of the semi-group of the LP X. A = −L is the spatial operator in (4.18) given by For f, g ∈ C ∞ 0 (R) we associate with operator A the bilinear form The translation invariance of the operator A (implied by stationarity of the LP X) in (4.22) and Parseval's equality (see [HRSW13,Remark 10.4.1]) imply that ψ is the symbol of A, i.e.
To construct the DNNs ψ u ε in the claim, we proceed in several steps: we first use a (analytic, in the bounded interval I = [x − , x + ] ⊂ R) change of variables s = exp(x) and the fact that Gevrey regularity is preserved under analytic changes of variables to infer Gevrey-δ regularity in [a, b] ⊂ R >0 of s → u(τ, s), for every fixed τ > 0. This, in turn, implies the existence of a sequence {u p (s)} p≥1 of polynomials of degree p ∈ N in [a, b] converging in W 1,∞ ([a, b]) to u(τ, ·) for τ > 0 at rate exp(−b ′ p 1/δ ) for some constant b ′ > 0 depending on a, b and on δ ≥ 1, but independent of p. The asserted DNNs are then obtained by approximately expressing the u p through ReLU DNNs, again at exponential rates, with Opschoor et al. [OSZ21]. The details are as follows.
. This implies that for every 0 < ε < 1/2, a pointwise error of O(ε) in [a, b] can be achieved by some ReLU NN ψ u ε of depth O(| log(ε)| δ | log(| log(ε)|)|) and of size O(| log(ε)| 2δ ). This completes the proof. 4.4. Summary and Discussion. For prices of derivative contracts on one risky asset, whose logreturns are modelled by a LP X, we have analyzed the expression rate of deep ReLU NNs. We provided two mathematically distinct approaches to the analysis of the expressive power of deep ReLU NNs. The first, probabilistic approach furnished algebraic expression rates, i.e. pointwise accuracy ε > 0 on a bounded interval [a, b] was furnished with DNNs of size O(ε −q ) with suitable q ≥ 0. The argument is based on approximating the option price by Monte Carlo sampling, estimating the uniform error on [a, b] and then emulating the resulting average by a DNN. The second, "analytic" approach, leveraged regularity of (variational) solutions of the corresponding Kolmogorov partial integrodifferential equations, and furnished exponential rates of DNN expression. That is, expression error ε > 0 is achieved with DNNs of size O(| log(ε)| a ) for suitable a > 0. Key in the second approach were stronger conditions (4.17) on the characteristic exponent of the LP X, which imply, as we showed, Gevrey-δ regularity of the map s → u(τ, s) for suitable τ > 0. This regularity implies, in turn, exponential rates of polynomial approximation (in the uniform norm on [a, b]) of s → u(τ, s), which is a result of independent interest and, subsequently, by emulation of polynomials with deep ReLU NNs, the corresponding exponential rates. We remark that in the particular case δ = 1, the derivative bounds (4.24) imply analyticity of the map s → u(τ, s) for s ∈ [a, b] which implies the assertion also with the exponential expression rate bound for analytic functions in Opschoor et al. [OSZ21].
We also remark that the smoothing of the solution operator in Proposition 4.8 accommodated payoff functions which belong merely to L 2 , as arise e.g. in particular binary contracts. This is a consequence of the assumption (4.17) which, on the other hand, excludes Lévy processes with one-sided jumps. Such processes are covered by Proposition 4.1.

DNN approximation rates for multivariate Lévy models
We now turn to DNN expression rates for multivariate geometric Lévy models. This is a typical situation when option prices on baskets of d risky assets are of interest, whose log-returns are modelled by multivariate Lévy processes. We admit rather general jump measures with, in particular, fully correlated jumps in the marginals, as provided, for example, by so-called Lévy copula constructions in Kallsen and Tankov [KT06]. As in the univariate case, we prove two results on ReLU DNN expression rates of option prices for European style contracts. The first argument is developed in Section 5.1 below and overcomes, in particular, the curse of dimensionality. Its proof is again based on probabilistic arguments from statistical learning theory. As exponential LPs X d generalize geometric Brownian motions, Theorem 5.1 generalizes several results from the classical Black-Scholes setting and we comment on the relation of Theorem 5.1 to these recent results in Section 5.2. Owing to the method of proof, the DNN expression rate in Theorem 5.1 will deliver an ε-complexity of O(ε −2 ), achieved with potentially shallow DNNs, see Remark 4.5. The second argument is based on parabolic regularity of the deterministic Kolmogorov PIDE associated to the LP X d . We show in Theorem 5.4 that polylogarithmic in ε expression rate bounds can be achieved by allowing DNN depth to increase essentially as O(| log ε|). The result in Theorem 5.4 is, however, prone to the curse of dimensionality: constants implied in the O(·) bounds may (and, generally, will) depend exponentially on d. We also show that under a hypothesis on sufficiently large time t > 0, parabolic smoothing will allow to overcome the curse of dimension, with dimension-independent expression rate bounds which are possibly larger than the rates furnished by the probabilistic argument (which is, however, valid uniformly for all t > 0). 5.1. DNN expression rate bounds via probabilistic argument. We start by remarking that in this subsection, there is no need to assume ReLU activation. The following result proves that neural networks are capable of approximating option prices in multivariate exponential Lévy models without the curse of dimensionality given that the corresponding Lévy triplets (A d , γ d , ν d ) are bounded uniformly with respect to the dimension d. For any dimension d ∈ N we assume given a payoff ϕ d : R d → R, a d-variate LP X d and we denote the option price in time-to-maturity by We refer to Sato [Sat99] for more details on multivariate Lévy processes and to Cont and Tankov [CT04], Eberlein and Kallsen [EK19] for more details on multivariate geometric Lévy models in finance. The next theorem is a main result of the present paper. It states that DNNs can efficiently express prices on possibly large baskets of risky assets whose dynamics are driven by multivariate Lévy processes with general jump correlation structure. The expression rate bounds are polynomial in the number d of assets and, therefore, not prone to the curse of dimensionality. This result partially generalizes earlier work on DNN expression rates for diffusion models in Elbrächter et al. [EGJS21], Grohs et al. [GHJvW18].
Theorem 5.1. Assume that for any d ∈ N, the payoff ϕ d : R d → R can be approximated well by neural networks, that is, there exists constants c > 0, p ≥ 2,q, q ≥ 0 and, for all ε ∈ (0, 1], d ∈ N, there exists a neural network φ ε,d with In addition, assume that the Lévy triplets (A d , γ d , ν d ) of X d are bounded in the dimension, that is, there exists a constant B > 0 such that for each d ∈ N, i, j = 1, . . . , d, Then there exist constants κ, p, q ∈ [0, ∞) and neural networks ψ ε,d , ε ∈ (0, 1], d ∈ N such that for any target accuracy ε ∈ (0, 1] and for any d ∈ N the number of weights grows only polynomially M (ψ ε,d ) ≤ κd p ε −q and the approximation error between the neural network ψ ε,d and the option price is at most ε, that is, Remark 5.2. The statement of Theorem 5.1 is still valid, if we admit logarithmic growth of B with d in (5.5). Proof. Let ε ∈ (0, 1] be the given target accuracy and considerε ∈ (0, 1] (to be selected later). To simplify notation we write for s ∈ [a, b] d se X d T = s 1 exp(X d T,1 ), . . . , s d exp(X d T,d ) .
The proof consists in four steps: • Step 1 bounds the error that arises when the payoff ϕ d is replaced by the neural network approximation φε ,d . As a part of Step 1 we also prove that the p-th exponential moments of the components X d T,i of the Lévy process are bounded uniformly in the dimension d.

•
Step 2 is a technical step that is required for Step 3; it bounds the error that arises when the Lévy process is capped at a threshold D > 0. If we assumed in addition that the output of the neural network φε ,d were bounded (this is for example the case if the activation function ̺ is bounded), then Step 2 could be omitted. • Step 3 is the key step in the proof. We introduce n i.i.d. copies of (the capped version of) X d T and use statistical learning techniques (symmetrization, Gaussian and Rademacher complexities) to estimate the expected maximum difference between the option price (with neural network payoff) and its sample average. This is then used to construct the approximating neural networks.
• Step 4 combines the estimates from Steps 1-3 and concludes the proof.
Step 1: Assumption (5.2) and Hölder's inequality yield for all s ∈ and we used that · ≤ · 1 in the last step. To see that c 1 is indeed finite, note that (5.5) and [Sat99,Theorem 25.17] (with the vector w ∈ R d in that result being pe i ) imply that for any d ∈ N, i = 1, . . . , d, the exponential moment can be bounded as where in the second inequality we used that |e z − 1 − z| ≤ z 2 e p for all z ∈ [−p, p] which can be seen e.g. from the (mean value form of the) Taylor remainder formula.
Step 2: Before proceeding with the key step of the proof, we need to introduce a cut-off in order to ensure that the neural network output is bounded. Let D > 0 and consider the random variable X d,D T = min(X d T , D), where the minimum is understood componentwise. Then the Lipschitz property (5.4) implies ≤c 1 e −D dq +1 , (5.8) wherec 1 = 2bc exp(5T pB + 2T e p pB) and we used · ≤ · 1 , Hölder's inequality, Chernoff's bound and finally again Hölder's inequality and (5.7).
5.2. Discussion of related results. As recently there have been several results on DNN expression rates in high dimensional diffusion models, a discussion on the relation of the multivariate DNN expression rate result, Thm.5.1, to other recent mathematical results on DNN expression rate bounds is in order. Given that geometric diffusion models are particular cases of the presently considered models (corresponding to ν d = 0 in the Lévy triplet), it is of interest to consider to which extent the DNN expression error bound Thm.5.1 relates to these results. Firstly, we note that with the exception of Gonon [RZ20] and the references therein) study approximation with respect to the L p -norm (p < ∞), whereas in Thm.5.1 we study approximation with respect to the L ∞ -norm, which requires entirely different techniques. While the results in [EGJS21] rely on specific structure of the payoff, the proof of the expression rates in [GGJ + 19] has some similarities with the proof of Thm.5.1. However, the novelty in the proof of Thm.5.1 is the use of statistical learning techniques (symmetrization, Gaussian and Rademacher complexities) which allow for weaker assumptions on the activation function than in [GGJ + 19]. In addition, the class of PDEs considered in [GGJ + 19] (heat equation and related) is different than the one considered in Thm.5.1 (Black-Scholes PDE and Lévy PIDE). Secondly, Thm.5.1 is the first result on ReLU DNN expression rates for option prices in models with jumps or, equivalently, for partial-integrodifferential equations in non-divergence form for x ∈ R d , τ > 0 or, when transformed from log-price variables x i to actual price variables s i via (s 1 , . . . , s d ) = (exp(x 1 ), . . . , exp(x d )) (and with the convention se y = (s 1 e y 1 , . . . , s d e y d )) (5.14)  [GHJvW18]. The results in the latter article are specialized to the Black-Scholes case in Section 4 [GHJvW18], where Setting 4.1 specifies the coefficients (A d ) i,j (in our notation) is symmetric, positive definite we obtain Σ i,j ≤ Σ i,i Σ j,j = 1 and hence these assumptions imply that (5.5) is satisfied. Therefore, the DNN expression rate results from Section 4 in [GHJvW18] can also be deduced from Thm.5.1, here in the case when the probability measure used to quantify the L p -error in [GHJvW18] is compactly supported, as in that case the L ∞ -bounds proved here imply the L p -bounds proved in [GHJvW18]. 5.3. Exponential ReLU DNN expression rates via PIDE. We now extend the univariate case discussed in Section 4.3, and prove an exponential expression rate bound similar to Proposition 4.8 for baskets of d ≥ 2 Lévy-driven assets. In this subsection we assume ReLU activation function ̺(x) = max{x, 0}. As in Section 5.1, we admit general correlation structure of the marginal processes' jumps. To prove DNN expression rate bounds, we exploit once more the fact that the stationarity and homogeneity of the R d -valued LP X d imply that the Kolmogorov equation (5.13) has constant coefficients. Under the provision that in (5.13) holds v d (0, ·) ∈ L 2 (R d ), this allows to write for every τ > 0 the Fourier transform F x→ξ v d (τ, ·) =v d (τ, ξ) as Here, for ξ ∈ R d the symbol ψ(ξ) = exp(−ix ⊤ ξ)A(∂ x ) exp(ix ⊤ ξ) with A(∂ x ) denoting the constant coefficient spatial integrodifferential operator in (5.13) by Courrège's 2nd Theorem (see, e.g., Applebaum [App09, Theorem 3.5.5]), and (4.21) becomes In fact, ψ can be expressed in terms of the characteristic triplet (A d , γ d , ν d ) of the LP X d as We impose again the strong ellipticity assumption (4.17), however now with |ξ| understood as |ξ| 2 = ξ ⊤ ξ for ξ ∈ R d . Then reasoning exactly as in the proof of Proposition 4.8 we obtain with C 1 > 0 as in (4.17) for every τ > 0 for the variational solution v d of (5.13) the bound Here, D k x denotes any weak derivative of total order k ∈ N 0 with respect to x ∈ R d . With the Sobolev embedding theorem we again obtain for any bounded cube and for every fixed τ > 0, that there exist constants C(d) > 0 and A(τ, ρ) > 0 such that The constant C(d) is independent of x − , x + , but depends in general exponentially on the basket size (respectively the dimension) d ≥ 2, and the constant A(τ, ρ) = (2τ C 1 ρ) −1/(2ρ) denotes the constant from (5.18) and Stirling's bound. If ρ = 1 (which corresponds to the case of non-degenerate diffusion) and if τ > 0 is sufficiently large (so that (2τ C 1 ) 1/(2ρ) ≥ 1) then the constant is bounded uniformly w.r. to the dimension d . The derivative bound (5.19) implies that v d (τ, ·)| I d is Gevrey-δ-regular with δ = 1/ min{1, 2ρ}. In particular, for δ = 1, i.e. when ρ ≥ 1/2, for every fixed τ > 0, which is the case we consider first.
. For every N ∈ N the set I d = R( log N )(J d )∪log(J d ) ⊂ (−∞, ∞) d is compact due to (5.23). For given ε ∈ (0, 1/2], we choose N ∈ N as before. Using that N min{ 1 2δ , 1 δd+1 } ≤ N 1 d+1 , this choice guarantees that in (5.23) it holds that C exp(−β ′ N 1 d+1 ) ≤ ε. Then we define u d (τ, ·) = R(ṽ d,ε )(·) • R( log N )(·) and estimate Since the DNN size and DNN depth are additive under composition of ReLU DNNs, the assertion forũ d,ε follows (possibly adjusting the value of the constant C).  In the present section we further address alternative mathematical arguments on how DNNs can overcome the CoD in the presently considered jump-diffusion models. Specifically, two mathematical arguments in addition to the probabilistic arguments in Section 5.1 are presented. Both exploit stationarity of the LP X d which implies (5.15), (5.16), to obtain DNN expression rates free from the curse of dimensionality.
5.4.1. Barron Space Analysis. The first alternative approach to Theorem 5.1 is based on verifying, using (5.15), (5.16), regularity of option prices in the so-called Barron space introduced in the fundamental work Barron [Bar93]. It will provide DNN expression error bounds with explicit values for p and q, however, in [Bar93] only for DNNs with sigmoidal activation functions ̺; similar results for ReLU activations are asserted in E and Wojtowytsch [EW21]. For simplicity, we consider here a subset B of Barron space. An integrable function f : The explicit appearance of the Fourier transformf renders the norm • B in (5.24) particularly suitable for our purposes due to (5.15)-(5.17). As was pointed out in [Bar93,EW21], the relevance of the Barron norm • B stems from it being sufficient for dimension-robust DNN approximation rates. For m ∈ N, consider the two-layer neural networks f m which are given by Their relevance stems from the following result: assume that ̺ is sigmoidal, i.e., bounded, measurable and ̺(z) → 1 as z → ∞, ̺(z) → 0 as z → −∞.
Then, for f ∈ B and for every R > 0, d ∈ N, and for every m ∈ N exist parameters {(a i , such that for the corresponding DNN f m as in (5.25) holds Here, π denotes a probability measure charging [−R, R] d .
Here, π denotes a probability measure on [−R, R] d .

Parabolic Smoothing and Sparsity of Chaos Expansions.
The second non-probabilistic approach to Theorem 5.1 towards DNN expression error rates not subject to the CoD is based on dimension-explicit derivative bounds of option prices, which allow in turn to establish summability bounds for generalized polynomial chaos (gpc for short) expansions of these prices. Good summability of gpc coefficient sequences is well known to imply high, dimension-independent rates of approximation by sparse, multivariate polynomials. This, in turn, implies corresponding expression rates by suitable DNNs Schwab and Zech [SZ19, Theorem 3.9]. Key in this approach is to exploit parabolic smoothing of the Kolmogorov PDE. The corresponding dimension-independent expression rate results will generally be higher than those based on probabilistic or Barron space analysis, but will hold only for sufficiently large τ > 0. We start by discussing more precisely the dependence of the constants in the proof of Theorem 5.4 on the dimension d. For α ∈ N d 0 with |α| = d i=1 α i = k, we find with the Cauchy-Schwarz inequality and with the lower bound (4.17) The last factor can be bounded precisely by the square-root of the right hand side of (5.18) (by using (4.23)). Using k k ≤ k!e k we obtain the bound (5.19) as with constant A(τ, ρ) = (2τ C 1 ρ) −1/(2ρ) and the explicit constant where ω d denotes the volume of the unit ball in R d . Inspecting the constant C(d, τ ) in (5.28), we observe that e.g. for ρ = 1 and τ 0 = τ 0 (C 1 ) = 1/(8πC 1 ), τ ≥ τ 0 > 0 sufficiently large implies that the constant C(d, τ ) is bounded independent of τ and d.
Remark 5.8. In certain cases, the parabolic smoothing implied by the ellipticity assumption (4.17) on the generator A entails that the constant C in the regularity estimates (5.19) grows only polynomially with respect to d. For instance, in Remark 5.7 we provided sufficient conditions which ensure that the constant C in the regularity estimates (5.19) is even bounded with respect to d. This allows to derive an explicit and dimension-independent bound on the series of Taylor coefficients. This, in turn, allows to obtain bounds on the constant in (5.22) which scale polynomially with respect to d. Consider, for example, ρ = 1 (i.e. non-degenerate diffusion) and assume that τ > 0 is sufficiently large: specifically, (2τ C 1 ) 1/(2ρ) ≥ 1 and dA(τ, ρ) < 1, where A(τ, ρ) = (2ρτ C 1 ) −1/(2ρ) denotes the constant in large parentheses of (4.24). This holds if (5.29) τ > d 2ρ 2ρC 1 .
With (5.29) and using α∈N d 0 ,|α|=k k α = d k , we may estimate with the multinomial theorem .  [SZ19] in the present context. We impose the following hypothesis, which takes the place of the lower bound in (4.17). We still impose |ψ(ξ)| ≤ C 2 |ξ| 2ρ + C 3 , i.e., the second condition in (4.17) holds for each d ∈ N (but C 2 , C 3 and ρ in that condition are allowed to depend on d).
Assumption 1. There exists a constant C 1 > 0 and (ρ j ) j∈N with 1 2 < ρ j ≤ 1, such that for each d ∈ N, the symbol ψ X d of the LP X d satisfies that Furthermore, In comparison to the lower bound in (4.17) the condition (5.30) is restricted to the case ρ > 1 2 . On the other hand, different exponents ρ j are allowed along each component. Furthermore, note that Assumption 1 imposes that C 1 does not depend on the dimension d.
Remark 5.9. Consider the pure diffusion case, i.e., when the characteristic triplet is (A d , 0, 0) with a symmetric, positive definite diffusion matrix A d and Lévy-symbol ψ X d : A sufficient condition for assumption (5.30) to hold is that the eigenvalues (λ d i ) i=1,...,d of A d be lower bounded away from zero, To see this, write Q ⊤ A d Q = D for a diagonal matrix D containing the eigenvalues of A d and an orthogonal matrix Q. Then we obtain for arbitrary ξ ∈ R d Therefore condition (5.30) is satisfied with C 1 as in (5.31) and ρ j = 1 for all j ∈ N. This condition imposes, in applications, that different assets (modelled by different components of the LP X d ) should not become asymptotically (perfectly) dependent as the dimension grows.
with C 1 as in (5.31). Hence, Assumption 1 is satisfied also in this more general situation. Further examples of LP satisfying Assumption 1 are based on stable-like processes and copula-based constructions as e.g. in Farkas et al. [FRS07].
As we shall see below, Assumption 1 ensures good "separation" and "anisotropy" properties of the symbol (5.17) of the corresponding Lévy process X d . For τ > 0 satisfying (5.29), we analyze the regularity of x → v d (τ, x). From Assumption 1 we find that for every τ > 0, x → v d (τ, x) ∈ L 2 (R d ) and that its Fourier transform has the explicit form For a multi-index ν = (ν 1 , ..., ν d ) ∈ N d 0 , denote by ∂ ν x the mixed partial derivative of total order |ν| = ν 1 + ... + ν d with respect to x ∈ R d . Formula (5.32) and Assumption 1 can be used to show that for every τ > 0, x → v d (τ, x) is analytic at any x ∈ R d . This is of course the well-known smoothing property of the generator of certain non-degenerate Lévy processes. To address the curse of dimensionality, we quantify the smoothing effect in a d-explicit fashion. To this end, with Assumption 1 we calculate for any ν ∈ N d 0 at x = 0 (by stationarity, the same bounds hold for the Taylor coefficients at any x ∈ R d ) We use (4.23) with m ← ν j , κ ← C 1 τ , µ = 2ρ j to bound the product as . We arrive at the following bound for the Taylor coefficient of order ν ∈ N d 0 of v d (t, ·) at x = 0: Stirling's inequality ∀n ∈ N : n! ≥ n n e −n √ 2πn ≥ n n e −n implies in (5.33) the bound Here, ρ ′ = 1− 1 2ρ > 0 and the positive weight sequence b = (b j ) j≥1 is given by b j = (2ρ j τ C 1 ) −1/(2ρ j ρ ′ ) , j = 1, 2, ... and multi-index notation is employed: .. and ν! = ν 1 !ν 2 !..., with the convention 0! = 1 and 0 0 = 1. We raise (5.34) to a power q > 0, with q < 1/ρ ′ and sum the resulting inequality over all ν ∈ N d 0 to estimate (generously) To obtain the estimate (5.34), one could also use the L 2 -bound with explicit constant derived in (5.27), (5.28). Under hypothesis (5.30) and for τ > 0 satisfying (5.29), q-summability of the Taylor coefficients follows.
We now refer to [SZ19, Theorem 3.9] (with q in place of p in the statement of that result) and, observing that in the proof of that theorem, only the p-summability of the Taylor coefficient sequence {t ν } was used, we conclude that for τ > 0 satisfying (5.35) there exists a constant C > 0 that is independent of d and, for every n ∈ N exists a ReLU DNNṽ n d with input dimension d, such that

Conclusion and Generalizations
We proved that prices of European style derivative contracts on baskets of d ≥ 1 assets in exponential Lévy models can be expressed by ReLU DNNs to accuracy ε > 0 with DNN size polynomially growing in ε −1 and d, thereby overcoming the curse of dimensionality. The technique of proof was based on probabilistic arguments and provides expression rate bounds that scale algebraically in terms of the DNN size. We then also provided an alternative, analytic argument, that allows to prove exponential expressivity of ReLU DNNs of the option price, i.e. of the map s → u(t, s) at any fixed time 0 < t < T , with DNN size growing polynomially w.r. to log(ε) to achieve accuracy ε > 0. For sufficiently large t > 0, based on analytic arguments involving parabolic smoothing and sparsity of generalized polynomial chaos expansions, we established in (5.38) a second, algebraic expression rate bound for ReLU DNNs that is free from the curse of dimensionality. In a forthcoming work Gonon and Schwab [GS21] we address PIDEs (5.13) with non-constant coefficients. In addition, the main result of the present paper, Thm.5.1, could be extended in the following directions. First, the expression rates are, almost certainly, not optimal in general; for high-dimensional diffusions, which are a particular case with A d = I and ν d = 0, in Elbrächter et al. [EGJS21] we established for particular payoff functions a spectral expression rate in terms of the DNN size, free from the curse of dimensionality. Solving Hamilton-Jacobi partial integrodifferential equations (HJPIDEs for short) by DNNs: it is classical that the Kolmogorov equation for the exponential LP X d in Section 2.2 is, in fact, a special case of a HJPIDE (e.g. Barles et al. [BBP97], Barles and Imbert [BI08]). In a forthcoming work [GS21] we aim at proving that the expression rate bounds obtained in Section 5 imply corresponding expression rate bounds for ReLU DNNs which are free from the curse of dimensionality for viscosity solutions of general HJPIDEs associated to the LP X d and for its exponential counterparts. Barriers: We considered payoff functions corresponding to European style contracts. Here, the stationarity of the LP X d and exponential Lévy modelling allowed to reduce our analysis to Cauchy problems of the Kolmogorov equations of X d in R d . In Lévy models in the presence of barriers, option prices generally exhibit singularities at the barriers. More involved versions of the Fourier transform based representations are available (involving a so-called Wiener-Hopf factorization of the Fourier symbol, see, e.g., Boyarchenko and Levendorskiȋ [BL02]). For LPs X d with bounded exponential moments, the present regularity analysis may be localized to compact subsets, well separated from the barriers, subject to an exponentially small localization error term; see Hilber et al. [HRSW13,Chapter 10.5]. Here, the semiheavy tails of the LPs X d enter crucially in the analysis. We therefore expect the present DNN expression rate bounds to remain valid also for barrier contracts, at least far from the barriers, for the LPs X d considered here. Dividends: We assumed throughout that contracts do not pay dividends; however, including a dividend stream (with constant over (0, T ] rate) on the underlying does not change the mathematical arguments; we refer to Lamberton and Mikou [LM08, Section 3.1] for a complete statement of exponential Lévy models with constant dividend payment rate δ > 0, and for the corresponding pricing of European and American style contracts for such models. American style contracts: Deep learning based algorithms for the numerical solution of optimal stopping problems for Markovian models have been recently proposed in Becker et al. [BCJ19].
For the particular case of American style contracts in exponential Lévy models, [LM08] provide an analysis in the univariate case, and establish qualitative properties of the exercise boundary {(b(t), t) : 0 < t < T }.
Here, for geometric Lévy models, in certain situations (d = 1, i.e. single risky asset, monotonic, piecewise analytic payoff function) the option price, as a function of x ∈ R at fixed 0 < t < T , is shown in [LM08] to be a piecewise analytic function which is, globally, Hölder continuous with a possibly algebraic singularity at the exercise boundary b(t). This holds, likewise, for the price expressed in the logarithmic coordinate x = log(s). The ReLU DNN expression rate of such functions has been analyzed in Opschoor et al. [OPS20,Section 5.4]. In higher dimensions d > 1, recently also higher Hölder regularity of the price in symmetric, stable Lévy models has been obtained for smooth payoffs in Barrios et al. [BFRO18].