Simulation of the drawdown and its duration in Lévy models via stick-breaking Gaussian approximation

We develop a computational method for expected functionals of the drawdown and its duration in exponential Lévy models. It is based on a novel simulation algorithm for the joint law of the state, supremum and time the supremum is attained of the Gaussian approximation for a general Lévy process. We bound the bias for various locally Lipschitz and discontinuous payoffs arising in applications and analyse the computational complexities of the corresponding Monte Carlo and multilevel Monte Carlo estimators. Monte Carlo methods for Lévy processes (using Gaussian approximation) have been analysed for Lipschitz payoffs, in which case the computational complexity of our algorithm is up to two orders of magnitude smaller when the jump activity is high. At the core of our approach are bounds on certain Wasserstein distances, obtained via the novel stick-breaking Gaussian (SBG) coupling between a Lévy process and its Gaussian approximation. Numerical performance, based on the implementation in Cázares and Mijatović (SBG approximation. GitHub repository. Available online at https://github.com/jorgeignaciogc/SBG.jl (2020)), exhibits a good agreement with our theoretical bounds. Numerical evidence suggests that our algorithm remains stable and accurate when estimating Greeks for barrier options and outperforms the “obvious” algorithm for finite-jump-activity Lévy processes.

1. Introduction 1.1.Setting and Motivation.Lévy processes are increasingly popular for the modeling of the market prices of risky assets.They naturally address the shortcoming of the diffusion models by allowing large (often heavy-tailed) sudden movements of the asset price observed in the markets [Sch03,Kou08,CT15].For risk management, it is therefore crucial to quantify the probabilities of rare and/or extreme events in Lévy models.Of particular interest in this context are the distributions of the drawdown (the current decline from a historical peak) and its duration (the elapsed time since the historical peak), see e.g.[Sor,Vec06,CZH11,BPP17,LLZ17].Together with the hedges for barrier options [ACU02, Sch06, KL09, GX17] and ruin probabilities in insurance [Mor03,KKM04,LZZ15], the expected drawdown and its duration constitute risk measures dependent on the following random vector, which is a statistic of the path of a Lévy process X: a historic maximum X T at a time T , the time τ T (X) at which this maximum was attained and the value X T of the process at T .Since neither the distribution of the drawdown 1 − exp(X T − X T ) nor of its duration T − τ T (X) is analytically tractable for a general X, simulation provides a natural alternative.The main objective of the present paper is to develop and analyse a novel practical simulation algorithm for the joint law of (X T , X T , τ T (X)), applicable to a general Lévy process X.
Exact simulation of the drawdown of a Lévy process is currently out of reach except for the stable [GCMUB19] and jump diffusion cases.However, even in the stable case it is not known how to jointly simulate any two components of (X T , X T , τ T (X)).Among the approximate simulation algorithms, the recently developed stick-breaking approximation [GCMUB21] is the fastest in terms of its computational complexity, as it samples from the law of (X T , X T , τ T (X)) with a geometrically decaying bias.However, like most approximate simulation algorithms for a statistic of the entire trajectory, it is only valid for Lévy process whose increments can be sampled.Such a requirement does not hold for large classes of widely used Lévy processes, including the general CGMY (aka KoBoL) model [CGMY03].Moreover, nonparametric estimation of Lévy processes typically yields Lévy measures whose transitions cannot be sampled [NR09, CDH10, CGC11, CGY18, QT19], again making a direct application of the algorithm in [GCMUB21] infeasible.
If the increments of X cannot be sampled, a general approach is to use the Gaussian approximation [AR01], which substitutes the small-jump component of the Lévy process by a Brownian motion.Thus, the Gaussian approximation process is a jump diffusion and the exact sample of the random vector (consisting of the state of the process, the supremum and the time the supremum is attained) can be obtained by applying [Dev10,Alg.MAXLOCATION] between the consecutive jumps.However, little is known about how close these quantities are to the vector (X T , X T , τ T (X)) that is being approximated in either Wasserstein or Kolmogorov distances.Indeed, bounds on the distances between the marginal of the Gaussian approximation and X T have been considered in [Dia13] and recently improved in [MR18,CDM20].A Wasserstein bound on the supremum is given in [Dia13] but so far no improvement analogous to the marginal case has been established.Moreover, to the best of our knowledge, there are no corresponding results either for the joint law of (X T , X T ) or the time τ T (X).Furthermore, as explained in Subsection 4.1.2below, the exact simulation algorithm for the supremum and the time of the supremum of a Gaussian approximation based on [Dev10, Alg.MAXLOCATION] is unsuitable for the multilevel Monte Carlo estimation.
The main motivation for the present work is to provide an operational framework for Lévy processes, which allows us to settle the issues raised in the previous paragraph, develop a general simulation algorithm for (X T , X T , τ T (X)) and analyse the computational complexity of its Monte Carlo (MC) and multilevel Monte Carlo (MLMC) estimators.
1.2.Contributions.The main contributions of this paper are twofold.(I) We establish bounds on the Wasserstein and Kolmogorov distances between the vector χ T = (X T , X T , τ T (X)) and its Gaussian approximation χ T , τ T (X (κ) )), where X (κ) is a jump diffusion equal to the Lévy process X with all the jumps smaller than κ ∈ (0, 1] substituted by a Brownian motion (see definition (2.5) below), and X (κ) T (resp.τ T (X (κ) )) is the supremum of X (κ) (resp.the time X (κ)  attains the supremum) over the time interval [0, T ]. (II) We introduce a simple and fast algorithm, SBG-Alg, which samples exactly the vector of interest for the Gaussian approximation of any Lévy process X, develop an MLMC estimator based on SBG-Alg (see [GCM] for an implementation in Julia) and analyse its complexity for discontinuous and locally Lipschitz payoffs arising in applications.We now briefly discuss each of the two contributions.
(I) In Theorem 3 (see also Corollary 4) we establish bounds on the Wasserstein distance between χ T and χ (κ) T (as κ tends to 0) under weak assumptions, typically satisfied by the models used in applications.The proof of Theorem 3 has two main ingredients.First, in Subsection 6.2 below, we construct a novel SBG coupling between χ T and χ (κ) T , based on the stick-breaking (SB) representation of χ T in (2.1) and the minimal transport coupling between the increments of X and its approximation X (κ) .The second ingredient consists of new bounds on the Wasserstein and Kolmogorov distances, given in Theorems 1 and 2 respectively, between the laws of X t and X (κ) t for any t > 0.
Theorem 3 is our main tool for controlling the distance between χ T and χ (κ) T .The SBG coupling underlying it cannot be simulated, but it provides a bound on the bias of SBG-Alg.Dominating the bias of the time τ T (X), which is a non-Lipschitz functional of the path of X, requires (by SB representations (2.1)) the bound in Theorem 2 on the Kolmogorov distance between the marginals.Applications related to the duration of drawdown and the risk-management of barrier options require bounding the bias of certain discontinuous functions of χ T .In Subsection 3.2 we develop such bounds.Their proofs are based on Theorem 3 and Lemma 14 of Subsection 6.3, which essentially converts Wasserstein distance into Kolmogorov distance for sufficiently regular distributions.We give explicit general sufficient conditions on the characteristic triplet of the Lévy process X (see Proposition 9 below), which guarantee the applicability of the results of Subsection 3.2 to models typically used in practice.Moreover, we obtain bounds on the Kolmogorov distance between the components of (X T , τ T (X)) and (X (κ) T , τ T (X (κ) )) (see Corollary 8 below), which we hope are of independent interest.
(II) Our main simulation algorithm, SBG-Alg, samples jointly coupled Gaussian approximations of χ T at distinct approximation levels.The coupling in SBG-Alg exploits the following simple observations: • Any Gaussian approximation χ (κ) T has an SB representation in (2.2), where the law of Y in (2.2) must equal that of X (κ) .
• For any two Gaussian approximations, the stick-breaking process in (2.2) can be shared.
• The increments in (2.2) over the shared sticks can be coupled using definition (2.5) of the Gaussian approximation X (κ) .
We analyse the computational complexity of the MLMC estimator based on SBG-Alg for a variety of payoff functions arising in applications.Figure 1.1 shows the leading power of the resulting MC and MLMC complexities, summarised in Tables 2 and 3 below (see Theorem 22 for full details), for locally Lipschitz and discontinuous payoffs used in practice.To the best of our knowledge, neither locally Lipschitz nor discontinuous payoffs had been previously considered in the context of MLMC estimation under Gaussian approximation.
A key component of the analysis of the complexity of an MLMC estimator is the rate of decay of level variances (see Appendix A.2 for details).In the case of SBG-Alg, the rate of decay is given in Theorem 17 below for locally Lipschitz and discontinuous payoffs of interest.Moreover, the proof of Theorem 17 shows that the decay of the level variances for Lipschitz payoffs under SBG-Alg is asymptotically equal to that of Algorithm 1, which samples jointly the increments at two distinct levels only.Furthermore, an improved coupling in Algorithm 1 for the increments of the Gaussian approximations (cf. the last bullet on the list above) would reduce the computational complexity the MLMC estimator for all payoffs considered in this paper (including the discontinuous ones).To the best of our knowledge, SBG-Alg is the first exact simulation algorithm for coupled Gaussian approximations of χ T with vanishing level variances when X has a Gaussian component, see also Subsection 4.1.2below.
In Section 5, using the code in repository [GCM], we test our theoretical findings against numerical results.We run SBG-Alg for models in the tempered stable and Watanabe classes.The former is a widely used class of processes whose increments cannot be sampled for all parameter values and the latter is a well-known class of processes with infinite activity but singular continuous increments.
In both cases we find a reasonable agreement between the theoretical prediction and the estimated decays of the bias and level variance, see Figures 5.1 & 5.2 below.
In the context of MC estimation, a direct simulation algorithm based on [Dev10, Alg.MAXLO-CATION] (Algorithm 2 below) can be used instead of SBG-Alg.In Subsection 5.2 we compare numerically its cost with that of SBG-Alg.
In the examples we considered, the speedup of SBG-Alg over Algorithm 2 is about 50, see Figure 5.3, remaining significant even for processes with small jump activity, see Figure 5.4.1.3.Comparison with the literature.Approximations of the pair (X T , X T ) abound.They include the random walk approximation, a Wiener-Hopf based approximation [KKPvS11,FCKSS14], the jump-adapted Gaussian (JAG) approximation [DH11,Der11] and, more recently, the SB approximation [GCMUB21].The SB approximation converges the fastest as its bias decays geometrically in its computational cost.However, the JAG approximation is the only method known to us that does not require the ability to simulate the increments of the Lévy process X.Indeed, the JAG approximation simulates all jumps above a cutoff level, together with their jump times, and then samples the transitions of the Brownian motion from the Gaussian approximation on a random grid containing all the jump times.In contrast, in the present paper we approximate χ T = (X T , X T , τ T (X)) with an exact sample from the law of the Gaussian approximation χ T , τ T (X (κ) )).The JAG approximation has been analysed for Lipschitz payoffs of the pair (X T , X T ) in [DH11,Der11].The discontinuous and locally Lipschitz payoffs arising in applications, considered in this paper (see Figure 1.1), have to the best of our knowledge not been analysed for the JAG approximation.Nor have the payoffs involving the time τ T (X) the supremum is attained.Within the class of Lipschitz payoffs of (X T , X T ), the computational complexities of the MC and MLMC estimators based on SBG-Alg are asymptotically dominated by those based on the JAG approximation, see Figure 1.2.In fact, SBG-Alg applied to discontinuous payoffs outperforms the JAG approximation applied to Lipschitz payoffs by up to an order of magnitude in computational complexity, cf.In order to understand where the differences in Figure 1.2 come from, in Table 1 we summarise the bias and level variance for SBG-Alg and the JAG approximation as a function of the cutoff level κ ∈ (0, 1] in the Gaussian approximation (cf.(2.5) below).

Gaussian component Approximation Bias Level variance
With (σ = 0) Table 1 shows that both bias and level variance decay at least as fast (and typically faster) for SBG-Alg than for the JAG approximation.The large improvement in the computational complexity of the MC estimator in Figure 1.2 is due to the faster decay of the bias under SBG-Alg.Put differently, the SBG coupling constructed in this paper controls the Wasserstein distance much better than the KMT-based coupling in [Der11].For the BG index β > 1, the improvement in the computational complexity of the MLMC estimator is mostly due to an faster bias decay.For β < 1, Figure 1.2(A) suggests that the computational complexity of the MLMC estimator under both algorithms is optimal.However, in this case, Table 1 and the equality in (A.3) imply that the MLMC estimator based on the JAG approximation has a computational complexity proportional to ǫ −2 log 3 (1/ǫ) while that of SBG-Alg is proportional to ǫ −2 .This improvement is due solely to the faster decay of level variance under SBG-Alg.The numerical experiments in Subsection 5.1 suggest that our bounds for Lipschitz and locally Lipschitz functions are sharp, see graphs To the best of our knowledge, in the literature there are no directly comparable results to either Theorem 3 or Proposition 7. Partial results in the direction of Theorem 3 are given in [Dia13,MR18,CDM20].We will now briefly comment on these results.Distance between the marginals X t and X (κ) t : Theorem 1 below, a key step in the proof of Theorem 3, improves the bounds in [MR18, Thm 9] on the Wasserstein distance.Theorem 2 below, a further key ingredient in the proof of Theorem 3, bounds the Kolmogorov distance with better rates than those of [Dia13, Prop. 10 (part 1)] (as κ → 0).Papers [MR18,CDM20] obtain bounds on the total variation distance between X t and X (κ) t , dominating the Kolmogorov distance.However, Theorem 2 again yields faster decay.For more details about these comparisons see Subsection 3.1 below.Distance between the suprema X t and X (κ) t : the rate of the bound in [Dia13, Thm 2] on the Wasserstein distance is worse than that implied by the bound in Corollary 4 below on the Wasserstein distance between (X t , X t ) and (X (κ) t , X (κ) t ).Proposition 5 below bounds the bias of locally Lipschitz functions, generalising [Dia13, Prop.9] and providing a faster decay rate.Proposition 6 and Corollary 8(a) below cover a class of discontinuous payoffs, including the up-and-in digital option considered in [Dia13, Prop. 10 (part 3)], and provide a faster rate of decay as κ → 0 if either X has a Gaussian component or the BG index β > 2/3.1.4.Organisation.The remainder of the paper is organised as follows.Section 2 recalls SB representation (2.1)-(2.2) and the Gaussian approximation (2.5) developed in [GCMUB21] and [AR01], respectively.Section 3 presents bounds on Wasserstein and Kolmogorov distances between χ T and its Gaussian approximation χ (κ) T and the biases of certain payoffs arising in applications.Section 3 also provides simple sufficient conditions, in terms of the Lévy triplet, under which these bounds hold.Section 4 constructs our main algorithm, SBG-Alg, and presents the computational complexity of the corresponding MC and MLMC estimators for all payoffs considered in this paper.In Section 5 we illustrate numerically our results for a widely used class of Lévy models.The proofs and the technical results are found in Section 6. Appendix A.1 gives a brief account of the complexity analysis of MC and MLMC (introduced in [Hei01, Gil08]) estimators.

The stick-breaking representation and the Gaussian approximation
Let f : [0, ∞) → R be a right-continuous function with left limits.For any t ∈ (0, ∞), define f t := inf s∈[0,t] f (s), f t := sup s∈[0,t] f (s) and let τ t (f ) (resp.τ t (f )) be the last time before t that the infimum f t (resp.supremum f t ) is attained.Throughout X = (X t ) t≥0 denotes a Lévy process, i.e. a stochastic process started at the origin with independent, stationary increments and rightcontinuous paths with left limits, see [Ber96,Kyp06,Sat13] for background on Lévy processes.In mathematical finance, the risky asset price S = (S t ) t≥0 under an exponential Lévy model is given by S t := S 0 e Xt .The price S t , its drawdown 1 − S t /S t (resp.drawup 1 − S t /S t ) and duration t − τ t (S) (resp.t − τ t (S)) at time t can be recovered from the vector χ t := (X t , X t , τ t (X)) (resp.χ t := (X t , X t , τ t (X))).Since Z := −X is a Lévy process and χ t = (−Z t , −Z t , τ t (Z)), it is sufficient to analyse the vector χ t .
2.1.The stick-breaking (SB) representation.We begin by recalling [GCMUB21, Thm 1], which is at the core of the bounds and algorithms developed in this paper.Given a Lévy process X and a time horizon t > 0, there exists a coupling (X, Y ), where Y d = X (throughout the paper d = denotes equality in law), and a stick-breaking process ℓ = (ℓ n ) n∈N on [0, t] based on the uniform law U(0, 1) (i.e.L 0 := t, L n := L n−1 U n , ℓ n := L n − L n−1 for n ∈ N, where (U n ) n∈N is an iid sequence following U n ∼ U(0, 1)), such that a.s. (2.1) Since, given L n , (ℓ k ) k>n is a stick-breaking process on [0, L n ], for any n ∈ N, (2.1) implies (2.2) Observe that the vector (Y Ln , Y Ln , τ Ln (Y )) and the sum on the right-hand side of the identity in (2.2) are conditionally independent given L n : the former (resp.latter) is a function of (Y s ) s∈[0,Ln] (resp.(Y s − Y Ln ) s∈[Ln,t] ), cf. Figure 2.1.The vector of interest χ t is thus represented by the corresponding vector (Y Ln , Y Ln , τ Ln (Y )) over an exponentially small interval (since E[L n ] = 2 −n t) and n independent increments of the Lévy process over random intervals independent of Y .In (2.2) and throughout ½ A is an indicator of a set A: We stress that (2.1) and (2.2) reduce the analysis of the path-functional χ t to that of the increments of X, since the "error term" (Y Ln , Y Ln , τ Ln (Y )) in (2.2) is typically exponentially small in n.More generally, for another Lévy process X ′ , the vectors χ t and (X ′ t , X ′ t , τ t (X ′ )) will be close if the increments of Y and Y ′ over the intervals [L k , L k−1 ] are close: apply (2.2) with a single stick-breaking process ℓ, independent of both Lévy processes Y d = X and Y ′ d = X ′ , respectively.This observation constitutes a key step in the construction of the coupling used in the proof of Theorem 3 below, which in turn plays a crucial role in controlling the bias (see the subsequent results of Section 3) of our main simulation algorithm SBG-Alg described in Section 4 below.SBG-Alg is based on (2.2) with X ′ being the Gaussian approximation of a general Lévy process X introduced in [AR01] and recalled briefly in the next subsection.
Since J 2,κ has an average of ν(κ)t jumps on [0, t], the expected complexity of simulating the increment X (κ) t is a constant multiple of 1 + ν(κ)t (see Algorithm 1 below).Moreover, the user need only be able to sample from the normalised tails of ν, which can typically be achieved in multiple ways (see e.g.[Ros01]).The behaviour of ν(κ) and σ κ as κ ↓ 0, key in the analysis of the MC/MLMC complexity, can be described in terms of the Blumenthal-Getoor (BG) index [BG61] β, defined as (2.6) β := inf{p > 0 : I p 0 < ∞}, where I p 0 := |x| p ν(dx) for any p ≥ 0.
3. Distance between the laws of χ t and its Gaussian approximation χ (κ)   t In this section we present bounds on the distance between the laws of the vectors χ t , defined in Section 2 above, and its Gaussian approximation χ (κ) )), based on the Lévy process X (κ) in (2.5).Our bounds on the Wasserstein distance (see Theorem 3 and Corollary 4 in Subsection 3.1) are based on a coupling constructed in Subsection 6.2 below, which in turn draws on the coupling in (2.1).Theorem 3 is then applied to control the bias of certain discontinuous and non-Lipschitz functions of χ t arising in applications (Subsection 3.2 below) as well as the Kolmogorov distances between the components of (X t , τ t (X)) and (X (κ) t , τ t (X (κ) )) (see Subsection 3.3 below).
3.1.Bounds on the Wasserstein and Kolmogorov distances.In order to study the Wasserstein distance between χ t and χ (κ) t via (2.1)-(2.2),we have to quantify the Wasserstein and Kolmogorov distances between the increments X s and X (κ) s for any time s > 0. With this in mind, we start with Theorems 1 and 2, which play a key role in the proofs of the main results of the subsection, Theorem 3 and Corollary 4 below, and are of independent interest.Theorem 1.There exist universal constants K 1 := 1/2 and K p > 0, p ∈ (1, 2], independent of (σ 2 , ν, b), such that for any t > 0 and κ ∈ (0, 1] there exists a coupling (X t , X Theorem 1 bounds the L p -Wasserstein distance (see (3.10) below for definition) between X t and X Thm 9]: the factor ϕ 2/p κ ∈ [0, 1] tends to zero (with κ → 0) as a constant multiple of σ 2/p κ if the Brownian component is present (i.e.σ > 0) and is equal to 1 when σ = 0.The bound in (3.1) cannot be improved in general in the sense that there exists a Lévy processes for which, up to constants, the reverse inequality holds (see [MR18,Rem. 3] and [Fou11,Sec. 4]).
The proof of Theorem 1, given in Subsection 6.1 below, decomposes the increment M (κ) t of the Lévy martingale M (κ) := σB + J 1,κ into a sum of m iid copies of M (κ) t/m and applies a Berry-Essentype bound for the Wasserstein distance [Rio09] in the context of a central limit theorem (CLT) as m → ∞.The small-time moment asymptotics of M (κ) t/m in [FL08] imply that M (κ) t is much closer to the Gaussian limit in the CLT if the Brownian component is present than if σ = 0.This explains a vastly superior rate in (3.1) in the case σ 2 > 0.
Bounds on the Kolmogorov distance may require the following generalisation of Orey's condition, which makes the distribution of X t sufficiently regular (see [Sat13,Prop. 28.3]).
Theorem 2. (a) There exists a constant C BE ∈ (0, 1 2 ), such that for any κ ∈ (0, 1], t > 0 we have: (b) Let Assumption (O-δ) hold.Then for every T > 0 there exists a constant C > 0, depending only on (T, δ, σ, ν), such that for any κ ∈ (0, 1] and t ∈ (0, T ] we have: The proof of Theorem 2 is in Subsection 6.1 below.Part (a) follows the same strategy as the proof of Theorem 1, applying the Berry-Esseen theorem (instead of [Rio09, Thm 4.1]) to bound the Kolmogorov distance.For the same reason as in (3.1), the rate in (3.2) is far better if σ 2 > 0. Proof of Theorem 2(b) bounds the density of X t using results in [Pic97] and applies (3.1).
Note that no assumption is made on the Lévy process X in Theorem 2(a).In particular, Assumption (O-δ) is not required in part (a); however, if (O-δ) is not satisfied, implying in particular that σ = 0, it is possible for the bound in (3.2) not to vanish as κ → 0 even if the Lévy process has infinite activity, i.e. ν(R \ {0}) = ∞.In fact, if σ = 0, the bound in (3.2) vanishes (as κ → 0) if and only if σ κ /κ → ∞, which is also a necessary and sufficient condition for the weak limit σ −1 κ J 1,κ d → W to hold whenever ν has no atoms in a neighbourhood of 0 (see [AR01, Prop.2.2]).
If X has a Brownian component (i.e.σ = 0), the bound on the total variation distance between the laws of X t and X (κ) t established in [MR18, Prop.8] implies the following upper bound on the Kolmogorov distance: This inequality is both generalised and sharpened (as κ → 0) by the bound in (3.2).Further improvements to the bound on the total variation were made in [CDM20], but the implied rates for the Kolmogorov distance are worse than the ones in Theorem 2 and require model restrictions when σ = 0 (beyond those of Theorem 2(b)) that can be hard to verify (see [CDM20, Subsec.2.1.1]).
We stress that the dependence in t in the bounds of Theorem 2 is explicit.This is crucial in the proof of Theorem 3 as we need to apply (3.2)-(3.3)over intervals of small random lengths.A related result [Dia13, Prop.10] contains similar bounds, which are non-explicit in t and suboptimal in κ.
If Assumption (O-δ) is satisfied, the parameter δ in part (b) of Theorem 2 should be taken as large as possible to get the sharpest inequality in (3.3).If σ = 0 (equivalently δ = 2), the bound in part (a) has a faster decay in κ than the bound in part (b).If σ = 0 (equivalently 0 < δ < 2), it is possible for the bound in part (a) to be sharper than the one in part (b) or vice versa.Indeed, it is easy to construct a Lévy measure ν such that δ ∈ (0, 2) in Theorem 2(b) satisfies Then the bound in (3.2) is a multiple of t −1/2 κ δ/2 as t, κ → 0, while the one in (3.3) behaves as t −2/(3δ) κ 2/3 min{1, t 1/3 κ −δ/3 }.Hence one bound may be sharper than the other depending on the value of δ, as t and/or κ tend to zero.In fact, we will use the bound in part (b) only when the maximal δ satisfying the assumption of Theorem 2(b) is smaller than 4/3, bounding the activity of the Lévy measure around 0 away from maximal possible activity.
Denote x + := max{x, 0} for x ∈ R. The next result quantifies the Wasserstein distance between the laws of the vectors χ t and χ (κ) t .
Theorem 3.For any κ ∈ (0, 1] and t > 0, there exists a coupling between X and X (κ) on the interval [0, t] such that the following inequalities hold for p ∈ {1, 2}: with ϕ κ = σ κ / σ 2 κ + σ 2 and the universal constant K 2 from Theorem 1. Furthermore we have Moreover, if Assumption (O-δ) holds, then for every T > 0 there exists a constant C > 0, dependent only on (T, δ, σ, ν), such that for all t ∈ [0, T ] and κ ∈ (0, 1] we have , where ψ κ := Cκϕ κ and (3.7) (3.8) The proof of Theorem 3, given in Subsection 6.2 below, constructs the SBG coupling (X, X (κ) ), satisfying the above inequalities, in terms of the distribution functions of the marginals X s and X (κ) s (for s > 0) and the coupling used in (2.1), see [GCMUB21] for the latter.The key idea is to couple χ t and χ (κ)  t so that they share the stick-breaking process in their respective SB representations (2.1), while the increments of the associated Lévy processes over each interval [L n , L n−1 ] are coupled so that they minimise appropriate Wasserstein distances.This coupling produces a bound on the distance between χ t and χ (κ) t that depends only on the distances between the marginals of X s and X (κ) s , s > 0, so that Theorems 1 and 2 above can be applied.We stress that the bound in (3.4) cannot be obtained from Doob's L 2 -maximal inequality (see, e.g.[Kal02, Prop.7.16]) and Theorem 1: if the processes X and X (κ) are coupled in such a way that X t − X (κ) t satisfies the inequality in (3.1), the difference process (X s − X (κ) s ) s∈[0,t] need not be a martingale.Inequality (3.4) holds without assumptions on X and is at most a logarithmic factor worse than the marginal inequality (3.1) for p ∈ {1, 2}, with the upper bound satisfying µ p (κ, t) ≤ 2κ log(1/κ) for all sufficiently small κ.Moreover, by Jensen's inequality, for all 1 < p < 2 the SBA coupling satisfies the inequality E In the absence of a Brownian component (i.e.σ = 0) we have ϕ κ = 1, making the upper bound µ 2 (κ, t) proportional to µ 1 (κ, t) as κ → 0. If σ > 0, then µ 1 (κ, t) ≤ 2κσ 2 κ log(1/(κσ κ ))/σ 2 for all small κ and, typically, µ 2 (κ, t) is proportional to κσ κ log(1/(κσ κ )) as κ → 0, which dominates µ 1 (κ, t).
The bound in (3.6) holds without assumptions on the Lévy process X, while (3.7) requires Assumption (O-δ) and is sharper the larger the value of δ ∈ (0, 2], satisfying (O-δ), is.Note that, if σ = 0, (O-δ) holds with δ = 2.If σ = 0 and δ satisfies (O-δ), we must have β ≥ δ, where β is the Blumenthal-Getoor (BG) index defined in (2.6) above.In fact, models typically used in applications either have σ = 0 or (O-δ) holds with δ = β (however, it is possible for (O-δ) to hold for some δ The following quantity is the smallest of the upper bounds in (3.6) and (3.7): Under Assumption (O-δ), for some constant c t > 0 and all κ ∈ (0, 1], we have For any a ∈ R d , let |a| := d i=1 |a i | denote its ℓ 1 -norm.Recall that for p ≥ 1, the L p -Wasserstein distance [Vil09,Def. 6.1] between the laws of random vectors ξ and ζ in R d can be defined as Theorem 3 implies a bound on the L p -Wasserstein distance between the vectors χ t and χ (κ) t , extending the bound on the distance between the laws of the marginals X t and X Given the bounds in Corollary 4 and Theorem 2, it is natural to inquire about the convergence in the Kolmogorov distance of the components of (X (κ) t , τ t (X (κ) )) to (X t , τ t (X)) as κ → 0. This question is addressed by Corollary 8 of Subsection 3.3 below.
The famous Kómlos-Major-Tusnády (KMT) coupling is used in [Der11, Thm 6.1] to bound the L 2 -Wasserstein distance between the paths of X and ) is bounded by a multiple of κσ κ log(1/(κσ κ )) for small κ and is thus smaller than a multiple of κ 2−q/2 for any q ∈ (β, 2) (where β is the BG index defined in (2.6) above).As mentioned above, µ 2 (κ, t) is bounded by a multiple of κ log(1/κ) for small κ in the case σ = 0. Unlike the SBG coupling, which underpins Theorem 3, the KMT coupling does not imply a bound on the distance between the times of the infima τ t (X) and τ t (X (κ) ) as these are not Lipschitz functionals of the trajectories with respect to the supremum norm.
Remark 1.The bounds on E|τ t (X) − τ t (X (κ) )| in Theorem 3 and Corollary 4, based on the SB representation in (2.1), require the control on the expected difference between the signs of the components of (X s , X (κ) s ) as either s or κ tend to zero.This is achieved via the minimal transport coupling (see (6.1) and Lemma 12 below) and a general bound in Theorem 2 on the Kolmogorov distance.However, further improvements seem possible in the finite variation case if the natural drift (i.e. the drift of X when small jumps are not compensated) is nonzero.Intuitively, the sign of the natural drift determines the sign of both components of (X s , X (κ) s ) with overwhelming likelihood as s → 0. This suggestion is left for future research.
3.2.Bounds on the bias of functions of χ t .The main tool for studying the bias of various Lipschitz, non-Lipschitz and discontinuous functions of χ t is the SBG coupling underpinning Theorem 3. The Lipschitz case is a direct consequence: for any d ∈ N, let Lip K (R d ) denote the space of real-valued Lipschitz functions on R d (under ℓ 1 -norm given above display (3.10)) with Lipschitz constant K ≥ 0 and note that the triangle inequality and Theorem 3 imply the following bounds on the bias Since in applications, the process X is often used to model log-returns of a risky asset (S 0 e Xt ) t≥0 , it is natural to study the bias of a Monte Carlo estimator of a locally Lipschitz function . Such payoffs arise in risk management (e.g.absolute drawdown) and in the pricing of hindsight call, perpetual American call and lookback put options.
Proposition 5. Let f ∈ locLip K (R 2 ) and assume [1,∞) e 2x ν(dx) < ∞, where ν is the Lévy measure of X.For any T > 0 and κ ∈ (0, 1] and µ 2 (κ, T ) defined in (3.5), the SBG coupling satisfies The assumption ), which is a natural requirement as the asset price model (S 0 e Xt ) t≥0 ought to have finite variance.Moreover, via the Lévy-Khintchine formula, an explicit bound on the expectation E[e 2X T ] (and hence the constant in the inequality of Proposition 5) in terms of the Lévy triplet of X can be obtained.If one instead considers f (X T , X T ) (a function on the supremum X T ), the proof of Proposition 5 in Subsection 6.3 below can be used to establish that T ] are finite under our assumption [1,∞) e 2x ν(dx) < ∞ and bounded explicitly in terms of the Lévy triplet of X, see the proof of [GCMUB21, Prop.2].Thus, by Proposition 5, the bias for f ∈ locLip K (R 2 ) is at most a multiple of κ log(1/κ), as is the case for f ∈ Lip K (R 2 ) by (3.11), cf.discussion after Theorem 3.
In financial markets, the class of barrier-type functions arises naturally: for K, M ≥ 0, y < 0 let Note that the indicator function ½ [y,∞) lies in BT 1 (y, 0, 1) and satisfies Moreover, a down-and-out put option payoff x → max{e k − e x , 0}½ [y,∞) (x), for some constants y < 0 < k, is in BT 1 (y, e k , e k −e y ).Bounding the bias of the estimators for functions in BT 1 (y, K, M ) requires the following regularity of the distribution of X T at y.
The proof of Proposition 6 is in Subsection 6.3 below.Assumption (H) with γ = 1 requires the distribution function of X T to be locally Lipschitz at y.By the Lebesgue differentiation theorem [Coh13, Thm 6.3.3],any distribution function is differentiable Lebesgue-a.e., implying that Assumption (H) holds for γ = 1 and a.e.y < 0. However, there exist Lévy processes satisfying Assumption (H) for countably many levels y with γ < 1, but not with γ = 1, see [GCMUB21, App.B].Proposition 9 below provides simple sufficient conditions, in terms of the Lévy triplet of X, for Assumption (H) to hold with γ = 1 for all y < 0. In particular, this is the case if σ = 0.
The next class arises in the analysis of the duration of drawdown: for K, M ≥ 0, s ∈ (0, T ) let: The biases of these functions clearly include the error |P(τ T (X) > s) − P(τ T (X (κ) ) > s)|.Analogous to Proposition 6, we require the following regularity from the distribution function of τ T (X).
) for x > 0 and let β be the BG index of X defined in (2.6) above.Suppose that either (I) σ > 0 or (II) the Lévy measure ν satisfies the following conditions: ν + (x) is regularly varying with index −β as x → 0 and Then there exists constants γ > 0 and C such that Assumption (Hτ ) holds with γ, C for all s ∈ [0, T ].
Either (I) or (II) with β > 1 imply that (H) holds with γ = 1 and some constant C I for all y in a compact I ⊂ (−∞, 0).Note that Proposition 9 holds if the roles of ν + and ν − are interchanged, i.e ν − (x) is regularly varying and the limit conditions are satisfied by the quotients ν − (x)/ν + (x).The assumptions of Proposition 9 are satisfied by most models used in practice, including tempered stable and most subordinated Brownian motion processes.Excluded are Lévy processes without a Brownian component and with barely any jump activity (i.e.BG index β = 0, which includes compound Poisson and variance gamma processes), where the Gaussian approximation X (κ) is not useful.
Proposition 9 is a consequence of a more general result, Proposition 16 below, stating that Assumptions (Hτ ) and (H) hold uniformly and locally uniformly, respectively, if over short time horizons, X is "attracted to" an α-stable process with non-monotone paths, see Subsection 6.3 below for details.In this case ρ := lim t→0 P(X t > 0) exists in (0, 1) and γ in the conclusion of Proposition 9, satisfying Assumption (Hτ ) on [0, T ], can be arbitrarily chosen in the interval (0, min{ρ, 1 − ρ}).In contrast to Assumption (Hτ ), a simple sufficient condition for the uniform version of Assumption (H), required in Corollary 8(a), remains elusive beyond special cases such as stable or tempered stable processes with γ in the interval (0, α(1 − ρ)), where α is the stability parameter and ρ is as above.

Simulation and the computational complexity of MC and MLMC
This section describes an MC/MLMC method for the simulation of χ T , τ T (X (κ) )) (SBG-Alg in Subsection 4.1) and analyses the computational complexities for various locally Lipschitz and discontinuous functions of χ T , is far superior to that of the "obvious" algorithm for jump diffusions (see Algorithm 2 below), particularly when the jump intensity is large (cf.Subsections 4.1.2and 4.1.3).Moreover, SBG-Alg is designed with MLMC in mind, which turns out not to be feasible in general for the "obvious" algorithm (see Subsections 4.1.2).

Simulation of χ (κ)
T .The main aim of the subsection is to develop a simulation algorithm for the pair of vectors (χ T and χ (κ ′ ) T tends to zero as κ, κ ′ → 0. SBG-Alg below, based on the SB representation in (2.2), achieves this aim: it applies Algorithm 1 for the increments over the stick-breaking lengths that arise in (2.2) and Algorithm 2 for the "error term" over the time horizon [0, L n ].By Theorem 17 below the L 2 -distance for the coupling given in SBG-Alg decays to zero, ensuring the feasibility of MLMC (see Theorem 22 for the computational complexity of MLMC).4.1.1.Increments in the SB representation.A simulation algorithm for a coupling X of Gaussian approximations (at levels 1 ≥ κ 1 > κ 2 > 0) of X t at an arbitrary time t > 0 is based on the following observation: the compound Poisson processes J 2,κ 1 and J 2,κ 2 in the Lévy-Itô decomposition in (2.4) can be simulated jointly, as the jumps of J 2,κ 1 are precisely those of J 2,κ 2 with modulus of at least κ 1 .By choosing the same Brownian motion W in representation (2.5) of X (κ 1 ) t and X (κ 2 ) t , we obtain the coupling X Algorithm 1. Simulation of the law Π κ 1 ,κ 2 t Require: Cutoff levels 1 ≥ κ 1 > κ 2 > 0 and time horizon t > 0.
, is not accessible from the perspective of simulation.Since the law Poi(ν(κ 2 )t) of the variable N t in line 2 of Algorithm 1 is Poisson with mean ν(κ 2 )t, the expected number of steps of Algorithm 1 is bounded by a constant multiple of 1 + ν(κ 2 )t, which is in turn bounded by a negative power of κ 2 by (2.7).Since the computational complexity of sampling the law of X (κ 2 ) t is of the same order as that of the law Π κ 1 ,κ 2 t , in the complexity analysis of SBG-Alg below, we may apply Algorithm 1 with Π 1,κ t to sample X (κ) t for any κ ∈ (0, 1]. ) for levels 0 < κ 2 < κ 1 ≤ 1 and any (typically very small) t > 0. In particular, it requires the sampler [Dev10, Alg.MAXLOCATION] for the law Φ t (v, µ) of (B * t , B * t , τ t (B * )) where (B * s ) s≥0 = (vB s + µs) s≥0 is a Brownian motion with drift µ ∈ R and volatility v > 0. Algorithm 2. Simulation of the law Π κ 1 ,κ 2 t Require: Cutoff levels 1 ≥ κ 1 > κ 2 > 0 and time horizon t > 0.
0 ) := (0, 0, 0, 0, 0, 0) 5: for k ∈ {1, . . ., N t + 1} do 6: end for Algorithm 2 samples the jump times and sizes of the compound Poisson process J 2,κ 2 on the interval (0, t) and prunes the jumps to get J 2,κ 1 .Then it samples the increment,infimum and the time the infimum is attained for the Brownian motion with drift on each interval between the jumps of J 2,κ 2 and assembles the pair Alg. MAXLOCATION] samples the law Φ t (v, µ) with uniformly bounded expected runtime over the choice of parameters µ, v and t, the computational cost of sampling the pair of vectors (χ using Algorithm 2 is proportional to to the cost of sampling X (κ) t via Algorithm 1.In principle, Algorithm 2 is an exact algorithm for the simulation of a coupling (χ (κ 1 ) t , χ (κ 2 ) t ).
However, it cannot be applied within an MLMC simulation scheme for a function of χ (κ) T at a fixed time horizon T (the next paragraph explains why).SBG-Alg below circumvents this issue via the SB representation in (2.2), which also makes SBG-Alg paralellizable and thus much faster in practice even in the context of MC simulation (see the discussion after Corollary 20 below).
To the best of our knowledge, there is no simulation algorithm for the increment, the infima and the times the infima are attained of a Brownian motion under different drifts, i.e. of the vector Thus, in line 7 of Algorithm 2, we are forced to take independent samples from In particular, the coupling of the marginals X given in line 16 of Algorithm 2, amounts to taking two independent Brownian motions in the respective representations in (2.5) of X (κ 1 ) t and X (κ 2 ) t .Thus, unlike the coupling defined in Algorithm 1, here, by Proposition 18(b) below, the squared ) 2 ] ≥ 2tσ 2 for all levels 1 ≥ κ 1 > κ 2 > 0, where σ 2 is the Gaussian component of X.Hence, for a fixed time horizon, the coupling Π κ 1 ,κ 2 t of χ (κ 1 ) t and χ (κ 2 ) t is not sufficiently strong for an MLMC scheme to be feasible if X has a Gaussian component, because the level variances do not decay to zero.However, by Proposition 18(b), the L 2 -distance between ζ (κ 1 ) and ζ (κ 2 ) constructed in Algorithm 2 does tend to zero with t → 0. Thus, SBG-Alg below, which applies Algorithm 2 over the time interval [0, L n ] (recall EL n = T /2 n from SB representation (2.2)), circumvents this issue.4.1.3.The SBG sampler.For a time horizon T , we can now define the coupling Π κ 1 ,κ 2 n,T of the vectors χ (κ 1 ) T and χ (κ 2 ) T via the following algorithm.
Algorithm (SBG-Alg).Simulation of the coupling (χ is indeed a coupling of the vectors χ (κ 1 ) T and χ (κ 2 ) T for any n ∈ N ∪ {0}.Note that if n equals zero, the set {1, . . ., n} in lines 1 and 2 of the algorithm is empty and the laws Π κ 1 ,κ 2 0,T and Π κ 1 ,κ 2 T coincide, implying that SBG-Alg may be viewed as a generalisation of Algorithm 2. The main advantage of SBG-Alg over Algorithm 2 is that it samples n increments of the Gaussian approximation over the interval [L n , T ] using the fast Algorithm 1, with the "error term" contribution ξ i being geometrically small.The computational complexity of SBG-Alg and Algorithms 1 & 2 is simple to analyse.Assume throughout that all mathematical operations (addition, multiplication, exponentiation, etc.), as well as the evaluation of ν(κ) and σ 2 κ for all κ ∈ (0, 1] have constant computational cost.Moreover, assume that the simulation of any of the following random variables has constant expected cost: standard normal N (0, 1), uniform U(0, 1), Poisson random variable (independently of its mean) and any jump with distribution ν| R\(−κ,κ) /ν(κ) (independently of the cutoff level κ ∈ (0, 1]).Recall that [Dev10, Alg.MAXLOCATION] samples the law Φ t (v, µ) with uniformly bounded expected cost for all values of the parameters µ ∈ R, v > 0 and t > 0. The next statement follows directly from the algorithms.
Corollary 10.Under assumptions above, there exists a positive constant C 1 (resp.C 2 ; C 3 ), independent of κ 1 , κ 2 ∈ (0, 1], n ∈ N and time horizon t > 0, such that the expected computational complexity of Algorithm 1 (resp.Algorithm 2; SBG-Alg) is bounded by Up to a multiplicative constant, Algorithms 1 and 2 have the same expected computational complexity.However, Algorithm 2 requires not only additional simulation of jump times of X (κ 2 ) and a sample from Φ t (v, µ) using [Dev10, Alg.MAXLOCATION] between any two consecutive jumps, but also a sequential computation of the output (the "for-loop" in lines 5-15) due to the condition in line 9 of Algorithm 2. This makes it hard to parallelise Algorithm 2. SBG-Alg avoids this issue by using the fast Algorithm 1 over the stick lengths in SB representation (2.2) and calling Algorithm 2 only over the short time interval [0, L n ], during which very few (if any) jumps of X (κ 2 ) occur.Moreover, SBG-Alg consists of several conditionally independent evaluations of Algorithm 1, which is paralellizable, leading to additional numerical benefits (see Subsection 5.2 below).4.2.Computational complexity of the MC/MLMC estimator based on SBG-Alg.This subsection gives an overview of the bounds on the computational complexity of the MC and MLMC estimators defined respectively in (6.28) and (6.29) of Subsection 6.5 below.Corollary 20 (for MC) and Theorem 22 (for MLMC) in Subsection 6.5 give the full analysis.
Assume (O-δ) holds with some δ ∈ (0, 2] throughout the subsection.As discussed in Subsection 3.1 above, we take δ as large as possible.In particular, if σ = 0 then δ = 2. Let q ∈ (0, 2] be as in (2.7) and thus q ≥ δ if σ = 0. We take q as small as possible.For processes used in practice with σ = 0, we may typically take δ = q = β, where β is the BG index defined in (2.6).Assumption (Hτ ), required for the analysis of the class BT 2 in (3.14) of discontinuous functions of τ T (X), holds with γ = 1 as (O-δ) is satisfied (see the discussion following Proposition 7 above).When analysing the class of discontinuous functions BT 1 in (3.12), assume (H) holds throughout with some γ > 0.

Monte
Carlo.An MC estimator is L 2 -accurate at level ǫ > 0, if its bias is smaller than ǫ/ √ 2 and the number N of independent samples is proportional to ǫ −2 , see Appendix A.1.The following table contains a summary of the values κ, as a function of ǫ, such that the bias of the estimator in (6.28) is at most ǫ/ √ 2, and the associated Monte Carlo cost C MC (ǫ) (up to a constant) for various classes of functions of χ T analysed in Subsection 3.2 (see also Corollary 20 below for full details).
The number of sticks n ∈ N ∪ {0} in SBG-Alg does not affect the law of χ (κ) T .It only impacts the MC estimator in (6.28) through numerical stability and the reduction of simulation cost.It is hard to determine the optimal choice for n.Clearly, the choice n = 0 (i.e.Algorithm 2) is not a good one as discussed in Subsection 4.1.3above.A balance needs to be struck between (i) having a vanishingly small number of jumps in the time interval [0, L n ], so that Algorithm 2 behaves in a numerically BT 1 defined in (3.12) σ = 0 max ǫ 3/(4−q) | log ǫ| , ǫ 2/(3−q) | log ǫ| 1/2 min | log ǫ| q ǫ 3q/(4−q) , | log ǫ| q/2 ǫ 2q/(3−q) Lip in τ T (X) BT 2 defined in (3.14) Table 2. Asymptotic behaviour of the level κ and the complexity CMC(ǫ) as ǫ → 0 for the MC estimator in (6.28).
stable way, and (ii) not having too many sticks so that line 2 of SBG-Alg does not execute redundant computation of many geometrically small increments of X (κ) , which are not detected in the final output.A good rule of thumb is n = n 0 + log 2 (1 + ν(κ)T ) , where ⌈x⌉ := inf{j ∈ Z : j ≥ x}, x ∈ R, and the initial value n 0 is chosen so that some sticks are present if for large κ the total expected number of jumps ν(κ)T is small (e.g.n 0 = 5 works well in Subsection 5.2 for jump diffusions with low activity, see Figures 5.4 and 5.3), ensuring that the expected number of jumps in [0, L n ] vanishes as ǫ (and hence κ) tends to zero.

Multilevel Monte Carlo.
The MLMC estimator in (6.29) is based on the coupling in SBG-Alg for consecutive levels of a geometrically decaying sequence (κ j ) j∈N and an increasing sequence of the numbers of sticks (n j ) j∈N .Table 3 summarises the resulting MLMC complexity up to logarithmic factors, with full results available in Theorem 22 below.There are two key ingredients in the proof of Theorem 22: (I) the bounds in Theorem 17 on the L 2 -distance (i.e. the level variance, see Appendix A.2) between the functions of the marginals of the coupling Π κ j ,κ j+1 n j ,T constructed by SBG-Alg; (II) the bounds on the bias of various functions in Section 3 above.The number of levels m in the MLMC estimator in (6.29) is chosen to ensure that its bias, equal to the bias of χ (κm) T at the top cutoff level κ m , is bounded by ǫ/ √ 2. Thus, the value of m can be expressed in terms of ǫ using Table 2 and the explicit formula for the cutoff κ j , given in the caption of Table 3.The formula for κ j at level j in the MLMC estimator in (6.29) is established in the proof of Theorem 22 by minimising the multiplicative constant in the computational complexity C ML (ǫ) over all possible rates of the geometric decay of the sequence (κ j ) j∈N .
We stress that the analysis of the level variances for the various payoff functions of the coupling Π κ j ,κ j+1 n j ,T in Theorem 17 is carried out directly for locally Lipschitz payoffs, see Propositions 18.However, in the case of the discontinuous payoffs in BT 1 (see (3.12)) and BT 2 (see (3.14)), the analysis requires a certain regularity (uniformly in the cutoff levels) of the coupling (χ ).This leads to a construction of a further coupling (χ where the components of (χ ) can be compared to the limiting object χ T , which can be shown to possess the necessary regularity (see Proposition 19 below for details).

Numerical examples
In this section we study numerically the performance of SBG-Alg.All the results are based on the code available in repository [GCM].In Subsection 5.1 we apply SBG-Alg to two families of Lévy models (tempered stable and Watanabe processes) and verify numerically the decay of the bias (established in Subsection 3.2 above) and level variance (see Theorem 17 below) of the Gaussian approximations.In Subsection 5.2 we study numerically the cost reduction of SBG-Alg, when compared to Algorithm 2, for the simulation of the vector χ (κ) T .

Numerical performance of SBG-Alg for tempered stable and Watanabe processes.
To illustrate numerically our results, we consider two classes of exponential Lévy models S = S 0 e X .The first is the tempered stable class, containing the CGMY (or KoBoL) model, a widely used process for modeling risky assets in financial mathematics (see e.g.[CT15] and the references therein), which satisfies the regularity assumptions from Subsection 3.2 above.The second is the Watanabe class, which has diffuse but singular transition laws [Sat13, Thm 27.19], making it a good candidate to stress test our results.
We numerically study the decay of the bias and level variance of the MLMC estimator in (6.29) for the prices of a lookback put E[S T − S T ] and an up-and-out call E[(S T − K) + ½{S T ≤ M }] as well as the values of the ulcer index (UI) 100E[(S T /S T − 1) 2 ] 1/2 [MM89, Dow] and a modified ulcer index (MUI) 100E[(S T /S T − 1) 2 ½{τ T (S) < T /2}] 1/2 .The first three quantities are commonplace in applications, see [CT15,Dow].The MUI refines the UI by incorporating the information on the drawdown duration, weights trends more heavily than short-time fluctuations.
In Subsections 5.1.1 and 5.1.2we use N = 10 5 independent samples to estimate the means and variances of the variables D 1 j in (6.29) (with χ T ), where κ j = e −r(j−1) and n j = max{j, log 2 (1 + ν(κ j+1 ))} , j ∈ N, discussed in Subsection 6.5 below.5.1.1.Tempered stable.The characteristic triplet (σ 2 , ν, b) of the tempered stable Lévy process X is given by σ = 0, drift b ∈ R and Lévy measure ν(dx) = |x| −1−α sgn(x) c sgn(x) e −λ sgn(x) |x| dx, where α ± ∈ [0, 2), c ± ≥ 0 and λ ± > 0, cf.(2.3).Exact simulation of increments is currently out of reach if either α + > 1 or α − > 1 (see e.g.[Gra19]) and requires the Gaussian approximation., respectively, where D 1 j is given in (6.29) with κj = exp(−r(j − 1)) for r = 1/2.The dashed lines in all the graphs plot the rates of the theoretical bounds in Subsection 3.2 (blue for the bias) and Theorem 17 (red for level variances).In plots (A)-(D) the initial value of the risky asset is normalised to S0 = 1 and the time horizon is set to T = 1/6.In plot (B) we set K = 1 and M = 1.2.The model parameters are given in Table 4   Figure 5.1 suggests that our bounds are close to the exhibited numerical behaviour for continuous payoff functions.In the discontinuous case, χ (κ j ) T appears to be much closer to χ T (resp.χ 5.1.2.Watanabe model.The characteristic triplet (σ, ν, b) of the Watanabe process is given by σ = 0, the Lévy measure ν equals n∈N c + δ a −n + c − δ −a −n , where a ∈ N \ {1} and δ x is the Dirac measure at x, and the drift b ∈ R is arbitrary.The increments of the Watanabe process are diffuse but have no density (see [Sat13,Thm 27.19]).Since the process has very little jump activity, the bound in Proposition 7 (see also (3.6)) is non-vanishing and the bounds in Theorem 17(c) & (d) are not applicable, meaning that we have no theoretical control on the approximation of τ T (S).This is not surprising as such acute lack of jump activity makes the Gaussian approximation unsuitable (cf.[AR01, Prop.2.2]).The pictures in Figure 5.2 (A) & (C) suggest that our bounds on the bias and level variance in Subsection 3.2 and Theorem 17 are robust for continuous payoff functions even if the underlying Lévy process has no transition densities.There are no dashed lines in Figure 5.2 (B) & (D) as there are no results for discontinuous functions of τ T (S) in this case.In fact, Figure 5.2(B) suggests that the decay rate of the bias and level variance for functions of τ T (S) can be arbitrarily slow if the process does not have sufficient activity.Figure 5.2(D), however, suggests that this decay is still fast if the underlying finite variation process X has a nonzero natural drift (see also Remark 1).

The cost reduction of SBG-Alg over Algorithm 2. Recall that Algorithm 2 and SBG-Alg both draw exact samples of a Gaussian approximation χ (κ)
T .However, in practice, SBG-Alg may be many times faster than Algorithm 2: Figure 5.3 plots the speedup factor in the case of a tempered stable process, defined in Subsection 5.1.1 above, as a function of κ.In conclusion, one should use SBG-Alg instead of Algorithm 2 for the MC estimator in (6.28) (recall that Algorithm 2 is not suitable for the MLMC estimator, as discussed in Subsection 4.1.2).If the Lévy process X is a jump diffusion, i.e. ν(R \ {0}) < ∞, we may apply Algorithms 1 & 2 and SBG-Alg with κ 1 = κ 2 = 0.In that case SBG-Alg still outperforms Algorithm 2 by a constant factor, with computational benefits being more pronounced when the total expected number of jumps λ := ν(R \ {0})T is large.The cost reduction is most drastic when λ is large, but the improvement is already significant for λ = 2.

Proofs
In the remainder of the paper we use the notation τ t := τ t (X), τ t := τ t (X (κ) ) for all t > 0.
6.1.Proof of Theorems 1 and 2. In this subsection we establish bounds on the Wasserstein and Kolmogorov distances between the increment X t and its Gaussian approximation X (κ) t in (2.5).
Proof of Theorem 1. Recall the Lévy-Itô decomposition of X at level κ in (2.4) and the martingale , where W is a standard Brownian motion in (2.5), independent of Z. Hence any coupling (W t , M (κ) t ) yields a coupling of Setting W := B, which amounts to the independence coupling (W, J 1,κ ), and applying Jensen's inequality for p ∈ For any m ∈ N we have M According to [FL08, Thm 1.1], the limit as m → ∞ of the right-hand side of the display above equals Proof of Theorem 2. (a) Define where the processes Z and M (κ) are as in the proof of Theorem 1.Since M (κ) is a Lévy process, for any m ∈ N we have t/m .By the Berry-Esseen inequality [KS12, Thm 1], there exists a constant C BE ∈ (0, 1 2 ) such that According to [FL08, Thm 1.1], the limit as m → ∞ of the right-hand side of the display above equals a).(b) By [Pic97, Thm 3.1(a)], X t has a smooth density f t and, given T > 0, the constant C ′ = sup (t,x)∈(0,T ]×R t 1/δ f t (x) is finite.Applying (3.1) and (6.9) in Lemma 14 with p = 2 gives (3.3).6.2.Proof of Theorem 3. We recall an elementary result for stick-breaking processes.
In particular, for any a 1 , a 2 > 0 and b 1 < b 2 with b 2 > 0, we have Proof.The law of − log ̟ n is gamma with shape n and scale 1. Applying Fubini's theorem, implies The formula for φ(x) := min{a 1 x b 1 , a 2 x b 2 } follows by a direct calculation.
In either case, the result follows.
For any t > 0, let G κ t denote the joint law of the comonotonic coupling of X t and X (κ) t defined in (6.1).Note that a coupling (X t , X t ) with law G κ t satisfies the inequality in Theorem 1.The following lemma is crucial in the proof of Theorem 3.
Proof.Note that µ p (κ, t) = µ 2 (κ, t) for all p ∈ (1, 2].Hence, by Jensen's inequality, in (6.2) we need only consider p ∈ {1, 2}.Pick n ∈ N and set κ p := K p p κ p ϕ 2 κ , p ∈ {1, 2}, where K p and ϕ κ are as in the statement of Theorem 1. Condition on ℓ n and apply the bound in (3.1) to obtain (6.4) An application of (6.4) and Lemma 11 yield the first inequality in (6.2) for p = 1: Consider the case p = 2.A simple expansion yields We proceed to bound the two sums.The inequality in (6.4) for p = 2 and Lemma 11 imply Define the σ-algebra F n := σ(ℓ 1 , . . ., ℓ n ) and use the conditional independence to obtain Use the tower property and apply (3.1) and Lemma 11 to get Putting everything together yields the first inequality in (6.2) for p = 2.
Next we prove the second inequality in (6.2).By Lemma 12, we have (6.5) n .By Fubini's theorem, conditioning in each summand on ℓ n , equality (6.5) and Lemma 11, we have , where ψ κ = Cκϕ κ as defined in (3.7).Moreover, we have Hence by (6.5) and Lemma 11, we obtain n ), n ∈ N, be as in Lemma 13.Define the vector By (2.1) and (6.1), we have 3 Hence, it suffices to show that these vectors satisfy (3.4), (3.6) and (3.7).Since x → min{x, 0} is in Lip 1 (R), the inequalities follow from the triangle inequality.The theorem follows from Lemma 13.Remark 4. Let C t and C (κ) t denote the convex minorants of X and X (κ) on [0, t], respectively.Couple X and X (κ) in such a way that the stick-breaking processes describing the lengths of the faces of their convex minorants (see [PUB12, Thm 1] and [GCMUB21, Sec.4.1]) coincide.(The Skorokhod space D[0, t] and the space of sequences on R are both Borel spaces by [Kal02, Thms A1.1, A1.2 & A2.2], so the existence of such a coupling is guaranteed by [Kal02, Thm 6.10].)Geometric arguments, similar to the ones in [GCMUB20], show that the sequences of heights of the faces of the convex minorants, denoted by (ξ n ) n∈N and (ξ n ), n ∈ N, are coupled as in Lemma 13, the inequalities in (6.2) and (6.3) yield the same bounds as in Theorem 3 but in a stronger metric (namely, the distance between the convex minorants in the supremum norm), while retaining the control on the time of the infimum.6.3.The proofs of Propositions 5, 6, 7 and 9.The Lévy-Khintchine formula for X t in (2.3), the definition of X (κ) t in (2.5) and the inequality e z ≥ 1 + z (for all z ∈ R) imply Thus E[exp(uX ) and, in particular, the Gaussian approximation X (κ) has as many exponential moments as the Lévy process X.
Proof of Proposition 5.By [Vil09, Thm 6.16], there exists a coupling between (ξ, ζ) The identity e b − e a = b a e z dz implies that, for x ≥ y and x ′ ≥ y ′ , we have (6.7) Apply this inequality, the Cauchy-Schwartz inequality, the elementary inequalities, which hold for all a, b ≥ 0, (a + b) 2 ≤ 2(a 2 + b 2 ) and (a + b) 1/2 ≤ a 1/2 + b 1/2 and the bound in (6.6) to obtain T , X Applying Corollary 4 gives the desired inequality, concluding the proof of the proposition.
We now introduce a tool that uses the satisfy 0 ≤ h ≤ M for some constants K, M ≥ 0. Then for any p, r > 0, we have (6.9) Remark 5.An analogous bound to the one in (6.8) holds for the indicator ½ (−∞,y] .Moreover, it follows from the proof below that the boundedness of the function h assumed in Lemma 14 may be replaced with a moment assumption ξ, ξ ′ ∈ L q for some q > 1.In such a case, Hölder's inequality could be invoked to obtain an analogue to (6.10) below.Similar arguments may be used to simultaneously handle multiple indicators.
Proof of Lemma 14. Applying the local γ-Hölder continuous property of the distribution function of ζ to (6.8) and optimising over r > 0 yields (6.9).Thus, it remains to establish (6.8).
Elementary set manipulation yields Hence, the triangle inequality and the Lipschitz property gives (6.10) Taking expectations and using Markov's inequality Proof of Proposition 6. Theorem 3 and (6.9) in Lemma 14 (with C and γ given in Assumption (H) and p = 2) applied to (X T , X T ) and (X T , X T ) under the SBG coupling give the claim.Proof of Proposition 7. Analogous to the proof of Proposition 6, applying Theorem 3 and (6.9) in Lemma 14 (with C and γ given in Assumption (Hτ ) and p = 1), gives the result.
Lemma 15.Suppose X is not a compound Poisson process.Then the law of τ T is absolutely continuous on (0, T ) and its density is locally bounded on (0, T ).
Proof.If X or −X is a subordinator then τ T is a.s.0 or T , respectively.In either case, the result follows immediately.Suppose now that neither X nor −X is a subordinator.Denote by n(ζ > •) (resp.n(ζ > •)) the intensity measures of the lengths ζ of the excursions away from 0 of the Markov process X − X (resp.X − X).Then, by Theorem 5 in [Cha13] with F ≡ K ≡ 1, the law of τ T can only have atoms at 0 or T , is absolutely continuous on (0, T ) and its density is given by s → n(ζ > s)n(ζ > T − s), s ∈ (0, T ).The maps s → n(ζ > s) and s → n(ζ > s) are non-increasing, so the density is bounded on any compact subset of (0, T ), completing the proof.
In preparation for the next result, we introduce the following assumption.
Note that ρ is well defined under Assumption (S-α) and that X t /a(t) can only have a nonzero weak limit as t → 0 if the limit is α-stable.Moreover, in that case, a is necessarily regularly varying at 0 with index 1/α and α is given in terms of the Lévy triplet (σ 2 , ν, b) of X: where β is the BG index introduced in (2.6).In fact, the assumptions of Proposition 9 imply Assumption (S-α) by [BI20, Prop.2.3], so Proposition 16 generalises Proposition 9. We refer the reader to [Iva18,Sec. 3 & 4] for conditions that are equivalent to Assumption (S-α).
Assumption (S-α) allows for the cases ρ = 0 or ρ = 1 when α ≤ 1, correspond to the stable limit being a.s.negative or a.s.positive, respectively.In these cases, the distribution of τ T (X) may have an atom at 0 or T , while the law of τ T (X (κ) ) is absolutely continuous, making the convergence in Kolmogorov distance impossible.This is the reason for excluding ρ ∈ {0, 1} in Proposition 16.
Proof of Proposition 16.By [BI20, Lem.5.7], under the assumptions in part (a) of the proposition, X T has a continuous density on (−∞, 0), implying the conclusion in (a).
Theorem 17. Fix T > 0, n ∈ N and 1 ≥ κ 1 > κ 2 > 0. Denote (Z , where the vector χ n,T , constructed in SBG-Alg, follows the law Π κ 1 ,κ 2 n,T .(a) For any Lipschitz function f ∈ Lip K (R 2 ), K > 0, we have κ 1 + σ κ 1 κ 1 .(6.12) (b) Suppose Assumption (H) is satisfied by some y < 0 and C, γ > 0. Then for any f ∈ BT 1 (y, K, M ), K, M ≥ 0, there exists some K ′ > 0 independent of (n, κ 1 , κ 2 ) such that 2] satisfies Assumption (O-δ), then there exists some C > 0 such that for any K > 0, f ∈ Lip K (R), n ∈ N, κ 1 > κ 2 and p ∈ {1, 2}, we have (d) Fix s ∈ (0, T ) and let Assumption (O-δ) hold for some δ ∈ (0, 2], then for any f ∈ BT 2 (s, K, M ), K, M ≥ 0, there exists a constant C > 0 such that for any n ∈ N, p ∈ {1, 2} and κ 1 > κ 2 , we have The synchronous coupling of the large jumps of the Gaussian approximations, implicit in SBG-Alg, ensures that no moment assumption on the large jumps of X is necessary for (6.11) to hold.For locally Lipschitz payoffs, however, the function may magnify the distance when a large jump occurs.This leads to the moment assumption The proof of Theorem 17 requires bounds on certain moments of the differences of the components of the output of Algorithms 1 & 2 and SBG-Alg, given in Proposition 18.
(a) The pair Z ), i ∈ {1, 2}, satisfies the following inequalities: Remark 6. (i) By Proposition 18, the L 2 -norms of the differences Z of the components of (χ (κ 1 ) n,t , χ (κ 2 ) n,t ), constructed in SBG-Alg, decay at the same rate as the L 2norm of , constructed in Algorithm 1. Indeed, assume that κ 1 = cκ 2 for some c > 1, κ 2 → 0 and, for some c ′ , r > 0 and all x > 0, we have ν κ 1 for all sufficiently small κ 1 , implying the claim by Proposition 18(a) & (c).Moreover, by Corollary 10, the corresponding expected computational complexities of Algorithm 1 and SBG-Alg are proportional as κ 2 → 0. Furthermore, since the decay of the bias of SBG-Alg is, by Theorem 3, at most a logarithmic factor away from that of Algorithm 1, the MLMC estimator based on Algorithm 1 for Ef (X t ) has the same computational complexity (up to logarithmic factors) as the MLMC estimator for Ef (X t , X t ) based on SBG-Alg (see Table 3 above for the complexity of the latter).(ii) The proof of Proposition 18 implies that an improvement in Algorithm 1 (i.e. a simulation procedure for a coupling with a smaller L 2 -norm of ) would result in an improvement in SBG-Alg for the simulation of a coupling (χ (κ 1 ) t , χ (κ 2 ) t ).Interestingly, this holds in spite of the fact that SBG-Alg calls Algorithm 2 whose coupling Π κ 1 ,κ 2 t is inefficient in terms of the L 2 -distance but is applied over the short interval [0, L n ]. (iii) A nontrivial bound on the moments of the difference τ under the coupling of Algorithm 2, which would complete the statement in Proposition 18(b), appears to be out of reach.By the SB representation in (2.2), such a bound is not necessary for our purposes.The corresponding bound on the moments of the difference τ n,t , constructed in SBG-Alg, follows from Proposition 19 below, see (6.25).(iv) The bounds on the fourth moments in (6.17 (constructed by Algorithm 1) equals by (2.5) a sum of two independent martingales: The first inequality follows since 0 < (σ 2 +σ 2 implying (6.16).Similarly, by conditioning on {ℓ k } n k=1 and L n , we deduce that the expectations of vanish for any distinct k 1 , k 2 , k 3 , k 4 ∈ {1, . . ., n + 1}.Thus, by expanding, we obtain The summands in the first sum are easily bounded by parts (a) and (b).To bound the summands of the second sum, condition on {ℓ k } n k=1 and L n and apply parts (a) and (b):

The representation in line 3 of SBG-Alg and the elementary inequality
(6.20) The first term on the right-hand side of this inequality is easily bounded via the inequalities in parts (a) and (b).To bound the second term, condition on {ℓ k } n k=1 and L n , apply the Cauchy-Schwarz inequality, denote υ := σ 2 + σ 2 κ 1 and observe that for m < k ≤ n we get where the equalities follow from the definition of the stick-breaking process (see Subsection 2.1).By (6.20) we have ] for any random variable ϑ.Hence, we may bound the first and third conditional moments of |ξ 1,k − ξ 2,k | and |ξ 1,n+1 − ξ 2,n+1 | given {ℓ k } n k=1 and L n .Thus, by expanding (6.21), conditioning on {ℓ k } n k=1 and L n , and using elementary estimates as in all the previously developed bounds, we obtain (6.19).
In order to control the level variances of the MLMC estimator in (6.29) for discontinuous payoffs of χ t and functions of τ t , we would need to apply Lemma 14 to the components of χ (κ 1 ) n,t , χ (κ 2 ) n,t constructed in SBG-Alg.In particular, the assumption in Lemma 14 requires a control on the constants in the locally Lipschitz property of the distribution functions of the various components of χ (κ 1 ) n,t , χ (κ 2 ) n,t in terms of the cutoff levels κ 1 and κ 2 .As such a uniform bound in the cutoff level appears to be out of reach, we establish Proposition 19, which allows us to compare the sampled quantities χ (κ 1 ) n,t and χ (κ 2 ) n,t with their limit χ t (as κ 1 , κ 2 → 0).Since, under mild assumptions, the distribution functions of the components of the limit χ t possess the necessary regularity and do not depend on the cutoff level, the application of Lemma 14 in the proof of Theorem 17 becomes feasible using Proposition 19.
(c) Recall that the inequality in (6.25) follows from (6.24) of Proposition 19.The inequality in (6.14) in the proposition is a direct consequence of the Lipschitz property and (6.25).
6.5.2.MLMC estimator.Let (κ j ) j∈N (resp.(n j ) j∈N∪{0} ) be a decreasing (resp.increasing) sequence in (0, 1] (resp.N) such that lim j→∞ κ j = 0. Let χ 0,i d = χ (κ 1 ) T and (χ j,i 1 , χ j,i 2 ) ∼ Π κ j ,κ j+1 n j ,T , i, j ∈ N, be independent draws constructed by SBG-Alg.Then, for the parameters m, N 0 , . . ., N m ∈ N, the MLMC estimator takes the form (6.29) The bias of the MLMC estimator is equal to that of the MC estimator in (6.28) with κ = κ m .Given the sequences (n j ) j∈N∪{0} and (κ j ) j∈N , which determine the simulation algorithms used in estimator (6.29), Appendix A.2 derives the asymptotically optimal (as ǫ ց 0) values for the integers m and (N j ) m j=0 minimising the expected computational complexity of (6.29) under the constraint that the L 2 -accuracy of Υ ML is of level ǫ.The key quantities are the bounds B(j), V (j) and C(j) on the bias, level variance and the computational complexity of SBG-Alg at level j (i.e.run with parameters κ j and n j ).The number of levels m in (6.29) is determined by the bound on the bias B(j), while the number of samples N j used at level j is given by the bounds on the complexity and level variances, see the formulae in (A.1)-(A.2).Proposition 21, which is a consequence of Theorem 3 and Propositions 5, 6 & 7 (for bias), Theorem 17 (for level variance) and Corollary 10 (for complexity), summarises the relevant bounds B(j), V (j) and C(j) established in this paper (suppressing the unknown constants as we are only interested in the asymptotic behaviour as ǫ ց 0).Proposition 21.Given sequences (κ j ) j∈N and (n j ) j∈N∪{0} as above, set C(j) := n j + ν(κ j+1 )T .
The following choices of functions B and V ensure that, for any ǫ > 0, the MLMC estimator Υ ML , with integers m and {N j } m j=0 given by (A.1)-(A.2),has L 2 -accuracy of level ǫ with complexity asymptotically proportional to C ML (ǫ) = 2ǫ −2 m j=0 C(j)V (j) 2 .
Remark 7. By (3.5) and (A.2) we note that κ m in Proposition 21(a) is bounded by (and typically proportional to) C 0 ǫ/| log ǫ|.Moreover, if κ m = e −r(m−1) for some r > 0, then the constant C 0 does not depend on the rate r.A similar statement holds for (b), (c) and (d), see Table 2 above.
It remains to choose the parameters (n j ) j∈N∪{0} and (κ j ) j∈N for the estimator in (6.29).Since we require the bias to vanish geometrically fast, we set κ j = e −r(j−1) for j ∈ N and some r > 0. The value of the rate r in Theorem 22 below is obtained by minimising the multiplicative constant in the complexity C ML (ǫ).Note that n j does not affect the bias (nor the bound B(j)) of Υ ML .By Proposition 21, n j may be as small as a multiple of log(1/σ 2 κ j ) without affecting the asymptotic behaviour of the level variances V (j) and as large as ν(κ j+1 ) without increasing the asymptotic behaviour of the cost of each level C(j).Moreover, to ensure that the term σ 2 2 −n j in the level variances (see Theorem 17 above) decays geometrically, it suffices to let n j grow at least linearly in j.In short, there is large interval within which we may choose n j without it having any effect on the asymptotic performance of the MLMC estimation (see Theorem 22 below).The choice n j = n 0 + max{j, log 2 (1 + ν(κ j+1 )T )} , for j ∈ N, in the numerical examples of Section 5 fall within this interval (recall ⌈x⌉ = inf{j ∈ Z : j ≥ x} for x ∈ R).
Figure1.1.Dashed (resp.solid) line plots the power of ǫ −1 in the computational complexity of an MC (resp.MLMC) estimator, as a function of the BG index β defined in (2.6), for discontinuous functions in BT1 (3.12) and BT2 (3.14), locally Lipschitz payoffs as well as Lipschitz functions of τ T (X).The cases are split according to whether X is with (σ = 0) or without (σ = 0) a Gaussian component.The pictures are based on Tables 2 and 3 under assumptions typically satisfied in applications, see Subsection 4.2 below for details.

Figure 2 . 1 .
Figure 2.1.The figure illustrates the first n = 4 sticks of a stick-breaking process.The increments of Y in (2.2) are taken over the intervals [L k , L k−1 ] of length ℓ k .Crucially, the time Ln featuring in the vector (YL n , Y Ln , τ Ln (Y )) in (2.2) is exponentially small in n.

T
(Subsection 4.2).The numerical performance of SBG-Alg, which is based on the SB representation in (2.1)-(2.2) of χ (κ) Figure 5.1.Gaussian approximation of a tempered stable process: log-log plot of the bias and level variance for various payoffs as a function of log κj .Circle (•) and plus (+) correspond to log |E[D 1 j ]| and log V[D 1 j ], respectively, where D 1 j is given in (6.29) with κj = exp(−r(j − 1)) for r = 1/2.The dashed lines in all the graphs plot the rates of the theoretical bounds in Subsection 3.2 (blue for the bias) and Theorem 17 (red for level variances).In plots (A)-(D) the initial value of the risky asset is normalised to S0 = 1 and the time horizon is set to T = 1/6.In plot (B) we set (κ j+1 ) T ), than predicted by Propositions 6 & 7 (resp.Theorem 17(b) & (d)).

Figure 5 . 2 .
Figure 5.2.Gaussian approximation of a Watanabe process: log-log plot of the bias and level variance for various payoffs as a function of log κj.Circle (•) and plus (+) correspond to log |E[D 1 j ]| and log V[D 1 j ], respectively, where D 1 j is given in (6.29) with κj = exp(−r(j − 1)) for r = 1.The dashed lines in graphs (A) & (C) plot the rates of the theoretical bounds in Subsection 3.2 (blue for the bias) and Theorem 17 (red for level variances).In plots (A)-(D) the initial value of the risky asset is normalised to S0 = 1 and the time horizon is set to T = 1.The model parameters are given by a = 2, c+ = c− = 1.

Figure 5 . 4 .
Figure 5.4.The pictures show, for multiple number of sticks n, the ratio of the cost of Algorithm 2 over the cost of SBG-Alg (both in seconds) for jump diffusions as a function of the mean number of jumps λ = ν(R \ {0})T .The ratio for n = 15 is 11.8 (resp.10.8) in Merton's (resp.Kou's) model when λ = 10.
ξ i , where ξ 1 , . . ., ξ m are iid with ξ 1 d = M (κ) t/m .Hence [Pet75, Thm 16] and [Rio09, Thm 4.1] imply the existence of universal constants K ) and (6.19) are required to control the level variances of the MLMC estimator in the case of locally Lipschitz payoff functions and are applied in the proof of Theorem 17(a).Proof of Proposition 18.(a) The difference Z

Table 1 .
The rates (as κ → 0) of decay of bias and level variance for Lipschitz payoffs of (XT , XT ) under the JAG approximation are based on [Der11, Cor.3.2] and [DH11, Thm 2], respectively.The rates on the bias and level variance for the SBG-Alg are given in Theorems 3 & 17 below.
Thus the Lévy triplet (σ 2 , ν, b), with respect to the cutoff function x → ½ (−1,1) (x), determines the law of X.All the Lévy triplets in the present paper use this cutoff function. below.

Table 4 .
[AL13,rameters used for Figure5.1.The first set of parameters corresponds to the risk-neutral calibration to vanilla options on the USD/JPY exchange rate, see[AL13,Table 3].The second set is the maximum likelihood estimate based on the real-world S&P stock prices, see [KRBF09, Table 1].