1 Introduction

Let \(X\) be a diffusion on some filtered probability space taking values in \((\ell ,r)\) and solving

$$ X_{t}=x+ \int _{0}^{t}\sigma (X_{s})dB_{s}+ \int _{0}^{t}b(X_{s})ds, \qquad t < \zeta , $$
(1.1)

where \(B\) is a Brownian motion and \(\zeta :=\inf \{t\geq 0: X_{t} \notin (\ell ,r)\}\) is the first exit time from the interval \((\ell ,r)\). The process is killed at \(\zeta \) and sent to a cemetery state.

Let us assume that at least one of the boundaries is accessible and \(\zeta \) is finite a.s., and consider \(\mathbb{E}[g(X_{T}){\mathbf {1}}_{\{T<\zeta \}}]\) for a given function \(g\) and a deterministic \(T\). Such computations appear very naturally in many applied problems of science, engineering and finance. For instance, in mathematical finance theory, such an expectation corresponds to the price of a barrier option with payoff \(g\) and maturity \(T\) written on a stock whose price process is given by \(X\). The barrier feature renders the option worthless if the stock price hits one of the accessible boundaries before the maturity of the option.

A closed-form expression for \(\mathbb{E}[g(X_{T}){\mathbf {1}}_{\{T<\zeta \}}]\) is rarely available even in this one-dimensional setting. Thus one needs to resort to an approximation scheme for an answer. Arguably the easiest approach is to run a standard Euler–Maruyama scheme on the SDE (1.1) by setting

$$ \bar{X}_{t_{n+1}} =\bar{X}_{t_{n}} +\sigma (\bar{X}_{t_{n}})(B_{t_{n+1}}-B_{t_{n}}) + b(\bar{X}_{t_{n}}) \frac{T}{N}, $$

where \(\bar{X}_{0}=x\), \(t_{0}=0\), \(N>0\) is an integer, \(t_{n}=\frac{nT}{N}\) for \(n=1, \ldots , N\), and compute \(\mathbb{E}[g(\bar{X}_{T}){\mathbf {1}}_{\{T<\tau \}}]\), where \(\tau \) is the first time that the discrete-time process \((\bar{X}_{t_{n}})_{n=0, 1,\dots ,N}\) hits one of the barriers. Under standard regularity conditions on the diffusion process and \(g\), such a scheme indeed converges as \(N\rightarrow \infty \). However, it converges at a rate much slower than a standard Euler–Maruyama scheme applied to a diffusion process that is not killed at accessible boundaries.

Indeed it was shown by Gobet [21] that under standard hypotheses, the above scheme for the killed diffusion converges weakly at rate \(N^{-1/2}\) as opposed to \(N^{-1}\), which is the rate of weak convergence for the Euler–Maruyama scheme in the absence of killing (see e.g. Talay and Tubaro [44] or Mikulevičius and Platen [34]). This rate is optimal since it is reached when \(X\) is a Brownian motion and \(g\) is an indicator function of a set strictly contained in \((\ell ,r)\) (see Siegmund and Yuh [43]).

Çetin [9] conjectured that using a recurrent transformation would bring the convergence rate back to \(N^{-1}\). A recurrent transformation at heart is a change of measure that keeps the Markovian structure intact while transforming the process into a recurrent one. In particular, \(X\) never touches the boundaries of \((\ell ,r)\) under the new measure ℚ. The article [9] shows that ℚ is locally absolutely continuous with respect to the original measure ℙ, and \(X\) follows

$$ dX_{t}=\sigma (X_{t})dW_{t} + \bigg(b(X_{t})+\sigma ^{2}(X_{t}) \frac{h'}{h}(X_{t})\bigg)dt $$
(1.2)

for some function \(h\) and a ℚ-Brownian motion \(W\). That the above claim was a conjecture and not following immediately from the standard results on Euler–Maruyama schemes is because \(\frac{h'}{h}\) is explosive near boundaries and is not Lipschitz, which is in fact needed for \(X\) not to touch the previously accessible boundaries after the measure change. This can create significant difficulties with approximation and may even lead to divergence (see e.g. the potential issues that may arise with non-Lipschitz drivers and methods on how to resolve them in Hutzenthaler et al. [27, 28]).

In this paper, we prove the above conjecture with a slight “twist”. Note that if one applies the Euler–Maruyama scheme naively to (1.2), one obtains as usual a Brownian motion with drift whose parameters change at the times of discretisation. This process will hit finite boundaries with positive probability and therefore will exit the state space of \(X\) with positive probability. One way to overcome this is to impose an ad hoc reflection on the boundaries. However, this will introduce a local-time term in computations requiring additional estimates on its convergence rate to 0. Moreover, it is far from obvious that reflection is the optimal resolution of problems arising from the discretised process exiting the domain.

We instead study a drift-implicit method that keeps the state space intact after discretisation. To see how, suppose that \(b\equiv 0\), which can be obtained by changing the scale if necessary, and consider the backward Euler–Maruyama scheme

$$ \widehat{X}_{t_{n+1}} =\widehat{X}_{t_{n}} +\sigma (\widehat{X}_{t_{n}})(B_{t_{n+1}}-B_{t_{n}}) + \frac{T}{N}\sigma ^{2}(\widehat{X}_{t_{n}})\frac{h'}{h}(\widehat{X}_{t_{n+1}}), $$
(1.3)

where \(h\) becomes a concave function. Note that differently than what one would expect from a backward scheme (see e.g. Mao and Szpruch [33], Alfonsi [3, 4] and Neuenkirch and Szpruch [36], to name a few), the \(\sigma ^{2}\)-term in the drift of (1.2) is still evaluated at \(\widehat{X}_{t_{n}}\). This stems from the fact that (1.2) with \(b\equiv 0\) should be viewed as a time-changed version of

$$ dY_{t}=dW_{t} +\frac{h'}{h}(Y_{t})dt, $$

where the time change is given by \(\int _{0}^{t}\sigma ^{2}(Y_{s})ds\). We make extensive use of this correspondence in our proofs.

Our main result is Theorem 4.4 which proves that the rate of weak convergence of the above backward Euler–Maruyama scheme is \(N^{-1}\) under standard assumptions on the diffusion process. Moreover, there is no single function \(h\) that achieves this rate. We show that any nonnegative concave \(h\) vanishing at accessible boundaries can be used to obtain this convergence rate as long as it satisfies some mild growth conditions. Such functions are easy to construct, and we study in Sect. 5 constructions of some particular \(h\)-functions to compute approximate prices for barrier options in a Black–Scholes framework. In fact, we observe fast convergence in our numerical studies even in the absence of the growth conditions imposed by our theoretical analysis. Our numerical results are very promising, and error terms converge to 0 very rapidly even with a small number of iterations. Moreover, in the case of a particular local volatility model with double barriers, our method yields smaller error terms than the so-called Brownian bridge method when the number of discretisations is reasonably large.

We are not the first to consider implicit schemes for studying diffusions with infinite lifetime and taking values in a strict subset of ℝ. Alfonsi [3, 4] and Neuenkirch and Szpruch [36] consider such scalar processes whose SDE representation is given by

$$ dY_{t}= dW_{t} + f(Y_{t})dt, $$
(1.4)

with \(f\) satisfying the conditions of a Feller test ensuring that \(Y\) takes values in \((\ell ,r)\) (see also Dereich et al. [15] in the special case of the Cox–Ingersoll–Ross (CIR) process). The articles [4] and [36] show that the drift-implicit Euler scheme for \(Y\) converges strongly with rate \(N^{-1}\) if \(f\) satisfies certain integrability conditions including

$$ \mathbb{E}^{\mathbb{Q}}\bigg[\int _{0}^{T}\big(f'(Y_{t})\big)^{2}dt \bigg]< \infty . $$
(1.5)

However, this condition cannot be satisfied by an \(h\) that paves the way for a recurrent transformation rendering \(X\) recurrent and following (1.2). (See Proposition C.1 in Appendix C for a proof in case \(b\equiv 0\) and \(\sigma \equiv 1\).)

The estimates obtained in [4] and [36] rely on the Burkholder–Davis–Gundy (BDG) inequality and integrability of the relevant quadratic variation process, which require the corresponding local martingale to be a true martingale. As \(\frac{1}{h(X)}\) is a strict local submartingale under ℚ, one needs to develop new techniques to arrive at the needed estimate for convergence theorems.

This brings to the fore another novelty of our paper. Given the impossibility of the use of the BDG inequality, we use potential-theoretic methods that yield the boundedness of inverse moments of \(h(X)\) under ℚ, which is crucial for obtaining the weak convergence result in our paper (or strong convergence type results considered by [4, 36]). We use the theory of Kato class potentials to show the boundedness of the required moments. Kato potentials are one of the fundamental objects in the study of Schrödinger operators (see e.g. Aizenman and Simon [1], Cranston et al. [13], Chen [11] and Chen and Song [12]). We show in Theorem 2.6 that the additive functional \(dA_{t}=-\frac{1}{2}\frac{h''(X_{t})}{h(X_{t})} dt\) belongs to a particular Kato class defined in [11], which in turn yields the boundedness of the inverse moment of \(\frac{1}{h}(\widehat{X}_{t_{n}})\) (uniformly in \(N\)) in conjunction with a comparison argument via Lemma 3.2. Potential theory also helps us to prove uniform bounds on the moments of integral functionals of \(h^{-2-p}(\widehat{X}_{t})\) (see Theorem 3.3 for an exact description).

Our methodology offers hope to study the convergence rates for CIR processes or diffusions that live in a bounded interval or half space that do not satisfy (1.5). We show in this paper that if one considers the 3-dimensional Bessel process

$$ dX_{t}=dW_{t}+ \frac{1}{X_{t}}dt, $$

the implicit scheme in (1.3) converges weakly at rate \(N^{-1}\). Clearly, (1.5) is violated since the reciprocal of a 3-dimensional Bessel process is a prime example of a strict local martingale. This process satisfies the conditions of Theorem 4.4 and one obtains the optimal convergence rate for sufficiently smooth \(g\).

To the best of our knowledge, the use of the BDG inequality seems to be almost the only method to control the bounds of the moments in the literature concerning the numerical analysis of SDEs. The novel potential-theoretic approach taken in the present paper avoids the use of the BDG inequality in the computation of inverse moments and instead makes use of the concept of Kato classes. As a result, the appearance of local martingale terms does not introduce an extra difficulty in our framework. This is an important contribution in its own right and has the potential to be useful in other contexts as well. We leave the investigation of the convergence rate for conservative diffusions on \((0,\infty )\) satisfying (1.4) in the absence of condition (C.1) to future study.

Although our analysis assumes a one-dimensional framework, a close look into our technical analysis reveals that our convergence result does not depend heavily on this assumption apart from the comparison argument used in Lemma 3.2. In particular, it is relatively clear how to obtain a version of Theorem 2.9 in the multidimensional case using well-known potential-theoretic arguments on Kato classes. However, our main obstacle in extending our results to a multidimensional setting is the absence of a systematic study of recurrent transformations in higher dimensions. Also note that Lemma 3.2 is only used to obtain estimates on \(h(\widehat{X})\), which is always a one-dimensional object with \(\widehat{X}\) referring to the continuous Euler scheme. Such a study and its applications to Euler methods for killed diffusions will be the subject of future research.

The outline of the paper is as follows. Section 2 fixes the setting, gives a brief summary of results for recurrent transformations needed for this paper together with novel inverse moment estimates, and introduces the backward Euler–Maruyama scheme that is tailored for our purposes. Section 3 obtains the moment estimates that are needed for the weak convergence analysis performed in Sect. 4. Theoretical results are confirmed via numerical studies in Sect. 5, and Sect. 6 concludes the paper.

2 Preliminaries

Let \(X\) be a regular diffusion on \((\ell ,r)\), where \(-\infty \leq \ell < r \leq \infty \). We assume that infinite boundaries are inaccessible, and if any of the boundaries are reached in finite time, the process is killed and sent to the cemetery state \(\Delta \). This is the only instance when the process can be ‘killed’; we do not allow killing inside \((\ell ,r)\). The set \(I\) consists of entrance boundaries and all points that can be reached in finite time starting from the interior of \((\ell ,r)\). That is, \(I\) is the union of \((\ell ,r)\) with the regular, exit and entrance boundaries. The law induced on \(C(\mathbb{R}_{+} ; I)\), the space of \(I\)-valued continuous functions on \(\mathbb{R}_{+} = [0,\infty )\), by \(X\) with \(X_{0}=x\) is denoted by \(P^{x}\) as usual, while \(\zeta \) corresponds to its lifetime, i.e., \(\zeta :=\inf \{t>0:X_{t} \notin (\ell ,r)\}\). We also introduce the set \(I_{\Delta}:=I\cup \{\Delta \}\) and extend any \(I\)-valued Borel-measurable function \(f\) to \(I_{\Delta}\) by setting \(f(\Delta )=0\) unless stated otherwise. The filtration \(({\mathcal {F}}^{0}_{t})_{t \geq 0}\) is the natural filtration of \(X\), \(\tilde{{\mathcal {F}}}_{t}\) is the universal completion of \({\mathcal {F}}^{0}_{t}\), and \({\mathcal {F}}_{t}=\tilde{{\mathcal {F}}}_{t+}\) so that \(({\mathcal {F}}_{t})_{t\geq 0}\) is a right-continuous filtration. We also set \({\mathcal {F}}:=\bigvee _{t \geq 0} {\mathcal {F}}_{t}\). We refer the reader to Borodin and Salminen [8, Chaps. I and II] for a summary of results and references on one-dimensional diffusions. The definitive treatment of such diffusions is of course in Itô and McKean [30, Chap. 3].

Since we are only concerned with the diffusion process until it is killed, we can assume without any loss of generality that \(X\) is on natural scale. The extra regularity conditions imposed in the following assumption are standard in the theory of Euler discretisations for SDEs.

Assumption 2.1

\(X\) is a regular one-dimensional diffusion on \((\ell ,r)\) such that

$$ X_{t}=X_{0}+ \int _{0}^{t}\sigma (X_{s})dB_{s}, \qquad t < \zeta , $$

where \(\sigma :(\ell ,r)\to (0,\infty )\) is continuously differentiable with a bounded derivative, \(B\) is a standard Brownian motion and \(\zeta =\inf \{t>0: X_{t}\in \{\ell ,r\}\}\). Moreover, \(\sigma (\ell +)\) (resp. \(\sigma (r-)\)) exists and is finite if \(\ell \) (resp. \(r\)) is finite.

Note that the speed measure \(m\) associated with \(X\) is given by \(m(dx)=2\sigma ^{-2}(x)dx\) on the Borel subsets of \((\ell ,r)\).

Since we are interested in diffusions with killing, the following assumption is needed to ensure that we are not dealing with a vacuous problem.

Assumption 2.2

\(P^{x}[\zeta <\infty ]>0\) for each \(x \in (\ell ,r)\).

Let \(I_{0}\) be the set of points in \(I\) that can be reached from its interior in finite time. Note that under Assumptions 2.1 and 2.2, there are only two cases to consider:

Case 1. Both \(\ell \) and \(r\) are accessible, which in turn implies \(\ell \) and \(r\) are finite and \(I_{0}=[\ell ,r]\).

Case 2. Only one of \(\ell \) and \(r\) is accessible, which can be assumed to be \(\ell \) without any loss of generality. In particular, \(I^{0}=[\ell , r)\).

As \(\ell \) is always finite as a result of the above convention, the following is also assumed for convenience.

Assumption 2.3

\(\ell =0\).

As a transient diffusion on \((0,r)\), \(X\) has a finite potential density \(u: (0,r)^{2} \to \mathbb{R}_{+}\) with respect to its speed measure (see [8, Paragraph 11 in Sect. II.1]). That is, for any nonnegative and measurable \(f\) vanishing at accessible boundaries, we have

$$ Uf(x):=\int _{0}^{\infty}E^{x}[f(X_{t})]dt=\int _{0}^{r} f(y)u(x,y)m(dy). $$

The potential density is symmetric and explicitly known in terms of the scale function and speed measure of \(X\). This leads for the potential density to the specification

$$ u(x,y)= \textstyle\begin{cases} \textstyle\begin{array}{ll} (x\wedge y) (1-\frac{x \vee y}{r} ), & \qquad \text{if } r< \infty , \\ x\wedge y, &\qquad \text{otherwise.}\end{array}\displaystyle \end{cases} $$

Definition 2.4

Let \(\mathcal{S}\) be the space of continuous functions \(f:(0,r)\to (0,\infty )\) satisfying the integrability conditions \(\int _{(0,r)}f(y)m(dy)<\infty \) and \(\int _{(0,r)}yf(y)m(dy)<\infty \). We define

$$ {\mathcal {H}}_{0}:=\{h:h= Uf, f\in \mathcal{S}\}. $$

Moreover, we denote by ℋ the union of \({\mathcal {H}}_{0}\) and the identity function if \(r=\infty \). If \(r<\infty \), then \({\mathcal {H}}={\mathcal {H}}_{0}\).

Note that any \(h\in {\mathcal {H}}_{0}\) is a concave function that is twice continuously differentiable and satisfies on \((\ell ,r)\) that

$$ \frac{1}{2}\sigma ^{2} h''=-f. $$

The following lemma, whose proof is relegated to Appendix A, lists some important properties shared by the functions that belong to ℋ.

Lemma 2.5

Let \(h\in {\mathcal {H}}\).

1) For any given \(z> 0\), consider the function \(H\) defined by

$$ H(x)=x-z \frac{h'(x)}{h(x)}, \qquad x\in (0,r). $$

Then \(H\) is strictly increasing and \(H((0,r))=\mathbb{R}\).

2) The function \(h\) is increasing if \(r=\infty \). However, \(h'\) is bounded. In particular, for \(h\in {\mathcal {H}}_{0}\), we have

$$ \begin{aligned} h'(0)= \textstyle\begin{cases} \textstyle\begin{array}{ll} \int _{0}^{\infty}f(y)m(dy), &\qquad \textit{if } r=\infty , \\ \int _{0}^{\infty} \frac{r-y}{r}f(y)m(dy)>0, &\qquad \textit{otherwise,} \end{array}\displaystyle \end{cases}\displaystyle \\ h'(r)= \textstyle\begin{cases} \textstyle\begin{array}{ll} 0, &\qquad \textit{if } r=\infty , \\ -\frac{1}{r}\int _{0}^{r} yf(y)m(dy)< 0, &\qquad \textit{otherwise.} \end{array}\displaystyle \end{cases}\displaystyle \end{aligned} $$

3) For any \(\alpha \geq 0\) and \(h\in {\mathcal {H}}_{0}\), we have

$$ \int _{(0,r)}\big(u(y,y)\wedge 1\big) \frac{\alpha |h'(y)|-h''(y)}{h(y)}dy< \infty . $$
(2.1)

We are now ready to state the transformations that we use in the sequel.

Theorem 2.6

Suppose that Assumptions 2.12.3are in force, and consider \(h\in {\mathcal {H}}\). Then the following hold:

1) There exists a probability measure \(Q^{h,x}\) onthat is locally absolutely continuous with respect to \(P^{x}\) such that

$$ dX_{t}=\sigma (X_{t})dW_{t} +\sigma ^{2}(X_{t}) \frac{h'(X_{t})}{h(X_{t})}dt $$
(2.2)

and \(W\) is a \(Q^{h,x}\)-Brownian motion.

2) For any \(x\in (\ell ,r)\), \(Q^{h,x}[\zeta <\infty ]=0\).

3) Let \(g:I_{0}\to \mathbb{R}\) be a continuous function vanishing at accessible boundaries. Then for any deterministic \(T>0\), we have

$$ E^{x} [g(X_{T}){\mathbf {1}}_{\{T< \zeta \}} ]=h(x) E^{h,x}\left [ \frac{g(X_{T})}{h(X_{T})}\exp \left (\frac{1}{2}\int _{0}^{T} \frac{\sigma ^{2}(X_{s})h''(X_{s})}{h(X_{s})}ds\right )\right ], $$
(2.3)

where \(E^{h,x}\) is the expectation operator associated with \(Q^{h,x}\).

Proof

If \(h\in {\mathcal {H}}_{0}\), the claims follow from Çetin [9, Theorem 3.2]. When \(r=\infty \) and \(h\) is the identity function, the stated transformation is the well-known Doob \(h\)-transform, and the reader is referred to Evans and Hening [18, Theorem 6.2] for a proof in a much more general setting. □

Remark 2.7

When \(h\in {\mathcal {H}}_{0}\), [9] shows that \(h(X)\exp (-\frac{1}{2}\int _{0}^{\cdot} \frac{\sigma ^{2}(X_{s})h''(X_{s})}{h(X_{s})}ds )\) is a \(P^{x}\)-martingale and \(X\) is a recurrent process under \(Q^{h,x}\), where \(Q^{h,x}\) is as in Theorem 2.6. This measure transformation via the excessive function \(h\) and the multiplicative functional \(\exp (-\frac{1}{2}\int _{0}^{\cdot} \frac{\sigma ^{2}(X_{s})h''(X_{s})}{h(X_{s})}ds )\) is called a recurrent transformation.

Although similar to a Doob \(h\)-transform at heart, the concept of a recurrent transformation is fundamentally different from an \(h\)-transform. By an \(h\)-transform, we refer to a change of measure via an excessive function \(h: (0,r)\to \mathbb{R}_{+}\). In particular, \(h(X)\) is at most a local martingale and can be a supermartingale that is not a local martingale. The latter case leads to a loss of mass after the change of measure that appears in the killing measure of the resulting diffusion (see Evans and Hening [18, Sect. 6] for details).

In contrast, first, a recurrent transformation always yields a recurrent diffusion while an \(h\)-transform yields a transient one. Indeed, when \(r=\infty \) and one uses the identity function in Theorem 2.6, \(Q^{{{\mathrm{Id}}},x}[\lim _{t \rightarrow \infty}X_{t}=\infty ]=1\) since the corresponding scale function under \(Q^{{{\mathrm{Id}}},x}\) is finite at \(\infty \).

Second, the excessive function that is associated with an \(h\)-transform is in general not harmonic. That is, \(h(X)\) is a strict \(P^{x}\)-supermartingale. This leads to a killing, i.e., a loss of probability mass, under \(Q^{h,x}\) at some particular last passage time. We refer the reader to Çetin [9, Sect. 3.1] for more details on this point and the potential-theoretic connection between recurrent transformations and Doob \(h\)-transforms.

Finally, akin to what we are doing in the present paper, if one is interested in an \(h\)-transform that does not involve any killing as in the preceding paragraph and prevents the diffusion from hitting the boundaries, one needs to find an \(h\)-function such that \(h\) is harmonic, i.e., \(h(X)\) is a \(P^{x}\)-local martingale, and \(h\) vanishes at accessible boundaries. However, the only harmonic functions of a diffusion on natural scale are affine functions. Combined with the requirement that \(h\) should vanish at 0 and \(r\), this implies \(h\equiv 0\). That is, there is no \(h\)-transform that yields a conservative diffusion that does not hit boundaries when \(r<\infty \). On the other hand, any function \(h\in {\mathcal {H}}_{0}\) yields a recurrent diffusion process that avoids hitting the boundaries via a recurrent transformation.

In order to approximate the expectation on the right-hand side of (2.3), we use a backward Euler–Maruyama (BEM) scheme. Let \(N>1\) be an integer and define \(t_{n}:=\frac{n}{N}T\) for \(n=0, \ldots , N\). Set \(\widehat{X}_{0}=X_{0}\) and proceed inductively by setting

$$ \widehat{X}_{t}=\widehat{X}_{t_{n}}+\sigma (\widehat{X}_{t_{n}})(W_{t}-W_{t_{n}})+ (t-t_{n})\sigma ^{2}(\widehat{X}_{t_{n}}) \frac{h'(\widehat{X}_{t})}{h(\widehat{X}_{t})} $$
(2.4)

for \(t \in (t_{n},t_{n+1}]\) and \(n=0, \ldots , N-1\). Note that in view of Lemma 2.5, the mapping \(x\mapsto x- z\frac{h'}{h}(x)\) is one-to-one and onto for any given \(z>0\). Thus the above scheme is well defined since \(\sigma (x)>0\) for all \(x\in (0,r)\).

Remark 2.8

One may also consider an exact simulation method as in Beskos and Roberts [6] to compute the right- or left-hand side of (2.3). However, the singularity of the drift term in (2.2) violates the regularity conditions assumed in [6] and its subsequent extensions. Thus one needs to consider the right-hand side and use the original diffusion after assuming enough regularity on \(\sigma \). Provided we can perform an exact simulation, one can get an exact simulation of the minimum or maximum of a path together with its terminal value. In [6], for \(T = 1\) year, the time taking to run 50’000 simulations to compute the maximum of \(X\) on \([0,1]\) is about 3.1 seconds in their C-language program. To get a good pricing accuracy (a few bps in errors), one may need to run 0.5 or 1 million simulation paths, which may amount to 30 seconds to 1 minute. In our numerical calculations for the BEM method, we have used the Octave software, which is much slower than the C-language. On the other hand, the numerical experiments have shown that we needed very few discretisation time steps (20 or 30 steps) and very few paths to get comparable accuracy (see Sect. 5). As a result, our simulations took about 20 seconds to complete.

Nevertheless, the development of an exact simulation method for the recurrent transform to compute the right-hand side of (2.3) has the potential to significantly reduce the computation time by avoiding calculations using an inverse function. This interesting direction is left for future research.

As we shall see in Sect. 3, a crucial role will be played by diffusion processes on \((0,r)\) of the type

$$ dY_{t}=dW_{t} + \bigg(\frac{h'(Y_{t})}{h(Y_{t})}+c\bigg)dt, \qquad t < \zeta (Y), $$
(2.5)

where \(\zeta (Y)\) denotes the first hitting time of 0 or \(r\). Note that \(c=0\) corresponds to the transformations defined in Theorem 2.6.

Theorem 2.9

Suppose that Assumptions 2.12.3are in force, \(h\in {\mathcal {H}}\) and \(Y\) is a process defined by (2.5) with \(Y_{0}=X_{0}\). Assume further that \(c\leq 0\) if \(r=\infty \), and \(c=0\) if \(h(x)=x\) for all \(x\). Then the following statements are valid:

1) \(Q^{h,X_{0}}[\zeta (Y)=\infty ]=1\).

2) For any stopping time \(S\) that is \(Q^{h,X_{0}}\)-a.s. bounded, there exists a constant \(K\) that does not depend on \(X_{0}\) such that

$$ E^{h,X_{0}}\left [\frac{1}{h(Y_{S})}\right ]< \frac{K}{h(X_{0})}. $$

3) For any \(t>0\) and \(p\in [0,1)\),

$$ E^{h,X_{0}}\left [\int _{0}^{t}\frac{1}{h^{2+p}(Y_{s})}ds\right ]< \infty . $$

Proof

1) First observe that the scale function and speed measure for \(Y\) can be chosen as

$$ s_{y}(x)= \int _{d}^{x}\frac{e^{-2cy}}{h^{2}(y)}dy, \qquad m_{y}(dx) = 2 h^{2}(x)\exp{(2cx)} dx, $$

where \(d\in (0,r)\). Since \(s_{y}(0)=-\infty \), 0 is an inaccessible boundary for \(Y\). By the same token, \(r\) is also an inaccessible boundary when \(s_{y}(r)=\infty \), which is valid when \(r<\infty \) or \(c\leq 0\).

2) Define \(Z\) by

$$ Z_{t}:= \frac{1}{h(Y_{t})}\exp \left (\frac{1}{2}\int _{0}^{t} \frac{ 2 ch'(Y_{s})+ h''(Y_{s})}{h(Y_{s})}ds\right ) $$

and note that \(Z\) is a nonnegative \(Q^{h,X_{0}}\)-local martingale by a straightforward application of Itô’s formula. By Sharpe [42, Theorem 62.19], there exists a probability measure \(\tilde{P}\) such that

$$ dY_{t}=d\beta _{t} + c dt, \quad t< \zeta (Y), $$

where \(\beta \) is a \(\tilde{P}\)-Brownian motion, and whenever \(S\) is a stopping time that is finite \(Q^{h,X_{0}}\)-a.s., one has

$$\begin{aligned} E^{h,X_{0}}\bigg[\frac{1}{h(Y_{S})}\bigg]&=\frac{1}{h(X_{0})} \tilde{E}\bigg[{\mathbf {1}}_{\{S< \zeta (Y)\}}\exp \bigg(-\frac{1}{2} \int _{0}^{S}\frac{2 ch'(Y_{s})+h''(Y_{s})}{h(Y_{s})}ds\bigg)\bigg] \\ &\leq \frac{1}{h(X_{0})}\tilde{E}\bigg[{\mathbf {1}}_{\{S< \zeta \}} \exp \bigg(\frac{1}{2}\int _{0}^{S} \frac{2 (ch'(Y_{s}))^{-}-h''(Y_{s})}{h(Y_{s})}ds\bigg)\bigg], \end{aligned}$$

where \(x^{-}\) denotes the negative part of \(x\) and we drop the dependence on \(Y\) for \(\zeta \) to ease the exposition.

Suppose that \(S< R\) \(Q^{h,X_{0}}\)-a.s., where \(R\) is a deterministic constant, and observe that \(\tilde{P}[S\geq R, S<\zeta ]=0\). Thus

$$ \begin{aligned} \tilde{E}\bigg[{\mathbf {1}}_{\{S< \zeta \}}\exp \Big(\int _{0}^{S} \frac{2 (ch'(Y_{s}))^{-}-h''(Y_{s})}{2h(Y_{s})}ds\Big)\bigg] \\ \leq \tilde{E}\bigg[\exp \Big(\int _{0}^{R\wedge \zeta} \frac{2 (ch'(Y_{s}))^{-}-h''(Y_{s})}{2h(Y_{s})}ds\Big)\bigg]. \end{aligned} $$

Let \({\mathcal {W}}^{c,y}\) denote the law of the process \(\tilde{Y}\) starting at \(y\), where \(d\tilde{Y}_{t}=d\beta _{t} +c dt\) and \(\tilde{Y}\) gets killed at hitting 0 or \(r\). Thus

$$ \tilde{E}\left [\exp \bigg(\int _{0}^{R\wedge \zeta} \frac{2 (ch'(Y_{s}))^{-}-h''(Y_{s})}{2h(Y_{s})}ds\bigg)\right ]={ \mathcal {W}}^{c,X_{0}} [\exp (C_{R}) ], $$

where \(C\) is the positive continuous additive functional of \(\tilde{Y}\) with

$$ dC_{t}= \frac{1}{2} \frac{2 (ch'(\tilde{Y}_{t}))^{-}-h''(\tilde{Y}_{t})}{h(\tilde{Y}_{t})}{ \mathbf {1}}_{\{t< \tilde{\zeta}\}}dt. $$

Note that the potential function \(u_{C}\) of \(C\) is given by

$$ u_{C}(x)={\mathcal {W}}^{x}[C_{\infty}]=\int _{0}^{r} v (x,y)\mu _{C}(y) \frac{d\tilde{m}}{dy}, $$

where \(v\) is the potential density of \(\tilde{Y}\), \(\mu _{C}(y)=\frac{1}{2}\frac{(2ch'(y))^{-}- h''(y)}{h(y)}\) and \(\tilde{m}\) is the associated speed measure of \(Y\). As the scale function and speed measure of \(\tilde{Y}\) can be chosen as

$$ \tilde{s}(x)=\frac{1-e^{-2cx}}{2c}, \qquad \tilde{m}(dx)=2e^{2cx}dx, $$

where \(\tilde{s}(x)=x\) if \(c=0\), we obtain for \(x\leq y\) that

$$ v(x,y)= \frac{\tilde{s}(x)(\tilde{s}(r)-\tilde{s}(y))}{\tilde{s}(r)}, $$

with \(\frac{\tilde{s}(r)-\tilde{s}(y)}{\tilde{s}(r)}\) being interpreted as 1 if \(\tilde{s}(r)=\infty \).

First observe that \(v(x,y)=u(x,y)\) if \(c=0\). On the other hand, if \(r=\infty \) and \(c<0\), then

$$ v(y,y)e^{2cy}=\frac{e^{2cy}-1}{2c}\leq y\wedge \frac{1}{2|c|}. $$
(2.7)

Similarly, for \(r<\infty \),

$$ v(y,y)e^{2cy}=\frac{e^{-2cr}}{2c(1-e^{-2cr})}(e^{2cy}-1)(e^{2c(r-y)}-1) \leq K(c,r) y\bigg(1-\frac{y}{r}\bigg). $$
(2.8)

Thus for some \(K<\infty \),

$$ \int _{0}^{r} v (y,y)\mu _{C}(y)2e^{2cy}dy\leq K \int _{0}^{r}\big(u(y,y) \wedge 1\big)\frac{(2c h'(y))^{-}- h''(y)}{h(y)}dy < \infty $$
(2.9)

by an application of (2.1) due to the bounds obtained via (2.7) and (2.8), and the assumption on the choice of \(c\) when \(r=\infty \). As

$$ u_{C}(x)\leq \int _{0}^{r} v (y,y)\mu _{C}(y)2e^{2cy}dy, $$

we deduce that \(u_{C}\) is bounded. Now consider a decreasing sequence \((D_{n})\) of subsets of \((0,r)\) such that \(D_{n} \rightarrow \emptyset \). Since

$$ \int _{0}^{r}v(x,y){\mathbf {1}}_{\{D_{n}\}}(y)\mu _{C}(y)2e^{2cy}\leq \int _{0}^{r}v(y,y){\mathbf {1}}_{\{D_{n}\}}(y)\mu _{C}(y)2e^{2cy}dy $$

and the right-hand side converges to 0 by the dominated convergence theorem due to (2.9), we establish that \(\mu _{C} \in \mathbf{K}_{1}(\tilde{Y})\) (see Chen [11, Definition 2.2]) by [11, Proposition 2.4]. Therefore, by Chen and Song [12, Proposition 2.3] we arrive at the estimate \(\sup _{y\in (0,r)}{\mathcal {W}}^{c,y} [\exp (C_{t})]\leq d_{1} e^{d_{2} t}\) for some constants \(d_{1}\) and \(d_{2}\). This proves the claim.

3) Since the semigroup is self-dual with respect to the speed measure, denoting the measure \(dy2h^{2}(y)\) by \(\nu (dy)\) gives for any nonnegative measurable \(f\) that

$$\begin{aligned} &\int _{0}^{r} \nu (dy)e^{2cy}f(y)E^{h,y}\bigg[\int _{0}^{t} \frac{e^{-s}{\mathbf {1}}_{\{Y_{s}\in D\}}}{h^{2+p}(Y_{s})}ds\bigg] \\ &=\int _{D} \nu (dy)e^{2cy}\frac{1}{h^{2+p}(y)}E^{h,y}\bigg[\int _{0}^{t}e^{-s}f(Y_{s})ds \bigg] \\ &\leq \int _{D} dy\frac{2e^{2cy}}{h^{p}(y)}E^{h,y}\bigg[\int _{0}^{t} f(Y_{s})ds \bigg], \end{aligned}$$

where \(D:=\{y: h(y)<1\wedge \frac{1}{2}\|h\|_{\infty}\}\). In particular, when \(f(y)=q(\varepsilon , y,y^{*})\) for some \(\varepsilon >0\), where \(q\) is the transition density of \(Y\) with respect to its speed measure, we obtain

$$\begin{aligned} &\int _{0}^{r} \nu (dy)e^{2cy}q(\varepsilon , y,y^{*})E^{h,y}\bigg[ \int _{0}^{t} \frac{e^{-s}{\mathbf {1}}_{\{Y_{s}\in D\}}}{h^{2+p}(Y_{s})}ds\bigg] \\ &\leq \int _{D} dy2e^{2cy}h^{-p}(y)E^{h,y}[L^{y^{*}}_{t+\varepsilon }] \\ &\leq E^{h,y^{*}}[L^{y^{*}}_{t+\varepsilon }]\int _{D} dy2e^{2cy}h^{-p}(y), \end{aligned}$$

where \(L^{y^{*}}\) is the diffusion local time with respect to the speed measure. Letting \(\varepsilon \rightarrow 0\), we arrive at

$$ E^{h,y^{*}}\bigg[\int _{0}^{t} \frac{e^{-s}{\mathbf {1}}_{\{Y_{s}\in D\}}}{h^{2+p}(Y_{s})}ds\bigg] \leq E^{h,y^{*}}[L^{y^{*}}_{t}]\int _{D} dy2e^{2cy}h^{-p}(y)< \infty , $$

provided that \(y\mapsto E^{h,y} [\int _{0}^{t} \frac{e^{-s}{\mathbf {1}}_{\{Y_{s}\in D\}}}{h^{2+p}(Y_{s})}ds]\) is lower semicontinuous. Note that the finiteness of the integral on the right-hand side follows from the fact that \(|h'(y)|\geq \alpha \) for some \(\alpha >0\) on \(D\).

Observe that

$$ E^{h,y} \bigg[\int _{0}^{t} \frac{e^{-s}{\mathbf {1}}_{\{Y_{s}\in D\}}}{h^{2+p}(Y_{s})}ds\bigg]= \phi (y)-e^{-t}E^{h,y}[\phi (Y_{t})], $$

where

$$ \phi (y):=E^{h,y}\bigg[\int _{0}^{\infty}ds \frac{e^{-s}{\mathbf {1}}_{\{Y_{s}\in D\}}}{h^{2+p}(Y_{s})}ds\bigg]= \int _{D}\frac{2 e^{2cz}v_{1}(y,z)}{h^{p}(z)}dz, $$

where \(v_{1}\) is the 1-potential density of \(Y\). Since \(v_{1}\) is jointly continuous (see [8, Paragraphs 10 and 11 in Chap. II]), the claimed semicontinuity follows. Finally, since

$$ E^{h,y^{*}}\bigg[\int _{0}^{t}\frac{1}{h^{2+p}(Y_{s})}ds\bigg]\leq e^{t} E^{h,y^{*}}\bigg[\int _{0}^{t} \frac{e^{-s}{\mathbf {1}}_{\{Y_{s}\in D\}}}{h^{2+p}(Y_{s})}ds\bigg] + K $$

for some \(K\), the claim follows from the arbitrariness of \(y^{*}\). □

3 Moment estimates for the continuous BEM scheme

In this section, we obtain some moment estimates, including inverse ones, that are necessary to establish the weak rate of convergence (see Sect. 4). We start with the following consequence of Itô’s formula.

Lemma 3.1

Suppose that \(h\in {\mathcal {H}}\) is in \(C^{2}_{b}((0,r); (0,\infty ))\) and such that \(h^{(3)}\) exists and satisfies the growth condition \(|h^{(3)}| \leq K(1+h^{-p})\) for some constants \(K\) and \(p\in [0,1)\). Consider the BEM scheme defined by (2.4). Then for all \(t\in (t_{n},t_{n+1}]\),

$$\begin{aligned} d\widehat{X}_{t}&= \frac{\sigma (\widehat{X}_{t_{n}})}{H_{x}(t_{n},\widehat{X}_{t_{n}};t,\widehat{X}_{t})}dW_{t} \\ & \hphantom{=:} + \frac{\sigma ^{2}(\widehat{X}_{t_{n}})}{H_{x}^{2}(t_{n},\widehat{X}_{t_{n}};t,\widehat{X}_{t})} \bigg(\frac{h'}{h}(\widehat{X}_{t}) +\mu (t_{n}, \widehat{X}_{t_{n}}; t, \widehat{X}_{t})\bigg) dt, \end{aligned}$$
(3.1)

where

$$ \begin{aligned} H(t_{n},z;t,x)&:=x-\sigma ^{2}(z)(t-t_{n})\frac{h'}{h}(x), \\ \mu (t_{n},z;t,x)&:=\big(H_{x}(t_{n},z;t,x)-1\big)\frac{h'}{h}(x) + \frac{1}{2}\frac{\sigma ^{2}(z)(t-t_{n})}{H_{x}(t_{n},z;t,x)}\left ( \frac{h'}{h}\right )''(x). \end{aligned} $$

Consider the sets \(O_{1}:=\{x:h'(x)>0\}\) and \(O_{2}:=\{x:h'(x)<0\}\). Then

$$ \inf _{x\in O_{1}}\mu (t_{n},z;t,x)\geq c_{1} \qquad \textit{and} \qquad \sup _{x\in O_{2}}\mu (t_{n},z;t,x)\leq c_{2} \ $$

for some constants \(c_{1}\leq 0\leq c_{2}\) that do not depend on \(t_{n}\), \(t\) or \(z\). In particular, \(c_{1}=0\) when \(h(x)=x\).

Proof

The decomposition (3.1) follows from Itô’s formula and straightforward calculations for the derivatives of the inverse function.

To prove the second assertion, first note that

$$ H_{x}(t,x)-1:= H_{x} (t_{n}, z; t,x) -1=-\sigma ^{2}(z)(t-t_{n}) \bigg(\frac{h'}{h}\bigg)'(x)\geq 0, $$

where we drop the dependence on \(t_{n}\) and \(z\) to ease the exposition. Observe that

$$ \mu =-\frac{\sigma ^{2}(z)(t-t_{n}) (\frac{h'}{h} )'}{H_{x}}\bigg(H_{x} \frac{h'}{h}- \frac{1}{2}\frac{ (\frac{h'}{h} )''}{ (\frac{h'}{h} )'} \bigg), $$
(3.2)

and that the claim follows immediately if \(h(x)=x\) since the term in parentheses in (3.2) becomes nonnegative. Thus it remains to show the assertion when \(h\in {\mathcal {H}}_{0}\).

First consider the case \(r=\infty \). Let \(u:=\frac{h'}{h}\) and note that \(\lim _{x\rightarrow \infty} u'(x)=0\) by Lemma 2.5. Moreover, \(|u'(x)|\leq K x^{-2}\) for some \(K <\infty \), which in turn implies

$$ \lim _{x\rightarrow \infty} \frac{\log (-u'(x))}{x}=0=\lim _{x \rightarrow \infty} \frac{ u''(x)}{u'(x)}, $$
(3.3)

where the second equality is an application of L’Hospital’s rule. Thus

$$ - \frac{1}{2}\frac{ (\frac{h'}{h} )''}{ (\frac{h'}{h} )'}> c \qquad \text{on } \bigg(\frac{x^{*}}{2},\infty \bigg) $$

for some \(c<0\), where \(x^{*}:=\inf \{x:h'(x)=0\}>0\) by Lemma 2.5.

An alternative representation for \(\mu \) is given by

$$ \mu = \sigma ^{2}(z)(t-t_{n})\bigg(-\frac{h'}{h}\Big(\frac{h'}{h} \Big)'\frac{1+H_{x}}{H_{x}}+\frac{1}{2H_{x}}\frac{h'''h-h''h'}{h^{2}} \bigg). $$
(3.4)

Thus we are done if

$$ w(t,x):=\frac{\sigma ^{2}(z)(t-t_{n})}{2H_{x}} \frac{h'''h-h''h'}{h^{2}} $$

is bounded from below on \((0,\frac{x^{*}}{2})\). Indeed, as \(h'\) is bounded away from 0 on this interval, the hypothesis on \(h'''\) implies that

$$ w(t,x)\geq -K \frac{\sigma ^{2}(z)(t-t_{n}) (\frac{h'}{h} )^{2}}{1+ \sigma ^{2}(z)(t-t_{n})(\frac{h'}{h})^{2}}, $$

leading to the desired lower bound.

When \(r<\infty \), we have in particular that \(\sigma \) is bounded. Moreover, we have the estimate \(|w(t,x)|\leq K \frac{\sigma ^{2}(z)(t-t_{n})\frac{1}{h^{2}}}{2H_{x}}\) for some constant \(K\) which renders \(w\) bounded. Observing that the remaining term in (3.4) has the correct sign completes the proof. □

The next result is a key comparison result that relates the inverse moments of the BEM scheme to those of the process (2.5) and thereby provide estimates that are valid uniformly in \(N\).

Lemma 3.2

Suppose that \(h\) satisfies the conditions of Lemma 3.1, \(\sigma \) is bounded, \(r=\infty \) and consider the BEM scheme defined by (2.4). Then for any nondecreasing and measurable function \(\phi \) that does not change sign, we have

$$ E^{h,X_{0}}[ \phi (\widehat{X}_{A_{t}^{-1}})] \geq E^{h,X_{0}} [ \phi (Y_{t}) ], $$

where \(Y\) is the process defined by (2.5) with \(c=c_{1}\) as in Lemma 3.1and \(A\) is a continuous time change defined by \(A_{0}=0\) and

$$ dA_{t}= \frac{\sigma ^{2}(\widehat{X}_{t_{n}})}{H_{x}^{2}(t_{n},\widehat{X}_{t_{n}};t,\widehat{X}_{t})}dt, \qquad t\in (t_{n},t_{n+1}]. $$

Moreover, \(Q^{h,X_{0}}[A_{t}\leq t\|\sigma \|_{\infty}^{2}]=1\).

Proof

Consider the process \(\widehat{Y}\) defined by \(\widehat{Y}_{t}=\widehat{X}_{A_{t}^{-1}}\). The Dambis–Dubins–Schwarz theorem (cf. Revuz and Yor [39, Theorem V.1.6]) yields

$$ d\widehat{Y}_{t}=d\beta _{t} +\left (\frac{h'}{h}(\widehat{Y}_{t}) + \mu _{t}\right )dt, \qquad t\in (t_{n},t_{n+1}], $$

where \(\mu _{t}\geq c_{1}\) and \(\beta \) is a standard Brownian motion adapted to \(({\mathcal {F}}_{A_{t}^{-1}})_{t\geq 0}\). Then the comparison theorem for stochastic differential equations (cf. Çetin and Danilova [10, Theorem 2.10]) shows that

$$ P^{h,X_{0}}[\widehat{Y}_{t}\geq Y_{t}, t\leq T]=1, $$

where

$$ Y_{t}= X_{0}+ \beta _{t} +\int _{0}^{t} \left (\frac{h'}{h}(Y_{s})+c_{1} \right )ds. $$

Since \(H_{x} \geq 1\), it follows that \(A_{t}\leq t \|\sigma \|_{\infty}^{2}\). This completes the proof. □

The main moment estimates are collected in the following theorem.

Theorem 3.3

Suppose that \(h\) satisfies the conditions of Lemma 3.1, \(\sigma \) is bounded and consider the BEM scheme defined by (2.4). Then for any \(T>0\) and \(p\in [0,1)\), the following statements are valid:

1) For each \(m\in \mathbb{N}\),

$$ \sup _{t\leq T, N} E^{h,X_{0}}\bigg[\frac{1}{h}(\widehat{X}_{t})+ \sum _{n=0}^{N-1}\int _{t_{n}}^{t_{n+1}} \frac{\sigma ^{2}(\widehat{X}_{t_{n}})h^{-2-p}(\widehat{X}_{t})}{H_{x}^{2}(t_{n},\widehat{X}_{t_{n}};t,\widehat{X}_{t})}dt+ |\widehat{X}_{t}|^{m} \bigg]< \infty . $$
(3.5)

2) For each \(n\),

$$ \operatorname*{{\mathrm{ess}\sup}}_{\tau \in {\mathcal {T}}_{n}}E^{h,X_{0}}\bigg[\frac{1}{h}( \widehat{X}_{\tau})+ X^{m}_{\tau}\bigg| {\mathcal {F}}_{t_{n}}\bigg]< \infty , $$
(3.6)

where \(m\geq 0\) is an integer and

$$ {\mathcal {T}}_{n}:=\{\tau : \tau \textit{ is a stopping time valued in } [t_{n}, t_{n+1}]\textit{ }Q^{h,X_{0}}\textit{-a.s.}\}. $$

Suppose further that \(p\leq \frac{1}{2}\) and \(\frac{h''}{h^{1-p}}\) is bounded. Then for each \(n \in \mathbb{N}\) and \(m\geq 0\),

$$\begin{aligned} &E^{h,X_{0}}\bigg[\sum _{n=0}^{N-1}\int _{t_{n}}^{t_{n+1}}\bigg(1-\exp \Big((s-t_{n}) \frac{\sigma ^{2}h''}{2h}(\widehat{X}_{t_{n}})\Big)\bigg) \frac{\sigma (\widehat{X}_{t_{n}})^{2}(h^{-p}(\widehat{X}_{s})+\widehat{X}_{s}^{m})}{H_{x}^{2}(t_{n},\widehat{X}_{t_{n}};s,\widehat{X}_{s})}ds \bigg] \\ &< \frac{KT}{N}, \end{aligned}$$
(3.7)

and \(K\) is independent of \(N\).

The proof of Theorem 3.3 is lengthy and is relegated to Appendix B. We end this section with the following lemma that will be useful in our PDE approach to weak convergence rates in Sect. 4.

Lemma 3.4

Suppose that \(h\) satisfies the conditions of Lemma 3.1, \(\sigma \) is bounded and consider the BEM scheme defined by (2.4). Then for any \(T>0\), the following statements are valid:

1) Let \(p\in [0,1)\) and \(m\geq 0\) be an integer. For each \(n\),

$$ \begin{aligned} &E^{h,X_{0}}\bigg[\int _{t_{n}}^{t_{n+1}}\Big| \frac{h^{1-p}(\widehat{X}_{t})(1+\widehat{X}_{t}^{m})\mu (t_{n}, \widehat{X}_{t_{n}}; t, \widehat{X}_{t})}{H_{x}^{2}(t_{n},\widehat{X}_{t_{n}};t,\widehat{X}_{t})} \Big|dt\bigg| {\mathcal {F}}_{t_{n}}\bigg] \\ &\leq \frac{KT}{N}E^{h,X_{0}}\bigg[\int _{t_{n}}^{t_{n+1}} \frac{\sigma ^{2}(\widehat{X}_{t_{n}})(h^{-2-p}(\widehat{X}_{t})+\widehat{X}_{t}^{m})}{H_{x}^{2}(t_{n},\widehat{X}_{t_{n}};t,\widehat{X}_{t})}dt \bigg| {\mathcal {F}}_{t_{n}}\bigg], \end{aligned} $$

with \(K\) being a constant independent of \(n\).

2) Assume further that \(h\in C^{4}((0,r);(0,\infty ))\). Consider \(p\in [0,1)\) and suppose

$$ \frac{|h^{(k)}|}{h}< \frac{K}{h^{k-2+p}}, \qquad k\in \{2,3,4\}, $$

for some \(K\). Let \(f\in C^{2}((0,r); \mathbb{R})\) be a bounded function such that

$$ |f^{(k)}(x)|\leq K(1+ x^{m})h^{2-p-k}(x), \qquad k\in \{1,2\}, $$

for some \(m\geq 0\). Then for each \(n\) and \(t\in [t_{n},t_{n+1}]\),

$$\begin{aligned} &\bigg| E^{h,X_{0}} \bigg[f(\widehat{X}_{t})\bigg(\frac{h''}{h}( \widehat{X}_{t_{n}})- \frac{h''(\widehat{X}_{t})}{H_{x}^{2}(t_{n},\widehat{X}_{t_{n}};t,\widehat{X}_{t})h(\widehat{X}_{t})} \bigg)\bigg| {\mathcal {F}}_{t_{n}}\bigg]\bigg| \\ &\leq K E^{h,X_{0}}\bigg[ \int _{t_{n}}^{t} \frac{\sigma (\widehat{X}_{t_{n}})^{2}(h^{-(2+p)}(\widehat{X}_{s})+\widehat{X}_{s}^{m})}{H_{x}^{2}(t_{n},\widehat{X}_{t_{n}};s,\widehat{X}_{s})}ds \bigg| {\mathcal {F}}_{t_{n}}\bigg] \\ & \hphantom{=:} -\frac{Kh''(\widehat{X}_{t_{n}})}{h(\widehat{X}_{t_{n}})} \\ & \hphantom{=:-} \times E^{h,X_{0}}\bigg[\int _{t_{n}}^{t} \frac{\sigma ^{2}(\widehat{X}_{t_{n}}) ((h^{-p}(\widehat{X}_{s})+\widehat{X}_{s}^{m})+(s-t_{n})(h^{-2}(\widehat{X}_{s})+\widehat{X}_{s}^{m}) )}{H_{x}^{2}(t_{n},\widehat{X}_{t_{n}};s,\widehat{X}_{s})}ds \bigg| {\mathcal {F}}_{t_{n}}\bigg] \end{aligned}$$

for some constant \(K\) independent of \(n\).

3) Suppose \(f\) and \(h\) satisfy the conditions of part 2) and \(b\in C^{2}_{b}((0,r); \mathbb{R})\). Then for each \(n\) and \(t\in [t_{n},t_{n+1}]\),

$$ \begin{aligned} \bigg|E^{h,X_{0}}\bigg[f(\widehat{X}_{t})\bigg(b( \widehat{X}_{t_{n}})- \frac{b(\widehat{X}_{t})}{H_{x}^{2}(t_{n},\widehat{X}_{t_{n}};t,\widehat{X}_{t})} \bigg)dt\bigg| {\mathcal {F}}_{t_{n}}\bigg]\bigg| \\ \leq K E^{h,X_{0}}\bigg[ \int _{t_{n}}^{t} \frac{\sigma (\widehat{X}_{t_{n}})^{2}(h^{-2-p}(\widehat{X}_{s})+\widehat{X}_{s}^{m})}{H_{x}^{2}(t_{n},\widehat{X}_{t_{n}};s,\widehat{X}_{s})}ds \bigg| {\mathcal {F}}_{t_{n}}\bigg] \end{aligned} $$

for some constant \(K\) independent of \(n\).

Proof

1) It follows directly from the definition of \(\mu \) and the hypothesis on \(h'''\) that

$$ h^{1-p}(\widehat{X}_{t})|\mu (t_{n}, \widehat{X}_{t_{n}}; t, \widehat{X}_{t})|\leq K \sigma ^{2}(\widehat{X}_{t_{n}})(t-t_{n})h^{-2-p}( \widehat{X}_{t}), \qquad t\in [t_{n},t_{n+1}], $$

for some \(K\). Also note that if \(m\geq 1\), there exist \(K\) and \(c\in (0,r)\) such that we have \(x^{m} h^{-(2+p)}\leq Kh^{m-(2+p)}\) for \(x\in [0,c]\). Thus

$$ x^{m} h^{-(2+p)}\leq K(x^{m} + h^{-(2+p)}). $$
(3.8)

2) Let \(\mu _{s}:=\mu (t_{n}, \widehat{X}_{t_{n}}; s, \widehat{X}_{s})\), \(u:=\frac{h'}{h}\) and \(\eta _{s}:= H_{x}(t_{n},\widehat{X}_{t_{n}};s,\widehat{X}_{s})\). Then Itô’s formula yields

$$ f(\widehat{X}_{t})\left (\frac{h''}{h}(\widehat{X}_{t_{n}})- \frac{h''(\widehat{X}_{t})}{h(\widehat{X}_{t})\eta _{t}^{2}}\right )=M_{t} + A_{t}, $$

where \(M\) is a local martingale with \(M_{t_{n}}=0\) since \(\eta _{t_{n}}=1\), and

$$ \begin{aligned} A_{t}&=\int _{t_{n}}^{t} \frac{\sigma ^{2}(\widehat{X}_{t_{n}})f(\widehat{X}_{s})}{2\eta _{s}^{4}} \bigg(\frac{2h''h'}{h^{2}}(\widehat{X}_{s})\mu _{s}+ \frac{(h'')^{2}-h h^{(4)}}{h^{2}}(\widehat{X}_{s})\bigg)ds \\ & \hphantom{=:} -\int _{t_{n}}^{t} \frac{\sigma (\widehat{X}_{t_{n}})^{2}f(\widehat{X}_{s})}{\eta _{s}^{4}} \frac{h^{(3)}}{h}(\widehat{X}_{s})\bigg(\mu _{s}+2\sigma ^{2}( \widehat{X}_{t_{n}})(s-t_{n})\frac{u''(\widehat{X}_{s})}{\eta _{s}} \bigg)ds \\ & \hphantom{=:} -\int _{t_{n}}^{t} \frac{\sigma ^{4}(\widehat{X}_{t_{n}})(s-t_{n})f(\widehat{X}_{s})h''(\widehat{X}_{s})}{h(\widehat{X}_{s})\eta _{s}^{5}} \bigg(2\mu _{s} u''(\widehat{X}_{s})+u^{(3)}(\widehat{X}_{s})\bigg)ds \\ & \hphantom{=:} -\int _{t_{n}}^{t} \frac{\sigma ^{4}(\widehat{X}_{t_{n}})(s-t_{n})f(\widehat{X}_{s})h''(\widehat{X}_{s})}{h(\widehat{X}_{s})\eta _{s}^{5}} \frac{3\sigma ^{2}(\widehat{X}_{t_{n}})(s-t_{n})(u'')^{2}(\widehat{X}_{s})}{\eta _{s}}ds \\ & \hphantom{=:} +\int _{t_{n}}^{t}\bigg(\frac{h''}{h}(\widehat{X}_{t_{n}})- \frac{h''(\widehat{X}_{s})}{h(\widehat{X}_{s})\eta _{s}^{2}}\bigg) \frac{\sigma ^{2}(\widehat{X}_{t_{n}})}{\eta _{s}^{2}}\bigg(f'( \widehat{X}_{s})(u(\widehat{X}_{s})+\mu _{s})+\frac{1}{2}f''( \widehat{X}_{s})\bigg)ds \\ & \hphantom{=:} +\int _{t_{n}}^{t} \frac{\sigma ^{2}(\widehat{X}_{t_{n}})f'(\widehat{X}_{s})}{\eta _{s}^{4}} \bigg(\frac{h''h'-h h^{(3)}}{h^{2}}(\widehat{X}_{s}) \\ & \hphantom{=:+\int _{t_{n}}^{t}\frac{\sigma ^{2}(\widehat{X}_{t_{n}})f'(\widehat{X}_{s})}{\eta _{s}^{4}}\bigg(} -2 \frac{\sigma ^{2}(\widehat{X}_{t_{n}})(s-t_{n})u''(\widehat{X}_{s})}{\eta _{s}} \frac{h''}{h}(\widehat{X}_{s})\bigg)ds. \end{aligned} $$

Observe that the hypothesis on \(h\) implies that

$$ |u^{(k)}|\leq K h^{-1-k}, \qquad k \in \{0,1,2, 3\}, $$

for some constant \(K\). Moreover, for some other constant \(K\) that does not depend on \(s\), we have \(|\mu _{s}|\leq K \sigma ^{2}(\widehat{X}_{t_{n}}) (s-t_{n})h^{-3}\) and \(\sigma ^{2}(\widehat{X}_{t_{n}})(s-t_{n})h^{-2}\eta _{s}^{-1}\leq K\). Thus combined with the assumption on \(f\), we arrive at

$$\begin{aligned} |A_{t}| & \leq -K\frac{h''}{h}(\widehat{X}_{t_{n}})\int _{t_{n}}^{t} \frac{\sigma (\widehat{X}_{t_{n}})^{2}(1+\widehat{X}_{s}^{m})}{H_{x}^{2}(t_{n},\widehat{X}_{t_{n}};s,\widehat{X}_{s})} \Big(h^{-p}(\widehat{X}_{s})\big(1+(s-t_{n})h^{-2}(\widehat{X}_{s}) \big)\Big)ds \\ & \hphantom{=:} +K \int _{t_{n}}^{t} \frac{\sigma ^{2}(\widehat{X}_{t_{n}})(1+\widehat{X}_{s}^{m})}{H_{x}^{2}(t_{n},\widehat{X}_{t_{n}};s,\widehat{X}_{s})h^{2+p}(\widehat{X}_{s})}ds \end{aligned}$$

for some constant \(K\). This in particular implies that \(M\) is a true martingale since we can deduce from the estimates (3.5) and (3.6) that the set

$$ \bigg\{ f(\widehat{X}_{\tau}) \frac{h''(\widehat{X}_{\tau})}{h(\widehat{X}_{\tau})\eta _{\tau}^{2}}: \tau \text{ is a stopping time valued in }(t_{n}, t_{n+1}] \bigg\} $$

is uniformly integrable as soon as we once again recall that \(|h''/h| < Kh^{-p}\) for some \(p<1\). Hence the claim holds in view of (3.8).

3) Applying Itô’s formula and repeating similar estimates yields the claim. □

4 Weak convergence of the BEM scheme

Consider on a filtered probability space \((\Omega , {\mathcal {G}}, ({\mathcal {G}}_{t})_{t\in [0,T]}, \mathbb{P})\) satisfying the usual conditions the stochastic differential equation

$$ X_{t}=X_{0}+ \int _{0}^{t}\sigma (X_{s})dW_{s} +\int _{0}^{t} \mu (X_{s})ds, $$

where \(X_{0}\in (0,r)\), \(\sigma \) and \(\mu \) are bounded and Lipschitz on \((0,r)\), and \(\sigma (x)>\varepsilon \) for all \(x\in (0,r)\) and some \(\varepsilon >0\). Let \(\tau :=\inf \{t\geq 0: X_{t}\notin (0,r)\}\). We are interested in a numerical approximation for \(\mathbb{E}[\tilde{g}(X_{T}){\mathbf {1}}_{\{T<\tau \}}]\) for a sufficiently regular \(\tilde{g}\).

Observe that by a Girsanov transformation, we can rewrite the above expression in terms of a diffusion process satisfying the conditions in earlier sections. Indeed, defining ℚ on \({\mathcal {G}}\) via

$$ \frac{d\mathbb{Q}}{d\mathbb{P}}=\exp \left (-\int _{0}^{T} \frac{\mu (X_{s})}{\sigma (X_{s})}dW_{s}-\frac{1}{2}\int _{0}^{T} \frac{\mu ^{2}(X_{s})}{\sigma ^{2}(X_{s})}ds\right ) $$

makes \(X\) solve \(dX_{t}= \sigma (X_{t})dB_{t}\) for a ℚ-Brownian motion \(B\). Therefore,

$$ \begin{aligned} \mathbb{E}[\tilde{g}(X_{T}){\mathbf {1}}_{\{T< \tau \}}]&= \exp \big(-F(X_{0})\big)\mathbb{E}^{\mathbb{Q}}\bigg[g(X_{T})\exp \bigg(\int _{0}^{T} \sigma ^{2}(X_{t}) b(X_{t})dt\bigg){\mathbf {1}}_{ \{T< \tau \}}\bigg], \\ g(x)&=\tilde{g}(x)\exp \big(F(x)\big), \\ F(x)&=\int _{c}^{x} \frac{\mu (y)}{\sigma ^{2}(y)}dy, \qquad b=- \frac{1}{2}\bigg(\Big(\frac{\mu}{\sigma ^{2}}\Big)'+ \frac{\mu ^{2}}{\sigma ^{4}}\bigg), \end{aligned} $$

and \(c\in (0,r)\). Thus we may assume \(\mu \equiv 0\) and consider

$$ \begin{aligned} &E^{X_{0}}\bigg[g(X_{T})\exp \bigg(\int _{0}^{T} \sigma ^{2}(X_{t}) b(X_{t})dt\bigg){\mathbf {1}}_{\{T< \zeta \}}\bigg] \\ &=E^{h,X_{0}}\bigg[\frac{h(X_{0})g(X_{T})}{h(X_{T})}\exp \bigg(\int _{0}^{T} \sigma ^{2}(X_{t})\Big( b(X_{t})+\frac{h''(X_{t})}{2h(X_{t})}\Big)dt \bigg)\bigg], \end{aligned} $$

where \(X\) is a process satisfying Assumption 2.1, \(b\) is bounded, \(\varepsilon <\sigma < K_{\sigma}\) and \(g\) is sufficiently regular.

We next introduce a set of conditions on the function \(h\) as well as the payoff \(g\) that will be needed for our analysis. The following proposition motivates some of the conditions stated in Assumption 4.2. Its proof is relegated to the end of this section.

Proposition 4.1

Suppose \(b \in C^{4}_{b}((0,r); \mathbb{R})\), \(\sigma \in C^{4}_{b}((0,r))\), \(h\in {\mathcal {H}}\) with

$$ \frac{|h^{(k)}|}{h}< \frac{K_{h}}{h^{k-2+p}}, \qquad k\in \{2,3,4\}, $$

for some \(K_{h}\) and \(p\in (0,1)\), \(g\in C^{6}_{b}((0,r);\mathbb{R})\) is a bounded function with \(g^{(k)}(0)=0\) (and \(g^{(k)}(r)=0\) if \(r <\infty \)) for \(k\in \{0,1,2,3,4\}\), and define for \(t\leq T\)

$$ v(T-t,x):=E^{h,x}\bigg[\frac{g(X_{t})}{h(X_{t})}\exp \bigg(\int _{0}^{t} \sigma ^{2}(X_{s})\Big( b(X_{s})+\frac{h''(X_{s})}{2h(X_{s})}\Big)ds \bigg)\bigg]. $$
(4.1)

Then

$$ v_{t}+\frac{\sigma ^{2}}{2}v_{xx} +\sigma ^{2} \frac{h'}{h}v_{x}=- \sigma ^{2}v\bigg(b+\frac{h''}{2h}\bigg). $$
(4.2)

Moreover, \(v\) and \(v_{t}\) are uniformly bounded and there exists a constant \(K\) such that

$$ \sup _{t\leq T}\bigg|\frac{\partial ^{k}}{\partial x^{k}}v_{t}(t,x) \bigg|+\sup _{t\leq T}\bigg|\frac{\partial ^{k}}{\partial x^{k}}v(t,x) \bigg|\leq Kh^{2-p-k}(x), \qquad k\in \{1,2\}. $$

In view of Proposition 4.1 and for the convenience of the reader, we collect in Assumption 4.2 below all the assumptions needed to prove our convergence result.

Assumption 4.2

The functions \(\sigma \), \(b\), \(h\) and \(g\) satisfy the following regularity conditions:

1) \(h\in {\mathcal {H}}\cap C^{4}((0,r); (0,\infty ))\) is such that

$$ \frac{|h^{(k)}|}{h}< \frac{K_{h}}{h^{p+k-2}}, \qquad k\in \{2,3,4\}, $$

for some \(K_{h}\) and \(p\in [0,\frac{1}{2}]\).

2) \(\sigma \in C_{b}^{2}((0,r); (0,\infty ))\) is bounded away from 0, i.e., there is some \(\varepsilon > 0\) such that \(\sigma (x)>\varepsilon \) for all \(x\in (0,r)\).

3) \(b \in C^{2}_{b}((0,r); \mathbb{R})\).

4) \(g\in C((0,r);\mathbb{R})\) is of polynomial growth with \(g(0)=0\) (and \(g(r)=0\) if \(r <\infty \)).

5) The function \(v\) defined by (4.1) belongs to \(C^{1,4}((0,r);\mathbb{R})\) and satisfies (4.2) as well as the growth conditions

$$ \sup _{t\leq T}\bigg|\frac{\partial ^{k}}{\partial x^{k}}v_{t}(t,x) \bigg|+\sup _{t\leq T}\bigg|\frac{\partial ^{k}}{\partial x^{k}}v(t,x) \bigg|\leq K(1+x^{m})h^{2-p-k}(x), \qquad k\in \{1,2\}, $$

for some constant \(K\) and integer \(m\geq 0\).

Remark 4.3

The first condition on the derivatives of \(h\) is not restrictive for practical purposes. Indeed, if a given \(h\in {\mathcal {H}}\cap C^{4}((0,r); (0,\infty ))\) does not satisfy this condition, one can always linearise this concave function near the boundaries at which \(h\) vanishes to obtain a new concave function satisfying the stated condition.

Theorem 4.4

Consider the BEM scheme defined by (2.4) and the associated error

$$ \begin{aligned} e(N)&:= \frac{g(\widehat{X}_{T})}{h(\widehat{X}_{T})} \exp \bigg(\sum _{n=0}^{N-1}\frac{T}{N} \sigma ^{2}(\widehat{X}_{t_{n}}) \Big( b(\widehat{X}_{t_{n}})+ \frac{h''(\widehat{X}_{t_{n}})}{2h(\widehat{X}_{t_{n}})}\Big)\bigg) \\ & \hphantom{=::} - \frac{g(X_{T})}{h(X_{T})}\exp \bigg(\int _{0}^{T} \sigma ^{2}(X_{t}) \Big( b(X_{t})+\frac{h''(X_{t})}{2h(X_{t})}\Big)dt\bigg). \end{aligned} $$

Then under Assumption 4.2, \(|E^{h,X_{0}}[e(N)] |\leq \frac{KT}{N}\) for some constant \(K\) independent of \(N\).

Proof

Let \(\pi _{0}(s)= 1\),

$$ \pi _{k}(s):=\exp \bigg(\sum _{n=0}^{k-1}s \sigma ^{2}(\widehat{X}_{t_{n}}) \Big( b(\widehat{X}_{t_{n}})+ \frac{h''(\widehat{X}_{t_{n}})}{2h(\widehat{X}_{t_{n}})}\Big)\bigg), \qquad k=1,\ldots , N, $$

with the convention that we set \(\pi _{k}:=\pi _{k}(TN^{-1})\), and observe that

$$ \begin{aligned} &E^{h,X_{0}}[e(N)] \\ &=E^{h,X_{0}} [v(T,\widehat{X}_{T})\pi _{N} ]-v(0,X_{0}) \\ &=\sum _{n=0}^{N-1}E^{h,X_{0}} [v(t_{n+1},\widehat{X}_{t_{n+1}})\pi _{n+1}-v(t_{n}, \widehat{X}_{t_{n}})\pi _{n} ] \\ &=\sum _{n=0}^{N-1}E^{h,X_{0}}\bigg[\pi _{n}\bigg(v(t_{n+1}, \widehat{X}_{t_{n+1}}) \\ & \hphantom{=\sum _{n=0}^{N-1}E^{h,X_{0}}\bigg[\pi _{n}\bigg(} \times \exp \Big(\frac{T \sigma ^{2}(\widehat{X}_{t_{n}})}{N}\Big( b( \widehat{X}_{t_{n}})+ \frac{h''(\widehat{X}_{t_{n}})}{2h(\widehat{X}_{t_{n}})}\Big)\Big)-v(t_{n}, \widehat{X}_{t_{n}})\bigg)\bigg]. \end{aligned} $$

Next observe that

$$ \begin{aligned} &E^{h,X_{0}}\bigg[\pi _{n}\bigg(v(t_{n+1},\widehat{X}_{t_{n+1}}) \\ & \hphantom{E^{h,X_{0}}\bigg[\pi _{n}\bigg(} \times \exp \Big(\frac{T \sigma ^{2}(\widehat{X}_{t_{n}})}{N}\Big( b( \widehat{X}_{t_{n}})+ \frac{h''(\widehat{X}_{t_{n}})}{2h(\widehat{X}_{t_{n}})}\Big)\Big)-v(t_{n}, \widehat{X}_{t_{n}})\bigg)\bigg|{\mathcal {F}}_{t_{n}}\bigg] \\ &=\pi _{n}E^{h,X_{0}}\bigg[\bigg(v(t_{n+1},\widehat{X}_{t_{n+1}}) \\ & \hphantom{=\pi _{n}E^{h,X_{0}}\bigg[\bigg(} \times \exp \Big(\frac{T \sigma ^{2}(\widehat{X}_{t_{n}})}{N}\Big( b( \widehat{X}_{t_{n}})+ \frac{h''(\widehat{X}_{t_{n}})}{2h(\widehat{X}_{t_{n}})}\Big)\Big)-v(t_{n}, \widehat{X}_{t_{n}})\bigg)\bigg|{\mathcal {F}}_{t_{n}}\bigg]. \end{aligned} $$

Moreover, using Itô’s formula and (4.2) after division by \(\sigma ^{2}\) gives

$$\begin{aligned} &v(t_{n+1},\widehat{X}_{t_{n+1}})\exp \bigg( \frac{T \sigma ^{2}(\widehat{X}_{t_{n}})}{N}\Big( b(\widehat{X}_{t_{n}})+ \frac{h''(\widehat{X}_{t_{n}})}{2h(\widehat{X}_{t_{n}})}\Big)\bigg)-v(t_{n}, \widehat{X}_{t_{n}}) \\ &=M_{t_{n+1}}-M_{t_{n}}+ \sum _{j=1}^{3} I_{j}, \end{aligned}$$

where \(M\) is a local martingale and

$$ \begin{aligned} I_{1}&=\int _{t_{n}}^{t_{n+1}} \frac{\pi _{n+1}(t-t_{n})}{\pi _{n}(t-t_{n})} \frac{\sigma ^{2}(\widehat{X}_{t_{n}})v_{x}(t,\widehat{X}_{t})\mu (t_{n}, \widehat{X}_{t_{n}}; t, \widehat{X}_{t})}{H_{x}^{2}(t_{n},\widehat{X}_{t_{n}};t,\widehat{X}_{t})}dt, \\ I_{2}&=\int _{t_{n}}^{t_{n+1}} \frac{\pi _{n+1}(t-t_{n})}{\pi _{n}(t-t_{n})}\sigma ^{2}(\widehat{X}_{t_{n}})v_{t}(t, \widehat{X}_{t}) \\ & \hphantom{=:\int _{t_{n}}^{t_{n+1}}} \times \bigg(\frac{1}{\sigma ^{2}(\widehat{X}_{t_{n}})}- \frac{1}{\sigma ^{2}(\widehat{X}_{t})H_{x}^{2}(t_{n},\widehat{X}_{t_{n}};t,\widehat{X}_{t})} \bigg)dt, \\ I_{3}&=\int _{t_{n}}^{t_{n+1}} \frac{\pi _{n+1}(t-t_{n})}{\pi _{n}(t-t_{n})}\sigma ^{2}(\widehat{X}_{t_{n}})v(t, \widehat{X}_{t}) \\ & \hphantom{=:\int _{t_{n}}^{t_{n+1}}} \times \bigg(b(\widehat{X}_{t_{n}})+ \frac{h''(\widehat{X}_{t_{n}})}{2h(\widehat{X}_{t_{n}})}- \frac{b(\widehat{X}_{t})+\frac{h''(\widehat{X}_{t})}{2h(\widehat{X}_{t})}}{H_{x}^{2}(t_{n},\widehat{X}_{t_{n}};t,\widehat{X}_{t})} \bigg)dt. \end{aligned} $$

First note that \(M\) is a true martingale due to (3.5) by the hypothesis on \(v\) and the boundedness of \(h\). Moreover, Lemma 3.4 shows (for a generic constant \(K\) that may change from line to line, but remains bounded uniformly in \(N\)) that

$$ \begin{aligned} & | E^{h,X_{0}} [I_{1}+I_{2}+I_{3}|{\mathcal {F}}_{t_{n}}] | \\ &\leq K\frac{T}{N} E^{h,X_{0}}\bigg[ \int _{t_{n}}^{t_{n+1}} \frac{\sigma ^{2}(\widehat{X}_{t_{n}})(h^{-2-p}(\widehat{X}_{t})+\widehat{X}_{t}^{m})}{H_{x}^{2}(t_{n},\widehat{X}_{t_{n}};t,\widehat{X}_{t})}dt \bigg| {\mathcal {F}}_{t_{n}}\bigg] \\ & \hphantom{=:} +K E^{h,X_{0}}\bigg[\int _{t_{n}}^{t_{n+1}}dt \frac{\pi _{n+1}(t-t_{n})}{\pi _{n}(t-t_{n})}\sigma ^{2}(\widehat{X}_{t_{n}}) \\ & \hphantom{=:K E^{h,X_{0}}\bigg[=\int _{t_{n}}^{t_{n+1}}dt} \times \int _{t_{n}}^{t} \frac{\sigma (\widehat{X}_{t_{n}})^{2}(h^{-2-p}(\widehat{X}_{s})+\widehat{X}_{s}^{m})}{H_{x}^{2}(t_{n},\widehat{X}_{t_{n}};s,\widehat{X}_{s})}ds \bigg| {\mathcal {F}}_{t_{n}}\bigg] \\ & \hphantom{=:} -K E^{h,X_{0}}\bigg[\int _{t_{n}}^{t_{n+1}}dt \frac{\pi _{n+1}(t-t_{n})}{\pi _{n}(t-t_{n})}\frac{\sigma ^{2}h''}{h}( \widehat{X}_{t_{n}}) \\ & \hphantom{=:K E^{h,X_{0}}\bigg[=\int _{t_{n}}^{t_{n+1}}dt} \times \int _{t_{n}}^{t} \frac{\sigma (\widehat{X}_{t_{n}})^{2}(h^{-p}(\widehat{X}_{s})+\widehat{X}_{s}^{m})}{H_{x}^{2}(t_{n},\widehat{X}_{t_{n}};s,\widehat{X}_{s})}ds \bigg| {\mathcal {F}}_{t_{n}}\bigg] \\ & \hphantom{=:} -KE^{h,X_{0}}\bigg[\int _{t_{n}}^{t_{n+1}}dt \frac{\pi _{n+1}(t-t_{n})}{\pi _{n}(t-t_{n})}\frac{\sigma ^{2}h''}{h}( \widehat{X}_{t_{n}}) \\ & \hphantom{=:-KE^{h,X_{0}}\bigg[\int _{t_{n}}^{t_{n+1}}dt} \times \int _{t_{n}}^{t} \frac{\sigma ^{2}(\widehat{X}_{t_{n}})(h^{-2}(\widehat{X}_{s})+\widehat{X}_{s}^{m})(s-t_{n})}{H_{x}^{2}(t_{n},\widehat{X}_{t_{n}};s,\widehat{X}_{s})}ds \bigg| {\mathcal {F}}_{t_{n}}\bigg] \\ &\leq K\frac{T}{N} E^{h,X_{0}}\bigg[ \int _{t_{n}}^{t_{n+1}} \frac{\sigma ^{2}(\widehat{X}_{t_{n}})(h^{-2-p}(\widehat{X}_{t})+\widehat{X}_{t}^{m})}{H_{x}^{2}(t_{n},\widehat{X}_{t_{n}};t,\widehat{X}_{t})}dt \bigg| {\mathcal {F}}_{t_{n}}\bigg] \\ & \hphantom{=:} +E^{h,X_{0}}\bigg[\int _{t_{n}}^{t_{n+1}}\bigg(1-\exp \Big((t-t_{n}) \frac{\sigma ^{2}h''}{2h}(\widehat{X}_{t_{n}})\Big)\bigg) \\ & \hphantom{=:+E^{h,X_{0}}\bigg[\int _{t_{n}}^{t_{n+1}}} \times \frac{\sigma (\widehat{X}_{t_{n}})^{2}(h^{-p}(\widehat{X}_{t})+\widehat{X}_{t}^{m})}{H_{x}^{2}(t_{n},\widehat{X}_{t_{n}};t,\widehat{X}_{t})}dt \bigg| {\mathcal {F}}_{t_{n}}\bigg] \\ & \hphantom{=:} +K\frac{T}{N}E^{h,X_{0}}\bigg[\int _{t_{n}}^{t_{n+1}} \frac{\sigma (\widehat{X}_{t_{n}})^{2}(h^{-2}(\widehat{X}_{t})+\widehat{X}_{t}^{m})}{H_{x}^{2}(t_{n},\widehat{X}_{t_{n}};t,\widehat{X}_{t})}dt \bigg| {\mathcal {F}}_{t_{n}}\bigg], \end{aligned} $$

where we have used the boundedness of \(\pi _{n+1}/\pi _{n}\) several times, and the last two lines follow from interchanging the order of integration on the third and fourth lines. This proves the assertion in view of Theorem 3.3 and in particular of (3.5) and (3.7), since the \(\pi _{n}\) are nonnegative and uniformly bounded and \(H_{x}\geq 1\). □

We end this section with the

Proof of Proposition 4.1

Note that \(v(T-t,x)=\frac{u(T-t,x)}{h(x)}\), where

$$ u(T-t,x):=E^{x}\left [g(X_{t})\exp \left (\int _{0}^{t} \sigma ^{2}(X_{s}) b(X_{s})ds\right ){\mathbf {1}}_{\{t< \zeta \}} \right ]. $$

Note that \(u(t,0)=0\) for \(t\leq T\). Moreover, Ladyženskaja et al. [32, Theorem IV.5.2] yields that \(u\) is the unique solution of

$$ u_{t} +\frac{1}{2}\sigma ^{2} u_{xx} + \sigma ^{2} u b=0 $$
(4.3)

and that

$$ \sup _{t\leq T, x\in (0,r)}\bigg|\frac{\partial ^{j}}{\partial x^{j}} \frac{\partial ^{k}}{\partial t^{k}}u\bigg|< \infty , \qquad 0\leq 2k+j \leq 5. $$
(4.4)

Also note that since

$$ \begin{aligned} &E^{x}\bigg[g(X_{t})\exp \bigg(\int _{0}^{t} \sigma ^{2}(X_{s}) b(X_{s})ds\bigg){\mathbf {1}}_{\{t< \zeta \}}\bigg] \\ &=g(x) +\frac{1}{2}E^{x}\bigg[\int _{0}^{t}\sigma ^{2}(X_{u})\exp \bigg(\int _{0}^{u} \sigma ^{2}(X_{s}) b(X_{s})ds\bigg) \\ & \hphantom{=:g(x) +\frac{1}{2}E^{x}\bigg[\int _{0}^{t}} \times \big(g''(X_{u})+2g(X_{u})b(X_{u})\big){\mathbf {1}}_{\{u< \zeta \}}du\bigg], \end{aligned} $$

we have

$$\begin{aligned} u_{t}(t,x)&=-\frac{1}{2}E^{x}\big[\sigma ^{2}(X_{T-t})\exp (C_{T-t}) \\ & \hphantom{=-\frac{1}{2}E^{x}\big[} \times \big(g''(X_{T-t})+2g(X_{T-t})b(X_{T-t})\big){\mathbf {1}}_{\{T-t< \zeta \}}\big], \end{aligned}$$
(4.5)

where \(C_{t}=\int _{0}^{t} \sigma ^{2}(X_{s}) b(X_{s})ds\). In particular, \(u_{t}(\,\cdot \,,0)=0\), which in turn implies \(u_{xx}(\,\cdot \,,0)=0\). Analogous boundary conditions also hold at \(r\) if \(r\) is finite.

Let \(w:=u_{t}\) and note that \(w\) solves (4.3) with the boundary condition \(w(t,0)=0\) and \(w(T,\,\cdot \,)=-\frac{1}{2}\sigma ^{2} g'' -\sigma ^{2} gb\). Using the stochastic representation in (4.5) and analogous arguments, we again arrive at \(w_{t}\) vanishing at finite boundaries.

Using the PDE for \(u\), it is straightforward to establish that \(v\) solves (4.2) and is bounded. Moreover, as \(v_{x}=\frac{h u_{x}- uh'}{h^{2}}\), integration by parts gives

$$ v_{x}(t,x)= \frac{\int _{0}^{x} (h(y)u_{xx}(t,y)-u(t,y)h''(y) )dy}{h^{2}(x)}. $$

Since \(h'(0)<\infty \) and \(u\) and \(u_{xx}\) vanish at 0 and are jointly continuous near \(t=T\), there exists a neighbourhood of 0 in which we have \(|h''|(y) \leq Kh^{1-p}(y)\leq K^{2} y\), \(|u(\,\cdot \,,y)|+|u_{xx}(\,\cdot \,,y)|< Ky\) (due to Lipschitz-continuity) and \(h(y)>cy\). Thus whenever \(x\) belongs to this neighbourhood, we have

$$ \frac{v_{x}(t,x)}{h^{1-p}(x)}\leq \frac{K\int _{0}^{x} (y(Ky+K^{2} y^{1-p}) )dy}{c^{3-p}x^{3-p}}= \frac{K^{2}/3 x^{3}+K^{3}/(3-p)x^{3-p}}{c^{3-p}x^{3-p}}. $$

Thus \(v_{x}/h^{1-p}\) is bounded near 0. Analogous considerations when \(r<\infty \) show that the ratio is bounded over \((0,r)\).

Next observe that \(v_{t}\) is bounded since \(u_{t}\) vanishes at finite boundaries and \(u_{tx}\) is bounded. In particular, \(v_{t} h^{p}\) remains bounded near finite boundaries (uniformly in \(t\)). Multiplying (4.2) by \(h^{p}\) and using the fact that \(v_{x}/h^{1-p}\) is bounded demonstrates that

$$ \sup _{t\leq T, x\in (0,r)}|v_{xx}(t,x)h^{p}(x)|< \infty . $$

Finally, since \(v_{t}=\frac{w}{h}\), repeating the above arguments and using the fact that \(w_{xx}\) vanishes at finite boundaries and is Lipschitz-continuous in view of (4.4), we deduce that \(v_{tx}/h^{1-p}\) is bounded. Similar arguments (due to the boundedness of \(w_{tx}=u_{ttx}\) in view of (4.4)) also lead to

$$ \sup _{t\leq T, x\in (0,r)}|v_{txx}(t,x)h^{p}(x)|< \infty . $$

 □

5 Numerical analysis

This section is dedicated to the numerical experiments illustrating the above technical analysis. As we shall see, one does not really need to satisfy all the conditions assumed in Theorem 4.4 in order to achieve the advertised convergence rate in practice. The experiments below compare our methodology developed in this paper to standard numerical approaches for pricing barrier options (see also the symmetrisation technique from Akahori and Imamura [2] and Imamura et al. [29] for another approach to barrier option pricing).

We consider the classical Black–Scholes model in the first part. As barrier option values are quite sensitive to the market skew/smile of volatility, the time-homogeneous hyperbolic local volatility model is also studied in the second part.

Remark 5.1

In this one-dimensional setting, the value function of a “plain” knock-out option, i.e., \(E^{x}[g(X_{T}){\mathbf {1}}_{\{T<\zeta \}}]\), can be found rather easily by applying a finite difference scheme to the associated PDE with vanishing boundary conditions at accessible boundaries.

The Monte Carlo BEM scheme introduced in this paper will have a clear advantage over the PDE method in higher dimensions. However, already in the one-dimensional case, it is quite flexible with respect to additional complications compared to the PDE method. In practice, a barrier-type payoff can be combined with various features like Asianing and forward starting. As an example, one can consider

$$ {\mathbf {1}}_{\{\zeta > T\}} \bigg( \frac{1}{m} \sum _{i=1}^{m} X_{T_{i}} - X_{T_{0}} \bigg)^{+} $$

with \(0 < T_{0} < T_{1} <\cdots < T_{m} = T\), where the strike is not fixed today but at a future date \(T_{0}\). A computation of the price of the above derivative can be made without much extra effort by using the BEM method, provided the discretisation includes the time points \(T_{i}\) for \(i=0, \ldots , m\). While the pricing of such a derivative can still be done via a PDE approach using a finite difference scheme, its implementation is relatively complex and involves essentially a 3-d PDE solver: one needs one dimension for the spot price \(X\), one to capture the possible values of the Asianing \(\frac{1}{m} \sum _{i=1}^{m} X_{T_{i}}\), and another to incorporate the possible values of the strike value at \(T_{0}\) (see e.g. Wilmott [45, Chap. 25], Musiela and Rutkowski [35, Sects. 6.2 and 7.1.10] or De Weert [14, Sect. 11]). Besides, this can be intensive in computation time.

5.1 Black–Scholes model for barrier options

For expository purposes, we assume the log-price \(X_{t} = \ln S_{t}\) under the risk-neutral probability ℙ is given by

$$ X_{t} =x - \frac{1}{2} \sigma ^{2} t + \sigma W_{t}, \qquad t < \zeta , $$
(5.1)

where \(\zeta =\inf \{t>0:X_{t} \in \{\ell ,r\}\}\) and \(\sigma >0\) is constant. Deterministic interest rates, a dividend yield or borrowing costs can be incorporated without difficulty. Above, \(\zeta \) represents the time of hitting a pre-specified barrier at which the option becomes worthless. Consequently, the value of the barrier option with payoff \(\tilde{g}\) is

$$ {\mathrm{price}} = \mathbb{E}^{\mathbb{P}} [ \tilde{g}(X_{T}) {\mathbf {1}}_{ \{\zeta > T\}} ]. $$
(5.2)

To remove the drift in (5.1), we follow the Girsanov transformation, described at the beginning of Sect. 4, and obtain

$$ dX_{t} = \sigma dW_{t}, \qquad X_{0}=x $$
(5.3)

under ℚ, where

$$ \frac{d \mathbb{Q}}{ d \mathbb{P}} = e^{\frac{1}{2} \sigma W_{T} - \frac{1}{8} \sigma ^{2} T} = e^{\frac{1}{2} (X_{T}-x+ \frac{1}{2} \sigma ^{2} T)- \frac{1}{8} \sigma ^{2} T} = e^{-\frac{1}{2} + \frac{1}{8} \sigma ^{2} T} e^{\frac{1}{2} X_{T}}. $$

Consequently, with \(g(x) = \tilde{g}(x) e^{-\frac{1}{2}x}\), we get

$$ {\mathrm{price}} = e^{ \frac{1}{2} x - \frac{1}{8} \sigma ^{2} T} \mathbb{E}^{\mathbb{Q}} [ g(X_{T}) {\mathbf {1}}_{\{\zeta > T\}} ]. $$

We shall perform a path transformation method as described earlier that either produces a recurrent process or generates a transient process with infinite lifetime (see Theorem 2.6). A lower barrier will be called \(\underline{s}\), an upper barrier \(\overline{s}\).

5.1.1 Specification of the recurrent transformation

For a single barrier with \(\ell \) finite and \(r = \infty \), we choose \(h(x) = e^{-\ell} - e^{-x}\) with \(h'(x) = e^{-x}\) and \(h''(x) = -e^{-x}\). Note that with this choice of \(h\), condition 1) of Assumption 4.2 is not satisfied for \(k = 2\) and any \(p \in [0, \frac{1}{2}]\). Nevertheless, we apply the implicit scheme (2.4) so that the price (5.2) is approximated by

$$ {\mathrm{price}} \approx e^{ \frac{1}{2} x - \frac{1}{8} \sigma ^{2} T} h(x) \mathbb{E}^{h,x} \left [ \frac{g}{h}(\widehat{X}_{t_{N}}) e^{ \frac{\sigma ^{2}}{2} \frac{T}{N} \sum _{n=0}^{N-1} \frac{h''}{h}( \widehat{X}_{t_{n}} ) } \right ] $$

and still obtain the optimal convergence rate in the numerical experiments.

Remark 5.2

In the Black–Scholes model with \(X_{t} = \ln S_{t}\), the function \(H\) is identical at each time step and needs to be computed once. In the implementation, we introduce a dense grid covering the interval \((\ell , r)\), calculate the values of \(H\) on these points and \(H^{-1}\) is computed by piecewise constant approximation.

5.1.2 The transient transformation

In the single barrier case of a down-and-out option, we can also consider a transformation via \(h(x) = x-\ell \) when \(\ell \) is finite and \(r = \infty \), as in Theorem 2.6. Under \(Q^{h,x}\), the process \(X\) defined in (5.3) follows

$$ dX_{t} = \sigma dW_{t} + \frac{\sigma ^{2}}{X_{t} - \ell}dt, \qquad X_{0}=x. $$

One advantage of this transformation is that the inverse of the function \(H\) appearing in the implicit scheme (2.4) can be computed analytically and is given by

$$ H^{-1}(x) = \frac{1}{2} \bigg( \sqrt{4 \sigma ^{2} \frac{T}{N} + (x- \ell )^{2}} + x+\ell \bigg). $$

5.2 Down-and-out put option

For a down-and-out put barrier option, the payoff is given by \(\max (K - S_{T})^{+} {\mathbf {1}}_{\{\zeta > T\}}\), where \(r=\infty \) and \(\ell =\log \underline{s}\), \(K\) is the option strike and \(T\) the maturity. As mentioned at the beginning of this section, to put our methodology in perspective, we have also implemented two other approaches to the numerical pricing of the barrier option:

– Standard Euler without hitting probability. This consists of discretising the SDE (5.1) according to the Euler scheme

$$ \textstyle\begin{cases} \displaystyle \widehat{X}_{0} = \ln S_{0}, \\ \displaystyle \widehat{X}_{t_{i+1}} = \widehat{X}_{t_{i}} - \frac{1}{2} \sigma ^{2} \frac{T}{N} + \sigma (W_{t_{i+1}}-W_{t_{i}}). \end{cases} $$
(5.4)

and evaluating \(\tilde{g}(X_{T}) {\mathbf {1}}_{\{\zeta > T\}} \) by \(\tilde{g}(\widehat{X}_{t_{N}}) {\mathbf {1}}_{\{\zeta ^{N} > T\}}\), where

$$ \zeta ^{N} = \inf \{t_{i} > 0: \widehat{X}_{t_{i}} \notin (\ell = \log \underline{s}, \infty )\}. $$

This numerical scheme for barrier option pricing has been studied in Gobet [21], where it was shown to have a convergence rate of \(\mathcal{O}(\frac{1}{\sqrt{N}})\). This loss of accuracy is mainly due to the fact that it is possible for \(X\) to cross the barriers \(\ell \) or \(r\) at some time \(t\) between grid points \(t_{i}\) and \(t_{i+1}\) and never be below the barrier at any of the dates \(t_{i}\) for \(i=1,\dots ,N\).

– Standard Euler with hitting probability. Although this is still based on the Euler scheme simulation (5.4), it applies a further correction to remove the barrier crossing biases via the conditional no-hitting probability \(\hat{p}_{i}\) using the Brownian bridge technique (see e.g. [21, Sect. 1.1]). More precisely, the \(\hat{p}_{i}\) are defined and can be computed analytically as

$$ \hat{p}_{i} := \mathbb{P}\big[ \widehat{X}_{t} > \ell , \forall t \in [t_{i}, t_{i+1}] \big| \widehat{X}_{t_{i}} = x_{i}, \widehat{X}_{t_{i+1}} = x_{i+1}\big] =1-e^{ -2 \frac{(x_{i} - \ell )(x_{i+1} - \ell )}{\sigma ^{2} (t_{i+1}-t_{i})} }, $$

where the process \((\widehat{X}_{t})_{0 \leq t \leq T}\) is the continuous Euler scheme which interpolates \((\widehat{X}_{t_{i}})_{0 \leq i \leq N }\) via

$$ \widehat{X}_{t} = \widehat{X}_{t_{i}} - \frac{1}{2} \sigma ^{2} (t - t_{i}) + \sigma (W_{t}-W_{t_{i}}), \qquad t \in [t_{i}, t_{i+1}). $$

It then corrects the payoff \(\tilde{g}(X_{T}) {\mathbf {1}}_{\{\zeta > T\}} \) by considering instead \(\tilde{g}(\widehat{X}_{t_{N}}) \prod _{i=0}^{N-1} \hat{p}_{i}\). As shown in Gobet [22], this bias correction brings the convergence rate back to the order \(N^{-1}\), which is the rate of weak convergence for the Euler–Maruyama scheme in the absence of killing. Moreover, in this specific Black–Scholes implementation, the simulation is exact, i.e., no discretisation error occurs due to constant \(\sigma \).

We next summarise the experiment details and comparison results.

5.2.1 Set of parameters

The numerical experiments are conducted using for the parameters the values \(S_{0} = 1\), \(T = 1\) year, \(\ell = \log (\underline{s} = 0.8)\), \(r = \infty \) and \(\sigma = 20\%\). For thoroughness, we have considered in-the-money (\(K = 1.2\)), at-the-money (\(K = 1\)) and out-of-the-money (\(K = 0.9\)) options. To reduce statistical noise, the simulations are run with 1’000’000 Monte Carlo paths. The benchmark price is calculated analytically (see e.g. Haug [23, Sect. 4.17.1]).

As our final results do not show any significant dependence on the moneyness of the option, we only report the results for at-the-money (ATM) options. In particular, the discrepancy between benchmark prices and the numerical value for ATM down-and-out put options is shown in Fig. 1. We have not observed any stability issues with any of our \(h\)-transformation schemes. As discussed earlier, the standard Euler scheme with the hitting probability method has no discretisation error. The discrepancy is therefore essentially the statistical noise.

Fig. 1
figure 1

Absolute discrepancy between the benchmark price for an ATM down-and-out put and those calculated with different numerical schemes when \(S_{0} = 1\), \(K = 1\), \(T = 1\) year, \(\ell = \log (\underline{s}=0.8)\), \(r = \infty \) and \(\sigma = 20\%\)

Our numerical results show a rapid convergence of the numerical approximation of prices given by the recurrent and transient transforms via the implicit scheme and demonstrate clearly its effectiveness over the standard Euler scheme without any hitting probability correction. This confirms the findings of our theoretical analysis even without satisfying all the conditions of Theorem 4.4.

Moreover, the prices given by the recurrent and transient transforms are quite comparable as predicted by the theoretical analysis. Figure 2 show the log–log plot of the discrepancy associated to the recurrent and transient transforms, respectively, for an ATM down-and-out put option. The respective numerical rates of convergence observed are 0.95 and 0.9.

Fig. 2
figure 2

Log-log plot of the absolute discrepancy for an ATM down-and-out put price with recurrent and transient transform numerical schemes when \(S_{0} = 1\), \(K = 1\), \(T = 1\) year, \(\ell = \log (\underline{s}=0.8)\), \(r = \infty \) and \(\sigma = 20\%\)

5.3 Time-homogeneous hyperbolic local volatility model

Empirical asset return distributions tend to exhibit fat tails (kurtosis) and skewness (asymmetric distribution). The skew or smile in implied volatility surfaces observed across various asset classes are a market reality (see e.g. Gatheral [19, Chap. 1], Overhaus et al. [37, Sect. 1.2] or Wilmott [46, Sect. 22.4]) and a manifestation of these stylised facts. We need more convenient models than Gaussian models for the asset \(S\) to reproduce more closely the implied volatility surfaces. Local volatility models, either parametric or non-parametric (see e.g. Dupire [17], Derman and Kani [16] or Rubinstein [41]) arguably capture the surface of implied volatilities more precisely than other approaches such as stochastic volatility models (see e.g. Ren et al. [38] or Romo [40]). Needless to say, the volatility surface has a significant impact on barrier option valuation.

For our analysis, we consider the time-homogeneous hyperbolic local volatility model (HLV) where the dynamics of the spot price under the risk-neutral measure is given by

$$ dX_{t} = \sigma (X_{t}) dW_{t}, \qquad X_{0}=1 $$

with

$$ \sigma (x) = \nu \bigg( \frac{(1-\beta +\beta ^{2})}{\beta} x + \frac{(\beta -1)}{\beta} \Big(\sqrt{x^{2}+\beta ^{2}(1-x)^{2}}-\beta \Big) \bigg). $$
(5.5)

Here \(\nu >0\) is the level of volatility, and \(\beta \in (0,1]\) is the skew parameter. First introduced in Jäckel [31], this behaves similarly to the constant elasticity of variance (CEV), this model and has been widely used in quantitative finance for numerical experiments in Hok et al. [25, 26, 24]. A practical advantage of this model is that zero is not an attainable boundary, which in turn avoids some numerical instabilities present in the CEV model when the underlying asset price is close to zero (see e.g. Andersen and Andreasen [5]). It corresponds to the Black–Scholes model for \(\beta =1\) and exhibits a skew for the implied volatility surface when \(\beta \neq 1\). Figure 3 illustrates the impact of the parameter \(\beta \) on the skew of the volatility surface. We observe that the skew increases significantly with decreasing value of \(\beta \). For example with \(\nu = 0.3\), \(\beta = 0.2\), the difference in volatility between strikes at \(50\%\) and at \(100\%\) is about \(15\%\).

Fig. 3
figure 3

Impact of the value \(\beta \) on the hyperbolic local volatility for fixed volatility level \(\nu = 0.3\)

5.3.1 Down-and-up-out double barrier call option

In this implementation, we set \(h(x) = \frac{(x-\ell )(r-x)}{2(r-\ell )^{2}}\) with the functions \(h^{(1)}(x) = \frac{\ell +r-2x}{2(r-\ell )^{2}}\), \(h^{(2)}(x) = - \frac{1}{(r-\ell )^{2}}\) and \(h^{(3)}(x) = h^{(4)}(x)=0\). Note that with this choice of \(h\), Condition 1) in Assumption 4.2 is not satisfied for \(k = 2\) and any \(p \in [0, \frac{1}{2}]\). The associated BEM scheme is then solved by using a bisection method with Octave vectorisation for faster code execution. Consequently, the price is approximated by

$$ {\mathrm{price}} \approx h(x) \mathbb{E}^{h,x} \bigg[ \frac{(\widehat{X}_{t_{N}} - K)^{+}}{h(\widehat{X}_{t_{N}})} e^{ \frac{1}{2} \frac{T}{N} \sum _{n=0}^{N-1} \sigma ^{2}(\widehat{X}_{t_{n}}) \frac{h''}{h}( \widehat{X}_{t_{n}} ) } \bigg] . $$

For comparison, we also compute the numerical price given by the standard Euler scheme (5.4) with series hitting probability correction, where the expression is given by an infinite series in Gobet [21]Footnote 1 as

$$\begin{aligned} \hat{p}_{i} :=& \mathbb{P}\big[ \widehat{X}_{t} \in (\ell ,r), \forall t \in [t_{i}, t_{i+1}] \big| \widehat{X}_{t_{i}} = x_{i}, \widehat{X}_{t_{i+1}} = x_{i+1}\big] \\ \hphantom{:} =& {\mathbf {1}}_{\{ \ell < x_{i}, x_{i+1} < r\}} \!\!\sum _{n=-\infty}^{ \infty} \!\! \Big( e^{ \frac{-2n (r-\ell )(n(r-\ell )+ x_{i+1} - x_{i})}{\sigma ^{2} (t_{i+1} - t_{i})} } \!- e^{ \frac{-2( n (r-\ell ) + x_{i} -r )( n(r-\ell ) + x_{i+1} - r )}{\sigma ^{2} (t_{i+1} - t_{i})} } \Big) . \end{aligned}$$
(5.6)

Here \(\sigma \) is computed using the parametric local volatility function (5.5). A numerical study of (5.6) suggests that it suffices to calculate the leading two or three terms for most cases. To be conservative, in our tests, the \(\hat{p}_{i}\) are estimated using \(n\) from −5 to 5. Experiment details and comparison results are described below.

5.3.2 Set of parameters

The numerical experiments are conducted using for the parameters the values \(S_{0} = 1\), \(\nu = 20\%\), \(\beta = 0.5\), \(T = 1\) year, \(\underline{s} = 0.85\), \(\overline{s} = 1.25\). For thoroughness, we consider in-the-money (\(K = 0.9\)), at-the-money (\(K = 1\)) and out-of-the-money (\(K = 1.05\)) options. The benchmark prices for each numerical method are computed by the method itself with a very dense time grid and a high number of Monte Carlo paths.

In this case, we observed some differences regarding the moneyness of the option in our numerical results. More precisely, the method performed relatively poorly for the ATM option. For this reason, we report below the results in all three cases and provide an explanation for the seemingly poor performance for the ATM option.

The discrepancies between benchmark prices and numerical methods for ITM, ATM and OTM double barrier call options are shown respectively in Figs. 46. Tables 13 provide the \(95\%\) confidence intervals associated to each considered numerical method. For the same number of MC paths, i.e., 200’000, the recurrent transform shows tighter confidence intervals, which match the confidence intervals of the benchmark scheme with 1’000’000 paths. Overall, note that the size of the interval is about a few bps. We have not observed any stability issues with the recurrent transform scheme. Interestingly, our recurrent transformation has a much smaller error than the explicit Euler method with a hitting probability correction when the number of discretisations is reasonably large. More importantly, this outperformance is still valid even if the number of Monte Carlo simulations for the explicit Euler method is increased five times. Having said that, one should still treat such a conclusion with caution as our benchmark price and hitting probabilities are calculated by applying a truncation and thus subject to error. Nevertheless, the outperformance is still promising as our truncation is no coarser than the common industry practice.

Fig. 4
figure 4

Absolute discrepancy between the benchmark price and those calculated by different numerical schemes for an ITM double barrier call when \(S_{0} = 1\), \(K = 0.9\), \(\nu = 20\%\), \(\beta = 0.5\), \(T = 1\) year, \(\underline{s} = 0.85\), \(\overline{s} = 1.25\)

Table 1 ITM double barrier call: numerical prices and confidence intervals as functions of the number of time steps. Recurrent transform with 200’000 paths, Euler with series hitting probability with 1’000’000 paths (Euler+ (1M)) and Euler with series hitting probability with 200’000 paths (Euler+ (200K)). Benchmark price = 0.0462
Table 2 ATM double barrier call: numerical prices and confidence intervals as functions of the number of time steps. Recurrent transform with 200’000 paths, Euler with series hitting probability with 1’000’000 paths (Euler+ (1M)) and Euler with series hitting probability with 200’000 paths (Euler+ (200K)). Benchmark price = 0.0193
Table 3 OTM double barrier call: numerical prices and confidence intervals as functions of the number of time steps. Recurrent transform with 200’000 paths, Euler with series hitting probability with 1’000’000 paths (Euler+ (1M)) and Euler with series hitting probability with 200’000 paths (Euler+ (200K)). Benchmark price = 0.0103

Figure 7 shows the log–log plot of the discrepancy associated to the recurrent transform method for ITM, ATM and OTM double barrier call options. The numerical rates of convergence are respectively 0.91, 0.63 and 1, using 200’000 Monte Carlo simulations. Although the rate of convergence for the ATM option is far from the theoretical rate of 1, a closer look at Fig. 5 reveals a clue. Note that the error of the approximation converges very rapidly to zero after a few iterations, and further discretisations do not significantly alter the already very small error term. This indicates that the observed error in this case can be mostly attributed to the statistical noise, and the simple regression to obtain the convergence rate does not work well.

Fig. 5
figure 5

Absolute discrepancy between the benchmark price and those calculated by different numerical schemes for an ATM double barrier call when \(S_{0} = 1\), \(K = 1\), \(\nu = 20\%\), \(\beta = 0.5\), \(T = 1\) year, \(\underline{s} = 0.85\), \(\overline{s} = 1.25\)

Fig. 6
figure 6

Absolute discrepancy between the benchmark price and those calculated by different numerical schemes for an OTM double barrier call when \(S_{0} = 1\), \(K = 1.05\), \(\nu = 20\%\), \(\beta = 0.5\), \(T = 1\) year, \(\underline{s} = 0.85\), \(\overline{s} = 1.25\)

Fig. 7
figure 7

Log–log plot of the absolute discrepancy for double barrier call prices for ITM (\(K=0.9\)), ATM (\(K=1\)) and OTM (\(K=1.05\)) with the recurrent transform numerical scheme when \(S_{0} = 1\), \(T = 1\) year, \(\underline{s} = 0.85\), \(\overline{s} = 1.25\), \(\nu = 20\%\) and \(\beta = 0.5\)

When we run the same experiment for the Euler scheme with a hitting probability correction with 200’000 Monte Carlo simulations, we observe a similar drop in the performance, and the convergence rates are found to be 0.50, 0.59 and 0.61, respectively. However, the convergence rates for the latter scheme increase to 0.83, 0.83 and 0.77, respectively, when the number of simulations are increased five times.

As complements, some variations of barrier levels are considered and their impact on numerical results is quantified:

  • Low barrier \(\underline{s} = 0.8\) and high barrier \(\overline{s} = 1.3\): widening of barrier levels.

  • Low barrier \(\underline{s} = 0.8\) and high barrier \(\overline{s} = 1.15\): tightening of barrier levels.

Here, we increase the number of MC samples from 200’000 to 500’000 for the BEM method and keep the same number of MC paths at 1’000’000 for the standard Euler scheme with series hitting probability. The \(95\%\) confidence intervals for both methods are comparable and of the order 2–3 bps. For each barrier level configuration, we have considered in-the-money (\(K = 0.9\)), at-the-money (\(K = 1\)) and out-of-the-money (\(K = 1.05\)) options. The discrepancies between benchmark prices and numerical methods with respect to the number of time steps are presented in Figs. 813. Overall, both methods show comparable convergence results which are in accordance with the theoretical analysis.

Fig. 8
figure 8

Absolute discrepancy between the benchmark price and those calculated by different numerical schemes for an ITM double barrier call when \(S_{0} = 1\), \(K = 0.9\), \(\nu = 20\%\), \(\beta = 0.5\), \(T = 1\) year, \(\underline{s} = 0.8\), \(\overline{s} = 1.3\)

Fig. 9
figure 9

Absolute discrepancy between the benchmark price and those calculated by different numerical schemes for an ATM double barrier call when \(S_{0} = 1\), \(K = 1\), \(\nu = 20\%\), \(\beta = 0.5\), \(T = 1\) year, \(\underline{s} = 0.8\), \(\overline{s} = 1.3\)

Fig. 10
figure 10

Absolute discrepancy between the benchmark price and those calculated by different numerical schemes for an OTM double barrier call when \(S_{0} = 1\), \(K = 1.05\), \(\nu = 20\%\), \(\beta = 0.5\), \(T = 1\) year, \(\underline{s} = 0.8\), \(\overline{s} = 1.3\)

Fig. 11
figure 11

Absolute discrepancy between the benchmark price and those calculated by different numerical schemes for an ITM double barrier call when \(S_{0} = 1\), \(K = 0.9\), \(\nu = 20\%\), \(\beta = 0.5\), \(T = 1\) year, \(\underline{s} = 0.8\), \(\overline{s} = 1.15\)

Fig. 12
figure 12

Absolute discrepancy between the benchmark price and those calculated by different numerical schemes for an ATM double barrier call when \(S_{0} = 1\), \(K = 1\), \(\nu = 20\%\), \(\beta = 0.5\), \(T = 1\) year, \(\underline{s} = 0.8\), \(\overline{s} = 1.15\)

Fig. 13
figure 13

Absolute discrepancy between the benchmark price and those calculated by different numerical schemes for an OTM double barrier call when \(S_{0} = 1\), \(K = 1.05\), \(\nu = 20\%\), \(\beta = 0.5\), \(T = 1\) year, \(\underline{s} = 0.8\), \(\overline{s} = 1.15\)

6 Conclusion

We have introduced a novel backward Euler–Maruyama method to increase the weak convergence rate of approximations in the presence of killing. The numerical experiments confirm our theoretical result that the convergence rate is of the order \(1/N\), where \(N\) is the number of discretisation steps. This corresponds to an order-1 weak convergence rate, which is the best rate that one can achieve (see Gobet [21]).

Moreover, the numerical studies suggest that one does not need a large \(N\) to obtain a sufficiently close approximations as all numerical studies indicate error terms diminishing very rapidly with a small number of iterations. The numerical experiments also suggested that our method outperforms the Brownian bridge method in certain cases, although such a statement currently does not have any theoretical backing.

We suggest a couple of interesting avenues for future research in addition to the extension of the BEM scheme to higher dimensions as discussed in the introduction:

– The transform method with BEM has the potential to achieve a higher order of weak convergence. Indeed, one possibility of improvement is to combine the BEM scheme with the Romberg extrapolation method (see e.g. Glasserman [20, Sect. 6.2.4]). Some preliminary tests are encouraging, showing that the Romberg method generates significantly lower errors compared to the Euler hitting probability method and BEM. We believe that we can expand further the weak error to justify the use of the Romberg method, and a higher order weak convergence analysis is currently under investigation.

– Exact simulation of diffusions. Especially in a multidimensional setting, it is a challenge to keep the simulated values of \(X\) in its domain even though it should not touch its boundaries in theory. The implicit scheme considered in this paper is one way out. Exact simulation, introduced in Beskos and Roberts [6] for the one-dimensional case, may be another way to resolve this issue. It involves a rejection-sampling algorithm and, when applicable, returns exact draws from any finite-dimensional distribution of the solution to the SDE. The method has been further extended to multivariate diffusions in Blanchet and Zhang [7], although some open questions remain regarding the speed of convergence of the algorithm. It will be interesting to study exact simulations for the recurrent transformations that may lead to a decrease in computation time by avoiding implicit schemes, especially in higher dimensions.