Abstract
We prove strong convergence of order \(1/4\epsilon \) for arbitrarily small \(\epsilon >0\) of the Euler–Maruyama method for multidimensional stochastic differential equations (SDEs) with discontinuous drift and degenerate diffusion coefficient. The proof is based on estimating the difference between the Euler–Maruyama scheme and another numerical method, which is constructed by applying the Euler–Maruyama scheme to a transformation of the SDE we aim to solve.
Introduction
We consider timehomogeneous stochastic differential equations (SDEs) of the form
where \(x\in {\mathbb {R}}^d\) is the initial value, \(\mu :{\mathbb {R}}^d \longrightarrow {\mathbb {R}}^d\) is the drift and \(\sigma : {\mathbb {R}}^d \longrightarrow {\mathbb {R}}^{d \times d}\) is the diffusion coefficient.
The Euler–Maruyama approximation with stepsize \(\delta >0\) of the solution to (1) is given by
with \({\underline{s}}=j\delta \) for \(s\in [j \delta , (j+1)\delta )\), \(j=0,\ldots ,(T\delta )/\delta \). In particular, for \(t\in \{j \delta : j=0,\ldots ,(T\delta )/\delta \}\), we have
For \(\mu ,\sigma \) Lipschitz, Itô [9] proved existence and uniqueness of the solution of (1). In this case the Euler–Maruyama method (2) converges with strong order 1 / 2 to the true solution, see [12, Theorem 10.2.2]. Higher order algorithms exist, but require stronger conditions on the coefficients.
In applications, frequently SDEs with less regular coefficients appear. For example in stochastic control theory, whenever the optimal control is of bang–bang type, meaning that the strategy is of the form \({\mathbf {1}}_\mathcal{S}(X)\) for a measurable set \(\mathcal{S}\subseteq {\mathbb {R}}^d\), the drift of the controlled dynamical system is discontinuous. Furthermore, there are models which involve only noisy observations of a signal that has to be filtered. After applying filtering theory the diffusion coefficient typically is degenerate in the sense that \(\Vert \sigma (x)^\top v\Vert =0\), for some \(x,v\in {\mathbb {R}}^d\). This motivates the study of SDEs with these kind of irregularities in the coefficients.
If \(\mu \) is bounded and measurable, and \(\sigma \) is bounded, Lipschitz, and uniformly nondegenerate, i.e. if there exists a constant \(c_0>0\) such that for all \(x\in {\mathbb {R}}^d\) and all \(v\in {\mathbb {R}}^d\) it holds that \(\Vert \sigma (x)^\top v\Vert \ge c_0 \Vert v\Vert \), Zvonkin [29] and Veretennikov [25, 26] prove existence and uniqueness of a solution. Veretennikov [27] extends these results by allowing a part of the diffusion to be degenerate.
In [16] existence and uniqueness of a solution for the case where the drift is discontinuous at a hyperplane, or a special hypersurface and where the diffusion coefficient is degenerate is proven, and in [23] it is shown how these results extend to the nonhomogeneous case.
Currently, research on numerical methods for SDEs with irregular coefficients is highly active. Hutzenthaler et al. [8] introduce the tamed Euler–Maruyama scheme and prove strong order 1 / 2 convergence for SDEs with continuously differentiable and polynomially growing drift that satisfy a onesided Lipschitz condition. Sabanis [22] proves strong convergence of the tamed Euler–Maruyama scheme from a different perspective and also considers the case of locally Lipschitz diffusion coefficient. Gyöngy [4] proves almost sure convergence of the Euler–Maruyama scheme for the case where the drift satisfies a monotonicity condition.
Halidias and Kloeden [7] show that the Euler–Maruyama scheme converges strongly for SDEs with a discontinuous monotone drift coefficient. KohatsuHiga et al. [13] show weak convergence with rates smaller than 1 of a method where they first regularize the discontinuous drift and then apply the Euler–Maruyama scheme. Étoré and Martinez [1, 2] introduce an exact simulation algorithm for onedimensional SDEs with a drift coefficient which is discontinuous in one point, but differentiable everywhere else. For onedimensional SDEs with piecewise Lipschitz drift and possibly degenerate diffusion coefficient, in [14] an existence and uniqueness result is proven, and a numerical method, which is based on applying the Euler–Maruyama scheme to a transformation of (1), is presented. This method converges with strong order 1 / 2. In [15] a (nontrivial) extension of the method is introduced, which converges with strong order 1 / 2 also in the multidimensional case. The paper also contains an existence and uniqueness result for the multidimensional setting under more general conditions than, e.g., the ones stated in [16].
The method introduced in [15] is the first numerical method that is proven to converge with positive strong rate for multidimensional SDEs with discontinuous drift and degenerate diffusion coefficient. It requires application of a transformation and its numerical inverse in each step, which makes the method rather slow in practice. Furthermore, the method requires specific inputs about the geometry of the discontinuity of the drift to calculate this transformation. This is a drawback, if, e.g., the method shall be applied for solving control problems, since the control is usually not explicitly known. So a method is preferred that can deal with the discontinuities in the drift automatically.
First results in this direction are contained in a series of papers by Ngo and Taguchi. In [21] they show convergence of order up to 1 / 4 of the Euler–Maruyama method for multidimensional SDEs with discontinuous bounded drift that satisfies a onesided Lipschitz condition and with Hölder continuous, bounded, and uniformly nondegenerate diffusion coefficient. In [19] they extend this result to cases where the drift is not necessarily onesided Lipschitz for onedimensional SDEs, and in [20] they extend the result for onedimensional SDEs by allowing for discontinuities also in the diffusion coefficient. For many applications, their results fail to be applicable, since they only hold for onedimensional SDEs and their method of proof relies on uniform nondegeneracy of the diffusion coefficient.
Contrasting the above, there are several delimiting results which state that even equations with infinitely often differentiable coefficients cannot always be solved approximately in finite time, even if the Euler–Maruyama method converges, cf. Hairer et al. [6], Jentzen et al. [10], MüllerGronbah and Yaroslavtseva [18], Yaroslavtseva [28]. However, there is still a big gap between the assumptions on the coefficients under which convergence with strong convergence rate has been proven and the properties of the coefficients of the equation presented in [6].
In this paper we prove strong convergence of order \(1/4\epsilon \) for arbitrarily small \(\epsilon >0\) of the Euler–Maruyama method for multidimensional SDEs with discontinuous drift satisfying a piecewise Lipschitz condition and with a degenerate diffusion coefficient. Note that we do not impose a onesided Lipschitz condition on the drift. So even for SDEs with nondegenerate diffusion coefficient, which do not have a onesided Lipschitz drift, this result is novel.
Our convergence proof is based on estimating the difference between the Euler–Maruyama scheme and the scheme presented in [15]. Close to the set of discontinuities of the drift, we have no tight estimate of this difference, so we need to study the occupation time of an Itô process with degenerate diffusion coefficient there. Away from the set of discontinuities, it is essential to estimate the probability that during one step the distance between the interpolation of the Euler–Maruyama method and the previous Euler–Maruyama step becomes greater than some threshold.
This paper’s result is the first one that gives strong convergence and also a strong convergence rate of a fully explicit scheme for multidimensional SDEs with discontinuous drift and degenerate diffusion coefficient, and the first one for multidimensional SDEs with discontinuous drift that does not satisfy a onesided Lipschitz condition.
Preliminaries
In this section we first state the assumptions on the coefficients of SDE (1), under which the result of this paper is proven, then we study the occupation time of an Itô process close to a hypersurface, and finally we recall the transformation from [15], which is also essential for our proof.
Definitions and assumptions
We want to prove strong convergence of the Euler–Maruyama method for SDEs with discontinuous drift coefficient. Instead of the usual requirement of Lipschitz continuity we only assume that the drift is a piecewise Lipschitz function on the \({\mathbb {R}}^d\).
Definition 2.1
([15, Definitions 3.1 and 3.2]) Let \(A\subseteq {\mathbb {R}}^d\).

1.
For a continuous curve \(\gamma :[0,1]\longrightarrow {\mathbb {R}}^d\), let \(\ell (\gamma )\) denote its length,
$$\begin{aligned} \ell (\gamma )=\sup _{n,0\le t_1<\cdots <t_n\le 1}\sum _{k=1}^n \Vert \gamma (t_k)\gamma (t_{k1})\Vert \,. \end{aligned}$$The intrinsic metric \(\rho \) on A is given by
$$\begin{aligned} \rho (x,y):= & {} \inf \{\ell (\gamma ):\gamma :[0,1]\longrightarrow A \text { is a continuous curve satisfying } \gamma (0)\\= & {} x,\, \gamma (1)=y\}\,, \end{aligned}$$where \(\rho (x,y):=\infty \), if there is no continuous curve from x to y.

2.
Let \(f:A\longrightarrow {\mathbb {R}}^m\) be a function. We say that f is intrinsic Lipschitz, if it is Lipschitz w.r.t. the intrinsic metric on A, i.e. if there exists a constant L such that
$$\begin{aligned} \forall x,y\in A: \Vert f(x)f(y)\Vert \le L \rho (x,y)\,. \end{aligned}$$
The prototypical examples for intrinsic Lipschitz function are given, like in the onedimensional case, by differentiable functions with bounded derivative.
Lemma 2.2
([15, Lemma 3.8]) Let \(A\subseteq {\mathbb {R}}^d\) be open and let \(f:A\longrightarrow {\mathbb {R}}^m\) be a differentiable function with \(\Vert f'\Vert <\infty \). Then f is intrinsic Lipschitz with Lipschitz constant \(\Vert f'\Vert \).
Definition 2.3
([15, Definition 3.4]) A function \(f{:}\,{\mathbb {R}}^d\longrightarrow {\mathbb {R}}^m\) is piecewise Lipschitz, if there exists a hypersurface \(\Theta \) with finitely many connected components and with the property, that the restriction \(f_{{\mathbb {R}}^d\backslash \Theta }\) is intrinsic Lipschitz. We call \(\Theta \) an exceptional set for f, and we call
the piecewise Lipschitz constant of f.
In this paper \(\Theta \) will be a fixed \(C^3\)hypersurface, and we will only consider piecewise Lipschitz functions with exceptional set \(\Theta \). In the following, \(L_f\) denotes the piecewise Lipschitz constant of a function f, if f is piecewise Lipschitz, and it denotes the Lipschitz constant, if f is Lipschitz.
We define the distance \(d(x,\Theta )\) between a point x and the hypersurface \(\Theta \) by \(d(x,\Theta ):=\inf \{\Vert xy\Vert :y \in \Theta \}\), and for every \(\varepsilon >0\) we define \(\Theta ^\varepsilon :=\{x\in {\mathbb {R}}^d: d(x,\Theta )<\varepsilon \}\).
Recall that, since \(\Theta \in C^3\), for every \(\xi \in \Theta \) there exists an open environment \(U\subseteq \Theta \) of \(\xi \) and a continuously differentiable function \(n{:}\,U\longrightarrow {\mathbb {R}}^d\) such that for every \(\zeta \in U\) the vector \(n(\zeta )\) has length 1 and is orthogonal to the tangent space of \(\Theta \) in \(\zeta \). On a given connected open subset of \(\Theta \) the local unit normal vector n is unique up to a factor \(\pm 1\).
We recall a definition from differential geometry.
Definition 2.4
Let \(\Theta \in {\mathbb {R}}^d\) be any set.

1.
An environment \(\Theta ^\varepsilon \) is said to have the unique closest point property, if for every \(x\in {\mathbb {R}}^d\) with \(d(x,\Theta )<\varepsilon \) there is a unique \(p\in \Theta \) with \(d(x,\Theta )=\Vert xp\Vert \). Therefore, we can define a mapping \(p{:}\Theta ^{\varepsilon }\longrightarrow \Theta \) assigning to each x the point \(p(x)\) in \(\Theta \) closest to x.

2.
\(\Theta \) is said to be of positive reach, if there exists \(\varepsilon >0\) such that \(\Theta ^\varepsilon \) has the unique closest point property. The reach of \(\Theta \) is the supremum over all such \(\varepsilon \) if such an \(\varepsilon \) exists, and 0 otherwise.
Now, we give assumptions which are sufficient for the results in [15] to hold and which we need to prove the main result here.
Assumption 2.1
We assume the following for the coefficients of (1):

1.
\(\mu \) and \(\sigma \) are bounded;

2.
the diffusion coefficient \(\sigma \) is Lipschitz;

3.
the drift coefficient \(\mu \) is a piecewise Lipschitz function \({\mathbb {R}}^d\longrightarrow {\mathbb {R}}^d\). Its exceptional set \(\Theta \) is a \(C^3\)hypersurface of positive reach;

4.
nonparallelity condition: there exists a constant \(c_0>0\) such that \(\Vert \sigma (\xi )^\top n(\xi )\Vert \ge c_0\) for all \(\xi \in \Theta \);

5.
the function \(\alpha {:} \Theta \longrightarrow {\mathbb {R}}^d\) defined by
$$\begin{aligned} \alpha (\xi ):=\lim _{h\rightarrow 0}\frac{\mu (\xi h n(\xi ))\mu (\xi +hn(\xi ))}{2 \Vert \sigma (\xi )^\top n(\xi )\Vert ^2} \end{aligned}$$(3)is \(C^3\) and all derivatives up to order three are bounded.
Theorem 2.5
([15, Theorem 3.21]) Let Assumption 2.1 hold. Then SDE (1) has a unique strong solution.
Remark on Assumption 2.1:

1.
For existence and uniqueness of a solution to (1), in [15, Theorem 3.21] instead of Assumption 2.1.1 only boundedness in an \(\varepsilon \)environment of \(\Theta \) is needed. However, for the proof of our convergence result we require global boundedness. Note that other results in the literature on numerical methods for SDEs with discontinuous drift also rely on boundedness of the coefficients, cf. [19,20,21].

2.
Assumption 2.1.2 is a technical condition; the focus in this paper is on other types of irregularities in the coefficients. There are results in the literature, where the authors deal with a nonglobally Lipschitz diffusion coefficient, see, e.g., [5], but in contributions where only Hölder continuity is required for \(\sigma \), usually uniform nondegeneracy is assumed.

3.
Assumption 2.1.3 is a geometrical condition which we require in order to locally flatten \(\Theta \), i.e. to map \(\Theta \) to a hyperplane in a regular way. This is crucial in many places in [15] and here, in particular for the proof of Theorem 2.7 below. In addition to that, Assumption 2.1.3 implies that there exists a constant \(c_1\) such that \(\Vert n'(\xi )\Vert \le c_1\) for every \(\xi \in \Theta \) and every orthonormal vector n on \(\Theta \), see [15, Lemma 3.10].

4.
Assumption 2.1.4 means that the diffusion coefficient must have a component orthogonal to \(\Theta \) in all \(\xi \in \Theta \). This condition is significantly weaker than uniform nondegeneracy, and it is essential: in [16] we give a counterexample for the case where the nonparallelity condition does not hold. Then, even existence of a solution is not guaranteed.

5.
Assumption 2.1.5 is a technical condition, which is required for our transformation method to work. Boundedness of \(\alpha \) and \(\alpha '\) is needed for proving the local invertibility of our transform. Existence and boundedness of \(\alpha ''\) and \(\alpha '''\) is required for the multidimensional version of Itô’s formula to hold for the transform, see [15]. Moreover, it has been shown in [15, Proposition 3.13] that \(\alpha \) is a welldefined function on \(\Theta \), i.e. it does not depend on the choice of the normal vector n and, in particular, on its sign.
Example 2.6
Suppose \(\Theta \) is the finite and disjoint union of orientable compact \(C^3\)manifolds. Then \(\Theta \) is of positive reach by the lemma in [3], and each connected component of \(\Theta \) separates the \({\mathbb {R}}^n\) into two open connected components by the Jordan–Brouwer separation theorem, see [17].
Thus \({\mathbb {R}}^d\backslash \Theta \) is the union of finitely many disjoint open connected subsets of \({\mathbb {R}}^d\); we can write \({\mathbb {R}}^d\backslash \Theta =A_1\cup \cdots \cup A_n\).
Suppose there exist bounded and Lipschitz \(C^3\)functions \(\mu _1,\ldots ,\mu _n:{\mathbb {R}}^d\longrightarrow {\mathbb {R}}^d\) such that \(\mu =\sum _{k=1}^n {\mathbf {1}}_{A_k}\mu _k\), and suppose that \(\sigma :{\mathbb {R}}^d\longrightarrow {\mathbb {R}}^{d\times d}\) is bounded, Lipschitz, and \(C^3\) with \(\sigma (\xi )^\top n(\xi )\ne 0\) for every \(\xi \in \Theta \).
Then it is readily checked that \(\mu \) and \(\sigma \) satisfy Assumption 2.1.
In Sect. 4 we present a number of concrete examples which satisfy Assumption 2.1 and we perform numerical tests on the associated SDEs.
Occupation time close to a hypersurface
In this section we study the occupation time of an Itô process close to a \(C^3\)hypersurface. In the proof of our main theorem, the Euler–Maruyama approximation \(X^\delta \) in equation (2) will play the role of that Itô process.
Theorem 2.7
Let \(\Theta \) be a \(C^3\)hypersurface of positive reach and let \({\varepsilon _0}>0\) be such that the closure of \(\Theta ^{\varepsilon _0}\) has the unique closest point property. Let further \(X=(X_t)_{t\ge 0}\) be an \({\mathbb {R}}^d\)valued Itô process
with progressively measurable processes \(A=(A_t)_{t\ge 0}\), \(B=(B_t)_{t\ge 0}\), where \(A\) is \({\mathbb {R}}^d\)valued and \(B\) is \({\mathbb {R}}^{d\times d}\)valued. Let the coefficients \(A,B\) be such that

1.
there exists a constant \(c_{AB}\) such that for almost all \(\omega \in \Omega \) it holds that
$$\begin{aligned} \forall t\in [0,T]: X_t(\omega )\in \Theta ^{\varepsilon _0} \Longrightarrow \max (\Vert A_t(\omega )\Vert ,\Vert B_t(\omega )\Vert )\le c_{AB}\,; \end{aligned}$$ 
2.
there exists a constant \(c_0\) such that for almost all \(\omega \in \Omega \) it holds that
$$\begin{aligned} \forall t\in [0,T]: X_t(\omega )\in \Theta ^{\varepsilon _0} \Longrightarrow n (p(X_t(\omega )) )^\top B_t(\omega )B_t(\omega )^\top n (p(X_t(\omega ) ))\ge c_0\,. \end{aligned}$$
Then there exists a constant C such that for all \(0<\varepsilon <\varepsilon _0/2\),
For the proof we will construct a onedimensional Itô process Y with the property that Y is close to 0, if and only if X is close to \(\Theta \). For the construction of Y we decompose the path of X into pieces close to \(\Theta \) and pieces farther away. These pieces are then mapped to \({\mathbb {R}}\) by using a signed distance of X from \(\Theta \) and pasted together in a continuous way.
A signed distance to \(\Theta \) is locally given by \(D(x):=n(p(x))^\top (xp(x))\), where n is a local unit normal vector.
Lemma 2.8
For all \(x\in \Theta ^{\varepsilon _0}\) it holds that \(D'(x)=n(p(x))^\top \).
Proof
Fix \(x\in \Theta ^{\varepsilon _0}\backslash \Theta \) and consider the function h defined by \(h(b):=\Vert xp(x+b)\Vert ^2\). By definition of the projection map p, h has a minimum in \(b=0\), such that \(h'(0)=0\). Hence from \(h'(b)=2(xp(x+b))^\top p'(x+b)\), we get \((xp(x))^\top p'(x)=0\). This implies \(n(p(x))^\top p'(x)=0\), since \((xp(x))\) is a scalar multiple of n(p(x)).
Using that \(D(x)=a\Vert xp(x)\Vert \) for an \(a\in \{1,1\}\), we compute
For \(\psi \in {\mathbb {R}}\) with \(\psi \) small we get
such that the directional derivative of D in direction n(p(x)) in x is 1. From this and from (4) it follows that \(D'(x)=n(p(x))^\top \). This also holds for \(x\in \Theta \) by the continuity of \(D'\). \(\square \)
The following lemma states that for any continuous curve \(\gamma \) in \(\Theta ^{\varepsilon _0}\) there is a continuous path of unit normal vectors, such that to every point of \(\gamma \) we can assign a signed distance in a continuous way.
Lemma 2.9
Let \(\gamma :[a,b]\longrightarrow \Theta ^{\varepsilon _0}\) be a continuous function. Then there exists \(m:[a,b]\longrightarrow {\mathbb {R}}^d\) such that

1.
m is continuous;

2.
\(\Vert m(t)\Vert =1\) for all \(t\in [a,b]\);

3.
m(t) is orthogonal to \(\Theta \) in the point \(p(\gamma (t))\) for all \(t\in [a,b]\).
Proof
For \(\xi \in \Theta \) we denote the tangent space to \(\Theta \) in \(\xi \) by \({{\text {tang}}}_\xi \). Let
The set \(S\) is nonempty and its elements are bounded by b. Let \(s_1:=\sup S\). There exists an open and connected subset \(U\subseteq \Theta \) such that \(p(\gamma (s_1))\in U\), and a unit normal vector \(n_1:U\longrightarrow {\mathbb {R}}^d\).
Since U is open and \(p\circ \gamma \) is continuous, there exists \(\eta >0\) such that \(p(\gamma ([s_1\eta ,s_1]))\subseteq U\). By the definition of \(s_1\) there exists \(s\in (s_1\eta ,s_1)\) and \(m:[a,s]\longrightarrow {\mathbb {R}}^d\) continuous, with \(\Vert m(t)\Vert =1\) and \(m(t)\bot {{\text {tang}}}_{p(\gamma (t))}\) for all \( t\in [a,s]\).
Since \(n_1\) is unique up to a factor \(\pm 1\), the mapping \(n_1\circ p \circ \gamma \) either coincides with m or \(m\) on \((s_1\eta ,s)\). Without loss of generality we may assume that the former is the case. Thus we can extend m continuously to \([a,s_1]\) by defining \(m(t):= n_1(p(\gamma (t)))\) for all \(t\in (s,s_1]\).
Now, if \(s_1\) was strictly smaller than b, then we could use the same mapping \(n_1\circ p \circ \gamma \) to extend m continuously beyond \(s_1\), contradicting the definition of \(s_1\). \(\square \)
We will need the following estimate on the local time of a onedimensional Itô process.
Lemma 2.10
Let \(Y=(Y_t)_{t\ge 0}\) be an Itô process with bounded and progressively measurable coefficients \(\hat{A}=(\hat{A}_t)_{t\ge 0},\hat{B}=(\hat{B}_t)_{t\ge 0}\).
Then \(\sup _{y\in {\mathbb {R}}}{\mathbb {E}}(L_T^y(Y)) \le \left( 3 T^2\Vert \hat{A}\Vert _\infty ^2 +\frac{3}{2}T \Vert \hat{B}_s\Vert _\infty ^2\right) ^{1/2}\).
The claim is a special case of [19, Lemma 3.2]. We give a proof for the convenience of the reader.
Proof
From the Meyer–Tanaka formula [11, Section 3.7, Eq. (7.9)] we have
Using the inequality \((a+b+c)^2\le 3(a^2+b^2+c^2)\) we get
and, using Itô’s \(L^2\)isometry,
The claim now follows by applying the Cauchy–Schwarzinequality and taking the supremum over all \(y\in {\mathbb {R}}\). \(\square \)
We are ready to prove the result of this section.
Proof of Theorem 2.7
Let \(\varepsilon _1=\varepsilon _0/2\). Define a mapping \(\lambda : {\mathbb {R}}\longrightarrow {\mathbb {R}}\) by
Note that \(\lambda '(0)=1\) and \(\lambda '(\pm \varepsilon _1)=\lambda ''(\pm \varepsilon _1)=0\), so that \(\lambda \in C^2\).
Next we decompose the path of X: let \(\tau _{0}:=\inf \{t\ge 0: X_t\in \Theta ^{\varepsilon _1}\}\). In particular we have \(\tau _0=0\), if \(X_0\in \Theta ^{\varepsilon _1}\). For \(k\in {\mathbb {N}}_0\), define
By Lemma 2.9 there exist continuous \(m_k:[\tau _k,\kappa _{k+1}]\longrightarrow {\mathbb {R}}^d\), with \(\Vert m_k(t)\Vert =1\) and \(m_k(t)\bot {{\text {tang}}}_{p(X_t)}\) for all \(t\in [\tau _k,\kappa _{k+1}]\). Without loss of generality \(m_0\) can be chosen such that \(m_0(\tau _0)^\top (X_{\tau _0}p(X_{\tau _0}) )\ge 0\). We construct a onedimensional process Y as follows:
where without loss of generality the \(m_k\) are chosen such that
Note that by construction both sides of (5) can only take the values \(\pm \lambda (\varepsilon _1)\).
We have thus constructed a continuous \([\lambda (\varepsilon _1),\lambda (\varepsilon _1)]\)valued process Y with the property that the occupation time of Y in an environment of 0 is the same as the occupation time of X in an environment of \(\Theta \), i.e. \(Y\in (\lambda (\varepsilon ),\lambda (\varepsilon ))\), iff \(X\in \Theta ^\varepsilon \) for all \(0<\varepsilon <\varepsilon _1\).
To show that Y is an Itô process, we want to use Itô’s formula. For this we recognize that Y, depending on its proximity to \(\Theta \), is either constant or locally of the form \(Y_t=\lambda (n(p(X_t))^\top (X_tp(X_t)))\) for a suitable choice of the unit normal vector. Denote \(D(x)=n(p(x))^\top (xp(x) )\). The function \(D\) is locally a signed distance to \(\Theta \) and \(D \in C^2\). This can be seen by following the proof of [3, Theorem 1]. Hence, we may apply Itô’s formula to get
By Lemma 2.8 we have \(D'(x)=n(p(x))^\top \), and hence
Since \(\lambda '\) and \(\lambda ''\) are bounded by construction, \(\Vert n(p(x))n(p(x))^\top \Vert =1\), \(\Vert n'\Vert \) is bounded (c.f. the remark on Assumption 2.1.3), and by Assumption 1 of the theorem, the coefficients of Y are uniformly bounded. Therefore \(dY_t=\hat{A}_t dt+ \hat{B}_t dW_t\), with bounded and progressively measurable \(\hat{A},\hat{B}\).
Let \(0<\varepsilon \le \varepsilon _1/2\). For all \(z\le \varepsilon \), we have \(\lambda '(z)\ge \left( \frac{3}{4}\right) ^2\). Thus by Assumption 2 of the theorem,
By the occupation time formula [11, Chapter 3, 7.1 Theorem] for onedimensional continuous semimartingales, we get
\(\square \)
The transformation
The proof of convergence is based on a transformation that removes the discontinuity from the drift and makes the drift Lipschitz while preserving the Lipschitz property of the diffusion coefficient. A suitable transform is presented in [15]. We recall it here.
Define \(G{:}{\mathbb {R}}^d\longrightarrow {\mathbb {R}}^d\),
where \(\varepsilon _0>0\) is smaller than the reach of \(\Theta \), see Assumption 2.1.3, \(\alpha \) is the function defined in Assumption 2.1.5, and
with positive constant c and
Note that \(G''\) is piecewise Lipschitz with exceptional set \(\Theta \).
If c is chosen sufficiently small, see [15, Lemma 3.18], G is invertible by [15, Theorem 3.14]. Furthermore, Itô’s formula holds for G and \(G^{1}\) by [15, Theorem 3.19].
With this we can define a process \(Z=(Z_t)_{t\ge 0}\) by \(Z_t=G(X_t)\), which solves the SDE
where
From [15, Theorem 3.20] we know that \(\tilde{\mu }\) and \(\tilde{\sigma }\) are Lipschitz, and hence the solution to (6) can be approximated with strong order 1 / 2 using the Euler–Maruyama scheme.
Main result
We are ready to formulate the main result.
Theorem 3.1
Let Assumption 2.1 hold. Then the Euler–Maruyama method (2) converges to the solution of SDE (1) with strong order \(1/4\epsilon \) for arbitrarily small \(\epsilon >0\), i.e. there exists a constant C such that for all \(\epsilon >0\) it holds that for sufficiently small step size \(\delta >0\),
In preparation of the proof of the main result, we prove two lemmas.
Lemma 3.2
Let Assumption 2.1.1 hold. Then there exists a constant C such that for sufficiently small step size \(\delta \)
Proof
By the definition of the Euler–Maruyama method (2) we have
\(\square \)
For all \(\delta ,\varepsilon >0\) and all \(j=0,\ldots ,T/\delta 1\), define
Lemma 3.3
Let Assumption 2.1.1 hold. Then there exists a constant C such that for all \(0<\delta \le 1\), all \(\varepsilon >0\), and all \(j=0,\ldots ,T/\delta 1\), it holds that \({\mathbb P}({\Omega _{\delta ,\varepsilon ,j}})\le C\exp (\varepsilon / \Vert \sigma \Vert _\infty \delta ^{1/2})\).
Proof
where we applied Doob’s submartingal inequality, and in the last step used that \(\delta \le 1\). \(\square \)
Now, we are ready to prove our main result.
Proof of Theorem 3.1
Since \(G^{1}\) is Lipschitz by the proof of [15, Theorem 3.20],
with \(Z=G(X)\) as in (6). Let \(Z^\delta \) be the Euler–Maruyama approximation of Z. It holds that
For estimating the first term in (9), recall that by [15, Theorem 3.20], the transformed SDE (6) has Lipschitz coefficients. Since the Euler–Maruyama method converges with strong order 1 / 2 for SDEs with Lipschitz coefficients (see [12, Theorem 10.2.2]), there exists a constant \(C_1>0\) such that for sufficiently small \(\delta >0\),
We now turn to the second term in (9), i.e. we estimate the difference between G applied to the Euler–Maruyama approximation of X and the Euler–Maruyama approximation of Z. Denote, for all \(\tau \in [0,T]\),
With \(\nu (x_1,x_2)=G'(x_1)\mu (x_2)+\frac{1}{2}{\text {tr}}(\sigma (x_2)^\top G''(x_1)\sigma (x_2))\) we have by Itô’s formula,
so that
Applying the Cauchy–Schwarz inequality to the Lebesgue integrals and the ddimensional Burkholder–Davis–Gundy inequality [8, Lemma 3.7] to the Itô integrals, we obtain
For estimating \(E_1\) in (11), we will use that
With this and the definition of \({\Omega ^c_{\delta ,\varepsilon ,j}}\) in (7), we get
By Lemma 3.3, \(\int _0^T {\mathbb P}({\Omega _{\delta ,\varepsilon ,{\underline{s}}/\delta }}) ds\le C_2\exp ( \varepsilon /\Vert \sigma \Vert _\infty \delta ^{1/2} )\), and by Theorem 2.7, \(\int _0^T {\mathbb P}( \{X^\delta _s \in \Theta ^\varepsilon \})ds \le C_3 \varepsilon \), for suitable constants \(C_2,C_3\). In order to minimize the bound on \(E_1\), we choose \(\varepsilon \) such that \(\exp ( \varepsilon /\Vert \sigma \Vert _\infty \delta ^{1/2} )+\varepsilon \) is minimized for \(\delta \) sufficiently small, yielding \(\varepsilon =\Vert \sigma \Vert _\infty \delta ^{1/2} \log (\Vert \sigma \Vert _\infty \delta ^{1/2} )=\Vert \sigma \Vert _\infty \delta ^{1/22 \epsilon } (\delta ^{2 \epsilon } \log (\Vert \sigma \Vert _\infty \delta ^{1/2} ))\) for arbitrarily small \( \epsilon >0\). Hence, with \(C_4=(2 L_{G'}^2 \Vert \mu \Vert _\infty ^2+\frac{1}{2}L_{G''}^2\Vert \sigma \Vert _\infty ^4)T\), \(C_5=(4 \Vert \mu \Vert _\infty ^2 \Vert G'\Vert ^2_\infty + \Vert \sigma \Vert ^4 \Vert G''\Vert _\infty ^2) C_2\), \(C_6=(4 \Vert \mu \Vert _\infty ^2 \Vert G'\Vert ^2_\infty + \Vert \sigma \Vert ^4 \Vert G''\Vert _\infty ^2) C_3\), we get
Thus, with \(C_7=C_4 \Vert \sigma \Vert _\infty ^2 +C_5 \Vert \sigma \Vert _\infty +C_6 \Vert \sigma \Vert _\infty \) and for arbitrarily small fixed \(\epsilon >0\), it holds that for sufficiently small \(\delta \)
For estimating \(E_2\) in (11), we apply Lemma 3.2 to get
For estimating \(E_3,E_4\) in (11), we use that \(\tilde{\mu }, \tilde{\sigma }\) are Lipschitz by [15, Theorem 3.20], to get
Combining the estimates (12), (13), (14), (15) with (11), we get
with \(C_9= 4T L_{\tilde{\mu }}^2+8dL_{\tilde{\sigma }}^2\). Using that \(4 T C_7 \delta ^{1/22\epsilon }+ 8d L_{G'}^2\Vert \sigma \Vert _\infty ^2 C_8 \delta \le C_{10} \delta ^{1/22\epsilon }\) for \(\delta \le 1\), and applying Gronwall’s inequality yields for all \(\tau \in [0,T]\),
Combining (10) and (16) with (9), and the result with (8), finally yields
for a suitably chosen constant C, for arbitrarily small \(\epsilon >0\), and for \(\delta \) sufficiently small. \(\square \)
Examples
We ran simulations for several examples—ones of theoretical interest as well as an example coming from applications.
When studying stochastic dynamical systems which include a noisy signal, then filtering this signal leads to a higher dimensional system with a degenerate diffusion coefficient. Stochastic control problems often lead to an optimal control policy which makes the drift of the system discontinuous. Examples are models with incomplete market information in mathematical finance where the rate with which cashflows are paid from a firm value process change systematically when the assetliability ratio passes a certain threshold which then triggers a rating change.
The class of equations studied here appears frequently in several areas of applied mathematics and the natural sciences.
Stepfunction
In the first example the drift is the step function \(\mu (x_1,x_2)=(3({\mathbf {1}}_{\{x_1 \ge 0\}}{\mathbf {1}}_{\{x_1 < 0\}}),1)^\top \), and \(\sigma \equiv {{\text {id}}_{{\mathbb {R}}^2}}\). It can easily be checked that these coefficients satisfy Assumption 2.1. In particular, note that the nonparallelity condition is trivially satisfied, since \(\sigma \) is uniformly nondegenerate. Since \(\mu \) does not satisfy a onesided Lipschitz condition, our result is the first one that gives a strong convergence rate of the Euler–Maruyama method for this example.
Discontinuity along the unit circle
In this example the drift has a discontinuity along the unit circle, and the diffusion coefficient is degenerate on the whole of \({\mathbb {R}}^2\):
Assumption 2.1 largely follows from Example 2.6. The nonparallelity condition is readily verified:
for all points \((x_1,x_2)\) that lie on the unit circle, i.e. \(x_1^2+x_2^2=1\).
Dividend maximization under incomplete information
In insurance mathematics, a wellstudied problem is the maximization of the expected discounted future dividend payments until the time of ruin of an insurance company, a value which serves as a risk measure. In [24] the problem is studied in a setup that allows for incomplete information about the market. This leads to a joint filtering and stochastic optimal control problem, and after solving the filtering problem, the driving dynamics are high dimensional and have a degenerate diffusion coefficient. This issue is described in more detail in [24]. Solving the stochastic optimal control problem in dimensions higher than three with the usual technique (solving an associated partial differential equation) becomes practically infeasible. Therefore, one has to resort to simulation. The SDE that has to be simulated has the coefficients
where \({\bar{u}},\beta ,(\vartheta _i)_{i=1}^d,(q_{ij})_{i,j=1}^d\) are known constants. The arguments \(x_2,\ldots ,x_d\) are elements of the simplex \(\{(x_2,\ldots ,x_d)\in [0,1]^{d1}:\sum _{j=1}^{d1}x_{j+1}\le 1\}\), and the corresponding processes stay within this simplex almost surely, see [24]. The function \(f\) determines the hypersurface \(\Theta \) along which the drift is discontinuous: \(\Theta =\{(x_1,\ldots ,x_d): x_1=f(x_2,\ldots ,x_d)\}\). In our simulations we choose \(d=5\) and \(f\) affine linear, but note that we need not restrict ourselves to affine linear \(f\).
We need to check Assumption 2.1: Since \(x_2,\ldots ,x_d \in [0,1]\), \(\mu , \sigma \) are bounded, and all first order derivatives of the entries of \(\sigma \) are bounded. Hence, \(\sigma \) is Lipschitz. \(\mu \) is piecewise Lipschitz, and since \(f\) is affine linear, \(\Theta \in C^3\). Whether the nonparallelity condition holds depends on the choice of the parameters, but for ours the condition is satisfied. Assumption 2.1.5 can easily be checked. Note that the coefficients can be extended to the whole of \({\mathbb {R}}^d\) in a way that they still satisfy our assumptions.
Error estimate
The \(L^2\)error is estimated by
where \(X_T^{(k)}\) is the numerical approximation of \(X_T\) with step size \(\delta ^{(k)}\), \(\hat{E}\) is an estimator of the mean value using \(2^{14}\) paths, and \({\bar{e}}\) is a normalizing constant so that \({{\text {err}}}_1=\sqrt{1/4}\).
Figure 1 shows \(\log _2\) of the estimated \(L^2\)error of the Euler–Maruyama approximation of \(X_T\) plotted over \(\log _2 \delta ^{(k)}\) for the examples presented above. We observe that the theoretical convergence rate is approximately obtained for the example of a stepfunction and that the other examples converge at a faster rate. In particular, for the examples with degenerate diffusion coefficient, the convergence rate is not worse than for the other example. Even for the stepfunction example, for sufficiently small stepsize the convergence rate seems to be higher than the theoretical one. Hence, it will be an interesting topic for future research to prove sharpness, or find a sharp bound.
Even though the proven rate for the Euler–Maruyama method is lower than for the transformationbased method from [15], the calculations are usually faster in practice using the first method, since the simulation of a single path is faster. Table 1 confirms this claim: we observe that computation times are higher by up to two orders of magnitude for the transformation method, while the estimated error is of comparable size.
For completeness, we remark that one can construct examples, where the transformation method is much faster while giving a smaller error. For example, start with prescribing the transform \(G(x)=x+xx\phi (10x)\) and set \(\mu (x)=\frac{1}{2}(G^{1})''(G(x))\) and \(\sigma (x)=(G^{1})'(G(x))\). This leads to \(\tilde{\mu }(z)=0\) and \(\tilde{\sigma }(z)=1\). Hence, if we use the transformation method with the same G, then \(Z^\delta =Z=W\) and the transformation method gives the estimate \(G^{1}(W)\), which is the exact solution.
Conclusion
In this paper we have for the first time proven strong convergence and also a positive strong convergence rate for an explicit method (the Euler–Maruyama method) for multidimensional SDEs with discontinuous drift that has a degenerate diffusion coefficient, or with a discontinuous drift that does not satisfy a onesided Lipschitz condition, or both. The Euler–Maruyama method has the advantage that it does not need the exact form of the set of discontinuities of the drift as an input, and that in practice, computation of one path is fast in comparison to the second method in the literature that can deal with this class of SDEs. Our numerical experiments suggest that in addition to these advantages, it even seems that the Euler–Maruyama method converges at a higher than the theoretically obtained rate for many examples and it will be a topic of future research to prove sharpness, or find a sharp bound.
References
Étoré, P., Martinez, M.: Exact simulation for solutions of onedimensional stochastic differential equations involving a local time at zero of the unknown process. Monte Carlo Methods Appl. 19(1), 41–71 (2013)
Étoré, P., Martinez, M.: Exact simulation for solutions of onedimensional stochastic differential equations with discontinuous drift. ESAIM Probab. Stat. 18, 686–702 (2014)
Foote, R.L.: Regularity of the distance function. Proc. Am. Math. Soc. 92(1), 153–155 (1984)
Gyöngy, I.: A note on Euler’s approximation. Potential Anal. 8, 205–216 (1998)
Gyöngy, I., Rásonyi, M.: A note on Euler’s approximation for SDEs with Hölder continuous diffusion coefficients. Stoch. Process. Appl. 121(10), 2189–2200 (2011)
Hairer, M., Hutzenthaler, M., Jentzen, A.: Loss of regularity for Kolmogorov equations. Ann. Probab. 43(2), 468–527 (2015)
Halidias, N., Kloeden, P.E.: A note on the Euler–Maruyama scheme for stochastic differential equations with a discontinuous monotone drift coefficient. BIT Numer. Math. 48(1), 51–59 (2008)
Hutzenthaler, M., Jentzen, A., Kloeden, P.E.: Strong convergence of an explicit numerical method for SDEs with nonglobally Lipschitz continuous coefficients. Ann. Appl. Probab. 22(4), 1611–1641 (2012)
Itô, K.: On stochastic differential equations. Mem. Am. Math. Soc. 4, 1–57 (1951)
Jentzen, A., MüllerGronbach, T., Yaroslavtseva, L.: On stochastic differential equations with arbitrary slow convergence rates for strong approximation. Commun. Math. Sci. 14(6), 1477–1500 (2016)
Karatzas, I., Shreve, S.E.: Brownian motion and stochastic calculus. Graduate texts in mathematics, 2nd edn. Springer, New York (1991)
Kloeden, P.E., Platen, E.: Numerical solutions of stochastic differential equations. Stochastic modelling and applied probability. Springer, Berlin (1992)
KohatsuHiga, A., Lejay, A., Yasuda, K.: Weak approximation errors for stochastic differential equations with nonregular drift. Preprint, Inria, hal00840211 (2013)
Leobacher, G., Szölgyenyi, M.: A numerical method for SDEs with discontinuous drift. BIT Numer. Math. 56(1), 151–162 (2016)
Leobacher, G., Szölgyenyi, M.: (2016) A strong order 1/2 method for multidimensional SDEs with discontinuous drift. Ann. Appl. Probab. (2016). arXiv:1512.02807
Leobacher, G., Szölgyenyi, M., Thonhauser, S.: On the existence of solutions of a class of SDEs with discontinuous drift and singular diffusion. Electron. Commun. Probab. 20(6), 1–14 (2015)
Lima, E.L.: The Jordan–Brouwer separation theorem for smooth hypersurfaces. Am. Math. Mon. 95(1), 39–42 (1988)
MüllerGronbach, T., Yaroslavtseva, L.: On hard quadrature problems for marginal distributions of SDEs with bounded smooth coefficients (2016). arXiv:1603.08686
Ngo, H.L., Taguchi, D.: On the Euler–Maruyama approximation for onedimensional stochastic differential equations with irregular coefficients. (2016). arXiv:1509.06532
Ngo, H.L., Taguchi, D.: Strong convergence for the Euler–Maruyama approximation of stochastic differential equations with discontinuous coefficients. (2016). arXiv:1604.01174
Ngo, H.L., Taguchi, D.: Strong rate of convergence for the Euler–Maruyama approximation of stochastic differential equations with irregular coefficients. Math. Comput. 85(300), 1793–1819 (2016)
Sabanis, S.: A note on tamed Euler approximations. Electron. Commun. Probab. 18(47), 1–10 (2013)
Shardin, A.A., Szölgyenyi, M.: Optimal control of an energy storage facility under a changing economic environment and partial information. Int. J. Theor. Appl. Finance 19(4), 1–27 (2016)
Szölgyenyi, M.: Dividend maximization in a hidden Markov switching model. Stat. Risk Model. 32(3–4), 143–158 (2016)
Veretennikov, A.Y.U.: On strong solutions and explicit formulas for solutions of stochastic integral equations. Math. USSR Sb. 39(3), 387–403 (1981)
Veretennikov, A.Y.U.: On the criteria for existence of a strong solution of a stochastic equation. Theory Probab. Appl. 27(3), 441–449 (1982)
Veretennikov, A.Y.U.: On stochastic equations with degenerate diffusion with respect to some of the variables. Math. USSR Izvestiya 22(1), 173–180 (1984)
Yaroslavtseva, L.: On nonpolynomial lower error bounds for adaptive strong approximation of SDEs. (2016). arXiv:1609.08073
Zvonkin, A.K.: A transformation of the phase space of a diffusion process that removes the drift. Math. USSR Sb. 22(129), 129–149 (1974)
Acknowledgements
Open access funding provided by Vienna University of Economics and Business (WU). The authors thank Andreas Neuenkirch and Lukasz Szpruch for fruitful discussions that helped to improve the estimate in Lemma 3.3 and hence the obtained convergence rate. G. Leobacher is supported by the Austrian Science Fund (FWF): Project F5508N26, which is part of the Special Research Program “QuasiMonte Carlo Methods: Theory and Applications”. A part of this paper was written while G. Leobacher was member of the Department of Financial Mathematics and Applied Number Theory, Johannes Kepler University Linz, 4040 Linz, Austria. M. Szölgyenyi is supported by the Vienna Science and Technology Fund (WWTF): Project MA14031. Furthermore, M. Szölgyenyi is grateful for the AXA Research Grant “Numerical Methods for Stochastic Differential Equations with Irregular Coefficients with Applications in Risk Theory and Mathematical Finance”.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Leobacher, G., Szölgyenyi, M. Convergence of the Euler–Maruyama method for multidimensional SDEs with discontinuous drift and degenerate diffusion coefficient. Numer. Math. 138, 219–239 (2018). https://doi.org/10.1007/s0021101709039
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s0021101709039
Keywords
 Stochastic differential equations
 Discontinuous drift
 Degenerate diffusion
 Euler–Maruyama method
 Strong convergence rate
Mathematics Subject Classification
 Primary 60H10
 65C30
 65C20
 Secondary 65L20