Convergence of the Euler-Maruyama method for multidimensional SDEs with discontinuous drift and degenerate diffusion coefficient

We prove strong convergence of order $1/4-\epsilon$ for arbitrarily small $\epsilon>0$ of the Euler-Maruyama method for multidimensional stochastic differential equations (SDEs) with discontinuous drift and degenerate diffusion coefficient. The proof is based on estimating the difference between the Euler-Maruyama scheme and another numerical method, which is constructed by applying the Euler-Maruyama scheme to a transformation of the SDE we aim to solve.

In applications, frequently SDEs with less regular coefficients appear. For example in stochastic control theory, whenever the optimal control is of bang-bang type, meaning that the strategy is of the form 1 S (X ) for a measurable set S ⊆ R d , the drift of the controlled dynamical system is discontinuous. Furthermore, there are models which involve only noisy observations of a signal that has to be filtered. After applying filtering theory the diffusion coefficient typically is degenerate in the sense that σ (x) v = 0, for some x, v ∈ R d . This motivates the study of SDEs with these kind of irregularities in the coefficients.
If μ is bounded and measurable, and σ is bounded, Lipschitz, and uniformly nondegenerate, i.e. if there exists a constant c 0 > 0 such that for all x ∈ R d and all v ∈ R d it holds that σ (x) v ≥ c 0 v , Zvonkin [29] and Veretennikov [25,26] prove existence and uniqueness of a solution. Veretennikov [27] extends these results by allowing a part of the diffusion to be degenerate.
In [16] existence and uniqueness of a solution for the case where the drift is discontinuous at a hyperplane, or a special hypersurface and where the diffusion coefficient is degenerate is proven, and in [23] it is shown how these results extend to the nonhomogeneous case.
Currently, research on numerical methods for SDEs with irregular coefficients is highly active. Hutzenthaler et al. [8] introduce the tamed Euler-Maruyama scheme and prove strong order 1/2 convergence for SDEs with continuously differentiable and polynomially growing drift that satisfy a one-sided Lipschitz condition. Sabanis [22] proves strong convergence of the tamed Euler-Maruyama scheme from a different perspective and also considers the case of locally Lipschitz diffusion coefficient. Gyöngy [4] proves almost sure convergence of the Euler-Maruyama scheme for the case where the drift satisfies a monotonicity condition.
Halidias and Kloeden [7] show that the Euler-Maruyama scheme converges strongly for SDEs with a discontinuous monotone drift coefficient. Kohatsu-Higa et al. [13] show weak convergence with rates smaller than 1 of a method where they first regularize the discontinuous drift and then apply the Euler-Maruyama scheme.
Étoré and Martinez [1,2] introduce an exact simulation algorithm for one-dimensional SDEs with a drift coefficient which is discontinuous in one point, but differentiable everywhere else. For one-dimensional SDEs with piecewise Lipschitz drift and possibly degenerate diffusion coefficient, in [14] an existence and uniqueness result is proven, and a numerical method, which is based on applying the Euler-Maruyama scheme to a transformation of (1), is presented. This method converges with strong order 1/2. In [15] a (non-trivial) extension of the method is introduced, which converges with strong order 1/2 also in the multidimensional case. The paper also contains an existence and uniqueness result for the multidimensional setting under more general conditions than, e.g., the ones stated in [16].
The method introduced in [15] is the first numerical method that is proven to converge with positive strong rate for multidimensional SDEs with discontinuous drift and degenerate diffusion coefficient. It requires application of a transformation and its numerical inverse in each step, which makes the method rather slow in practice. Furthermore, the method requires specific inputs about the geometry of the discontinuity of the drift to calculate this transformation. This is a drawback, if, e.g., the method shall be applied for solving control problems, since the control is usually not explicitly known. So a method is preferred that can deal with the discontinuities in the drift automatically.
First results in this direction are contained in a series of papers by Ngo and Taguchi. In [21] they show convergence of order up to 1/4 of the Euler-Maruyama method for multidimensional SDEs with discontinuous bounded drift that satisfies a onesided Lipschitz condition and with Hölder continuous, bounded, and uniformly nondegenerate diffusion coefficient. In [19] they extend this result to cases where the drift is not necessarily one-sided Lipschitz for one-dimensional SDEs, and in [20] they extend the result for one-dimensional SDEs by allowing for discontinuities also in the diffusion coefficient. For many applications, their results fail to be applicable, since they only hold for one-dimensional SDEs and their method of proof relies on uniform non-degeneracy of the diffusion coefficient.
Contrasting the above, there are several delimiting results which state that even equations with infinitely often differentiable coefficients cannot always be solved approximately in finite time, even if the Euler-Maruyama method converges, cf. Hairer et al. [6], Jentzen et al. [10], Müller-Gronbah and Yaroslavtseva [18], Yaroslavtseva [28]. However, there is still a big gap between the assumptions on the coefficients under which convergence with strong convergence rate has been proven and the properties of the coefficients of the equation presented in [6].
In this paper we prove strong convergence of order 1/4− for arbitrarily small > 0 of the Euler-Maruyama method for multidimensional SDEs with discontinuous drift satisfying a piecewise Lipschitz condition and with a degenerate diffusion coefficient. Note that we do not impose a one-sided Lipschitz condition on the drift. So even for SDEs with non-degenerate diffusion coefficient, which do not have a one-sided Lipschitz drift, this result is novel.
Our convergence proof is based on estimating the difference between the Euler-Maruyama scheme and the scheme presented in [15]. Close to the set of discontinuities of the drift, we have no tight estimate of this difference, so we need to study the occupation time of an Itô process with degenerate diffusion coefficient there. Away from the set of discontinuities, it is essential to estimate the probability that during one step the distance between the interpolation of the Euler-Maruyama method and the previous Euler-Maruyama step becomes greater than some threshold. This paper's result is the first one that gives strong convergence and also a strong convergence rate of a fully explicit scheme for multidimensional SDEs with discontinuous drift and degenerate diffusion coefficient, and the first one for multidimensional SDEs with discontinuous drift that does not satisfy a one-sided Lipschitz condition.

Preliminaries
In this section we first state the assumptions on the coefficients of SDE (1), under which the result of this paper is proven, then we study the occupation time of an Itô process close to a hypersurface, and finally we recall the transformation from [15], which is also essential for our proof.

Definitions and assumptions
We want to prove strong convergence of the Euler-Maruyama method for SDEs with discontinuous drift coefficient. Instead of the usual requirement of Lipschitz continuity we only assume that the drift is a piecewise Lipschitz function on the R d .
The intrinsic metric ρ on A is given by the piecewise Lipschitz constant of f .
In this paper will be a fixed C 3 -hypersurface, and we will only consider piecewise Lipschitz functions with exceptional set . In the following, L f denotes the piecewise Lipschitz constant of a function f , if f is piecewise Lipschitz, and it denotes the Lipschitz constant, if f is Lipschitz.
We define the distance d(x, ) between a point x and the hypersurface by d(x, ) := inf{ x − y : y ∈ }, and for every ε > 0 we define ε : Recall that, since ∈ C 3 , for every ξ ∈ there exists an open environment U ⊆ of ξ and a continuously differentiable function n: U −→ R d such that for every ζ ∈ U the vector n(ζ ) has length 1 and is orthogonal to the tangent space of in ζ . On a given connected open subset of the local unit normal vector n is unique up to a factor ±1.
We recall a definition from differential geometry.

Definition 2.4
Let ∈ R d be any set.
1. An environment ε is said to have the unique closest point property, if for every is said to be of positive reach, if there exists ε > 0 such that ε has the unique closest point property. The reach of is the supremum over all such ε if such an ε exists, and 0 otherwise. Now, we give assumptions which are sufficient for the results in [15] to hold and which we need to prove the main result here.

Assumption 2.1
We assume the following for the coefficients of (1): set is a C 3 -hypersurface of positive reach; 4. non-parallelity condition: there exists a constant c 0 > 0 such that σ (ξ) n(ξ ) ≥ c 0 for all ξ ∈ ; 5. the function α: −→ R d defined by is C 3 and all derivatives up to order three are bounded. of irregularities in the coefficients. There are results in the literature, where the authors deal with a non-globally Lipschitz diffusion coefficient, see, e.g., [5], but in contributions where only Hölder continuity is required for σ , usually uniform non-degeneracy is assumed. 3. Assumption 2.1.3 is a geometrical condition which we require in order to locally flatten , i.e. to map to a hyperplane in a regular way. This is crucial in many places in [15] and here, in particular for the proof of Theorem 2.7 below. In addition to that, Assumption 2.1.3 implies that there exists a constant c 1 such that n (ξ ) ≤ c 1 for every ξ ∈ and every orthonormal vector n on , see [15, Lemma 3.10]. 4. Assumption 2.1.4 means that the diffusion coefficient must have a component orthogonal to in all ξ ∈ . This condition is significantly weaker than uniform non-degeneracy, and it is essential: in [16] we give a counterexample for the case where the non-parallelity condition does not hold. Then, even existence of a solution is not guaranteed. 5. Assumption 2.1.5 is a technical condition, which is required for our transformation method to work. Boundedness of α and α is needed for proving the local invertibility of our transform. Existence and boundedness of α and α is required for the multidimensional version of Itô's formula to hold for the transform, see [15]. Moreover, it has been shown in [15,Proposition 3.13] that α is a well-defined function on , i.e. it does not depend on the choice of the normal vector n and, in particular, on its sign.

Example 2.6
Suppose is the finite and disjoint union of orientable compact C 3manifolds. Then is of positive reach by the lemma in [3], and each connected component of separates the R n into two open connected components by the Jordan-Brouwer separation theorem, see [17].
Thus R d \ is the union of finitely many disjoint open connected subsets of Suppose there exist bounded and Lipschitz C 3 -functions μ 1 , . . . , μ n : Then it is readily checked that μ and σ satisfy Assumption 2.1.
In Sect. 4 we present a number of concrete examples which satisfy Assumption 2.1 and we perform numerical tests on the associated SDEs.

Occupation time close to a hypersurface
In this section we study the occupation time of an Itô process close to a C 3hypersurface. In the proof of our main theorem, the Euler-Maruyama approximation X δ in equation (2) will play the role of that Itô process.

Theorem 2.7
Let be a C 3 -hypersurface of positive reach and let ε 0 > 0 be such that the closure of ε 0 has the unique closest point property. Let further X = (X t ) t≥0 be an R d -valued Itô process Then there exists a constant C such that for all 0 < ε < ε 0 /2, For the proof we will construct a one-dimensional Itô process Y with the property that Y is close to 0, if and only if X is close to . For the construction of Y we decompose the path of X into pieces close to and pieces farther away. These pieces are then mapped to R by using a signed distance of X from and pasted together in a continuous way.
A signed distance to is locally given by D( where n is a local unit normal vector. is a scalar multiple of n( p(x)).
For ψ ∈ R with |ψ| small we get such that the directional derivative of D in direction n( p(x)) in x is 1. From this and from (4) it follows that D (x) = n( p(x)) . This also holds for x ∈ by the continuity of D .
The following lemma states that for any continuous curve γ in ε 0 there is a continuous path of unit normal vectors, such that to every point of γ we can assign a signed distance in a continuous way. Proof For ξ ∈ we denote the tangent space to in ξ by tang ξ . Let The set S is nonempty and its elements are bounded by b. Let s 1 := sup S. There exists an open and connected subset U ⊆ such that p(γ (s 1 )) ∈ U , and a unit normal vector n 1 : U −→ R d .
Since n 1 is unique up to a factor ±1, the mapping n 1 • p • γ either coincides with m or −m on (s 1 − η, s). Without loss of generality we may assume that the former is the case. Thus we can extend m continuously to [a, s 1 ] by defining m(t) := n 1 ( p(γ (t))) for all t ∈ (s, s 1 ]. Now, if s 1 was strictly smaller than b, then we could use the same mapping n 1 • p •γ to extend m continuously beyond s 1 , contradicting the definition of s 1 .
We will need the following estimate on the local time of a one-dimensional Itô process.

Lemma 2.10 Let Y = (Y t ) t≥0 be an Itô process with bounded and progressively
The claim is a special case of [19,Lemma 3.2]. We give a proof for the convenience of the reader.
Proof From the Meyer-Tanaka formula [11, Section 3.7, Eq. (7.9)] we have and, using Itô's L 2 -isometry, The claim now follows by applying the Cauchy-Schwarz-inequality and taking the supremum over all y ∈ R.
We are ready to prove the result of this section.
Next we decompose the path of X : let τ 0 := inf{t ≥ 0 : X t ∈ ε 1 }. In particular we have τ 0 = 0, if X 0 ∈ ε 1 . For k ∈ N 0 , define By Lemma 2.9 there exist continuous m k : [τ k , κ k+1 ] −→ R d , with m k (t) = 1 and m k (t)⊥tang p(X t ) for all t ∈ [τ k , κ k+1 ]. Without loss of generality m 0 can be chosen such that m 0 (τ 0 ) (X τ 0 − p(X τ 0 )) ≥ 0. We construct a one-dimensional process Y as follows: where without loss of generality the m k are chosen such that Note that by construction both sides of (5) can only take the values ±λ(ε 1 ).
We have thus constructed a continuous [λ(−ε 1 ), λ(ε 1 )]-valued process Y with the property that the occupation time of Y in an environment of 0 is the same as the occupation time of X in an environment of , i.e. Y ∈ (−λ(ε), λ(ε)), iff X ∈ ε for all 0 < ε < ε 1 .
To show that Y is an Itô process, we want to use Itô's formula. For this we recognize that Y , depending on its proximity to , is either constant or locally of the form Y t = λ(n( p(X t )) (X t − p(X t ))) for a suitable choice of the unit normal vector. Denote D(x) = n( p(x)) (x − p(x)). The function D is locally a signed distance to and D ∈ C 2 . This can be seen by following the proof of [3, Theorem 1]. Hence, we may apply Itô's formula to get By Lemma 2.8 we have D (x) = n( p(x)) , and hence

The transformation
The proof of convergence is based on a transformation that removes the discontinuity from the drift and makes the drift Lipschitz while preserving the Lipschitz property of the diffusion coefficient. A suitable transform is presented in [15]. We recall it here.
where ε 0 > 0 is smaller than the reach of , see Assumption 2.1.3, α is the function defined in Assumption 2.1.5, and with positive constant c and Note that G is piecewise Lipschitz with exceptional set . If c is chosen sufficiently small, see [15,Lemma 3.18], G is invertible by [15,Theorem 3.14]. Furthermore, Itô's formula holds for G and G −1 by [15,Theorem 3.19].
With this we can define a process Z = (Z t ) t≥0 by Z t = G(X t ), which solves the SDE wherẽ From [15,Theorem 3.20] we know thatμ andσ are Lipschitz, and hence the solution to (6) can be approximated with strong order 1/2 using the Euler-Maruyama scheme.

Main result
We are ready to formulate the main result.
In preparation of the proof of the main result, we prove two lemmas.

Examples
We ran simulations for several examples-ones of theoretical interest as well as an example coming from applications. When studying stochastic dynamical systems which include a noisy signal, then filtering this signal leads to a higher dimensional system with a degenerate diffusion coefficient. Stochastic control problems often lead to an optimal control policy which makes the drift of the system discontinuous. Examples are models with incomplete market information in mathematical finance where the rate with which cashflows are paid from a firm value process change systematically when the asset-liability ratio passes a certain threshold which then triggers a rating change.
The class of equations studied here appears frequently in several areas of applied mathematics and the natural sciences.

Step-function
In the first example the drift is the step function μ(x 1 , 1) , and σ ≡ id R 2 . It can easily be checked that these coefficients satisfy Assumption 2.1.
In particular, note that the non-parallelity condition is trivially satisfied, since σ is uniformly non-degenerate. Since μ does not satisfy a one-sided Lipschitz condition, our result is the first one that gives a strong convergence rate of the Euler-Maruyama method for this example.

Discontinuity along the unit circle
In this example the drift has a discontinuity along the unit circle, and the diffusion coefficient is degenerate on the whole of R 2 : Assumption 2.1 largely follows from Example 2.6. The non-parallelity condition is readily verified: x 1 x 2 0 0 for all points (x 1 , x 2 ) that lie on the unit circle, i.e. x 2 1 + x 2 2 = 1.

Dividend maximization under incomplete information
In insurance mathematics, a well-studied problem is the maximization of the expected discounted future dividend payments until the time of ruin of an insurance company, a value which serves as a risk measure. In [24] the problem is studied in a setup that allows for incomplete information about the market. This leads to a joint filtering and stochastic optimal control problem, and after solving the filtering problem, the driving dynamics are high dimensional and have a degenerate diffusion coefficient. This issue is described in more detail in [24]. Solving the stochastic optimal control problem in dimensions higher than three with the usual technique (solving an associated partial differential equation) becomes practically infeasible. Therefore, one has to resort to simulation. The SDE that has to be simulated has the coefficients  corresponding processes stay within this simplex almost surely, see [24]. The function f determines the hypersurface along which the drift is discontinuous: In our simulations we choose d = 5 and f affine linear, but note that we need not restrict ourselves to affine linear f .
We need to check Assumption 2.1: Since x 2 , . . . , x d ∈ [0, 1], μ, σ are bounded, and all first order derivatives of the entries of σ are bounded. Hence, σ is Lipschitz. μ is piecewise Lipschitz, and since f is affine linear, ∈ C 3 . Whether the non-parallelity condition holds depends on the choice of the parameters, but for ours the condition is satisfied. Assumption 2.1.5 can easily be checked. Note that the coefficients can be extended to the whole of R d in a way that they still satisfy our assumptions.

Error estimate
The L 2 -error is estimated by T is the numerical approximation of X T with step size δ (k) ,Ê is an estimator of the mean value using 2 14 paths, andē is a normalizing constant so that err 1 = √ 1/4. Figure 1 shows log 2 of the estimated L 2 -error of the Euler-Maruyama approximation of X T plotted over log 2 δ (k) for the examples presented above. We observe that the theoretical convergence rate is approximately obtained for the example of a step-function and that the other examples converge at a faster rate. In particular, for the examples with degenerate diffusion coefficient, the convergence rate is not worse than for the other example. Even for the step-function example, for sufficiently small step-size the convergence rate seems to be higher than the theoretical one. Hence, it will be an interesting topic for future research to prove sharpness, or find a sharp bound. Even though the proven rate for the Euler-Maruyama method is lower than for the transformation-based method from [15], the calculations are usually faster in practice using the first method, since the simulation of a single path is faster. Table 1 confirms this claim: we observe that computation times are higher by up to two orders of magnitude for the transformation method, while the estimated error is of comparable size.
For completeness, we remark that one can construct examples, where the transformation method is much faster while giving a smaller error. For example, start with prescribing the transform G(x) = x + x|x|φ(10x) and set μ(x) = 1 2 (G −1 ) (G(x)) and σ (x) = (G −1 ) (G(x)). This leads toμ(z) = 0 andσ (z) = 1. Hence, if we use the transformation method with the same G, then Z δ = Z = W and the transformation method gives the estimate G −1 (W ), which is the exact solution.

Conclusion
In this paper we have for the first time proven strong convergence and also a positive strong convergence rate for an explicit method (the Euler-Maruyama method) for multidimensional SDEs with discontinuous drift that has a degenerate diffusion coefficient, or with a discontinuous drift that does not satisfy a one-sided Lipschitz condition, or both. The Euler-Maruyama method has the advantage that it does not need the exact form of the set of discontinuities of the drift as an input, and that in practice, computation of one path is fast in comparison to the second method in the literature that can deal with this class of SDEs. Our numerical experiments suggest that in addition to these advantages, it even seems that the Euler-Maruyama method converges at a higher than the theoretically obtained rate for many examples and it will be a topic of future research to prove sharpness, or find a sharp bound.
Research Grant "Numerical Methods for Stochastic Differential Equations with Irregular Coefficients with Applications in Risk Theory and Mathematical Finance".
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.