Well-posedness and numerical schemes for one-dimensional McKean–Vlasov equations and interacting particle systems with discontinuous drift

Leobacher, Gunther; Reisinger, Christoph; Stockinger, Wolfgang

doi:10.1007/s10543-022-00920-4

Well-posedness and numerical schemes for one-dimensional McKean–Vlasov equations and interacting particle systems with discontinuous drift

Open access
Published: 18 May 2022

Volume 62, pages 1505–1549, (2022)
Cite this article

Download PDF

You have full access to this open access article

BIT Numerical Mathematics Aims and scope Submit manuscript

Well-posedness and numerical schemes for one-dimensional McKean–Vlasov equations and interacting particle systems with discontinuous drift

Download PDF

Gunther Leobacher¹,
Christoph Reisinger² &
Wolfgang Stockinger ORCID: orcid.org/0000-0002-5305-7786²

2322 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

In this paper, we first establish well-posedness results for one-dimensional McKean–Vlasov stochastic differential equations (SDEs) and related particle systems with a measure-dependent drift coefficient that is discontinuous in the spatial component, and a diffusion coefficient which is a Lipschitz function of the state only. We only require a fairly mild condition on the diffusion coefficient, namely to be non-zero in a point of discontinuity of the drift, while we need to impose certain structural assumptions on the measure-dependence of the drift. Second, we study Euler–Maruyama type schemes for the particle system to approximate the solution of the one-dimensional McKean–Vlasov SDE. Here, we will prove strong convergence results in terms of the number of time-steps and number of particles. Due to the discontinuity of the drift, the convergence analysis is non-standard and the usual strong convergence order 1/2 known for the Lipschitz case cannot be recovered for all presented schemes.

Approximations of McKean–Vlasov Stochastic Differential Equations with Irregular Coefficients

Article 26 February 2021

Coupled McKean–Vlasov Equations Over Convex Domains

Article 18 November 2023

McKean–Vlasov type stochastic differential equations arising from the random vortex method

Article Open access 24 December 2021

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In this article, we study the existence and uniqueness of strong solutions for classes of McKean–Vlasov SDEs, where the drift exhibits a discontinuity in the spatial component. We also provide time-stepping schemes of Euler–Maruyama type, for which we show strong convergence of a certain rate.

A McKean–Vlasov equation (introduced in [44, 45]) for a d-dimensional process $X=(X_t)_{t \in [0,T]}$, with a given finite time-horizon $T >0$, is an SDE where the underlying coefficients depend on the current state $X_t$ and, additionally, on the law of $X_t$. We consider more specifically the one-dimensional equation of the form

$$\begin{aligned} \mathrm {d}X_t =b(X_t, {\mathcal {L}}_{X_t}) \, \mathrm {d}t + \sigma (X_t) \, \mathrm {d}W_t, \quad X_0 = \xi \in L_2^{0}({\mathbb {R}}), \end{aligned}$$

(1.1)

where $L_2^{0}({\mathbb {R}})$ denotes the space of real-valued, ${\mathcal {F}}_0$-measurable random variables with second finite moments, $(W_t)_{t \in [0,T]}$ is a one-dimensional standard Brownian motion and ${\mathcal {L}}_{X_t}$ denotes the marginal law of the process X at time $t \in [0,T]$. In particular, we are concerned with the well-posedness of equations of the form (1.1), where $b(\cdot ,\mu )$ is discontinuous in zero, and piecewise Lipschitz on the subintervals $(-\infty ,0)$ and $(0,\infty )$. Concerning the measure component of the drift, we will require global Lipschitz continuity with respect to the Wasserstein distance with quadratic cost denoted by ${\mathcal {W}}_2$ (see below for a precise definition). The diffusion term will be only state dependent and globally Lipschitz continuous. Our setting contrasts with the standard case with globally Lipschitz continuous coefficients, which is well-studied in the literature, both from an analytic and numerical perspective, e.g., in [7, 46, 58], respectively.

The study of SDEs and McKean–Vlasov equations with discontinuous drift is motivated by such models in biology (see, e.g., [25]) and financial mathematics (see, e.g., Atlas models in equity markets [3, 33] and dividend maximisation problems [59]). Further, in stochastic control a discontinuous control can lead to equations with discontinuous drift (see [57]). In the context of stochastic N-player games, non-smooth cost functions (such as the $\ell _1$-regularisation) or constraints on the size of the control process can result in discontinuous controls (bang-bang type optimal controls) and hence will give controlled state dynamics with discontinuous drift, as in [11].

We start our literature review with some key references on standard SDEs with irregular and discontinuous drift, namely [39, 40, 61, 62], and then proceed to discuss some recent articles on McKean–Vlasov SDEs with non-Lipschitz drift. Zvonkin [62] (for one-dimensional SDEs) and Veretennikov [61] (for the multi-dimensional setting) prove the existence of a unique strong solution for an SDE where the drift is assumed to be measurable and bounded, but the diffusion coefficient $\sigma $ needs to satisfy rather strong assumptions, namely that it is bounded and uniformly elliptic, i.e., there is a $\lambda >0$ such that for all $x \in {\mathbb {R}}^{d}$ and all $v \in {\mathbb {R}}^{d}$, we have $v^{\top }\sigma (x)\sigma (x)^{\top }v \ge \lambda v^{\top }v$. An interesting addition to the aforementioned results in the case where the diffusion is not uniformly elliptic was established in the one-dimensional case in [39]. The authors assume the drift coefficient to be piecewise Lipschitz and $\sigma $ to be globally Lipschitz with $\sigma (\eta ) \ne 0$ for each of finitely many points of discontinuity $\eta $ of the drift. This condition guarantees that the process does not spend a positive amount of time in the singularity. Under these assumptions, by explicitly constructing a transformation that removes the singularities, the existence of a unique strong solution can be proven and a numerical procedure for solving this class of SDEs can be constructed.

The main contribution of [40] is the extension of the one-dimensional case to the multi-dimensional setting under the assumption of piecewise Lipschitz continuity of the drift. In [40], the authors introduce a meaningful concept of piecewise Lipschitz continuity in higher dimensions, which is based on the notion of the so-called intrinsic metric. As already indicated by the one-dimensional case, there needs to be an intricate connection between the geometry of the set of discontinuities and the diffusion coefficient. We note that the exceptional set of singularities, denoted by $\varTheta $, is assumed to be a ${\mathcal {C}}^{4}$ hypersurface and for the diffusion part one requires the following: There exists a constant $C>0$ such that $| \sigma (\eta )^{\top } n(\eta ) | \ge C$ for all $\eta \in \varTheta $, where $n(\eta )$ is orthogonal to the tangent space of $\varTheta $ in $\eta $ and $| n(\eta )| =1$. Under these assumptions (and some additional technical conditions on the coefficients and on the geometry of $\varTheta $) the existence of a unique strong solution for multi-dimensional SDEs with piecewise Lipschitz continuous drift can be proven.

Moving on to McKean–Vlasov equations, the existence and uniqueness theory for strong solutions of such SDEs with coefficients of linear growth and Lipschitz type conditions (with respect to the state and the measure component) is well-established (see, e.g., [13, 58]). More general existence/uniqueness results for weak and strong solutions of McKean–Vlasov SDEs can be found in [6, 37, 48]. The article [6] is concerned with the weak and strong existence/uniqueness of one-dimensional equations with additive noise, where the drift is assumed to be measurable, continuous in the measure component with respect to the Monge–Kantorovich metric and further satisfies a linear growth condition. In [37, 48], a d-dimensional setting is considered, where the drift is assumed to be bounded, measureable (and possibly path-dependent) and Lipschitz continuous in the measure component with respect to the total variation distance. The diffusion is non-degenerate and independent of the measure. Under these assumptions (and some technical conditions) weak existence and uniqueness is proven.

For further recent existence and uniqueness results for strong and weak solutions of McKean–Vlasov SDEs, including results concerning standard Lipschitz assumptions on the coefficients, we refer to [15, 20, 21, 27, 30, 42, 55] and the references given therein.

The numerical analysis of SDEs with discontinuous drifts has received significant attention over the last few years, see, e.g., [19, 29, 38,39,40,41, 49,50,51,52] and the references therein for well-posedness results and for strong and weak convergence rates of numerical schemes.

In particular, in [39] (for the one-dimensional case) and in [40] (for the multi-dimensional setting) the standard strong convergence rate of order 1/2 for a method derived from the Euler–Maruyama scheme was proven. However, the applicability of these schemes is limited as they require the explicit knowledge of a transformation (and its inverse) to map the SDE with discontinuous coefficients into one with Lipschitz continuous coefficients. In [41], an Euler–Maruyama scheme without the aforementioned transformation is introduced (in a multi-dimensional setting). While this scheme is easier to apply, the authors only show a strong convergence rate of order $1/4 - \varepsilon $ for any $\varepsilon >0$, imposing also the stronger assumption of boundedness for both coefficients of the underlying SDE. The central idea of [41] is to quantify the probability that a multi-dimensional process is in a small neighbourhood of the set of discontinuities, using an occupation time formula. In the one-dimensional case, with coefficients of linear growth, these techniques were refined in [49] and the expected strong convergence rate of order 1/2 was recovered. Other recent works concerned with the numerical approximation of SDEs with discontinuous drifts include [50, 51], where a higher order scheme and an adaptive time-stepping scheme were introduced, respectively. In [38] a numerical scheme for classical one-dimensional diffusion processes generated by a differential operator involving discontinuous coefficients is presented. As the generator is non-local for McKean–Vlasov equations it seems a challenging problem to use these techniques in our framework.

The simulation of McKean–Vlasov SDEs typically involves two steps: First, at each time t, the true measure ${\mathcal {L}}_{X_t}$ is approximated by the empirical measure

$$\begin{aligned} \mu _t^{{\varvec{X}}^{N}}(\mathrm {d}x) := \frac{1}{N}\sum _{j=1}^{N} \delta _{X_t^{j,N}}(\mathrm {d}x), \end{aligned}$$

where $\delta _{x}$ denotes the Dirac measure at point x and $({\varvec{X}}^{N}_t)_{ t \in [0,T]} = (X_t^{1,N}, \ldots , X_t^{N,N})_{t \in [0,T]}^{\top }$, an interacting particle system, is the solution to the ${\mathbb {R}}^{dN}$-dimensional SDE with components

$$\begin{aligned} \mathrm {d}X_t^{i,N} = b(X_t^{i,N}, \mu _t^{{\varvec{X}}^{N}} ) \, \mathrm {d}t + \sigma (X_t^{i,N}) \, \mathrm {d}W_t^{i}, \quad X_{0}^{i,N} = \xi ^{i}. \end{aligned}$$

Here, $W^{i} = (W_t^{i})_{t \in [0,T]}$ and $\xi ^{i}$, for $i \in \lbrace 1, \ldots , N \rbrace $, are independent Brownian motions (also independent of W) and independent copies of $\xi $, respectively. In a next step, one needs to introduce a reasonable time-stepping method to discretise the particles $(X_t^{i,N})_{t \in [0,T]}$ over some finite time horizon [0, T]. Numerical schemes for interacting particle systems with Hölder continuous coefficients (in the state variable) and with coefficients satisfying certain assumptions on monotonicity (in the state variable) and Lipschitz continuity (in the measure variable), can be found in [4, 5, 36, 54] (and the references cited therein), respectively, where a strong convergence analysis is conducted. In [1, 9], a quantitative $L_p$-error analysis in terms of density and cumulative distribution function approximation is presented. The survey [7] discusses several examples and numerical schemes for McKean–Vlasov equations involving singular drifts, e.g., a probabilistic interpretation of the Burgers equation, see also [9], of the 2D-incompressible Navier–Stokes equation (see e.g., [22, 47]) and turbulent flow models [53]. Other examples of McKean–Vlasov equations with singular drifts appear in the Keller–Segel equation [26], the Coulomb gas model [17], the Thomson problem [32], and the Stefan problem [34, 35].

Our numerical schemes present an original approximation method which, as of now, is restricted to the specific case of a one-dimensional spatial and one-point discontinuous drift component, but provides, again in this specific framework, a suitable alternative to the methodical mollification/cut-off approximation methods.

In this article, we first focus on the decomposable case, namely that

$$\begin{aligned} b(x,\mu ) = b_1(x) + b_2(x,\mu ), \end{aligned}$$

where $b_1$ is piecewise Lipschitz continuous with a discontinuity in zero, and $b_2$ satisfies the usual Lipschitz assumptions in both components. This structure allows us to present the main ideas of the analysis to be used later in a more general setting, in particular a transformation of the state variable to remove the discontinuity. In this setting, we prove well-posedness of the McKean–Vlasov equation and the associated particle system. This structure includes the important class of McKean–Vlasov equations of the form

$$\begin{aligned} \mathrm {d}X_t = \left( V(X_t) + \int _{{\mathbb {R}}} \beta (X_t-y) \, {\mathcal {L}}_{X_t}(\mathrm {d}y) \right) \, \mathrm {d}t + \sigma (X_t) \, \mathrm {d}W_t, \quad X_0 = \xi , \end{aligned}$$

(1.2)

where V describes an external potential and $\beta $ an interaction kernel; see, e.g., [31] and the references cited therein related to mean-field over-damped Langevin equations. These models also embed the class of self-stabilizing diffusions and the McKean–Vlasov model related to the granular media equation.

We then relax the structural assumption on decomposability slightly, but still have to require certain continuity of the measure derivatives at the points of discontinuity, which encompasses the above setting as a special case. The necessity for this condition arises due to the explicit measure or time dependence of the employed transformation. A future research direction concerns a setting where the point of discontinuity is time-dependent, or depends on the distribution of the process $(X_t)_{t \in [0,T]}$, which is relevant to study further practically important models, e.g., from [33].

Having established the existence of a unique strong solution with bounded moments, we propose two Euler–Maruyama schemes for the particle systems as numerical approximations to the McKean–Vlasov equations. For an Euler–Maruyama scheme applied to the SDE in the transformed state, strong convergence of order 1/2 follows immediately, while for a direct time-discretisation of the particle system without transformation, we are only able to show order 1/9. Numerical tests indicate that this order is in general not sharp. We will discuss the reasons for this gap and possible improvements later.

The main contributions of the present article are as follows. First, we establish the well-posedness of McKean–Vlasov SDEs (with a certain discontinuity) and of their associated particle systems. Techniques from variational calculus on the measure space ${\mathcal {P}}({\mathbb {R}})$ equipped with the Wasserstein distance ${\mathcal {W}}_2$ will be essential in the proofs, due to the possible measure dependence of the transformation applied to the processes as described above. The second central contribution of the present paper is the development of numerical schemes for approximating such McKean–Vlasov SDEs and their associated particle systems. Here, a non-standard strong convergence analysis based on occupation time estimates of the discretised processes in a neighbourhood of the discontinuity will be presented.

The remainder of the paper is organised as follows: In Sect. 2, we collect all preliminary tools and notions needed throughout the paper. The precise problem description and the main results are presented in Sect. 3. Then, Sect. 4 discusses numerical schemes for McKean–Vlasov SDEs with discontinuous drift. We show strong convergence of certain orders with respect to the number of particles and time-steps, respectively. In Sect. 5, we apply our numerical scheme to a model problem arising in neuroscience [25] and to a slight modification of a mean-field game in systemic risk [16, 28].

2 Preliminaries

In the sequel, we will introduce several concepts and notions, which will be needed throughout this article. In addition, we will give a brief introduction to the so-called Lions derivative (abbreviated by L-derivative), which allows us to define a derivative with respect to measures of the space ${\mathcal {P}}_2({\mathbb {R}})$ (see below for a precise definition). Also, we recall the transformation used to cope with drifts having discontinuities in a given finite number of points and first developed in [39]. We give a summary of important properties of this mapping. Note that generic constants used in this article are denoted by $C>0$. They are independent of the number of particles and number of time-steps, and might change their values from line to line.

2.1 Notions and notation

We start with introducing some notions and fixing the notation.

Throughout this article, $(\varOmega ,{\mathcal {F}},({\mathcal {F}}_t)_{t \in [0,T]},{\mathbb {P}})$ will denote a filtered probability space, where $\mathbb {F}=({\mathcal {F}}_t)_{t \in [0,T]}$ is the natural filtration of W augmented with an independent $\sigma $-algebra ${\mathcal {F}}_0$ and $(\varOmega ,{\mathcal {F}},{\mathbb {P}})$ is assumed to be atomless.
$({\mathbb {R}}^d,\left\langle \cdot ,\cdot \right\rangle , |\cdot |)$ represents the d-dimensional ($d \ge 1$) Euclidean space. As a matrix-norm, we will use $\Vert A \Vert := \sup _{v \in {\mathbb {R}}^{d}, |v|=1}|Av|$, for any $A \in {\mathbb {R}}^{d \times d}$.
We use ${\mathcal {P}}({\mathbb {R}})$ to denote the family of all probability measures on $({\mathbb {R}},{\mathcal {B}}({\mathbb {R}}))$, where ${\mathcal {B}}({\mathbb {R}})$ denotes the Borel $\sigma $-field over ${\mathbb {R}}$ and define the subset of probability measures with finite second moment by
$$\begin{aligned} {\mathcal {P}}_2({\mathbb {R}}):= \Big \{ \mu \in {\mathcal {P}}({\mathbb {R}}) : \ \int _{{\mathbb {R}}} |x|^2 \mu (\mathrm {d} x)<\infty \Big \}. \end{aligned}$$
We recall the definition of the standard Wasserstein distance with quadratic cost: For any $\mu , \nu \in {\mathcal {P}}_2({\mathbb {R}})$, we define
$$\begin{aligned} {\mathcal {W}}_2(\mu , \nu ) := \left( \inf _{\pi \in \varPi (\mu ,\nu )} \int _{{\mathbb {R}} \times {\mathbb {R}}} |x-y |^2 \pi (\mathrm {d}x,\mathrm {d}y) \right) ^{1/2}, \end{aligned}$$
where $\varPi (\mu ,\nu )$ denotes the set of all couplings between $\mu $ and $\nu $.
For a given $p \ge 2$, $L_p^{0}({\mathbb {R}})$ refers to the space of real-valued, ${\mathcal {F}}_0$-measurable random variables X satisfying ${\mathbb {E}}[|X|^p] < \infty $ and for a terminal time $T>0$, ${\mathcal {S}}^p([0,T])$ refers to the space of real-valued continuous, ${\mathbb {F}}$-adapted processes, defined on the interval [0, T], with finite p-th moments, i.e., processes $(X_t)_{t \in [0,T]}$ satisfying ${\mathbb {E}} \left[ \sup _{t \in [0,T]}|X_t|^p \right] < \infty $.

We briefly introduce the L-derivative of a functional $f: {\mathcal {P}}_2({\mathbb {R}}) \rightarrow {\mathbb {R}}$, as it will appear in the proofs presented in the main section. For further information on this concept, we refer to [12] or [10]. Here, we follow the exposition of [14]. We will associate to the function f a lifted function ${\tilde{f}}$, defined by ${\tilde{f}}(X)=f({\mathcal {L}}_X)$, where ${\mathcal {L}}_X$ is the law of X, for $X \in L_2(\varOmega , {\mathcal {F}},{\mathbb {P}};{\mathbb {R}})$.

This will allow us to introduce L-differentiability as Fréchet derivative on the lifted space. In particular, a function f on ${\mathcal {P}}_2({\mathbb {R}})$ is said to be L-differentiable at $\mu _0 \in {\mathcal {P}}_2({\mathbb {R}})$ if there exists a random variable $X_0 \in L_2(\varOmega , {\mathcal {F}},{\mathbb {P}};{\mathbb {R}})$ with law $\mu _0$, such that the lifted function ${\tilde{f}}$ is Fréchet differentiable at $X_0$.

Now, the Riesz representation theorem implies that there is a (${\mathbb {P}}$-a.s.) unique $\varPhi \in L_2(\varOmega , {\mathcal {F}},{\mathbb {P}};{\mathbb {R}})$ with

$$\begin{aligned} {\tilde{f}}(X) = {\tilde{f}}(X_0) + \langle \varPhi , X-X_0 \rangle _{L_2} + o(\Vert X-X_0\Vert _{L_2}), \text { as } \Vert X-X_0\Vert _{L_2} \rightarrow 0, \end{aligned}$$

with the standard inner product and norm on $L_2(\varOmega , {\mathcal {F}},{\mathbb {P}};{\mathbb {R}})$. If f is L-differentiable for all $\mu _0 \in {\mathcal {P}}_2({\mathbb {R}})$, then we say that f is L-differentiable.

It is known (see, e.g., [14, Proposition 5.25]) that there exists a Borel measurable function $\chi : {\mathbb {R}} \rightarrow {\mathbb {R}}$, such that $\varPhi = \chi (X_0)$ almost surely, and hence

$$\begin{aligned} f({\mathcal {L}}_X) = f({\mathcal {L}}_{X_0}) + {\mathbb {E}}\left\langle \chi (X_0), X -X_0 \right\rangle +o(\Vert X-X_0\Vert _{L_2}). \end{aligned}$$

Note that $\chi $ only depends on the law of $X_0$, but not on $X_0$ itself. We define $\partial _{\mu }f({\mathcal {L}}_{X_0})(y):=\chi (y)$, $y \in {\mathbb {R}}$, as the L-derivative of f at $\mu _0$. If, in addition, for a fixed $y \in {\mathbb {R}}$, there is a version of the mapping ${\mathcal {P}}_2({\mathbb {R}}) \ni \mu \mapsto \partial _{\mu }f(\mu )(y)$ which is continuously L-differentiable, then the L-derivative of $\partial _{\mu }f(\cdot )(y): {\mathcal {P}}_2({\mathbb {R}}) \rightarrow {\mathbb {R}}$, is defined as

$$\begin{aligned} \partial ^2_{\mu } f(\mu )(y,y'):= \partial _{\mu }(\partial _{\mu }f)(\cdot )(y)(\mu ,y'), \end{aligned}$$

for $(\mu ,y,y') \in {\mathcal {P}}_2({\mathbb {R}}) \times {\mathbb {R}} \times {\mathbb {R}}$.

We require a definition describing regularity properties of a function $f: {\mathcal {P}}_2({\mathbb {R}}) \rightarrow {\mathbb {R}}$ in terms of the measure derivative (see [14, 18]).

Definition 2.1

Let $f: {\mathcal {P}}_2({\mathbb {R}}) \rightarrow {\mathbb {R}}$ be a given functional.

We say that f is an element of the class ${\mathcal {C}}^{(1,1)}_{b}$, if f is continuously L-differentiable, for any $\mu $, there is a continuous version of the mapping ${\mathbb {R}} \ni y \mapsto \partial _{\mu } f(\mu )(y)$ and the derivatives
$$\begin{aligned} \partial _{\mu } f(\mu )(y), \quad \partial _{y} \lbrace \partial _{\mu } f(\mu )(\cdot ) \rbrace (y), \end{aligned}$$
exist, are bounded and jointly continuous in the variables $(\mu ,y)$ such that $y \in {\mathrm{Supp}}(\mu )$.
We say that f is an element of the class ${\mathcal {C}}^{(2,1)}_{b}$, if it is an element of ${\mathcal {C}}^{(1,1)}_{b}$ and in addition the second order Lions derivative $\partial ^2_{\mu } f(\mu )(y,y')$ exists, is bounded and is again jointly continuous in the corresponding variables. Also, the joint continuity of all derivatives is here required globally, i.e., for all $(\mu ,y,y')$.

We give the following additional remark, which links the L-derivative of functions of empirical measures to the standard partial derivatives of its empirical projections. For a functional $f: {\mathcal {P}}_2({\mathbb {R}}) \rightarrow {\mathbb {R}}$, we associate with it the finite dimensional projection $f^N: {\mathbb {R}}^N \rightarrow {\mathbb {R}}$ defined as

$$\begin{aligned} f^{N}({\varvec{x}}^{N}):=f\left( \frac{1}{N} \sum _{j=1}^N \delta _{x_j} \right) , \end{aligned}$$

for ${\varvec{x}}^{N}:= (x_1, \ldots , x_N)$. If $f \in {\mathcal {C}}^{(2,1)}_{b}$, then $f^{N}$ is twice differentiable (in a classical sense) and

$$\begin{aligned}&\partial _{x_i} f^{N}({\varvec{x}}^{N}) = \frac{1}{N} \partial _{\mu }f\left( \frac{1}{N} \sum _{j=1}^N \delta _{x_j} \right) (x_i), \\&\partial _{x_i} \partial _{x_k} f^{N}({\varvec{x}}^{N}) = \frac{1}{N} \partial _y \partial _{\mu }f\left( \frac{1}{N} \sum _{j=1}^N \delta _{x_j} \right) (x_i) \delta _{i,k} + \frac{1}{N^2} \partial ^2_{\mu }f\left( \frac{1}{N} \sum _{j=1}^N \delta _{x_j} \right) (x_i,x_k), \end{aligned}$$

where $\delta _{i,k}$ is the Kronecker delta, see, e.g., [14, Proposition 5.35].

2.2 Properties of the transformation G

In [39], the authors consider one-dimensional SDEs of the form

$$\begin{aligned} \mathrm {d}X_t=b(X_t) \, \mathrm {d}t+\sigma (X_t) \, \mathrm {d}W_t, \quad X_0 = x \in {\mathbb {R}}, \end{aligned}$$

with a piecewise Lipschitz continuous drift coefficient b that is discontinuous in $K \in {\mathbb {N}}$ points $\eta _1, \ldots , \eta _K$, and a Lipschitz diffusion coefficient $\sigma $ that does not vanish in any $\eta _k$. A mapping $G:{\mathbb {R}}\rightarrow {\mathbb {R}}$ is defined to transform the SDE into one for $Z=G(X)$ with globally Lipschitz continuous coefficients. For simplicity, we restrict the discussion to $K=1$ with $\eta _1=0$. We define the mapping G by

$$\begin{aligned} G(x) := x + \alpha x |x| \phi \left( \frac{x}{c} \right) , \end{aligned}$$

where

$$\begin{aligned} \phi (x):= {\left\{ \begin{array}{ll} (1-x^2)^3, &{} |x|\le 1, \\ 0, &{} |x|>1\, \end{array}\right. } \quad \alpha : =\frac{b(0^{-})- b(0^{+})}{2\sigma ^2(0)}, \end{aligned}$$

and c is a constant satisfying $0< c < 1/|\alpha |$. The choice of $\alpha $ yields a Lipschitz continuous drift coefficient for the SDE of $Z=G(X)$, in particular, it removes the discontinuity in 0 from the drift. The restriction on c guarantees that G possesses a global inverse.

It is known from [40] that G satisfies the following properties:

G is ${\mathcal {C}}^{1}({\mathbb {R}},{\mathbb {R}})$ with $0< \inf _{x \in {\mathbb {R}}} G'(x) \le \sup _{x \in {\mathbb {R}}} G'(x) < \infty $. Therefore, G is Lipschitz continuous and has an inverse $G^{-1}: {\mathbb {R}}\rightarrow {\mathbb {R}}$ that is Lipschitz continuous as well.
The derivative $G'$ is Lipschitz continuous (i.e., also absolutely continuous). In addition, $G'$ has a bounded Lebesgue density $G'':{\mathbb {R}}\rightarrow {\mathbb {R}}$, which is Lipschitz continuous on each of the subintervals $(-\infty ,0)$ and $(0,\infty )$. Also, Itô’s formula can still be applied to G and $G^{-1}$.

3 Existence and uniqueness results

The following subsections are devoted to proving well-posedness results for certain classes of one-dimensional McKean–Vlasov SDEs with a drift having a discontinuity in zero. In a first step, we study a simple class where the resulting transformation will not depend on the measure. Here, the transformation techniques developed in [39] will allow us to prove existence and uniqueness of a strong solution. The second class of McKean–Vlasov SDEs investigated below has the intrinsic difficulty that the required transformation will depend on the measure (i.e., will be time-dependent). Hence, a fixed-point iteration in the measure component will be required and we need to use techniques from variational calculus on the measure space ${\mathcal {P}}_2({\mathbb {R}})$, in particular an Itô formula for functionals acting on this space.

For each of these classes of McKean–Vlasov SDEs, we will additionally study the well-posedness of their associated interacting particle system. Although they can be considered as N-dimensional classical SDEs, with N denoting the number of particles, the resulting set of discontinuities of the N-dimensional drift cannot be handled by the main results of [40].

Future work is needed to extend the methods developed in this article to a multi-dimensional framework. In particular, it seems that the decomposable case can be generalised when discontinuities of the form discussed in [40] are considered.

3.1 McKean–Vlasov SDEs and interacting particle systems with decomposable drift

For a given terminal time $T >0$ and given $p \ge 2$, we consider a one-dimensional McKean–Vlasov SDE of the form

$$\begin{aligned} \mathrm {d} X_t = b(X_t, {\mathcal {L}}_{X_t}) \, \mathrm {d}t + \sigma (X_t) \, \mathrm {d}W_t, \quad \ X_0= \xi \in L_p^{0}({\mathbb {R}}), \end{aligned}$$

(3.1)

where $b:{\mathbb {R}}\times {\mathcal {P}}_2({\mathbb {R}}) \rightarrow {\mathbb {R}}$ and $\sigma : {\mathbb {R}}\rightarrow {\mathbb {R}}$ are measurable functions.

In the following, we state the model assumptions which will specify the set-up for this subsection:

H. 1

(1)
We have $\sigma (0) \ne 0$ and there exists a constant $L >0$ such that
$$\begin{aligned} | \sigma (x) - \sigma (x') | \le L |x-x'| \quad \forall x, x' \in {\mathbb {R}}. \end{aligned}$$
(2)
The drift is decomposable in the following sense:
$$\begin{aligned} b(x, \mu ) = b_1(x) + b_2(x,\mu ) \quad \forall x \in {\mathbb {R}}, \ \forall \mu \in {\mathcal {P}}_2({\mathbb {R}}), \end{aligned}$$
where $b_1: {\mathbb {R}} \rightarrow {\mathbb {R}}$ is Lipschitz continuous on the subintervals $(-\infty ,0)$ and $(0,\infty )$ and there exists a constant $L_1>0$ such that
$$\begin{aligned} | b_2(x,\mu ) - b_2(x',\nu ) | \le L_1 \left( |x-x'| + {\mathcal {W}}_2(\mu ,\nu ) \right) \quad \forall x, x' \in {\mathbb {R}}, \ \forall \mu , \nu \in {\mathcal {P}}_2({\mathbb {R}}). \end{aligned}$$

We now state the main results of this section:

Proposition 3.1

Let Assumption (H.1) be satisfied, let $\xi \in L_p^{0}({\mathbb {R}})$ for a given $p \ge 2$ and assume $c < 1/|\alpha |$. Then, the McKean–Vlasov SDE defined in (3.1) has a unique strong solution in ${\mathcal {S}}^{p}([0,T])$.

Proof

Transforming the McKean–Vlasov SDE (3.1), employing the transformation $G: {\mathbb {R}} \rightarrow {\mathbb {R}}$ defined in Sect. 2.2, with

$$\begin{aligned} \alpha =\frac{b(0^{-}, \mu )- b(0^{+},\mu )}{2\sigma ^2(0)} = \frac{b_1(0^{-}) - b_1(0^{+})}{2 \sigma ^2(0)}, \end{aligned}$$

in order to eliminate the discontinuity in zero, yields a McKean–Vlasov SDE with globally Lipschitz continuous coefficients. This can be shown in a similar manner to [39, Theorem 2.5]). Moreover, G has a global inverse due to the choice $c < 1/|\alpha |$ (see [40, Lemma 2.2]), and Itô’s formula can be applied to $G^{-1}$, which allows to deduce the claim. $\square $

The interacting particles $(X^{i,N}_t)_{t \in [0,T]}$, $i \in \lbrace 1, \ldots , N \rbrace $, associated with (3.1) satisfy

$$\begin{aligned} \mathrm {d}X_t^{i,N} = b_1(X_t^{i,N}) \, \mathrm {d}t + b_2(X_t^{i,N}, \mu _t^{{\varvec{X}}^{N}}) \, \mathrm {d}t + \sigma (X_t^{i,N}) \, \mathrm {d}W_t^{i}, \end{aligned}$$

(3.2)

where $(\xi ^{i},W^{i})$, for $i \in \lbrace 1, \ldots , N \rbrace $, are independent copies of $(\xi ,W)$.

Proposition 3.2

Let Assumption (H.1) be satisfied, let $\xi \in L_p^{0}({\mathbb {R}})$ for a given $p \ge 2$ and assume $c < 1/|\alpha |$. Then, the interacting particle system defined in (3.2) has a unique strong solution in ${\mathcal {S}}^{p}([0,T])$.

Proof

In contrast to the set-up in [40], the set of discontinuities, denoted by $\varTheta $, is not a differentiable manifold, but has the form

$$\begin{aligned} \varTheta =\{(x_1,\dots ,x_N)^{\top } \in {\mathbb {R}}^N:\exists j\in \{1,\dots ,N\}:x_j=0\}. \end{aligned}$$

However, we may define ${\varvec{G}}_N: {\mathbb {R}}^{N} \rightarrow {\mathbb {R}}^{N}$ by

$$\begin{aligned} {\varvec{G}}_N({\varvec{x}}^N) := (G(x_1), \ldots , G(x_N))^{\top }, \end{aligned}$$

where G is as in Proposition 3.1 and ${\varvec{x}}^N=(x_1, \ldots , x_N)^{\top }$, which allows us to transform the particle system $({\varvec{X}}_t^{N})_{t \in [0,T]} = (X^{1,N}_t, \ldots , X^{N,N}_t)^{\top }_{t \in [0,T]}$ into a new particle system with globally Lipschitz continuous coefficients. Now, ${\varvec{G}}_N$ has a global inverse, as the mapping G has a global inverse, due to the choice of c (see Sect. 2.2). Therefore, applying Itô’s formula to the inverse allows to deduce the claim. $\square $

3.2 McKean–Vlasov SDE with non-decomposable drift

Here, we consider again a one-dimensional McKean–Vlasov SDE of the form (3.1),

$$\begin{aligned} \mathrm {d} X_t = b(X_t, {\mathcal {L}}_{X_t}) \, \mathrm {d}t + \sigma (X_t) \, \mathrm {d}W_t, \quad \ X_0= \xi \in L_p^{0}({\mathbb {R}}). \end{aligned}$$

(3.3)

However, in contrast to the above setting, we will not assume that b can be decomposed in two parts as in Assumption (H.1(2)) from the previous section, and therefore the transformation will also depend on the measure. To be precise, for any $(x,\mu ) \in {\mathbb {R}}\times {\mathcal {P}}_2({\mathbb {R}})$, we define

$$\begin{aligned} G(x,\mu ) := x + \alpha (\mu ) x |x| \phi \left( \frac{x}{c} \right) , \end{aligned}$$

(3.4)

where

$$\begin{aligned}&\phi (x):= {\left\{ \begin{array}{ll} (1-x^2)^3, &{} |x|\le 1, \\ 0, &{} |x|>1\, \end{array}\right. } \qquad \alpha (\mu ): =\frac{b(0^{-},\mu )- b(0^{+},\mu )}{2\sigma ^2(0)}, \end{aligned}$$

(3.5)

and $c>0$ is a constant small enough to guarantee the invertibility of G. When we speak of an ‘inverse’ of $G(x,\mu )$, we mean ‘inverse with respect to x’, i.e., the inverse is a function $G^{-1}:{\mathbb {R}}\times {\mathcal {P}}_2({\mathbb {R}}) \rightarrow {\mathbb {R}}$ which satisfies $G^{-1} \big (G(x,\mu ), \mu \big )=x$ for all $(x,\mu ) \in {\mathbb {R}}\times {\mathcal {P}}_2({\mathbb {R}})$ and $G\big (G^{-1}(z,\mu ), \mu \big )=z$ for all $(z,\mu )\in {\mathbb {R}}\times {\mathcal {P}}_2({\mathbb {R}})$. For a given flow of measures $(\mu _t)_{t \in [0,T]} \in {\mathcal {C}}([0,T],{\mathcal {P}}_2({\mathbb {R}}))$, G may also be viewed as a mapping $G :{\mathbb {R}}\times [0,T] \rightarrow {\mathbb {R}}$.

In the following, we state the model assumptions which will specify the set-up for this subsection:

H. 2

Assumption (H.1(1)) is satisfied and we require:

(1)
There exist constants $L, L_1>0$ such that
$$\begin{aligned}&\sup _{x \ne 0} \frac{|b(x,\mu )|}{1+|x|} \le L, \quad | b(x,\mu ) - b(x,\nu ) | \le L_1 {\mathcal {W}}_2(\mu ,\nu ) \quad \forall x \in {\mathbb {R}}\setminus \lbrace 0 \rbrace ,\\ {}&\forall \mu , \nu \in {\mathcal {P}}_2({\mathbb {R}}). \end{aligned}$$
Additionally, for all $\mu \in {\mathcal {P}}_2({\mathbb {R}})$, ${\mathbb {R}}\ni x \mapsto b(x, \mu )$ is Lipschitz continuous on the subintervals $(-\infty ,0)$ and $(0,\infty )$, uniformly with respect to $\mu $.
(2)
$\alpha \in {\mathcal {C}}^{(1,1)}_{b}$, and the mapping ${\mathcal {P}}_2({\mathbb {R}}) \times {\mathbb {R}}\ni (\mu , y) \mapsto \partial _y \partial _{\mu } \alpha (\mu )(y)$ is Lipschitz continuous, that is, there exists a constant $L_2>0$ such that
$$\begin{aligned}&| \partial _y \partial _{\mu } \alpha (\mu )(y) - \partial _y \partial _{\mu } \alpha (\nu )(y')| \le L_2 \left( |y-y'| + {\mathcal {W}}_2(\mu ,\nu ) \right) \quad \forall y,y' \in {\mathbb {R}}, \\ {}&\forall \mu , \nu \in {\mathcal {P}}_2({\mathbb {R}}). \end{aligned}$$
(3)
For any $\mu \in {\mathcal {P}}_2({\mathbb {R}})$, the mapping ${\mathbb {R}}\ni y \mapsto \partial _{\mu } \alpha (\mu )(y)$ vanishes in zero, i.e.,
$$\begin{aligned}&\partial _{\mu } \alpha (\mu )(0) = \partial _{\mu } b(0^{-}, \mu )(0)- \partial _{\mu }b(0^{+},\mu )(0) = 0. \end{aligned}$$
(3.6)
(4)
The mapping ${\mathcal {P}}_2({\mathbb {R}}) \times {\mathbb {R}}\ni (\mu ,y) \mapsto \partial _{\mu }\alpha (\mu )(y) b(y,\mu )$ is Lipschitz continuous.

Remark 3.1

The requirement in (H.2(2)) that $\alpha \in {\mathcal {C}}^{(1,1)}_{b}$ is needed to apply an Itô formula for $\alpha $ (see [14, Proposition 5.102]).

Remark 3.2

Note that (H.2(4)) could also be replaced by the following alternative set of assumptions: On each of the two subintervals $(-\infty ,0)$ and $(0,\infty )$, $b(\cdot ,\mu )$ is a ${\mathcal {C}}^{1}({\mathbb {R}},{\mathbb {R}})$ function with bounded derivative and, additionally, for any $\mu \in {\mathcal {P}}_2({\mathbb {R}})$, the mapping ${\mathbb {R}}\ni y \mapsto \partial _y\partial _{\mu } \alpha (\mu )(y)$ vanishes in zero, i.e.,

$$\begin{aligned} \partial _y \partial _{\mu } \alpha (\mu )(0) = \partial _y\partial _{\mu } b(0^{-}, \mu )(0)- \partial _y\partial _{\mu }b(0^{+},\mu )(0) = 0. \end{aligned}$$

The following proposition shows the Lipschitz continuity of the mapping ${\mathbb {R}}\times {\mathcal {P}}_2({\mathbb {R}}) \ni (x,\mu ) \rightarrow G^{-1}(x,\mu )$.

Proposition 3.3

Let the function G be defined as in (3.4) with $c < 1/ \sup _{\mu \in {\mathcal {P}}_2({\mathbb {R}})}|\alpha (\mu )|$ and (3.5) and let Assumption (H.2(1)) be satisfied. Then, there exists a constant L(c), which depends on c and the model parameters in Assumption (H.2(1)), such that $L(c) \rightarrow 0$ as $c \rightarrow 0$ and for any $x,y \in {\mathbb {R}}$ and $\mu , \nu \in {\mathcal {P}}_2({\mathbb {R}})$

$$\begin{aligned} |G^{-1}(x,\mu ) - G^{-1}(y,\nu )| \le 2|x-y| + L(c){\mathcal {W}}_2(\mu ,\nu ). \end{aligned}$$

Proof

First, we note that by differentiating (3.4), we get for $x \in [-c,c]$

$$\begin{aligned} \partial _x G(x,\mu )&= 1 - \frac{6\alpha (\mu )}{c^2} |x| x^2 \left( 1-(x/c)^2 \right) ^2 + 2\alpha (\mu )|x|(1-(x/c)^2)^3\\&= 1 - 2c \alpha (\mu ) \frac{|x|}{c}\big (4x^2/c^2 - 1\big )\left( 1-(x/c)^2 \right) ^2\,. \end{aligned}$$

It is easy to verify that

$$\begin{aligned} \sup _{x\in [0,c]}\Big |\frac{|x|}{c}\big (4x^2/c^2 - 1\big )\left( 1-(x/c)^2 \right) ^2\Big | =\sup _{z\in [0,1]}\Big ||z|\big (4z^2 - 1\big )\left( 1-z^2 \right) ^2\Big |<\frac{1}{4}\,, \end{aligned}$$

which implies that for all $x\in {\mathbb {R}}$ and $\mu \in {\mathcal {P}}_2({\mathbb {R}})$

$$\begin{aligned} \big |\partial _x G(x,\mu )-1\big | <c \frac{|\sup _{\mu \in {\mathcal {P}}_2({\mathbb {R}})}\alpha (\mu )|}{2}\,. \end{aligned}$$

In particular, if $c < 1/\sup _{\mu \in {\mathcal {P}}_2({\mathbb {R}})} |\alpha (\mu )|$, then for all $x,y\in {\mathbb {R}}$ and $\mu \in {\mathcal {P}}_2({\mathbb {R}})$, we have

$$\begin{aligned} \partial _x G(x,\mu )>\frac{1}{2}, \quad \big |G^{-1}(x,\mu )-G^{-1}(y,\mu )\big |<2|x-y|\,. \end{aligned}$$

That $ \mu \mapsto \partial _xG(x,\mu )$ is Lipschitz continuous is a consequence of (H.2(1)). It is easy to show that the mapping $ x \mapsto \partial _xG(x,\mu )$ is also Lipschitz continuous. Denote the Lipschitz constant of $\partial _xG$ with respect to the first and second argument by $L_x$ and $L_{\mu }$, respectively. Writing

$$\begin{aligned} G^{-1}(x,\mu ) = \int _{0}^{x} \frac{1}{\partial _xG(G^{-1}(y,\mu ),\mu )} \, \mathrm {d}y, \end{aligned}$$

we obtain

$$\begin{aligned}&|G^{-1}(x,\mu ) - G^{-1}(x,\nu )| \\&\quad \le \int _{0}^{x} \left| \frac{1}{\partial _xG(G^{-1}(y,\mu ),\mu )} - \frac{1}{\partial _xG(G^{-1}(y,\nu ),\mu )} \right| \, \mathrm {d}y \\&\qquad + \int _{0}^{x} \left| \frac{1}{\partial _xG(G^{-1}(y,\nu ),\mu )} - \frac{1}{\partial _xG(G^{-1}(y,\nu ),\nu )} \right| \, \mathrm {d}y \\&\quad \le \int _{0}^{x} \left| \frac{\partial _xG(G^{-1}(y,\nu ),\mu )-\partial _xG(G^{-1}(y,\mu ),\mu )}{\partial _xG(G^{-1}(y,\mu ),\mu )\partial _xG(G^{-1}(y,\nu ),\mu )} \right| \, \mathrm {d}y\\&\qquad + \int _{0}^{x} \left| \frac{\partial _xG(G^{-1}(y,\nu ),\nu )-\partial _xG(G^{-1}(y,\nu ),\mu )}{\partial _xG(G^{-1}(y,\nu ),\mu )\partial _xG(G^{-1}(y,\nu ),\nu )} \right| \, \mathrm {d}y \\&\quad \le \int _{0}^{x} 4L_x\left| G^{-1}(y,\nu )-G^{-1}(y,\mu ) \right| \, \mathrm {d}y + \int _{0}^{x} 4L_{\mu } {\mathcal {W}}_2(\mu ,\nu ) \, \mathrm {d}y \\&\quad \le 4L_x\int _{0}^{x} \left| G^{-1}(y,\nu )-G^{-1}(y,\mu ) \right| \, \mathrm {d}y + 4L_{\mu }x {\mathcal {W}}_2(\mu ,\nu ) \,. \end{aligned}$$

For $|x|< c$, we have

$$\begin{aligned} |G^{-1}(x,\mu ) - G^{-1}(x,\nu )|&\le 4L_x\int _{0}^{x} \left| G^{-1}(y,\nu )-G^{-1}(y,\mu ) \right| \, \mathrm {d}y + 4L_{\mu }x {\mathcal {W}}_2(\mu ,\nu ) \\&\le 4L_x\int _{0}^{x} \left| G^{-1}(y,\nu )-G^{-1}(y,\mu ) \right| \, \mathrm {d}y + 4L_{\mu }c {\mathcal {W}}_2(\mu ,\nu ) \,, \end{aligned}$$

and hence Gronwall’s inequality implies

$$\begin{aligned} |G^{-1}(x,\mu ) - G^{-1}(x,\nu )| \le 4L_{\mu }c {\mathcal {W}}_2(\mu ,\nu )e^{4L_xx} \le 4L_{\mu }ce^{4L_xc} {\mathcal {W}}_2(\mu ,\nu )\,. \end{aligned}$$

For $|x| \ge c$, $|G^{-1}(x,\mu ) - G^{-1}(x,\nu )|=0\le 4L_{\mu }ce^{4L_xc} {\mathcal {W}}_2(\mu ,\nu )$ by the definition of G. We finally obtain, for all $x,y\in {\mathbb {R}}$ and all $\mu ,\nu \in {\mathcal {P}}_2({\mathbb {R}})$, with $L(c):=4L_{\mu }ce^{4L_xc}$,

$$\begin{aligned}&|G^{-1}(x,\mu ) - G^{-1}(y,\nu )| \le |G^{-1}(x,\mu ) - G^{-1}(x,\nu )|+|G^{-1}(x,\nu ) - G^{-1}(y,\nu )|\\&\quad \le L(c) {\mathcal {W}}_2(\mu ,\nu ) + 2|x-y| \le \max (L(c),2) ({\mathcal {W}}_2(\mu ,\nu ) + |x-y|)\,. \end{aligned}$$

$\square $

Remark 3.3

In what follows, we will assume that $c < 1/\sup _{\mu \in {\mathcal {P}}_2({\mathbb {R}})} |\alpha (\mu )|$ and is small enough such that the Lipschitz constant of the mapping ${\mathcal {P}}_2({\mathbb {R}}) \ni \mu \mapsto G^{-1}(x,\mu )$ (i.e., the constant L(c) a few lines above) is less than a half. The reason for this requirement will become clearer in the proof of Theorem 3.1.

Similar to the previous section, we aim to recover a unique strong solution of (3.3) by setting $X_t = G^{-1}(Z_t^{\mu },\mu _t)$, where $\mu _t = {\mathcal {L}}_{X_t}$ for $t \in [0,T]$, and $(Z_t^{\mu })_{t \in [0,T]}$ is the process obtained by applying the transformation G to X. Even though G is not twice continuously differentiable in the state variable, Itô’s formula is still applicable due to the special form of the discontinuity (see [24, Theorem 2.1] and the comments after the proof of this theorem). Now, observe that

$$\begin{aligned} \mathrm {d}G(X_t,\mu _t)&= \left( \partial _t G(X_t,\mu _t) + b(X_t,\mu _t) + \alpha ({\mu _t}) {\bar{\phi }}'(X_t) b(X_t,\mu _t) t \right. \\&\qquad \left. + \frac{1}{2} \alpha ({\mu _t}) {\bar{\phi }}''(X_t) \sigma ^2(X_t) \right) \, \mathrm {d}t \\&\qquad + (\sigma (X_t) + \alpha ({\mu _t}) {\bar{\phi }}'(X_t)\sigma (X_t)) \, \mathrm {d}W_t. \end{aligned}$$

Itô’s formula along a flow of measures $(\mu _t)_{t \in [0,T]} \in {\mathcal {C}}([0,T],{\mathcal {P}}_2({\mathbb {R}}))$ (see, e.g., [14, Proposition 5.102]) implies

$$\begin{aligned}&\partial _t G(x,\mu _t) \\&= \int _{{\mathbb {R}}} \left( b(y,\mu _t) \partial _{\mu }G(x,\mu _t)(y) + \frac{\sigma ^{2}(y)}{2} \partial _y \partial _{\mu }G(x,\mu _t)(y) \right) \mu _t(\mathrm {d}y) \\&=:{\mathcal {L}}_{\mu _t}(G(x,\cdot ))(\mu _t), \end{aligned}$$

where we recall that $\partial _y \partial _{\mu }G(x,\mu _t)(y)$ denotes the derivative of the mapping ${\mathbb {R}}\ni y \mapsto \partial _{\mu }G(x,\mu _t)(y)$ and

$$\begin{aligned}&\partial _{\mu }G(x,\mu _t)(y) = \partial _{\mu }\alpha (\mu _t)(y)|x|x\phi (x/c),\\&\partial _y \partial _{\mu }G(x,\mu _t)(y) = \partial _y\partial _{\mu }\alpha (\mu _t)(y)|x|x\phi (x/c). \end{aligned}$$

Hence, we define

$$\begin{aligned}&\mathrm {d}Z_t^{\mu } :={\tilde{b}}(Z^{\mu }_t,\mu _t) \, \mathrm {d}t + {\tilde{\sigma }}(Z^{\mu }_t) \, \mathrm {d}W_t, \quad Z_0^{\mu } = G(\xi ,\delta _{\xi }), \end{aligned}$$

(3.7)

where

$$\begin{aligned} {\tilde{b}}(z,\mu )&:= {\mathcal {L}}_{\mu }(G(G^{-1}(z,\mu ),\cdot )(\mu ) + b(G^{-1}(z,\mu ),\mu ) \nonumber \\&\qquad + \alpha ({\mu }) {\bar{\phi }}'(G^{-1}(z,\mu ))\nonumber b(G^{-1}(z,\mu ),\mu ) \nonumber \\&\qquad + \frac{1}{2} \alpha (\mu ) {\bar{\phi }}''(G^{-1}(z,\mu )) \sigma ^2(G^{-1}(z,\mu )), \nonumber \\ {\tilde{\sigma }}(z,\mu )&:= \sigma (G^{-1}(z,\mu )) + \alpha ({\mu }) {\bar{\phi }}'(G^{-1}(z,\mu ))\sigma (G^{-1}(z,\mu )). \end{aligned}$$

(3.8)

In the following, we will show that the decoupled SDE (3.7) where the flow $(\mu _t)_{t \in [0,T]} \in {\mathcal {C}}([0,T],{\mathcal {P}}_2({\mathbb {R}}))$ is fixed has Lipschitz continuous coefficients. Note that for a such a flow of measures the process $X^{\mu }$ in (3.3) (interpreted as classical SDE) has bounded moments uniformly in $(\mu _t)_{t \in [0,T]}$, which is a consequence of (H.1(1)) and (H.2(1)). In particular, for $\xi \in L_2^{0}({\mathbb {R}})$, we have the a-priori estimate ${\mathbb {E}}\left[ \sup _{0 \le t \le T} |X^{\mu }_t|^{2} \right] \le C(1+ {\mathbb {E}}[|\xi |^{2}])=:{\bar{C}}$, where $C>0$ only depends on T and the constants appearing in the model assumptions. We will introduce the following subspace of ${\mathcal {C}}([0,T],{\mathcal {P}}_2({\mathbb {R}}))$: We define ${\mathcal {P}}^{b} := \lbrace \mu \in {\mathcal {C}}([0,T],{\mathcal {P}}_2({\mathbb {R}})): \ \sup _{t \in [0,T]} \int _{{\mathbb {R}}} x^2 \, \mu _t(\mathrm {d}x) \le {\bar{C}} \rbrace $, where ${\bar{C}}$ is defined as above and complete this space with the metric $\sup _{t \in [0,T]} {\mathcal {W}}_2(\mu _t,\nu _t)$, for $(\mu _t)_{t \in [0,T]}, (\nu _t)_{t \in [0,T]} \in {\mathcal {P}}^{b}$.

Lemma 3.1

Let Assumption (H.2) be satisfied and assume $c < 1/ \sup _{\mu \in {\mathcal {P}}_2({\mathbb {R}})}|\alpha (\mu )|$. Then, ${\tilde{b}}$ and ${\tilde{\sigma }}$ given in (3.8) are Lipschitz continuous on ${\mathbb {R}}\times {\mathcal {P}}^{b}$.

Proof

Using (H.1(1)), the Lipschitz continuity of $z \mapsto G^{-1}(z,\nu _t)$ and the uniform boundedness of $\alpha $, see (H.2(1)), in combination with [40, Lemma 2.5], gives the Lipschitz continuity of $ z \mapsto {\tilde{\sigma }}(z,\nu _t)$, with a Lipschitz constant independent of $\nu _t$. Similarly, we can deduce that $\nu _t \mapsto {\tilde{\sigma }}(z,\nu _t)$ is Lipschitz continuous, due to Proposition 3.3 and the Lipschitz continuity of $ \nu _t \mapsto \alpha (\nu _t)$.

The choice of $\alpha $ along with (H.2(1)) guarantees that the mapping

$$\begin{aligned} z \mapsto b(G^{-1}(z,\nu _t),\nu _t) + \frac{1}{2} \alpha (\nu _t) {\bar{\phi }}''(G^{-1}(z,\nu _t)) \sigma ^2(G^{-1}(z,\nu _t)), \end{aligned}$$

is Lipschitz continuous. From [40, Lemma 2.4], we can deduce the Lipschitz continuity of

$$\begin{aligned} z \mapsto \alpha (\nu _t) {\bar{\phi }}'(G^{-1}(z,\nu _t)) b(G^{-1}(z,\nu _t),\nu _t). \end{aligned}$$

That

$$\begin{aligned}&\nu _t \mapsto b(G^{-1}(z,\nu _t),\nu _t) + \frac{1}{2} \alpha (\nu _t) {\bar{\phi }}''(G^{-1}(z,\nu _t)) \sigma ^2(G^{-1}(z,\nu _t)), \\&\nu _t \mapsto \alpha (\nu _t) {\bar{\phi }}'(G^{-1}(z,\nu _t)) b(G^{-1}(z,\nu _t),\nu _t), \end{aligned}$$

are Lipschitz continuous is a consequence of (H.1(1)) and (H.2(1)), Proposition 3.3 and the fact that $G^{-1}(z,\mu ^{(1)})$ and $G^{-1}(z,\mu ^{(2)})$, for $\mu ^{(1)}, \mu ^{(2)} \in {\mathcal {P}}_2({\mathbb {R}})$, have the same sign. Also note that ${\bar{\phi }}'$ and ${\bar{\phi }}''$ are (piecewise) Lipschitz continuous and bounded. It remains to analyse the Lipschitz continuity of

$$\begin{aligned}&(z,\nu _t) \mapsto \int _{{\mathbb {R}}} \left( b(y,\nu _t) \partial _{\mu }G(G^{-1}(z,\nu _t),\nu _t)(y) + \frac{\sigma ^{2}(y)}{2} \partial _y \partial _{\mu }G(G^{-1}(z,\nu _t),\nu _t)(y) \right) \nu _t(\mathrm {d}y). \end{aligned}$$

(3.9)

Assumptions (H.2(2)) and (H.2(3)) guarantee that above mapping exists and further that the mapping $y \mapsto b(y,\nu _t) \partial _{\mu }G(G^{-1}(z,\nu _t),\nu _t)(y)$ is continuous in zero. We start by analysing the Lipschitz continuity of (3.9) with respect to the measure variable. Consider now an arbitrary coupling $\varPi _t(\cdot ,\cdot )$ between $\nu _t(\cdot )$ and $\mu _t(\cdot )$, for $(\mu _t)_{t \in [0,T]}, (\nu _t)_{t \in [0,T]} \in {\mathcal {P}}^{b}$, and estimate

$$\begin{aligned}&\int _{{\mathbb {R}}^2} \left( b(y,\nu _t) \partial _{\mu }G(G^{-1}(z,\nu _t),\nu _t)(y) - b(x,\mu _t) \partial _{\mu }G(G^{-1}(z,\mu _t),\mu _t)(x) \right) \, \varPi _t(\mathrm {d}y,\mathrm {d}x) \\&= \int _{{\mathbb {R}}^2} \left( b(y,\nu _t) \partial _{\mu }G(G^{-1}(z,\nu _t),\nu _t)(y) - b(x,\mu _t) \partial _{\mu }G(G^{-1}(z,\nu _t),\mu _t)(x) \right) \, \varPi _t(\mathrm {d}y,\mathrm {d}x) \\&\quad + \int _{{\mathbb {R}}^2} \left( b(x,\mu _t) \partial _{\mu }G(G^{-1}(z,\nu _t),\mu _t)(x) - b(x,\mu _t)\partial _{\mu }G(G^{-1}(z,\mu _t),\mu _t)(x) \right) \, \varPi _t(\mathrm {d}y,\mathrm {d}x). \end{aligned}$$

Note that

$$\begin{aligned}&\int _{{\mathbb {R}}^2} \left| b(y,\nu _t) \partial _{\mu }G(G^{-1}(z,\nu _t),\nu _t)(y) - b(x,\mu _t) \partial _{\mu }G(G^{-1}(z,\nu _t),\mu _t)(x) \right| \, \varPi _t(\mathrm {d}y,\mathrm {d}x) \\&= \int _{{\mathbb {R}}^2} \Big | b(y,\nu _t) \partial _{\mu }\alpha (\nu _t)(y) {\bar{\phi }}(G^{-1}(z,\nu _t)) - b(x,\mu _t) \partial _{\mu }\alpha (\mu _t)(x){\bar{\phi }}(G^{-1}(z,\nu _t)) \Big | \, \varPi _t(\mathrm {d}y,\mathrm {d}x) \\&\le C {\mathcal {W}}_2(\mu _t,\nu _t), \end{aligned}$$

where we used (H.2(4)) and the fact that ${\bar{\phi }}$ is bounded. Furthermore, from the boundedness of $(x,\nu _t) \mapsto \partial _{\mu }\alpha (\nu _t)(x)$ and (H.2(1)), we derive

$$\begin{aligned}&\int _{{\mathbb {R}}^2} \left| b(x,\mu _t) \partial _{\mu }G(G^{-1}(z,\nu _t),\mu _t)(x) - b(x,\mu _t) \partial _{\mu }G(G^{-1}(z,\mu _t),\mu _t)(x) \right| \, \varPi _t(\mathrm {d}y,\mathrm {d}x) \nonumber \\&\quad \le \int _{{\mathbb {R}}^2} \left| b(x,\mu _t)\partial _{\mu }\alpha (\mu _t)(x) \right| \left| {\bar{\phi }}(G^{-1}(z,\nu _t)) - {\bar{\phi }}(G^{-1}(z,\mu _t)) \right| \, \varPi _t(\mathrm {d}y,\mathrm {d}x) \nonumber \\&\quad \le C \int _{{\mathbb {R}}^2} (1+|x|) \left| {\bar{\phi }}(G^{-1}(z,\nu _t)) - {\bar{\phi }}(G^{-1}(z,\mu _t)) \right| \, \varPi _t(\mathrm {d}y,\mathrm {d}x) \nonumber \\&\quad \le C {\mathcal {W}}_2(\mu _t,\nu _t). \end{aligned}$$

(3.10)

We remark that in the last inequality, we used the Lipschitz continuity of $ z \mapsto |z| z \phi (z/c)$, Proposition 3.3 and employed that $(\mu _t)_{t \in [0,T]}$ is an element of the space ${\mathcal {P}}^{b}$. In a similar manner, we can show the Lipschitz continuity of

$$\begin{aligned} z \mapsto \int _{{\mathbb {R}}} b(y,\nu _t) \partial _{\mu }G(G^{-1}(z,\nu _t),\nu _t)(y) \, \nu _t(\mathrm {d}y). \end{aligned}$$

Analogous statements can be derived for

$$\begin{aligned} (z,\nu _t) \mapsto \int _{{\mathbb {R}}} \frac{\sigma ^{2}(y)}{2} \partial _y \partial _{\mu }G(G^{-1}(z,\nu _t),\nu _t)(y) \, \nu _t(\mathrm {d}y), \end{aligned}$$

taking (H.1(1)) and (H.2(2)) into account. $\square $

We are now ready to present the main result and its proof of this section:

Theorem 3.1

Let Assumption (H.2) be satisfied, let $\xi \in L_p^{0}({\mathbb {R}})$ for a given $p \ge 2$ and assume that the constant c is sufficiently small (as in Remark 3.3). Then, the McKean–Vlasov SDE defined in (3.3) has a unique strong solution in ${\mathcal {S}}^{p}([0,T])$.

Proof

First, we remark that for any given flow of measures $(\mu _t)_{t \in [0,T]} \in {\mathcal {C}}([0,T],{\mathcal {P}}_2({\mathbb {R}}))$ the SDE defined in (3.7) has a unique strong solution by Lemma 3.1. Now, for $(\mu _t)_{t \in [0,T]}, (\nu _t)_{t \in [0,T]} \in {\mathcal {P}}^{b}$, we obtain, for any $t \in [0,T]$, using Lemma 3.1, BDG’s inequality and Hölder’s inequality along with Gronwall’s inequality

$$\begin{aligned}&{\mathbb {E}}\left[ |Z^{\mu }_t-Z^{\nu }_t|^2 \right] \\&\le C \left( {\mathbb {E}} \left[ \int _{0}^{t} |{\tilde{b}}(Z^{\mu }_s,\mu _s) - {\tilde{b}}(Z^{\nu }_s,\nu _s)|^2 \, \mathrm {d}s \right] + {\mathbb {E}} \left[ \int _{0}^{t} |{\tilde{\sigma }}(Z^{\mu }_s,\mu _s) - {\tilde{\sigma }}(Z^{\nu }_s,\nu _s)|^2 \, \mathrm {d}s \right] \right) \\&\le C \Bigg ( {\mathbb {E}} \left[ \int _{0}^{t} |{\tilde{b}}(Z^{\mu }_s,\mu _s) - {\tilde{b}}(Z^{\nu }_s,\mu _s)|^2 \, \mathrm {d}s \right] + {\mathbb {E}} \left[ \int _{0}^{t} |{\tilde{b}}(Z^{\nu }_s,\mu _s) - {\tilde{b}}(Z^{\nu }_s,\nu _s)|^2 \, \mathrm {d}s \right] \\&\quad + {\mathbb {E}} \left[ \int _{0}^{t} |{\tilde{\sigma }}(Z^{\mu }_s,\mu _s) - {\tilde{\sigma }}(Z^{\nu }_s,\mu _s)|^2 \, \mathrm {d}s \right] + {\mathbb {E}} \left[ \int _{0}^{t} |{\tilde{\sigma }}(Z^{\nu }_s,\mu _s) - {\tilde{\sigma }}(Z^{\nu }_s,\nu _s)|^2 \, \mathrm {d}s \right] \Bigg ) \\&\le C {\mathbb {E}} \left[ \int _{0}^{t} \left( |Z^{\mu }_s-Z^{\nu }_s|^2 + {\mathcal {W}}_2^{2}(\mu _s,\nu _s) \right) \mathrm {d}s \right] \le C \int _{0}^{t} {\mathcal {W}}_2^{2}(\mu _s,\nu _s) \, \mathrm {d}s. \end{aligned}$$

For $k \ge 0$ and $t \in [0,T]$, we define the Picard iteration

$$\begin{aligned} \mu _t^{k+1} = \text {Law}\left( G^{-1}(Z_t^{\mu ^{k}},\mu _t^{k}) \right) , \end{aligned}$$

(3.11)

with $Z_t^{\mu ^{0}} = G(\xi ,\delta _{\xi })$ and $\mu _t^{0}=\delta _{\xi }$. Note that by Itô’s formula (applied to $G^{-1}$), the process defined by $X_t^{k+1}:= G^{-1}(Z_t^{\mu ^{k}},\mu _t^{k})$ is the solution to

$$\begin{aligned} \mathrm {d} X^{k+1}_t = b(X^{k+1}_t,\mu _t^{k}) \, \mathrm {d}t + \sigma (X^{k+1}_t) \, \mathrm {d}W_t, \quad X^{k+1}_0= \xi . \end{aligned}$$

Recall that $(X^{k+1}_t)_{t \in [0,T]}$ has uniformly bounded moments (uniformly in k), due to (H.1(1)) and (H.2(1)), i.e., we have

$$\begin{aligned} \sup _{k \ge 1} {\mathbb {E}}\left[ \sup _{0 \le t \le T} |X^{k}_t|^{p} \right] \le C(1+{\mathbb {E}}[|\xi |^{p}]). \end{aligned}$$

(3.12)

The applicability of Itô’s formula for $G^{-1}$ is a consequence of the fact that the inverse inherits the regularity of G, in particular the mapping ${\mathcal {P}}_2({\mathbb {R}}) \ni \mu \mapsto G^{-1}(y,\mu )$ is still an element of the class ${\mathcal {C}}^{(1,1)}_{b}$ (see, Proposition A.1 in Appendix A.1). Then above estimate and Proposition 3.3 yield

$$\begin{aligned} \sup _{t \in [0, T]} {\mathcal {W}}_2^{2}(\mu _t^{k+1},\mu _t^{k})&\le \sup _{t \in [0, T]}{\mathbb {E}}\left[ |X^{k+1}_t-X^{k}_t|^2 \right] \nonumber \\&\le 2 \sup _{t \in [0, T]} {\mathbb {E}}\left[ |G^{-1}(Z_t^{\mu ^{k}},\mu _t^{k}) - G^{-1}(Z_t^{\mu ^{k-1}},\mu _t^{k})|^2 \right] \nonumber \\&\quad + 2 \sup _{t \in [0, T]}{\mathbb {E}}\left[ |G^{-1}(Z_t^{\mu ^{k-1}},\mu _t^{k}) - G^{-1}(Z_t^{\mu ^{k-1}},\mu _t^{k-1})|^2 \right] \nonumber \\&\le C \int _{0}^{T} {\mathcal {W}}_2^{2}(\mu ^{k}_s,\mu ^{k-1}_s) \mathrm {d}s + 2L(c) \sup _{t \in [0, T]} {\mathcal {W}}_2^{2}(\mu ^{k}_{t},\mu ^{k-1}_{t}) \nonumber \\&\le C\int _{0}^{T} {\mathcal {W}}_2^{2}(\mu ^{k}_s,\mu ^{k-1}_s) \mathrm {d}s + 2L(c) \sup _{t \in [0, T]} {\mathbb {E}}\left[ |X^{k}_{t}-X^{k-1}_{t}|^2 \right] , \end{aligned}$$

(3.13)

where $L:= 2L(c) < 1$ due to the choice of c and $C>0$ is a constant depending on the constants appearing in Proposition 3.3 and Lemma 3.1.

Let now $0< T_0 <T$ such that $CT_0 + L < 1$. With this choice the uniqueness of (3.3) on $[0,T_0]$ follows from the estimate in (3.13) by assuming there exist two solutions $(X,\mu )$ and $(Y,\nu )$ to (3.3), with $\mu _t$ and $\nu _t$ the marginal laws of $X_t$ and $Y_t$, respectively, for $t \in [0,T_0]$. In addition, we observe that the sequence of flows $(\mu ^{k})_{k}$, for $\mu ^{k}=(\mu ^{k}_t)_{t \in [0,T_0]}$, is a Cauchy sequence in the complete metric space ${\mathcal {C}}([0,T],{\mathcal {P}}_2({\mathbb {R}}))$ equipped with the Wasserstein distance $\sup _{t \in [0, T_0]} {\mathcal {W}}_2(\mu _t,\nu _t)$. Hence, (3.11) has a fixed point, in particular we have $X_t = G^{-1}(Z_t^{\mu },\mu _t)$, where $\mu _t={\mathcal {L}}_{X_t}$. Itô’s formula applied to $G^{-1}$ yields the claim for the time interval $[0,T_0]$. Repeating the above procedure starting at $T_0$, we can extend the solution to the interval $[T_0,T_1]$, for some $T_0< T_1 <T$. This is possible as the choice of $T_1$ depends on $X_{T_0}$ only through the second moment of $X_{T_0}$, for which we have a uniform bound for the entire interval [0, T], see (3.12). Proceeding in such a manner, we can obtain well-posedness of (3.3) on [0, T]. $\square $

3.3 Interacting particle system with non-decomposable drift

In the following, we state the model assumptions which will specify the set-up for this subsection:

H. 3

Assumptions (H.2(3)) and (H.2(4)) are satisfied and we require:

(1)
Assumption (H.1(1)) holds and there exists a constant $L >0$ such that $|\sigma (x)| \le L$ for all $x \in {\mathbb {R}}$.
(2)
There exists a constant $L_1>0$ such that
$$\begin{aligned} | b(x,\mu ) - b(x,\nu ) | \le L_1 {\mathcal {W}}_2(\mu ,\nu ) \quad \forall x \in {\mathbb {R}} \setminus \lbrace 0 \rbrace , \ \forall \mu , \nu \in {\mathcal {P}}_2({\mathbb {R}}). \end{aligned}$$
Further, for any $\mu \in {\mathcal {P}}_2({\mathbb {R}})$, $x \mapsto b(x,\mu )$ is Lipschitz continuous on the subintervals $(-\infty ,0)$ and $(0,\infty )$, uniformly with respect to $\mu $.
(3)
$\alpha \in {\mathcal {C}}^{(1,2)}_{b}$ is a bounded function and the mappings
$$\begin{aligned}&{\mathcal {P}}_2({\mathbb {R}}) \times {\mathbb {R}}\ni (\mu ,y) \mapsto \partial _{\mu } \alpha (\mu )(y), \\&{\mathcal {P}}_2({\mathbb {R}}) \times {\mathbb {R}}\ni (\mu ,y) \mapsto \partial _y \partial _{\mu } \alpha (\mu )(y),\\&{\mathcal {P}}_2({\mathbb {R}}) \times {\mathbb {R}}\times {\mathbb {R}}\ni (\mu ,y,y') \mapsto \partial ^{2}_{\mu }\alpha (\mu )(y,y'), \end{aligned}$$
are bounded and Lipschitz continuous.

Remark 3.4

Note that, compared to (H.2(1)), we do not require the drift to be uniformly bounded in the measure component.

The interacting particles of the system $(X^{i,N}_t)_{t \in [0,T]}$, for $i \in \lbrace 1, \ldots , N \rbrace $ associated with (3.3) satisfy

$$\begin{aligned} \mathrm {d}X_t^{i,N} = b(X_t^{i,N}, \mu _t^{{\varvec{X}}^{N}}) \, \mathrm {d}t + \sigma (X_t^{i,N}) \, \mathrm {d}W_t^{i}, \end{aligned}$$

(3.14)

where $(\xi ^{i},W^{i})$, for $i \in \lbrace 1, \ldots , N \rbrace $, are independent copies of $(\xi ,W)$.

In contrast to the case of particle systems with decomposable drift, we set $\alpha : {\mathcal {P}}_2({\mathbb {R}}) \rightarrow {\mathbb {R}}$,

$$\begin{aligned} \alpha (\mu ^{{\varvec{x}}^{N}}) = \frac{b(0^{-},\mu ^{{\varvec{x}}^{N}})- b(0^{+},\mu ^{{\varvec{x}}^{N}})}{2\sigma ^2(0)}, \end{aligned}$$

(which could also be interpreted as a mapping $\alpha _N: {\mathbb {R}}^N \rightarrow {\mathbb {R}}$) and apply to each particle the following transformation $G: {\mathbb {R}}\times {\mathcal {P}}_2({\mathbb {R}}) \rightarrow {\mathbb {R}}$,

$$\begin{aligned} G \left( x_i, \mu ^{{\varvec{x}}^{N}} \right) = x_i + \alpha \left( \mu ^{{\varvec{x}}^{N}}\right) x_i|x_i| \phi \big (x_i/c \big ). \end{aligned}$$

(3.15)

We set ${\mathbb {R}}^{N} \ni {\varvec{x}}^{N} \mapsto G_i({\varvec{x}}^{N}) := G \left( x_i, \mu ^{{\varvec{x}}^{N}} \right) $, and use these mappings to define ${\varvec{G}}_N: {\mathbb {R}}^{N} \rightarrow {\mathbb {R}}^{N}$ by

$$\begin{aligned} {\varvec{G}}_N({\varvec{x}}^{N}) := \left( G_1({\varvec{x}}^{N}), \ldots , G_N({\varvec{x}}^{N}) \right) ^{\top }. \end{aligned}$$

To obtain the transformed process $({\varvec{Z}}^N_t)_{t \in [0,T]} = (Z^{1,N}_t,\ldots ,Z^{N,N}_t)_{t \in [0,T]}^{\top } \in {\mathbb {R}}^N$, we proceed as follows: For any $t \in [0,T]$ and $i \in \lbrace 1, \ldots , N \rbrace $, we have, using [14, Proposition 5.35] (see also Sect. 2)

$$\begin{aligned} \mathrm {d}G(X_t^{i,N},\mu _t^{{\varvec{X}}^{N}})&= \mathrm {d}G_i(X_t^{1,N},\ldots ,X_t^{N,N}) \nonumber \\&= \partial _{x_i} G(X_t^{i,N},\mu _t^{{\varvec{X}}^{N}})\mathrm {d}X_t^{i,N}+\frac{1}{2} \partial ^2_{x_i} G(X_t^{i,N},\mu _t^{{\varvec{X}}^{N}})\mathrm {d}[X^{i,N}]_t \nonumber \\&\quad + \frac{1}{N} \sum _{k =1}^{N} \partial _{\mu } G(X_t^{i,N},\mu _t^{{\varvec{X}}^{N}})(X_t^{k,N}) \mathrm {d}X^{k,N}_t \nonumber \\&\quad + \frac{1}{2N} \sum _{k =1}^{N} \partial _y \partial _{\mu }G(X_t^{i,N},\mu _t^{{\varvec{X}}^{N}})(X_t^{k,N}) \mathrm {d}[X^{k,N}]_t \nonumber \\&\quad + \frac{1}{2N^{2}} \sum _{k=1}^{N}\partial ^2_{\mu }G(X_t^{i,N},\mu _t^{{\varvec{X}}^{N}})(X_t^{k,N},X_t^{k,N}) \mathrm {d}[X^{k,N}]_t \nonumber \\&\quad + \frac{1}{N} \partial _{x_i} \partial _{\mu }G(X_t^{i,N},\mu _t^{{\varvec{X}}^{N}})(X_t^{i,N}) \mathrm {d}[X^{i,N}]_t. \nonumber \end{aligned}$$

The applicability of Itô’s formula for the function G is guaranteed by [40, Theorem 3.19]. Note that Assumption 3.4 therein is imposed to guarantee Lipschitz continuity of second order derivatives of G outside the set of discontinuities. Assumption (H.3(3)) and the definition of G substitute this condition.

Assuming for now the global invertibility of ${\varvec{G}}_N$, we may introduce

$$\begin{aligned} \mathrm {d}{\varvec{Z}}^{N}_t = {\varvec{B}}_N\left( {\varvec{G}}^{-1}_N({\varvec{Z}}^{N}_t) \right) \, \mathrm {d}t + \varvec{\varSigma }_N\left( {\varvec{G}}^{-1}_N({\varvec{Z}}^{N}_t) \right) \, \mathrm {d}{\varvec{W}}^{N}_t, \quad {\varvec{Z}}^{N}_0= {\varvec{G}}_N((X_0^{1,N}, \ldots , X_0^{N,N})), \end{aligned}$$

where ${\varvec{W}}^{N}_t = (W^{1}_t, \ldots , W^{N}_t)^{\top }$, ${\varvec{B}}_N({\varvec{x}}^{N})=(B_1({\varvec{x}}^{N}), \ldots , B_N({\varvec{x}}^{N}))^{\top }$ is defined by

$$\begin{aligned} B_i({\varvec{x}}^{N})&:= \partial _{x_i} G(x_i,\mu ^{{\varvec{x}}^{N}}) b(x_i,\mu ^{{\varvec{x}}^{N}}) + \frac{1}{2} \sigma ^{2}(x_i) \partial ^2_{x_i} G(x_i,\mu ^{{\varvec{x}}^{N}}) \nonumber \\&\quad + \frac{1}{N} \sum _{k =1}^{N} \partial _{\mu } G(x_i,\mu ^{{\varvec{x}}^{N}})(x_k) b(x_k,\mu ^{{\varvec{x}}^{N}}) + \frac{1}{2N} \sum _{k =1}^{N} \partial _y \partial _{\mu }G(x_i,\mu ^{{\varvec{x}}^{N}})(x_k) \sigma ^{2}(x_k) \nonumber \nonumber \\&\quad + \frac{1}{2N^{2}} \sum _{k=1}^{N}\partial ^2_{\mu }G(x_i,\mu ^{{\varvec{x}}^{N}})(x_k,x_k) \sigma ^{2}(x_k) + \frac{1}{N} \partial _{x_i} \partial _{\mu }G(x_i,\mu ^{{\varvec{x}}^{N}})(x_i) \sigma ^2(x_i), \end{aligned}$$

(3.16)

and $\varvec{\varSigma }_N({\varvec{x}}^N) = \left( \varSigma ^{i,j}({\varvec{x}}^N) \right) _{i,j \in \lbrace 1, \ldots , N \rbrace }$ by

$$\begin{aligned} \varSigma ^{i,j}({\varvec{x}}^N)&= \partial _{x_i} G(x_i,\mu ^{{\varvec{x}}^{N}}) \sigma (x_i) \delta _{i,j} + \frac{1}{N}\partial _{\mu } G(x_i,\mu ^{{\varvec{x}}^{N}})(x_j) \sigma (x_j). \end{aligned}$$

(3.17)

In the following lemma, we will prove the invertibility of ${\varvec{G}}_N$.

Lemma 3.2

Let Assumption (H.3(3)) be satisfied and assume that the constant c in (3.15) satisfies

$$\begin{aligned} c< \min \left( 1, \left( \sup _{{\varvec{x}}^N \in {\mathbb {R}}^{N}} \left( |\alpha _N({\varvec{x}}^N)| + \max _{i \in \lbrace 1, \ldots , N \rbrace } |\partial _\mu \alpha (\mu ^{{\varvec{x}}^N})(x_i)| \right) \right) ^{-1} \right) . \end{aligned}$$

(3.18)

Then, ${\varvec{G}}_N$ has a global inverse.

Proof

We will employ Hadamard’s global inverse function theorem (see, e.g., [56, Theorem 2.2]) to prove that ${\varvec{G}}_N$ has a global inverse. To do so, we need to verify the following properties of ${\varvec{G}}_N$: ${\varvec{G}}_N$ is in ${\mathcal {C}}^{1}({\mathbb {R}}^N,{\mathbb {R}}^{N \times N})$, $\lim _{|{\varvec{x}}^N| \rightarrow \infty } |{\varvec{G}}_N({\varvec{x}}^N)| = \infty $, and ${\varvec{G}}'_N({\varvec{x}}^N)$ is invertible for all ${\varvec{x}}^N \in {\mathbb {R}}^N$. The first two mentioned conditions are obvious, due to the definition of ${\varvec{G}}_N$ and the uniform boundedness of $\alpha $ and ${\bar{\phi }}(x) := x|x| \phi \big (x/c \big )$.

Hence, we need to prove that ${\varvec{G}}'_N({\varvec{x}}^N)$ is invertible. First, note that

$$\begin{aligned} {\varvec{G}}'_N({\varvec{x}}^N) = \mathrm {I}_{N \times N} + \text {diag}_{N \times N}({\bar{\phi }}'(x_1)\alpha _N({\varvec{x}}^N), \ldots ,{\bar{\phi }}'(x_N)\alpha _N({\varvec{x}}^N))+ \bar{\varvec{\phi }}({\varvec{x}}^N)\varvec{\alpha }'_N({\varvec{x}}^N), \end{aligned}$$

where $\mathrm {I}_{N \times N}$ is the $N \times N$ identity matrix, $\bar{\varvec{\phi }}({\varvec{x}}^N) = ({\bar{\phi }}(x_1),\ldots ,{\bar{\phi }}(x_N))^{\top }$, with $(\varvec{\alpha }_N'({\varvec{x}}^N))_i= \frac{1}{N}\partial _\mu \alpha (\mu ^{{\varvec{x}}^N})(x_i)$ and $\varvec{\alpha }'_N$ is a row vector.

Now, we define

$$\begin{aligned} {\mathcal {A}}({\varvec{x}}^N) := \text {diag}_{N \times N}({\bar{\phi }}'(x_1)\alpha _N({\varvec{x}}^N), \ldots ,{\bar{\phi }}'(x_N)\alpha _N({\varvec{x}}^N)) + \bar{\varvec{\phi }}({\varvec{x}}^N)\varvec{\alpha }'_N({\varvec{x}}^N), \end{aligned}$$

and remark that ${\varvec{G}}'_N({\varvec{x}}^N)$ can be identified with the linear operator $\mathrm {I}_{N \times N} + {\mathcal {A}}({\varvec{x}}^{N}): {\mathbb {R}}^N \rightarrow {\mathbb {R}}^N$. Therefore, succeeding in showing that c can be chosen (uniformly in ${\varvec{x}}^{N}$) in a way such that the operator norm of ${\mathcal {A}}({\varvec{x}}^N)$ is smaller than one would yield the claim, as in this case ${\varvec{G}}'_N({\varvec{x}}^N)$ is close to the identity (see, [40, Lemma 3.17]). We compute

$$\begin{aligned} \Vert {\mathcal {A}}({\varvec{x}}^N) \Vert&\le \max _{i \in \lbrace {1, \ldots , N \rbrace }} |{\bar{\phi }}'(x_i)| |\alpha _N({\varvec{x}}^N)|+ \max _{i \in \lbrace 1, \ldots , N \rbrace } |{\bar{\phi }}(x_i)| |\partial _\mu \alpha (\mu ^{{\varvec{x}}^N})(x_i)| \\&\le c |\alpha _N({\varvec{x}}^N)| + c^2 \max _{i \in \lbrace 1, \ldots , N \rbrace }|\partial _\mu \alpha (\mu ^{{\varvec{x}}^N})(x_i)|, \end{aligned}$$

which implies that for

$$\begin{aligned} c< \min \left( 1, \left( |\alpha _N({\varvec{x}}^N)| + \max _{i \in \lbrace 1, \ldots , N \rbrace }|\partial _\mu \alpha (\mu ^{{\varvec{x}}^N})(x_i)| \right) ^{-1} \right) , \end{aligned}$$

$\Vert {\mathcal {A}}({\varvec{x}}^N) \Vert < 1$. Note that (H.3(3)) guarantees that $\alpha $ and its derivatives are uniformly bounded, i.e., c can be chosen uniformly in ${\varvec{x}}^N$. Therefore, Hadamard’s global inverse function theorem proves that, for each given $N \ge 1$, ${\varvec{G}}_N: {\mathbb {R}}^N \rightarrow {\mathbb {R}}^N$ is a diffeomorphism. $\square $

We proceed by showing that the transformed SDE has (locally) Lipschitz continuous coefficients.

Lemma 3.3

Let Assumption (H.3) be satisfied. Then, the coefficients ${\varvec{B}}_N$ and $\varvec{\varSigma }_N$ introduced in (3.16) and (3.17), respectively, are locally Lipschitz continuous with linear growth.

Proof

For ${\varvec{x}}^{N}, {\varvec{y}}^{N} \in {\mathbb {R}}^N$, we obtain using (3.17)

$$\begin{aligned}&\Vert \varvec{\varSigma }_N({\varvec{x}}^{N}) - \varvec{\varSigma }_N({\varvec{y}}^{N}) \Vert ^2 \\&\quad \le \sum _{i=1}^{N} |\varSigma ^{i,i}({\varvec{x}}^{N}) - \varSigma ^{i,i}({\varvec{y}}^{N})|^2 + \sum _{i \ne j} |\varSigma ^{i,j}({\varvec{x}}^{N}) - \varSigma ^{i,j}({\varvec{y}}^{N})|^2 \\&\quad \le \sum _{i=1}^{N} |\partial _{x_i} G(x_i,\mu ^{{\varvec{x}}^{N}}) \sigma (x_i) - \partial _{y_i} G(y_i,\mu ^{{\varvec{y}}^{N}}) \sigma (y_i)|^2 \\&\qquad \quad + \frac{1}{N^2} \sum _{i \ne j} |\partial _{\mu } G(x_i,\mu ^{{\varvec{x}}^{N}})(x_j) \sigma (x_j) -\partial _{\mu } G(y_i,\mu ^{{\varvec{x}}^{N}})(y_j) \sigma (y_j)|^2 \\&\quad \le C |{\varvec{x}}^{N} - {\varvec{y}}^{N}|^2, \end{aligned}$$

where we used (H.3(1)), and the Lipschitz continuity of the functions $x \mapsto \partial _{x} G(x,\mu )$ and $\mu \mapsto \partial _{x} G(x,\mu )$ to estimate the first term. Assumptions (H.3(1)) and (H.3(3)), in particular the Lipschitz continuity of $x \mapsto \partial _{\mu } G(x,\mu )(y)$ and $\mu \mapsto \partial _{\mu } G(x,\mu )(y)$, are employed to handle the second sum. Also note that all these mappings are bounded due to (H.3(3)) and the definition of the transformation.

Noting that

$$\begin{aligned}&| {\varvec{B}}_N({\varvec{x}}^{N}) - {\varvec{B}}_N({\varvec{y}}^{N}) |^2 = \sum _{i=1}^{N} |B_i({\varvec{x}}^{N}) - B_i({\varvec{y}}^{N})|^2, \end{aligned}$$

we further obtain for the drift

$$\begin{aligned}&|B_i({\mathbf {x}}^{N}) - B_i({\mathbf {y}}^{N})|^2 \\&\quad \le C \Bigg ( |\partial _{x_i} G(x_i,\mu ^{{\mathbf {x}}^{N}}) b(x_i,\mu ^{{\mathbf {x}}^{N}}) + \frac{1}{2} \sigma ^{2}(x_i) \partial ^2_{x_i} G(x_i,\mu ^{{\mathbf {x}}^{N}}) \\& -\partial _{y_i} G(y_i,\mu ^{{\mathbf {y}}^{N}}) b(y_i,\mu ^{{\mathbf {y}}^{N}}) - \frac{1}{2} \sigma ^{2}(y_i) \partial ^2_{y_i} G(y_i,\mu ^{{\mathbf {y}}^{N}})|^2 \\&\qquad +\frac{1}{N}\sum _{k =1}^{N}| \partial _{\mu } G(x_i,\mu ^{{\mathbf {x}}^{N}})(x_k) b(x_k,\mu ^{{\mathbf {x}}^{N}}) - \partial _{\mu } G(y_i,\mu ^{{\mathbf {y}}^{N}})(y_k) b(y_k,\mu ^{{\mathbf {y}}^{N}})|^2 \\&\qquad + \frac{1}{2N} \sum _{k =1}^{N} |\partial _y \partial _{\mu }G(x_i,\mu ^{{\mathbf {x}}^{N}})(x_k) \sigma ^{2}(x_k) - \partial _y \partial _{\mu }G(y_i,\mu ^{{\mathbf {x}}^{N}})(y_k) \sigma ^{2}(y_k)|^2 \\&\qquad + \frac{1}{2N^{2}} \sum _{k=1}^{N}|\partial ^2_{\mu }G(x_i,\mu ^{{\mathbf {x}}^{N}})(x_k,x_k) \sigma ^{2}(x_k)-\partial ^2_{\mu }G(y_i,\mu ^{{\mathbf {y}}^{N}})(y_k,y_k) \sigma ^{2}(y_k)|^2 \\&\qquad + \frac{1}{N} |\partial _{x_i} \partial _{\mu }G(x_i,\mu ^{{\mathbf {x}}^{N}})(x_i) \sigma ^2(x_i)-\partial _{y_i} \partial _{\mu }G(y_i,\mu ^{{\mathbf {y}}^{N}})(y_i)) \sigma ^2(y_i)|^2 \Bigg )=: \sum _{i=1}^{5} \varPi _i. \end{aligned}$$

That the terms $\varPi _3,\varPi _4$ and $\varPi _5$ allow a Lipschitz bound is a consequence of (H.3(1)) and (H.3(3)).

For any $R>0$, in view of (H.3(2)) and (H.3(3)), we derive the following estimate for $\varPi _2$:

$$\begin{aligned} \varPi _2&\le \frac{C}{N} \Bigg ( \sum _{k = 1}^{N} \left| \partial _{\mu } \alpha (\mu ^{{\mathbf {x}}^{N}})(x_k){\bar{\phi }}(x_i) b(x_k,\mu ^{{\mathbf {x}}^{N}}) - \partial _{\mu } \alpha (\mu ^{{\mathbf {y}}^{N}})(y_k){\bar{\phi }}(x_i)b(y_k,\mu ^{{\mathbf {y}}^{N}}) \right| ^2 \\&\quad + \sum _{k = 1}^{N} \left| \partial _{\mu } \alpha (\mu ^{{\mathbf {y}}^{N}})(y_k){\bar{\phi }}(x_i)b(y_k,\mu ^{{\mathbf {y}}^{N}}) - \partial _{\mu } \alpha (\mu ^{{\mathbf {y}}^{N}})(y_k){\bar{\phi }}(y_i)b(y_k,\mu ^{{\mathbf {y}}^{N}}) \right| ^2 \Bigg ) \\&\le \frac{C}{N} \sum _{k = 1}^{N} \left| \partial _{\mu } \alpha (\mu ^{{\mathbf {x}}^{N}})(x_k) b(x_k,\mu ^{{\mathbf {x}}^{N}}) - \partial _{\mu } \alpha (\mu ^{{\mathbf {y}}^{N}})(y_k)b(y_k,\mu ^{{\mathbf {y}}^{N}}) \right| + L_R|x_i-y_i|^2, \end{aligned}$$

for some constant $L_R>0$ and any $|{\varvec{x}}^{N}|, |{\varvec{y}}^{N}| \le R$. We proceed with the estimate

$$\begin{aligned} \sum _{k = 1}^{N} \left| \partial _{\mu } \alpha (\mu ^{{\varvec{x}}^{N}})(x_k) b(x_k,\mu ^{{\varvec{x}}^{N}}) - \partial _{\mu } \alpha (\mu ^{{\varvec{y}}^{N}})(y_k)b(y_k,\mu ^{{\varvec{y}}^{N}}) \right| ^2 \le C \sum _{k = 1}^{N} |x_k-y_k|^2, \end{aligned}$$

which holds due to Assumptions (H.2(3)) and (H.2(4)). Combining above estimates, we obtain

$$\begin{aligned} \varPi _2 \le L_R \left( |x_i-y_i|^2 + \frac{1}{N}\sum _{k = 1}^{N} |x_k-y_k|^2 \right) . \end{aligned}$$

Finally, we point out that

$$\begin{aligned} x \mapsto b(x,\mu ) + \frac{1}{2} \sigma ^{2}(x) \partial ^2_{x} G(x,\mu ), \end{aligned}$$

is Lipschitz continuous due to the choice of $\alpha $. Employing this along with (H.3(1)), (H.3(2)) and (H.3(3)), we derive

$$\begin{aligned} \varPi _1 \le C \left( |x_i-y_i|^2 + \frac{1}{N}\sum _{k = 1}^{N} |x_k-y_k|^2 \right) , \end{aligned}$$

for some constant $C >0$. Taking the estimates for $\varPi _1, \ldots , \varPi _5$ into account, yields the local Lipschitz continuity of ${\varvec{B}}_N$.

The linear growth of ${\varvec{B}}_N$ and $\varvec{\varSigma }_N$, i.e., that there exists a constant $C>0$ such that $|{\varvec{B}}_N({\varvec{x}}^{N})| + \Vert \varvec{\varSigma }_N({\varvec{x}}^{N})\Vert \le C(1+|{\varvec{x}}^{N}|)$ for all ${\varvec{x}}^{N} \in {\mathbb {R}}^{N}$, is a direct consequence of the growth conditions on b and $\sigma $ along with the bounds for the derivatives of G and $\alpha $. $\square $

Theorem 3.2

Let Assumption (H.3) be satisfied, let $\xi \in L_p^{0}({\mathbb {R}})$ for a given $p \ge 2$ and assume that the constant c in (3.15) satisfies (3.18). Then, the interacting particle system defined in (3.14) has a unique strong solution in ${\mathcal {S}}^{p}([0,T])$.

Proof

From Lemma 3.3 and the linear growth of ${\varvec{B}}_N$ and $\varvec{\varSigma }_N$, we can deduce that the SDE for ${\varvec{Z}}^{N}$ has a unique strong solution (see, [43, Chapter 5, Theorem 2.5]). Applying now Itô’s formula to ${\varvec{G}}_N^{-1}({\varvec{Z}}^{N}_t)$ proves the strong uniqueness of a solution to the particle system defined in (3.14). Note that ${\varvec{G}}_N^{-1}$ exists due to Lemma 3.2 and Itô’s formula is applicable for ${\varvec{G}}_N^{-1}$ as it inherits the regularity of ${\varvec{G}}_N$ (see, Appendix A.2 for details). $\square $

4 Euler–Maruyama scheme with and without transformation

In this section, we restrict most of our discussion to the case of a McKean–Vlasov SDE with decomposable drift, due to the simpler structure of the underlying transformation and the particle systems.

In the following subsections, we will present two Euler–Maruyama schemes to discretise the particle system defined in (3.2) in time.

For the first scheme (Scheme 1), we will discretise the transformed (continuous) particle system in time and then exploit the global inverse $G^{-1}$ to obtain approximations of the original (discontinuous) particle system. A slight modification of this scheme will also be applied in the non-decomposable case. An approximation result with respect to the number of particles will also be presented.

The second scheme (Scheme 2) will be defined by directly discretising the discontinuous particle system, without making use of the transformation G. We give strong convergence rates in terms of the number of time-steps and pathwise strong propagation of chaos results in order to obtain quantitative $L_2$-approximations for the underlying McKean–Vlasov SDE.

4.1 Scheme 1: Euler–Maruyama after transformation (decomposable case)

We define the following explicit Euler–Maruyama scheme to discretise the particle system (3.2) in time. In a first step, we partition a given time interval [0, T] into subintervals of equal length $h=T/M$, for some integer $M>0$, and define $t_n:= nh$. Then, we simulate the transformed particle system by

$$\begin{aligned} Z_{t_{n+1}}^{i,N,M} = Z_{t_{n}}^{i,N,M} + {\tilde{b}}(G^{-1}(Z_{t_n}^{i,N,M}), \mu _{t_n}^{{\varvec{Z}}^{N,M}}) h + {\tilde{\sigma }}(G^{-1}(Z_{t_n}^{i,N,M})) \varDelta W_{n}^{i}, \end{aligned}$$

(4.1)

for $n \in \lbrace 0, \ldots , M-1 \rbrace $, where $Z_0^{i,N,M} = G(X_0^{i,N})$, $\varDelta W_{n}^{i} = W_{t_{n+1}}^{i} - W_{t_{n}}^{i}$, for $i \in \lbrace 1, \ldots , N \rbrace $, and

$$\begin{aligned} \mu _{t_n}^{{\varvec{Z}}^{N,M}}(\mathrm {d}x) := \frac{1}{N} \sum _{j=1}^{N} \delta _{G^{-1}(Z_{t_{n}}^{j,N})}(\mathrm {d}x). \end{aligned}$$

We introduce the notation $\eta (t):= \sup \lbrace s \in \lbrace 0, h, \ldots , Mh \rbrace : s \le t \rbrace $, for $t \in [0,T]$, which allows us to define the continuous time version of (4.1)

$$\begin{aligned} Z_{t}^{i,N,M} = Z_0^{i,N,M} + \int _{0}^{t} {\tilde{b}}(G^{-1}(Z_{\eta (s)}^{i,N,M}), \mu _{\eta (s)}^{{\varvec{Z}}^{N,M}}) \, \mathrm {d}s + \int _{0}^{t} {\tilde{\sigma }}(G^{-1}(Z_{\eta (s)}^{i,N,M})) \, \mathrm {d}W_s^{i}. \end{aligned}$$

(4.2)

Then, we propose an Euler–Maruyama approximation to $X_t^{i,N}$, for $i \in \lbrace 1, \ldots , N \rbrace $ and $t \in [0,T]$, by

$$\begin{aligned} X_t^{i,N,M}=G^{-1}(Z_{t}^{i,N,M}). \end{aligned}$$

(4.3)

The convergence of this algorithm is proven in the following theorem:

Theorem 4.1

Let Assumption (H.1) be satisfied, let $\xi \in L_p^{0}({\mathbb {R}})$ for some $p > 4$ and assume $c < 1/|\alpha |$. For $i \in \lbrace 1, \ldots , N \rbrace $, let $(X_t^{i})_{t \in [0,T]}$ be the unique strong solution of (3.1) driven by the Brownian motion $(W_t^{i})_{t \in [0,T]}$ with initial data $\xi ^{i}$, and $(X_t^{i,N,M})_{t \in [0,T]}$ be given by (4.2) and (4.3). Then, there exists a constant $C>0$ (independent of N and M) such that

$$\begin{aligned} \max _{i \in \lbrace 1, \ldots , N \rbrace }{\mathbb {E}}\left[ \sup _{t \in [0,T]} |X_t^{i} - X_t^{i,N,M}|^2 \right] \le C \left( h + N^{-1/2} \right) . \end{aligned}$$

Proof

Note that

$$\begin{aligned} |X_t^{i} - X_t^{i,N,M}|^2 \le 2 |X_t^{i} - X_t^{i,N}|^2 + 2|X_t^{i,N} - X_t^{i,N,M}|^2, \end{aligned}$$

and recall that the dynamics of $Z_t^{i} = G(X_t^{i})$ satisfies

$$\begin{aligned} \mathrm {d}Z_t^{i} = {\tilde{b}}(Z_t^{i}, {\tilde{\mu }}_t^{Z}) \, \mathrm {d}t + {\tilde{\sigma }}(Z_t^{i}) \, \mathrm {d}W_t^{i}, \end{aligned}$$

where ${\tilde{\mu }}_t^{Z} := {\mathcal {L}}_{G^{-1}(Z^{i}_t)}$. Therefore, we obtain for some constant $C>0$

$$\begin{aligned} {\mathbb {E}} \left[ \sup _{t \in [0,T]} |X_t^{i} - X_t^{i,N}|^2 \right]&= {\mathbb {E}} \left[ \sup _{t \in [0,T]} |G^{-1}(Z_t^{i}) - G^{-1}(Z_t^{i,N})|^2 \right] \\&\le L_{G^{-1}}^2 {\mathbb {E}} \left[ \sup _{t \in [0,T]} |Z_t^{i} - Z_t^{i,N}|^2 \right] \le C N^{-1/2}, \end{aligned}$$

where the last inequality can be derived similarly to the propagation of chaos results for equations with Lipschitz continuous coefficients as in, e.g., [13] for $d=1$ (note that the rate $N^{-1/2}$ can be improved into $N^{-1}$ in case where $b_2(x,\mu ) = \int _{{\mathbb {R}}} \beta (x,y) \, \mu (\mathrm {d}y)$, with $\beta $ Lipschitz continuous, see [46]). To apply the aforementioned propagation of chaos result, we need the requirement that $\xi \in L_p^{0}({\mathbb {R}})$ for $p > 4$ (see, also [14, Theorem 5.8]). Furthermore, we have

$$\begin{aligned} {\mathbb {E}} \left[ \sup _{t \in [0,T]} |X_t^{i,N} - X_t^{i,N,M}|^2 \right]&= {\mathbb {E}} \left[ \sup _{t \in [0,T]} |G^{-1}(Z_t^{i,N}) - G^{-1}(Z_t^{i,N,M})|^2 \right] \\&\le L_{G^{-1}}^2 {\mathbb {E}} \left[ \sup _{t \in [0,T]} |Z_t^{i,N} - Z_t^{i,N,M}|^2 \right] \le C h, \end{aligned}$$

since the SDEs for $(Z_t^{i,N})_{t \in [0,T]}$ and $(Z_t^{i,N,M})_{t \in [0,T]}$ have globally Lipschitz continuous coefficients. From these two estimates the claim follows. $\square $

4.2 Scheme 2: Euler–Maruyama without transformation (decomposable case)

As G and $G^{-1}$ may be difficult to construct in multi-dimensional settings, and since the evaluation for the inverse at each time point can be computationally expensive, it would be preferable to discretise the particle system $(X^{i,N})_{i \in \lbrace 1, \ldots , N \rbrace }$ in time directly, without the use of the transformation G. In addition, a drawback of Scheme 1 is that an SDE with additive diffusion term will be transformed into one with multiplicative noise and therefore the Euler–Maruyama scheme no longer coincides with the Milstein scheme. We employ an Euler–Maruyama scheme to discretise the particle system (3.2) and compute an approximate solution by

$$\begin{aligned} X_{t}^{i,N,M} = X^{i,N}_0 + \int _{0}^{t} \left( b_1(X_{\eta (s)}^{i,N,M}) + b_2(X_{\eta (s)}^{i,N,M}, \mu _{\eta (s)}^{{\varvec{X}}^{N,M}}) \right) \, \mathrm {d}s + \int _{0}^{t} \sigma (X_{\eta (s)}^{i,N,M}) \, \mathrm {d}W_s^{i}. \end{aligned}$$

(4.4)

The following results are concerned with moment stability of the discretised particle system and estimates for the occupation time of the particle system in the neighbourhood of the set of discontinuities.

Moment stability:

We first remark that due to the linear growth of the coefficients in the state component, the Lipschitz continuity in the measure variable and the fact that all particles are identically distributed, we have the following result (see, e.g., [43] for details):

Proposition 4.1

Let Assumption (H.1) be satisfied, and let $\xi \in L_p^{0}({\mathbb {R}})$ for $p \ge 2$. Then, there exist constants $C_1, C_2>0$, such that

$$\begin{aligned} \max _{i \in \lbrace 1, \ldots , N \rbrace } \max _{n \in \lbrace 0, \ldots , M \rbrace } {\mathbb {E}}\left[ |X_{t_n}^{i,N,M} |^p \right] \le C_1, \end{aligned}$$

and for all $i \in \lbrace 1, \ldots , N \rbrace $ and for all $t \in [0,T]$,

$$\begin{aligned} {\mathbb {E}} \left[ |X_{t}^{i,N,M} - X_{\eta (t)}^{i,N,M}|^p\right] \le C_2h^{p/2}. \end{aligned}$$

Occupation time formula for (4.4):

Below, we will show an estimate of the expected occupation time of a fixed particle of the system defined by (4.4) in a neighbourhood of zero.

Proposition 4.2

Let Assumption (H.1) be satisfied and let i be an arbitrary but fixed particle index. Further, let $\xi \in L_p^{0}({\mathbb {R}})$ for $p \ge 2$ and let $(X_t^{i,N,M})_{t \in [0,T]}$ be given by (4.4). Then, there exists a constant $C>0$ such that for all $N, M \in {\mathbb {N}}$ and all sufficiently small $\varepsilon >0$

$$\begin{aligned} \int _{0}^{T} {\mathbb {P}} \left( \lbrace {\varvec{X}}_{t}^{N,M} \in \varTheta ^{i,\varepsilon } \rbrace \right) \, \mathrm {d}t \le C \varepsilon , \end{aligned}$$

where $ {\varvec{X}}_{t}^{N,M} = (X_{t}^{1,N,M},\ldots ,X_{t}^{N,N,M})^{\top } \in {\mathbb {R}}^{N}$ and $\varTheta ^{i,\varepsilon }$ is given by

$$\begin{aligned} \varTheta ^{i,\varepsilon }:= \lbrace {\varvec{x}}^{N} =(x_1, \ldots , x_N)^{\top } \in {\mathbb {R}}^N: \ \exists {\varvec{y}}^N \in \varTheta ^i \text { with } |{\varvec{x}}^N-{\varvec{y}}^N| < \varepsilon \rbrace , \end{aligned}$$

with $\varTheta ^i:=\{{\varvec{x}}^{N}=(x_1,\ldots ,x_N)^{\top } \in {\mathbb {R}}^{N} \ :\ x_i=0\}$.

Proof

We aim to apply [41, Theorem 2.7], which states the following: Let $(X_t)_{t \in [0,T]}$ be an ${\mathbb {R}}^d$-valued Itô process

$$\begin{aligned} X_T = X_0 + \int _{0}^{T} A_t \, \mathrm {d}t + \int _{0}^{T} B_t \, \mathrm {d}W_t, \end{aligned}$$

with progressively measurable processes $A=(A_t)_{t \in [0,T]}$ and $B=(B_t)_{t \in [0,T]}$, where A is ${\mathbb {R}}^d$-valued and B is ${\mathbb {R}}^{d \times d}$-valued. The set of discontinuities, $\varTheta $, is assumed to be a ${\mathcal {C}}^3$ hypersurface of positive reach. Namely, there exists $ \varepsilon > 0$ such that $p(x)= {{\,\mathrm{arg\,min}\,}}_{y \in \varTheta } |x-y|$ is a single valued function of class ${\mathcal {C}}^3$ on the tubular neighbourhood $\varTheta ^{\varepsilon }:= \lbrace x \in {\mathbb {R}}^{d}: \ \inf _{y \in \varTheta } |x-y| < \varepsilon \rbrace $ (see Definition 2.4 in [41] for details). Then, there exists a constant $C>0$, such that for all sufficiently small $\varepsilon >0$

$$\begin{aligned} \int _{0}^{T} {\mathbb {P}} \left( \lbrace X_t \in \varTheta ^{\varepsilon } \rbrace \right) \, \mathrm {d}t \le C \varepsilon , \end{aligned}$$

assuming, additionally, that

1.
the processes A and B are almost surely bounded by a constant C if $X_t(\omega )$ is in a small neighbourhood of $\varTheta $, and
2.
there exists a constant $C>0$ such that for almost all $\omega \in \varOmega $, we have: If, for any $t \in [0,T]$, $X_t(\omega )$ is in a small neighbourhood of $\varTheta $ then
$$\begin{aligned} n^{\top }\left( p\left( X_t(\omega ) \right) \right) B_t^{\top }(\omega )B_t(\omega ) n\left( p\left( X_t(\omega ) \right) \right) \ge C, \end{aligned}$$
where n(x) has length one and is orthogonal to the tangent space of $\varTheta $ in x.

We now return to our particular model problem. First, we remark that $\varTheta ^{i}$ satisfies all regularity conditions of [41, Theorem 2.7], i.e., it is a ${\mathcal {C}}^{3}$ hypersurface of positive reach. We then observe that the N-dimensional particle system can be rewritten as

$$\begin{aligned} \mathrm {d}{\varvec{X}}_{t}^{N,M} = {\varvec{B}}_N({\varvec{X}}_{\eta (t)}^{N,M}) \, \mathrm {d}t + \varvec{\varSigma }_N({\varvec{X}}_{\eta (t)}^{N,M}) \, \mathrm {d}{\varvec{W}}^{N}_t, \end{aligned}$$

where ${\varvec{W}}^{N}_t= (W^{1}_t, \ldots , W^{N}_t)^{\top }$ and ${\varvec{B}}_N: {\mathbb {R}}^N \rightarrow {\mathbb {R}}^N$ and $\varvec{\varSigma }_N: {\mathbb {R}}^N \rightarrow {\mathbb {R}}^{N \times N}$ are defined by

$$\begin{aligned} {\varvec{B}}_N({\varvec{x}}^{N})&= (b(x_1,\mu ^{{\varvec{x}}^{N}}),\ldots ,b(x_N,\mu ^{{\varvec{x}}^{N}}))^{\top }, \\ \varvec{\varSigma }_N({\varvec{x}}^N)&= \text{ diag}_{N \times N}(\sigma (x_1),\ldots ,\sigma (x_N)). \end{aligned}$$

Further, we observe that there is a constant $C>0$ such that: If, for any $t \in [0,T]$ and $\omega \in \varOmega $, ${\varvec{X}}_{t}^{N,M}(\omega )$ is in a small neighbourhood of $\varTheta ^{i}$ then

$$\begin{aligned} n^{\top }\left( p\left( {\varvec{X}}_{t}^{N,M}(\omega ) \right) \right) \varvec{\varSigma }_N^{\top }({\varvec{X}}_{t}^{N,M}(\omega ))\varvec{\varSigma }_N({\varvec{X}}_{t}^{N,M}(\omega )) n\left( p\left( {\varvec{X}}_{t}^{N,M}(\omega ) \right) \right) \ge C, \end{aligned}$$

as $\sigma $ is continuous and $\sigma (0) \ne 0$. Also, note that a normal vector of the tangent space of $\varTheta ^{i}$ is $e_i$, i.e., the i-th unit vector. Further, a close inspection of the proof of [41, Theorem 2.7], shows that the boundedness assumption on the coefficients in a neighbourhood of $\varTheta ^{i}$ is not needed in our case, due to the moment bound of an individual particle established in Proposition 4.1. Hence,

$$\begin{aligned} \int _{0}^{T} {\mathbb {P}} \left( \lbrace {\varvec{X}}_{t}^{N,M} \in \varTheta ^{i,\varepsilon } \rbrace \right) \, \mathrm {d}t \le C \varepsilon , \end{aligned}$$

where the constant $C>0$ is independent of N, due to the fact that the normal vector is the i-th unit vector. $\square $

Auxiliary proposition:

Based on the occupation time estimate from Proposition 4.2, we will prove the following result, which is needed in the proof of Theorem 4.2 given below.

Proposition 4.3

Let Assumption (H.1) be satisfied, and let $\xi \in L_p^{0}({\mathbb {R}})$ for $p \ge 8$. Furthermore, let $(X_t^{i,N,M})_{t \in [0,T]}$ be given by (4.4). Then, there exists a constant $C>0$ (independent of N and M) such that for any $t \in [0,T]$, we have

$$\begin{aligned} \max _{i \in \lbrace 1, \ldots , N \rbrace } {\mathbb {E}}\left[ \left| \int _{0}^{t} \left( G''(X_s^{i,N,M}) - G''(X_{\eta (s)}^{i,N,M}) \right) \sigma ^2(X_{\eta (s)}^{i,N,M}) \, \mathrm {d}s \right| ^2 \right] \le C h^{2/9}. \end{aligned}$$

Proof

First, observe that the linear growth of $\sigma $ and the piecewise Lipschitz continuity of $G''$ imply that there exists a constant $C>0$ such that

$$\begin{aligned}&\left| \left( G''(X_s^{i,N,M}) - G''(X_{\eta (s)}^{i,N,M}) \right) \sigma ^2(X_{\eta (s)}^{i,N,M}) \right| \\&\quad \le {\left\{ \begin{array}{ll} C \left( 1+ (X_{\eta (s)}^{i,N,M})^2 \right) |X_s^{i,N,M} - X_{\eta (s)}^{i,N,M}|, \ X_s^{i,N,M} \notin (-\varepsilon ,\varepsilon ), |X_s^{i,N,M} - X_{\eta (s)}^{i,N,M}| < \varepsilon , \\ C \left( 1+ (X_{\eta (s)}^{i,N,M})^2 \right) , \text{ otherwise }, \end{array}\right. } \end{aligned}$$

where $\varepsilon >0$ will be specified later. With this at hand, we derive

$$\begin{aligned}&{\mathbb {E}}\left[ \left| \int _{0}^{t} \left( G''(X_s^{i,N,M}) - G''(X_{\eta (s)}^{i,N,M}) \right) \sigma ^2(X_{\eta (s)}^{i,N,M}) \, \mathrm {d}s \right| ^2 \right] \\&\quad \le C \int _{0}^{t} {\mathbb {E}} \left[ \left| \left( G''(X_s^{i,N,M}) - G''(X_{\eta (s)}^{i,N,M}) \right) \sigma ^2(X_{\eta (s)}^{i,N,M}) \right| ^2 \right] \, \mathrm {d}s \\&\quad \le C \Bigg ( \int _{0}^{t} {\mathbb {E}} \left[ \left| \left( G''(X_s^{i,N,M}) - G''(X_{\eta (s)}^{i,N,M}) \right) \sigma ^2(X_{\eta (s)}^{i,N,M}) \right| ^2 \right. \\&\qquad \qquad \times \left. \left( \mathrm {I}_{\lbrace X_s^{i,N,M} \notin (-\varepsilon ,\varepsilon ) \rbrace } \mathrm {I}_{\lbrace |X_s^{i,N,M} - X_{\eta (s)}^{i,N,M}|< \varepsilon \rbrace } \right) \right] \, \mathrm {d}s \\&\qquad + \int _{0}^{t} {\mathbb {E}} \Big [ \left| \left( G''(X_s^{i,N,M}) - G''(X_{\eta (s)}^{i,N,M}) \right) \sigma ^2(X_{\eta (s)}^{i,N,M}) \right| ^2 \\&\qquad \qquad \times \left( \mathrm {I}_{\lbrace X_s^{i,N,M} \in (-\varepsilon ,\varepsilon ) \rbrace } + \mathrm {I}_{\lbrace X_s^{i,N,M} \notin (-\varepsilon ,\varepsilon ) \rbrace } \mathrm {I}_{\lbrace |X_s^{i,N,M} - X_{\eta (s)}^{i,N,M}| \ge \varepsilon \rbrace } \right) \Big ] \, \mathrm {d}s \Bigg ) \\&\quad \le C \Bigg ( \int _{0}^{t} {\mathbb {E}} \Bigg [ \left( 1+ (X_{\eta (s)}^{i,N,M})^4 \right) |X_s^{i,N,M} - X_{\eta (s)}^{i,N,M}|^2\\&\quad \qquad \times \left( \mathrm {I}_{\lbrace X_s^{i,N,M} \notin (-\varepsilon ,\varepsilon ) \rbrace } \mathrm {I}_{\lbrace |X_s^{i,N,M} - X_{\eta (s)}^{i,N,M}| < \varepsilon \rbrace } \right) \Bigg ] \, \mathrm {d}s + \int _{0}^{t} {\mathbb {E}} \Bigg [ \left( 1+ (X_{\eta (s)}^{i,N,M})^4 \right) \\&\quad \qquad \qquad \times \left( \mathrm {I}_{\lbrace X_s^{i,N,M} \in (-\varepsilon ,\varepsilon ) \rbrace } + \mathrm {I}_{\lbrace X_s^{i,N,M} \notin (-\varepsilon ,\varepsilon ) \rbrace } \mathrm {I}_{\lbrace |X_s^{i,N,M} - X_{\eta (s)}^{i,N,M}| \ge \varepsilon \rbrace } \right) \Bigg ] \, \mathrm {d}s \Bigg ) \\&\quad \le C \left( \varepsilon ^2 + \varepsilon ^{1/2} + \int _{0}^{t} \left( {\mathbb {P}}(|X_s^{i,N,M} - X_{\eta (s)}^{i,N,M}| \ge \varepsilon ) \right) ^{1/2} \, \mathrm {d}s \right) , \end{aligned}$$

where we used Hölder’s inequality, Propositions 4.1 and 4.2 in the last display. Markov’s inequality along with Proposition 4.1 imply that there exists a constant $C>0$ such that

$$\begin{aligned} \left( {\mathbb {P}}(|X_s^{i,N,M} - X_{\eta (s)}^{i,N,M}| \ge \varepsilon ) \right) ^{1/2} \le \frac{\left( {\mathbb {E}} \left[ \left| X_s^{i,N,M} - X_{\eta (s)}^{i,N,M} \right| ^8 \right] \right) ^{1/2}}{\varepsilon ^4} \le \frac{C h^2}{\varepsilon ^4}. \end{aligned}$$

Choosing $\varepsilon = h^{4/9}$ gives the result. $\square $

We are now ready to present our main convergence result. In this case, we only obtain the strong convergence rate of order 1/9:

Theorem 4.2

Let Assumption (H.1) be satisfied, let $\xi \in L_p^{0}({\mathbb {R}})$ for some $p \ge 8$ and assume $c < 1/|\alpha |$. Furthermore, let $(X_t^{i})_{t \in [0,T]}$ be the unique strong solution of (3.1) driven by the Brownian motion $(W_t^{i})_{t \in [0,T]}$with initial data $\xi ^{i}$ and $(X_t^{i,N,M})_{t \in [0,T]}$ given by (4.4). Then, there exists a constant $C>0$ (independent of N and M) such that

$$\begin{aligned} \max _{i \in \lbrace 1, \ldots , N \rbrace } {\mathbb {E}}\left[ \sup _{t \in [0,T]} |X_t^{i} - X_t^{i,N,M}|^2 \right] \le C \left( h^{2/9} + N^{-1/2} \right) . \end{aligned}$$

Proof

Note that

$$\begin{aligned} {\mathbb {E}}\left[ \sup _{t \in [0,T]} |X_t^{i} - X_t^{i,N} |^2 \right]&\le L_{G^{-1}} {\mathbb {E}} \left[ \sup _{t \in [0,T]} |G( X_t^{i}) - G(X_t^{i,N})|^2 \right] \nonumber \\&\le C N^{-1/2}, \end{aligned}$$

(4.5)

where in the last display, we used the pathwise propagation of chaos result as in the previous subsection. Further, we have

$$\begin{aligned}&{\mathbb {E}} \left[ \sup _{t \in [0,T]} |X_t^{i,N}-X_t^{i,N,M}|^2 \right] \le L_{G^{-1}} {\mathbb {E}} \left[ \sup _{t \in [0,T]} |Z_t^{i,N} - G(X_t^{i,N,M})|^2 \right] \\&\quad \le C \left( {\mathbb {E}} \left[ \sup _{t \in [0,T]} |Z_t^{i,N}-Z_t^{i,N,M}|^2 \right] + {\mathbb {E}} \left[ \sup _{t \in [0,T]} |Z_t^{i,N,M} - G(X_t^{i,N,M})|^2 \right] \right) \\&\quad \le C \left( h + {\mathbb {E}} \left[ \sup _{t \in [0,T]} |Z_t^{i,N,M} - G(X_t^{i,N,M})|^2 \right] \right) , \end{aligned}$$

where in the last estimate, we employed standard strong convergence results for the Euler–Maruyama scheme applied to SDEs with globally Lipschitz continuous coefficients. Following similar arguments to [41] or [49], one further obtains, by applying Itô’s formula to $G(X_t^{i,N,M})$,

$$\begin{aligned}&{\mathbb {E}} \left[ \sup _{t \in [0,T]} |Z_t^{i,N,M} - G(X_t^{i,N,M})|^2 \right] \\&\quad \le C\Bigg (h + \int _{0}^{T} {\mathbb {E}}\left[ \left| \left( G''(X_s^{i,N,M}) - G''(X_{\eta (s)}^{i,N,M}) \right) \sigma ^2(X_{\eta (s)}^{i,N,M}) \right| ^2 \right] \, \mathrm {d}s \\&\qquad + \int _{0}^{T} {\mathbb {E}} \left[ \sup _{s \in [0,t]} |Z_s^{i,N,M} - G(X_s^{i,N,M})|^2 \right] \, \mathrm {d}t \Bigg ), \end{aligned}$$

where the second summand on the right side is of order $h^{2/9}$ due to Proposition 4.3. Hence, Gronwall’s inequality yields

$$\begin{aligned} {\mathbb {E}} \left[ \sup _{t \in [0,T]} |X_t^{i,N}-X_t^{i,N,M}|^2 \right] \le Ch^{2/9}, \end{aligned}$$

(4.6)

and the claim follows combining (4.5) and (4.6). $\square $

Remark 4.1

The convergence rate in terms of the number of particles in the above theorem can again be improved to 1/2 if the drift has the form (1.2), see [46].

The convergence rate in terms of number of time-steps established in Theorem 4.2 could be improved by employing exponential tail estimate techniques, as in [41]. The resulting strong convergence rate would be $1/4-\varepsilon $, for an arbitrarily small $\varepsilon >0$. However, to achieve this, one would need to assume boundedness of the coefficients in Eq. (3.1). Another possibility to recover a better convergence rate in our setting would be to require that the initial data $\xi \in L_p^{0}({\mathbb {R}})$ for any $p \ge 2$ and that $\sigma $ is uniformly bounded. This would enable us to obtain sharper estimates, when employing Markov’s inequality in the proof of Proposition 4.3. If we assume moment boundedness of the initial data of all orders, but allow $\sigma $ to grow-linearly, we would obtain a rate of order $1/8-\varepsilon $.

Moreover, although we expect that the optimal convergence rate of the Euler–Maruyama scheme applied to the interacting particle system is 1/2 (as for equations with Lipschitz coefficients), we only achieve the order 1/9 (or $1/4-\varepsilon $ under stronger assumptions on the initial data or the coefficients of the underlying McKean–Vlasov SDE), due to the estimate of the probability that $X_{\eta (s)}^{i,N,M}$ and $X_s^{i,N,M}$ have a different sign, i.e., that the term $|G''(X_s^{i,N,M}) - G''(X_{\eta (s)}^{i,N,M})|$ in the proof of Proposition 4.3 does not allow a Lipschitz type estimate. Refined estimates of the aforementioned expected sign change, as derived in [49] for one-dimensional SDEs, are not easy to prove for an interacting particle system. The proof of the main result in [49] is not applicable to our setting as an individual particle (seen as a one-dimensional equation) does not satisfy Markov properties (due to the dependence of interaction terms), which are key in [49].

4.3 Scheme 1 for the non-decomposable case

Here, we first prove a propagation of chaos result in the case of non-decomposable drifts in Lemma 4.1. The time-discretisation error is then given in Theorem 4.3.

Lemma 4.1

Let Assumption (H.3) hold, let $\xi \in L_p^{0}({\mathbb {R}})$ for $p > 4$ and assume that $b: {\mathbb {R}} \times {\mathcal {P}}_2({\mathbb {R}}) \rightarrow {\mathbb {R}}$ is uniformly bounded. Let $(X_t^{i})_{t \in [0,T]}$ be the unique strong solution of (3.3) driven by the Brownian motion $(W_t^{i})_{t \in [0,T]}$ with initial data $\xi ^{i}$, and $(X_t^{i,N})_{t \in [0,T]}$ is the solution to the associated particle system. Then, there exists a constant $C>0$ (independent of N) such that

$$\begin{aligned} \max _{i \in \lbrace 1, \ldots , N \rbrace } \sup _{t \in [0,T]} {\mathbb {E}}\left[ |X_t^{i} - X_t^{i,N}|^2 \right] \le C N^{-1/2}. \end{aligned}$$

Proof

First, we observe, using the definitions $\mu _t = {\mathcal {L}}_{X_t}$, $\mu ^{N}_t(\mathrm {d}x) = \frac{1}{N} \sum _{j=1}^{N} \delta _{X_t^{j}}(\mathrm {d}x)$, and $\mu _t^{{\varvec{X}}^{N},N-1}(\mathrm {d}x) = \frac{1}{N-1} \sum _{j \ne i} \delta _{X_t^{j,N}}(\mathrm {d}x)$, that

$$\begin{aligned}&\sup _{t \in [0,T]} {\mathbb {E}} \left[ |X_t^{i} - X_t^{i,N}|^2 \right] \nonumber \\&\quad = \sup _{t \in [0,T]} {\mathbb {E}} \left[ |G^{-1}(G(X^{i}_t,\mu _t),\mu _t) - G^{-1}(G(X_t^{i,N},\mu _t^{{\mathbf {X}}^{N},N-1}),\mu _t^{{\mathbf {X}}^{N},N-1})|^2 \right] \nonumber \\&\quad \le C \Bigg (\sup _{t \in [0,T]} {\mathbb {E}} \left[ |G(X^{i}_t,\mu _t)- G(X_t^{i,N},\mu _t^{{\mathbf {X}}^{N},N-1})|^2 \right] \nonumber \\&\qquad + \sup _{t \in [0,T]} L(c)\left( {\mathcal {W}}^{2}_2(\mu _t,\mu ^{N}_t) + {\mathcal {W}}^{2}_2(\mu ^{N}_t,\mu _t^{{\mathbf {X}}^{N},N-1}) \right) \Bigg ), \end{aligned}$$

(4.7)

where $L(c) \rightarrow 0$ as $c \rightarrow 0$ (see Proposition 3.3). Furthermore, [14, Theorem 5.8] implies that ${\mathcal {W}}^{2}_2(\mu _t,\mu ^{N}_t) \le CN^{-1/2}$ and in addition, by triangle inequality, we deduce

$$\begin{aligned} L(c) {\mathcal {W}}^{2}_2(\mu ^{N}_t,\mu _t^{{\mathbf {X}}^{N},N-1})&\le 2L(c) \left( \sup _{t \in [0,T]} {\mathbb {E}} \left[ |X_t^{i} - X_t^{i,N}|^2 \right] + {\mathcal {W}}^{2}_2(\mu _t^{{\mathbf {X}}^{N}},\mu _t^{{\mathbf {X}}^{N},N-1}) \right) \nonumber \\&\le 2L(c) \sup _{t \in [0,T]} {\mathbb {E}} \left[ |X_t^{i} - X_t^{i,N}|^2 \right] + CN^{-1/2}, \end{aligned}$$

(4.8)

where we used ${\mathcal {W}}^{2}_2(\mu _t^{{\varvec{X}}^{N}},\mu _t^{{\varvec{X}}^{N},N-1}) \le C N^{-1}$, which follows from [60, Lemma 3.1].

To summarise, taking (4.7) and (4.8) into account, we obtain

$$\begin{aligned} \sup _{t \in [0,T]} {\mathbb {E}} \left[ |X_t^{i} - X_t^{i,N}|^2 \right]&\le C \Bigg ( \sup _{t \in [0,T]} {\mathbb {E}} \left[ |G(X^{i}_t,\mu _t)- G(X_t^{i,N},\mu _t^{{\mathbf {X}}^{N}})|^2 \right] \\&\quad + \sup _{t \in [0,T]} {\mathbb {E}} \left[ |G(X_t^{i,N},\mu _t^{{\mathbf {X}}^{N}})- G(X_t^{i,N},\mu _t^{{\mathbf {X}}^{N},N-1})|^2 \right] + N^{-1/2} \Bigg ). \end{aligned}$$

A similar analysis to above can be employed to handle the second term and hence, we arrive at

$$\begin{aligned} \sup _{t \in [0,T]} {\mathbb {E}} \left[ |X_t^{i} - X_t^{i,N}|^2 \right]&\le C \left( \sup _{t \in [0,T]} {\mathbb {E}} \left[ |G(X^{i}_t,\mu _t)- G(X_t^{i,N},\mu _t^{{\mathbf {X}}^{N}})|^2 \right] + N^{-1/2} \right) . \end{aligned}$$

(4.9)

To further estimate (4.9), we apply Itô’s formula to derive

$$\begin{aligned}&G(X^{i}_t,\mu _t)- G(X_t^{i,N},\mu _t^{{\mathbf {X}}^{N}}) \\&\quad = G(X^{i}_0,\mu _0)- G(X_0^{i,N},\mu _0^{{\mathbf {X}}^{N}}) \\&\qquad + \int _{0}^{t} \left( b(X^{i}_s,\mu _s) + \alpha ({\mu _s}) {\bar{\phi }}'(X^{i}_s) b(X^{i}_s,\mu _s) + \frac{1}{2} \alpha ({\mu _s}) {\bar{\phi }}''(X^{i}_s) \sigma ^2(X^{i}_s) \right) \, \mathrm {d}s \\&\qquad - \int _{0}^{t} \Bigg (b(X^{i,N}_s,\mu _s^{{\mathbf {X}}^{N}}) + \alpha (\mu _s^{{\mathbf {X}}^{N}}) {\bar{\phi }}'(X^{i,N}_s) b(X^{i,N}_s,\mu _s^{{\mathbf {X}}^{N}}) \\& + \frac{1}{2} \alpha (\mu _s^{{\mathbf {X}}^{N}}) {\bar{\phi }}''(X^{i,N}_s) \sigma ^2(X^{i,N}_s) \Bigg ) \, \mathrm {d}s \\&\qquad + \int _{0}^{t} \left( \sigma (X^{i}_s) + \alpha ({\mu _s}) {\bar{\phi }}'(X^{i}_s)\sigma (X^{i}_s) - \sigma (X^{i,N}_s) - \alpha (\mu _s^{{\mathbf {X}}^{N}}) {\bar{\phi }}'(X^{i,N}_s)\sigma (X^{i,N}_s) \right) \, \mathrm {d}W^{i}_s \\&\qquad + \int _{0}^{t} \Big ( \partial _s G(X^{i}_s,\mu _s) - \frac{1}{2N} \sum _{k =1}^{N} \partial _y \partial _{\mu }G(X_s^{i,N},\mu _s^{{\mathbf {X}}^{N}})(X_s^{k,N}) \sigma ^2(X^{k,N}_s) \\& - \frac{1}{N} \sum _{k =1}^{N} \partial _{\mu } G(X_s^{i,N},\mu _s^{{\mathbf {X}}^{N}})(X_s^{k,N}) b(X^{k,N}_s,\mu _s^{{\mathbf {X}}^{N}}) \Big ) \, \mathrm {d}s \\&\qquad - \int _{0}^{t} \frac{1}{N} \sum _{k =1}^{N} \partial _{\mu } G(X_s^{i,N},\mu _s^{{\mathbf {X}}^{N}})(X_s^{k,N}) \sigma (X^{k,N}_s) \, \mathrm {d}W_s^{k} \\&\qquad - \int _{0}^{t} \frac{1}{2N^{2}} \sum _{k=1}^{N}\partial ^2_{\mu }G(X_s^{i,N},\mu _s^{{\mathbf {X}}^{N}})(X_s^{k,N},X_s^{k,N}) \sigma ^2(X_s^{k,N}) \, \mathrm {d}s \\&\qquad - \int _{0}^{t} \frac{1}{N} \partial _{x_i} \partial _{\mu }G(X_s^{i,N},\mu _s^{{\mathbf {X}}^{N}})(X_s^{i,N})\sigma ^2(X_s^{i,N}) \, \mathrm {d}s =: \sum _{i=1}^{7} \varPi _i. \nonumber \end{aligned}$$

It is clear due to (H.3(3)) that ${\mathbb {E}}[|\varPi _6|^2] + {\mathbb {E}}[|\varPi _7|^2] \le CN^{-1}$. For the term $\varPi _5$ we derive, using BDG’s inequality,

$$\begin{aligned}&\mathbb {E}[|\varPi _5|^2] \le \frac{1}{N}\mathbb {E}\Bigg [\sum _{k=1}^{N} C \int _{0}^{t}\Big ( \partial _{\mu } G(X_s^{i,N},\mu _s^{\varvec{X}^{N}})(X_s^{k,N}) \sigma (X^{k,N}_s) \\&- \partial _{\mu }\alpha (\mu _s)(X_s^{k})\bar{\phi }(X^{k}_s) \sigma (X^{k}_s) \Big )^2 \, \mathrm {d}s \\&\quad +\sum _{k,l=1}^{N} \int _{0}^{t}\partial _{\mu }\alpha (\mu _s)(X_s^{k})\bar{\phi }(X^{k}_s) \sigma (X^{k}_s) \, \mathrm {d}W_s^{k} \int _{0}^{t} \partial _{\mu }\alpha (\mu _s)(X_s^{l})\bar{\phi }(X^{l}_s) \sigma (X^{l}_s) \, \mathrm {d}W_s^{l} \Bigg ]. \end{aligned}$$

Therefore, taking the Lipschitz continuity of $(x,\mu ) \mapsto \partial _{\mu }\alpha (\mu )(x) {\bar{\phi }}(x) \sigma (x)$ and the independence of $(X_t^{k})_{k \in \lbrace 1, \ldots , N \rbrace }$, for $t \in [0,T]$, into account and using ${\mathcal {W}}_2(\mu _s^{{\varvec{X}}^{N}},\mu _s) \le {\mathcal {W}}_2(\mu _s^{{\varvec{X}}^{N}},\mu ^{N}_s) + {\mathcal {W}}_2(\mu _s^N,\mu _s)$ along with [14, Theorem 5.8], we obtain

$$\begin{aligned} {\mathbb {E}}[|\varPi _5|^2]&\le C \left( \int _{0}^{T} {\mathbb {E}} \left[ |X_t^{i} - X_t^{i,N}|^2 \right] \, \mathrm {d}t + N^{-1/2} \right) . \end{aligned}$$

Similar to Lemma 3.1 combined with BDG’s inequality and Hölder’s inequality, we deduce

$$\begin{aligned}&{\mathbb {E}}[|\varPi _1|^2]+ {\mathbb {E}}[|\varPi _2|^2]+ {\mathbb {E}}[|\varPi _3|^2] + {\mathbb {E}}[|\varPi _4|^2] \\&\quad \le C \left( \int _{0}^{T} {\mathbb {E}} \left[ |X_t^{i} - X_t^{i,N}|^2 \right] \, \mathrm {d}t + N^{-1/2} \right) . \end{aligned}$$

Therefore, inserting the estimates for ${\mathbb {E}}[|\varPi _1|^2], \ldots , {\mathbb {E}}[|\varPi _7|^2]$ back into (4.9), we deduce the claim using Gronwall’s inequality. $\square $

The following hybrid explicit-implicit time-stepping algorithm computes a discrete-time approximation of $(X_t^{i,N})_{t \in [0,T]}$, denoted by $X_{t_n}^{i,N,M}$ for $n \in \lbrace 0, \ldots , M \rbrace $ and $i \in \lbrace 1, \ldots , N \rbrace $:

Set ${\tilde{X}}^{i,N,M}_{t_0} = G(X^{i,N}_0,\mu _0^{{\varvec{X}}^{N}})$ and $X^{i,N,M}_{t_0} = X^{i,N}_0= \xi ^{i}$.
For $n \ge 1$, compute
$$\begin{aligned} {\tilde{X}}^{i,N,M}_{t_n} = {\tilde{X}}^{i,N,M}_{t_{n-1}}+ & {} B_i({\tilde{X}}^{1,N,M}_{t_{n-1}}, \ldots , {\tilde{X}}^{N,N,M}_{t_{n-1}}) h \\+ & {} \sum _{j=1}^{N}\varSigma ^{i,j}({\tilde{X}}^{1,N,M}_{t_{n-1}}, \ldots , {\tilde{X}}^{N,N,M}_{t_{n-1}}) \varDelta W^{j}_n, \end{aligned}$$
where $B_i$ and $\varSigma ^{i,j}$ are defined by (3.16) and (3.17), respectively.
Find $X^{i,N,M}_{t_n}$ such that $X^{i,N,M}_{t_n} = G^{-1}({\tilde{X}}^{i,N,M}_{t_n},\mu _{t_n}^{{\varvec{X}}^{N,M}})$, with $\mu _{t_n}^{{\varvec{X}}^{N,M}}(\mathrm {d}x) = \frac{1}{N} \sum _{j=1}^{N} \delta _{X^{j,N,M}_{t_n}}(\mathrm {d}x)$.

Remark 4.2

The implicit function theorem applied to the function

$$\begin{aligned} {\varvec{F}}_N({\varvec{x}}^{N},{\varvec{y}}^{N}) = {\varvec{y}}^{N}- \left( G^{-1}(x_1,\mu ^{{\varvec{y}}^{N}}), \ldots , G^{-1}(x_N,\mu ^{{\varvec{y}}^{N}}) \right) ^{\top }, \quad {\varvec{x}}^{N}, {\varvec{y}}^{N} \in {\mathbb {R}}^{N}, \end{aligned}$$

implies that we can express ${\varvec{y}}^{N}$ in terms of ${\varvec{x}}^{N}$. The applicability of the implicit function theorem follows from similar arguments to the ones presented in Lemma 3.2 along with Proposition A.1.

Remark 4.3

We could also define an explicit scheme by setting $X^{i,N,M}_{t_n} = G^{-1}({\tilde{X}}^{i,N,M}_{t_n},\mu _{t_n}^{\tilde{{\varvec{X}}}^{N,M}})$, with $\mu _{t_n}^{\tilde{{\varvec{X}}}^{N,M}}(\mathrm {d}x) = \frac{1}{N} \sum _{j=1}^{N} \delta _{{\tilde{X}}^{j,N,M}_{t_n}}(\mathrm {d}x)$. However, to derive a strong convergence rate for the resulting scheme, one has to analyse the quantity ${\mathbb {E}}[|X^{i,N,M}_{t_n}-{\tilde{X}}^{i,N,M}_{t_n}|^2]$. Similar arguments as for Scheme 2 could possibly be used here, but our current analysis does not allow us to derive an optimal convergence rate in h.

Theorem 4.3

Let Assumption (H.3) hold, let $\xi \in L_p^{0}({\mathbb {R}})$ for $p > 4$ and assume that $b: {\mathbb {R}} \times {\mathcal {P}}_2({\mathbb {R}}) \rightarrow {\mathbb {R}}$ is uniformly bounded. Let $(X_t^{i})_{t \in [0,T]}$ be the unique strong solution of (3.3) driven by the Brownian motion $(W_t^{i})_{t \in [0,T]}$ with initial data $\xi ^{i}$, and $X_{t_n}^{i,N,M}$ for $n \in \lbrace 0, \ldots , M\rbrace $ be defined by the above algorithm. Then, there exists a constant $C>0$ (independent of N and M) such that

$$\begin{aligned} \max _{i \in \lbrace 1, \ldots , N \rbrace } \max _{n \in \lbrace 0, \ldots , M \rbrace } {\mathbb {E}}\left[ |X_{t_n}^{i} - X_{t_n}^{i,N,M}|^2 \right] \le C (N^{-1/2} + h). \end{aligned}$$

Proof

We start with the observation that for any $n \in \lbrace 0, \ldots , M \rbrace $

$$\begin{aligned}&|X_{t_n}^{i,N} - X_{t_n}^{i,N,M}|^2 \\&= |G^{-1}(G(X_{t_n}^{i,N},\mu _{t_n}^{{\varvec{X}}^{N},N-1}),\mu _{t_n}^{{\varvec{X}}^{N},N-1})-G^{-1}({\tilde{X}}^{i,N,M}_{t_n},\mu _{t_n}^{{\varvec{X}}^{N,M}})|^2 \\&\le 2|G^{-1}(G(X_{t_n}^{i,N},\mu _{t_n}^{{\varvec{X}}^{N},N-1}),\mu _{t_n}^{{\varvec{X}}^{N},N-1})-G^{-1}(G(X_{t_n}^{i,N},\mu _{t_n}^{{\varvec{X}}^{N}}),\mu _{t_n}^{{\varvec{X}}^{N}})|^2 \\&\quad + 2|G^{-1}(G(X_{t_n}^{i,N},\mu _{t_n}^{{\varvec{X}}^{N}}),\mu _{t_n}^{{\varvec{X}}^{N},N})-G^{-1}({\tilde{X}}^{i,N,M}_{t_n},\mu _{t_n}^{{\varvec{X}}^{N,M}})|^2. \end{aligned}$$

Using the arguments from the previous lemma, the first term can be shown to be of order ${\mathcal {O}}(N^{-1})$. For the second term, we derive the estimate

$$\begin{aligned}&|G^{-1}(G(X_{t_n}^{i,N},\mu _{t_n}^{{\varvec{X}}^{N}}),\mu _{t_n}^{{\varvec{X}}^{N}})-G^{-1}({\tilde{X}}^{i,N,M}_{t_n},\mu _{t_n}^{{\varvec{X}}^{N,M}})|^2 \\&\quad \le C \left( |G(X_{t_n}^{i,N},\mu _{t_n}^{{\varvec{X}}^{N}}) - {\tilde{X}}^{i,N,M}_{t_n}|^2 + L(c){\mathcal {W}}_2^{2}(\mu _{t_n}^{{\varvec{X}}^{N}},\mu _{t_n}^{{\varvec{X}}^{N,M}}) \right) , \end{aligned}$$

where $L(c) \rightarrow 0$ as $c \rightarrow 0$, as in Proposition 3.3.

Therefore, we get

$$\begin{aligned} {\mathbb {E}} \left[ |X_{t_n}^{i,N} - X_{t_n}^{i,N,M}|^2 \right] \le C \left( N^{-1} + {\mathbb {E}} \left[ |G(X_{t_n}^{i,N},\mu _{t_n}^{{\varvec{X}}^{N}}) - {\tilde{X}}^{i,N,M}_{t_n}|^2 \right] \right) . \end{aligned}$$

Using now the definition of ${\tilde{X}}^{i,N,M}_{t_n}$ and Lemma 3.3, one shows that there exists a constant $C>0$ such that ${\mathbb {E}} \left[ |G(X_{t_n}^{i,N},\mu _{t_n}^{{\varvec{X}}^{N}}) - {\tilde{X}}^{i,N,M}_{t_n}|^2 \right] \le Ch$. This together with Lemma 4.1 yields the claim. $\square $

5 Numerical illustration

In the following, we will present two examples of McKean–Vlasov SDEs and interacting particle systems exhibiting discontinuous drifts in order to motivate the theoretical study of such equations and to numerically illustrate the strong convergence behaviour of an Euler–Maruyama scheme. These models find applications in biology and mathematical finance, in particular systemic risk.

As we do not know the exact solution of the considered equations, the convergence rate (in terms of number of time-steps) was determined by comparing two solutions computed on a fine and coarse grid, respectively, where the same Brownian motions were used. In order to illustrate the strong convergence behaviour in the uniform time-step h, we compute the root-mean-square error (RMSE) by comparing the numerical solution at level l of the time discretisation with the solution at level $l-1$ at the final time $T=1$. To be precise, as error measure we use the quantity

$$\begin{aligned} \text {RMSE}:= \sqrt{\frac{1}{N} \sum _{i=1}^{N} \left( X_T^{i,N,M_l} - X_T^{i,N,M_{l-1}} \right) ^2}, \end{aligned}$$

where $M_l = 2^lT$ and by $X_T^{i,N,M_l}$ we denote the approximation of X at time T computed based on N particles and $2^lT$ time-steps. The number of particles used in the tests will be specified below.

5.1 Neuronal interactions

In this section, we provide a numerical illustration for a specific model for neuronal interactions. Interacting particle systems are ubiquitous in neuroscience, such as the Hodgkin-Huxley model [2, 8] or mean-field equations describing the behaviour of a (large) network of interacting spiking neurons [23]. For other mean-field models appearing in neuroscience, we refer to the references given in [23].

A recent model of the action potential of neurons is described in [25] and involves discontinuous coefficients. The reason for the necessity of introducing discontinuities is the following: After a charging phase of an individual neuron, subject to spikes of nearby neurons, randomness, and the effect of discharge with constant rate, the neuron emits a spike to the network once a certain threshold is hit and is then in a recovery phase. The change between these two phases is characterised by a discontinuity in the dynamics describing the potential of each neuron.

The action potential of N interacting neurons at time $t \in [0,T]$, $V^{i,N}_t (\text {mod } 2) \in [0,2)$, where $V^{i,N}_t \in {\mathbb {R}}$, $i \in \lbrace 1, \ldots , N \rbrace $, is modelled by the discontinuous mean-field equation

$$\begin{aligned} \mathrm {d}V^{i,N}_t&= \lambda (V^{i,N}_t (\text {mod } 2)) \, \mathrm {d}t + \sigma ^{\varepsilon }(V^{i,N}_t (\text {mod } 2)) \, \mathrm {d}W^{i}_t \\&\quad + \frac{1}{N}\sum _{j=1}^N \varTheta (\xi _i,\xi _j) \mathrm {I}_{[1,1+\kappa ]}(V^{j,N}_t (\text {mod } 2)) \mathrm {I}_{[0,1]}(V^{i,N}_t (\text {mod } 2)) \, \mathrm {d}t, \end{aligned}$$

with square-integrable random initial values $V_0^{i,N} = \eta _i$, for $i \in \lbrace 1, \ldots , N \rbrace $ and $0< \kappa < 1$ fixed. The set of i. i. d. random variables $\lbrace \xi _1, \ldots , \xi _N \rbrace $, $\xi _i \in D$, describes the location of the N (non-moving) neurons, where D is modelled as an open connected domain of ${\mathbb {R}}^3$. Hence, the position of each neuron is given by $X_t^{i,N} = \xi ^{i}$ at each time $t \in [0,T]$. It is further assumed that $\lbrace \xi _1, \ldots , \xi _N, \eta _1, \ldots , \eta _N \rbrace $ are independent for each integer $N \ge 1$. The standard Brownian motions $(W_t^{i})_{t \in [0,T]}$ are independent, and independent of $\xi _i$ and $\eta _i$, for $i \in \lbrace 1, \ldots , N \rbrace $. Observe that $V^{i,N}$ is specified by an SDE, where the drift has a discontinuity in the state variable (due to the choice of $\lambda $; see below for details), but also has a jump, in case a particle $j \ne i$ reaches the the critical values 1 or $1+\kappa $.

The following conditions are imposed in [25] to guarantee strong well-posedness of the particle system above (see, [25, Theorem 2.2]) and the associated McKean–Vlasov equation (see, [25, Theorem 6.1]); propagation of chaos type results, i.e., weak convergence of the law of the empirical distribution of $(\xi ^{i},V^{i,N})$ to a Dirac measure centred at the law of the solution to the underlying McKean–Vlasov equation, are shown in [25, Theorem 5.7]:

1.
$\lambda (v) = -{\hat{\lambda }}v \mathrm {I}_{[0,1]}(v) + \mathrm {I}_{(1,2)}, \quad {\hat{\lambda }} >0$, $\varTheta (x,y) = \sin (|x-y|)$, for $x,y \in {\mathbb {R}}^3$;
2.
$ \sigma ^{\varepsilon }$ is a ${\mathcal {C}}_b^{1}([0,2])$ function satisfying $ \sigma ^{\varepsilon } \ge \sqrt{2 \varepsilon } > 0$ and
$$\begin{aligned}&\sigma ^{\varepsilon }(v) = \sqrt{2\varepsilon } \text { on } [1,2], \quad \sigma ^{\varepsilon }(2) = \sigma ^{\varepsilon }(0) = \sqrt{2 \varepsilon }, \quad (\sigma ^{\varepsilon })'(0) = (\sigma ^{\varepsilon })'(2) = 0, \\&\quad \text { with } \varepsilon >0 \text { fixed}. \end{aligned}$$

These conditions specify our Example 1. For Example 2, the diffusion term was chosen $\sigma ^{\varepsilon }(x) = \sqrt{2\varepsilon }+x$. For our tests, we used $N=10^3$. Furthermore, we set $\kappa =0.01$, ${\hat{\lambda }} = 0.02$ and $\varepsilon = 0.1$. The initial values $\eta _1, \ldots , \eta _N$ are chosen as independent normal random variables with mean 1 and standard deviation 2. Also, these values are considered modulo 2. The variables $\xi _1, \ldots , \xi _N$ are independent three-dimensional random variables chosen from the same multivariate normal distribution with some given mean vector and covariance matrix.

We investigate numerically the convergence of Scheme 2, i.e., the Euler–Maruyama scheme without applying any transformations. In Fig. 1, we observe strong convergence of order 3/4 for Example 1, which is most likely due to the choice of $\sigma ^{\varepsilon } = \sqrt{2\varepsilon }$ as constant. In [50] a Milstein scheme for one-dimensional SDEs with discontinuities in the drift was derived and a strong convergence order of 3/4 was proven. In addition, it is conjectured in [50] (Conjecture 1 and Conjecture 2) that the rate 3/4 is optimal.

5.2 Systemic risk

In this section, we consider a McKean–Vlasov SDE of the form

$$\begin{aligned} \mathrm {d}X_t = \left( a \left( {\mathbb {E}}[X_t]- X_t \right) + \kappa _1 \mathrm {I}_{ \lbrace X_t \le 0 \rbrace } + \kappa _2 \mathrm {I}_{\lbrace X_t > 0 \rbrace } \right) \, \mathrm {d}t + (\sigma + X_t) \, \mathrm {d}W_t, \quad X_0=x \in {\mathbb {R}}, \end{aligned}$$

(5.1)

where $a \ge 0$ is the mean-reversion rate, $\kappa _1 < 0$, $\kappa _2>0$ and $\sigma >0$. The strong well-posedness of (5.1) follows from Proposition 3.1. This equation can be linked to a model of systemic risk in [16], where a mean-field game of N banks borrowing from, and lending to, a central bank is proposed. The banks control the rate of their borrowing depending on their (log-)monetary reserves, which are modelled by a system of SDEs with interaction through their average. In this setting flocking, and thus systemic default events, may occur. Here, we give a slight reformulation of this problem following [28, Section 4]. The problem consists in finding $({\hat{\mu }}_t,{\hat{\beta }}_t)_{t \in [0,T]}$, where $({\hat{\mu }}_t)_{t \in [0,T]}$ is a flow of measures in ${\mathcal {C}}([0,T],{\mathcal {P}}_2({\mathbb {R}}))$ and $({\hat{\beta }}_t)_{t \in [0,T]}$ is an adapted, square-integrable control process (describing the rate of borrowing from, or lending to, the central bank), such that $({\hat{\beta }}_t)_{t \in [0,T]}$ minimises the objective function given by

$$\begin{aligned} J^{{\hat{\mu }}}(x;\beta )= {\mathbb {E}}&\left[ \int _{0}^{T}\left( r |\beta _t| + \frac{\varepsilon }{2} \left( X^{{\hat{\mu }},\beta }_t - \int _{{\mathbb {R}}} x \, {\hat{\mu }}_t(\mathrm {d}x) \right) ^2 \right) \, \mathrm {d}t \right. \\&\qquad \qquad \quad \left. + \frac{c}{2} \left( X^{{\hat{\mu }},\beta }_T - \int _{{\mathbb {R}}} x \, {\hat{\mu }}_T(\mathrm {d}x) \right) ^2 \right] , \end{aligned}$$

for $r, \varepsilon , c \ge 0$, where

$$\begin{aligned} \mathrm {d}X^{{\hat{\mu }},\beta }_t = \left( a \left( \int _{{\mathbb {R}}} x \, {\hat{\mu }}_t(\mathrm {d}x)- X^{{\hat{\mu }},\beta }_t \right) + \beta _t \right) \, \mathrm {d}t + (\sigma + X^{{\hat{\mu }},\beta }_t) \, \mathrm {d}W_t, \quad X^{{\hat{\mu }},\beta }_0=x \in {\mathbb {R}}, \end{aligned}$$

(5.2)

and ${\hat{\mu }}_t = {\mathcal {L}}_{X^{{\hat{\mu }},{\hat{\beta }}}_t}$ for all $t \in [0,T]$.

For simplicity, we did not add the common noise term as in [16]. In addition, we modified the diffusion to allow it to be degenerate. In [28], a constraint $\beta _t \in [\kappa _1,\kappa _2]$ on the borrowing/lending rate is imposed for all $t \in [0,T]$. The minimiser of the objective function for this mean-field game (with constant diffusion term) is shown to be a control of bang-zero-bang type (see [28, equation (24)] for an analytic expression). Written in feedback form, this optimal control strategy has discontinuities that are time-dependent, i.e., the zero-control region changes over time, a setting not covered by our current analysis.

Using instead the special bang-bang type control $(\beta ^{*}_t)_{t \in [0,T]}$ of the form

$$\begin{aligned} \beta ^{*}_t = {\left\{ \begin{array}{ll} \kappa _1, &{}\text { if } X_t \le 0 \\ \kappa _2, &{} \text { if } X_t > 0, \\ \end{array}\right. } \end{aligned}$$

and plugging $\beta _t^{*}$ back into (5.2) results into an equation of the form (5.1) with a one-point discontinuity.

In our numerical experiments, we set $\kappa _1= -0.5$, $\kappa _2=0.5, x=0$ and $\sigma =0.7$. Further, we consider three different choices for the mean-reversion rate, i.e., we set $a=1,5,10$. The expected value is approximated by the empirical mean of $N=10^4$ particles. For a larger value of a, we expect sample paths of (5.1) to be more concentrated around the mean of the samples; see Fig. 2a, b (with $T=1$ and $M=2^7$). We can also observe that for a stronger concentration effect the strong approximation behaviour becomes better; see Fig. 2b (Fig. 3).

References

Antonelli, F., Kohatsu-Higa, A.: Rate of convergence of a particle method to the solution of the McKean–Vlasov equation. Ann. Appl. Probab. 12(2), 424–476 (2002)
Article MathSciNet MATH Google Scholar
Baladron, J., Fasoli, D., Faugeras, O., Touboul, J.: Mean-field description and propagation of chaos in networks of Hodgkin–Huxley and FitzHugh–Nagumo neurons. J. Math. Neurosci. 2(10), 50 (2012)
MathSciNet MATH Google Scholar
Banner, A.D., Fernholz, R., Karatzas, I.: Atlas models of equity markets. Ann. Appl. Probab. 15(4), 2296–2330 (2005)
Article MathSciNet MATH Google Scholar
Bao, J., Huang, X.: Approximations of McKean–Vlasov SDEs with irregular coefficients. J. Theor. Probab. (2021) (available online)
Bao, J., Reisinger, C., Ren, P., Stockinger, W.: First order convergence of Milstein schemes for McKean–Vlasov equations and interacting particle systems. Proc. R. Soc. A 477(2245), 27 (2021)
Article MathSciNet Google Scholar
Bauer, M., Meyer-Brandis, T., Proske, F.: Strong solutions of mean-field stochastic differential equations with irregular drift. Electron. J. Probab. 23(32), 35 (2018)
MathSciNet MATH Google Scholar
Bossy, M.: Some stochastic particle methods for nonlinear parabolic PDEs. In: Proceedings of the 2005 GRIP Summer School, ESAIM Proceedings, vol. 15, pp. 18–57 (2005)
Bossy, M., Faugeras, O., Talay, D.: Clarification and complement to ‘Mean-field description and propagation of chaos in networks of Hodgkin–Huxley and FitzHugh–Nagumo neurons’. J. Math. Neurosci. 5(19), 27 (2015)
MathSciNet MATH Google Scholar
Bossy, M., Talay, D.: A stochastic particle method for the McKean–Vlasov and the Burgers equation. Math. Comput. 66(217), 157–192 (1997)
Article MathSciNet MATH Google Scholar
Buckdahn, R., Li, J., Peng, S., Rainer, C.: Mean-field stochastic differential equations and associated PDEs. Ann. Probab. 45(2), 824–878 (2017)
Article MathSciNet MATH Google Scholar
Cao, H., Guo, X., Lee, J. S.: Approximation of mean field games to $N$-player stochastic games with singular controls. arXiv:1703.04437 (2020)
Cardaliaguet, P.: Notes on mean field games, P.-L. Lions lectures at Collège de France. Online at https://www.ceremade.dauphine.fr/~cardaliaguet/MFG20130420.pdf
Carmona, R.: Lectures of BSDEs, Stochastic Control, and Stochastic Differential Games with Financial Applications. SIAM, Philadelphia (2016)
Book MATH Google Scholar
Carmona, R., Delarue, F.: Probabilistic Theory of Mean Field Games with Applications I, Volume 84 of Probability Theory and Stochastic Modelling, 1st edn. Springer International Publishing, Cham (2018)
MATH Google Scholar
Carmona, R., Delarue, F., Lacker, D.: Mean field games with common noise. Ann. Probab. 44(6), 3740–3803 (2016)
Article MathSciNet MATH Google Scholar
Carmona, R., Fouque, J.-P., Sun, L.-H.: Mean field games and systemic risk. Commun. Math. Sci. 13, 911–933 (2015)
Article MathSciNet MATH Google Scholar
Cépa, E., Lépingle, D.: Diffusing particles with electrostatic repulsion. Probab. Theory Relat. Fields 107, 429–449 (1997)
Article MathSciNet MATH Google Scholar
Chassagneux, J.-F., Crisan, D., Delarue, F.: A probabilistic approach to classical solutions of the master equation for large population equilibria. Memoirs of the AMS. arXiv:1411.3009 (2014). (To appear)
Chassagneux, J.-F., Jacquier, A., Mihaylov, I.: An explicit Euler scheme with strong rate of convergence for financial SDEs with non-Lipschitz coefficients. SIAM J. Financ. Math. 7(1), 79–107 (2016)
Article MathSciNet MATH Google Scholar
Chaudru de Raynal, P.-E.: Strong well posedness of McKean–Vlasov stochastic differential equations with Hölder drift. Stoch. Process. Appl. 130(1), 79–107 (2020)
Article MathSciNet MATH Google Scholar
Chaudru de Raynal, P.-E., Frikha, N.: Well-posedness for some non-linear diffusion processes and related PDE on the Wasserstein space. arXiv:1811.06904 (2018)
Chorin, A.L.: Numerical study of slightly viscous flow. J. Fluid Mech. 57(4), 785–796 (1973)
Article MathSciNet Google Scholar
Delarue, F., Inglis, J., Rubenthaler, S., Tanré, E.: Particle systems with a singular mean-field self-excitation. Application to neuronal networks. Stoch. Process. Appl. 125(6), 2451–2492 (2015)
Article MathSciNet MATH Google Scholar
Elworthy, K.D., Truman, A., Zhao, H.Z.: Generalized Itô formulae and space-time Lebesgue–Stieltjes integrals of local times. In: Donati-Martin, C., Émery, M., Rouault, A., Stricker, C. (eds.) Séminaire de Probabilités XL, Lecture Notes in Mathematics, vol. 1899, pp. 117–136. Springer, Berlin (2007)
Chapter Google Scholar
Flandoli, F., Priola, E., Zanco, G.: A mean-field model with discontinuous coefficients for neurons with spatial interaction. Discrete Contin. Dyn. Syst. A 39(6), 3037–3067 (2019)
Article MathSciNet MATH Google Scholar
Fournier, N., Jourdain, B.: Stochastic particle approximation of the Keller–Segel equation and two-dimensional generalization of Bessel process. Ann. Appl. Probab. 27(5), 2807–2861 (2017)
Article MathSciNet MATH Google Scholar
Frikha, N., Konakov, V., Menozzi, S.: Well-posedness of some non-linear stable driven SDEs. Discrete Contin. Dyn. Syst. A 41(2), 849–898 (2020)
Article MathSciNet MATH Google Scholar
Guo, X., Lee, J.S.: Mean Field Games with Singular Controls of Bounded Velocity. Available at SSRN (2017)
Halidias, N., Kloeden, P.E.: A note on the Euler–Maruyama scheme for stochastic differential equations with a discontinuous monotone drift coefficient. BIT Numer. Math. 48(1), 51–59 (2008)
Article MathSciNet MATH Google Scholar
Hammersley, W., S̆is̆ka, D., Szpruch, L.: McKean–Vlasov SDE under measure dependent Lyapunov conditions. Ann. l’Institut Henri Poincaré Probab. Stat. 57(2), 1032–1057 (2021)
Hu, K., Ren, Z., S̆is̆ka, D., Szpruch, L.: Mean-field Langevin dynamics and energy landscape of neural networks. Ann. l’Institut Henri Poincaré Probab. Stat. 57, 2043–2065 (2021)
Jin, S., Li, L., Liu, J.-G.: Random batch methods (RBM) for interacting particle systems. J. Comput. Phys. 400, 30 (2020)
Article MathSciNet MATH Google Scholar
Jourdain, B., Reygner, J.: Capital distribution and portfolio performance in the mean-field Atlas model. Ann. Finance 11, 151–198 (2013)
Article MathSciNet MATH Google Scholar
Kaushansky, V., Reisinger, C.: Simulation of particle systems interacting through hitting times. Discrete Contin. Dyn. Syst. B 24(10), 5481–5502 (2018)
MathSciNet MATH Google Scholar
Kaushansky, V., Reisinger, C., Shkolnikov, M., Song, Z.Q.: Convergence of a time-stepping scheme to the free boundary in the supercooled Stefan problem. arXiv:2010.05281 (2020)
Kumar, C., Neelima.: On explicit Milstein-type scheme for McKean–Vlasov stochastic differential equations with super-linear drift coefficient. Electron. J. Probab. 26, 1–32 (2021)
Lacker, D.: On a strong form of propagation of chaos for McKean–Vlasov equations. Electron. Commun. Probab. 23, 11 (2018)
Article MathSciNet MATH Google Scholar
Lejay, A., Martinez, M.: A scheme for simulating one-dimensional diffusion processes with discontinuous coefficients. Ann. Appl. Probab. 16(1), 107–139 (2006)
Article MathSciNet MATH Google Scholar
Leobacher, G., Szölgyenyi, M.: A numerical method for SDEs with discontinuous drift. BIT Numer. Math. 56(2), 151–162 (2016)
Article MathSciNet MATH Google Scholar
Leobacher, G., Szölgyenyi, M.: A strong order $1/2$ method for multidimensional SDEs with discontinuous drift. Ann. Appl. Probab. 27(4), 2383–2418 (2017)
Article MathSciNet MATH Google Scholar
Leobacher, G., Szölgyenyi, M.: Convergence of the Euler–Maruyama method for multidimensional SDEs with discontinuous drift and degenerate diffusion coefficient. Numer. Math. 138(1), 219–239 (2018)
Article MathSciNet MATH Google Scholar
Li, J., Min, H.: Weak solutions of mean-field stochastic differential equations and application to zero-sum stochastic differential games. SIAM J. Control Optim. 54(3), 1826–1858 (2016)
Article MathSciNet MATH Google Scholar
Mao, X.: Stochastic Differential Equations and Applications. Horwood Publishers Ltd., Sawston (1997)
MATH Google Scholar
McKean, H.P.: A class of Markov processes associated with nonlinear parabolic equations. Proc. Natl. Acad. Sci. U. S. A. 56(6), 1907–1911 (1966)
Article MathSciNet MATH Google Scholar
McKean, H.P.: Propagation of Chaos for a Class of Nonlinear Parabolic Equations, Lecture Series in Differential Equations 7, pp. 41–57. Catholic University, Washington (1967)
Google Scholar
Méléard, S.: Asymptotic behaviour of some particle systems: McKean–Vlasov and Boltzmann models. In: Talay, D., Tubaro, L. (eds.) Probabilistic Models for Nonlinear Partial Differential Equations, Lecture Notes in Mathematics, vol. 1627, pp. 42–95. Springer, Berlin (1996)
Chapter Google Scholar
Méléard, S.: Monte-Carlo approximation for 2D Navier–Stokes equations with measure initial data. Probab. Theory Relat. Fields 121, 367–388 (2001)
Article MathSciNet MATH Google Scholar
Mishura, Y.S., Veretennikov, A.Y.: Existence and uniqueness theorems for solutions of McKean–Vlasov stochastic equations. Theory Probab. Math. Stat. 103, 59–101 (2021)
Article MathSciNet MATH Google Scholar
Müller-Gronbach, T., Yaroslavtseva, L.: On the performance of the Euler–Maruyama scheme for SDEs with discontinuous drift coefficient. Ann. l’Institut Henri Poincaré Probab. Stat. 56(2), 1162–1178 (2020)
MathSciNet MATH Google Scholar
Müller-Gronbach, T., Yaroslavtseva, L.: A strong order $3/4$ method for SDEs with discontinuous drift coefficient. arXiv:1904.09178 (2019)
Neuenkirch, A., Szölgyenyi, M., Szpruch, L.: An adaptive Euler–Maruyama scheme for stochastic differential equations with discontinuous drift and its convergence analysis. SIAM J. Numer. Anal. 57(1), 378–403 (2019)
Article MathSciNet MATH Google Scholar
Ngo, H.-L., Taguchi, D.: Strong convergence for the Euler–Maruyama approximation of stochastic differential equations with discontinuous coefficients. Stat. Probab. Lett. 125, 55–63 (2017)
Article MathSciNet MATH Google Scholar
Pope, S.B.: Turbulent Flows, 11th edn. Cambridge University Press, Cambridge (2011)
MATH Google Scholar
Reis, G.D., Engelhardt, S., Smith, G.: Simulation of McKean-Vlasov SDEs with super linear drift. IMA J. Numer. Anal. 42(1), 874–922 (2021)
Article MATH Google Scholar
Röckner, M., Zhang, X.: Well-posedness of distribution dependent SDEs with singular drifts. Bernoulli 27(2), 1131–1158 (2021)
Article MathSciNet MATH Google Scholar
Ruzhansky, M., Sugimoto, M.: On global inversion of homogeneous maps. Bull. Math. Sci. 5(1), 13–18 (2015)
Article MathSciNet MATH Google Scholar
Shardin, A.A., Szölgyenyi, M.: Optimal control of an energy storage facility under a changing economic environment and partial information. Int. J. Theor. Appl. Finance 19(4), 1–27 (2016)
Article MathSciNet MATH Google Scholar
Sznitman, A.S.: Topics in Propagation of Chaos, Ecole d’été de probabilités de Saint-Flour XIX—1989, Volume 1464 of Lecture notes in Mathematics. Springer, Berlin (1991)
Szölgyenyi, M.: Dividend maximization in a hidden Markov switching model. Stat. Risk Modell. 32(3–4), 143–158 (2016)
MathSciNet MATH Google Scholar
Szpruch, L., Tse, A.: Antithetic multilevel particle system sampling method for McKean–Vlasov SDEs. Ann. Appl. Probab. 31(3), 1100–1139 (2021)
MathSciNet MATH Google Scholar
Veretennikov, A.Y.: On strong solutions and explicit formulas for solutions of stochastic integral equations. Math. USSR Sbornik 39(3), 387–403 (1981)
Article MATH Google Scholar
Zvonkin, A.K.: A transformation of the phase space of a diffusion process that removes the drift. Math. USSR Sbornik 22(129), 129–149 (1974)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

We want to express our gratitude to two anonymous referees for valuable comments and suggestions for improvements on an earlier version of this article. WS thanks Yufei Zhang for several helpful discussions on this topic.

Author information

Authors and Affiliations

Institute of Mathematics and Scientific Computing, University of Graz, Heinrichstr. 36, 8010, Graz, Austria
Gunther Leobacher
Mathematical Institute, University of Oxford, Andrew Wiles Building, Woodstock Road, Oxford, OX2 6GG, UK
Christoph Reisinger & Wolfgang Stockinger

Authors

Gunther Leobacher
View author publications
You can also search for this author in PubMed Google Scholar
Christoph Reisinger
View author publications
You can also search for this author in PubMed Google Scholar
Wolfgang Stockinger
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wolfgang Stockinger.

Additional information

Communicated by David Cohen.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

G. Leobacher is supported by the Austrian Science Fund (FWF): Project F5508-N26, which is part of the Special Research Program ‘Quasi-Monte Carlo Methods: Theory and Applications’. W. Stockinger is supported by a special Upper Austrian Government Grant.

A Auxiliary results

1.1 A.1 L-differentiability of $G^{-1}$

Proposition A.1

Let $G:{\mathbb {R}}\times {\mathcal {P}}_2({\mathbb {R}}) \rightarrow {\mathbb {R}}$ be defined by (3.5) with $c>0$ as in Remark 3.3 and let the assumptions of Proposition 3.3 be satisfied. Assume that G is L-differentiable, and ${\mathbb {R}}\ni y \mapsto \partial _{\mu }G(x,\mu )(y)$ is in ${\mathcal {C}}^{1}({\mathbb {R}},{\mathbb {R}})$ for all $(x,\mu ) \in {\mathbb {R}}\times {\mathcal {P}}_2({\mathbb {R}})$. Then, the inverse $G^{-1}:{\mathbb {R}}\times {\mathcal {P}}_2({\mathbb {R}}) \rightarrow {\mathbb {R}}$ is continuously L-differentiable with

$$\begin{aligned} \partial _{\mu }G^{-1}(x,\mu )(y)= \frac{-\partial _{\mu } G(G^{-1}(x,\mu ),\mu )(y)}{\partial _{x} G(G^{-1}(x,\mu ),\mu )}, \end{aligned}$$

(A.1)

for all $(x,\mu ,y) \in {\mathbb {R}}\times {\mathcal {P}}_2({\mathbb {R}}) \times {\mathbb {R}}$.

Proof

First, we remark that (H.2(2)) guarantees the L-differentiability of G. Let now with ${\mathcal {L}}_{X}=\mu $ and ${\mathcal {L}}_{X+Y}=\nu $, for $\mu , \nu \in {\mathcal {P}}_2({\mathbb {R}})$. Consider the lifted inverse function ${\tilde{G}}^{-1}$ defined by ${\tilde{G}}^{-1}(y,X):= G^{-1}(y,\mu )$, for any $y \in {\mathbb {R}}$. Similarly, we set ${\tilde{G}}(x,X):= G(x,\mu )$ and ${\tilde{G}}(x,X+Y):= G(x,\nu )$, for any $x \in {\mathbb {R}}$. Now fix $y\in {\mathbb {R}}$ and set $x:={\tilde{G}}^{-1}(y,X)$, $h:={\tilde{G}}^{-1}(y,X+Y)-x$.

Observe that $y=G(G^{-1}(y,\nu ),\nu )=G(x+h,\nu )$ and also $y=G(G^{-1}(y,\mu ),\mu )=G(x,\mu )$. In addition, due to the Lipschitz continuity of $\mu \mapsto G^{-1}(y,\mu )$, we have

$$\begin{aligned} |h| = |G^{-1}(y,\nu ) - G^{-1}(y,\mu )| \le L {\mathcal {W}}_2(\nu ,\mu ) \le L \Vert Y \Vert _{L_2}. \end{aligned}$$

(A.2)

In what follows, we aim to show (A.1):

$$\begin{aligned}&\frac{\left| {\tilde{G}}^{-1}(y,X+Y) - {\tilde{G}}^{-1}(y,X) - \left\langle \frac{-\partial _{\mu } G(x,\mu )(X)}{\partial _{x} G(x,\mu )},Y \right\rangle _{L_2} \right| }{ \Vert Y \Vert _{L_2}} \\&\quad \le C \frac{\left| -\partial _{x} G(x,\mu ) h - \left\langle \partial _{\mu } G(x,\mu )(X),Y \right\rangle _{L_2} \right| }{ \Vert Y \Vert _{L_2}}, \end{aligned}$$

for some constant $C>0$, where we used the boundedness of $x \mapsto \partial _{x} G(x,\mu )$. Employing this estimate, along with the identity ${\tilde{G}}(x,X)={\tilde{G}}(x+h,X+Y)$ and (A.2), we obtain

$$\begin{aligned}&\frac{\left| {\tilde{G}}^{-1}(y,X+Y) - {\tilde{G}}^{-1}(y,X) - \left\langle \frac{-\partial _{\mu } G(x,\mu )(X)}{\partial _{x} G(x,\mu )},Y \right\rangle _{L_2} \right| }{ \Vert Y \Vert _{L_2}} \\&\quad \le C\frac{\left| -\partial _{x} G(x,\mu )h + {\tilde{G}}(x+h,X+Y) - {\tilde{G}}(x,X) - \left\langle \partial _{\mu } G(x,\mu )(X),Y \right\rangle _{L_2} \right| }{ \Vert Y \Vert _{L_2}} \\&\quad \le C\frac{\left| {\tilde{G}}(x+h,X+Y) - {\tilde{G}}(x,X+Y) -\partial _{x} G(x,\nu )h \right| }{ \Vert Y \Vert _{L_2}} +C\frac{\left| \partial _{x} G(x,\nu )h -\partial _{x} G(x,\mu )h \right| }{ \Vert Y \Vert _{L_2}} \\&\qquad +C\frac{\left| {\tilde{G}}(x,X+Y) - {\tilde{G}}(x,X) - \left\langle \partial _{\mu } G(x,\mu )(X),Y \right\rangle _{L_2} \right| }{ \Vert Y \Vert _{L_2}}\\&\quad \le CL\frac{\left| G(x+h,\nu ) - G(x,\nu ) -\partial _{x} G(x,\nu )h \right| }{ |h|}\\&\quad \quad +CL\left| \partial _{x} {{\tilde{G}}}(x,X+Y) -\partial _{x} {{\tilde{G}}}(x,X) \right| \\&\qquad +C\frac{\left| {\tilde{G}}(x,X+Y) - {\tilde{G}}(x,X) - \left\langle \partial _{\mu } G(x,\mu )(X),Y \right\rangle _{L_2} \right| }{ \Vert Y \Vert _{L_2}} \,. \end{aligned}$$

Now, if $\Vert Y \Vert _{L_2} \rightarrow 0$, then also $ |h| \rightarrow 0$, so the terms in the last estimate tend to zero from which the claim follows. Furthermore, we remark that the mapping $y \mapsto \frac{-\partial _{\mu } G(G^{-1}(x,\mu ),\mu )(y)}{\partial _{x} G(G^{-1}(x,\mu ),\mu )}$ is in ${\mathcal {C}}^{1}({\mathbb {R}},{\mathbb {R}})$. $\square $

1.2 A.2 The class ${\mathscr {C}}$

For a given $ N \in {\mathbb {N}}$, we define for all $k\in \{1,\ldots , N\}$ the sets

$$\begin{aligned} \varTheta ^k:=\{{\varvec{x}}^{N}=(x_1,\ldots ,x_N)^{\top } :x_k=0\}. \end{aligned}$$

Definition A.1

For $ N \in {\mathbb {N}}$, we define the class ${\mathscr {C}}$ of functions ${\varvec{G}}_N:{\mathbb {R}}^N\rightarrow {\mathbb {R}}^N$ with the following properties:

(p1)
${\varvec{G}}_N$ is ${\mathcal {C}}^1({\mathbb {R}}^N,{\mathbb {R}}^{N \times N})$;
(p2)
for all $ {\varvec{x}}^{N} \in {\mathbb {R}}^N$, $\det \big ({\varvec{G}}_N'({\varvec{x}}^{N})\big )\ne 0$;
(p3)
$\lim _{|{\varvec{x}}^{N}|\rightarrow \infty } |{\varvec{G}}_N({\varvec{x}}^{N})|=\infty $;
(p4)
for all $k\in \{1,\dots ,N \}$ and all ${\varvec{x}}^{N} \in \varTheta ^k$, we have $G_k({\varvec{x}}^{N})=0$;
(p5)
for all $j,k\in \{1,\dots ,N \}$ and all ${\varvec{x}}^{N} \in \varTheta ^k$, we have $\partial _{x_j} G_k({\varvec{x}}^{N})=\delta _{j,k}$;
(p6)
for all $i,j,k\in \{1,\dots ,N\}$ with $i\ne j$ the mixed partial derivative $\partial _{x_i}\partial _{x_j}G_k$ exists and is continuous;
(p7)
for all $j,k\in \{1,\dots ,N \}$ with $j\ne k$ the second partial derivative $\partial _{x_j}^2G_k$ exists and is continuous;
(p8)
for all $k\in \{1,\dots ,N\}$ the second partial derivative $\partial _{x_k}^2 G_k$ exists on ${\mathbb {R}}^N\setminus \varTheta ^k$ and is continuous there.

The aim of this appendix is to prove the following theorem:

Theorem A.1

If ${\varvec{G}}_N \in {\mathscr {C}}$, then ${\varvec{G}}_N$ is invertible with ${\varvec{G}}_N^{-1}\in {\mathscr {C}}$.

Proof

Let ${\varvec{G}}_N = (G_1, \ldots , G_N)^{\top } \in {\mathscr {C}}$. The Hadamard global inverse function theorem states that under assumptions (p1), (p2), and (p3), ${\varvec{G}}_N$ is invertible with inverse in ${\mathcal {C}}^1({\mathbb {R}}^N,{\mathbb {R}}^{N \times N})$. We denote ${\varvec{H}}_N={\varvec{G}}^{-1}_N$ and conclude

(p1)
${\varvec{H}}_N \in {\mathcal {C}}^1({\mathbb {R}}^N,{\mathbb {R}}^{N \times N})$;
(p2)
for all ${\varvec{x}}^{N} \in {\mathbb {R}}^d$, $\det \big ({\varvec{H}}_N'({\varvec{x}}^{N})\big )\ne 0$;
(p3)
$\lim _{|{\varvec{x}}^{N}|\rightarrow \infty } |{\varvec{H}}_N({\varvec{x}}^{N})|=\infty $.

Let $k\in \{1,\ldots ,N\}$ and fix values $x_1,\ldots ,x_{k-1},x_{k+1},\ldots ,x_N\in {\mathbb {R}}$. As ${\varvec{G}}_N$ has a global inverse, the mapping $ {\mathbb {R}}\ni x_k \mapsto G_k(x_1,\ldots ,x_N)$ is invertible, i.e., there is precisely one $x_k$ with $G_k(x_1,\ldots ,x_N)=0$. On the other hand, $G_k(x_1,\ldots ,x_{k-1},0,x_{k+1},\ldots ,x_N)=0$, and consequently we proved:

(p4)
for all $k\in \{1,\dots ,N\}$ and all ${\varvec{y}} \in \varTheta ^k$, we have $H_k({\varvec{y}})=0$.

We note that, since ${\varvec{H}}_N \circ {\varvec{G}}_N$ is the identity on ${\mathbb {R}}^N$

$$\begin{aligned} \partial _{x_j}(H_k \circ {\varvec{G}}_N)=\delta _{j,k}, \end{aligned}$$

and therefore, using (p5) for ${\varvec{G}}_N$, we have for ${\varvec{x}}^{N} \in \varTheta ^k$

$$\begin{aligned} \delta _{j,k}&=\sum _{l=1}^N \partial _{y_l} H_k\circ {\varvec{G}}_N({\varvec{x}}^{N})\,\partial _{x_j} G_l({\varvec{x}}^{N}) =\sum _{l=1}^N \partial _{y_l} H_k\circ {\varvec{G}}_N({\varvec{x}}^{N})\,\delta _{j,l}\\&=\partial _{y_j} H_k\circ {\varvec{G}}_N({\varvec{x}}^{N}). \end{aligned}$$

Now if ${\varvec{y}}^{N} \in \varTheta ^k$, then ${\varvec{H}}_N({\varvec{y}}^{N})\in \varTheta ^k$, and therefore $ \partial _{y_j}H_k(y)=\partial _{y_j}H_k\circ {\varvec{G}}_N({\varvec{H}}_N({\varvec{y}}^{N})) =\delta _{j,k}. $ To summarise, we have shown:

(p5)
for all $j,k\in \{1,\dots ,N\}$ and all ${\varvec{y}}^{N} \in \varTheta ^k$, we have $\partial _{y_j}H_k({\varvec{y}}^{N})=\delta _{j,k}$.

To prove (p6)–(p8), we first note that for any $j,k\in \{1,\dots ,N\}$

$$\begin{aligned} \partial _{y_j} G_k({\varvec{H}}_N({\varvec{y}}^{N})) = \sum _{l=1}^{N} \partial _{x_l} G_k({\varvec{H}}_N({\varvec{y}}^{N})) \partial _{y_j} H_l({\varvec{y}}^{N}) = \delta _{j,k}, \end{aligned}$$

and hence

$$\begin{aligned} \left( \partial _{y_j} H_l({\varvec{y}}^{N}) \right) _{l \in \lbrace 1, \ldots , N \rbrace } = \left( ({\varvec{G}}'_N)^{-1}({\varvec{H}}_N({\varvec{y}}^{N})) \right) _j, \end{aligned}$$

where the subindex denotes the j-th column of $({\varvec{G}}'_N)^{-1}$. The higher order regularity properties of ${\varvec{H}}_N$ follow from this expression and the second order differentiability properties of ${\varvec{G}}_N$. $\square $

Rights and permissions

This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Leobacher, G., Reisinger, C. & Stockinger, W. Well-posedness and numerical schemes for one-dimensional McKean–Vlasov equations and interacting particle systems with discontinuous drift. Bit Numer Math 62, 1505–1549 (2022). https://doi.org/10.1007/s10543-022-00920-4

Download citation

Received: 29 June 2020
Accepted: 28 March 2022
Published: 18 May 2022
Issue Date: December 2022
DOI: https://doi.org/10.1007/s10543-022-00920-4

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Well-posedness and numerical schemes for one-dimensional McKean–Vlasov equations and interacting particle systems with discontinuous drift

Abstract

Similar content being viewed by others

Approximations of McKean–Vlasov Stochastic Differential Equations with Irregular Coefficients

Coupled McKean–Vlasov Equations Over Convex Domains

McKean–Vlasov type stochastic differential equations arising from the random vortex method

1 Introduction

2 Preliminaries

2.1 Notions and notation

Definition 2.1

2.2 Properties of the transformation G

3 Existence and uniqueness results

3.1 McKean–Vlasov SDEs and interacting particle systems with decomposable drift

H. 1

Proposition 3.1

Proof

Proposition 3.2

Proof

3.2 McKean–Vlasov SDE with non-decomposable drift

H. 2

Remark 3.1

Remark 3.2

Proposition 3.3

Proof

Remark 3.3

Lemma 3.1

Proof

Theorem 3.1

Proof

3.3 Interacting particle system with non-decomposable drift

H. 3

Remark 3.4

Lemma 3.2

Proof

Lemma 3.3

Proof

Theorem 3.2

Proof

4 Euler–Maruyama scheme with and without transformation

4.1 Scheme 1: Euler–Maruyama after transformation (decomposable case)

Theorem 4.1

Proof

4.2 Scheme 2: Euler–Maruyama without transformation (decomposable case)

Proposition 4.1

Proposition 4.2

Proof

Proposition 4.3

Proof

Theorem 4.2

Proof

Remark 4.1

4.3 Scheme 1 for the non-decomposable case

Lemma 4.1

Proof

Remark 4.2

Remark 4.3

Theorem 4.3

Proof

5 Numerical illustration

5.1 Neuronal interactions

5.2 Systemic risk

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

A Auxiliary results

A Auxiliary results

1.1 A.1 L-differentiability of \(G^{-1}\)

Proposition A.1

Proof

1.2 A.2 The class \({\mathscr {C}}\)

Definition A.1

Theorem A.1

Proof

Rights and permissions

About this article

Cite this article