1 Introduction

The main objective of this paper is to study and quantify convergence to equilibrium for McKean–Vlasov type nonlinear stochastic differential equations of the form

$$\begin{aligned} \textrm{d}\bar{X}_t= \left[ \int _{{\mathbb {R}}^d} b(\bar{X}_t-x) \textrm{d}{\bar{\mu }}_t(x) \right] \textrm{d}t+\textrm{d}B_t \;, \qquad {\bar{\mu }}_t=\textrm{Law}(\bar{X}_t) \;, \end{aligned}$$
(1)

where \((B_t)_{t\ge 0}\) is a d-dimensional standard Brownian motion and \(b:{\mathbb {R}}^d\rightarrow {\mathbb {R}}^d\) is a Lipschitz continuous function. This nonlinear SDE is the probabilistic counterpart of the Fokker–Planck equation

$$\begin{aligned} \frac{\partial }{\partial t} u_t= \nabla \cdot \Big [(1/2)\nabla u_t -(b*u_t)u_t\Big ] \;, \end{aligned}$$
(2)

which describes the time evolution of the density \(u_t\) of \( {\bar{\mu }}_t\) with respect to the Lebesgue measure on \({\mathbb {R}}^d\). Moreover, we also study uniform in time propagation of chaos for the approximating mean-field interacting particle systems

$$\begin{aligned} \textrm{d}X_t^{i,N}&= \frac{1}{N}\sum _{j=1}^N b \left( X_t^{i,N}-X_t^{j,N}\right) \textrm{d}t+\textrm{d}B_t^i \;,{} & {} i\in \{ 1,\ldots ,N\} \;, \end{aligned}$$
(3)

with i.i.d. initial values \(X_0^{1,N},\ldots ,X_0^{N,N}\), and driven by independent d-dimensional Brownian motions \(\{(B_t^i)_{t\ge 0} \}_{i=1}^N\). Our results are based on a new probabilistic approach relying on sticky couplings and comparison with solutions to a class of nonlinear stochastic differential equations on the real interval \([0,\infty )\) with a sticky boundary at 0. The study of this type of equations carried out below might also be of independent interest.

The Eqs. (1) and (2) have been studied in many works. Often a slightly different setup is considered, where the interaction b is assumed to be of gradient type, i.e., \(b=-\nabla W\) for an interaction potential function \(W:{\mathbb {R}}^d\rightarrow {\mathbb {R}}\), and an additional confinement potential function \(V:{\mathbb {R}}^d\rightarrow {\mathbb {R}}\) satisfying \(\lim _{|x|\rightarrow \infty }V(x)=\infty \) is included in the equations. The corresponding Fokker–Planck equation

$$\begin{aligned} \frac{\partial }{\partial t} u_t= & {} \nabla \cdot \Big [(1/2)\nabla u_t +(\nabla V+\nabla W*u_t)u_t\Big ] \;, \end{aligned}$$
(4)

occurs for example in the modelling of granular media, see [3, 44] and the references therein. Existence and uniqueness of solutions to (1), (2) and (4) have been studied intensively. Introductions to this topic can be found for example in [23, 35, 36, 43], while recent results have been established in [26, 37]. Under appropriate conditions, it can be shown that the solutions converge to a unique stationary distribution at some given rate, see e.g.[7, 10, 11, 16, 17, 25]. In the case without confinement considered here, convergence to equilibrium of \(({\bar{\mu }}_t)_{t \ge 0}\) defined by (1) can only be expected for centered solutions, or after recentering around the center of mass of \( {\bar{\mu }}_t\). It has first been analyzed in [10, 11] by an analytic approach and under the assumption that \(b=-\nabla W\) for a convex function W. In particular, exponential convergence to equilibrium has been established under the strong convexity assumption \(\textrm{Hess}(W)\ge \rho {\text {Id}}\) for some \(\rho >0\), and polynomial convergence in the case where W is only degenerately strictly convex. Similar results and some extensions have been derived in [12, 33] using a probabilistic approach.

Our first contribution aims at complementing these results, and extending them to non-convex interaction potentials and interaction functions that are not of gradient type. More precisely, suppose that

$$\begin{aligned} b(x) = - Lx + \gamma (x) \;, \qquad x \in {\mathbb {R}}^d \;, \end{aligned}$$
(5)

where \(L\in (0,\infty ) \) is a positive real constant, and \(\gamma : {\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) is a bounded function. Then we give conditions on \(\gamma \) ensuring exponential convergence of centered solutions to (1) to a unique stationary distribution in the standard \(\textrm{L}^1\) Wasserstein metric. More generally, we show in Theorem 1 that under these conditions there exist constants \(M,c\in (0,\infty )\) that depend only on L and \(\gamma \) such that if \(({\bar{\mu }}_t)_{t \ge 0}\) and \(({\bar{\nu }}_t)_{t \ge 0}\) are the marginal distributions of two solutions of (1), then for all \(t\ge 0\),

$$\begin{aligned} {{\mathcal {W}}}_1({\bar{\mu }}_t,{\bar{\nu }}_t)&\le M \textrm{e}^{-ct}{{\mathcal {W}}}_1({\bar{\mu }}_0,{\bar{\nu }}_0) \;. \end{aligned}$$

Using a coupling approach, related results have been derived in the previous works [16, 17] for the case where an additional confinement term is included in the equations. However, the arguments in these works rely on treating the equation with confinement and interaction term as a perturbation of the corresponding equation without interaction term, which has good ergodic properties. In the unconfined case this approach does not work, since the equation without interaction is transient and hence does not admit an invariant probability measure. Moreover, we are not aware of results for this framework with non-convex interaction potentials and non-gradient interaction functions that rely on classical analytical methods. Therefore, we have to develop a new approach for analyzing the equation without confinement.

Our approach is based on sticky couplings, an idea first developed in [18] to control the total variation distance between the marginal distributions of two non degenerate diffusion processes with identical noise but different drift coefficients. Since two solutions of (1) differ only in their drifts, we can indeed couple them using a sticky coupling in the sense of [18]. It can then be shown that the coupling distance process can be controlled by the solution \((r_t)_{t\ge 0}\) of a nonlinear SDE on \([0,\infty )\) with a sticky boundary at 0 of the form

$$\begin{aligned} \textrm{d}r_t=[{\tilde{b}}(r_t)+a {\mathbb {P}}(r_t >0)]\textrm{d}t+2 \mathbbm {1}_{(0,\infty )}(r_t) \textrm{d}W_t \;, \end{aligned}$$
(6)

Here \({\tilde{b}}\) is a real-valued function on \([0,\infty )\) satisfying \({{\tilde{b}}} (0)=0 \), a is a positive constant, and \((W_t)_{t\ge 0}\) is a one-dimensional standard Brownian motion. Solutions to SDEs with diffusion coefficient \(r \mapsto \mathbbm {1}_{(0,\infty )}(r)\), as in (6), have a sticky boundary at 0, i.e., if the drift at 0 is strictly positive, then the set of all time points \(t\in [0,\infty )\) such that \(r_t=0\) is a fractal set with strictly positive Lebesgue measure that does not contain any open interval. Sticky SDEs have attracted wide interest, starting from [21, 22] in the one-dimensional case. Multivariate extensions have been considered in [27, 45, 46] building upon results obtained in [34, 40, 41], while corresponding martingale problems have been investigated in [42]. Versions of sticky processes occur among others in natural sciences [8, 24] and finance [29]. Note that in general no strong solution for this class of SDEs exists as illustrated in [13]. We refer to [2, 20] and the references therein for recent contributions on this topic. Note, however, that in contrast to standard sticky SDEs, the equation (6) is nonlinear in the sense of McKean. We are not aware of previous studies of such nonlinear sticky equations, which seems to be a very interesting topic on its own.

Intuitively, one would hope that as time evolves, more mass gets stuck at 0, i.e., \({\mathbb {P}}(r_t >0)\) decreases. As a consequence, the drift at 0 in Eq. (6) decreases, which again forces even more mass to get stuck at 0. Therefore, under appropriate conditions one could hope that \({\mathbb {P}}(r_t =0)\) converges to 1 as \(t\rightarrow \infty \). On the other hand, if a is too large then the drift at 0 might be too strong so that not all of the mass gets stuck at 0 eventually. This indicates that there might be a phase transition for the nonlinear sticky SDE depending on the size of the constant a compared to \({{\tilde{b}}}\). In Sect. 3, we prove rigorously that this intuition is correct. Under appropriate conditions on \({\tilde{b}}\), we show at first that existence and uniqueness in law holds for solutions of (6). Then we prove that for a sufficiently small, the Dirac measure at 0 is the unique invariant probability measure, and geometric ergodicity holds. As a consequence, under corresponding assumptions, the sticky coupling approach yields exponential convergence to equilibrium for the original nonlinear SDE (1). On the other hand, we prove the existence of multiple invariant probability measures for (6) if the smallness condition on a is not satisfied. In this case, we cannot make a statement on the behaviour of the distance function corresponding to the sticky coupling approach since based on this approach we only get upper bounds and the existence of multiple invariant measure for the dominating sticky nonlinear SDE does not imply that the underlying distance function does not converge. If the unconfined SDE (1) has multiple invariant measures and if the two copies of the unconfined SDE in the sticky coupling start in two different equilibria, then the law of the distance function does not converge to the Dirac measure at zero. Our results for (1) can also be adapted to deal with nonlinear SDEs over the torus \({\mathbb {T}} = {\mathbb {R}}/(2\pi {\mathbb {Z}})\), as considered in [15]. As an example, we discuss the application to the Kuramoto model for which a more explicit analysis is available [1, 4, 5, 9].

Finally, in addition to studying the long-time behaviour of the nonlinear SDE (1), we are also interested in establishing propagation of chaos for the mean-field particle system approximation (3). The propagation of chaos phenomenon first introduced by Kac [30] describes the convergence of the empirical measure of the mean-field particle system (3) to the solution (1). More precisely, in [36, 43] it has been shown under weak assumptions on W that for i.i.d. initial laws, the random variables \(X_t^{i,N} \), \(i\in \{ 1,\ldots ,N\}\), become asymptotically independent as \(N\rightarrow \infty \), and the common law \(\mu _t^{N}\) of each of these random variables converges to \({\bar{\mu }}_t\). However, the original results are only valid uniformly over a finite time horizon. Quantifying the convergence uniformly for all times \(t\in {\mathbb {R}}_+\) is an important issue. The case with a confinement potential has been studied for example in [16], see also the references therein. Again, the case when there is only interaction is more difficult. Malrieu [33] seems the first to consider the case without confinement. By applying a synchronous coupling, he proved uniform in time propagation of chaos for strongly convex interaction potentials. Later on, assuming that the interaction potential is loosing strict convexity only in a finite number of points (e.g., \(W(x)= |x|^3\)), Cattiaux, Guillin and Malrieu [12] have shown uniform in time propagation of chaos with a rate getting worse with the degeneracy in convexity. In a very recent work, Delarue and Tse [14] prove uniform in time weak propagation of chaos (i.e., observable by observable) on the torus via Lions derivative methods. Remarkably, their results are not limited to the unique invariant measure case.

Our contribution is in the same vein using probabilistic tools in place of analytic ones. We endow the space \({\mathbb {R}}^{Nd}\) consisting of N particle configurations \(x=(x^i)_{i=1}^N\) with the semi-metric \(l^1\circ \pi \), where

$$\begin{aligned} l^1(x,y)\ =\ \frac{1}{N}\sum \nolimits _{i=1}^N \left| x^i-y^i \right| \end{aligned}$$
(7)

is a normalized \(l^1\)-distance between configurations \(x,y\in {\mathbb {R}}^{Nd}\), and

$$\begin{aligned} \pi (x,y)\ =\ \left( \left( x^i-\frac{1}{N}\sum \nolimits _{j=1}^Nx^{j}\right) _{i=1}^N,\left( y^{i}-\frac{1}{N}\sum \nolimits _{j=1}^Ny^{j}\right) _{i=1}^N\right) \;, \end{aligned}$$
(8)

is a projection from \({\mathbb {R}}^{Nd}\times {\mathbb {R}}^{Nd}\) to the subspace \(\textsf{H}_N\times \textsf{H}_N\), where

$$\begin{aligned} \textsf{H}_N\ =\ \{x\in {\mathbb {R}}^{Nd}:\sum \nolimits _{i=1}^N x^{i} =0\} \;. \end{aligned}$$
(9)

Let \({{\mathcal {W}}}_{l^1\circ \pi }\) denote the \(L^1\) Wasserstein semimetric on probability measures on \({\mathbb {R}}^{Nd}\) corresponding to the cost function \(l^1\circ \pi \). Then under assumptions stated below, we prove uniform in time propagation of chaos for the mean-field particle system in the following sense: Suppose that \((X_t^{1,N},\ldots ,X_t^{N,N})_{t\ge 0}\) is a solution of (3) such that \(X_0^{1,N},\ldots ,X_0^{N,N}\) are i.i.d. with distribution \({\bar{\mu }}_0\) having finite second moment. Let \(\nu _t^N\) denote the joint law of the random variables \(X_t^{i,N}\), \(i\in \{ 1,\ldots N\}\), and let \({{\bar{\mu }}}_t\) denote the law of the solution of (1) with initial law \({{\bar{\mu }}}_0\). Then there exists a constant \(C \in [0,\infty )\) such that for any \(N \in {\mathbb {N}}\),

$$\begin{aligned} \sup _{t \ge 0} \, {{\mathcal {W}}}_{l^1\circ \pi }({\bar{\mu }}_t^{\otimes N},\nu _t^N)\le CN^{-1/2} \;. \end{aligned}$$
(10)

The proof is based on a componentwise sticky coupling, and a comparison of the coupling difference process with a system of one-dimensional sticky nonlinear SDEs.

The paper is organised as follows. In Sect. 2, we state our main results regarding the long-time behaviour of (1). The main results on one-dimensional nonlinear SDEs with a sticky boundary at zero are stated in Sect. 3. Sections 4 and 5 contain the corresponding results on uniform (in time) propagation of chaos and mean-field systems of sticky SDEs. All the proofs are given in Sect. 6. In “Appendix A”, we carry the results over to nonlinear sticky SDEs over \({\mathbb {T}}\) and consider the application to the Kuramoto model.

Notation

The Euclidean norm on \({\mathbb {R}}^{d}\) is denoted by \(|\cdot |\). For \(x\in {\mathbb {R}}\), we write \(x_+=\max (0,x)\). For some space \({\mathbb {X}}\), which here is either \({\mathbb {R}}^{d}\), \({\mathbb {R}}^{Nd}\) or \({\mathbb {R}}_+\), we denote its Borel \(\sigma \)-algebra by \({{\mathcal {B}}}({\mathbb {X}})\). The space of all probability measures on \(({\mathbb {X}},{{\mathcal {B}}}({\mathbb {X}}))\) is denoted by \({{\mathcal {P}}}({\mathbb {X}})\). Let \(\mu ,\nu \in {{\mathcal {P}}}({\mathbb {X}})\). A coupling \(\xi \) of \(\mu \) and \(\nu \) is a probability measure on \(({\mathbb {X}}\times {\mathbb {X}},{{\mathcal {B}}}({\mathbb {X}})\otimes {{\mathcal {B}}}({\mathbb {X}}))\) with marginals \(\mu \) and \(\nu \). \(\Gamma (\mu ,\nu )\) denotes the set of all couplings of \(\mu \) and \(\nu \). The \(\textrm{L}^1\) Wasserstein distance with respect to a distance function \(d:{\mathbb {X}}\times {\mathbb {X}}\rightarrow {\mathbb {R}}_+\) is defined by

$$\begin{aligned} {{\mathcal {W}}}_d(\mu ,\nu )=\inf _{\xi \in \Gamma (\mu ,\nu )}\int _{{\mathbb {X}}\times {\mathbb {X}}}d(x,y)\xi (\textrm{d}x \textrm{d}y)\;. \end{aligned}$$

We write \({{\mathcal {W}}}_1\) if the underlying distance function is the Euclidean distance.

We denote by \({{\mathcal {C}}}({\mathbb {R}}_+,{\mathbb {X}})\) the set of continuous functions from \({\mathbb {R}}_+\) to \({\mathbb {X}}\), and by \({{\mathcal {C}}}^2({\mathbb {R}}_+,{\mathbb {X}})\) the set of twice continuously differentiable functions.

Consider a probability space \((\Omega , {{\mathcal {A}}},P)\) and a measurable function \(r:\Omega \rightarrow {{\mathcal {C}}}({\mathbb {R}}_+,{\mathbb {X}})\). Then \({\mathbb {P}}=P\circ r^{-1}\) denotes the law on \({{\mathcal {C}}}({\mathbb {R}}_+,{\mathbb {X}})\), and \(P_t=P\circ {r_t}^{-1}\) the marginal law on \({\mathbb {X}}\) at time t.

2 Long-time behaviour of McKean–Vlasov diffusions

We establish our results regarding (1) and (3) under the following assumption on b.

B1

The function \(b:{\mathbb {R}}^d\rightarrow {\mathbb {R}}^d\) is Lipschitz continuous and anti-symmetric, i.e., \(b(z)=-b(-z)\), and there exist \(L\in (0,\infty )\), a function \(\gamma :{\mathbb {R}}^d\rightarrow {\mathbb {R}}^d\) and a Lipschitz continuous function \(\kappa :[0,\infty )\rightarrow {\mathbb {R}}\) such that

$$\begin{aligned} b(z)=-Lz+ \gamma (z) \qquad \text {for all } z\in {\mathbb {R}}^d \;, \end{aligned}$$
(11)

and the following conditions are satisfied for all \(x,y\in {\mathbb {R}}^d\):

$$\begin{aligned} \langle x-y,\gamma (x)-\gamma (y)\rangle \le \kappa (|x-y|)|x-y|^2\;, \end{aligned}$$
(12)

and

$$\begin{aligned} \limsup _{r\rightarrow \infty }(\kappa (r)-L)<0 \;. \end{aligned}$$
(13)

Let \(\bar{b}(r)=(\kappa (r)-L)r\). If (13) holds, then there exist \(R_0,R_1 \ge 0\) such that for

$$\begin{aligned} \bar{b}(r)&<0 \;, \qquad \qquad&\text { for any } r> R_0 \;, \end{aligned}$$
(14)
$$\begin{aligned} \bar{b}(r)/r&\le -4/[R_1(R_1-R_{0})] \;, \qquad \qquad&\text { for any } r\ge R_1 \;. \end{aligned}$$
(15)

In addition, we assume

B2

$$\begin{aligned} \Vert \gamma \Vert _\infty \le \Big (4\int _0^{R_1}\exp \Big (\frac{1}{2}\int _0^s \bar{b}(r)_+ \textrm{d}r\Big )\textrm{d}s\Big )^{-1} \;. \end{aligned}$$

Often drifts of gradient type are considered, i.e., \(b\equiv \nabla U\) for some potential \(U\in {{\mathcal {C}}}^2\). Then, B1 is satisfied for instance for L-strongly convex potentials and condition (12) holds for \(\kappa \equiv 0\). In this case, B2 reduces to \(\Vert \gamma \Vert _{\infty }\le \sqrt{L}/8\). But, the assumptions include also asymptotically L-strongly convex potentials as double-well potentials and more general drifts provided the deviation represented by the function \(\gamma \) to the linear term \(-Lz\) is sufficiently small in terms of the generalized one-sided Lipschitz bound and the bound in the supremum norm. In particular, this can always be obtained by considering a sufficiently small multiple of \(\gamma \).

Additionally, we consider the following condition on the initial distribution.

B3

The initial distribution \(\mu _0\) satisfies \(\int _{{\mathbb {R}}^d} \left\| x \right\| ^{4} \mu _0(\textrm{d}x) < +\infty \) and \(\int _{{\mathbb {R}}^d} x \, \mu _0(\textrm{d}x) = 0\).

Note that under conditions B1 and B3, unique strong solutions \((\bar{X}_t)_{t\ge 0}\) and \((\{X_t^{i,N}\}_{i=1}^N)_{t\ge 0}\) exist for (1) and (3), see e.g.[12, Theorem 2.6]. In addition, note that since b is assumed to be anti-symmetric, by an easy localisation argument, we get that \( \textrm{d}{\mathbb {E}}[\bar{X_t}]/ \textrm{d}t ={\mathbb {E}}[b*\mu _t(\bar{X}_t)]=0\) and \(\textrm{d}{\mathbb {E}}[ N^{-1}\sum _{i=1}^N X_t^{i,N}]/\textrm{d}t=0\). Thus, if \(\bar{X}_0\) and \(\{X_0^{i,N}\}_{i=1}^N\) have distribution \(\mu _0\) and \(\mu _0^{\otimes N}\), respectively, with \(\mu _0\) satisfying B3, then it holds \({\mathbb {E}}[\bar{X}_t]=0\) and \({\mathbb {E}}[ N^{-1}\sum _{i=1}^N X_t^{i,N}]=0\) for all \(t\ge 0 \).

Suppose \(f:{\mathbb {R}}_+\rightarrow {\mathbb {R}}_+\) is an increasing, concave function vanishing at zero. Then \(d(x,y)=f(|x-y|)\) defines a distance. The corresponding \(L^1\) Wasserstein distance is denoted by \({{\mathcal {W}}}_f\). Note that in the case \(f(t) = t\) for any \(t \ge 0\), \({{\mathcal {W}}}_f\) is simply \({{\mathcal {W}}}_1\).

Theorem 1

(Contraction for nonlinear SDE) Assume B1 and B2. Let \({\bar{\mu }}_0, {\bar{\nu }}_0\) be probability measures on \(({\mathbb {R}}^d,{{\mathcal {B}}}({\mathbb {R}}^d))\) satisfying B3. For any \(t\ge 0\), let \({\bar{\mu }}_t\) and \({\bar{\nu }}_t\) denote the laws of \(\bar{X}_t\) and \(\bar{Y}_t\) where \((\bar{X}_s)_{s\ge 0}\) and \((\bar{Y}_s)_{s\ge 0}\) are solutions of (1) with initial distribution \({\bar{\mu }}_0\) and \({\bar{\nu }}_0\), respectively. Then, for all \(t\ge 0\),

$$\begin{aligned} {{\mathcal {W}}}_f({\bar{\mu }}_t, {\bar{\nu }}_t) \le \textrm{e}^{-{\tilde{c}} t} {{\mathcal {W}}}_f({\bar{\mu }}_0,{\bar{\nu }}_0) \quad \text { and }\quad {{\mathcal {W}}}_1({\bar{\mu }}_t, {\bar{\nu }}_t) \le M_1 \textrm{e}^{-{\tilde{c}} t} {{\mathcal {W}}}_1({\bar{\mu }}_0,{\bar{\nu }}_0) \;,\qquad \end{aligned}$$
(16)

where the function f is defined by (37) and the constants \({\tilde{c}}\) and \(M_1\) are given by

$$\begin{aligned} {\tilde{c}}^{-1}&=2\int _0^{R_{1}}\int _0^s \exp \Big (\frac{1}{2}\int _r^s\bar{b}(u)_+ \ \textrm{d}u\Big )\textrm{d}r\textrm{d}s \;, \end{aligned}$$
(17)
$$\begin{aligned} M_1&= 2\exp \Big (\frac{1}{2}\int _0^{R_0} \bar{b}(s)_+\textrm{d}s\Big ) \;. \end{aligned}$$
(18)

Proof

The proof is postponed to Sect. 6.2.1. \(\square \)

The construction and definition of the underlying distance function \(f(|x-y|)\) mentioned in Theorem 1 is based on the one introduced by [19].

To prove Theorem 1 we use a coupling \((\bar{X}_t,\bar{Y}_t)_{t\ge 0}\) of two copies of solutions to the nonlinear stochastic differential equation (1) with different initial conditions. The coupling \((\bar{X}_t,\bar{Y}_t)_{t\ge 0}\) will be defined as the weak limit of a family of couplings \((\bar{X}_t^\delta ,\bar{Y}_t^\delta )_{t\ge 0}\), parametrized by \(\delta >0\). Roughly, this family is mixture of synchronous and reflection couplings and can be described as follows. For \(\delta >0\), \((\bar{X}_t^\delta ,\bar{Y}_t^\delta )_{t\ge 0}\) behaves like a reflection coupling if \(|\bar{X}_t^\delta -\bar{Y}_t^\delta |\ge \delta \), and like a synchronous coupling if \(|\bar{X}_t^\delta -\bar{Y}_t^\delta |=0\). For \(|\bar{X}_t^\delta -\bar{Y}_t^\delta |\in (0,\delta )\) we take an interpolation of synchronous and reflection coupling. We argue that the family of couplings \(\{(\bar{X}_t^\delta ,\bar{Y}_t^\delta )_{t \ge 0}:\delta >0\}\) is tight and that a subsequence \(\{(\bar{X}_t^{\delta _n},\bar{Y}_t^{\delta _n})_{t\ge 0}:n\in {\mathbb {N}}\}\) converges to a limit \((\bar{X}_t,\bar{Y}_t)_{t\ge 0}\). This limit is a coupling which we call the sticky coupling associated to (1).

To carry out the construction rigorously, we take two Lipschitz continuous functions \(\textrm{rc}^\delta , \textrm{sc}^\delta :{\mathbb {R}}_+\rightarrow [0,1]\) for \(\delta >0\) such that

$$\begin{aligned}&\textrm{rc}^\delta (0)=0\;, \ \textrm{rc}^\delta (r)=1 \text { for } r\ge \delta \;, \textrm{rc}^\delta (r)>0 \text { for }r\nonumber \\&>0 \text { and } \textrm{rc}^\delta (r)^2+\textrm{sc}^\delta (r)^2=1 \text { for } r\ge 0 \;. \end{aligned}$$
(19)

Further, we assume that there exists \(\epsilon _0>0\) such that for any \(\delta \le \epsilon _0\), \(\textrm{rc}^\delta \) satisfies

$$\begin{aligned} \begin{aligned} \textrm{rc}^\delta (r)\ge \frac{\Vert \gamma \Vert _{\textrm{Lip}}}{2\Vert \gamma \Vert _\infty }r{} & {} \text { for any }r\in (0,\delta )\;, \end{aligned} \end{aligned}$$
(20)

where \(\Vert \gamma \Vert _{\textrm{Lip}}<\infty \) denotes the Lipschitz norm of \(\gamma \). This assumption is satisfied for example if \(\textrm{rc}^\delta (r)=\sin ((\pi /2\delta )r)\mathbbm {1}_{r < \delta }+\mathbbm {1}_{r\ge \delta }\) and \(\textrm{sc}^\delta (r)=\cos ((\pi /2\delta )r)\mathbbm {1}_{r < \delta }\) with \(\delta \le \epsilon _0=2\Vert \gamma \Vert _\infty /\Vert \gamma \Vert _{\textrm{Lip}}\).

Let \((B_t^1)_{t\ge 0}\) and \((B_t^2)_{t\ge 0}\) be two d-dimensional Brownian motions. We define the coupling \((\bar{X}_t^\delta ,\bar{Y}_t^\delta )_{t\ge 0}\) as a process in \({\mathbb {R}}^{2d}\) satisfying the following nonlinear stochastic differential equation

$$\begin{aligned} \textrm{d}\bar{X}_t^\delta&=b*{\bar{\mu }}^\delta _t(\bar{X}_t^\delta )\textrm{d}t+ \textrm{rc}^\delta (\bar{r}_t^\delta )\textrm{d}B_t^1+\textrm{sc}^\delta (\bar{r}_t^\delta )\textrm{d}B_t^2\;, \qquad {\bar{\mu }}_t^\delta =\textrm{Law}(\bar{X}_t^\delta )\;,\nonumber \\ \textrm{d}\bar{Y}_t^\delta&=b*{\bar{\nu }}^\delta _t(\bar{Y}_t^\delta )\textrm{d}t+ \textrm{rc}^\delta (\bar{r}_t^\delta )({\text {Id}}-2\bar{e}_t^\delta (\bar{e}_t^\delta )^T ) \textrm{d}B_t^1 \nonumber \\ {}&\quad +\textrm{sc}^\delta (\bar{r}_t^\delta )\textrm{d}B_t^2\;, \qquad {\bar{\nu }}_t^\delta =\textrm{Law}(\bar{Y}_t^\delta ) \end{aligned}$$
(21)

with initial condition \((\bar{X}_0^\delta ,\bar{Y}_0^\delta )=(x_0,y_0)\). Here we set \(\bar{Z}_t^\delta =\bar{X}_t^\delta -\bar{Y}_t^\delta \), \(\bar{r}_t^\delta =|\bar{Z}_t^\delta |\) and \(\bar{e}_t^\delta =\bar{Z}_t^\delta /\bar{r}_t^\delta \) if \(\bar{r}_t^\delta \ne 0\). For \(\bar{r}_t^\delta =0\), \(\bar{e}_t^\delta \) is some arbitrary unit vector, whose exact choice is irrelevant since \(\textrm{rc}^\delta (0)=0\). We note that a refection coupling is obtained if \(\textrm{rc}^{\delta }=1\), whereas a synchronous coupling is obtained if \(\textrm{sc}^{\delta }=0\). This indicates the name of the functions \(\textrm{rc}\) and \(\textrm{sc}\), respectively.

Theorem 2

Assume B1. Let \({\bar{\mu }}_0\) and \({\bar{\nu }}_0\) be probability measures on \(({\mathbb {R}}^d,{{\mathcal {B}}}({\mathbb {R}}^d))\) satisfying B3. Then, \((\bar{X}_t,\bar{Y}_t)_{t \ge 0}\) is a subsequential limit in distribution as \(\delta \rightarrow 0\) of \(\{(\bar{X}_t^{\delta },\bar{Y}_t^{\delta })_{t \ge 0} \,: \, \delta >0\}\) where \((\bar{X}_t)_{t \ge 0}\) and \((\bar{Y}_t)_{t \ge 0}\) are solutions of (1) with initial distribution \({\bar{\mu }}_0\) and \({\bar{\nu }}_0\). Further, there exists a process \((r_t)_{t \ge 0}\) defined on the same probability space as \((\bar{X}_t,\bar{Y}_t)_{t\ge 0}\) satisfying for any \(t \ge 0\), \( \vert \bar{X}_t - \bar{Y}_t \vert \le r_t\) almost surely and which is a weak solution of

$$\begin{aligned} \textrm{d}r_t=(\bar{b}(r_t)+2 \Vert \gamma \Vert _\infty {\mathbb {P}}(r_t >0))\textrm{d}t+2 \mathbbm {1}_{(0,\infty )}(r_t) \textrm{d}{\tilde{W}}_t \;, \end{aligned}$$
(22)

where \(({\tilde{W}}_t)_{t \ge 0}\) is a one-dimensional Brownian motion.

Proof

The proof is postponed to Sect. 6.2.2. \(\square \)

Therefore, next we study sticky nonlinear SDEs given by (6).

3 Nonlinear SDEs with sticky boundaries

Consider nonlinear SDEs with a sticky boundary at 0 of the form

$$\begin{aligned} \textrm{d}r_t=({\tilde{b}}(r_t)+P_t(g))\textrm{d}t+2\mathbbm {1}_{(0,\infty )}(r_t)\textrm{d}W_t\;,{} & {} P_t=\textrm{Law}(r_t) \;, \end{aligned}$$
(23)

where \({\tilde{b}}:[0,\infty )\rightarrow {\mathbb {R}}\) is some continuous function and \(P_t(g)=\int _{{\mathbb {R}}_+}g(r)P_t(\textrm{d}r)\) for some measurable function \(g:[0,\infty )\rightarrow {\mathbb {R}}\).

In this section we establish existence, uniqueness in law and comparison results for solutions of (6). Consider a filtered probability space \((\Omega , {{\mathcal {A}}}, ({{\mathcal {F}}}_t)_{t\ge 0},P)\) and a probability measure \(\mu \) on \({\mathbb {R}}_+\). We call an \(({{\mathcal {F}}}_t)_{t\ge 0}\) adapted process \((r_t,W_t)_{t\ge 0}\) a weak solution of (23) with initial distribution \(\mu \) if the following holds: \(\mu =P\circ r_0^{-1}\), the process \((W_t)_{t\ge 0}\) is a one-dimensional \(({{\mathcal {F}}}_t)_{t\ge 0}\) Brownian motion w.r.t. P, the process \((r_t)_{t\ge 0}\) is non-negative and continuous, and satisfies almost-surely

$$\begin{aligned} r_t-r_0=\int _0^t \Big ({\tilde{b}}(r_s)+P_s(g)\Big )\textrm{d}s+\int _0^t 2\cdot \mathbbm {1}_{(0,\infty )}(r_s)\textrm{d}W_s\;, \qquad \text { for } t\in {\mathbb {R}}_+ \;. \end{aligned}$$

Note that the sticky nonlinear SDE given in (6) is a special case of (23) with \(g(r)=a\mathbbm {1}_{(0,\infty )}(r)\) since \({\mathbb {P}}(r_t >0)=\int _{{\mathbb {R}}_+}\mathbbm {1}_{(0,\infty )}(y) P_t(\textrm{d}y)\) with \(P_t=P\circ r_t^{-1}\).

3.1 Existence, uniqueness in law, and a comparison result

Let \({\mathbb {W}}={{\mathcal {C}}}({\mathbb {R}}_+,{\mathbb {R}})\) be the space of continuous functions endowed with the topology of uniform convergence on compact sets, and let \({{\mathcal {B}}}({\mathbb {W}})\) be the corresponding Borel \(\sigma \)-algebra. Suppose \((r_t, W_t)_{t\ge 0}\) is a solution of (23) on \((\Omega ,{{\mathcal {A}}},P)\), then we denote by \({\mathbb {P}}=P\circ r^{-1}\) its law on \(({\mathbb {W}},{{\mathcal {B}}}({\mathbb {W}}))\). We say that uniqueness in law holds for (23) if for any two solutions \((r_t^1)_{t\ge 0}\) and \((r_t^2)_{t\ge 0}\) of (23) with the same initial law, the distributions of \((r_t^1)_{t\ge 0}\) and \((r_t^2)_{t\ge 0}\) on \(({\mathbb {W}},{{\mathcal {B}}}({\mathbb {W}}))\) are equal.

We impose the following assumptions on \({\tilde{b}}\), g and the initial condition \(\mu \):

H1

\({\tilde{b}}\) is a Lipschitz continuous function with Lipschitz constant \({\tilde{L}}\) and \({\tilde{b}}(0)=0\).

H2

g is a left-continuous, non-negative, non-decreasing and bounded function.

H3

There exists \(p>2\) such that the p-th order moment of the law \(\mu \) is finite.

Note that for (6), the condition H2 is satisfied if a is a positive constant. It follows from H1 and H2 that there is a constant \(C<\infty \) such that for all \(r\in {\mathbb {R}}_+\), the following linear growth condition holds,

$$\begin{aligned} {\tilde{b}}(r)+\sup _{ p\in {{\mathcal {P}}}({\mathbb {R}}_+)}p(g)\le C(1+|r|)\;. \end{aligned}$$
(24)

In order to get a solution to (23) on \({\mathbb {R}}_+\) we extend the function \({\tilde{b}}\) to \({\mathbb {R}}\) by setting \({\tilde{b}}(r)=0\) for \(r<0\). Note that any solution \((r_t)_{t\ge 0}\) with initial distribution supported on \({\mathbb {R}}_+\) satisfies almost surely \(r_t\ge 0\) for all \(t\ge 0\). This follows from the Itō–Tanaka formula applied to \(F(r)=\mathbbm {1}_{(-\infty ,0)}(r) r\), cf. [39, Chapter 6, Theorem 1.2 and Theorem 1.7]. Indeed

$$\begin{aligned} \mathbbm {1}_{(-\infty ,0)}(r_t)r_t&=\mathbbm {1}_{(-\infty ,0)}(r_0)r_0+\int _0^t \mathbbm {1}_{(-\infty ,0)}(r_s)\textrm{d}r_s - \frac{1}{2}\ell _t^{0-}(r) \\ {}&=\int _0^t \mathbbm {1}_{(-\infty ,0)}(r_s)({\tilde{b}}(r_s)+P_s(g))\textrm{d}s \\&\quad + \int _0^t \mathbbm {1}_{(-\infty ,0)}2 \mathbbm {1}_{(0,\infty )}(r_s)\textrm{d}W_s- \frac{1}{2}\ell _t^{0-}(r) \\ {}&=\int _0^t\mathbbm {1}_{(-\infty ,0)}(r_s)P_s(g)\textrm{d}s \ge 0\;, \end{aligned}$$

where \(\ell _t^{0-}(r)\) is the left local time at 0, which is given by \(\ell _t^{0-}(r)=\lim _{\epsilon \downarrow 0} \epsilon ^{-1}\int _0^t\mathbbm {1}_{\{-\epsilon \le r_s\le 0\}}\textrm{d}[r]_s\) and which vanishes, since \(\textrm{d}[r]_s=\mathbbm {1}_{(0,\infty )}(r_s)\textrm{d}s\).

Existence and uniqueness in law of (23) is a direct consequence of a stronger result that we now introduce. To study existence and uniqueness and to compare two solutions of (23) with different drifts, we establish existence of a synchronous coupling of two copies of (23),

$$\begin{aligned} \begin{aligned} \textrm{d}r_t&=({\tilde{b}}(r_t)+P_t(g))\textrm{d}t+2\mathbbm {1}_{(0,\infty )}(r_t)\textrm{d}W_t\;, \\ \textrm{d}s_t&=({\hat{b}}(s_t)+\hat{P_t}(h))\textrm{d}t+2\mathbbm {1}_{(0,\infty )}(s_t)\textrm{d}W_t\;, \qquad \text {Law}(r_0,s_0)=\eta \;, \end{aligned} \end{aligned}$$
(25)

where \(P_t=P\circ r_t^{-1}\), \({\hat{P}}_t=P\circ s_t^{-1}\), \((W_t)_{t\ge 0}\) is a Brownian motion and where \(\eta \in \Gamma (\mu , \nu )\) for \(\mu , \nu \in {{\mathcal {P}}}({\mathbb {R}}_+)\).

Theorem 3

Suppose that \(({\tilde{b}},g)\) and \(({\hat{b}},h)\) satisfy H1 and H2. Let \(\eta \in \Gamma (\mu ,\nu )\) where the probability measures \(\mu \) and \(\nu \) on \({\mathbb {R}}_+\) satisfy H3. Then there exists a weak solution \((r_t,s_t)_{t\ge 0}\) of the sticky stochastic differential equation (25) with initial distribution \(\eta \) defined on a probability space \((\Omega , {{\mathcal {A}}},P)\) with values in \(({\mathbb {W}}\times {\mathbb {W}},{{\mathcal {B}}}({\mathbb {W}})\otimes {{\mathcal {B}}}({\mathbb {W}}))\). If additionally,

$$\begin{aligned}&{\tilde{b}}(r)\le {\hat{b}}(r)\quad and \quad g(r)\le h(r){} & {} \text { for any } r\in {\mathbb {R}}_+, \text { and } \\ {}&P[r_0\le s_0]=1, \end{aligned}$$

then \(P[r_t\le s_t \text { for all } t\ge 0]=1\).

Proof

The proof is postponed to Sect. 6.3.1. \(\square \)

Remark 4

We note that by the comparison result we can deduce uniqueness in law for the solution of (23).

3.2 Invariant measures and phase transition for (6)

Under the following conditions on the drift function \({\tilde{b}}\) we exhibit a phase transition phenomenon for the model (6), where as compared to (23) we focus on the case \(P_t(g)=a {\mathbb {P}}[r_t>0]\).

Theorem 5

Suppose H1 holds and \(\limsup _{r\rightarrow \infty }(r^{-1}{\tilde{b}}(r))<0\). Then, the Dirac measure at 0, \(\delta _0\), is an invariant probability measure for (6). If there exists \(p\in (0,1)\) solving

$$\begin{aligned} (2/a)=(1-p)I(a,p) \end{aligned}$$
(26)

with

$$\begin{aligned} I(a,p)=\int _0^\infty \exp \Big (\frac{1}{2}apx+\frac{1}{2}\int _0^x {\tilde{b}}(r)\textrm{d}r\Big )\textrm{d}x\;, \end{aligned}$$
(27)

then the probability measure \(\pi \) on \([0,\infty )\) given by

$$\begin{aligned} \pi (\textrm{d}x)\propto \Big (\frac{2}{ap}\delta _0(\textrm{d}x)+\exp \Big (\frac{1}{2}apx+\frac{1}{2}\int _0^x {\tilde{b}}(r)\textrm{d}r\Big ) \lambda _{(0,\infty )}(\textrm{d}x)\Big ) \end{aligned}$$
(28)

is another invariant probability measure for (6).

Proof

The proof is postponed to Sect. 6.3.2. \(\square \)

In our next result we specify a necessary and sufficient condition for the existence of a solution of (26).

Proposition 6

Suppose that \({\tilde{b}}(r)\) in (6) is of the form \({\tilde{b}}(r)=-{\tilde{L}}r\) with constant a \({\tilde{L}}>0\). If \(a/\sqrt{{\tilde{L}}} > 2/\sqrt{\pi }\), then there exists a unique \({\hat{p}}\) solving (27). In particular, the Dirac measure \(\delta _0\) and the measure \(\pi \) given in (28) with \({\hat{p}}\) are invariant measures for (6). On the other hand, if \(a/\sqrt{{\tilde{L}}}\le 2/\sqrt{\pi }\), then there exists no \({\hat{p}}\) solving (27).

Proof

The proof is postponed to Sect. 6.3.2. \(\square \)

3.3 Convergence for sticky nonlinear SDEs of the form (6)

Under H1 and the following additional assumption we establish geometric convergence in Wasserstein distance for the marginal law of the solution \(r_t\) of (6) to the Dirac measure at 0:

H4

It holds \(\limsup _{r\rightarrow \infty }(r^{-1}{\tilde{b}}(r))<0\) and \(a\le (2\int _0^{{\tilde{R}}_1}\exp \big (\frac{1}{2}\int _0^s {\tilde{b}}(u)_+\textrm{d}u\big )ds)^{-1}\) with \({\tilde{R}}_0, {\tilde{R}}_1 \) defined by

$$\begin{aligned} {\tilde{R}}_0&=\inf \{s\in {\mathbb {R}}_+: {\tilde{b}}(r)\le 0 \ \forall r\ge s\} \;\qquad \text {and} \end{aligned}$$
(29)
$$\begin{aligned} {\tilde{R}}_1&=\inf \{s\ge {\tilde{R}}_0: -\frac{s}{r}(s-{\tilde{R}}_0) {\tilde{b}}(r)\ge 4 \ \forall r\ge s\}\;. \end{aligned}$$
(30)

Theorem 7

Suppose H1 and H4 holds. Then, the Dirac measure at 0, \(\delta _0\), is the unique invariant probability measure of (6). Moreover if \((r_s)_{s\ge 0}\) is a solution of (6) with \(r_0\) distributed with respect to an arbitrary probability measure \(\mu \) on \(({\mathbb {R}}_+,{{\mathcal {B}}}({\mathbb {R}}_+))\), it holds for all \(t\ge 0 \),

$$\begin{aligned} {\mathbb {E}}[f(r_t)]\le \textrm{e}^{-ct}{\mathbb {E}}[f(r_0)]\;, \end{aligned}$$
(31)

where f and c are given by (37) and (36) with a and \({\tilde{b}}\) given in (6) and \({\tilde{R}}_0\) and \({\tilde{R}}_1\) given in (29) and (30).

Proof

The proof is postponed to Sect. 6.3.3. \(\square \)

4 Uniform in time propagation of chaos

To prove uniform in time propagation of chaos, we consider the \(L^1\) Wasserstein distance with respect to the cost function \(\bar{f}_N\circ \pi :{\mathbb {R}}^{Nd}\times {\mathbb {R}}^{Nd}\rightarrow {\mathbb {R}}_+\) with \(\pi \) given in (8), and \(\bar{f}_N\) given by

$$\begin{aligned} \bar{f}_N((x^{i,N})_{i=1}^N,(y^{i,N})_{i=1}^N)=\frac{1}{N}\sum _{i=1}^N f\left( \left| x^i-y^i \right| \right) \;, \end{aligned}$$
(32)

with \(f: {\mathbb {R}}_+ \rightarrow {\mathbb {R}}_+\) defined in (37). This distance is denoted by \({{\mathcal {W}}}_{f,N}\). Note that \(\bar{f}_N\) is equivalent to \(l^1\) defined in (7).

We note that since \(\pi \) defines a projection from \({\mathbb {R}}^{Nd}\) to the hyperplane \(\textsf{H}_N\subset {\mathbb {R}}^{Nd}\) given in (9), for \({\hat{\mu }}\) and \({\hat{\nu }}\) on \(\textsf{H}_N\), \({{\mathcal {W}}}_{f,N}({\hat{\mu }},{\hat{\nu }})\) coincides with the Wasserstein distance given by

$$\begin{aligned} \hat{{{\mathcal {W}}}}_{f,N}({\hat{\mu }},{\hat{\nu }})=\inf _{\xi \in \Gamma ({\hat{\mu }},{\hat{\nu }})}\int _{\textsf{H}_N \times \textsf{H}_N} \bar{f}_N (x,y) \xi (\textrm{d}x \textrm{d}y) \end{aligned}$$
(33)

and \({{\mathcal {W}}}_{l^1\circ \pi }({\hat{\mu }},{\hat{\nu }})=\hat{{{\mathcal {W}}}}_{l^1}({\hat{\mu }},{\hat{\nu }})\), where \(\bar{f}_N\) and \(l^1\) are given in (32) and (7), respectively, and where \(\hat{{{\mathcal {W}}}}_{l^1}({\hat{\mu }},{\hat{\nu }})\) is defined as in (33) with respect to the distance \(l^1\).

Theorem 8

(Uniform in time propagation of chaos) Let \(N\in {\mathbb {N}}\) and assume B1 and B2. Let \({\bar{\mu }}_0\) and \(\nu _0\) be probability measures on \(({\mathbb {R}}^d,{{\mathcal {B}}}({\mathbb {R}}^d))\) satisfying B3. For \(t \ge 0\), denote by \({\bar{\mu }}_t\) and \(\nu _t^N\) the law of \(\bar{X}_t\) and \(\{X_t^{i,N}\}_{i=1}^N\) where \((\bar{X}_s)_{s \ge 0}\) and \((\{X_s^{i,N}\}_{i=1}^N)_{s \ge 0}\) are solutions of (1) and (3), respectively, with initial distributions \({\bar{\mu }}_0\) and \(\nu _0^{\otimes N}\). Then for all \(t\ge 0\),

$$\begin{aligned} {{\mathcal {W}}}_{f,N}({\bar{\mu }}_t^{\otimes N},\nu _t^N)&\le \textrm{e}^{-{\tilde{c}} t}{{\mathcal {W}}}_{f,N}({\bar{\mu }}_0^{\otimes N},\nu _0^{\otimes N})+{\tilde{C}}{\tilde{c}}^{-1}N^{-1/2}\;, \\ {{\mathcal {W}}}_{l^1\circ \pi }({\bar{\mu }}_t^{\otimes N},\nu _t^N)&\le M_1 \textrm{e}^{-{\tilde{c}} t}{{\mathcal {W}}}_{l^1\circ \pi }({\bar{\mu }}_0^{\otimes N},\nu _0^{\otimes N})+M_1{\tilde{C}}{\tilde{c}}^{-1}N^{-1/2}\;, \end{aligned}$$

where f is defined by (37), \(M_1\) by (18), \({\tilde{c}}\) by (17) and \({\tilde{C}}\) is a finite constant depending on \(\Vert \gamma \Vert _\infty \), L and the second moment of \({\bar{\mu }}_0\) and given in (77).

Proof

The proof is postponed to Sect. 6.4. \(\square \)

Remark 9

Denote by \(\mu _t^N\) and \(\nu _t^N\) the distribution of \(\{X_t^{i,N}\}_{i=1}^N\) and \(\{Y_t^{i,N}\}_{i=1}^N\) where the two processes \((\{X_s^{i,N}\}_{i=1}^N)_{s \ge 0}\) and \((\{Y_s^{i,N}\}_{i=1}^N)_{s \ge 0}\) are solutions of (3) with initial probability distributions \(\mu _0^{N},\nu _0^{N}\in {{\mathcal {P}}}({\mathbb {R}}^{Nd})\), respectively, with finite forth moment. An easy inspection and adaptation of the proof of Theorem 8 show that if B1 holds, then

$$\begin{aligned} \begin{aligned}&{{\mathcal {W}}}_{f,N}(\mu _t^N,\nu _t^N )\le \textrm{e}^{-{\tilde{c}} t}{{\mathcal {W}}}_{f,N}(\mu _0^{\otimes N},\nu _0^{\otimes N})\;,\\&\qquad {{\mathcal {W}}}_{l^1\circ \pi }(\mu _t^N,\nu _t^N)\le 2 M_1 \textrm{e}^{-{\tilde{c}} t}{{\mathcal {W}}}_{l^1\circ \pi }(\mu _0^{\otimes N},\nu _0^{\otimes N}) \;, \end{aligned} \end{aligned}$$

where f, \({\tilde{c}}\) and \(M_1\) are defined as in Theorem 8.

5 System of N sticky SDEs

Consider a systerm of N one-dimensional SDEs with sticky boundaries at 0 given by

$$\begin{aligned} \textrm{d}r_t^i=\Big ({\tilde{b}}(r_t^i)+\frac{1}{N}\sum _{j=1}^N g(r_t^j)\Big )\textrm{d}t+2 \mathbbm {1}_{(0,\infty )}(r_t^i)\textrm{d}W_t^i\;, \qquad i=1,\ldots ,N. \end{aligned}$$
(34)

The results on existence, uniqueness and the comparison theorem for solutions of sticky nonlinear SDEs mostly carry directly over to a solution of (34) and are applied to prove propagation of chaos in Theorem 8.

Let \(\mu \) be a probability distribution on \({\mathbb {R}}_+\). For \(N\in {\mathbb {N}}\), \((\{r_t^i,W_t^i\}_{i=1}^N)_{t\ge 0}\) is a weak solution on the filtered probability space \((\Omega , {{\mathcal {A}}},({{\mathcal {F}}}_t)_{t\ge 0},P)\) of (34) with initial distribution \(\mu ^{\otimes N}\) if the following hold: \(\mu ^{\otimes N}=P\circ (\{r_0\}_{i=1}^N)^{-1}\), \((\{W_t\}_{i=1}^N)_{t\ge 0}\) is a N-dimensional \(({{\mathcal {F}}}_t)_{t\ge 0}\) Brownian motion w.r.t. P, the process \((r_t^i)_{t\ge 0}\) is non-negative, continuous and satisfies almost surely for any \(i\in \{1,\ldots ,N\}\) and \(t\in {\mathbb {R}}_+\),

$$\begin{aligned} r_t^i-r_0^i&=\int _0^t\Big ({\tilde{b}}(r_s^i)+\frac{1}{N}\sum _{j=1}^N g(r_s^j)\Big )\textrm{d}s+\int _0^t 2\mathbbm {1}_{(0,\infty )}(r_s^i) \textrm{d}W_s^i\;. \end{aligned}$$

To show existence and uniqueness in law of a weak solution \((\{r_t^i,W_t^i\}_{i=1}^N)_{t\ge 0}\), we suppose H1 and H2 for \({\tilde{b}}\) and g.

It follows that there exists a constant \(C<\infty \) such that for all \(\{r^i\}_{i=1}^N\in {\mathbb {R}}_+^N\), it holds \(\sum _{i=1}^N |{\tilde{b}}(r^i)|+|g(r^i)|\le C(1+\sum _{i=1}^N|r^i|)\), and a possible solution \((\{r_t^i\}_{i=1}^N)_{t\ge 0}\) is non-explosive. If the initial distribution is supported on \({\mathbb {R}}_+^N\), then in the same line as for the nonlinear SDE in Sect. 3.1, the solution \((\{r_t^i\}_{i=1}^N)_{t\ge 0}\) satisfies \(r^i_t>0\) almost surely for any \(i=1,\ldots ,N\) and \(t\ge 0\) by H1 and H2.

Existence and uniqueness in law of (34) is a direct consequence of a stronger result that we now introduce. To study existence and uniqueness and to compare two solutions of (34) with different drifts, we establish existence of a synchronous coupling of two copies of (34),

$$\begin{aligned}&\textrm{d}r_t^i=\Big ({\tilde{b}}(r_t^i)+\frac{1}{N}\sum _{j=1}^Ng(r_t^j)\Big )\textrm{d}t+2 \mathbbm {1}_{(0,\infty )}(r_t^i) \textrm{d}W_t^i \;, \nonumber \\ {}&\textrm{d}s_t^i=\Big ({\hat{b}}(s_t^i)+\frac{1}{N}\sum _{j=1}^N h(s_t^j)\Big )\textrm{d}t+2 \mathbbm {1}_{(0,\infty )}(s_t^i) \textrm{d}W_t^i\;, \nonumber \\ {}&\textrm{Law}(r_0^i,s_0^i)=\eta \;, \qquad \text {for}\ i\in \{1,\ldots ,N\} \end{aligned}$$
(35)

where \((\{W_t^i\}_{i=1}^N)_{t\ge 0}\) are N i.i.d.1-dimensional Brownian motions and where \(\eta \in \Gamma (\mu , \nu )\) for \(\mu , \nu \in {{\mathcal {P}}}({\mathbb {R}}_+)\).

Let \({\mathbb {W}}^{N}={{\mathcal {C}}}({\mathbb {R}}_+,{\mathbb {R}}^{N})\) be the space of continuous functions from \({\mathbb {R}}_+\) to \({\mathbb {R}}^{N}\) endowed with the topology of uniform convergence on compact sets, and let \({{\mathcal {B}}}({\mathbb {W}}^{N})\) denote its Borel \(\sigma \)-Algebra.

Theorem 10

Assume that \(({\tilde{b}},g)\) and \(({\hat{b}},h)\) satisfy H1 and H2. Let \(\eta \in \Gamma (\mu ,\nu )\) where \(\mu \) and \(\nu \) are the probability measure on \({\mathbb {R}}_+\) satisfying H3. Then there exists a weak solution \((\{r^i_t,s^i_t\}_{i=1}^N)_{t\ge 0}\) of the sticky stochastic differential equation (35) with initial distribution \(\eta ^{\otimes N}\) defined on a probability space \((\Omega , {{\mathcal {A}}},P)\) with values in \({\mathbb {W}}^N\times {\mathbb {W}}^N\). If additionally,

$$\begin{aligned}&{\tilde{b}}(r)\le {\hat{b}}(r) \quad and \quad g(r)\le h(r)\;,{} & {} \text {for any } r\in {\mathbb {R}}_+ \;, \\ {}&P[r_0^{i}\le s_0^{i} \text { for all } i=1,\ldots ,N]=1\;, \end{aligned}$$

then \(P[r_t^i\le s_t^i \text { for all } t\ge 0\text { and } i=1,\ldots ,N]=1\).

Proof

The proof is postponed to Sect. 6.5. \(\square \)

Remark 11

We note that by the comparison result we can deduce uniqueness in law for the solution of (34).

6 Proofs

Before proving the statements of Sects. 25, let us give an overview of the proofs. The first subsection gives the definition of the underlying distance function f used in Theorems 17 and 8. Sections 6.2 and 6.3 provide proofs for the convergence result for the nonlinear SDE (Theorem 1) using the sticky coupling approach and the results for the sticky nonlinear SDE (Theorem 7). Note that both Theorems 1 and 7 use the auxiliary Lemmas 1416, where a comparison result and an approximation in two steps of the sticky nonlinear SDE are given. The existence of a solution to the sticky nonlinear SDE and a comparison result are essential to show contraction in this approach.

In Sects. 6.4 and 6.5 the proofs for the propagation of chaos for the mean-field particle system and for the system of sticky SDEs are given. Note that the techniques to prove the result for the particle systems and the system of N sticky SDEs are partially similar to the nonlinear case. In particular, the proofs of Theorems 8 and 10 and its auxiliary Lemmas 182023 have a similar structure as the ones of Theorems 2 and 3 and its auxiliary Lemmas 1216, respectively.

6.1 Definition of the metrics

In Theorems 17 and 8 we consider Wasserstein distances based on a carefully designed concave function \(f:{\mathbb {R}}_+\rightarrow {\mathbb {R}}_+\) that we now define. In addition we derive useful properties of this function that will be used in our proofs of Theorems 18 and 7. Let \(a\in {\mathbb {R}}_+\) and \({\tilde{b}}:{\mathbb {R}}_+\rightarrow {\mathbb {R}}\) be such that H4 is satisfied with \({\tilde{R}}_0\) and \({\tilde{R}}_1\) defined in (29). We define

$$\begin{aligned} \varphi (r)&=\exp \left( -\int _0^r \{{\tilde{b}}(s)_+/2\}\textrm{d}s \right) \;, \qquad \Phi (r)=\int _0^r \varphi (s)\textrm{d}s\;,{} & {} \text { and } \\ g(r)&=1-\frac{c}{2}\int _0^{r\wedge {\tilde{R}}_1}\{\Phi (s)/\varphi (s)\}\textrm{d}s-\frac{a}{2}\int _0^{r \wedge {\tilde{R}}_1} \{1/\varphi (s)\}\textrm{d}s \;, \end{aligned}$$

where

$$\begin{aligned} c=\left( 2\int _0^{{\tilde{R}}_1} \{\Phi (s)/\varphi (s)\}\textrm{d}s \right) ^{-1}, \end{aligned}$$
(36)

and \({\tilde{R}}_1\) is given in (30). It holds \(\varphi (r)=\varphi ({\tilde{R}}_0)\) for \( r\ge {\tilde{R}}_0\) with \({\tilde{R}}_0\) given in (29), \(g(r)=g({\tilde{R}}_1)\in [1/2,3/4]\) for \(r\ge {\tilde{R}}_1\) and \(g(r)\in [1/2,1]\) for all \(r\in {\mathbb {R}}_+\) by (36) and H4. We define the increasing function \(f:[0,\infty )\rightarrow [0,\infty )\) by

$$\begin{aligned} f(t)=\int _0^t \varphi (r)g(r) \textrm{d}r \;. \end{aligned}$$
(37)

The construction is adapted from the function f given in [19]. Here, the function g has an extra term. As we see later in the proof of Theorems 1 and 7, this term has the purpose to control the term \(a{\mathbb {P}}[r_t>0]\). We observe that f is concave, since \(\varphi \) and g are decreasing. Since for all \(r\in {\mathbb {R}}_+\)

$$\begin{aligned} \varphi ({\tilde{R}}_0) r/2\le \Phi (r)/2\le f(r)\le \Phi (r)\le r\;, \end{aligned}$$
(38)

\((x,y)\mapsto f(|x-y|)\) defines a distance on \({\mathbb {R}}^d\) equivalent to the Euclidean distance on \({\mathbb {R}}^d\).

Moreover, f satisfies

$$\begin{aligned} 2f''(0)=-{\tilde{b}}(0)_+-a = -a\;, \end{aligned}$$
(39)

and

$$\begin{aligned} 2f''(r)\le 2f''(0)-f'(r){\tilde{b}}(r)-cf(r)\;,\qquad \text {for all} r\in {\mathbb {R}}_+ \backslash \{{\tilde{R}}_1\} \;. \end{aligned}$$
(40)

Indeed by construction of f, \(f''(r)=-{\tilde{b}}(r)_+f'(r)/2-c\Phi (r)/2-a/2\) for \(0\le r< {\tilde{R}}_1\) and so (40) holds for \(0\le r< {\tilde{R}}_1\) by (38). To show (40) for \(r>{\tilde{R}}_1\) note that \(f''(r)=0\) and \(f'(r)\ge \varphi ({\tilde{R}}_0)/2\) hold for \(r>{\tilde{R}}_1\). Hence, by the definition (30) of \({\tilde{R}}_1\), for \(r>{\tilde{R}}_1\),

$$\begin{aligned} f''(r)+f'(r){\tilde{b}}(r)/2\le \varphi ({\tilde{R}}_0){\tilde{b}}(r)/4\le -({\tilde{R}}_1({\tilde{R}}_1-{\tilde{R}}_0))^{-1}\varphi ({\tilde{R}}_0)r\;. \end{aligned}$$
(41)

Since \(\varphi (r)=\varphi ({\tilde{R}}_0)\) for \(r\ge {\tilde{R}}_0\), it holds \(\Phi (r)=\Phi ({\tilde{R}}_0)+(r-{\tilde{R}}_0)\varphi ({\tilde{R}}_0)\) for \(r\ge {\tilde{R}}_0\). Further, it holds \(\Phi (R_0)\ge {\tilde{R}}_0\varphi ({\tilde{R}}_0)\) since \(\varphi \) is decreasing for \(r\le {\tilde{R}}_0\). Hence,

$$\begin{aligned} \frac{r}{{\tilde{R}}_1}&=\frac{(r-{\tilde{R}}_1)(\Phi ({\tilde{R}}_0)+({\tilde{R}}_1-{\tilde{R}}_0)\varphi ({\tilde{R}}_0))}{{\tilde{R}}_1\Phi ({\tilde{R}}_1)}\nonumber \\&\quad +1\ge \frac{(r-{\tilde{R}}_1){\tilde{R}}_1\varphi ({\tilde{R}}_0)}{{\tilde{R}}_1\Phi ({\tilde{R}}_1)}+1 =\frac{\Phi (r)}{\Phi ({\tilde{R}}_1)}\;. \end{aligned}$$
(42)

Furthermore, we have

$$\begin{aligned} \int _{{\tilde{R}}_0}^{{\tilde{R}}_1} \{\Phi (s)/\varphi (s)\}\textrm{d}s&=\int _{{\tilde{R}}_0}^{{\tilde{R}}_1} \frac{\Phi ({\tilde{R}}_0)+(s-{\tilde{R}}_0)\varphi ({\tilde{R}}_0)}{\varphi ({\tilde{R}}_0)}\textrm{d}s \nonumber \\ {}&= ({\tilde{R}}_1-{\tilde{R}}_0)\frac{\Phi ({\tilde{R}}_0)}{\varphi ({\tilde{R}}_0)}+\frac{1}{2}({\tilde{R}}_1-{\tilde{R}}_0)^2 \ge \frac{1}{2}({\tilde{R}}_1-{\tilde{R}}_0)\frac{\Phi ({\tilde{R}}_1)}{\varphi ({\tilde{R}}_0)}\;. \end{aligned}$$
(43)

We insert (42) and (43) in (41) and use (36) to obtain

$$\begin{aligned} f''(r)+f'(r){\tilde{b}}(r)/2&\le -\Phi (r)\Phi ({\tilde{R}}_1)^{-1}({\tilde{R}}_1-{\tilde{R}}_0)^{-1}\varphi ({\tilde{R}}_0) \end{aligned}$$
(44)
$$\begin{aligned}&\le -\frac{\Phi (r)}{2\int _{{\tilde{R}}_0}^{{\tilde{R}}_1} \{\Phi (s)/\varphi (s)\}\textrm{d}s} \le -\frac{cf(r)}{2}-\frac{c\Phi (r)}{2}\;. \end{aligned}$$
(45)

By H4 and (36), we get

$$\begin{aligned} -\frac{c\Phi (r)}{2}\le -\frac{\Phi ({\tilde{R}}_1)}{4\int _0^{{\tilde{R}}_1} \{\Phi (s)/\varphi (s)\}\textrm{d}s}\le -\frac{1}{4\int _0^{{\tilde{R}}_1}\{1/\varphi (s)\}\textrm{d}s}\le -\frac{a}{2}=f''(0)\;. \end{aligned}$$

Combining this estimate with (44) gives (40) for \(r>{\tilde{R}}_1\). Hence, the choice of the underlying function f for the Wasserstein distance ensures (39) and (40). These properties guarantee that the term \(a{\mathbb {P}}[r_t>0]\) is controlled in (6) and contraction with rate c is obtained in Theorems 17 and 8.

6.2 Proof of Sect. 2

First, we prove Theorem 1 by using Theorem 2 and properties of the carefully constructed function f before we show Theorem 2. To prove that the dominating process \(r_t\) exists we make use of the result of the sticky nonlinear SDE which are proven in Sect. 6.3.1.

6.2.1 Proof of Theorem 1

Proof of Theorem 1

We consider the process \((\bar{X}_t,\bar{Y}_t,r_t)_{t\ge 0}\) defined in Theorem 2 and satisfying \(|\bar{X}_t-\bar{Y}_t|\le r_t\) for any \( t\ge 0\), and \((r_t)_{t\ge 0}\) is a weak solution of (22). Set \(a=2\Vert \gamma \Vert _\infty \) and \({\tilde{b}}(r)=\bar{b}(r)\). With this notation, B1 and B2 imply H4 and \({\tilde{R}}_0=R_0\) and \({\tilde{R}}_1=R_1\) by (14), (15), (29) and (30). By Itō–Tanaka formula, cf. [39, Chapter 6, Theorem 1.1], using that \(f'\) is absolutely continuous, we have,

$$\begin{aligned} \textrm{d}f(r_t)&\le f'(r_t)(\bar{b}(r_t)+2\Vert \gamma \Vert _\infty {\mathbb {P}}(r_t>0))\textrm{d}t +2f''(r_t)\mathbbm {1}_{(0,\infty )}(r_t)\textrm{d}t \\ {}&\quad +f'(r_t)2\mathbbm {1}_{(0,\infty )}(r_t)\textrm{d}W_t\;. \end{aligned}$$

Taking expectation we obtain by (39) and (40)

$$\begin{aligned}&\frac{\textrm{d}}{\textrm{d}t}{\mathbb {E}}[f(r_t)]\le {\mathbb {E}}[f'(r_t){\tilde{b}}(r_t)_++ 2(f''(r_t)-f''(0))]+{\mathbb {E}}[(a+2f''(0))\mathbbm {1}_{r_t>0}]\\&\quad \le -{\tilde{c}}{\mathbb {E}}[f(r_t)]\;, \end{aligned}$$

where \({\tilde{c}}\) is given by (17). Therefore by Grönwall’s lemma,

$$\begin{aligned} {\mathbb {E}}[f(|\bar{X}_t-\bar{Y}_t|)]\le {\mathbb {E}}[f(r_t)]\le \textrm{e}^{-{\tilde{c}}t} {\mathbb {E}}[f(r_0)]=\textrm{e}^{-{\tilde{c}}t}{\mathbb {E}}[f(|\bar{X}_0-\bar{Y}_0|)]\;. \end{aligned}$$

Hence, it holds

$$\begin{aligned} {{\mathcal {W}}}_f({\bar{\mu }}_t,{\bar{\nu }}_t)\le {\mathbb {E}}[f(|\bar{X}_t-\bar{Y}_t|)]\le \textrm{e}^{-{\tilde{c}}t}\int _{{\mathbb {R}}^d\times {\mathbb {R}}^d} f(|x-y|) \xi (\textrm{d}x \textrm{d}y) \end{aligned}$$

for an arbitrary coupling \(\xi \in \Gamma (\mu _0,\nu _0)\). Taking the infimum over all couplings \(\xi \in \Gamma (\mu _0,\nu _0)\), we obtain the first inequality of (16). By (38), we get the second inequality of (16). \(\square \)

6.2.2 Proof of Theorem 2

Note that the nonlinear SDE (21) has Lipschitz continuous coefficients. The existence and the uniqueness of the coupling \((\bar{X}_t^\delta ,\bar{Y}_t^\delta )_{t\ge 0}\) follows from [36, Theorem 2.2]. By Levy’s characterization, \((\bar{X}_t^\delta ,\bar{Y}_t^\delta )_{t\ge 0}\) is indeed a coupling of two copies of solutions of (1). Further, we remark that \(W_t^\delta =\int _0^t (\bar{e}_s^\delta )^T \textrm{d}B_s^1\) is a one-dimensional Brownian motion. In the next step, we analyse \(|\bar{X}_t^\delta -\bar{Y}_t^\delta |\).

Lemma 12

Suppose that the conditions B1 and B3 are satisfied. Then, it holds for any \(\epsilon <\epsilon _0\), where \(\epsilon _0\) is given by (20), setting \(\bar{r}_t^\delta =|\bar{X}_t^\delta -\bar{Y}_t^\delta |\)

$$\begin{aligned} \textrm{d}\bar{r}_t^\delta&= \Big (-L \bar{r}_t^\delta +\Big \langle {\bar{e}_t^\delta },\int _{\mathbb {R^d}}\int _{{\mathbb {R}}^d}\gamma (\bar{X}_t^\delta -x)-\gamma (\bar{Y}_t^\delta -y)\mu _t^\delta (\textrm{d}x)\nu _t^\delta (\textrm{d}y)\Big \rangle \Big ) \textrm{d}t+2 \textrm{rc}^\delta (\bar{r}_t^\delta )\textrm{d}W_t^\delta \end{aligned}$$
(46)
$$\begin{aligned}&\le \Big (\bar{b}(\bar{r}_t^\delta )+2\Vert \gamma \Vert _\infty \int _{{\mathbb {R}}^{d}} \int _{{\mathbb {R}}^{d}} \textrm{rc}^\epsilon (|x-y|) {\bar{\mu }}_t^\delta (\textrm{d}x) {\bar{\nu }}_t^\delta (\textrm{d}y)\Big ) \textrm{d}t+2 \textrm{rc}^\delta (\bar{r}_t^\delta )\textrm{d}W_t^\delta \;, \end{aligned}$$
(47)

almost surely for all \(t\ge 0\), where \({\bar{\mu }}_t^\delta \) and \({\bar{\nu }}_t^\delta \) are the laws of \(\bar{X}_t^\delta \) and \(\bar{Y}_t^\delta \), respectively.

Proof

Using (21), B1 and B3, the stochastic differential equation of the process \(((\bar{r}_t^\delta )^2)_{t\ge 0}\) is given by

$$\begin{aligned} \textrm{d}((\bar{r}_t^\delta )^2)&=2 \Big \langle Z_t^\delta ,-LZ_t^\delta + \int _{{\mathbb {R}}^d}\int _{{\mathbb {R}}^d} \gamma (\bar{X}_t^\delta -x)-\gamma (\bar{Y}_t^\delta -y) {\bar{\mu }}_t^\delta (\textrm{d}x) {\bar{\nu }}_t^\delta (\textrm{d}y) \Big \rangle \textrm{d}t \\ {}&\quad + 4 \textrm{rc}^\delta (\bar{r}_t^\delta )^2\textrm{d}t+ 4\textrm{rc}^\delta (\bar{r}_t^\delta )\langle Z_t^\delta , e_t^\delta \rangle \textrm{d}W_t^\delta \;. \end{aligned}$$

For \({{\varepsilon }}>0\) we define as in [18, Lemma 8] a \({{\mathcal {C}}}^2\) approximation of the square root by

$$\begin{aligned} S_{{\varepsilon }}(r)={\left\{ \begin{array}{ll}(-1/8){{\varepsilon }}^{-3/2}r^2+(3/4){{\varepsilon }}^{-1/2}r+(3/8){{\varepsilon }}^{1/2} &{} \text {for } r<{{\varepsilon }} \\ \sqrt{r} &{} \text {otherwise}\;. \end{array}\right. } \end{aligned}$$

Then, by Itō’s formula,

$$\begin{aligned} \textrm{d}S_{{\varepsilon }}((\bar{r}_t^\delta )^2)&=S_{{\varepsilon }}'((\bar{r}_t^\delta )^2)\textrm{d}(\bar{r}_t^\delta )^2+\frac{1}{2}S_{{\varepsilon }}''((\bar{r}_t^\delta )^2)\textrm{d}[(\bar{r}^\delta )^2]_t \\ {}&= 2 S_{{\varepsilon }}'((\bar{r}_t^\delta )^2) \Big \langle Z_t^\delta ,-LZ_t^\delta + \int _{{\mathbb {R}}^d}\int _{{\mathbb {R}}^d} \gamma (\bar{X}_t^\delta -x)-\gamma (\bar{Y}_t^\delta -y) {\bar{\mu }}_t^\delta (\textrm{d}x) {\bar{\nu }}_t^\delta (\textrm{d}y) \Big \rangle \textrm{d}t \\ {}&\quad + S_{{\varepsilon }}'((\bar{r}_t^\delta )^2) 4 \textrm{rc}^\delta (\bar{r}_t^\delta )^2\textrm{d}t+ S_{{\varepsilon }}'((\bar{r}_t^\delta )^2)4 \textrm{rc}^\delta (\bar{r}_t^\delta )\langle Z_t^\delta , e_t^\delta \rangle \textrm{d}W_t^\delta + 8 S_{{\varepsilon }}''((\bar{r}_t^\delta )^2)(\textrm{rc}^\delta (\bar{r}_t^\delta ))^2 (\bar{r}_t^\delta )^2 \textrm{d}t\;. \end{aligned}$$

We take the limit \({{\varepsilon }} \rightarrow 0 \). Then \(\lim _{{\varepsilon }\rightarrow 0} S_{{\varepsilon }}'(r)=(1/2)r^{-1/2}\) and \(\lim _{{\varepsilon }\rightarrow 0} S_{{\varepsilon }}''(r)=-(1/4)r^{-3/2}\) for \(r > 0\). Since \(\sup _{0\le r\le \varepsilon }|S_{{\varepsilon }}'(r)|\lesssim {{\varepsilon }}^{-1/2}\), \(\sup _{0\le r\le \epsilon }|S_{{\bar{\varepsilon }}}''(r)|\lesssim {{\bar{\varepsilon }}}^{-3/2}\) and \(\textrm{rc}^\delta \) is Lipschitz continuous with \(\textrm{rc}^\delta (0)=0\), we apply Lebesgue’s dominated convergence theorem to show convergence for the integrals with respect to time t. More precisely, we note that the integrand \((4S_{\varepsilon }'((\bar{r}_t^\delta )^2)+8S_{\varepsilon }''((\bar{r}_t^\delta )^2))\textrm{rc}^\delta (\bar{r}_t^\delta ))^2(\bar{r}_t^\delta )^2\) is dominated by \(3\varepsilon ^{1/2}\Vert \textrm{rc}^\delta \Vert _{\textrm{Lip}}\). For any \(\varepsilon <\varepsilon _0\) for fixed \(\varepsilon _0>0\), the integrand \(2 S_{{\varepsilon }}'((\bar{r}_t^\delta )^2) \langle Z_t^\delta ,-LZ_t^\delta + \int _{{\mathbb {R}}^d}\int _{{\mathbb {R}}^d} (\gamma (\bar{X}_t^\delta -x)-\gamma (\bar{Y}_t^\delta -y)) {\bar{\mu }}_t^\delta (\textrm{d}x) {\bar{\nu }}_t^\delta (\textrm{d}y)\rangle \) is dominated by \((3/2)(L\max (\varepsilon _0^{(1/2)},\bar{r}_t^\delta )+2\Vert \gamma \Vert _\infty )\).

For the stochastic integral it holds \(|S_{\varepsilon }'((\bar{r}_t^\delta )^2)4\textrm{rc}^\delta (\bar{r}_t^\delta ) \bar{r}_t^\delta |\le 3\). Hence, the stochastic integral converges along a subsequence almost surely, to \(\int _0^t2 \textrm{rc}^\delta (\bar{r}_s^\delta ) \textrm{d}W_s^\delta \), see [39, Chapter 4, Theorem 2.12]. Hence, we obtain (46). Since (12) implies \(\langle x-y, \gamma (x-{\tilde{x}})-\gamma (y-{\tilde{x}})\rangle \le \kappa (|x-y|)|x-y|^2\) for all \(x,y,{\tilde{x}}\in {\mathbb {R}}^d\), we obtain by B1 and (20) for \(\epsilon <\epsilon _0\)

$$\begin{aligned}&\Big \langle {\bar{e}_t^\delta },\int _{{\mathbb {R}}^d}\int _{{\mathbb {R}}^d}(\gamma (\bar{X}_t^\delta -x)-\gamma (\bar{Y}_t^\delta -y))\mu _t^\delta (\textrm{d}x)\nu _t^\delta (\textrm{d}y)\Big \rangle \\ {}&\quad \le \Big \langle {\bar{e}_t^\delta },\int _{{\mathbb {R}}^d}\int _{{\mathbb {R}}^d}(\gamma (\bar{X}_t^\delta -x)-\gamma (\bar{Y}_t^\delta -x)+\gamma (\bar{Y}_t^\delta -x)-\gamma (\bar{Y}_t^\delta -y))\mu _t^\delta (\textrm{d}x)\nu _t^\delta (\textrm{d}y)\Big \rangle \\ {}&\quad \le \kappa (\bar{r}_t^\delta )\bar{r}_t^\delta +\int _{{\mathbb {R}}^d}\int _{{\mathbb {R}}^d} 2\Vert \gamma \Vert _\infty \textrm{rc}^\epsilon (|x-y|) \mu _t^\delta (\textrm{d}x)\nu _t^\delta (\textrm{d}y)\;, \end{aligned}$$

and hence (47) holds. \(\square \)

We define a one-dimensional process \((r_t^{\delta ,\epsilon })_{t\ge 0}\) by

$$\begin{aligned} \textrm{d}r_t^{\delta ,\epsilon } =\Big (\bar{b}(r_t^{\delta ,\epsilon })+2 \Vert \gamma \Vert _\infty \int _{{\mathbb {R}}_+} \textrm{rc}^{\epsilon }(u) P_t^{\delta ,\epsilon }(\textrm{d}u) \Big )\textrm{d}t+2 \textrm{rc}^\delta ( r_t^{\delta ,\epsilon }) \textrm{d}W_t^\delta \end{aligned}$$
(48)

with initial condition \(r_0^{\delta ,\epsilon } =\bar{r}_0^\delta \), \(P_t^{\delta ,\epsilon }=\textrm{Law}(r_t^{\delta ,\epsilon })\) and \(W_t^\delta =\int _0^t (\bar{e}_s^\delta )^T \textrm{d}B_s^1\). This process will allow us to control the distance of \(\bar{X}_t^\delta \) and \(\bar{Y}_t^\delta \).

By [36, Theorem 2.2], under B1 and B3, \((U_t^{\delta ,\epsilon })_{t\ge 0}=(\bar{X}_t^\delta ,\bar{Y}_t^\delta ,r_t^{\delta ,\epsilon })_{t\ge 0}\) exists and is unique, where \((\bar{X}_t^\delta ,\bar{Y}_t^\delta )_{t\ge 0}\) solves uniquely (21), \((\bar{r}_t^\delta )_{t\ge 0}\) and \((r_t^{\delta ,\epsilon })_{t\ge 0}\) solve uniquely (46) and (48), respectively, with \(W_t^\delta =\int _0^t (\bar{e}_s^\delta )^T \textrm{d}B_s^1\).

Lemma 13

Assume B1 and B3. Then, \(|\bar{X}_t^\delta -\bar{Y}_t^\delta |=\bar{r}_t^\delta \le r_t^{\delta ,\epsilon }\), almost surely for all t and \(\epsilon <\epsilon _0\).

Proof

Note that \((\bar{r}_t^\delta )_{t\ge 0}\) and \((r_t^{\delta ,\epsilon })_{t\ge 0}\) have the same initial distribution and are driven by the same noise. Since the drift of \((\bar{r}_t^\delta )_{t\ge 0}\) is smaller than the drift of \((r_t^{\delta ,\epsilon })_{t\ge 0}\) for \(\epsilon <\epsilon _0\), the result follows by Lemma 14.

\(\square \)

Proof of Theorem 2

We consider the nonlinear process \((U_t^{\delta ,\epsilon })_{t\ge 0}=(\bar{X}_t^\delta ,\bar{Y}_t^\delta ,r_t^{\delta ,\epsilon })_{t\ge 0}\) on \({\mathbb {R}}^{2d+1}\) for each \(\epsilon ,\delta >0\). We denote by \({\mathbb {P}}^{\delta ,\epsilon }\) the law of \(U^{\delta ,\epsilon }\) on the space \({{\mathcal {C}}}({\mathbb {R}}_+,{\mathbb {R}}^{2d+1})\). We define by \({\varvec{X}},{\varvec{Y}}:{{\mathcal {C}}}({\mathbb {R}}_+,{\mathbb {R}}^{2d+1})\rightarrow {{\mathcal {C}}}({\mathbb {R}}_+,{\mathbb {R}}^{d})\) and \({\varvec{r}}:{{\mathcal {C}}}({\mathbb {R}}_+,{\mathbb {R}}^{2d+1})\rightarrow {{\mathcal {C}}}({\mathbb {R}}_+,{\mathbb {R}})\) the canonical projections onto the first d components, onto the second d components and onto the last component, respectively. By B1 and B3 following the same line as the proof of Lemma 15, see (56), it holds for each \(T>0\)

$$\begin{aligned} E[|U_{t_2}^{\delta ,\epsilon }-U_{t_1}^{\delta ,\epsilon }|^4]\le C|t_2-t_1|^2 \qquad \text {for} t_1,t_2\in [0,T] \;, \end{aligned}$$
(49)

for some constant C depending on T, L, \(\Vert \gamma \Vert _{\textrm{Lip}}\), \(\Vert \gamma \Vert _\infty \) and on the fourth moment of \(\mu _0\) and \(\nu _0\). As in Lemma 15 the law \({\mathbb {P}}_T^{\delta ,\epsilon }\) of \((U_t^{\delta ,\epsilon })_{0\le t\le T}\) on \({{\mathcal {C}}}([0,T],{\mathbb {R}}^{2d+1})\) is tight for each \(T>0\) by [31, Corollary 14.9] and for each \(\epsilon >0\) there exists a subsequence \(\delta _n\rightarrow 0\) such that \(({\mathbb {P}}^{\delta _n,\epsilon }_T)_{n\in {\mathbb {N}}}\) on \({{\mathcal {C}}}([0,T],{\mathbb {R}}^{2d+1})\) converge to a measure \({\mathbb {P}}^\epsilon _T\) on \({{\mathcal {C}}}([0,T],{\mathbb {R}}^{2d+1})\). By a diagonalization argument and since \(\{{\mathbb {P}}^\epsilon _T: T\ge 0\}\) is a consistent family, cf. [31, Theorem 5.16], there exists a probability measure \({\mathbb {P}}^\epsilon \) on \({{\mathcal {C}}}({\mathbb {R}}_+,{\mathbb {R}}^{2d+1})\) such that for all \(\epsilon \) there exists a subsequence \(\delta _n\) such that \(({\mathbb {P}}^{\delta _n,\epsilon })_{n\in {\mathbb {N}}}\) converges along this subsequence to \({\mathbb {P}}^\epsilon \). As in the proof of Lemma 16 we repeat this argument for the family of measures \(({\mathbb {P}}^\epsilon )_{\epsilon >0}\). Hence, there exists a subsequence \(\epsilon _m\rightarrow 0\) such that \(({\mathbb {P}}^{\epsilon _m})_{m\in {\mathbb {N}}}\) converges to a measure \({\mathbb {P}}\). Let \((\bar{X}_t,\bar{Y}_t,r_t)_{t\ge 0}\) be some process on \({\mathbb {R}}^{2d+1}\) with distribution \({\mathbb {P}}\) on \(({\bar{\Omega }},\bar{{{\mathcal {F}}}}, \bar{P})\).

Since \((\bar{X}_t^\delta )_{t\ge 0}\) and \((\bar{Y}_t^\delta )_{t\ge 0}\) are solutions of (1) which are unique in law, we have that for any \(\epsilon , \delta >0\), \({\mathbb {P}}^{\delta ,\epsilon }\circ {\varvec{X}}^{-1}={\mathbb {P}}\circ {\varvec{X}}^{-1}\) and \({\mathbb {P}}^{\delta ,\epsilon }\circ {\varvec{Y}}^{-1}={\mathbb {P}}\circ {\varvec{Y}}^{-1}\). And therefore \((\bar{X}_t)_{t\ge 0}\) and \((\bar{Y}_t)_{t\ge 0}\) are solutions of (1) as well with the same initial condition. Hence \({\mathbb {P}}\circ ({\varvec{X}},{\varvec{Y}})^{-1}\) is a coupling of two copies of (1).

Similarly to the proof of Lemmas 15 and 16 there exist an extended probability space and a one-dimensional Brownian motion \((W_t)_{t\ge 0}\) such that \((r_t,W_t)_{t\ge 0}\) is a solution to

$$\begin{aligned} \textrm{d}r_t=(\bar{b}(r_t)+2\Vert \gamma \Vert _\infty {\mathbb {P}}(r_t >0))\textrm{d}t+2 \mathbbm {1}_{(0,\infty )}(r_t) \textrm{d}W_t\;. \end{aligned}$$

In addition, the statement of Lemma 13 carries over to the limiting process \((r_t)_{t\ge 0}\), i.e., \(|\bar{X}_t-\bar{Y}_t|\le r_t\) for all \(t\ge 0\), since by the weak convergence along the subsequences \((\delta _n)_{n\in {\mathbb {N}}}\) and \((\epsilon _m)_{m\in {\mathbb {N}}}\) and the Portmanteau theorem, \(P(|\bar{X}_t-\bar{Y}_t|\le r_t)\ge \limsup _{m\rightarrow \infty }\limsup _{n\rightarrow \infty } P(|\bar{X}_t^{\delta _n}-\bar{Y}_t^{\delta _n}|\le r_t^{\delta _n,\epsilon _m})=1\).

\(\square \)

6.3 Proof of Sect. 3

First, we introduce a family of nonlinear SDE whose drift and diffusion coefficient are Lipschitz continuous approximations of the drift and diffusion coefficient of (25). Theorem 3 is shown by proving a comparison result for nonlinear SDEs, taking in two steps the limit of the approximations and identifying the limit with the solution of (25). Then, Theorems 5 and 7 are shown where we make use of the careful construction of the function f.

6.3.1 Proof of Theorem 3

We show Theorem 3 via a family of stochastic differential equations, indexed by \(n,m\in {\mathbb {N}}\), with Lipschitz continuous coefficients,

$$\begin{aligned} \textrm{d}r_t^{n,m}= & {} ({\tilde{b}}(r_t^{n,m})+P_t^{n,m}(g^m))\textrm{d}t+2 \theta ^n(r_t^{n,m}) \textrm{d}W_t \nonumber \\ \textrm{d}s_t^{n,m}= & {} ({\hat{b}}(s_t^{n,m})+\hat{P_t}^{n,m}(h^m))\textrm{d}t+2\theta ^n(s_t^{n,m}) \textrm{d}W_t\;, \quad \text {Law}(r_0^{n,m},s_0^{n,m})=\eta _{n,m}\;,\nonumber \\ \end{aligned}$$
(50)

where \(P_t^{n,m}=\textrm{Law}(r_t^{n,m})\), \({\hat{P}}^{n,m}_t=\textrm{Law}(s_t^{n,m})\), \(P_t^{n,m}(g^m)=\int _{{\mathbb {R}}_+} g^m(x)P_t^{n,m}(\textrm{d}x)\) and \({\hat{P}}^{n,m}_t(h^m)=\int _{{\mathbb {R}}_+} h^m(x){\hat{P}}^{n,m}_t(\textrm{d}x)\) for some measurable functions \((g^m)_{m\in {\mathbb {N}}}\) and \((h^m)_{m\in {\mathbb {N}}}\), and where \(\eta _{n,m}\in \Gamma (\mu _{n,m}, \nu _{n,m})\) for \(\mu _{n,m}, \nu _{n,m}\in {{\mathcal {P}}}({\mathbb {R}}_+)\). We identify the weak limit for \(n\rightarrow \infty \) as solution of a family of stochastic differential equations, indexed by \(m\in {\mathbb {N}}\), given by

$$\begin{aligned} \textrm{d}r_t^{m}= & {} ({\tilde{b}}(r_t^{m})+P_t^{m}(g^m))\textrm{d}t+2 \mathbbm {1}_{(0,\infty )}(r_t^{m}) \textrm{d}W_t \nonumber \\ \textrm{d}s_t^{m}= & {} ({\hat{b}}(s_t^{m})+\hat{P_t}^{m}(h^m))\textrm{d}t+2\mathbbm {1}_{(0,\infty )}(s_t^{m}) \textrm{d}W_t\;, \quad \text {Law}(r_0^{m},s_0^{m})=\eta _m \;. \end{aligned}$$
(51)

with \(P_t^{m}=\textrm{Law}(r_t^{m})\) and \({\hat{P}}^{m}_t=\textrm{Law}(s_t^{m})\), and where \(\eta _{m}\in \Gamma (\mu _{m}, \nu _{m})\) for \(\mu _{m}, \nu _{m}\in {{\mathcal {P}}}({\mathbb {R}}_+)\). Taking the limit \(m\rightarrow \infty \), we show in the next step that the solution of (51) converges to a solution of (25).

We assume for \((g^m)_{m\in {\mathbb {N}}}\), \((h^m)_{m\in {\mathbb {N}}}\), \((\theta ^n)_{n\in {\mathbb {N}}}\) and the initial distributions:

H5

\((g^m)_{m\in {\mathbb {N}}}\) and \((h^m)_{m\in {\mathbb {N}}}\) are sequences of non-decreasing non-negative uniformly bounded Lipschitz continuous functions such that for all \(r\ge 0\), \(g^m(r)\le g^{m+1}(r)\) and \(h^m(r)\le h^{m+1}(r)\) and \(\lim _{m \rightarrow +\infty } g^m(r) = g(r)\) and \(\lim _{m \rightarrow +\infty } h^m(r) = h(r)\) where g, h are left-continuous non-negative non-decreasing bounded functions. In addition, there exists \(K_m<\infty \) for any m such that for all \(r,s\in {\mathbb {R}}\)

$$\begin{aligned} |g^m(r)-g^m(s)|\le K_m |r-s| \qquad \text {and} \qquad |h^m(r)-h^m(s)|\le K_m |r-s|\;. \end{aligned}$$

H6

\((\theta ^n)_{n\in {\mathbb {N}}}\) is a sequence of Lipschitz continuous functions from \({\mathbb {R}}_+\) to [0, 1] with \(\theta ^n(0)=0\), \(\theta ^n(r)=1\) for all \(r\ge 1/n\) and \(\theta ^n(r)>0\) for all \(r> 0\).

H7

\((\mu _{n,m})_{m,n\in {\mathbb {N}}}\), \((\nu _{n,m})_{m,n\in {\mathbb {N}}}\), \((\mu _{m})_{m\in {\mathbb {N}}}\), \((\nu _{m})_{m\in {\mathbb {N}}}\) are families of probability distributions on \({\mathbb {R}}_+\) and \((\eta _{n,m})_{n,m\in {\mathbb {N}}}\), \((\eta _{m})_{m\in {\mathbb {N}}}\) families of probability distributions on \({\mathbb {R}}_+^2\) such that for any \(n,m\in {\mathbb {N}}\) \(\eta _{n,m}\in \Gamma (\mu _{n,m},\nu _{n,m})\) and \(\eta _{m}\in \Gamma (\mu _{m},\nu _{m})\) and for any \(m\in {\mathbb {N}}\), \((\eta _{n,m})_{n\in {\mathbb {N}}}\) converges weakly to \(\eta _m\) and \((\eta _m)_{m\in {\mathbb {N}}}\) converges weakly to \(\eta \). Further, the p-th order moments of \((\mu _{n,m})_{n,m\in {\mathbb {N}}}\), \((\nu _{n,m})_{n,m\in {\mathbb {N}}}\), \((\mu _{m})_{m\in {\mathbb {N}}}\) and \((\nu _{m})_{m\in {\mathbb {N}}}\) are uniformly bounded for \(p>2\) given in H3.

Note that by H5 for any non-decreasing sequence \((u_m)_{m\in {\mathbb {N}}}\), which converges to \(u\in {\mathbb {R}}_+\), \(g^m(u_m)\) and \(h^m(u_m)\) converge to g(u) and h(u), respectively. More precisely, it holds for for all \(m\in {\mathbb {N}}\), \(g^m(u_m)- g(u)\le 0\) and for \(m\ge n\), \(g^m(u_m)\ge g^m(u_n)\) and therefore, \(\lim _{m\rightarrow \infty } g^m(u_n)-g(u)\ge \lim _{n\rightarrow \infty }\lim _{m\rightarrow \infty } =\lim _{n\rightarrow \infty }g(u_n)-g(u)=0\) by left-continuity of g. Hence, \(\lim _{m\rightarrow \infty }g^m(u_m)-g(u)=0\) and analogously \(\lim _{m\rightarrow \infty }h^m(u_m)-h(u)=0\). By H5, \(\Gamma =\max (\Vert h\Vert _\infty ,\Vert g\Vert _\infty )\) is a uniform upper bound of \((g^m)_{m\in {\mathbb {N}}}\) and \((h^m)_{m\in {\mathbb {N}}}\).

Consider a probability space \((\Omega _0, {{\mathcal {A}}}_0, Q)\) and a one-dimensional Brownian motion \((W_t)_{t\ge 0}\). Under H5, H6 and H7, for all \(m,n\in {\mathbb {N}}\), there exists random variables \(r^{n,m}, s^{n,m}:\Omega _0\rightarrow {\mathbb {W}}\) for each nm such that \((r^{n,m}_t,s^{n,m}_t)_{t\ge 0}\) is a unique strong solution to (50) associated to \((W_t)_{t\ge 0}\) by [36, Theorem 2.2]. We denote by \({\mathbb {P}}^{n,m}=Q\circ (r^{n,m},s^{n,m})^{-1}\) the corresponding distribution on \({\mathbb {W}}\times {\mathbb {W}}\).

Before studying the two limits \(n,m\rightarrow \infty \) and proving Theorem 3, we state a modification of the comparison theorem by Ikeda and Watanabe to compare two solutions of (50), cf. [28, Section VI, Theorem 1.1].

Lemma 14

Let \((r_t^{n,m},s_t^{n,m})_{t\ge 0}\) be a solution of (50) for fixed \(n,m\in {\mathbb {N}}\). Assume H1, H5 and H6. If \(Q[r_0^{n,m}\le s_0^{n,m}]=1\), \({\tilde{b}}(r)\le {\hat{b}}(r)\) and \(g^m(r)\le h^m(r)\) for any \(r\in {\mathbb {R}}_+\), then

$$\begin{aligned} Q[r_t^{n,m}\le s_t^{n,m} \text { for all } t\ge 0]=1\;. \end{aligned}$$
(52)

Proof

For simplicity, we drop the dependence on nm in \((r_t^{n,m})\) and \((s_t^{n,m})\). Denote by \(\rho \) the Lipschitz constant of \(\theta ^n\). Let \((a_k)_{k\in {\mathbb {N}}}\) be a decreasing sequence, \(1>a_1>a_2>\ldots>a_k>\ldots >0\), such that \(\int _{a_1}^1\rho ^{-2}x^{-1}\textrm{d}x=1\), \(\int _{a_2}^{a_1}\rho ^{-2}x^{-1}\textrm{d}x=2\),\(\ldots \), \(\int _{a_k}^{a_{k-1}}\rho ^{-2}x^{-1}\textrm{d}x=k\). We choose a sequence \(\Psi _k(u)\), \(k=1,2,\ldots \), of continuous functions such that its support is contained in \((a_k,a_{k-1})\), \(\int _{a_k}^{a_{k-1}}\Psi _k(u)\textrm{d}u=1\) and \(0\le \Psi _k(u)\le 2/k\cdot \rho ^{-2} u^{-2}\). Such a function exists. We set

$$\begin{aligned} \varphi _k(x)={\left\{ \begin{array}{ll} \int _0^x \textrm{d}y \int _0^y \Psi _k(u)\textrm{d}u &{} \text { if } x\ge 0, \\ 0\;&{} \text { if } x<0\;. \end{array}\right. } \end{aligned}$$

Note that for any \(k\in {\mathbb {N}}\), \(\varphi _k\in {{\mathcal {C}}}^2({\mathbb {R}}_+)\), \(|\varphi '_k(x)|\le 1\), \(\varphi _k(x)\rightarrow x_+\) as \(k\uparrow \infty \) and \(\varphi '_k(x)\uparrow \mathbbm {1}_{(0,\infty )}(x)\). Applying Itō’s formula to \(\varphi _k(r_t-s_t)\), we obtain

$$\begin{aligned} \varphi _k(r_t-s_t)=\varphi _k(r_0-s_0)+I_1(k)+I_2(k)+I_3(k)\;, \end{aligned}$$

where

$$\begin{aligned} I_1(k)&=\int _0^t\varphi _k'(r_u-s_u)[\theta ^n(r_u)-\theta ^n(s_u)]\textrm{d}B_u\;, \\ I_2(k)&=\int _0^t \varphi _k'(r_u-s_u)[{\tilde{b}}(r_u)-{\hat{b}}(s_u)+P_u(g^m)-{\hat{P}}_u(h^m)] \textrm{d}u \;, \\ I_3(k)&=\frac{1}{2}\int _0^t \varphi _k''(r_u-s_u)[\theta ^n(r_u)-\theta ^n(s_u)]^2\textrm{d}u \;, \end{aligned}$$

with \(P_u=Q\circ r_u^{-1}\) and \({\hat{P}}_u=Q\circ s_u^{-1}\). It holds by boundedness and Lipschitz continuity of \(\theta ^n\)

$$\begin{aligned} {\mathbb {E}}[I_1(k)]=0\;, \quad \text { and }\quad {\mathbb {E}}[I_3(k)]\le \frac{1}{2}{\mathbb {E}}\Big [\int _0^t \varphi _k''(r_u-s_u)\rho ^2|r_u-s_u|^2\textrm{d}u\Big ]\le \frac{t}{k} \;. \end{aligned}$$

We note that by H5\({\mathbb {E}}[(g^m(r_u)-h^m(s_u))\mathbbm {1}_{r_u-s_u<0}]\le 0\) and

$$\begin{aligned} {\mathbb {E}}[(g^m(r_u)-h^m(s_u))\mathbbm {1}_{r_u-s_u\ge 0}]&\le {\mathbb {E}}[(g^m(r_u)-g^m(s_u){+}g^m(s_u)-h^m(s_u))\mathbbm {1}_{r_u-s_u\ge 0}] \nonumber \\ {}&\le {\mathbb {E}}[(g^m(r_u)-g^m(s_u))\mathbbm {1}_{r_u-s_u\ge 0}] \nonumber \\ {}&\le K_m{\mathbb {E}}[|r_u-s_u|\mathbbm {1}_{r_u-s_u\ge 0}] \end{aligned}$$
(53)

by Lipschitz continuity of \(g^m\), by \(g^m(r)\le h^m(r)\) and since \(g^m\) and \(h^m\) are non-decreasing. Hence for \(I_2\), we obtain

$$\begin{aligned} I_2(k)&= \int _0^t \varphi _k'(r_u-s_u)[{\tilde{b}}(r_u)-{\hat{b}}(r_u)+{\hat{b}}(r_u)-{\hat{b}}(s_u)]\textrm{d}u \\ {}&\quad +\int _0^t \varphi _k'(r_u-s_u)\Big ({\mathbb {E}}[(g^m(r_u)-h^m(s_u))\mathbbm {1}_{r_u-s_u\ge 0}]+{\mathbb {E}}[(g^m(r_u)-h^m(s_u))\mathbbm {1}_{r_u-s_u<0}]\Big ) \textrm{d}u \\ {}&\le \int _0^t \varphi _k'(r_u-s_u){\tilde{L}}|r_u-s_u|\textrm{d}u+ \int _0^t \varphi _k'(r_u-s_u) K_m{\mathbb {E}}[|r_u-s_u|\mathbbm {1}_{r_u-s_u\ge 0}] \textrm{d}u\;. \end{aligned}$$

Taking the limit \(k\rightarrow \infty \) and using that \({\mathbb {E}}[r_0-s_0]=0\), we obtain

$$\begin{aligned}&{\mathbb {E}}[(r_t-s_t)_+]\le {\tilde{L}}{\mathbb {E}}\Big [\int _0^t(r_u-s_u)_+ \textrm{d}u\Big ]\nonumber \\&\quad +K_m{\mathbb {E}}\Big [\int _0^t \mathbbm {1}_{(0,\infty )}(r_u-s_u){\mathbb {E}}[(r_u-s_u)_+] \textrm{d}u\Big ] \;, \end{aligned}$$
(54)

by the monotone convergence theorem and since \((\varphi _k')_{k\in {\mathbb {N}}}\) is a monotone increasing sequence which converges pointwise to \(\mathbbm {1}_{(0,\infty )}(x)\). Assume there exists \(t^*=\inf \{t\ge 0: {\mathbb {E}}[(r_t-s_t)_+]>0\}<\infty \). Then, \(\int _0^{t^*}{\mathbb {E}}[(r_u-s_u)_+ ]\textrm{d}u>0\) or \(\int _0^{t^*}{\mathbb {E}}[ \mathbbm {1}_{(0,\infty )}(r_u-s_u)]{\mathbb {E}}[(r_u-s_u)_+] \textrm{d}u>0\). By definition of \(t^*\), \({\mathbb {E}}[(r_u-s_u)_+ ]=0\) for all \(u<t^*\) and hence both terms are zero. This contradicts the definition of \(t^*\). Hence, (52) holds. \(\square \)

Next, we show that the distribution of the solution of (50) converges as \(n\rightarrow \infty \).

Lemma 15

Assume that \({\tilde{b}}\), \({\hat{b}}\), g and h satisfy H1 and H2. Let \(\eta \in \Gamma (\mu ,\nu )\) where the probability measures \(\mu \) and \(\nu \) on \({\mathbb {R}}_+\) satisfy H3. Assume that \((g^m)_{m\in {\mathbb {N}}}\), \((h^m)_{m\in {\mathbb {N}}}\), \((\theta ^n)_{n\in {\mathbb {N}}}\), \((\mu _{n,m})_{m,n\in {\mathbb {N}}}\), \((\nu _{n,m})_{m,n\in {\mathbb {N}}}\) and \((\eta _{n,m})_{m,n\in {\mathbb {N}}}\) satisfy condition H5, H7 and H6. Then for any \(m \in {\mathbb {N}}\), there exists a random variable \((r^m,s^m)\) defined on some probability space \((\Omega ^m,{{\mathcal {A}}}^m,P^m)\) with values in \({\mathbb {W}}\times {\mathbb {W}}\), such that \((r_t^m,s_t^m)_{t\ge 0}\) is a weak solution of the stochastic differential equation (51). More precisely, for all \(m\in {\mathbb {N}}\) the sequence of laws \(Q\circ (r^{n,m},s^{n,m})^{-1}\) converges weakly to the distribution \(P^m\circ (r^m,s^m)^{-1}\). If additionally,

$$\begin{aligned}&{\tilde{b}}(r)\le {\hat{b}}(r) \quad \text {and} \quad g^m(r)\le h^m(r) \;,{} & {} \text {for any } r\in {\mathbb {R}}_+ \text { and } \\ {}&Q[r_0^{n,m}\le s_0^{n,m}]=1{} & {} \text {for any } n,m\in {\mathbb {N}}, \end{aligned}$$

then \(P^m[r_t^{m}\le s_t^{m} \text { for all } t\ge 0]=1\).

Proof

Fix \(m\in {\mathbb {N}}\). The proof is divided in three parts. First we show tightness of the sequences of probability measures. Then we identify the limit of the sequence of stochastic processes. Finally, we compare the two limiting processes.

Tightness We show that the sequence of probability measures \(({\mathbb {P}}^{n,m})_{n\in {\mathbb {N}}}\) on \(({\mathbb {W}}\times {\mathbb {W}},{{\mathcal {B}}}({\mathbb {W}})\otimes {{\mathcal {B}}}({\mathbb {W}}))\) is tight by applying Kolmogorov’s continuity theorem. Consider \(p>2\) such that the p-th moment in H3 and H7 are uniformly bounded. Fix \(T>0\). Then the p-th moment of \(r_t^{n,m}\) for \(t<T\) can be bounded using Itō’s formula,

$$\begin{aligned} \textrm{d}|r_t^{n,m}|^p&\le p|r_t^{n,m}|^{p-2}\langle r_t^{n,m},({\tilde{b}}(r_t^{n,m})+P_t^{n,m}(g^m))\rangle \textrm{d}t \\&\quad + 2\theta ^n(r_t^{n,m}) p|r_t^{n,m}|^{p-2} r_t^{n,m}\textrm{d}W_t \\ {}&\quad +p(p-1)|r_t^{n,m}|^{p-2}2 \theta ^n(r_t^{n})^2\textrm{d}t \\ {}&\le p\Big (|r_t^{n,m}|^p {\tilde{L}}+\Gamma |r_t^{n,m}|^{p-1}+2(p-1)|r_t^{n,m}|^{p-2}\Big ) \textrm{d}t \\&\quad + 2\theta ^n(r_t^{n,m}) p(r_t^{n,m})^{p-1}\textrm{d}W_t \\ {}&\le p\Big ({\tilde{L}}+\Gamma +2(p-1)\Big )|r_t^{n,m}|^p\textrm{d}t + p(\Gamma +2(p-1))\textrm{d}t \\&\quad + 2\theta ^n(r_t^{n,m}) p(r_t^{n,m})^{p-1}\textrm{d}W_t\;, \end{aligned}$$

where \(\Gamma =\max (\Vert g\Vert _\infty ,\Vert h\Vert _{\infty })\). Taking expectation yields

$$\begin{aligned} \frac{\textrm{d}}{\textrm{d}t}{\mathbb {E}}[ |r_t^{n,m}|^p]\le p\Big ({\tilde{L}}+\Gamma +2(p-1)\Big ){\mathbb {E}}|r_t^{n,m}|^p+p(\Gamma +2(p-1))\;. \end{aligned}$$

Then by Gronwall’s lemma

$$\begin{aligned} \sup _{t\in [0,T]}{\mathbb {E}}[|r_t^{n,m}|^p]\le \textrm{e}^{p({\tilde{L}}+\Gamma +2(p-1))T} ({\mathbb {E}}[|r_0^{n,m}|^p]+Tp(\Gamma +2(p-1)))<C_p<\infty \;, \end{aligned}$$
(55)

where \(C_p\) depends on T and the p-th moment of the initial distribution, which is finite by H6. Similarly, it holds \(\sup _{t\in [0,T]} {\mathbb {E}}[|s_t^{n,m}|^p]<C_p\) for \(t\le T\). Using this moment bound, it holds for all \(t_1,t_2\in [0,T]\) by H1, H5 and H6,

$$\begin{aligned}&{\mathbb {E}}[|r_{t_2}^{n,m}-r_{t_1}^{n,m}|^p]\le C_1(p)\Big ({\mathbb {E}}[|\int _{t_1}^{t_2} {\tilde{b}}(r_u^{n,m}) + P^{n,m}_u(g^m)\textrm{d}u |^p]+{\mathbb {E}}[|\int _{t_1}^{t_2}2 \theta ^n(r_u^{n,m})\textrm{d}W_u|^p]\Big ) \\ {}&\quad \le C_2(p)\Big (\Big ({\mathbb {E}}\Big [\frac{{\tilde{L}}^p}{|t_2-t_1|}\int _{t_1}^{t_2}|r_u^{n,m}|^p\textrm{d}u\Big ]+\Gamma ^p\Big )|t_2-t_1|^p+{\mathbb {E}}[|\int _{t_1}^{t_2}2\theta ^n(r_u^{n,m})\textrm{d}u|^{p/2}]\Big ) \\ {}&\quad \le C_2(p)\Big (\Big (\frac{{\tilde{L}}^p}{|t_2-t_1|}\int _{t_1}^{t_2}{\mathbb {E}}[|r_u^{n,m}|^p]\textrm{d}u +\Gamma ^p\Big )|t_2-t_1|^p+2^{p/2}|t_2-t_1|^{p/2}\Big ) \\ {}&\quad \le C_3(p,T,{\tilde{L}},\Gamma ,C_p)|t_2-t_1|^{p/2}\;, \end{aligned}$$

where \(C_i(\cdot )\) are constants depending on the stated argument and which are independent of nm. Note that in the second step, we used Burkholder-Davis-Gundy inequality, see [38, Chapter IV, Theorem 48]. It holds similarly, \({\mathbb {E}}[|s_{t_2}^{n,m}-s_{t_1}^{n,m}|^p]\le C_3(p,T,{\tilde{L}},\Gamma ,C_p)|t_2-t_1|^{p/2}\). Hence,

$$\begin{aligned} {\mathbb {E}}[|(r_{t_2}^{n,m},s_{t_2}^{n,m})-(r_{t_1}^{n,m},s_{t_1}^{n,m})|^p]\le C_4(p,T,{\tilde{L}},\Gamma ,C_p)|t_2-t_1|^{p/2} \end{aligned}$$
(56)

for all \(t_1,t_2\in [0,T]\). Hence, by Kolmogorov’s continuity criterion, cf. [31, Corollary 14.9], there exists a constant \({\tilde{C}}\) depending on p and \(\gamma \) such that

$$\begin{aligned} {\mathbb {E}}\Big [ [(r^{n,m},s^{n,m})]_\gamma ^p\Big ]\le {\tilde{C}}\cdot C_4(p,T,{\tilde{L}},\Gamma ,C_p) \;, \end{aligned}$$
(57)

where \([\cdot ]_\gamma ^p\) is given by \([x]_\gamma =\sup _{t_1,t_2\in [0,T]}\frac{|x(t_1)-x(t_2)|}{|t_1-t_2|^\gamma }\) and \((r_t^{n,m},s_t^{n,m})_{n\in {\mathbb {N}},t\ge 0}\) is tight in \({{\mathcal {C}}}([0,T],{\mathbb {R}}^2)\). Hence, for each \(T>0\) there exists a subsequence \(n_k\rightarrow \infty \) and a probability measure \({\mathbb {P}}^m_T\) on \({{\mathcal {C}}}([0,T],{\mathbb {R}}^2)\). Since \(\{{\mathbb {P}}^m_T\}_T\) is a consistent family, there exists by [31, Theorem 5.16] a probability measure \({\mathbb {P}}^m\) on \(({\mathbb {W}}\times {\mathbb {W}}, {{\mathcal {B}}}({\mathbb {W}})\otimes {{\mathcal {B}}}({\mathbb {W}}))\) such that there is a subsequence \((n_k)_{k\in {\mathbb {N}}}\) such that \({\mathbb {P}}^{n_k,m}\) converges along this subsequence to \({\mathbb {P}}^m\). Note that here we can take by a diagonalization argument the same subsequence \((n_k)_{k\in {\mathbb {N}}}\) for all m.

Characterization of the limit measure In the following we drop for simplicity the index k in the subsequence. Denote by \(({\varvec{r}}_t,{\varvec{s}}_t) (\omega )=\omega (t)\) the canonical process on \({\mathbb {W}}\times {\mathbb {W}}\). Since \({\mathbb {P}}^{n,m}\circ ({\varvec{r}}_0,{\varvec{s}}_0)^{-1}=\eta _{n,m}\) converges weakly to \(\eta _m\) by H7, it holds \({\mathbb {P}}^m\circ ({\varvec{r}}_0,{\varvec{s}}_0)^{-1}=\eta _m\). We define the maps \(M^{n,m},N^{n,m}:{\mathbb {W}}\times {\mathbb {W}}\rightarrow {\mathbb {W}}\) by

$$\begin{aligned} M_t^{n,m}={\varvec{r}}_t-{\varvec{r}}_0-\int _0^t ({\tilde{b}}({\varvec{r}}_u)+P_u^n(g^m))\textrm{d}u \text { and } N_t^{n,m}\\ ={\varvec{s}}_t-{\varvec{s}}_0-\int _0^t ({\hat{b}}({\varvec{s}}_u)+{\hat{P}}_u^n(h^m))\textrm{d}u \;, \end{aligned}$$

where \(P_u^n={\mathbb {P}}^{n,m}\circ ({\varvec{r}}_u)^{-1}\) and \({\hat{P}}_u^n={\mathbb {P}}^{n,m}\circ ({\varvec{s}}_u)^{-1}\). For each \(m,n\in {\mathbb {N}}\), \((M_t^{n,m},{{\mathcal {F}}}_t,{\mathbb {P}}^{n,m})\) and \((N_t^{n,m},{{\mathcal {F}}}_t,{\mathbb {P}}^{n,m})\) are martingales with respect to the canonical filtration \({{\mathcal {F}}}_t=\sigma (({\varvec{r}}_u,{\varvec{s}}_u)_{0\le u\le t})\) by Itō’s formula and the moment estimate (55). Further the family \((M_t^{n,m},{\mathbb {P}}^{n,m})_{n\in {\mathbb {N}},t\ge 0}\) and \((N_t^{n,m},{\mathbb {P}}^{n,m})_{n\in {\mathbb {N}},t\ge 0}\) are uniformly integrable by Lipschitz continuity of \({\tilde{b}}\) and \({\hat{b}}\) and by boundedness of \(g^m\) and \(h^m\). Further, the mappings \(M^{n,m}\) and \(N^{n,m}\) are continuous in \({\mathbb {W}}\). We show that \({\mathbb {P}}^{n,m}\circ ({\varvec{r}},{\varvec{s}},M^{n,m},N^{n,m})^{-1}\) converges weakly to \({\mathbb {P}}^m\circ ({\varvec{r}},{\varvec{s}},M^m,N^m)^{-1}\) as \(n\rightarrow \infty \), where

$$\begin{aligned} M_t^{m}={\varvec{r}}_t-{\varvec{r}}_0-\int _0^t ({\tilde{b}}({\varvec{r}}_u)+P_u(g^m))\textrm{d}u \quad \text {and} \quad N_t^{m}\nonumber \\ ={\varvec{s}}_t-{\varvec{s}}_0-\int _0^t ({\hat{b}}({\varvec{s}}_u)+{\hat{P}}_u(h^m))\textrm{d}u \;, \end{aligned}$$
(58)

with \(P_u={\mathbb {P}}^{m}\circ {\varvec{r}}_u^{-1}\) and \({\hat{P}}_u={\mathbb {P}}^{m}\circ {\varvec{s}}_u^{-1}\). To show weak convergence to \({\mathbb {P}}^m\circ ({\varvec{r}},{\varvec{s}},M^m,N^m)^{-1}\), we note that \((M^m,N^m)\) is continuous in \({\mathbb {W}}\) and we consider for a Lipschitz continuous and bounded function \(G:{\mathbb {W}}\rightarrow {\mathbb {R}}\),

$$\begin{aligned}&\left| \int _{\mathbb {W}}G(\omega )\textrm{d}{\mathbb {P}}^{n,m}\circ (M^{n,m})^{-1}(\omega )-\int _{\mathbb {W}}G(\omega )\textrm{d}{\mathbb {P}}^{m}\circ (M^{m})^{-1}(\omega ) \right| \\ {}&\le \left| \int _{\mathbb {W}}G(\omega )\textrm{d}{\mathbb {P}}^{n,m}\circ (M^{n,m})^{-1}(\omega )-\int _{\mathbb {W}}G(\omega )\textrm{d}{\mathbb {P}}^{n,m}\circ (M^{m})^{-1}(\omega ) \right| \\ {}&\quad +\left| \int _{\mathbb {W}}G(\omega )\textrm{d}{\mathbb {P}}^{n,m}\circ (M^{m})^{-1}(\omega )-\int _{\mathbb {W}}G(\omega )\textrm{d}{\mathbb {P}}^{m}\circ (M^{m})^{-1}(\omega ) \right| \;. \end{aligned}$$

The second term converges to 0 as \(n\rightarrow \infty \), since \((M^m)\) is continuous. For the first term it holds

$$\begin{aligned}&\left| \int _{\mathbb {W}}G(\omega )\textrm{d}{\mathbb {P}}^{n,m}\circ (M^{n,m})^{-1}(\omega )-\int _{\mathbb {W}}G(\omega )\textrm{d}{\mathbb {P}}^{n,m}\circ (M^{m})^{-1}(\omega ) \right| \\ {}&=\left| \int _{\mathbb {W}}(G\circ M^{n,m})(\omega )\textrm{d}{\mathbb {P}}^{n,m}(\omega )-\int _{\mathbb {W}}(G\circ M^{m})(\omega )\textrm{d}{\mathbb {P}}^{n,m}(\omega ) \right| \\ {}&\le \Vert G\Vert _{\textrm{Lip}}\sup _{\omega \in {\mathbb {W}}} d_{\mathbb {W}}(M^{n,m}(\omega ),M^m(\omega )) \;, \end{aligned}$$

where \(d_{\mathbb {W}}(f,g)=\sum _{k=1}^\infty \sup _{t\in [0,k]}2^{-k}|f(t)-g(t)|\). This term converges to 0 for \(n\rightarrow \infty \), since for all \(T>0\) and \(\omega \in {\mathbb {W}}\), for \(n\rightarrow \infty \)

$$\begin{aligned}&\sup _{t\in [0,T]}|M^{n,m}_t(\omega )-M^{m}_t(\omega )|\\&\quad \le \int _0^T\left| ({\mathbb {P}}^{n,m}\circ {\varvec{r}}_s^{-1})(g^m)-({\mathbb {P}}^{m}\circ {\varvec{r}}_s^{-1})(g^m) \right| \textrm{d}s\rightarrow 0 \;, \end{aligned}$$

by Lebesgue dominated convergence theorem, since g is bounded. Hence,

$$\begin{aligned}&\left| \int _{\mathbb {W}}G(\omega )\textrm{d}{\mathbb {P}}^{n,m}\circ (M^{n,m})^{-1}(\omega )-\int _{\mathbb {W}}G(\omega )\textrm{d}{\mathbb {P}}^{m}\circ (M^{m})^{-1}(\omega ) \right| \\&\quad \rightarrow 0 \quad \text {for } n\rightarrow \infty , \end{aligned}$$

and similarly for \((N^{n,m})\), and therefore by the Portmanteau theorem [32, Theorem 13.16], weak convergence of \({\mathbb {P}}^{n,m}\circ ({\varvec{r}},{\varvec{s}},M^{n,m},N^{n,m})^{-1}\) to \({\mathbb {P}}^m\circ ({\varvec{r}},{\varvec{s}},M^m,N^m)^{-1}\) holds.

Let \(G:{\mathbb {W}}\rightarrow {\mathbb {R}}_+\) be a \({{\mathcal {F}}}_s\)-measurable, bounded, non-negative function. By uniformly integrability of \((M_t^{n,m},{\mathbb {P}}^{n,m})_{n\in {\mathbb {N}}, t\ge 0}\), for any \(s\le t\),

$$\begin{aligned} {\mathbb {E}}^m[G(M_t^m-M_s^m)]&={\mathbb {E}}^m[G( \int _s^t( {\tilde{b}}({\varvec{r}}_u)+P_u(g^m))\textrm{d}u)] \nonumber \\ {}&=\lim _{n\rightarrow \infty }{\mathbb {E}}^{n,m}[G( \int _s^t ({\tilde{b}}({\varvec{r}}_u)+P_u^{n}(g^m))\textrm{d}u)] \nonumber \\ {}&=\lim _{n\rightarrow \infty }{\mathbb {E}}^{n,m}[G(M_t^{n,m}-M_s^{n,m})]=0 \;, \end{aligned}$$
(59)

and analogously for \((N_t^{n,m})_{t\ge 0}\) and hence, \((M_t^m,{{\mathcal {F}}}_t,{\mathbb {P}}^m)\) and \((N_t^m,{{\mathcal {F}}}_t,{\mathbb {P}}^m)\) are continuous martingales. The quadratic variation \(([(M^m,N^m)]_t)\) exists \({\mathbb {P}}^m\)-almost surely. To complete the identification of the limit, it suffices to note that the quadratic variation is given by

$$\begin{aligned}&[M^m]=4\int _0^\cdot \mathbbm {1}_{(0,\infty )}({\varvec{r}}_u)\textrm{d}u{} & {} {\mathbb {P}}^m\text {-almost surely,}\nonumber \\&[N^m]=4\int _0^\cdot \mathbbm {1}_{(0,\infty )}({\varvec{s}}_u)\textrm{d}u{} & {} {\mathbb {P}}^m\text {-almost surely, and} \nonumber \\&[M^m,N^m]=4\int _0^\cdot \mathbbm {1}_{(0,\infty )}({\varvec{r}}_u)\mathbbm {1}_{(0,\infty )}({\varvec{s}}_u)\textrm{d}u{} & {} {\mathbb {P}}^m\text {-almost surely,} \end{aligned}$$
(60)

which holds following the computations in the proof of [18, Theorem 22]. We show that \(((M_t^m)^2-4\int _0^t\mathbbm {1}_{(0,\infty )}{\varvec{r}}_u\textrm{d}u)\) is a sub- and a supermartingale and hence a martingale using a monotone class argument by noting first that for any bounded continuous and non-negative function \(G:{\mathbb {W}}\rightarrow {\mathbb {R}}_+\),

$$\begin{aligned} {\mathbb {E}}^m[G(M_t^m)^2]=\lim _{n\rightarrow \infty }{\mathbb {E}}^{n,m}[G(M_t^{n,m})^2] \end{aligned}$$
(61)

holds using uniform integrability of \(((M_t^{n,m})^2,{\mathbb {P}}^{n,m})_{n\in {\mathbb {N}},t\ge 0}\) which holds similarly as above. Note that

$$\begin{aligned} {\mathbb {E}}^m\left[ G\int _s^t \mathbbm {1}_{(0,\infty )}({\varvec{r}}_u)\textrm{d}u \right]&\le \lim _{\epsilon \downarrow 0}\liminf _{n\rightarrow \infty }{\mathbb {E}}^{n,m}\left[ G\int _s^t \mathbbm {1}_{(\epsilon ,\infty )}({\varvec{r}}_u)\textrm{d}u \right] \end{aligned}$$
(62)

holds by lower semicontinuity of \(\omega \rightarrow \int _0^{\cdot }\mathbbm {1}_{(\epsilon ,\infty )}(\omega _s)\textrm{d}s\) for each \(\epsilon >0\), Fatou’s lemma and the Portmanteau theorem. For any fixed \(\epsilon >0\),

$$\begin{aligned} \liminf _{n\rightarrow \infty }{\mathbb {E}}^{n,m}\left[ G\left( \int _s^t\theta ^n({\varvec{r}}_u)^2\textrm{d}u-\int _s^t \mathbbm {1}_{(\epsilon ,\infty )}({\varvec{r}}_u)\textrm{d}u \right) \right] . \end{aligned}$$
(63)

Then by (61), (62) and (63)

$$\begin{aligned} {\mathbb {E}}^m&\left[ G\left( (M_t^m)^2-(M_s^m)^2-4\int _s^t \mathbbm {1}_{(0,\infty )}({\varvec{r}}_u)\textrm{d}u \right) \right] \\ {}&\ge \lim _{\epsilon \downarrow 0}\liminf _{n\rightarrow \infty }{\mathbb {E}}^{n,m}\left[ G\left( (M_t^{n,m})^2-(M_s^{n,m})^2-4\int _s^t\theta ^n({\varvec{r}}_u)^2\textrm{d}u \right) \right] =0 \end{aligned}$$

and by a monotone class argument, cf. [38, Chapter 1, Theorem 8], \(((M_t^m)^2-4\int _0^t\mathbbm {1}_{(0,\infty )}({\varvec{r}}_u)\textrm{d}u,{\mathbb {P}}^m)\) is a submartingale. To show that it is also a supermartingale we note that \(((M_t^m)^2-4t,{\mathbb {P}}^m)\) is a supermartingale by (61). By the uniqueness of the Doob–Meyer decomposition, cf. [38, Chapter 3, Theorem 8], \(t\rightarrow [M^m]_t-4t\) is \({\mathbb {P}}^m\)-almost surely decreasing. Note further, that \(({\varvec{r}}_t,{{\mathcal {F}}}_t,{\mathbb {P}}^m)\) is a continuous semimartingale with \([{\varvec{r}}]=[M^m]\). Then by Itō–Tanaka formula, cf. [39, Chapter 6, Theorem 1.1],

$$\begin{aligned} \int _0^t\mathbbm {1}_{\{0\}}({\varvec{r}}_u)\textrm{d}[M^m]_u=\int _0^t\mathbbm {1}_{\{0\}}({\varvec{r}}_u)\textrm{d}[{\varvec{r}}]_u=\int _0^t\mathbbm {1}_{\{0\}}(y)\ell _t^y({\varvec{r}})\textrm{d}y=0 \;, \end{aligned}$$

where \(\ell _t^y({\varvec{r}})\) is the local time of \({\varvec{r}}\) in y. Therefore, for any \(0\le s< t\),

$$\begin{aligned}{}[M^m]_t-[M^m]_s=\int _0^t\mathbbm {1}_{(0,\infty )}({\varvec{r}}_u)\textrm{d}[M^m]_u\le 4\int _0^t\mathbbm {1}_{(0,\infty )}({\varvec{r}}_u)\textrm{d}u \end{aligned}$$

and hence, for any \({{\mathcal {F}}}_s\)-measurable, bounded, non-negative function \(G:{\mathbb {W}}\rightarrow {\mathbb {R}}_+\),

$$\begin{aligned} {\mathbb {E}}^m\left[ G((M_t^m)^2-(M_s^m)^2-4\int _s^t \mathbbm {1}_{(0,\infty )}({\varvec{r}}_u)\textrm{d}u) \right] \le 0 \;. \end{aligned}$$

As before, by a monotone class argument, \(((M_t^m)^2-4\int _0^t\mathbbm {1}_{(0,\infty )}({\varvec{r}}_u)\textrm{d}u,{\mathbb {P}}^m)\) is a supermartingale, and hence a martingale.

Hence, we obtain the quadratic variation \([M^m]_t\) given in (60). The other characterizations in (60) follow by analogous arguments. Then by a martingale representation theorem, see [28, Chapter II, Theorem 7.1], we conclude, that there are a probability space \((\Omega ^m,{{\mathcal {A}}}^m,P^m)\) and a Brownian motion motion W and random variables \((r^m,s^m)\) on this space such that \(P^m\circ (r^m,s^m)^{-1}={\mathbb {P}}^m\circ ({\varvec{r}}^m,{\varvec{s}}^m)^{-1}\) and such that \((r^m,s^m,W)\) is a weak solution of (51). Finally, note that we have weak convergence of \(Q\circ (r^{n,m},s^{n,m})^{-1}\) to \(P^m\circ (r^m,s^m)^{-1}\) not only along a subsequence since the characterization of the limit holds for any subsequence \((n_k)_{k\in {\mathbb {N}}}\).

Comparison of two solutions To show \(P^m[r_t^{m}\le s_t^m \text { for all }t\ge 0]=1\) we note that by Lemma 14, \(Q[r_t^{n}\le s_t^{n} \text { for all } t\ge 0]=1\). The monotonicity carries over to the limit by the Portmanteau theorem for closed sets, since we have weak convergence of \({\mathbb {P}}^{n,m}\circ ({\varvec{r}},{\varvec{s}})^{-1}\) to \({\mathbb {P}}^m\circ ({\varvec{r}},{\varvec{s}})^{-1}\). \(\square \)

We show in the next step that the distribution of the solution of (51) converges as \(m\rightarrow \infty \). For each \(m\in {\mathbb {N}}\) let \((\Omega ^m,{{\mathcal {A}}}^m,P^m)\) be a probability space and random variables \(r^m,s^m:\Omega ^m\rightarrow {\mathbb {W}}\) such that \((r^m_t,s^m_t)_{t\ge 0}\) is a solution of (51). Let \({\mathbb {P}}^m=P^m\circ (r^m,s^m)^{-1}\) denote the law on \({\mathbb {W}}\times {\mathbb {W}}\).

Lemma 16

Assume that \(({\tilde{b}},g)\) and \(({\hat{b}},h)\) satisfy H1 and H2. Let \(\eta \in \Gamma (\mu ,\nu )\) where the probability measures \(\mu \) and \(\nu \) on \({\mathbb {R}}_+\) satisfy H3. Assume that \((g^m)_{m\in {\mathbb {N}}}\), \((h^m)_{m\in {\mathbb {N}}}\), \((\mu _m)_{m\in {\mathbb {N}}}\), \((\nu _m)_{m\in {\mathbb {N}}}\) and \((\eta _m)_{m\in {\mathbb {N}}}\) satisfy conditions H5 and H7. Then there exists a random variable (rs) defined on some probability space \((\Omega ,{{\mathcal {A}}},P)\) with values in \({\mathbb {W}}\times {\mathbb {W}}\), such that \((r_t,s_t)_{t\ge 0}\) is a weak solution of the sticky stochastic differential equation (25). Furthermore, the sequence of laws \(P^{m}\circ (r^{m},s^{m})^{-1}\) converges weakly to the law \(P\circ (r,s)^{-1}\). If additionally,

$$\begin{aligned}&{\tilde{b}}(r)\le {\hat{b}}(r) \;, \qquad g(r)\le h(r) \quad \text {and} \qquad g^m(r)\le h^m(r){} & {} \text { for any } r\in {\mathbb {R}}_+ \text {, and } \\ {}&P^m[r_0^m\le s_0^m]=1{} & {} \text {for any } m\in {\mathbb {N}} \end{aligned}$$

then \(P[r_t\le s_t \text { for all } t\ge 0]=1\).

Proof

The proof is structured as the proof of Lemma 15. First analogously to the proof of (55) we show under H1, H5 and H7,

$$\begin{aligned} \sup _{t\in [0,T]}{\mathbb {E}}[|r_t^{m}|^p]<\infty \;. \end{aligned}$$
(64)

Tightness of the sequence of probability measures \(({\mathbb {P}}^{m})_{m\in {\mathbb {N}}}\) on \(({\mathbb {W}}\times {\mathbb {W}},{{\mathcal {B}}}({\mathbb {W}})\otimes {{\mathcal {B}}}({\mathbb {W}}))\) holds adapting the steps of the proof of Lemma 15 to (51). Note that (55) and (56) hold analogously for \((r_t^m,s_t^m)_{m\in {\mathbb {N}}}\) by H1, H5 and H7. Hence by Kolmogorov’s continuity criterion, cf. [31, Corollary 14.9], we can deduce that there exists a probability measure \({\mathbb {P}}\) on \(({\mathbb {W}}\times {\mathbb {W}},{{\mathcal {B}}}({\mathbb {W}})\otimes {{\mathcal {B}}}({\mathbb {W}}))\) such that there is a subsequence \((m_k)_{k\in {\mathbb {N}}}\) along which \({\mathbb {P}}^{m_k}\) converge towards \({\mathbb {P}}\). To characterize the limit, we first note that by Skorokhod representation theorem, cf. [6, Chapter 1, Theorem 6.7], without loss of generality we can assume that \((r^m,s^m)\) are defined on a common probability space \((\Omega ,{{\mathcal {A}}},P)\) with expectation E and converge almost surely to (rs) with distribution \({\mathbb {P}}\). By H5, \(P_t^m(g^m)=E[g^m(r_t^m)]\) and the monotone convergence theorem, \(P_t^{m}(g^m)\) converges to \(P_t(g)\) for any \(t\ge 0\). Then, by Lebesgue convergence theorem it holds almost surely for all \(t\ge 0\)

$$\begin{aligned} \lim _{m\rightarrow \infty }\int _0^t\Big ({\tilde{b}}(r_t^m)+P^m_u(g^m)\Big )\textrm{d}u=\int _0^t\Big ({\tilde{b}}(r_t)+P_u(g)\Big )\textrm{d}u \;, \end{aligned}$$
(65)

where \(P^m_u=P\circ (r_u^m)^{-1}\) and \(P_u=P\circ (r_u)^{-1}\). A similar statement holds for \((s_t)_{t\ge 0}\).

Consider the mappings \({M}^m,{N}^m:{\mathbb {W}}\times {\mathbb {W}}\rightarrow {\mathbb {W}}\) given by (58). Then for all \(m\in {\mathbb {N}}\), \(({M}_t^m,{{\mathcal {F}}}_t,{\mathbb {P}}^m)\) and \(({N}_t^m,{{\mathcal {F}}}_t,{\mathbb {P}}^m)\) are martingales with respect to the canonical filtration \({{\mathcal {F}}}_t=\sigma (({\varvec{r}}_u,{\varvec{s}}_u)_{0\le u\le t})\). Further the family \(({M}_t^m,{\mathbb {P}}^m)_{m\in {\mathbb {N}},t\ge 0}\) and \(({N}_t^m,{\mathbb {P}}^m)_{m\in {\mathbb {N}},t\ge 0}\) are uniformly integrable by (64). In the same line as in the proof of Lemma 15 and by (65), \({\mathbb {P}}^m\circ ({\varvec{r}},{\varvec{s}},{M}^m,{N}^m)\) converges weakly to \({\mathbb {P}}\circ ({\varvec{r}},{\varvec{s}},{M},{N})\) where

$$\begin{aligned} {M}_t={\varvec{r}}_t-{\varvec{r}}_0-\int _0^t({\tilde{b}}({\varvec{r}}_u)+P_u(g))\textrm{d}u \qquad \text {and} \qquad \\ {N}_t={\varvec{s}}_t-{\varvec{s}}_0-\int _0^t({\hat{b}}({\varvec{s}}_u)+ {\hat{P}}_u(h))\textrm{d}u \;. \end{aligned}$$

Let \(G:{\mathbb {W}}\rightarrow {\mathbb {R}}_+\) be a \({{\mathcal {F}}}_s\)-measurable bounded, non-negative function. By uniform integrability, for any \(s\le t\),

$$\begin{aligned} {\mathbb {E}}[G(M_t-M_s)]&={\mathbb {E}}[G( \int _s^t( {\tilde{b}}({\varvec{r}}_u)+P_u(g))\textrm{d}u)]\\&=\lim _{m\rightarrow \infty }{\mathbb {E}}^m[G( \int _s^t ({\tilde{b}}({\varvec{r}}_u)+P_u(g^m))\textrm{d}u)] \\ {}&=\lim _{m\rightarrow \infty }{\mathbb {E}}^m[G(M_t^m-M_s^m)]=0 \;, \end{aligned}$$

and analogously for \((N_t)_{t\ge 0}\). Hence, \(({M}_t,{{\mathcal {F}}}_t,{\mathbb {P}})\) and \(({N}_t,{{\mathcal {F}}}_t,{\mathbb {P}})\) are martingales. Further, the quadratic variation \(([({M},{N})]_t)\) exists \({\mathbb {P}}\)-almost surely and is given by (60) \({\mathbb {P}}\)-almost surely, which holds following the computations in the proof of Lemma 15. As in Lemma 15, we conclude by a martingale representation theorem that there are a probability space \((\Omega ,{{\mathcal {A}}},P)\) and a Brownian motion W and random variables (rs) on this space such that \(P\circ (r,s)^{-1}={\mathbb {P}}\circ ({\varvec{r}},{\varvec{s}})^{-1}\) and such that (rsW) is a weak solution of (25). Note that the limit identification holds for all subsequences \((m_k)_{k\in {\mathbb {N}}}\) and hence \(P^m\circ (r^m,s^m)^{-1}\) converges weakly to \(P\circ (r,s)^{-1}\) for \(m\rightarrow \infty \). The monotonicity \(P^m[r_t^m\le s_t^m \text { for all } t\ge 0]=1\) carries over to the limit by Portmanteau theorem, since \({\mathbb {P}}^m\circ ({\varvec{r}},{\varvec{s}})^{-1}\) converges weakly to \({\mathbb {P}}\circ ({\varvec{r}},{\varvec{s}})^{-1}\). \(\square \)

Proof of Theorem 3

The proof is a direct consequence of Lemmas 15 and 16. \(\square \)

6.3.2 Proof of Theorem 5

Proof of Theorem 5

Note that the Dirac at 0, \(\delta _0\), is by definition an invariant measure of \((r_t)_{t\ge 0}\) solving (6). Assume that the process starts from an invariant probability measure \(\pi \), hence \({\mathbb {P}}(r_t >0)= p=\pi ((0,\infty ))\) for any \(t\ge 0\). Note that for \(p=0\) the drift vanishes. If the initial measure is the Dirac measure in 0, \(\delta _0\), then the diffusion coefficient disappears. Hence, \(\textrm{Law}(r_t)=\delta _0\) for any \(t\ge 0\). It remains to investigate the case \(p\ne 0\). Here, we are in the regime of [18, Lemma 24] where an invariant measure is of the form (28). Since \(p ={\mathbb {P}}(r_t >0)\), the invariant measure \(\pi \) satisfies additionally the necessary condition

$$\begin{aligned} p=\pi ((0,\infty ))=\frac{I(a,p)}{2/(ap)+I(a,p)} \end{aligned}$$
(66)

with I(ap) given in (27). For \(p\ne 0\), this expression is equivalent to (26). \(\square \)

Proof of Theorem 6

By Theorem 5, it suffices to study the solutions of (26). By (27) and since \({\tilde{b}}(r)=-{\tilde{L}}r\), it holds for \({\hat{I}}(a,p)=(1-p)I(a,p)\),

$$\begin{aligned} {\hat{I}}(a,p)=\Big (\sqrt{\frac{\pi }{2}}+\int _0^{\frac{ap}{\sqrt{2{\tilde{L}}}}}\exp (-x^2/2)\textrm{d}x\Big )\sqrt{\frac{2}{{\tilde{L}}}}\exp \Big (\frac{a^2p^2}{4{\tilde{L}}}\Big )(1-p) \;. \end{aligned}$$
(67)

In the case \(a/\sqrt{{\tilde{L}}}\le 2/\sqrt{\pi }\), \({\hat{I}}(a,0)=\sqrt{\pi /{\tilde{L}}}\) by (67). Further, by \(1+x\le \textrm{e}^x\) and \(a/\sqrt{{\tilde{L}}}\le 2/\sqrt{\pi }\),

$$\begin{aligned}&\Big (\sqrt{\frac{\pi }{2}}+\int _0^{\frac{ap}{\sqrt{2{\tilde{L}}}}}\textrm{e}^{-\frac{x^2}{2}}\textrm{d}x\Big )(1-p)\textrm{e}^{\frac{a^2p^2}{4{\tilde{L}}}}\\&\le \sqrt{\frac{\pi }{2}}\Big (1+\sqrt{\frac{2}{\pi }}\int _0^{\frac{ap}{\sqrt{2L}}}\textrm{e}^{-\frac{x^2}{2}}\textrm{d}x\Big )\textrm{e}^{-p}\textrm{e}^{\frac{p^2}{\pi }} \\ {}&\le \sqrt{\frac{\pi }{2}}\Big (1+\frac{2p}{\pi }\Big )\textrm{e}^{-p}\textrm{e}^{\frac{p^2}{\pi }} \le \sqrt{\frac{\pi }{2}}\textrm{e}^{p(\frac{3}{\pi }-1)}< \sqrt{\frac{\pi }{2}} \end{aligned}$$

for \(p\in (0,1]\). Hence, \({\hat{I}}(a,p)<{\hat{I}}(a,0)\) by (67). Therefore, \({\hat{I}}(a,p)<{\hat{I}}(a,0)\le \frac{2}{a}\) for all \(p\in (0,1]\) and so \(\delta _0\) is the unique invariant probability measure for \(a/\sqrt{{\tilde{L}}}\le 2/\sqrt{\pi }\).

To show that for \(a/{\sqrt{{\tilde{L}}}}> 2/{\sqrt{\pi }}\), there exists a unique p solving (26), we note that \({\hat{I}}(a,p)\) is continuous with \({\hat{I}}(a,0)>2/a\) and \({\hat{I}}(a,1)=0\). By the mean value theorem, there exists at least one \(p\in (0,1)\) satisfying (26). In the following we drop the dependence on a in I(ap) and \({\hat{I}}(a,p)\). We show uniqueness of the solution p by contradiction. Assume that \(p_1<p_2\) are the two smallest solutions of (26). Hence, it holds either \({\hat{I}}'(p_1)<0\) or \({\hat{I}}'(p)=0\) for \(p_1\). Note that the derivative is given by

$$\begin{aligned} {\hat{I}}'(p_i)&=-I(p_i)+(1-p_i)I'(p_i) =-I(p_i)+(1-p_i)\Big (p_i\frac{a^2}{2{\tilde{L}}}I(p_i)+\frac{a}{{\tilde{L}}}\Big ) \nonumber \\ {}&=-\frac{2}{a(1-p_i)}+(1-p_i)\frac{a}{{\tilde{L}}}\Big (\frac{p_i}{1-p_i}+1\Big )=-\frac{2}{a(1-p_i)}+\frac{a}{{\tilde{L}}} \;. \end{aligned}$$
(68)

Then, for \(p_2>p_1\), it holds

$$\begin{aligned} {\hat{I}}'(p_2)=-\frac{2}{a(1-p_2)}+\frac{a}{{\tilde{L}}}<-\frac{2}{a(1-p_1)}+\frac{a}{{\tilde{L}}}={\hat{I}}'(p_1)\le 0 \;. \end{aligned}$$

If \({\hat{I}}'(p_1)<0\), it holds \({\hat{I}}'(p_2)<0\) which contradicts that \(p_1\) and \(p_2\) are the two smallest solutions. In the second case, when \({\hat{I}}'(p_1)=0\), we note that the second derivative of \({\hat{I}}(p)\) at \(p_1\) is given by

$$\begin{aligned} {\hat{I}}''(p_1)&=-2I'(p_1)+(1-p_1)I''(p_1) \\ {}&= \Big (-2+(1-p_1)\frac{a^2p_1}{2{\tilde{L}}}\Big )\Big (I(p_1)\frac{a^2p_1}{2{\tilde{L}}}+\frac{a}{{\tilde{L}}}\Big )+(1-p_1)I(p_1)\frac{a^2}{2{\tilde{L}}} \\ {}&=\Big (-2+(1-p_1)\frac{a^2p_1}{2{\tilde{L}}}\Big )\frac{a}{{\tilde{L}}(1-p_1)}+\frac{a}{{\tilde{L}}} =-\frac{a}{{\tilde{L}}(1-p_1)}<0 \;. \end{aligned}$$

Hence, in this case there is a maximum at \(p_1\), which contradicts that \(p_1\) is the smallest solution. Thus, there exists a unique solution \(p_1\) of (26) for \(a/\sqrt{{\tilde{L}}}>2/\sqrt{\pi }\).

\(\square \)

6.3.3 Proof of Theorem 7

Proof of Theorem 7

To show (31) we extend the function f to a concave function on \({\mathbb {R}}\) by setting \(f(x)=x\) for \(x<0\). Note that f is continuously differentiable and \(f'\) is absolutely continuous and bounded. Using Itō–Tanaka formula, c.f. [39, Chapter 6, Theorem 1.1] we obtain

$$\begin{aligned} \textrm{d}f(r_t)= f'(r_t)({\tilde{b}}(r_t)+a{\mathbb {P}}(r_t >0))\textrm{d}t+2 f''(r_t)\mathbbm {1}_{(0,\infty )}(r_t)\textrm{d}t+\textrm{d}M_t \;, \end{aligned}$$

where \(M_t=2 \int _0^t f'(r_s) 1_{(0,\infty )}(r_s) \textrm{d}B_s\) is a martingale. Taking expectation, we get

$$\begin{aligned} \frac{d}{dt}{\mathbb {E}}[f(r_t)]&={\mathbb {E}}[ f'(r_t)({\tilde{b}}(r_t)+a{\mathbb {P}}(r_t>0))]+2{\mathbb {E}}[f''(r_t)\mathbbm {1}_{(0,\infty )}(r_t)] \\ {}&={\mathbb {E}}[f'(r_t){\tilde{b}}(r_t)+2(f''(r_t)-f''(0))]+{\mathbb {E}}[af'(r_t)+2f''(0)]{\mathbb {P}}(r_t >0) \\ {}&\le -c{\mathbb {E}}[f(r_t)] \;, \end{aligned}$$

where the last step holds by (39) and (40). By applying Gronwall’s lemma, we obtain (31).

\(\square \)

6.4 Proof of Sect. 4

The proof of Theorem 8 works in the same line as the proof of Theorems 1 and 2. Additionally, the difference between the nonlinear SDE and the mean-field system is bounded in Lemma 19 for which a uniform in time bound for the second moment of the process \((\bar{X}_t)_{t\ge 0}\) solving (1) is needed and which is given first.

Lemma 17

Let \((\bar{X}_t)_{t\ge 0}\) be a solution of (1) with \({\mathbb {E}}[|\bar{X}_0|^2]<\infty \). Assume B1. Then there exists \(C\in (0,\infty )\) depending on d, W and the second moment of \(\bar{X}_0\) such that

$$\begin{aligned} C=\sup _{t\ge 0 }{\mathbb {E}}[|\bar{X}_t|^2]<\infty \;. \end{aligned}$$
(69)

The proof relies on standard techniques (see e.g., [16, Lemma 8]) and is added for completeness.

Proof of Lemma 17

By Itō’s formula, it holds

$$\begin{aligned} \frac{1}{2}\textrm{d}|\bar{X}_t|^2=\langle \bar{X}_t,b*{\bar{\mu }}_t (\bar{X}_t)\rangle \textrm{d}t+\bar{X}_t^T \textrm{d}B_t+\frac{1}{2}d \ \textrm{d}t\;. \end{aligned}$$

Taking expectation and using symmetry, we get

$$\begin{aligned} \frac{\textrm{d}}{ \textrm{d}t}{\mathbb {E}}[|\bar{X}_t|^2]&= {\mathbb {E}}[\langle \bar{X}_t-{\tilde{X}}_t,b(\bar{X}_t-{\tilde{X}}_t\rangle ]+d \\ {}&= -{\mathbb {E}}[\langle \bar{X}_t-{\tilde{X}}_t,L(\bar{X}_t-{\tilde{X}}_t)-\gamma (\bar{X}_t-{\tilde{X}}_t)\rangle \mathbbm {1}_{|\bar{X}_t-{\tilde{X}}_t|>R_0}] \\ {}&-{\mathbb {E}}[\langle \bar{X}_t-{\tilde{X}}_t,L(\bar{X}_t-{\tilde{X}}_t)- \gamma (\bar{X}_t-{\tilde{X}}_t)\rangle \mathbbm {1}_{|\bar{X}_t-{\tilde{X}}_t|\le R_0}]+d \\ {}&\le {\mathbb {E}}[|\bar{X}_t|^2(-2L+ \kappa (|\bar{X}_t-{\tilde{X}}_t|) \mathbbm {1}_{|\bar{X}_t-{\tilde{X}}_t|>R_0})]+\Vert \gamma \Vert _\infty R_0+d\;. \end{aligned}$$

Hence by definition (14) of \(R_0\) and by Gronwall’s lemma we obtain the result (69). \(\square \)

Let \(N\in {\mathbb {N}}\). We construct a sticky coupling of N i.i.d. realizations of solutions \((\{\bar{X}_t^i\}_{i=1}^N)_{t\ge 0}\) to (1) and of the solution \((\{Y_t^i\}_{i=1}^N)_{t\ge 0}\) to the mean field particle system (3). Then, we consider a weak limit for \(\delta \rightarrow 0\) of Markovian couplings which are constructed similar as in Sect. 2. Let \(\textrm{rc}^{\delta }\), \(\textrm{sc}^{\delta }\) satisfy (19) and (20). The coupling \((\{\bar{X}_t^{i,\delta },{Y}^{i,\delta }\}_{i=1}^N)_{t\ge 0}\) is defined as process in \({\mathbb {R}}^{2Nd}\) satisfying a system of SDEs given by

$$\begin{aligned} \textrm{d}\bar{X}_t^{i,\delta }&=b*{\bar{\mu }}^\delta _t(\bar{X}_t^{i,\delta }) \textrm{d}t +\textrm{rc}^{\delta }({\tilde{r}}_t^{i,\delta })\textrm{d}B_t^{i,1} +\textrm{sc}^{\delta }({\tilde{r}}_t^{i,\delta })\textrm{d}B_t^{i,2} \nonumber \\ \textrm{d}{Y}_t^{i,\delta }&=\frac{1}{N}\sum _{j=1}^N b(Y_t^{i,\delta }-Y_t^{j,\delta })\textrm{d}t\nonumber \\&\quad +\textrm{rc}^{\delta }({\tilde{r}}_t^{i,\delta })({\text {Id}}-2{\tilde{e}}_t^{i,\delta }({\tilde{e}}_t^{i,\delta })^T) \textrm{d}B_t^{i,1} +\textrm{sc}^{\delta }({\tilde{r}}_t^{i,\delta })\textrm{d}B_t^{i,2}\;, \end{aligned}$$
(70)

where \(\textrm{Law}(\{\bar{X}_0^{i,\delta },Y_0^{i,0}\}_{i=1}^N)={\bar{\mu }}_0^{\otimes N}\otimes \nu _0^{\otimes N}\), and where \((\{B_t^{i,1}\}_{i=1}^N)_{t\ge 0},(\{B_t^{i,2}\}_{i=1}^N)_{t\ge 0}\) are i.i.d. d-dimensional standard Brownian motions. We set \({\tilde{X}}_t^{i,\delta }=\bar{X}_t^{i,\delta }-\frac{1}{N}\sum _{j=1}^N\bar{X}_t^{j,\delta }\), \({\tilde{Y}}_t^{i,\delta }=Y_t^{i,\delta }-\frac{1}{N}\sum _{j=1}^N Y_t^{j,\delta }\), \({\tilde{Z}}_t^{i,\delta }={\tilde{X}}_t^{i,\delta }-{\tilde{Y}}_t^{i,\delta }\), \({\tilde{r}}_t^{i,\delta }=|{\tilde{Z}}_t^{i,\delta }|\) and \({\tilde{e}}_t^{i,\delta }={\tilde{Z}}_t^{i,\delta }/{\tilde{r}}_t^{i,\delta }\) for \({\tilde{r}}_t^{i,\delta }\ne 0\). The value \({\tilde{e}}_t^{i,\delta }\) for \({\tilde{r}}_t^{i,\delta }=0\) is irrelevant as \(\textrm{rc}^{i,\delta }(0)=0\). By Levy’s characterization \((\{\bar{X}_t^{i,\delta },{Y}^{i,\delta }_t\}_{i=1}^N)_{t\ge 0}\) is indeed a coupling of (1) and (3). Existence and uniqueness of the coupling given in (70) hold by [36, Theorem 2.2]. In the next step we analyse \({\tilde{r}}_t^{i,\delta }\).

Lemma 18

Assume B1 holds. Then, for \(\epsilon <\epsilon _0\), where \(\epsilon _0\) is given in (20), and for any \(i \in \{1,\ldots ,N\}\), it holds almost surely,

$$\begin{aligned} \textrm{d}{\tilde{r}}_t^{i,\delta }&=-L{\tilde{r}}_t^{i,\delta } \textrm{d}t+\langle {\tilde{e}}_t^{i,\delta },\frac{1}{N}\sum _{j=1}^N \gamma ({\tilde{X}}_t^{i,\delta }-{\tilde{X}}_t^{j,\delta })-\gamma ({\tilde{Y}}_t^{i,\delta }-{\tilde{Y}}_t^{j,\delta })\rangle \textrm{d}t \nonumber \\ {}&\quad + 2\sqrt{1+\frac{1}{N}} \textrm{rc}^{\delta }({\tilde{r}}_t^{i,\delta })\textrm{d}W_t^{i,\delta } + \Big \langle {\tilde{e}}_t^{i,\delta },\Theta _t^{i,\delta }+\frac{1}{N}\sum _{k=1}^N\Theta _t^{k,\delta }\Big \rangle \textrm{d}t \nonumber \\ {}&\le \Big ( \bar{b}({\tilde{r}}_t^{i,\delta }) + 2\Vert \gamma \Vert _\infty \frac{1}{N}\sum _{j=1}^N\textrm{rc}^\epsilon ({\tilde{r}}_t^{j,\delta })\Big ) \textrm{d}t + 2 \sqrt{1+\frac{1}{N}} \textrm{rc}^{\delta }({\tilde{r}}_t^{i,\delta })\textrm{d}W_t^{i,\delta } \nonumber \\ {}&\quad +\Big ( A_t^{i,\delta } + \frac{1}{N}\sum _{k=1}^NA_t^{k,\delta }\Big )\textrm{d}t\;. \end{aligned}$$
(71)

with \(\Theta _t^{i,\delta }=b*{\bar{\mu }}_t^\delta (\bar{X}_t^{i,\delta })-\frac{1}{N}\sum _{j=1}^N b (\bar{X}_t^{i,\delta }-\bar{X}_t^{j,\delta })\) and

$$\begin{aligned} A_t^{i,\delta }=\Big |\Theta _t^{i,\delta }\Big |=\Big |b*{\bar{\mu }}_t^\delta (\bar{X}_t^{i,\delta })- \frac{1}{N}\sum _{j=1}^N b (\bar{X}_t^{i,\delta }-\bar{X}_t^{j,\delta })\Big | \end{aligned}$$
(72)

and where \((\{W_t^{i,\delta }\}_{i=1}^N)_{t\ge 0}\) are N one-dimensional Brownian motions given by

$$\begin{aligned} W_t^{i,\delta }=\sqrt{\frac{N}{N+1}}\left( \int _0^t ({\tilde{e}}_s^{i,\delta })^T \textrm{d}B_s^{i,1}+\frac{1}{N}\sum _{j=1}^N\int _0^t ({\tilde{e}}_s^{j,\delta })^T\textrm{d}B_s^{j,1} \right) \;, \quad i=1,\ldots , N. \end{aligned}$$
(73)

Proof

By (70) and since \(\gamma \) is anti-symmetric, it holds by Itō’s formula for any \(i \in \{1,\ldots ,N\}\),

$$\begin{aligned} \textrm{d}({\tilde{r}}_t^{i,\delta })^2&=-2L({\tilde{r}}_t^{i,\delta })^2 \textrm{d}t+2\langle {\tilde{Z}}_t^{i,\delta },\frac{1}{N}\sum _{j=1}^N \gamma ({\tilde{X}}_t^{i,\delta }-{\tilde{X}}_t^{j,\delta })-\gamma ({\tilde{Y}}_t^{i,\delta }-{\tilde{Y}}_t^{j,\delta })\rangle \textrm{d}t \\ {}&\quad +4 \Big (1+\frac{1}{N}\Big ) \textrm{rc}^{\delta }({\tilde{r}}_t^{i,\delta })^2\textrm{d}t+ 4 \sqrt{1+\frac{1}{N}} \textrm{rc}^{\delta }({\tilde{r}}_t^{i,\delta })\langle {\tilde{Z}}_t^{i,\delta },{\tilde{e}}_t^{i,\delta }\rangle \textrm{d}W_t^{i,\delta } \\ {}&\quad + 2\langle {\tilde{Z}}_t^{i,\delta },b*{\bar{\mu }}_t^{\delta } (\bar{X}_t^{i,\delta })-\frac{1}{N}\sum _{j=1}^N b(\bar{X}_t^{i,\delta }-\bar{X}_t^{j,\delta })\rangle \textrm{d}t \\ {}&\quad +2 \langle {\tilde{Z}}_t^{i,\delta },-\frac{1}{N}\sum _{k=1}^N \Big (b*{\bar{\mu }}_t^{\delta } (\bar{X}_t^{k,\delta })-\frac{1}{N}\sum _{j=1}^N b(\bar{X}_t^{k,\delta }-\bar{X}_t^{j,\delta })\Big )\rangle \textrm{d}t\;. \end{aligned}$$

where \((\{W_t^i\}_{i=1}^N)_{t\ge 0}\) are N i.i.d.one-dimensional Brownian motions given by (73). Note that the prefactor \((N/(N+1))^{1/2}\) ensures that the quadratic variation satisfies \([W^i]_t=t\) for \(t\ge 0\), and hence \((\{W_t^i\}_{i=1}^N)_{t\ge 0}\) are Brownian motions. This definition of \((\{W_t^i\}_{i=1}^N)_{t\ge 0}\) leads to \((1+1/N)^{1/2}\) in the diffusion term of the SDE. Applying the \({{\mathcal {C}}}^2\) approximation of the square root used in the proof of Lemma 12 and taking \(\varepsilon \rightarrow 0\) in the approximation yields the stochastic differential equations of \((\{{\tilde{r}}_t^{i,\delta }\}_{i=1}^N)_{t\ge 0}\). We obtain its upper bound for \(\epsilon <\epsilon _0\) by B1 and (20) similarly to the proof of Lemma 12. \(\square \)

Next, we state a bound for (72). The result and the proof are adapted from [16, Theorem 2].

Lemma 19

Under the same assumption as in Lemma 20, it holds for any \(i=1,\ldots ,N\)

$$\begin{aligned} E\Big [|A_t^{i,\delta }|^2\Big ]\le C_1N^{-1} \text { and } E\Big [A_t^{i,\delta }\Big ]\le C_2 N^{-1/2}\;, \end{aligned}$$

where \(A_t^{i,\delta }\) is given in (72) and \(C_1\) and \(C_2\) are constants depending on \(\Vert \gamma \Vert _\infty \), L and C given in Lemma 17.

Proof

By B3, it holds \({\mathbb {E}}(|\bar{X}_0^{i,\delta }|^2)<\infty \) for \(i=1,\ldots ,N\). Note that given \(\bar{X}_t^{i,\delta }\), \(\bar{X}_t^{j,\delta }\) are i.i.d.with law \({\bar{\mu }}_t^{\delta }\) for all \(j\ne i \). Hence,

$$\begin{aligned} {\mathbb {E}}[b(\bar{X}_t^{i,\delta }-\bar{X}_t^{j,\delta })|\bar{X}_t^{i,\delta }]=b*{\bar{\mu }}_t^\delta (\bar{X}_t^{i,\delta })\;. \end{aligned}$$

Since \(\gamma \) is anti-symmetric, \(b(0)=0\), and we have

$$\begin{aligned} {\mathbb {E}}&\Big [|b*{\bar{\mu }}_t^\delta (\bar{X}_t^{i,\delta })-\frac{1}{N-1}\sum _{j=1}^Nb(\bar{X}_t^{i,\delta }-\bar{X}_t^{j,\delta })|^2\Big |\bar{X}_t^{i,\delta }\Big ] \\ {}&= {\mathbb {E}}\Big [|\frac{1}{N-1}\sum _{j=1}^N{\mathbb {E}}[b(\bar{X}_t^{i,\delta }-\bar{X}_t^{j,\delta })|\bar{X}_t^{i,\delta }] -\frac{1}{N-1}\sum _{j=1}^Nb(\bar{X}_t^{i,\delta }-\bar{X}_t^{j,\delta })|^2\Big |\bar{X}_t^{i,\delta }\Big ] \\ {}&=\frac{1}{N-1}\textrm{Var}_{{\bar{\mu }}_t^\delta }(b(\bar{X}_t^{i,\delta }-\cdot ))\;. \end{aligned}$$

By (11), B1, B3 and Lemma 17, we obtain

$$\begin{aligned} \textrm{Var}_{{\bar{\mu }}_t^\delta }(b(\bar{X}_t^{i,\delta }-\cdot ))&=\int _{{\mathbb {R}}^d}\Big |\Big (-L(\bar{X}_t^{i,\delta }-x)+\int _{{\mathbb {R}}^d} L(\bar{X}_t^{i,\delta }-{\tilde{x}}){\bar{\mu }}_t^\delta (\textrm{d}{\tilde{x}})\Big ) \\ {}&+\Big (\gamma (\bar{X}_t^{i,\delta }-x)-\int _{{\mathbb {R}}^d} \gamma ({\tilde{X}}_t^{i,\delta }-{\tilde{x}}){\bar{\mu }}_t^\delta (\textrm{d}{\tilde{x}})\Big )\Big |^2 {\bar{\mu }}_t^\delta (\textrm{d}x) \\ {}&=\int _{{\mathbb {R}}^d}\Big |Lx+\Big (\gamma (\bar{X}_t^{i,\delta }-x)-\int _{{\mathbb {R}}^d} \gamma ({\tilde{X}}_t^{i,\delta }-{\tilde{x}}){\bar{\mu }}_t^\delta (\textrm{d}{\tilde{x}})\Big )\Big |^2 {\bar{\mu }}_t^\delta (\textrm{d}x) \\ {}&\le 2L^2\int _{{\mathbb {R}}^d} |x|^2{\bar{\mu }}_t^\delta (\textrm{d}x)+8\Vert \gamma \Vert _\infty ^2\le 2L^2C^2+8\Vert \gamma \Vert _\infty ^2\;. \end{aligned}$$

By the Cauchy-Schwarz inequality, we have

$$\begin{aligned} {\mathbb {E}}[(A_t^{i,\delta })^2]&\le 2{\mathbb {E}}\Big [|b*{\bar{\mu }}_t(\bar{X}_t^{i,\delta })-\frac{1}{N-1}\sum _{j=1}^Nb(\bar{X}_t^{i,\delta }-\bar{X}_t^{j,\delta })|^2\Big ] \\ {}&\quad +2\Big (\frac{1}{N-1}-\frac{1}{N}\Big )^2 {\mathbb {E}}\Big [|\sum _{j=1}^Nb(\bar{X}_t^{i,\delta }-\bar{X}_t^{j,\delta })|^2\Big ] \\ {}&\le 2\frac{1}{N-1}{\mathbb {E}}[\textrm{Var}_{{\bar{\mu }}_t^\delta }(b(\bar{X}_t^{i,\delta }-\cdot ))]+\frac{1}{N^2(N-1)}{\mathbb {E}}\Big [\sum _{j=1}^N|b(X_t^{i,\delta }-X_t^{j,\delta })|^2\Big ] \\ {}&\le \frac{4 L^2}{N-1}C+\frac{16\Vert \gamma \Vert _\infty ^2}{N-1} +\frac{1}{N^2}\Big (8CL^2+4\Vert \gamma \Vert ^2_\infty \Big ) \\ {}&\le N^{-1}C_1<\infty \;, \end{aligned}$$

where \(C_1\) depends on \(\Vert \gamma \Vert _\infty \), L and the second moment bound C. Similarly, it holds

$$\begin{aligned} {\mathbb {E}}[A_t^{i,\delta }]&\le {\mathbb {E}}\Big [|b*{\bar{\mu }}_t^\delta (\bar{X}_t^{i,\delta })-\frac{1}{N-1}\sum _{j=1}^Nb(\bar{X}_t^{i,\delta }-\bar{X}_t^{j,\delta })|\Big ] \\ {}&\quad +\Big (\frac{1}{N-1}-\frac{1}{N}\Big ) \sum _{j=1}^N {\mathbb {E}}\Big [|b(\bar{X}_t^{i,\delta }-\bar{X}_t^{j,\delta })|\Big ] \\ {}&\le \frac{\sqrt{2} L}{\sqrt{N-1}}C^{1/2}+\frac{\sqrt{8}\Vert \gamma \Vert _\infty }{\sqrt{N-1}} +\frac{1}{N}\Big (\sqrt{2}C^{1/2}L+\Vert \gamma \Vert _\infty \Big ) \\ {}&\le N^{-1/2}C_2<\infty \;, \end{aligned}$$

where \(C_2=2LC^{1/2}+4\Vert \gamma \Vert _\infty +(\sqrt{2}C^{1/2}+\Vert \gamma \Vert _\infty )\). \(\square \)

To control \((\{{\tilde{r}}_t^{i,\delta }\}_{i=1}^N)_{t\ge 0}\), we consider \((\{r_t^{i,\delta ,\epsilon }\}_{i=1}^N)_{t\ge 0}\) given as solution of

$$\begin{aligned} \textrm{d}r_t^{i,\delta ,\epsilon }&=\bar{b}(r_t^{i,\delta ,\epsilon }) \textrm{d}t + \frac{1}{N}\sum _{j=1}^N 2\Vert \gamma \Vert _\infty \textrm{rc}^{\epsilon }(r_t^{j,\delta ,\epsilon })\textrm{d}t +\Big ( A_t^{i,\delta }+\frac{1}{N}\sum _{k=1}^N A_t^{k,\delta }\Big )\textrm{d}t \nonumber \\ {}&\quad + 2\sqrt{1+\frac{1}{N}} \textrm{rc}^{\delta }(r_t^{i,\delta ,\epsilon })\textrm{d}W_t^{i,\delta } \end{aligned}$$
(74)

with initial condition \(r_0^{i,\delta ,\epsilon }={\tilde{r}}_0^{i,\delta }\) for all \(i=1,\ldots ,N\), \(A_t^{i,\delta }\) given in (72) and \(W_t^{i,\delta }\) given in (73).

By [36, Theorem 2.2], under B1 and B3, \((\{U_t^{i,\delta ,\epsilon }\}_{i=1}^N)_{t\ge 0}=(\{\bar{X}_t^{i,\delta },Y_t^{i,\delta },r_t^{i,\delta ,\epsilon }\}_{i=1}^N)_{t\ge 0}\) exists and is unique, where \((\{\bar{X}_t^{i,\delta },\bar{Y}_t^{i,\delta }\}_{i=1}^N)_{t\ge 0}\) solves uniquely (70), \((\{\bar{r}_t^{i\delta }\}_{i=1}^N)_{t\ge 0}\) and \((\{r_t^{i,\delta ,\epsilon }\}_{i=1} N)_{t\ge 0}\) solve uniquely (71) and (74), respectively, with \((\{W_t^{i,\delta }\}_{i=1}^N)_{t\ge 0}\) given by (73).

Lemma 20

Assume B1 and B3. Then for any \(i=1,\ldots ,N\), \(|\bar{X}_t^{i,\delta }-Y_t^{i,\delta }-\frac{1}{N}\sum _j(\bar{X}_t^{j,\delta }-Y_t^{j,\delta })|={\tilde{r}}_t^{i,\delta }\le r_t^{i,\delta ,\epsilon }\), almost surely for all \(t\ge 0\) and \(\epsilon <\epsilon _0\).

Proof

Note, that both processes \((\{{\tilde{r}}_t^{i,\delta }\}_{i=1}^N)_{t \ge 0}\) and \((\{r_t^{i,\delta ,\epsilon }\}_{i=1}^N)_{t\ge 0}\) have the same initial condition and are driven by the same noise. Since the drift for \((\{r_t^{i,\delta ,\epsilon }\}_{i=1}^N)_{t\ge 0}\) is larger than the drift for \((\{{\tilde{r}}_t^{i,\delta }\}_{i=1}^N)_{t\ge 0}\) for \(\epsilon <\epsilon _0\) by (20), we can conclude \({\tilde{r}}_t^{i,\delta }\le r_t^{i,\delta ,\epsilon }\) almost surely for all \(t\ge 0\), \(\epsilon <\epsilon _0\) and \(i=1,\ldots N\) by Lemma 21. \(\square \)

Proof of Theorem 8

Consider the process \((\{U_t^{i,\delta ,\epsilon }\}_{i=1}^N)_{t\ge 0}=(\{\bar{X}_t^{i,\delta },Y_t^{i,\delta },r_t^{i,\delta ,\epsilon }\}_{i=1}^N)_{t\ge 0}\) on \({\mathbb {R}}^{N(2d+1)}\) for each \(\epsilon ,\delta >0\). We denote by \({\mathbb {P}}^{\delta ,\epsilon }\) the law of \(\{U^{\delta ,\epsilon }\}_{i=1}^N\) on \({{\mathcal {C}}}({\mathbb {R}}_+,{\mathbb {R}}^{N(2d+1)})\). We define the canonical projections \({{\varvec{X}}},{{\varvec{Y}}},{\varvec{r}}\) onto the first Nd, second Nd and last N components.

By B1 and B3 it holds in the same line as in the proof of Lemma 22 for each \(T>0\)

$$\begin{aligned} E[|\{U_{t_2}^{i,\delta ,\epsilon }-U_{t_1}^{i,\delta ,\epsilon }\}_{i=1}^N|^4]\le C|t_2-t_1|^2 \qquad \text {for} t_1,t_2\in [0,T], \end{aligned}$$
(75)

for some constant C depending on T, L, \(\Vert \gamma \Vert _{\textrm{Lip}}\), \(\Vert \gamma \Vert _\infty \), N and on the fourth moment of \(\mu _0\) and \(\nu _0\). Note that we used here that the additional drift terms \((A_t^{i,\delta })_{t\ge 0}\) occurring in the SDE of \((\{r_t^{i,\delta ,\epsilon }\}_{i=1}^N)_{t\ge 0}\) are Lipschitz continuous in \((\{\bar{X}_t^{i,\delta }\}_{i=1}^N)_{t\ge 0}\). Then as in the proofs of Lemma 22 and Lemma 23, \({\mathbb {P}}^{\delta ,\epsilon }\) is tight and converges weakly along a subsequence to a measure \({\mathbb {P}}\) by Kolmogorov’s continuity criterion, cf. [31, Corollary 14.9].

As in Lemma 22 the law \({\mathbb {P}}_T^{\delta ,\epsilon }\) of \((\{U_t^{i,\delta ,\epsilon }\}_{i=1}^N)_{0\le t\le T}\) on \({{\mathcal {C}}}([0,T],{\mathbb {R}}^{N(2d+1)})\) is tight for each \(T>0\) by [31, Corollary 14.9] and for each \(\epsilon >0\) there exists a subsequence \(\delta _n\rightarrow 0\) such that \(({\mathbb {P}}^{\delta _n,\epsilon }_T)_{n\in {\mathbb {N}}}\) on \({{\mathcal {C}}}([0,T],{\mathbb {R}}^{N(2d+1)})\) converge to a measure \({\mathbb {P}}^\epsilon _T\) on \({{\mathcal {C}}}([0,T],{\mathbb {R}}^{N(2d+1)})\). By a diagonalization argument and since \(\{{\mathbb {P}}^\epsilon _T: T\ge 0\}\) is a consistent family, cf. [31, Theorem 5.16], there exists a probability measure \({\mathbb {P}}^\epsilon \) on \({{\mathcal {C}}}({\mathbb {R}}_+,{\mathbb {R}}^{N(2d+1)})\) such that for all \(\epsilon \) there exists a subsequence \(\delta _n\) such that \(({\mathbb {P}}^{\delta _n,\epsilon })_{n\in {\mathbb {N}}}\) converges along this subsequence to \({\mathbb {P}}^\epsilon \). As in the proof of Lemma 23 we repeat this argument for the family of measures \(({\mathbb {P}}^\epsilon )_{\epsilon >0}\). Hence, there exists a subsequence \(\epsilon _m\rightarrow 0\) such that \(({\mathbb {P}}^{\epsilon _m})_{m\in {\mathbb {N}}}\) converges to a measure \({\mathbb {P}}\). Let \((\{\bar{X}_t^i,{Y}_t^i,r_t^i\}_{i=1}^N)_{t\ge 0}\) be some process on \({\mathbb {R}}^{N(2d+1)}\) with distribution \({\mathbb {P}}\) on \(({\bar{\Omega }},\bar{{{\mathcal {F}}}}, \bar{P})\).

Since \((\{\bar{X}_t^{i,\delta }\}_{i=1}^N)_{t\ge 0}\) and \((\{{Y}_t^{i,\delta }\}_{i=1}^N)_{t\ge 0}\) are solutions that are unique in law, we have that for any \(\delta ,\epsilon >0\), \({\mathbb {P}}^{\delta ,\epsilon }\circ {\varvec{X}}^{-1}={\mathbb {P}}\circ {\varvec{X}}^{-1}\) and \({\mathbb {P}}^{\delta ,\epsilon }\circ {\varvec{Y}}^{-1}={\mathbb {P}}\circ {\varvec{Y}}^{-1}\). Hence, \({\mathbb {P}}\circ ({{\varvec{X}}},{{\varvec{Y}}})^{-1}\) is a coupling of (1) and (3).

Similarly to the proof of Lemmas 22 and 23 there exist an extended underlying probability space and N i.i.d.one-dimensional Brownian motion \((\{W_t^i\}_{i=1}^N)_{t\ge 0}\) such that \((\{r_t^i,W_t^i\}_{i=1}^N)_{t\ge 0}\) is a solution of

$$\begin{aligned} \textrm{d}r_t^i&= \bar{b}(r_t^{i}) \textrm{d}t + \frac{1}{N}\sum _{j=1}^N 2\Vert \gamma \Vert _\infty \mathbbm {1}_{(0,\infty )}(r_t^j)\textrm{d}t +\Big (A_t^i+\frac{1}{N}\sum _{k=1}^N A_t^k\Big )\textrm{d}t\\&\quad +2\sqrt{1+\frac{1}{N}} \mathbbm {1}_{(0,\infty )}(r_t^i)\textrm{d}W_t^i\;, \end{aligned}$$

where \(A_t^i=|b*{\bar{\mu }}_t(\bar{X}_t^i)- \frac{1}{N}\sum _{j=1}^N b(\bar{X}_t^i-\bar{X}_t^j)|\).

In addition, the statement of Lemma 20 carries over to the limiting process \((\{r_t^i\}_{i=1}^N)_{t\ge 0}\), since by the weak convergence along the subsequences \((\delta _n)_{n\in {\mathbb {N}}}\) and \((\epsilon _m)_{m\in {\mathbb {N}}}\) and the Portmanteau theorem, \(P(|{\tilde{X}}^i_t-{\tilde{Y}}^i_t|\le r_t^i \text { for } i=1,\ldots ,N)\ge \limsup _{m\rightarrow \infty }\limsup _{n\rightarrow \infty } P(|{\tilde{X}}_t^{i,\delta _n}-{\tilde{Y}}_t^{i,\delta _n}|\le r_t^{i,\delta _n,\epsilon _m} \text { for } i=1,\ldots ,N)=1\), where \({\tilde{X}}^i_t=\bar{X}_t^i-(1/N)\sum _{j=1}^N\bar{X}_t^j\) and \({\tilde{Y}}^i_t=\bar{X}_t^i-(1/N)\sum _{j=1}^N\bar{Y}_t^j\) for all \(t\ge 0\) and \(i=1,\ldots ,N\).

Using Itō—Tanaka formula, c.f. [39, Chapter 6, Theorem 1.1], and \(f'\) is absolutely continuous, we obtain for f defined in (37) with \({\tilde{b}}(r)=(\kappa (r)-L)r\) and \(a=2\Vert \gamma \Vert _\infty \),

$$\begin{aligned} \textrm{d}\Big (\frac{1}{N}\sum _{i=1}^N f(r_t^i)\Big )&=\frac{1}{N}\sum _{i=1}^N\Big (\bar{b}(r_t^i)f'(r_t^i)+f''(r_t^i)2\frac{N+1}{N} \mathbbm {1}_{(0,\infty )}(r_t^i)\Big )\textrm{d}t \\ {}&\quad +\frac{1}{N^2}\sum _{i=1}^N\sum _{j=1}^N2f'(r_t^i)\Vert \gamma \Vert _\infty \mathbbm {1}_{(0,\infty )}(r_t^j)\textrm{d}t \\ {}&\quad +\frac{1}{N}\sum _{i=1}^N f'(r_t^i)2\sqrt{1+\frac{1}{N}} \mathbbm {1}_{(0,\infty )}(r_t^i)\textrm{d}W_t^i\\&\quad +\frac{1}{N}\sum _{i=1}^N f'(r_t^i)\Big (A_t^{i}+\frac{1}{N}\sum _{k=1}^NA_t^k\Big )\textrm{d}t\;. \end{aligned}$$

Taking expectation, we get using \(f'(r)\le 1\) for all \(r\ge 0\),

$$\begin{aligned} \frac{\textrm{d}}{\textrm{d}t}{\mathbb {E}}\Big [\frac{1}{N}\sum _{i=1}^N f(r_t^i)\Big ]&\le \frac{1}{N}\sum _{i=1}^N\Big \{{\mathbb {E}}\Big [\bar{b}(r_t^i)f'(r_t^i)+2\frac{N+1}{N}(f''(r_t^i)-f''(0))\Big ]\nonumber \\ {}&\quad + {\mathbb {E}}\Big [2\Big (\Vert \gamma \Vert _\infty +\frac{N+1}{N} f''(0)\Big )\mathbbm {1}_{(0,\infty )}(r_t^i)\Big ]+{\mathbb {E}}\Big [ 2A_t^{i}\Big ]\Big \}\;. \end{aligned}$$
(76)

By (39) and (40), the first two terms are bounded by \(-{\tilde{c}}\frac{1}{N}\sum _i f(r_t^i)\) with \({\tilde{c}}\) given in (17).

By Lemma 19 the last term in (76) is bounded by

$$\begin{aligned} 2E[A_t^i]\le {\tilde{C}}N^{-1/2}\;, \end{aligned}$$

where

$$\begin{aligned} {\tilde{C}}=2C_2=4LC^{1/2}+8\Vert \gamma \Vert _\infty +2(\sqrt{2}C^{1/2}L+\Vert \gamma \Vert _\infty )\;. \end{aligned}$$
(77)

Hence, we obtain

$$\begin{aligned} \frac{d}{dt}{\mathbb {E}}\Big [\frac{1}{N}\sum _i f(r_t^i)\Big ]&\le -{\tilde{c}} \frac{1}{N}\sum _i{\mathbb {E}}[f(r_t^i)]+{\tilde{C}}N^{-1/2} \end{aligned}$$

for \(t\ge 0\) which leads by Grönwall’s lemma to

$$\begin{aligned} {\mathbb {E}}\Big [\frac{1}{N}\sum _i f(r_t^i)\Big ]\le \textrm{e}^{-{\tilde{c}} t} {\mathbb {E}}\Big [\frac{1}{N}\sum _i f(r_0^i)\Big ]+\frac{1}{{\tilde{c}}}{\tilde{C}}N^{-1/2}\;. \end{aligned}$$

For an arbitrary coupling \(\xi \in \Gamma ({\bar{\mu }}_0^{\otimes N},\nu _0^{\otimes N})\), we have

$$\begin{aligned} {{\mathcal {W}}}_{f,N}(({\bar{\mu }}_t)^{\otimes N},\nu _t^N)\\ \le \textrm{e}^{-{\tilde{c}} t} \int _{{\mathbb {R}}^{2Nd}} \frac{1}{N}\sum _{i=1}^N f \left( \left| x^i-y^i-\frac{1}{N}\sum _{j=1}^N(x^j-y^j) \right| \right) \xi (\textrm{d}x \textrm{d}y)+\frac{{\tilde{C}}}{{\tilde{c}}N^{1/2}}\;, \end{aligned}$$

as \({\mathbb {E}}[f(r_0^i)]\le \int _{{\mathbb {R}}^{2Nd}} \frac{1}{N}\sum _{i=1}^Nf(|x^i-y^i-\frac{1}{N}\sum _{j=1}^N(x^j-y^j)|) \xi (\textrm{d}x \textrm{d}y)\). Taking the infimum over all couplings \(\xi \in \Gamma ({\bar{\mu }}_0^{\otimes N},\nu _0^{\otimes })\) gives the first bound. By (38), the second bound follows. \(\square \)

6.5 Proof of Sect. 5

Analogously to the proof of Theorem 3, we introduce approximations for the system of sticky SDEs and prove Theorem 10 using a comparison result given in Lemma 21 and via taking the limit of the approximation of the system of sticky SDEs in two steps and identifying the limit with the solution of (35).

As for the nonlinear case we show Theorem 10 via a family of stochastic differential equations, with Lipschitz continuous coefficients,

$$\begin{aligned} \begin{aligned}&\textrm{d}r_t^{i,n,m}=\Big ({\tilde{b}}(r_t^{i,n,m})+\frac{1}{N}\sum _{j=1}^Ng^m(r_t^{j,n,m})\Big )\textrm{d}t+2\theta ^n(r_t^{i,n,m}) \textrm{d}W_t^i \\ {}&\textrm{d}s_t^{i,n,m}=\Big ({\hat{b}}(s_t^{i,n,m})+\frac{1}{N}\sum _{j=1}^Nh^m(s_t^{j,n,m})\Big )\textrm{d}t+2\theta ^n(s_t^{i,n,m}) \textrm{d}W_t^i \\ {}&\textrm{Law}(r_0^{i,n,m},s_0^{i,n,m})=\eta _{n,m}\;, \qquad i\in \{1,\ldots ,N\}\;, \end{aligned} \end{aligned}$$
(78)

where \(\eta _{n,m}\in \Gamma (\mu _{n,m},\nu _{n,m})\). Under H1, H2, H5, H6 and H7 we identify the weak limit of \((\{r_t^{i,n,m},s_t^{i,n,m}\}_{i=1,n,m\in {\mathbb {N}}}^N)_{t\ge 0}\) solving (78) for \(n\rightarrow \infty \) by \((\{r_t^{i,m},s_t^{i,m}\}_{i=1,m\in {\mathbb {N}}}^N)_{t\ge 0}\) solving the family of SDEs given by

$$\begin{aligned} \begin{aligned}&\textrm{d}r_t^{i,m}=\Big ({\tilde{b}}(r_t^{i,m})+\frac{1}{N}\sum _{j=1}^Ng^m(r_t^{j,m})\Big )\textrm{d}t+2 \mathbbm {1}_{(0,\infty )}(r_t^{i,m}) \textrm{d}W_t^i \;, \\ {}&\textrm{d}s_t^{i,m}=\Big ({\hat{b}}(s_t^{i,m})+\frac{1}{N}\sum _{j=1}^Nh^m(s_t^{j,m})\Big )\textrm{d}t+2 \mathbbm {1}_{(0,\infty )}(s_t^{i,m}) \textrm{d}W_t^i \;, \\ {}&\textrm{Law}(r_0^{i,m},s_0^{i,m})=\eta _m \;, \qquad i\in \{1,\ldots ,N\}\;, \end{aligned} \end{aligned}$$
(79)

where \(\eta _m\in \Gamma (\mu _m,\nu _m)\).

Taking the limit \(m\rightarrow \infty \), we obtain (35) as the weak limit of (79). In the case \(g(r)=\mathbbm {1}_{(0,\infty )}(r)\), we can choose \(g^m=\theta ^m\).

Consider a probability space \((\Omega _0,{{\mathcal {A}}}_0,Q)\) and N i.i.d.1-dimensional Brownian motions \((\{W_t^i\}_{i=1}^N)_{t\ge 0}\). Note that under H1H7, there are random variables \(\{r^{i,n,m}\}_{i=1}^N,\{s^{i,n,m}\}_{i=1}^N:\Omega _0\rightarrow {\mathbb {W}}^{N}\) for each nm such that \((\{r^{i,n,m},s^{i,n,m}\}_{i=1}^N)\) is a unique solution to (78) by [36, Theorem 2.2]. We denote by \({\mathbb {P}}^{n,m}=Q\circ (\{r^{i,n,m},s^{i,n,m}\}_{i=1}^N)^{-1}\) the law on \({\mathbb {W}}^N\times {\mathbb {W}}^N\).

Before taking the two limits and proving Theorem 10, we introduce a modification of Ikeda and Watanabe’s comparison theorem, to compare two solutions of (78), cf. [28, Section VI, Theorem 1.1].

Lemma 21

Suppose a solution \((\{r_t^{i,n,m},s_t^{i,n,m}\}_{i=1}^N)_{t\ge 0}\) of (78) is given for fixed \(n,m\in {\mathbb {N}}\). Assume H5 for \(g^{m}\) and \(h^{m}\), H1 for \({\tilde{b}}\) and \( {\hat{b}}\), H6 for \(\theta ^n\). If \(Q[r_0^{i,n,m}\le s_0^{i,n,m} \text { for all } i=1,\ldots ,N]=1\), \({\tilde{b}}(r)\le {\hat{b}}(r)\) and \(g^m(r)\le h^m(r)\) for any \(r\in {\mathbb {R}}_+\), then

$$\begin{aligned} Q[r_t^{i,n,m}\le s_t^{i,n,m} \text { for all } t\ge 0 \text { and } i=1,\ldots ,N]=1 \end{aligned}$$

Proof

The proof is similar for each component \(i=1,\ldots ,N\) to the proof of Lemma 14. It holds for the interaction part similarly to (53) using the properties of \(g^m\) and \(h^m\),

$$\begin{aligned}&\frac{1}{N}\sum _{j=1}^N (g^m(r^{j,n,m}_t)-h^m(s^{j,n,m}_t))\\&\le K_m\frac{1}{N}\sum _{j=1}^N|r^{j,n,m}_t-s^{j,n,m}_t|\mathbbm {1}_{(0,\infty )}(r^{j,n,m}_t-s^{j,n,m}_t) \;. \end{aligned}$$

Hence, we obtain analogously to (54),

$$\begin{aligned} {\mathbb {E}}[(r_t^{i,n,m}-s_t^{i,n,m})_+]&\le {\tilde{L}}{\mathbb {E}}\Big [\int _0^t(r^{i,n,m}_u-s^{i,n,m}_u)_+ \textrm{d}u\Big ] \\&\quad +K_m{\mathbb {E}}\Big [\int _0^t \frac{1}{N}\sum _{j=1}^N(r_u^{j,n,m}-s_u^{j,n,m})_+ \textrm{d}u\Big ] \end{aligned}$$

for all \(i=1,\ldots ,N\). Assume \(t^*=\inf \{t\ge 0:{\mathbb {E}}[(r_t^{i,n,m}- s_t^{i,n,m})_+]>0 \text { for some } i \}<\infty \). Then, there exists \(i\in \{1,\ldots ,N\}\) such that \(\int _0^{t^*}{\mathbb {E}}[(r_u^{i,n,m}-s^{i,n,m}_u)_+ ]\textrm{d}u>0\). But, by definition of \(t^*\), for all i, \(u<t^*\), \({\mathbb {E}}[(r^{i,n,m}_u-s^{i,n,m}_u)_+ ]=0\). This contradicts the definition of \(t^*\). Hence, \(Q[r_t^{i,n,m}\le s_t^{i,n,m} \text { for all } i, \ t\ge 0]=1\). \(\square \)

In the next step, we prove that the distribution of the solution of (78) converges as \(n\rightarrow \infty \).

Lemma 22

Assume that H1 and H2 is satisfied for \(({\tilde{b}},g)\) and \(({\hat{b}},h)\). Further, let \((\theta ^n)_{n\in {\mathbb {N}}}\), \((g^m)_{m\in {\mathbb {N}}}\), \((h^m)_{m\in {\mathbb {N}}}\), \((\mu _{n,m})_{n,m\in {\mathbb {N}}}\), \((\nu _{n,m})_{n,m\in {\mathbb {N}}}\) and \((\eta _{n,m})_{n,m\in {\mathbb {N}}}\) be such that H5, H6 and H7 hold. Let \(m\in {\mathbb {N}}\). Then there exists a random variable \((\{r^{i,m},s^{i,m}\}_{i=1}^N)\) defined on some probability space \((\Omega ^m,{{\mathcal {A}}}^m,P^m)\) with values in \({\mathbb {W}}^N\times {\mathbb {W}}^N\) such that \((\{ r_t^{i,m},s_t^{i,m}\}_{i=1}^N)_{t\ge 0}\) is a weak solution of (79). Moreover, the laws \(Q\circ (\{r^{i,n,m},s^{i,n,m}\}_{i=1}^N)^{-1}\) converge weakly to \(P^m\circ (\{r^{i,m},s^{i,m}\}_{i=1}^N)^{-1}\). If in addition,

$$\begin{aligned}&{\tilde{b}}(r)\le {\hat{b}}(r) \quad \text {and} \quad g^m(r)\le h^m(r){} & {} \text { for any } r\in {\mathbb {R}}_+, \\&Q[r_0^{i,n,m}\le s_0^{i,n,m}]=1{} & {} \text { for any } n\in {\mathbb {N}}, i=1,\ldots ,N, \end{aligned}$$

then \(P^m[r^{i,m}_t\le s^{i,m}_t \text { for all } t\ge 0 \text { and } i\in \{1,\ldots ,N\}]=1\).

Proof

Fix \(m\in {\mathbb {N}}\). The proof is divided in three parts and is similar to the proof of Lemma 15. First we show tightness of the sequences of probability measures. Then we identify the limit of the sequence of stochastic processes. Finally, we compare the two limiting processes.

Tightness We show analogously as in the proof of Lemma 15 that the sequence of probability measures \(({\mathbb {P}}^{n,m})_{n\in {\mathbb {N}}}\) on \(({\mathbb {W}}^N\times {\mathbb {W}}^N,{{\mathcal {B}}}({\mathbb {W}}^N)\otimes {{\mathcal {B}}}({\mathbb {W}}^N))\) is tight by applying Kolmogorov’s continuity theorem. We consider \(p>2\) such that the p-th moment in H7 are uniformly bounded. Fix \(T>0\). Then the p-th moment of \(r_t^{i,n,m}\) and \(s_t^{i,n,m}\) for \(t<T\) is bounded using Itō’s formula,

$$\begin{aligned} \textrm{d}|r_t^{i,n,m}|^p&\le p|r_t^{i,n,m}|^{p-2}\langle r_t^{i,n,m},({\tilde{b}}(r_t^{i,n,m})+\frac{1}{N}\sum _{j=1}^Ng^m(r_t^{j,n,m}))\rangle \textrm{d}t \\ {}&\quad + 2\theta ^n(r_t^{i,n,m}) p|r_t^{n,m}|^{p-2} r_t^{i,n,m}\textrm{d}W_t^i +p(p-1)|r_t^{i,n,m}|^{p-2}2 \theta ^n(r_t^{i,n,m})^2\textrm{d}t \\ {}&\le p\Big (|r_t^{i,n,m}|^p {\tilde{L}}+\Gamma |r_t^{i,n,m}|^{p-1}+2(p-1)|r_t^{i,n,m}|^{p-2}\Big ) \textrm{d}t + 2\theta ^n(r_t^{i,n,m}) p(r_t^{i,n,m})^{p-1}\textrm{d}W_t^i \\ {}&\le p\Big ({\tilde{L}}+\Gamma +2(p-1)\Big )|r_t^{i,n,m}|^p\textrm{d}t + p(\Gamma +2(p-1))\textrm{d}t + 2\theta ^n(r_t^{i,n,m}) p(r_t^{n,m})^{p-1}\textrm{d}W_t^i\;. \end{aligned}$$

Taking expectation yields

$$\begin{aligned} \frac{\textrm{d}}{\textrm{d}t}{\mathbb {E}}[ |r_t^{i,n,m}|^p]&\le p\Big (L+\Gamma +2(p-1)\Big ){\mathbb {E}}|r_t^{i,n,m}|^p +p(\Gamma +2(p-1))\;. \end{aligned}$$

Then by Gronwall’s lemma

$$\begin{aligned} \sup _{t\in [0,T]}{\mathbb {E}}[|r_t^{i,n,m}|^p] \le \textrm{e}^{p(L+\Gamma +2(p-1))T} ({\mathbb {E}}[|r_0^{i,n,m}|^p]+Tp(\Gamma +2(p-1)) )<C_p<\infty \;, \end{aligned}$$
(80)

where \(C_p\) depends on T and the p-th moment of the initial distribution, which is by assumption finite. Similarly, it holds \(\sup _{t\in [0,T]} {\mathbb {E}}[|s_t^{i,n,m}|^p]<C_p\) for \(t\le T\). Using these moment bounds, it holds for all \(t_1,t_2\in [0,T]\) by H1, H5 and H6,

$$\begin{aligned} {\mathbb {E}}&[|r_{t_2}^{i,n,m}-r_{t_1}^{i,n,m}|^p] \\ {}&\le C_1(p)\Big ({\mathbb {E}}[|\int _{t_1}^{t_2} {\tilde{b}}(r_u^{i,n,m}) +\frac{1}{N}\sum _{j=1}^N g^m(r_t^{j,n,m})\textrm{d}u |^p]+{\mathbb {E}}[|\int _{t_1}^{t_2}2 \theta ^n(r_u^{i,n,m})\textrm{d}W_u^i|^p]\Big ) \\ {}&\le C_2(p)\Big (\Big ({\mathbb {E}}\Big [\frac{{\tilde{L}}^p}{|t_2-t_1|}\int _{t_1}^{t_2}|r_u^{i,n,m}|^p\textrm{d}u\Big ]+\Gamma ^p\Big )|t_2-t_1|^p+{\mathbb {E}}[|\int _{t_1}^{t_2}2\theta ^n(r_u^{i,n,m})\textrm{d}u|^{p/2}]\Big ) \\ {}&\le C_2(p)\Big (\Big (\frac{{\tilde{L}}^p}{|t_2-t_1|}\int _{t_1}^{t_2}{\mathbb {E}}[|r_u^{i,n,m}|^p]\textrm{d}u +\Gamma ^p\Big )|t_2-t_1|^p+2^{p/2}|t_2-t_1|^{p/2}\Big ) \\ {}&\le C_3(p,T,{\tilde{L}},\Gamma ,C_p)|t_2-t_1|^{p/2} \;, \end{aligned}$$

where \(C_k(\cdot )\) are constants depending on the stated arguments, but independent of nm. Note that in the second step, we use Burkholder-Davis-Gundy inequality, see [38, Chapter IV, Theorem 48]. It holds similarly, \({\mathbb {E}}[|s_{t_2}^{i,n,m}-s_{t_1}^{i,n,m}|^p]\le C_3(p,T,{\tilde{L}},\Gamma ,C_p)|t_2-t_1|^{p/2}\). Hence,

$$\begin{aligned} {\mathbb {E}}[|(\{r_{t_2}^{i,n,m},s_{t_2}^{i,n,m}\}_{i=1}^N)&-(\{r_{t_1}^{i,n,m},s_{t_1}^{i,n,m}\}_{i=1}^N)|^p] \\ {}&\le C_4(p,N)(\sum _{i=1}^N( {\mathbb {E}}[|r_{t_2}^{i,n,m}-r_{t_1}^{i,n,m}|^p]+{\mathbb {E}}[|s_{t_2}^{i,n,m}-s_{t_1}^{i,n,m}|^p])) \\ {}&\le C_5(p,N,T,{\tilde{L}},\Gamma ,C_p)|t_2-t_1|^{p/2} \end{aligned}$$

for all \(t_1,t_2\in [0,T]\). Hence, by Kolmogorov’s continuity criterion, cf. [31, Corollary 14.9], there exists a constant \({\tilde{C}}\) depending on p and \(\gamma \) such that

$$\begin{aligned} {\mathbb {E}}\Big [ [(\{r^{i,n,m},s^{i,n,m}\}_{i=1}^N)]_\gamma ^p\Big ]\le {\tilde{C}}\cdot C_5(p,N,T,{\tilde{L}},\Gamma ,C_p) \;. \end{aligned}$$
(81)

where \([\cdot ]_\gamma ^p\) is defined by \([x]_\gamma =\sup _{t_1,t_2\in [0,T]}\frac{|x(t_1)-x(t_2)|}{|t_1-t_2|^\gamma }\) and \((\{r_t^{i,n,m},s_t^{i,n,m}\}_{i=1}^N)_{n\in {\mathbb {N}},t\ge 0}\) is tight in \({{\mathcal {C}}}([0,T],{\mathbb {R}}^{2N})\). Hence, for each \(T>0\) there exists a subsequence \(n_k\rightarrow \infty \) and a probability measure \({\mathbb {P}}_T\) on \({{\mathcal {C}}}([0,T],{\mathbb {R}}^{2N})\). Since \(\{{\mathbb {P}}_T^m\}_T\) is a consistent family, there exists by [31, Theorem 5.16] a probability measure \({\mathbb {P}}^m\) on \(({\mathbb {W}}^N\times {\mathbb {W}}^N,{{\mathcal {B}}}({\mathbb {W}}^N)\otimes {{\mathcal {B}}}({\mathbb {W}}^N))\) such that \({\mathbb {P}}^{n_k,m}\) converges weakly to \({\mathbb {P}}^m\). Note that we can take here the same subsequence \((n_k)\) for all m using a diagonalization argument.

Characterization of the limit measure Denote by \((\{{\varvec{r}}_t^i,{\varvec{s}}_t^i\}_{i=1}^N)=\omega (t)\) the canonical process on \({\mathbb {W}}^N\times {\mathbb {W}}^N\). To characterize the measure \({\mathbb {P}}^m\) we first note that \({\mathbb {P}}^m\circ ({\varvec{r}}_0^i,{\varvec{s}}_0^i)^{-1}=\eta _m\) for all \(i\in \{1,\ldots ,N\}\), since \({\mathbb {P}}^{n,m}({\varvec{r}}_0^i,{\varvec{s}}_0^i)^{-1}=\eta _{n,m}\) converges weakly to \(\eta _m\) by assumption. We define maps \(M^{i,m},N^{i,m}:{\mathbb {W}}^{N}\times {\mathbb {W}}^{N}\rightarrow {\mathbb {W}}\) by

$$\begin{aligned}&M_t^{i,m}={\varvec{r}}_t^i-{\varvec{r}}_0^i-\int _0^t\Big ({\tilde{b}}({\varvec{r}}_u^i)+\frac{1}{N}\sum _{j=1}^Ng^m({\varvec{r}}_u^j)\Big )\textrm{d}u\;, \quad \text { and }\nonumber \\ {}&N_t^{i,m}={\varvec{s}}_t^i-{\varvec{s}}_0^i-\int _0^t\Big ({\hat{b}}({\varvec{s}}_u^i)+\frac{1}{N}\sum _{j=1}^Nh^m({\varvec{s}}_u^j)\Big )\textrm{d}u \;. \end{aligned}$$
(82)

For each \(n,m\in {\mathbb {N}}\) and \(i=1,\ldots ,N\), \((M_t^{i,m},{{\mathcal {F}}}_t,{\mathbb {P}}^{n,m})\) is a martingale with respect to the filtration \({{\mathcal {F}}}_t=\sigma (({\varvec{r}}_u^{j},{\varvec{s}}_u^{j}):{j=1,\ldots ,N,0\le u\le t})\). Note that the families \((\{M_t^{i,m}\}_{i=1}^N,{\mathbb {P}}^{n,m})_{n\in {\mathbb {N}},t\ge 0}\) and \((\{N_t^{i,m}\}_{i=1}^N,{\mathbb {P}}^{n,m})_{n\in {\mathbb {N}},t\ge 0}\) are uniformly integrable. Since the mappings \(M^{i,m}\) and \(N^{i,m}\) are continuous in \({\mathbb {W}}\), \({\mathbb {P}}^{n,m}\circ (\{{\varvec{r}}^i,{\varvec{s}}^i,M^{i,m},N^{i,m}\}_{i=1}^N)^{-1}\) converges weakly to \({\mathbb {P}}^{m}\circ (\{{\varvec{r}}^i,{\varvec{s}}^i,M^{i,m},N^{i,m}\}_{i=1}^N)^{-1}\) by the continuous mapping theorem. Then applying the same argument as in (59), \((M^{m,i}_t,{{\mathcal {F}}}_t,{\mathbb {P}}^m)\) and \((N^{m,i}_t,{{\mathcal {F}}}_t,{\mathbb {P}}^m)\) are continuous martingales for all \(i=1,\ldots ,N\) and the quadratic variation \(([\{M^{i,m},N^{i,m}\}_{i=1}^N]_t)_{t\ge 0}\) exists \({\mathbb {P}}^m\)-almost surely. To complete the identification of the limit, it suffices to identify the quadratic variation. Similar to the computations in the proof of Lemma 15, it holds

$$\begin{aligned}&[M^{i,m}]=4\int _0^\cdot \mathbbm {1}_{(0,\infty )}({\varvec{r}}_u^i)\textrm{d}u \qquad {\mathbb {P}}^m\text {-almost surely,} \nonumber \\&[N^{i,m}]=4\int _0^\cdot \mathbbm {1}_{(0,\infty )}({\varvec{s}}_u^i)\textrm{d}u \qquad {\mathbb {P}}^m\text {-almost surely, and} \nonumber \\&[M^{i,m},N^{i,m}]=4\int _0^\cdot \mathbbm {1}_{(0,\infty )}({\varvec{r}}_u^i)\mathbbm {1}_{(0,\infty )}({\varvec{s}}_u^i)\textrm{d}u \qquad {\mathbb {P}}^m\text {-almost surely,} \end{aligned}$$
(83)

Further, \([M^{i,m},M^{j,m}]_t=[N^{i,m},N^{j,m}]_t=[M^{i,m},N^{j,m}]_t=0\) \({\mathbb {P}}^{n,m}\)-almost surely for \(i\ne j\) and \((M_t^{i,m}M_t^{j,m},{\mathbb {P}}^{n,m})\), \((N_t^{i,m}N_t^{j,m},{\mathbb {P}}^{n,m})\) and \((M_t^{i,m}N_t^{j,m},{\mathbb {P}}^{n,m})\) are martingales. For any bounded, continuous non-negative function \(G:{\mathbb {W}}\rightarrow {\mathbb {R}}\), it holds

$$\begin{aligned} \mathbb {\mathbb {E}}^m[G(M^{i,m}_tM^{j,m}_t-M^{i,m}_sM^{j,m}_s)]=\lim _{n\rightarrow \infty }\mathbb {\mathbb {E}}^{n,m}[G(M_t^{i,m}M_t^{j,m}-M_s^{i,m}M_s^{j,m})]=0 \;, \end{aligned}$$

respectively, \(\mathbb {\mathbb {E}}^m[G(N^{i,m}_tN^{j,m}_t-N^{i,m}_sN^{j,m}_s)]=0\) and \(\mathbb {\mathbb {E}}^m[G(M^{i,m}_tN^{j,m}_t-M^{i,m}_sN^{j,m}_s)]=0\). Then

$$\begin{aligned}&[M^{i,m},M^{j,m}]=[N^{i,m},N^{j,m}]=[M^{i,m},N^{j,m}]=0{} & {} {\mathbb {P}}^m\text {-almost surely}, \nonumber \\&\quad \text {for all } i\ne j \;. \end{aligned}$$
(84)

Then by a martingale representation theorem, cf. [28, Chapter II, Theorem 7.1], there is a probability space \((\Omega ^m, {{\mathcal {A}}}^m,P^m)\) and a Brownian motion \(\{W^i\}_{i=1}^N\) and random variables \((\{r^{i,m},s^{i,m}\}_{i=1}^N)\) on this space, such that it holds \(P^m\circ (\{r^{i,m},s^{i,m}\}_{i=1}^N)^{-1}={\mathbb {P}}^m\circ (\{{\varvec{r}}^i,{\varvec{s}}^i\}_{i=1}^N)^{-1}\) and such that \((\{r^{i,m},s^{i,m},W^i\}_{i=1}^N)\) is a weak solution of (79).

Comparison of two solutions To show \(P^m[r^{i,m}_t\le s^{i,m}_t \text { for all }t\ge 0 \text { and } i=1,\ldots ,N]=1\) it suffices to note that \(P^{n,m}[r_t^{i,n,m}\le s_t^{i,n,m} \text { for all } t\ge 0 \text { and }i=1,\ldots ,N]=1\), which holds by Lemma 21, carries over to the limit by the Portmanteau theorem, since we have weak convergence of \({\mathbb {P}}^{n,m}\circ (\{{\varvec{r}}^i,{\varvec{s}}^i\}_{i=1}^N)^{-1}\) to \({\mathbb {P}}^m\circ (\{{\varvec{r}}^i,{\varvec{s}}^i\}_{i=1}^N)^{-1}\). \(\square \)

In the next step we show that the distribution of the solution of (79) converges as \(m\rightarrow \infty \). Consider a probability space \((\Omega ^m,{{\mathcal {A}}}^m,P^m)\) for each \(m\in {\mathbb {N}}\) and random variables \(\{r^{i,m}\}_{i=1}^N,\{s^{i,m}\}_{i=1}^N:\Omega ^m\rightarrow {\mathbb {W}}^N\) such that \((\{r^{i,m}_t,s_t^{i,m}\}_{i=1}^N)_{t\ge 0}\) is a solution to (79). Denote by \({\mathbb {P}}^m=P^m\circ (\{r^{i,m},s^{i,m}\}_{i=1}^N)^{-1}\) the law on \({\mathbb {W}}^N\times {\mathbb {W}}^N\).

Lemma 23

Assume that H1 and H2 is satisfied for \(({\tilde{b}},g)\) and \(({\hat{b}},h)\). Let \(\eta \in \Gamma (\mu ,\nu )\) where the probability measures \(\mu \) and \(\nu \) on \({\mathbb {R}}_+\) satisfy H3. Further, let \((g^m)_{m\in {\mathbb {N}}}\), \((h^m)_{m\in {\mathbb {N}}}\), \((\mu _{m})_{m\in {\mathbb {N}}}\), \((\nu _{m})_{m\in {\mathbb {N}}}\) and \((\eta _{m})_{m\in {\mathbb {N}}}\) be such that H5 and H7 hold. Then there exists a random variable \((\{r^i,s^i\}_{i=1}^N)\) defined on some probability space \((\Omega ,{{\mathcal {A}}},P)\) with values in \({\mathbb {W}}^N\times {\mathbb {W}}^N\) such that \(( \{r_t^{i},s_t^{i}\}_{i=1}^N)\) is a weak solution of (35). Moreover, the laws \(P^{m}\circ ( \{r^{i,m},s^{i,m}\}_{i=1}^N)^{-1}\) converge weakly to \(P\circ (\{r^i,s^i\}_{i=1}^N)^{-1}\). If in addition,

$$\begin{aligned}&{\tilde{b}}(r)\le {\hat{b}}(r), \quad g(r)\le h(r), \quad \text {and} \quad g^m(r)\le h^m(r){} & {} \text { for any } r \in {\mathbb {R}}_+ \text {, and } \\ {}&P^m[r_0^{i,m}\le s_0^{i,m} \text { for all } t\ge 0 \text { and } i\in \{1,\ldots ,N\}]=1{} & {} \text { for any } m\in {\mathbb {N}}, \end{aligned}$$

then \(P[r^i_t\le s^i_t \text { for all } t\ge 0 \text { and } i\in \{1,\ldots ,N\}]=1\).

Proof

The proof is structured as the proof of Lemma 22. Tightness of the sequence of probability measures \(({\mathbb {P}}^{m})_{m\in {\mathbb {N}}}\) on \(({\mathbb {W}}^N\times {\mathbb {W}}^N,{{\mathcal {B}}}({\mathbb {W}}^N)\otimes {{\mathcal {B}}}({\mathbb {W}}^N))\) holds adapting the steps of the proof of Lemma 22 to (79). Note that (80) and (81) hold analogously for \((\{r_t^{i,m},s_t^{i,m}\}_{i=1}^N)\) by H1, H5 and H7. Hence by Kolmogorov’s continuity criterion, cf. [31, Corollary 14.9], we can deduce that there exists a probability measure \({\mathbb {P}}\) on \(({\mathbb {W}}^N\times {\mathbb {W}}^N,{{\mathcal {B}}}({\mathbb {W}}^N)\otimes {{\mathcal {B}}}({\mathbb {W}}^N))\) such that there is a subsequence \((m_k)_{k\in {\mathbb {N}}}\) along which \({\mathbb {P}}^{m_k}\) converge towards \({\mathbb {P}}\).

To characterize the limit, we first note that by Skorokhod representation theorem, cf. [6, Chapter 1, Theorem 6.7], without loss of generality we can assume that \((\{r^{i,m},s^{i,m}\}_{i=1}^N)\) are defined on a common probability space \((\Omega ,{{\mathcal {A}}},P)\) with expectation E and converge almost surely to \((\{r^i,s^i\}_{i=1}^N)\) with distribution \({\mathbb {P}}\). Then, by H5 and Lebesgue convergence theorem it holds almost surely for all \(t\ge 0\),

$$\begin{aligned} \lim _{m\rightarrow \infty }\int _0^t {\tilde{b}}(r_t^{i,m})+\frac{1}{N}\sum _{j=1}^N g^m(r_u^{j,m})\textrm{d}u=\int _0^t {\tilde{b}}(r_t^i)+\frac{1}{N}\sum _{j=1}^N g^m(r_u^j)\textrm{d}u \;. \end{aligned}$$
(85)

Consider the mappings \({M}^{i,m},{N}^{i,m}:{\mathbb {W}}^N\times {\mathbb {W}}^N\times {{\mathcal {P}}}({\mathbb {W}}^N\times {\mathbb {W}}^N)\rightarrow {\mathbb {W}}\) defined by (82) Then for all \(m\in {\mathbb {N}}\) and \(i=1,\ldots ,N\), \(({M}_t^{i,m},{{\mathcal {F}}}_t,{\mathbb {P}}^m)\) and \(({N}_t^{i,m},{{\mathcal {F}}}_t,{\mathbb {P}}^m)\) are martingales with respect to the canonical filtration \({{\mathcal {F}}}_t=\sigma ((\{{\varvec{r}}_u^{i},{\varvec{s}}_u^{i}\}_{i=1}^N)_{0\le u\le t})\). Further the family \((\{M_t^{i,m}\}_{i=1}^N,{\mathbb {P}}^m)_{m\in {\mathbb {N}},t\ge 0}\) and \((\{N_t^{i,m}\}_{i=1}^N,{\mathbb {P}}^m)_{m\in {\mathbb {N}},t\ge 0}\) are uniformly integrable. In the same line as weak convergence is shown in the proof of Lemma 15 and by (85), \({\mathbb {P}}^m\circ (\{{\varvec{r}}^i,{\varvec{s}}^i,M^{i,m},N^{i,m}\}_{i=1}^N)^{-1}\) converges weakly to \({\mathbb {P}}\circ (\{{\varvec{r}}^i,{\varvec{s}}^i,M^{i},N^{i}\}_{i=1}^N)^{-1}\) where

$$\begin{aligned}&M_t^{i}={\varvec{r}}_t^i-{\varvec{r}}_0^i-\int _0^t\Big ({\tilde{b}}({\varvec{r}}_u^i)+\frac{1}{N}\sum _{j=1}^Ng({\varvec{r}}_u^j)\Big )\textrm{d}u \;,{} & {} \text { and } \\ {}&N_t^{i}={\varvec{s}}_t^i-{\varvec{s}}_0^i-\int _0^t\Big ({\hat{b}}({\varvec{s}}_u^i)+\frac{1}{N}\sum _{j=1}^Nh({\varvec{s}}_u^j)\Big )\textrm{d}u\;. \end{aligned}$$

Then \((\{{M}_t^i\}_{i=1}^N,{{\mathcal {F}}}_t,{\mathbb {P}})\) and \((\{{N}_t^i\}_{i=1}^N,{{\mathcal {F}}}_t,{\mathbb {P}})\) are continuous martingales using the same argument as in (59). Further, the quadratic variation \(([\{{M}_t^i,{N}_t^i\}_{i=1}^N]_t)_{t\ge 0}\) exists \({\mathbb {P}}\)-almost surely and is given by (83) and (84) \({\mathbb {P}}\)-almost surely, which holds following the computations in the proof of Lemma 15 and Lemma 22. As in Lemma 22, we conclude by a martingale representation theorem that there are a probability space \((\Omega ,{{\mathcal {A}}},P)\) and a Brownian motion \(\{W^i\}_{i=1}^N\) and random variables \((\{r^i\}_{i=1}^N,\{s^i\}_{i=1}^N)\) on this space such that \(P\circ (\{r^i,s^i\}_{i=1}^N)^{-1}={\mathbb {P}}\circ (\{{\varvec{r}}^i,{\varvec{s}}^i\}_{i=1}^N)^{-1}\) and such that \((\{r^i,s^i,W^i\}_{i=1}^N)\) is a weak solution of (25).

By the Portmanteau theorem the monotonicity carries over to the limit, since \({\mathbb {P}}^m\circ (\{{\varvec{r}}^i,{\varvec{s}}^i\}_{i=1}^N)^{-1}\) converges weakly to \({\mathbb {P}}\circ (\{{\varvec{r}}^i,{\varvec{s}}^i\}_{i=1}^N)^{-1}\). \(\square \)

Proof of Theorem 10

The proof is a direct consequence of Lemmas 22 and 23. \(\square \)